Facebook has huge data bank and it allows us to make use of it to some extent.
October is a month of celebration in India. We have festivals like Diwali and Dushehra in October, which makes the entire month a time to celebrate and reunion. Every time we meet our friends and relatives at different places, to make it easier for everyone to reunite. Every time before going to the city, I update my FB/Twitter status to “Going to city xyz” and get a bunch of replies from people who are traveling to the same place. But almost every time I miss some of my friends just because of lack of information. This time I went a step ahead and connected my FB account to R and looked for people who have their current city as my target location. Surprisingly, I got 10 more friends who were in the city, whom I might have missed if this exercise was never done. In this process, I had a lot of fun with other user profile variables FB permits us to look at. In this article, I will help readers to understand this process of connecting FB to R and demonstrate the simplicity of the process. This type of analysis is not restricted only to the case in hand but a much broader set of daily life problems.
We will use library Rfacebook for this analysis.
[stextbox id=”section”] How to connect R to Facebook [/stextbox]
Facebook provides you two simple ways to import data from the website. I will demonstrate the simpler one in this article.
Step 1 : Goto the link ” https://developers.facebook.com/tools/explorer/” . This will open thee FB developer page.
Step 2: Change the API Version(Red box in the picture) to “unversioned”
Step 3: Click the “Get Access Token” (Green box in the picture).
Step 4: Check all the boxes in all three tabs. These are the permissions you are asking from yourself to access. Assuming you do not wish to hide anything from yourself, you can safely check all boxes.
Step 5 : Click on the button “Get Access Token”. This token is valid for 2 hours.
Step 6 : Store your token as variable in R studio. You can use the following code for the same :
[stextbox id=”grey”]
> token <- "XXXXX12333YYY"
> me <- getUsers("me", token=token)
> me$name
[1] “Tavish Srivastava”
[/stextbox]
Now, you have facebook connected to your R session for the next 2 hours.
[stextbox id=”section”] Search people in a particular city among your friend list [/stextbox]
I and all my relatives decided to meet in Pune (Maharashtra) this year and hence “Pune” is the location I am looking for in the current location field of all my friends profile. Imagine doing the same thing manually on facebook. Let’s take a smarter route and check out the frequency distribution of current location among the user IDs in my friend list. To accomplish this task you can execute a simple code on R.
Step1 : Pull out the list of all friends and their ID.
Step 2 : Pull all the user details corresponding to this table of IDs.
Step 3 : Check the frequency distribution of all current location. This is done to make sure the same name “Pune” is not appearing in different formats.
This frequency distribution is a reason why this method adds power over traditional search on Facebook. For example, if we were meeting in Delhi, I would want to search Delhi, Gurgaon, Noida and possibly Faridabad for my friends. However, through this method, I can write one single query to get it.
You can use following code to do the same :
[stextbox id=”grey”]
> my_friends <- getFriends(token, simplify=TRUE)
> my_friends_info <- getUsers(my_friends$id, token=token, private_info=TRUE)
> table(my_friends_info$location)
[/stextbox]
We see that, I have 16 friends with their current location as Pune. I also get the exact string I should search for to complete my task. Following is the code you can use to find these 16 friends.
[stextbox id=”grey”]
> Pune_resident <- subset(my_friends_info$first_name,my_friends_info$location == "Pune, Maharashtra") > Pune_resident
[/stextbox]
Finally I get the list of names of my friends, who have their current location as Pune. While doing this exercise, I found some other interesting facts about my friend list. It is very easy to tabulate the relationship_status of all your friends. Because the possible values are very few, it becomes interesting to analyze the same. Following is a code I used to tabulate the relationship_status of my friends.
[stextbox id=”grey”]
> table(my_friends_info$relationship_status)
Engaged : 3
In a relationship : 6
It’s complicated : 5
Married : 126
Single : 434
[/stextbox]
As I have been lately busy in my work, I completely lost track of people getting engaged. Here is an easy method to find the same :
[stextbox id=”grey”]
> engaged.friends <- subset(my_friends_info$first_name,my_friends_info$relationship_status == "Engaged")
> engaged.friends
[/stextbox]
I did a tabulation on each of the user information Facebook shared and discovered new things about my friends every single time.
[stextbox id=”section”] End Notes [/stextbox]
I found this small piece of analysis both interesting and insightful. It just helps you get a summary of everything. You can go through the user information of your entire friend list in less than 5 minutes. You can use this data to visualize your friends on a graph and see various clusters of population (Hint – you will need to use igraph library for this). You can do some cool things like define the distance between nodes basis interactions on Facebook and see which are the closest people to you as per Facebook.
How would you play around with your social media data? Have you done such small experiments on your Facebook or twitter profile? Did you go beyond the scope of this article in your analysis? Please share with us your thoughts on the topic.
Hi Tavish, Thanks for the nicely explained article. As you mentioned that token is valid for 2 hours. So what happens to the data post the expiry of 2 hours. Also, can one extract entire fb data or only corresponding to one's friendlist. regards & care, Gaurav
Hi Tavish, Nice short sweet and most importantly concise and graphical article on FB . Similarly we can mine the tweets from twitter to get the sentiment on a particular person,product,party,celebrity etc.
very interesting