Social insights are one of todays Big Data topics. It is not enough to explore analytics on the data you have but also to incorporate data you can obtain. The #1 obtainable data today is Social. This presentation walks through setting up a Twitter API account and accessing that data through R. On the way obtaining a basic understanding of JSON and ReST R methods.
R-Users Group JSON and ReST Introduction using Twitter
1. Introduction to JSON
A walk through Twitter using R
and
Basics of ReST API access using R
By: Kevin J. Smith
Kansas City R-Users Group
2. Introduction
● Kevin J. Smith – IBM Informix – Automation
Engineer
● Twitter API
● JSON
● ReST
● R httr and jsonlite
3. Twitter APIs
● Twitter ReST APIs:
https://dev.twitter.com/rest/public
● Step 1: Sign up for an API account –
https://dev.twitter.com
● Create an app for Oauth authentication handles
https://dev.twitter.com/apps
● Oauth: https://dev.twitter.com/oauth
4.
5.
6. Why JSON
● http://www.json.org/
● JSON is most compared to XML
● JSON is leaner and faster
– Smaller in size. A XML document with the same data is larger
than corresponding JSON document
– Writing and parsing json by humans and machines is faster
● JSON is becoming the defacto object data storage
medium replacing XML.
● JSON, in the database world, allows for unstructured
data to be stored and accessed ex photo metadata
7. JSON (JavaScript Object Notation)
● Key:Value pairs
– Key – any valid unicode string. Special characters like double
quotes must be escaped.
– Value – Number, String, Boolean, Array, Object, null
– Example “Name”:”Kevin”
● Objects or Documents
– Key:Value pairs enclosed in curly brackets
– Example {“Name”:”Kevin”}
● Collection
– An array of documents
– Example [ { “name”:”Kevin”} ,
{ “name” : “Sheryl”} ,
{“huh?”:”what something different”}]
8. JSON – Data Types
Value Types Description
Number double- precision floating-point format ie
12.138
String Single or Double quoted unicode with
escaped special characters ie
“Hello ”Kevin” World”
Boolean True or false
Array Square bracketed list of value types
Object Curly bracketed list of key:value types
null empty
JSON Document example:
{ “name” : “kevin”,
“age” : 36.06 ,
“titles” : [ “Dr.” , “Sir.”, “Mr.”, 8284, true ] ,
“offspring_oldest” : { “name” : “son”, “age” : 12.21},
“offspring_all” : [ { “name” : “son”, “age” : 12.21},
{ “name” : “daughter” , “age” : 8.92} ] ,
“greying?” : false,
“status” : null
}
9. ReST (Representational State
Transfer)
● HTTP vs ReST vs SOAP
● Access Methods: GET, POST, PUT, DELETE,
OPTIONS, HEAD, TRACE, CONNECT and
PATCH
● Headers describes the interaction between the
client and the server
● Body describes the data interchanged between
the client and server
● Special data can be sent in the url as query
parameters: ie ?key:value&key:value and must be
url encoded
10. R httr
● Oauth comes with the httr package. This
allows for Oauth tokens and secrets to be used
for authenticating a user to a data source
● ReST request
– GET(), POST(), PUT(),...
● JSON reply
– Headers(),cookies(), http_status(), content()
11.
12. cat(json1)
[{"created_at":["Fri Jul 08 19:35:07 +0000 2016"],"id":
[7.51499876924621e+17],"id_str":["751499876924620800"],"text":["Jupiter's moon,
Ganymede, has more water than the Pacific. https://t.co/RGgVqACgq2
https://t.co/Fguz7rXpBj"],"truncated":[false],"extended_entities":{"media":[{"id":
[7.51499874420548e+17],"id_str":["751499874420547584"],"indices":[[83],
[106]],"media_url":
["http://pbs.twimg.com/media/Cm3dJ3eWYAAPCYZ.jpg"],"media_url_https":
["https://pbs.twimg.com/media/Cm3dJ3eWYAAPCYZ.jpg"],"url":
["https://t.co/Fguz7rXpBj"],"display_url":
["pic.twitter.com/Fguz7rXpBj"],"expanded_url":
["http://twitter.com/ScienceChannel/status/751499876924620800/photo/1"],"type":
["photo"],"sizes":{"medium":{"w":[600],"h":[308],"resize":["fit"]},"large":{"w":[600],"h":
[308],"resize":["fit"]},"thumb":{"w":[150],"h":[150],"resize":["crop"]},"small":{"w":
[600],"h":[308],"resize":["fit"]}}}]},"source":["<a href="http://www.hootsuite.com"
rel="nofollow">Hootsuite</a>"],"in_reply_to_status_id":
{},"in_reply_to_status_id_str":{},"in_reply_to_user_id":{},"in_reply_to_user_id_str":
{},"in_reply_to_screen_name":{},"user":{"id":[16895274],"id_str":["16895274"]},"geo":
{},"coordinates":{},"place":{},"contributors":{},"is_quote_status":[false],"retweet_count":
[8],"favorite_count":[10],"favorited":[false],"retweeted":[false],"possibly_sensitive":
[false],"possibly_sensitive_appealable":[false],"lang":["en"]}]
json2$text
[[1]]
[1] "Jupiter's moon, Ganymede, has more water than the Pacific.
https://t.co/RGgVqACgq2 https://t.co/Fguz7rXpBj"