Saveface - Save your Facebook content as RDF data

2,997 views
2,843 views

Published on

The slides share experience on how to build a crawl FB content and save as RDF, then use Joseki (Jena) to serve the RDF data using SPARQL endpoint.

Published in: Technology, Education
0 Comments
5 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,997
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
50
Comments
0
Likes
5
Embeds 0
No embeds

No notes for slide

Saveface - Save your Facebook content as RDF data

  1. 1. Saveface – Save Facebook’s data as RDF graph Using Jena, Joseki & FB graph API Fuming Shih fuming@mit.edu
  2. 2. About Me• 4th year graduate student at CSAIL, working with Hal Abelson• Member of DIG group (decentralized information group) at CSAIL• Working on topics relating to privacy, mobile context, and accountability 2
  3. 3. SaveflickrSaveface Simond Seconos Walled Gardens Picture, taken from TimBLs presentation 3
  4. 4. Outline• Demo Saveface SPARQL endpoint• Overview• Set up Joseki SPARQL endpoint• From Protégé (data modeling) to Jena/Jaster (RDF library/SPARQL endpoint – Protégé – Jastor – Facebook graph API – Jena 4
  5. 5. Overview• Protégé 4.1 (data modeling)• Jastor library (RDF to POJO)• Facebook graph API• RestFB*• Jena/Jastor 5
  6. 6. Setup Joseki (Jena)• Joskei is an HTTP engine that supports SPRAQL; (use jetty, support ARQ for Jena) – configuration as turtle file• Get Jena 2.6.3, tdb 0.8.7, Joseki 3.4.2 at http://sourceforge.net/projects/jena/files/ – or go to http://dig.csail.mit.edu/2010/aintno/rdfData/aintno_j oseki.tar.gz for everything in one zip file – Jena is now an Apache Incubator program (http://incubator.apache.org/jena/index.html) source: http://ricroberts.com/articles/installing-jena-and-joseki-on-os-x-or-linux 6
  7. 7. Setup environment• export JOSEKIROOT=/path/to/Joseki-3.4.2• export TDBROOT=/path/to/TDB-0.8.7• export JENAROOT=/path/to/Jena-2.6.3• export CLASSPATH=.:$JENAROOT/lib/*.jar:$TDBROOT/lib/*.jar: $JOSEIKIROOT/lib/*.jar• export PATH=“$TDBROOT/bin:$JOSEKIROOT/bin:$PATH• if you download the all-in-one package(* I have put all jars under Joseki’s lib folder) – export JOSEKIROOT="/path/to/Joseki-aintno” – export PATH="$JOSEKIROOT/bin:$PATH” – export CLASSPATH=".:$JOSEKIROOT/lib/*.jar" 7
  8. 8. Run Joseki• cd /path/to/Joseki• ./bin/rdfserver – ./bin/rdfserver - - help (joseki.rdfserver [--verbose] [--port N] dataSourceConfigFile)• Now open browser at http://localhost:2020/ – test some of the SPARQL query interface with example data 8
  9. 9. Joseki - Http access to SPARQL Endpoint 9
  10. 10. Saveface• Goal: save my Facebook data as linked data• Facebook *finally* provides restful API to access its data (Facebook Graph API) – http://developers.facebook.com/docs/reference/a pi/ – graph structure (e.g. Album class) • http://developers.facebook.com/docs/reference/api/al bum/ 10
  11. 11. From Data model to Java POJO• Used Protégé to create owl class for each of the Facebook classes – be aware that mapping from OO to ontology needs cares – serialize as RDF files• Mapping ontologies (owl files) to JAVA classes – used Jastor library to generates Java interfaces, implementations, factories, and listeners based on the properties and class hierarchies in the Web ontologies – easier for non-Semantic Web java developer to make use of ontology 11
  12. 12. Jastor• Typesafe, Ontology Driven RDF Access from Java http://jastor.sourceforge.net/ – Use Jena 2.4• Provides an interface for access/setting/adding event listeners to RDF model Jastor Operator listener iCal SIOC Mapping tool Tag FOAF Jena2 Platform (RDF model + Reasoning Engine + Persistence System) RDF DB Ontology files JAVA VM 12
  13. 13. Example Create mappingJastorContext ctx = new JastorContext();ctx.addOntologyToGenerate(new FileInputStream("src/data/Tag.owl"), "http://www.mit.edu/dig/ns/tag", "edu.mit.dig.model.Tag");JastorGenerator gen = new JastorGenerator( new File("gensrc").getCanonicalFile(), ctx);gen.run(); Make use of the classTag tag = edu.mit.dig.model.Tag.tagFactory.createTag(NS_PREFIX + "id_1", model);tag.addName("A tag");tag.addX(45);tag.addY(32); 13
  14. 14. RestFB + RDF• Facebook graph API client• Forked RestFB 1.5.4 and added RDFUtil.java – used java reflection to covert each FB objects in RestFB to Jena RDF model (method toRDF())• Default domain name for Saveface data – http://servername:port_num/data/saveface/ 14
  15. 15. Demo• git clone git@github.com:fumingshih/savefaceDemo.git• Login to your Facebook• Go to http://developers.facebook.com/docs/reference/api/ – click on one of the links to view your content in json format (graph) – copy the access_token after https://graph.facebook.com/me/friends?access_token• Run saveface.tutorial.Exercise1.java – paste the access_token string (* only valid for one hour) – change the directory for storing RDF (TDB files) 15
  16. 16. Access SaveFace Data through Joseki• Open /path/to/your/Joseki/joseki-config.ttl• Three concepts in the configuration files – services • Services are the points that request are sent to • Need to specify dataset and processor • Note that the service reference and the routing of incoming requests by URI as defined by web.xml have to align – datasets • can be path to the dataset • or using Jena assembler description to compile different named graphs together – processors • set limitations on SPARQL queries (locking, no FROM/FROM NAMED) 16 Reference: http://www.joseki.org/configuration.html
  17. 17. Configuration Example (Service) # Service 3 - SPARQL processor only handing a given dataset(TDB) <#service3> rdf:type joseki:Service ; rdfs:label "SPARQL on the named graph of saveface" ; joseki:serviceRef "saveface" ; # web.xml must route this name to Joseki # dataset part joseki:dataset <#savefacedata> ; # Service part. # This processor will not allow either the protocol, # nor the query, to specify the dataset. joseki:processor joseki:ProcessorSPARQL_MultiDS ; . 17
  18. 18. Configuration Example (Dataset) # init tdb [] ja:loadClass "com.hp.hpl.jena.tdb.TDB" . tdb:DatasetTDB rdfs:subClassOf ja:RDFDataset . tdb:GraphTDB rdfs:subClassOf ja:Model . <#savefacedata> rdf:type tdb:DatasetTDB ; rdfs:label "saveface dataset" ; #change this line below to your path to the dataset tdb:location "/Users/fuming/tmp/saveface_demo" ; . 18
  19. 19. Facebook Data as Linked Data!• Change graph name to <urn:saveface:dataGraph:FumingShih> in the SPARQL query 19
  20. 20. References• http://incubator.apache.org/jena/index.html• http://www.joseki.org/• Graph API – http://developers.facebook.com/docs/reference/api/• Jastor – http://jastor.sourceforge.net/• RestFB (http://restfb.com/ ) – FB API browser (http://zestyping.livejournal.com/257224.html)• SavefaceDemo – https://github.com/fumingshih/savefaceDemo – More on Saveface demo • http://dice.csail.mit.edu/aintno/ui/#aintno • http://dig.csail.mit.edu/wiki/SocialWebs_Data_Crawler/RDF_Repository_Setu p 20

×