Hacking with Semantic Web                           Tom Praison                   Developer @ Yahoo!         http://twitte...
What’s in here?• Evolution of the web• Poorly Solved Information Needs• Semantic Web Technologies• Linked Data• Demo of co...
I just had to take the hypertext                   idea and connect it to the                   Transmission Control Proto...
WEB 1.0Few Content Creators! Majority Consumers!                         http://www.flickr.com/photos/leandrociuffo/366588...
WEB 2.0          Web as a platform          http://www.flickr.com/photos/lambertwm/4737580179/
WEB 1.0 vs WEB 2.0       Ofoto                                  Flickr  Personal Website                          Blogging...
WEB 3.0      Which direction will it take?                          http://www.flickr.com/photos/markhillary/337685031
Semantic WebVirtual Web        WEB 3.0               Pervasive Web                Could be anything!                      ...
Today’s WebA Web of Documents rather than Data!
Poorly Solved Information Needs• Multiple interpretations   – Apple• Long tail queries   – Roja (I meant a south indian ac...
THE SOLUTION               Semantic Web
Publish data on the Web• Linked Data: linking data similar to how we link  documents on the Web• Query databases over the ...
Architectural Challenges• A common format for sharing data• Sharing the meaning of data• Infrastructure
Semantic Web standards from W3C• Data and schema  languages  (RDF, OWL, RIF)• Document formats  (RDF/XML, RDFa)• Protocols...
Current Researches & Other Efforts• Semantic Web research into knowledge  representation and reasoning, data  integration,...
RDF (Resource Description                Framework)• The basic data model of the Semantic Web   – A universal model to cap...
Graphical and textual notation                              foaf:Person                    type     my:Joe                ...
RDF is designed for the Web• URIs provide web-wide global identification across datasets   – A resource may be described b...
RDF is designed for the Web• URIs can be retrieved from the Web   – A well-behaved URI returns a description of the     re...
URIs implicitly link data together                                        (#joe, #loves, #mary)(#joe, #name, “Joe A.”)(#jo...
Put together, triples form a single           ‘global’ graph               #name          “Joe A.”#joe                    ...
RDF Example
Linked Data cloud: interlinked RDF          datasets on the Webhttp://linkeddata.org/
DBPedia• Dbpedia is dataset that contains much of the  structured data in Wikipedia  – Data from the info-boxes  – Links b...
Fetching individual resources• Use your web browser  • http://dbpedia.org/resource/Yahoo redirects to    http://dbpedia.or...
Querying using SPARQL• Interactive query builders     • SPARQL Explorer: http://dbpedia.org/snorql/     • Examples at: htt...
ConfHopper.in• Award winning app in WWW2012 Metadata  Challenge.• Confhopper.in is a desktop / mobile HTML5 based  applica...
Some Techniques for getting   Structured Information from Web• Semantic Markup• NER• Extraction Tools (Dapper)
Semantic Markup•   Microdata (Schema.org)•   RDFa•   Open Graph Protocol (ogp.me)•   Example:    http://getschema.org/micr...
NER – Named Entity Recognition• Yahoo! Content Analysis API• http://developer.yahoo.com/contentanalysis/
Dapperhttp://open.dapper.netDapper is a tool that enables users to create update feeds fortheir favorite sites and website...
References• http://www.slideshare.net/tompraison• http://inkdroid.org/journal/2010/06/04/the-  5-stars-of-open-linked-data...
Upcoming SlideShare
Loading in...5
×

Hacking with Semantic Web

1,672

Published on

Published in: Technology, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,672
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
21
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Hacking with Semantic Web

  1. 1. Hacking with Semantic Web Tom Praison Developer @ Yahoo! http://twitter.com/tompraison
  2. 2. What’s in here?• Evolution of the web• Poorly Solved Information Needs• Semantic Web Technologies• Linked Data• Demo of confhopper.in, a site built using open datasets• Some techniques for getting Structured Information from Web.• Demo of Yahoo! Contextual Analysis Platform and Open Dapper
  3. 3. I just had to take the hypertext idea and connect it to the Transmission Control Protocol and domain name system ideas and—ta-da!—the World Wide Web.Tim Berners Lee – Inventor of the WWW
  4. 4. WEB 1.0Few Content Creators! Majority Consumers! http://www.flickr.com/photos/leandrociuffo/3665883373/
  5. 5. WEB 2.0 Web as a platform http://www.flickr.com/photos/lambertwm/4737580179/
  6. 6. WEB 1.0 vs WEB 2.0 Ofoto Flickr Personal Website Blogging Britannica Online WikipediaDirectories(taxonomy) Tagging(“folksonomy”)Content Management Wikis Systems
  7. 7. WEB 3.0 Which direction will it take? http://www.flickr.com/photos/markhillary/337685031
  8. 8. Semantic WebVirtual Web WEB 3.0 Pervasive Web Could be anything! Artificial Personalization Intelligence
  9. 9. Today’s WebA Web of Documents rather than Data!
  10. 10. Poorly Solved Information Needs• Multiple interpretations – Apple• Long tail queries – Roja (I meant a south indian actress)• Imprecise or overly precise searches – jim hendler – pictures of strong adventures people• Searches for descriptions – countries in africa – 25 year old computer engineer living in Bangalore – Reliable smart phone under 15,000 rupees
  11. 11. THE SOLUTION Semantic Web
  12. 12. Publish data on the Web• Linked Data: linking data similar to how we link documents on the Web• Query databases over the Web
  13. 13. Architectural Challenges• A common format for sharing data• Sharing the meaning of data• Infrastructure
  14. 14. Semantic Web standards from W3C• Data and schema languages (RDF, OWL, RIF)• Document formats (RDF/XML, RDFa)• Protocols (SPARQL, HTTP)
  15. 15. Current Researches & Other Efforts• Semantic Web research into knowledge representation and reasoning, data integration, data quality and many other topics• Community effort (Linked Data movement)
  16. 16. RDF (Resource Description Framework)• The basic data model of the Semantic Web – A universal model to capture all sorts of data: networks, relational, object-oriented…• Basic unit of information is a triple – A tuple of (subject, predicate, object) – Example: (Joe, loves, Mary) – Each triple gives the value of a property for a given resource or relates two objects to one another • Object is either a resource or a literal• An RDF model is a set of triples – Ordering of statements in an RDF document is irrelevant (unlike XML)
  17. 17. Graphical and textual notation foaf:Person type my:Joe name “Joe A.”A number of ways to serialize an RDF model into an RDF document RDF/XML, Turtle, N3, N-Triples
  18. 18. RDF is designed for the Web• URIs provide web-wide global identification across datasets – A resource may be described by multiple documents – URIs are intended to be reused – Unique, but not single identifiers: two URIs may denote the same thing
  19. 19. RDF is designed for the Web• URIs can be retrieved from the Web – A well-behaved URI returns a description of the resource – Provides authority: the definition of foaf:Person lives at that URI• Ontologies can be looked up as well – Typically at the root of the URIs, also known as the namespace – Example: http://xmlns.com/foaf/0.1/Person redirects to the specification
  20. 20. URIs implicitly link data together (#joe, #loves, #mary)(#joe, #name, “Joe A.”)(#joe, #email, mailto:joe@joe.com) A social networking site (#mary, name, “Mary B.”) Joe’s homepage (#mary, gender, “female”) Mary’s homepage (#name, #type, #Property) (#name, #domain, #Person) Schema doc
  21. 21. Put together, triples form a single ‘global’ graph #name “Joe A.”#joe #email “joe@joe.com” #loves #name “Mary B.” #mary #gender “female”
  22. 22. RDF Example
  23. 23. Linked Data cloud: interlinked RDF datasets on the Webhttp://linkeddata.org/
  24. 24. DBPedia• Dbpedia is dataset that contains much of the structured data in Wikipedia – Data from the info-boxes – Links between Wikipedia pages – Categories – Disambiguation and redirect pages• Links to other datasets
  25. 25. Fetching individual resources• Use your web browser • http://dbpedia.org/resource/Yahoo redirects to http://dbpedia.org/page/Yahoo • You can plug in this URI into other Linked Data browsers• HTTP GET to fetch data – Using curl: add Accept: application/rdf+xml for RDF and enable redirect • curl -L -H Accept:application/rdf+xml http://dbpedia.org/resource/Berlin’• Data dumps – http://wiki.dbpedia.org/Datasets
  26. 26. Querying using SPARQL• Interactive query builders • SPARQL Explorer: http://dbpedia.org/snorql/ • Examples at: http://wiki.dbpedia.org/OnlineAccess• Using HTTP GET – GET /sparql/?query=EncodedQuery HTTP/1.1 – Example: • SELECT ?film ?x WHERE { ?film <http://dbpedia.org/ontology/language> <http://dbpedia.org/resource/French_language> . ?film <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://dbpedia.org/ontology/Film>} • curl http://dbpedia.org/sparql?query=encodedQuery’
  27. 27. ConfHopper.in• Award winning app in WWW2012 Metadata Challenge.• Confhopper.in is a desktop / mobile HTML5 based application designed for conference attendees.• Built with the help of open datasets from http://data.semanticweb.org/ and various other sources.
  28. 28. Some Techniques for getting Structured Information from Web• Semantic Markup• NER• Extraction Tools (Dapper)
  29. 29. Semantic Markup• Microdata (Schema.org)• RDFa• Open Graph Protocol (ogp.me)• Example: http://getschema.org/microdataextractor?url =http://www.tompraison.com&out=json
  30. 30. NER – Named Entity Recognition• Yahoo! Content Analysis API• http://developer.yahoo.com/contentanalysis/
  31. 31. Dapperhttp://open.dapper.netDapper is a tool that enables users to create update feeds fortheir favorite sites and website owners to optimize anddistribute their content in new ways.
  32. 32. References• http://www.slideshare.net/tompraison• http://inkdroid.org/journal/2010/06/04/the- 5-stars-of-open-linked-data/• http://www.freebase.com/• http://dbpedia.org/About
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×