Fun with the Semantic WebPeter MikaYahoo! Research Barcelonapmika@yahoo-inc.com
Vague, but exciting… Berners-Lee and the dawn of the Web
Semantic WebPublish data on the WebLinked Data: linking data similar to how we link documents on the WebQuery databases over the WebArchitectural challengesA common format for sharing dataSharing the meaning of dataInfrastructureSemantic Web standards from W3CData and schema languages (RDF, OWL, RIF)Document formats (RDF/XML, RDFa)Protocols (SPARQL, HTTP)Semantic Web research into knowledge representation and reasoning, data integration, data quality and many other topicsCommunity effort (Linked Data movement)
RDF (Resource Description Framework)The basic data model of the Semantic WebA universal model to capture all sorts of data: networks, relational, object-oriented…Basic unit of information is a triple A tuple of (subject, predicate, object)Example: (Joe, loves, Mary)Each triple gives the value of a property for a given resource or relates two objects to one anotherObject is either a resource or a literalAn RDF model is a set of triplesOrdering of statements in an RDF document is irrelevant (unlike XML)
Resources vs. literalsResources are identified by a URI or otherwise the are  called a blank nodeURIs are a generalization of URLsNotation: <http://www.example.org/Person> or ex:PersonLiterals have an optional language and datatype (string, integer etc.)Literals can not be subjects of statementsDatatypes are identified by URIs, e.g. XML Schema datatypesTwo literals are the same if their components are the sameNotation: “Joe B.” or Joe@en^^http://…#string
Graphical and textual notationfoaf:Persontypemy:Joename“Joe A.”A number of ways to serialize an RDF model into an RDF documentRDF/XML, Turtle, N3, N-TriplesExample: http://www.cs.vu.nl/~pmika/foaf.rdf
RDF is designed for the WebURIs provide web-wide global identification across datasetsA resource may be described by multiple documentsWe know it’s the same resource because the same URI is used or through reasoning (advanced topic…)URIs are intented to be reusedUnique, but not single identifiers: two URIs may denote the same thingURIs can be retrieved from the WebA well-behaved URI returns a description of the resource Provides authority: the definition of foaf:Person lives at that URIOntologies can be looked up as wellTypically at the root of the URIs, also known as the namespaceExample: http://xmlns.com/foaf/0.1/Person redirects to the specification
URIs implicitly link data together (#joe, #loves, #mary)(#joe, #name, “Joe A.”)(#joe, #email, mailto:joe@joe.com)A dating site(#mary, name, “Mary B.”)(#mary, gender, “female”)Joe’s homepageMary’s homepage(#name, #type, #Property)(#name, #domain, #Person)Schema doc
Put together, triples form a single ‘global’ graph“Joe A.”#name#joe#email“joe@joe.com”#loves“Mary B.”#name#mary#gender“female”
Linked DataOpen your dataPublish it in RDF, the lingua franca of the data webData first, schema secondWorry about linking, data integration later… someone else can do it for you!Optionally, provide query access using the SPARQL query language and protocolPowerful, SQL-like query languageHTTP or SOAP protocol to communicate with SPARQL servers
Linked Data cloud: interlinked RDF datasets on the Webhttp://linkeddata.org/
DbpediaDbpedia is dataset that contains much of the structured data in WikipediaData from the info-boxesLinks between Wikipedia pagesCategoriesDisambiguation and redirect pagesLinks to other datasets
Fetching individual resourcesUse your web browserhttp://dbpedia.org/resource/Yahoo redirects to http://dbpedia.org/page/YahooYou can plug in this URI into other Linked Data browsersHTTP GET to fetch dataUsing curl: add Accept: application/rdf+xmlfor RDF and enable redirectcurl -L -H 'Accept:application/rdf+xml' 'http://dbpedia.org/resource/Berlin’Data dumpshttp://wiki.dbpedia.org/Datasets
Querying using SPARQLInteractive query buildersSPARQL Explorer: http://dbpedia.org/snorql/Examples at: http://wiki.dbpedia.org/OnlineAccessUsing HTTP GETGET /sparql/?query=EncodedQuery HTTP/1.1Example:SELECT ?film ?x WHERE {    ?film <http://dbpedia.org/ontology/language> <http://dbpedia.org/resource/French_language> . ?film <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://dbpedia.org/ontology/Film>}curl 'http://dbpedia.org/sparql?query=encodedQuery’
More dataNew York Timeshttp://data.nytimes.com/Example URI:http://data.nytimes.com/60694995023816375851Also supports JSONAppend .json or set Accept:text/javascriptFreebasehttp://freebase.comExample URI http://rdf.freebase.com/rdf/en.tron_legacyData dumphttp://download.freebase.com
And more data…Geonames: open geo dataGeonames.orghttp://sws.geonames.org/5130561/Download:	http://www.geonames.org/export/Open Government data effortsData.govSee apps e.g. http://flyontime.usData.gov.ukhttp://data.gov.uk/sparql
Spanish open gov’t data and linked data effortsSpanish open data effortsLa AsociaciónEspañola de Linked Data (AELID) http://aelid.es/ProyectoAportaaporta.esRegional/local effortsrisp.asturias.es (RDF, SPARQL)datos.zaragoza.es  (RDF, SPARQL)opendata.euskadi.net  (RDF)dadesobertes.gencat.cat (RDF)Competition AbreDatos 2010abredatos.es
More infoSegaran et al.: Programming the Semantic Web, O’Reilly, 2010.linkeddata.orgW3C Semantic Web ActivityPresentations, guides etc.RDF Primerhttp://www.w3.org/TR/2004/REC-rdf-primer-20040210/SPARQL query language and protocol specshttp://www.w3.org/TR/rdf-sparql-protocol/http://www.w3.org/TR/rdf-sparql-query/Search SlideShare etc. for more intro material
Build your Own Search Service (BOSS)Peter MikaYahoo! Research Barcelonapmika@yahoo-inc.com
Innovate with Search!It’s really simple…Example: pay $0.0008 for a query, earn $0.01 per query100,000 users a day, each making 1 query a dayEarn $920 dollars a day!
Reminds me of the underpants gnomes from the Simpsons	http://en.wikipedia.org/wiki/Underpants_Gnomes
Yahoo BOSS: Yahoo’s Search APIAbility to re-order results and blend-in addition contentNo restrictions on presentationNo branding or attributionAccess to multiple verticals (web search, image, news)Spelling suggestions40+ supported language and region pairsPricing (BOSS)10,000 free queries a dayPay for more queriesServe any ads you wantFor more info, http://developer.yahoo.com/search/boss/New in BOSS v2Powered by BingRetrieve ads from Yahoo! and earn money ;)
Using BOSSSimple HTTP GET calls, no authenticationGet an Application ID at http://developer.yahoo.com/search/boss/Example:http://boss.yahooapis.com/ysearch/web/v1/{query}?appid={appid}&format=xmlhttp://boss.yahooapis.com/ysearch/spelling/v1/{query}?appid={appid}&format=xmlDocumentationhttp://developer.yahoo.com/search/boss/boss_guide/
Queries you can play withYahoo!’s WebScope program Data sharing with universities and research institutions Some of the most exciting data that we have!Request access onlinehttp://webscope.sandbox.yahoo.com/Requires approval by Department ChairFor HackU, you can sign up here for access to a dataset containing real world user queriesYahoo! Search Tiny Sample v1.0: a set of 4,500 queriesIdeal for testing and demonstrating your search-based appsCan you really show something interesting for all these users?

Hack U Barcelona 2011

  • 1.
    Fun with theSemantic WebPeter MikaYahoo! Research Barcelonapmika@yahoo-inc.com
  • 2.
    Vague, but exciting…Berners-Lee and the dawn of the Web
  • 3.
    Semantic WebPublish dataon the WebLinked Data: linking data similar to how we link documents on the WebQuery databases over the WebArchitectural challengesA common format for sharing dataSharing the meaning of dataInfrastructureSemantic Web standards from W3CData and schema languages (RDF, OWL, RIF)Document formats (RDF/XML, RDFa)Protocols (SPARQL, HTTP)Semantic Web research into knowledge representation and reasoning, data integration, data quality and many other topicsCommunity effort (Linked Data movement)
  • 4.
    RDF (Resource DescriptionFramework)The basic data model of the Semantic WebA universal model to capture all sorts of data: networks, relational, object-oriented…Basic unit of information is a triple A tuple of (subject, predicate, object)Example: (Joe, loves, Mary)Each triple gives the value of a property for a given resource or relates two objects to one anotherObject is either a resource or a literalAn RDF model is a set of triplesOrdering of statements in an RDF document is irrelevant (unlike XML)
  • 5.
    Resources vs. literalsResourcesare identified by a URI or otherwise the are called a blank nodeURIs are a generalization of URLsNotation: <http://www.example.org/Person> or ex:PersonLiterals have an optional language and datatype (string, integer etc.)Literals can not be subjects of statementsDatatypes are identified by URIs, e.g. XML Schema datatypesTwo literals are the same if their components are the sameNotation: “Joe B.” or Joe@en^^http://…#string
  • 6.
    Graphical and textualnotationfoaf:Persontypemy:Joename“Joe A.”A number of ways to serialize an RDF model into an RDF documentRDF/XML, Turtle, N3, N-TriplesExample: http://www.cs.vu.nl/~pmika/foaf.rdf
  • 7.
    RDF is designedfor the WebURIs provide web-wide global identification across datasetsA resource may be described by multiple documentsWe know it’s the same resource because the same URI is used or through reasoning (advanced topic…)URIs are intented to be reusedUnique, but not single identifiers: two URIs may denote the same thingURIs can be retrieved from the WebA well-behaved URI returns a description of the resource Provides authority: the definition of foaf:Person lives at that URIOntologies can be looked up as wellTypically at the root of the URIs, also known as the namespaceExample: http://xmlns.com/foaf/0.1/Person redirects to the specification
  • 8.
    URIs implicitly linkdata together (#joe, #loves, #mary)(#joe, #name, “Joe A.”)(#joe, #email, mailto:joe@joe.com)A dating site(#mary, name, “Mary B.”)(#mary, gender, “female”)Joe’s homepageMary’s homepage(#name, #type, #Property)(#name, #domain, #Person)Schema doc
  • 9.
    Put together, triplesform a single ‘global’ graph“Joe A.”#name#joe#email“joe@joe.com”#loves“Mary B.”#name#mary#gender“female”
  • 10.
    Linked DataOpen yourdataPublish it in RDF, the lingua franca of the data webData first, schema secondWorry about linking, data integration later… someone else can do it for you!Optionally, provide query access using the SPARQL query language and protocolPowerful, SQL-like query languageHTTP or SOAP protocol to communicate with SPARQL servers
  • 11.
    Linked Data cloud:interlinked RDF datasets on the Webhttp://linkeddata.org/
  • 12.
    DbpediaDbpedia is datasetthat contains much of the structured data in WikipediaData from the info-boxesLinks between Wikipedia pagesCategoriesDisambiguation and redirect pagesLinks to other datasets
  • 13.
    Fetching individual resourcesUseyour web browserhttp://dbpedia.org/resource/Yahoo redirects to http://dbpedia.org/page/YahooYou can plug in this URI into other Linked Data browsersHTTP GET to fetch dataUsing curl: add Accept: application/rdf+xmlfor RDF and enable redirectcurl -L -H 'Accept:application/rdf+xml' 'http://dbpedia.org/resource/Berlin’Data dumpshttp://wiki.dbpedia.org/Datasets
  • 14.
    Querying using SPARQLInteractivequery buildersSPARQL Explorer: http://dbpedia.org/snorql/Examples at: http://wiki.dbpedia.org/OnlineAccessUsing HTTP GETGET /sparql/?query=EncodedQuery HTTP/1.1Example:SELECT ?film ?x WHERE { ?film <http://dbpedia.org/ontology/language> <http://dbpedia.org/resource/French_language> . ?film <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://dbpedia.org/ontology/Film>}curl 'http://dbpedia.org/sparql?query=encodedQuery’
  • 15.
    More dataNew YorkTimeshttp://data.nytimes.com/Example URI:http://data.nytimes.com/60694995023816375851Also supports JSONAppend .json or set Accept:text/javascriptFreebasehttp://freebase.comExample URI http://rdf.freebase.com/rdf/en.tron_legacyData dumphttp://download.freebase.com
  • 16.
    And more data…Geonames:open geo dataGeonames.orghttp://sws.geonames.org/5130561/Download: http://www.geonames.org/export/Open Government data effortsData.govSee apps e.g. http://flyontime.usData.gov.ukhttp://data.gov.uk/sparql
  • 17.
    Spanish open gov’tdata and linked data effortsSpanish open data effortsLa AsociaciónEspañola de Linked Data (AELID) http://aelid.es/ProyectoAportaaporta.esRegional/local effortsrisp.asturias.es (RDF, SPARQL)datos.zaragoza.es (RDF, SPARQL)opendata.euskadi.net (RDF)dadesobertes.gencat.cat (RDF)Competition AbreDatos 2010abredatos.es
  • 18.
    More infoSegaran etal.: Programming the Semantic Web, O’Reilly, 2010.linkeddata.orgW3C Semantic Web ActivityPresentations, guides etc.RDF Primerhttp://www.w3.org/TR/2004/REC-rdf-primer-20040210/SPARQL query language and protocol specshttp://www.w3.org/TR/rdf-sparql-protocol/http://www.w3.org/TR/rdf-sparql-query/Search SlideShare etc. for more intro material
  • 19.
    Build your OwnSearch Service (BOSS)Peter MikaYahoo! Research Barcelonapmika@yahoo-inc.com
  • 20.
    Innovate with Search!It’sreally simple…Example: pay $0.0008 for a query, earn $0.01 per query100,000 users a day, each making 1 query a dayEarn $920 dollars a day!
  • 21.
    Reminds me ofthe underpants gnomes from the Simpsons http://en.wikipedia.org/wiki/Underpants_Gnomes
  • 22.
    Yahoo BOSS: Yahoo’sSearch APIAbility to re-order results and blend-in addition contentNo restrictions on presentationNo branding or attributionAccess to multiple verticals (web search, image, news)Spelling suggestions40+ supported language and region pairsPricing (BOSS)10,000 free queries a dayPay for more queriesServe any ads you wantFor more info, http://developer.yahoo.com/search/boss/New in BOSS v2Powered by BingRetrieve ads from Yahoo! and earn money ;)
  • 23.
    Using BOSSSimple HTTPGET calls, no authenticationGet an Application ID at http://developer.yahoo.com/search/boss/Example:http://boss.yahooapis.com/ysearch/web/v1/{query}?appid={appid}&format=xmlhttp://boss.yahooapis.com/ysearch/spelling/v1/{query}?appid={appid}&format=xmlDocumentationhttp://developer.yahoo.com/search/boss/boss_guide/
  • 24.
    Queries you canplay withYahoo!’s WebScope program Data sharing with universities and research institutions Some of the most exciting data that we have!Request access onlinehttp://webscope.sandbox.yahoo.com/Requires approval by Department ChairFor HackU, you can sign up here for access to a dataset containing real world user queriesYahoo! Search Tiny Sample v1.0: a set of 4,500 queriesIdeal for testing and demonstrating your search-based appsCan you really show something interesting for all these users?