Hack U Barcelona 2011
Upcoming SlideShare
Loading in...5
×
 

Hack U Barcelona 2011

on

  • 1,938 views

Very brief intro to Semantic Web and BOSS for a Yahoo! Hack U event at UPC in Barcelona, Spain.

Very brief intro to Semantic Web and BOSS for a Yahoo! Hack U event at UPC in Barcelona, Spain.

Statistics

Views

Total Views
1,938
Views on SlideShare
1,938
Embed Views
0

Actions

Likes
0
Downloads
6
Comments
0

0 Embeds 0

No embeds

Accessibility

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Hack U Barcelona 2011 Hack U Barcelona 2011 Presentation Transcript

    • Fun with the Semantic Web
      Peter Mika
      Yahoo! Research Barcelona
      pmika@yahoo-inc.com
    • Vague, but exciting… Berners-Lee and the dawn of the Web
    • Semantic Web
      Publish data on the Web
      Linked Data: a web of data instead of web ofdocuments
      Query databases over the Web
      Two main architectural challenges
      A common format for sharing data
      Sharing the meaning of data
      Semantic Web standards from W3C
      Data and schema languages (RDF, OWL, RIF)
      Document formats (RDF/XML, RDFa)
      Protocols (SPARQL, HTTP)
      Semantic Web research into knowledge representation and reasoning, data integration, data quality and many other topics
      Community efforts to publish data and develop schemas (Linked Data)
    • RDF (Resource Description Framework)
      The basic data model of the Semantic Web
      A universal model to capture all sorts of data: networks, relational, object-oriented…
      Basic unit of information is a triple
      A tuple of (subject, predicate, object)
      Example: (Joe, loves, Mary)
      Each triple gives the value of a property for a given resource or relates two objects to one another
      Object is either a resource or a literal
      An RDF model is a set of triples
      Ordering of statements in an RDF document is irrelevant (unlike XML)
    • Resources vs. literals
      Resources are identified by a URI or otherwise the are called a blank node
      URIs are a generalization of URLs
      Notation: <http://www.example.org/Person> or ex:Person
      Literals have an optional language and datatype (string, integer etc.)
      Literals can not be subjects of statements
      Datatypes are identified by URIs, e.g. XML Schema datatypes
      Two literals are the same if their components are the same
      Notation: “Joe B.” or Joe@en^^http://…#string
    • Graphical and textual notation
      foaf:Person
      type
      my:Joe
      name
      “Joe A.”
      A number of ways to serialize an RDF model into an RDF document
      RDF/XML, Turtle, N3, N-Triples
      Example: http://www.cs.vu.nl/~pmika/foaf.rdf
    • RDF is designed for the Web
      URIs provide web-wide global identification across datasets
      A resource may be described by multiple documents
      We know it’s the same resource because the same URI is used or through reasoning (advanced topic…)
      URIs are intented to be reused
      Unique, but not single identifiers: two URIs may denote the same thing
      URIs are dereferencable (can be retrieved)
      A well-behaved URI returns a description of the resource
      Provides authority: the definition of foaf:Person lives at that URI
      Ontologies can be looked up as well
      Typically at the root of the URIs, also known as the namespace
      Example: http://xmlns.com/foaf/0.1/Person redirects to the specification
    • URIs implicitly link data together
      (#joe, #loves, #mary)
      (#joe, #name, “Joe A.”)
      (#joe, #email, mailto:joe@joe.com)
      A dating site
      (#mary, name, “Mary B.”)
      (#mary, gender, “female”)
      Joe’s homepage
      Mary’s homepage
      (#name, #type, #Property)
      (#name, #domain, #Person)
      Schema doc
    • Put together, triples form a single ‘global’ graph
      “Joe A.”
      #name
      #joe
      #email
      “joe@joe.com”
      #loves
      “Mary B.”
      #name
      #mary
      #gender
      “female”
    • Linked Data
      Open your data
      Publish it in RDF, the lingua franca of the data web
      Data first, schema second
      Worry about linking, data integration later… someone else can do it for you!
      Optionally, provide query access using the SPARQL query language and protocol
      Powerful, SQL-like query language
      HTTP or SOAP protocol to communicate with SPARQL servers
    • Linked Data cloud: interlinked RDF datasets on the Web
      http://linkeddata.org/
    • Dbpedia
      Dbpedia is dataset that contains much of the structured data in Wikipedia
      Data from the info-boxes
      Links between Wikipedia pages
      Categories
      Disambiguation and redirect pages
      Links to other datasets
    • Fetching individual resources
      Use your web browser
      http://dbpedia.org/resource/Yahoo redirects to http://dbpedia.org/page/Yahoo
      You can plug in this URI into other Linked Data browsers
      HTTP GET to fetch data
      Using curl: add Accept: application/rdf+xmlfor RDF and enable redirect
      curl -L -H 'Accept:application/rdf+xml' 'http://dbpedia.org/resource/Berlin’
      Data dumps
      http://wiki.dbpedia.org/Datasets
    • Querying using SPARQL
      Interactive query builders
      SPARQL Explorer: http://dbpedia.org/snorql/
      Examples at: http://wiki.dbpedia.org/OnlineAccess
      Using HTTP GET
      GET /sparql/?query=EncodedQuery HTTP/1.1
      Example:
      curl 'http://dbpedia.org/sparql?query=SELECT%20%3Ffilm%20WHERE%20%7B%20%3Ffilm%20%3Chttp%3A%2F%2Fpurl.org%2Fdc%2Fterms%2Fsubject%3E%20%3Chttp%3A%2F%2Fdbpedia.org%2Fresource%2FCategory%3AFrench_films%3E%20%7D’
      Result type is an XML document
    • More data
      New York Times
      http://data.nytimes.com/
      Example URI:
      http://data.nytimes.com/60694995023816375851
      Also supports JSON
      Append .json or set Accept:text/javascript
      Freebase
      http://freebase.com
      Example URI
      http://rdf.freebase.com/rdf/en.tron_legacy
      Data dump
      http://download.freebase.com
    • And more data…
      Geonames: open geo data
      Geonames.org
      http://sws.geonames.org/5130561/
      Download:
      http://www.geonames.org/export/
      Open Government data efforts
      Data.gov:
      Data.gov.uk
      http://data.gov.uk/sparql
    • Spanish open gov’t data and linked data efforts
      Spanish open data efforts
      La AsociaciónEspañola de Linked Data (AELID)
      http://aelid.es/
      ProyectoAporta
      aporta.es
      Regional/local efforts
      risp.asturias.es (RDF, SPARQL)
      datos.zaragoza.es (RDF, SPARQL)
      opendata.euskadi.net (RDF)
      dadesobertes.gencat.cat (RDF)
      Competition AbreDatos 2010
      abredatos.es
    • More info
      Segaran et al.: Programming the Semantic Web, O’Reilly, 2010.
      linkeddata.org
      W3C Semantic Web Activity
      Presentations, guides etc.
      RDF Primer
      http://www.w3.org/TR/2004/REC-rdf-primer-20040210/
      SPARQL query language and protocol specs
      http://www.w3.org/TR/rdf-sparql-protocol/
      http://www.w3.org/TR/rdf-sparql-query/
      Search SlideShare etc. for more intro material
    • Build your Own Search Service (BOSS)
      Peter Mika
      Yahoo! Research Barcelona
      pmika@yahoo-inc.com
    • Innovate with Search!
      It’s really simple…
      Example:
      pay $0.0008 for a query, earn $0.01 per query
      100,000 users a day, each making 1 query a day
      Earn $920 dollars a day!
    • Yahoo BOSS: Yahoo’s Search API
      Ability to re-order results and blend-in addition content
      No restrictions on presentation
      No branding or attribution
      Access to multiple verticals (web search, image, news)
      Spelling suggestions
      40+ supported language and region pairs
      Pricing (BOSS)
      10,000 free queries a day
      Pay for more queries
      Serve any ads you want
      For more info, http://developer.yahoo.com/search/boss/
      New in BOSS v2
      Powered by Bing
      Retrieve ads from Yahoo! and earn money ;)
    • Using BOSS
      Simple HTTP GET calls, no authentication
      Get an Application ID at
      http://developer.yahoo.com/search/boss/
      Example:
      http://boss.yahooapis.com/ysearch/web/v1/{query}?appid={appid}&format=xml
      http://boss.yahooapis.com/ysearch/spelling/v1/{query}?appid={appid}&format=xml
      Documentation
      http://developer.yahoo.com/search/boss/boss_guide/
    • Queries you can play with
      Yahoo!’s WebScope program
      Data sharing with universities and research institutions
      Some of the most exciting data that we have!
      Request access online
      http://webscope.sandbox.yahoo.com/
      Requires approval by Department Chair
      For HackU, you can sign up here for access to a dataset containing real world user queries
      Yahoo! Search Tiny Sample v1.0: a set of 4,500 queries
      Ideal for testing and demonstrating your search-based apps
      Can you really show something interesting for all these users?