Hack U Barcelona 2011
Upcoming SlideShare
Loading in...5
×
 

Hack U Barcelona 2011

on

  • 1,978 views

Very brief intro to Semantic Web and BOSS for a Yahoo! Hack U event at UPC in Barcelona, Spain.

Very brief intro to Semantic Web and BOSS for a Yahoo! Hack U event at UPC in Barcelona, Spain.

Statistics

Views

Total Views
1,978
Views on SlideShare
1,978
Embed Views
0

Actions

Likes
0
Downloads
6
Comments
0

0 Embeds 0

No embeds

Accessibility

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Hack U Barcelona 2011 Hack U Barcelona 2011 Presentation Transcript

  • Fun with the Semantic Web
    Peter Mika
    Yahoo! Research Barcelona
    pmika@yahoo-inc.com
  • Vague, but exciting… Berners-Lee and the dawn of the Web
  • Semantic Web
    Publish data on the Web
    Linked Data: a web of data instead of web ofdocuments
    Query databases over the Web
    Two main architectural challenges
    A common format for sharing data
    Sharing the meaning of data
    Semantic Web standards from W3C
    Data and schema languages (RDF, OWL, RIF)
    Document formats (RDF/XML, RDFa)
    Protocols (SPARQL, HTTP)
    Semantic Web research into knowledge representation and reasoning, data integration, data quality and many other topics
    Community efforts to publish data and develop schemas (Linked Data)
  • RDF (Resource Description Framework)
    The basic data model of the Semantic Web
    A universal model to capture all sorts of data: networks, relational, object-oriented…
    Basic unit of information is a triple
    A tuple of (subject, predicate, object)
    Example: (Joe, loves, Mary)
    Each triple gives the value of a property for a given resource or relates two objects to one another
    Object is either a resource or a literal
    An RDF model is a set of triples
    Ordering of statements in an RDF document is irrelevant (unlike XML)
  • Resources vs. literals
    Resources are identified by a URI or otherwise the are called a blank node
    URIs are a generalization of URLs
    Notation: <http://www.example.org/Person> or ex:Person
    Literals have an optional language and datatype (string, integer etc.)
    Literals can not be subjects of statements
    Datatypes are identified by URIs, e.g. XML Schema datatypes
    Two literals are the same if their components are the same
    Notation: “Joe B.” or Joe@en^^http://…#string
  • Graphical and textual notation
    foaf:Person
    type
    my:Joe
    name
    “Joe A.”
    A number of ways to serialize an RDF model into an RDF document
    RDF/XML, Turtle, N3, N-Triples
    Example: http://www.cs.vu.nl/~pmika/foaf.rdf
  • RDF is designed for the Web
    URIs provide web-wide global identification across datasets
    A resource may be described by multiple documents
    We know it’s the same resource because the same URI is used or through reasoning (advanced topic…)
    URIs are intented to be reused
    Unique, but not single identifiers: two URIs may denote the same thing
    URIs are dereferencable (can be retrieved)
    A well-behaved URI returns a description of the resource
    Provides authority: the definition of foaf:Person lives at that URI
    Ontologies can be looked up as well
    Typically at the root of the URIs, also known as the namespace
    Example: http://xmlns.com/foaf/0.1/Person redirects to the specification
  • URIs implicitly link data together
    (#joe, #loves, #mary)
    (#joe, #name, “Joe A.”)
    (#joe, #email, mailto:joe@joe.com)
    A dating site
    (#mary, name, “Mary B.”)
    (#mary, gender, “female”)
    Joe’s homepage
    Mary’s homepage
    (#name, #type, #Property)
    (#name, #domain, #Person)
    Schema doc
  • Put together, triples form a single ‘global’ graph
    “Joe A.”
    #name
    #joe
    #email
    “joe@joe.com”
    #loves
    “Mary B.”
    #name
    #mary
    #gender
    “female”
  • Linked Data
    Open your data
    Publish it in RDF, the lingua franca of the data web
    Data first, schema second
    Worry about linking, data integration later… someone else can do it for you!
    Optionally, provide query access using the SPARQL query language and protocol
    Powerful, SQL-like query language
    HTTP or SOAP protocol to communicate with SPARQL servers
  • Linked Data cloud: interlinked RDF datasets on the Web
    http://linkeddata.org/
  • Dbpedia
    Dbpedia is dataset that contains much of the structured data in Wikipedia
    Data from the info-boxes
    Links between Wikipedia pages
    Categories
    Disambiguation and redirect pages
    Links to other datasets
  • Fetching individual resources
    Use your web browser
    http://dbpedia.org/resource/Yahoo redirects to http://dbpedia.org/page/Yahoo
    You can plug in this URI into other Linked Data browsers
    HTTP GET to fetch data
    Using curl: add Accept: application/rdf+xmlfor RDF and enable redirect
    curl -L -H 'Accept:application/rdf+xml' 'http://dbpedia.org/resource/Berlin’
    Data dumps
    http://wiki.dbpedia.org/Datasets
  • Querying using SPARQL
    Interactive query builders
    SPARQL Explorer: http://dbpedia.org/snorql/
    Examples at: http://wiki.dbpedia.org/OnlineAccess
    Using HTTP GET
    GET /sparql/?query=EncodedQuery HTTP/1.1
    Example:
    curl 'http://dbpedia.org/sparql?query=SELECT%20%3Ffilm%20WHERE%20%7B%20%3Ffilm%20%3Chttp%3A%2F%2Fpurl.org%2Fdc%2Fterms%2Fsubject%3E%20%3Chttp%3A%2F%2Fdbpedia.org%2Fresource%2FCategory%3AFrench_films%3E%20%7D’
    Result type is an XML document
  • More data
    New York Times
    http://data.nytimes.com/
    Example URI:
    http://data.nytimes.com/60694995023816375851
    Also supports JSON
    Append .json or set Accept:text/javascript
    Freebase
    http://freebase.com
    Example URI
    http://rdf.freebase.com/rdf/en.tron_legacy
    Data dump
    http://download.freebase.com
  • And more data…
    Geonames: open geo data
    Geonames.org
    http://sws.geonames.org/5130561/
    Download:
    http://www.geonames.org/export/
    Open Government data efforts
    Data.gov:
    Data.gov.uk
    http://data.gov.uk/sparql
  • Spanish open gov’t data and linked data efforts
    Spanish open data efforts
    La AsociaciónEspañola de Linked Data (AELID)
    http://aelid.es/
    ProyectoAporta
    aporta.es
    Regional/local efforts
    risp.asturias.es (RDF, SPARQL)
    datos.zaragoza.es (RDF, SPARQL)
    opendata.euskadi.net (RDF)
    dadesobertes.gencat.cat (RDF)
    Competition AbreDatos 2010
    abredatos.es
  • More info
    Segaran et al.: Programming the Semantic Web, O’Reilly, 2010.
    linkeddata.org
    W3C Semantic Web Activity
    Presentations, guides etc.
    RDF Primer
    http://www.w3.org/TR/2004/REC-rdf-primer-20040210/
    SPARQL query language and protocol specs
    http://www.w3.org/TR/rdf-sparql-protocol/
    http://www.w3.org/TR/rdf-sparql-query/
    Search SlideShare etc. for more intro material
  • Build your Own Search Service (BOSS)
    Peter Mika
    Yahoo! Research Barcelona
    pmika@yahoo-inc.com
  • Innovate with Search!
    It’s really simple…
    Example:
    pay $0.0008 for a query, earn $0.01 per query
    100,000 users a day, each making 1 query a day
    Earn $920 dollars a day!
  • Yahoo BOSS: Yahoo’s Search API
    Ability to re-order results and blend-in addition content
    No restrictions on presentation
    No branding or attribution
    Access to multiple verticals (web search, image, news)
    Spelling suggestions
    40+ supported language and region pairs
    Pricing (BOSS)
    10,000 free queries a day
    Pay for more queries
    Serve any ads you want
    For more info, http://developer.yahoo.com/search/boss/
    New in BOSS v2
    Powered by Bing
    Retrieve ads from Yahoo! and earn money ;)
  • Using BOSS
    Simple HTTP GET calls, no authentication
    Get an Application ID at
    http://developer.yahoo.com/search/boss/
    Example:
    http://boss.yahooapis.com/ysearch/web/v1/{query}?appid={appid}&format=xml
    http://boss.yahooapis.com/ysearch/spelling/v1/{query}?appid={appid}&format=xml
    Documentation
    http://developer.yahoo.com/search/boss/boss_guide/
  • Queries you can play with
    Yahoo!’s WebScope program
    Data sharing with universities and research institutions
    Some of the most exciting data that we have!
    Request access online
    http://webscope.sandbox.yahoo.com/
    Requires approval by Department Chair
    For HackU, you can sign up here for access to a dataset containing real world user queries
    Yahoo! Search Tiny Sample v1.0: a set of 4,500 queries
    Ideal for testing and demonstrating your search-based apps
    Can you really show something interesting for all these users?