• Save
San Diego Meetup - Sem Web Overview - 2009.04.27
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

San Diego Meetup - Sem Web Overview - 2009.04.27

  • 677 views
Uploaded on

On April 27, 2010, I presented these slides to the San Diego Semantic Web Meetup in Carlsbad, CA.

On April 27, 2010, I presented these slides to the San Diego Semantic Web Meetup in Carlsbad, CA.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
677
On Slideshare
662
From Embeds
15
Number of Embeds
3

Actions

Shares
Downloads
0
Comments
0
Likes
0

Embeds 15

http://www.linkedin.com 12
https://www.linkedin.com 2
http://www.lmodules.com 1

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • Hello, welcome. I want this to be participatory, there are a lot of great brains in the room, and this is a place to get answers! No stupid questions here. I may suggest that we postpone some of them if we start getting too deep into an area that I think can be better covered elsewhere, or if I need to find a resource to point you to, but please ask your questions!That’s your job – to take what you can from tonight and enjoy yourselves.My job is to not be boring.So let’s get started…
  • Fortunately, I have some colleagues joining me who will help in that regard.This morning, I’m going to begin by giving you a ramp up into the world of Linked Open Data and the technologies used in both LOD and LED initiatives. Once I have given you that background, Bernadette Hyland from Zepheira will take you through some of the exciting work that they are doing with Linked Data to solve very real business problems faster and less expensively than would be possible using other approaches.This afternoon, David Wood will take us further into the idea of implementing these technologies in the enterprise.So let’s get started…
  • I use these terms interchangeably. There is a lot of discussion about what they each mean, which is perhaps ironic, since the meaning of Semantic in this case is the same thing it means in linguistics or philosophy: that is, “MEANING.”Another term that has been gaining a lot of traction is… [BUILD]Web of Data. I like this term a lot and I hope that by the end of this session, you will understand why.
  • Let’s look at some ‘versions’ of the Web. It should be said here that Tim Berners-Lee, the recognized “father” of the WWW, doesn’t like the idea of versioning the Web. I happen to agree, but I understand why people do it.As we talk about these versions of the Web, you may want to think of this as a continuum with significant waves; each with its own benchmark technologies rather than specific versions with distinct start and end points.Nova Spivack of Radar Networks and Twine.com created this.
  • When the Web came about, people were excited because here was a way to post documents and share them quickly, easily, inexpensively, and globally. There was a way to describe documents uniquely in the form of URLs; Documents linked to other documents with hyperlinks; URLs are URIsOne could even search for relevant documents; Corporate intranets and portals became popular as enterprises realized that they could use this technology to share documents within their organizations, no matter how disparate their teams were.These documents could take the form of HTML pages, PDFs, Word publishing files, spreadsheets, images… really any type of file that could be saved in a digital format could be posted to the Web and shared. Obviously, this was a huge step forward for human information collection and sharing. But what about the machines we use?
  • My friend Karen wrote a book of poetry. It sells on Amazon.com.In a Web 1.0 context, this page is an HTML document with some information displayed for human consumption. A computer, through use of standards like HTML, PDF, and corresponding browser technology, recognizes certain elements as character strings, numbers, images, formatting, etc.BUT, the computer does NOT deduce the meaning from this information. It does not recognize that this is a book with an author, a publisher, and a format (paperback). It doesn’t know concepts like that a book can be a commodity with a corresponding price.It also doesn’t recognize that commodities such as books have REVIEWS (Web 2.0).The Data in the page are not understood by the computer, and there’s no way to link that data to any other relevant information.
  • In Web 2.0, the focus shifted so that now we became very interested in linking individuals, their likes and dislikes, their opinions, their profiles, photos, videos, their metadata – you get the idea.
  • In Web 2.0, we experienced an immense growth of content on the Web as users interacted with the Web in entirely new ways. They have been encouraged to enter data and metadata to many collections, both publicly and anonymously. End-users found their voices on the Web. My non-technical friends and relatives interact and add data on a regular basis to this system. Pretty exciting time to be us.However, my computer is still pretty clueless.
  • In Web 3.0, we’re no longer talking about linking documents. We’re no longer talking about searching for character strings and getting back results in search engines that match those strings.We’re actually talking about linking the information INSIDE those documents.Now we’re going to take a look at the growing world of Linked Open Data.
  • Remember our book example from earlier? By using Web 3.0 technologies, my computer now understands so much more!When we uniquely identify THINGS for the computer, it can start to recognize the data points in the page.Look at the THINGS on this page. I’ve marked up some of the nouns “Dick Tracy–style”. Every one of these THINGS can be referred to by a URI. The RELATIONSHIPS are merely implied here, but they are there:This BOOK | HAS | a TITLE, PUBLISHER, PRINT FORMAT, and an IMAGE associated with it. That IMAGE | IS A | COVER.The BOOK | has | a PRICE.Etc.
  • …which shows the progression of technologies and standards as we move toward a world of Linked Data.Here in the lower left corner is the era that we may refer to as “Web 0.0” or in Nova’s diagram, the PC era.Computers operated in solitude. With the advent of TCP/IP they could begin to share information across a network. But the files/documents were still locked in their physical location (we used the “sneakernet” to transfer files and docs)With HTTP, physical location of the computers became irrelevant. Files and Docs could be linked and shared.But the CONCEPTS within those documents were now what remained locked in a rigid structure.Semantic Technologies change that; enabling those concepts to be linked across systems in much the same way that files and computers were linked with other standards. The vision is the GGG; the Great Global Graph.
  • 1. In Web 1.0, documents were given URIs. With a web browser, individuals could access those exact documents by entering the URIs.In Web 3.0, every THING, every RELATIONSHIP, is given a URI, providing similar access to the information within documents. We can now point to specific data points as unique resources. THAT is what I mean by ‘X’. ‘X’ is related to ‘Y’ Specifically in THIS WAY.In Web 1.0, documents could be linked together by embedding URIs in documents, but computers still couldn’t understand the information IN those documents.In the Semantic Web, the data becomes free. I’m using “free” here not to describe monetary value, nor to describe access control or security. Rather, to describe the flexibility of the data; the ACCESSIBILITY of the data in terms of what’s possible.
  • These are representations of:MeMy postal codeThe White HouseThe sales tax rate of Los Angeles County (remember that the next time you hate us for our winters)An imageNote that we’re talking about applying this to both structured and unstructured data. Also, these are pieces of data stored in a variety of places. Imagine what we could do if we could tie them together.[[Use example of “a PHOTO taken by ME of the WHITE HOUSE, but sold from my POSTAL CODE to another Californian (sales tax rate).” -- Make sense?]]
  • How many of you remember doing this? Don’t worry – I won’t make you relive 5th grade entirely.But, I do want to talk about it just for a moment – the key thing about the sentences on the previous slide is that they can be divided into a SUBJECT, PREDICATE, and OBJECT. And that, in a nutshell, is all a triple is:[BUILD] SUBJECT + PREDICATE + OBJECT
  • And this is what they look like in graph form…[[BUILD]]
  • 1. In Web 1.0, documents were given URIs. With a web browser, individuals could access those exact documents by entering the URIs.In Web 3.0, every THING, every RELATIONSHIP, is given a URI, providing similar access to the information within documents. We can now point to specific data points as unique resources. THAT is what I mean by ‘X’. ‘X’ is related to ‘Y’ Specifically in THIS WAY.In Web 1.0, documents could be linked together by embedding URIs in documents, but computers still couldn’t understand the information IN those documents.In the Semantic Web, the data becomes free. I’m using “free” here not to describe monetary value, nor to describe access control or security. Rather, to describe the flexibility of the data; the ACCESSIBILITY of the data in terms of what’s possible.
  • This relationship can be given a specific URI. That means that the concept of isFather can have a distinct meaning, characteristics, and requirements – AND THAT DEFINITION CAN BE LINKED TO. Maybe I mean biological father. Maybe I mean a broader, social definition that includes step-fathers, adoptive fathers. Each one of these RELATIONSHIPS may be defined in a different vocabulary in a different way. The KEY THING HERE is that when a RELATIONSHIP has a URI, it can be called upon and re-used. I can choose which existing definition I want to use or I can create my own.[BUILD] The RELATIONSHIP can be expressed in a way that computers can understand. In this case, at least one such definition exists for the concept of “Father.”[BUILD] The data points can change; the relationship remains independent. “I” can be expressed by any number of URI’s, begging the philosophical question: “Who am I on the Web?” My Twitter Feed? My FaceBook Profile? My LinkedIn Profile? That senior High School photo that a “friend” just posted of me?[BUILD] The data points can change; the relationship remains independent.
  • This relationship can be given a specific URI. That means that the concept of isFather can have a distinct meaning, characteristics, and requirements – AND THAT DEFINITION CAN BE LINKED TO. Maybe I mean biological father. Maybe I mean a broader, social definition that includes step-fathers, adoptive fathers. Each one of these RELATIONSHIPS may be defined in a different vocabulary in a different way. The KEY THING HERE is that when a RELATIONSHIP has a URI, it can be called upon and re-used. I can choose which existing definition I want to use or I can create my own.[BUILD] The RELATIONSHIP can be expressed in a way that computers can understand. In this case, at least one such definition exists for the concept of “Father.”[BUILD] The data points can change; the relationship remains independent. “I” can be expressed by any number of URI’s, begging the philosophical question: “Who am I on the Web?” My Twitter Feed? My FaceBook Profile? My LinkedIn Profile? That senior High School photo that a “friend” just posted of me?[BUILD] The data points can change; the relationship remains independent.
  • This relationship can be given a specific URI. That means that the concept of isFather can have a distinct meaning, characteristics, and requirements – AND THAT DEFINITION CAN BE LINKED TO. Maybe I mean biological father. Maybe I mean a broader, social definition that includes step-fathers, adoptive fathers. Each one of these RELATIONSHIPS may be defined in a different vocabulary in a different way. The KEY THING HERE is that when a RELATIONSHIP has a URI, it can be called upon and re-used. I can choose which existing definition I want to use or I can create my own.[BUILD] The RELATIONSHIP can be expressed in a way that computers can understand. In this case, at least one such definition exists for the concept of “Father.”[BUILD] The data points can change; the relationship remains independent. “I” can be expressed by any number of URI’s, begging the philosophical question: “Who am I on the Web?” My Twitter Feed? My FaceBook Profile? My LinkedIn Profile? That senior High School photo that a “friend” just posted of me?[BUILD] The data points can change; the relationship remains independent.
  • Here’s graph of several related triples. I think it’s actually a lot easier to read and write than diagramming sentences. 
  • Hopefully this is all pretty familiar territory for you.Relational Databases have been around a long time and work because they have:A way of representing data that is built upon standardsA formal way of representing schemasA robust query language that allows extraction of information
  • The way of representing data in RDBMS is tables.
  • Note that in this schema, data is connected at the table level. In creating a schema for RDBMS, you need to do a lot of planning for what goes into which tables.
  • And a basic SQL query.
  • Linked Data has the same components, but is built upon a web-scale architecture. THE WWW IS THE DATABASE!
  • RDF = Resource Description Framework. It is the language used for describing data (and metadata, and even other data languages). It is graph-based, and it’s the core of what we have been talking about today.What is it good for? “RDF is good for distributing data across the Web and pretending it’s in one place.”-Dean Allemang
  • Subjects and predicates may only be URIs.Objects may be only one of two data types: literals (Strings or other XML-defined data type) and resources (URIs).If you use http: URIs, others can reference them.
  • Here are examples of Dublin Core and FOAF vocabularies in use.
  • SPARQL = SPARQL Protocol and RDF Query Language (Recursive acronym). This is the language used to write queries over the information made available in RDF.IMPORTANT TO REMEMBER: “SPARQL can be used to express queries across diverse data sources, whether the data is stored natively as RDF or viewed as RDF via middleware.”This is more than a mash-up!– Linked Data allows for computation of data across websites.– Think in terms of data leading you to other data
  • Imagine being able to query the Web as you would a single, local database. This is a simple query, but SPARQL is a very robust language and allows for quite complex queries.Example #1:  FOAF (some people that David Wood knows)Example #1 may be resolved via (e.g.) http://demo.openlinksw.com/sparql_demo/ by putting http://zepheira.com/team/dave/dave.rdf into the Graph field and the query into the SPARQL Query field. [[PREFIX foaf:  SELECT ?nameFROM WHERE {    ?knower foaf:knows ?known .    ?known foaf:name ?name .}]] 
  • There is also a standard that defines a tool for using this query language. That standard is a SPARQL endpoint. Imagine being able to query the Web as you would a single, local database. This is a simple query, but SPARQL is a very robust language and allows for quite complex queries.This is what’s called a generic SPARQL endpoint. http://demo.openlinksw.com/sparql_demo/This type of endpoint sits somewhere on the Web and goes out to retrieve RDF data from elsewhere on the Web to run a query. Because a generic SPARQL endpoint will query against arbitrary RDF data, we must specify the URL of the graph (or graphs) to run the query against. We do this either using the input boxes provided on the human-friendly forms, or using the SPARQL FROM clause. It allows me to enter the URI for the graph I want to access (in this case, Dave’s FOAF file – written in RDF), and the SPARQL query specifying the result set I am looking for.
  • And because it’s a web of data, we can also query on things for fun as well as for business. Now, living in L.A., I have friends for whom this IS business -- but I digress. Example #2:  DBPedia (Bart Simpson's chalkboard gags)We should get back two columns of data: “Episode Title” and “What Bart wrote” [[SELECT ?episode,?chalkboard_gag WHERE { ?episode skos:subject ?season . ?season rdfs:label ?season_title . ?episode dbpedia2:blackboard ?chalkboard_gag . FILTER (regex(?season_title, "The Simpsons episodes, season")) . } ORDER BY ?season]]http://dbpedia.org/snorql/?query=SELECT+%3Fepisode%2C%3Fchalkboard_gag+WHERE+%7B%0D%0A++%3Fepisode+skos%3Asubject+%3Fseason+.%0D%0A++%3Fseason+rdfs%3Alabel+%3Fseason_title+.%0D%0A++%3Fepisode+dbpedia2%3Ablackboard+%3Fchalkboard_gag+.%0D%0A++FILTER+%28regex%28%3Fseason_title%2C+%22The+Simpsons+episodes%2C+season%22%29%29+.%0D%0A++%7D%0D%0A++ORDER+BY+%3Fseason Here, we saw a specific SPARQL endpoint for DBPedia so we didn’t need to specify the FROM. We are only querying the graph from the DBPedia dataset.
  • And because it’s a web of data, we can also query on things for fun as well as for business. Now, living in L.A., I have friends for whom this IS business -- but I digress. Example #2:  DBPedia (Bart Simpson's chalkboard gags)We should get back two columns of data: “Episode Title” and “What Bart wrote” [[SELECT ?episode,?chalkboard_gag WHERE { ?episode skos:subject ?season . ?season rdfs:label ?season_title . ?episode dbpedia2:blackboard ?chalkboard_gag . FILTER (regex(?season_title, "The Simpsons episodes, season")) . } ORDER BY ?season]]http://dbpedia.org/snorql/?query=SELECT+%3Fepisode%2C%3Fchalkboard_gag+WHERE+%7B%0D%0A++%3Fepisode+skos%3Asubject+%3Fseason+.%0D%0A++%3Fseason+rdfs%3Alabel+%3Fseason_title+.%0D%0A++%3Fepisode+dbpedia2%3Ablackboard+%3Fchalkboard_gag+.%0D%0A++FILTER+%28regex%28%3Fseason_title%2C+%22The+Simpsons+episodes%2C+season%22%29%29+.%0D%0A++%7D%0D%0A++ORDER+BY+%3Fseason Here, we saw a specific SPARQL endpoint for DBPedia so we didn’t need to specify the FROM. We are only querying the graph from the DBPedia dataset.
  • The lost episode of the Simpsons?http://www.milinkito.com/swf/bart.php

Transcript

  • 1. SemTech for the Rest of UsIntroduction toSemantic Technology
    Comic by Geek & Poke, www.geekandpoke.com
  • 2. About Me
    • Professional
    • 3. Coach/Consultant
    • 4. Evangelist
    • 5. Lifelong Learner/Teacher
    • 6. Geek
    Eric Franzon
    VP, Semantic Universe
    Affiliate Analyst, Guidewire Group
  • 7. Easy to play
  • 8. First, some context…
  • 9. Web 3.0 = Semantic Web
    = Web of Data
  • 10. Web of Data
  • 11. Web 1.0 – Linking Documents
  • 12. Web 1.0
    “I see: characters + formatting + images”
    --my Computer
  • 13. Web 1.0 – Linking Documents
    Web 2.0 – Linking People
  • 14. Web 2.0
    “I see: characters + formatting + images”
    --my Computer
  • 15. Web 1.0 – Linking Documents
    Web 2.0 – Linking People
    Web 3.0 – Linking Data
  • 16. Web 3.0 – Linking Data
    Title
    Publisher
    Format
    Author
    Price
    Cover
    “I see: things (and relationships).
    This information is about a book.”
    --my Computer
  • 17.
  • 18. The building block of linked data:
    The modest “triple”
  • 19. What’s a Triple?
    Two uniquely identified THINGS
    Connected by
    a uniquely identified RELATIONSHIP
    So, what’s a THING?
  • 20. A THING is anything that can be uniquely identified by a URI or a literal (string)
    Me
    My postal code
    The White House
    L.A. County’s sales tax rate
    http://twitter.com/ericaxel
    http://www.city-data.com/zips/90043.html
    Lat: 38.89859 Long: -77.035971
    9.750 %
    http://ericfranzon.com/operator.jpg
  • 21. Triples? It’s Elementary! (School)
    book has title.
    This
    a
    Subject + Predicate + Object
  • 22. Triples
    Has Title
    Book
    “Title”
    Created
    Eric
    Webpage
    Objects
    Subjects
    Has License
    CC Non-Commercial
    Image
    Predicates
  • 23. Power of Relationships
    The real power of triples
    comes from uniquely identifying
    RELATIONSHIPS
    Who’s your daddy?
  • 24. Is Father of
    <owl:ObjectPropertyrdf:ID="isFather">
    <rdfs:domainrdf:resource="#Person"/>
    <rdfs:rangerdf:resource="#Person"/>
    </owl:ObjectProperty>
  • 25. Is Father of
    mailto:ericaxel@yahoo.com
    <owl:ObjectPropertyrdf:ID="isFather">
    <rdfs:domainrdf:resource="#Person"/>
    <rdfs:rangerdf:resource="#Person"/>
    </owl:ObjectProperty>
  • 26. Is Father of
    <owl:ObjectPropertyrdf:ID="isFather">
    <rdfs:domainrdf:resource="#Person"/>
    <rdfs:rangerdf:resource="#Person"/>
    </owl:ObjectProperty>
  • 27. Author
    Title
    Wrote
    Has Title
    Written by
    Book
    Has ISBN
    Has Publisher
    ISBN
    Publisher
  • 28. The Technologies of RDBMS
  • RDBMS Data
  • 31. RDBMS Schema
  • 32. RDBMS Query Language: SQL
    SELECT isbn,
    title,
    price,
    price * 0.06 AS sales_tax
    FROM Book
    WHERE price > 100.00
    ORDER BY title;
  • 33. The Technologies of Linked Data
  • The Data Language
    Resource
    Description
    Framework
  • 36. RDF Triple Components
    URI
    URI
    URI or
    String Literal
  • 37. Just so you know…
    There are many ways of representing RDF:
    Each serialization has pros and cons, but
    they all are used to connect
    THINGS and RELATIONSHIPS into TRIPLES
  • 43. “RDF is good for distributing data
    across the Web and pretending
    it’s in one place.”
    -Dean Allemang, TopQuadrant
  • 44. The Schemata
    Linked Data schemas consist of:
    Your RDF relationships (predicates)
    +
    Relationship descriptions
  • 45. Linked Data Schemata
    Schema
    Initial Schema
    Relationship
    description
    Data
    hasSurname
    hasFirstName
    hasLastName
    hasID
    owl:sameAs
    1
    Krista
    Thomas
  • 46. Choosing Relationships
    Reuse popular vocabularies
    FOAF (Friend-of-a-friend)
    Dublin Core (library/publisher metadata)
    SIOC (Semantically-Interlinked Online Communities)
    ...or make up your own!
  • 47. RDF Triples
  • 48. The query language
    SPARQL
    SPARQL
    Protocol
    And
    RDF
    Query
    Language
  • 49. SPARQL Example #1
    FOAF (some people that Eric Franzon knows)
     
    PREFIX foaf:  <http://xmlns.com/foaf/0.1/>
    SELECT ?name
    FROM <http://ericaxel.com/eric.rdf>
    WHERE {
        ?knower foaf:knows ?known .
        ?known foaf:name ?name .
    }
  • 50. SPARQL Example #1
  • 51. Example #1 - Results
  • 52. SPARQL Example #2
    Querying two FOAF Profiles
    PREFIX foaf: <http://xmlns.com/foaf/0.1/>
    PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
    SELECT ?name
    FROM NAMED <http://ericaxel.com/eric.rdf>
    FROM NAMED <http://zepheira.com/team/dave/dave.rdf>
    WHERE {
    GRAPH <http://ericaxel.com/eric.rdf> {
    ?x rdf:typefoaf:Person .
    ?x foaf:name ?name .
    } .
    GRAPH <http://zepheira.com/team/dave/dave.rdf> {
    ?y rdf:typefoaf:Person .
    ?y foaf:name ?name .
    } .
    }
  • 53. Example #2 - Results
  • 54. SPARQL Example #3
    Bart Simpson's chalkboard gags (DBPedia)
     
    SELECT ?episode,?chalkboard_gag
    WHERE { ?episode skos:subject ?season .
    ?season rdfs:label ?season_title .
    ?episode dbpedia2:blackboard ?chalkboard_gag .
    FILTER (regex(?season_title, "The Simpsons episodes, season")) . }
    ORDER BY ?season
  • 55. Example #3 - Results
  • 56. http://www.milinkito.com/swf/bart.php
  • 57. Easy to play; takes work to master.
  • 58. Thank you!
    Questions? Call Me!
    Operators are standing by.
    EricAxel@yahoo.com
    (323) 743-3511
  • 59. www.SemTech2010.com
    Coupon Code: MEETUP = $200 off
    www.SemanticUniverse.com
  • 60. Resources
    http://geekandpoke.typepad.com/
    http://richard.cyganiak.de/2007/10/lod/
    http://www.flickr.com/photos/dawnmanser/3532853278/
    http://www.flickr.com/photos/artolog/3983764041/
    http://www.flickr.com/photos/97964364@N00/59780745/
    http://www.flickr.com/photos/starwarsblog/
    http://aldobucchi.com
    http://www.milinkito.com/swf/bart.php