Your SlideShare is downloading. ×
0
Integrating the Cloud into Content
Using Semantics to Enhance Content Publishing

                    Jamie Taylor
       ...
What do y'all mean
  "Semantics"
critique          misfortune

                              bad luck   occurrence
     roast


KNOCK                 sound...
critique           misfortune

                               bad luck   occurrence
        roast


LJOMF                 ...
IBM
1 New Orchard Road
          Publicaly Listed                                                      Armonk, New York
      ...
1 New Orchard Road
          Publicaly Listed                                                      Armonk, New York
      ...
http://www.flickr.com/photos/pacroon/
http://www.flickr.com/photos/soldiersmediacenter/
PageRank
       tm
1 New Orchard Road
          Publicaly Listed                Armonk, New York
            Company
                        ...
Earlier this year, the AP slashed prices to try to hold on to
subscribers.

That's not the answer, says Jeff Jarvis, journ...
Jarvis resorts to the
                                          concept of a "gift
                                       ...
I am a behavioral
economist.

Gift economics are
frequently used as
explanations for
what we don't
understand
Worse I am a
Behaviorist

Only talk about
what you can
observe
Semantics



Process of communicating enough
  meaning to result in an action
Link Economy


•   Enriching links focuses meaning
    •   Improves "findability" (SEO)
    •   Increased usability
    •  ...
Link Economy   At the end of this talk -
                              you should be able to
                             ...
Wish it were real
Might be real
Is real, but don't believe it
Is very useful



                  Build Flexible
                 Applications with
                   Graph Data
Not Your Typical
Semantic Web Talk
The W3C Layer Cake




     The Cake
       taken from http://www.w3.org/2007/Talks/0130-sb-W3CTechSemWeb/layerCake-4.png
AI Agents




      http://www.flickr.com/photos/matthewtownsend/
Ontologies
RDF Serialization Formats
<http://rdf.freebase.com/ns/guid.9202a8c04000641f8000000005b7ab1a> <http://www.w3.org/1999/02/22...
Instead....
                                    Part I
                                    - so you can explain to other

...
Part I
Why
Is very useful



                  Build Flexible
                 Applications with
                   Graph Data
The Office (US)                                       Leatherheads
TV Program                                           Fil...
A socially managed semantic database
Freebase has Many Types of Things
9,547,107 Topics
Contributions over $50000 made to members of the
US congress in the 2008 election cycle by companies
    headquartered out...
Industry Browser Identity Model
Industry (USCB)         Company              Company              Donations
    NAICS     ...
Industry Browser




http://kiwitobes.com/industry_mashup/
Barriers between science and
                  the humanities impede solving
                  humanities important proble...
"Smoov"
Ankolekar et al.2007
Topic Blocks
http://www.freebase.com/topicblocks/index?id=/en/pirates_of_the_caribbean_3
http://www.freebase.com/widget/topic?
mode=i&pane=image,article_props& id=/en/pirates_of_the_caribbean_3




             ...
Patrick Sinclair (BBC)
About the Content (and visitor?)
MIT Simile
Simile




http://dev.mqlx.com/~jamie/simile/timeline.html
Data Portability
Data     Data


                Semantics allows data to
                be utilized by
         Data
   ...
Simile
MIT Simile: Exhibit
User Experience
Topic Hubs
Open Calais
Open Calais
http://p.opencalais.com/er/company/ralg-tr1r/9e3f6c34-aa6b-3a3b-b221-a07aa7933633


                          Open Calais
<rdf:Description rdf:nodeID="A1">
                               <att:lastupdated>2009-06-18T21:22:28</att:lastupdated>
  ...
<owl:sameAs rdf:resource="http://dbpedia.org/resource/IBM"/>
   <owl:sameAs rdf:resource="http://cb.semsol.org/company/ibm...
Epispider




Herman Tolentino et al. http://epispider.net/index.php
Chris Thorpe




guardian.co.uk Open Platform
Vocabulary


Do you understand the words that are
coming out of my mouth?
         -Chris Tucker, Rush Hour
1 New Orchard Road
          Publicaly Listed                                                 Armonk, New York
           ...
Epispider




Herman Tolentino et al. http://epispider.net/index.php
vocabularies...are
   everywhere
@




              Short URLs
                           #




The Twitter Vocabulary
Pivot on an @ tag
Pivot on a # tag
http://bit.ly/info/3zyJ8g




Pivot on a Short URL
Vocabularies make links
 more understandable

...and thus content more
         findable
microformats


 Annotate existing HTML so the
content can be "extracted by
software and indexed, searched for,
saved, cros...
microformats
microformats
<div class="vcard">
.....
	     	   <div id="view">
	     	   	     <div id="home">

	   	    	    	    <tabl...
microformats.org
microformats


• (Relatively) easy to use
• Small, fixed vocabulary
• No standard parsing pattern
• No strong identifiers
  ...
RDFa




Annotate HTML with machine readable RDF
RDFa



<div xmlns:fb=”http://rdf.freebase.com/ns/”
    about=”http://rdf.freebase.com/ns/en.jamie_taylor”
    rel=fb:peop...
RDFa

• Unambiguous identifiers
• Extensible vocabulary
• Standard parsing pattern
  • Produces RDF
• Hard to use
  • Rules...
What “concepts” are covered in content
                                  Like existing tagging,
                          ...
<resource>

                                                                    tagged




                               ...
And the winner is....
HTML5 MicroData



•   Annotate HTML with machine
    readable data
•   Simple Name-Value Pair design
HTML5 MicroData
Sometimes, it is desirable to annotate
content with specific machine-readable
labels, e.g. to allow generic...
HTML5




        Simple! 15 pages of 657 page spec
HTML5 MicroData

<section itemscope itemtype="http://example.org/animals#cat"
                   itemid="http://semprog.co...
MicroData Widgets
HTML5 MicroData
•   Easy to use
•   Strong identifiers
•   Extensible vocabulary
•   Easy to parse


•   In last call for c...
Vocabulary Powered Search
                     Search Applications:
                     - Enhanced results
              ...
<div class="hReview-aggregate">
<div class="item vcard">
	    <h1 class="fn org">Taylor's Automatic Refresher</h1>
	    <d...
<div class="hReview-aggregate">
<div class="item vcard">
	     <h1 class="fn org">Taylor's Automatic Refresher</h1>
	     ...
Search Monkey Vocabulary
Search Monkey Vocabulary
DBPedia Place Vocabulary
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/20...
Rich Snippet Vocabulary
•   name 	
•   affiliation 	
•   nickname 	 	
•   price	
•   postal-code 	
•   dtReviewed
•   photo...
Rich Snippet Vocabulary
<rdf:Property rdf:ID="affiliation">
 <rdfs:comment>An affiliation can be specified by a string litera...
HTML5 Vocabularies
Vocab Hub
http://microdata.freebaseapps.com/
Part II
          How

(or why we wrote the book)
The Office (US)                                       Leatherheads
TV Program                                           Fil...
Connected to other rich sources
Where does your data live?
Traditional data-modeling
Tabular data

 Restaurant           Address           Cuisine        Price                Open
   Deli Lllama       Peacht...
Tabular Data



Restaurant     Address        Cuisine   Price                             Open
 Deli Lllama   Peachtree Rd...
A simple schema

 Restaurant            Hours
 id                    restaurant_id
 name                  day
 address    ...
A simple schema


id    name         address price        restaurant_id   day    open   close
1    Deli Lllama   Peachtree...
Some new data


      Bar              Address      DJ    Best Drink
  The Bitter End       14th Ave     No        Beer
  ...
Half-empty columns

  Restaurant               Address        Price   DJ    Best Drink
     Deli Lllama          Peachtree...
Link the tables


Restaurant       RB_Link
id               restaurant_id   Bar
name             bar_id          id
addres...
Split place / purpose
                                Bar
                                id
                             ...
Large schemas




A small section of a limited product
A flexible schema

Venue          Properties
id             venue_id
name           field_id
address        value



       ...
Add some data
id    name          address       venue_id   field_id           value
1    Deli Lllama   Peachtree Rd      1 ...
Add live music info
id    name         address        venue_id   field_id           value
1    Deli Lllama   Peachtree Rd  ...
Explicit semantics
The basic data unit




subject     predicate         object



   Remember this from grammar class?
Restaurants as triples
            subject   predicate          object
              S1         cuisine          “Deli”
  ...
...or as a graph
                     Deli Liiama
         Name

         Cuisine
    S1                  Deli
           ...
Restaurant Graph
     Peking Inn                                  Deli Liiama
                                   Name

   ...
Extending The Restaurant Model
                                                   Deli Liiama
Urban Chic                  ...
Integrating Graph Data Models
                   Deli Liiama
         Name
                                             De...
What Went Wrong?
                            Scripting Languages
                            facilitate change

          ...
Semantic Representation


Relationships are represented explicitly
Schema can be represented as a graph
Data integration i...
Just enough RDF
Just Enough RDF

RDF is a Data Model
   A very simple model!
Cosmos was written by Carl Sagan
Subject Predicate Object



(Cosmos) (was written by) (Carl Sagan)


               author       Carl
   Cosmos
          ...
Subject   Which Cosmos?




(Cosmos)
Subject   Which Cosmos?




(Cosmos)
Identifiers are Everywhere




#w2e
The humble URI

•URI’s provide strong references
 •Much like pointing in the physical
  world
             “this is red”
 ...
Subject                      Which Cosmos?




(Cosmos)



 http://rdf.freebase.com/ns/authority.openlibrary.book.OL356886...
What do you mean, author?

http://rdf.freebase.com/ns/book.written_work.author




                   author              ...
There are billions of Carl Sagans...
      http://rdf.freebase.com/ns/en.carl_sagan




  Cosmos            author
0 ”
                          9 8
                      d “1
                  h e
             b lis
         p u

      ...
RDF Data Model


Nodes (“Subjects”)
connect via Links (“Predicates”)
to Objects
 •   either Nodes or Literals
Expressions of RDF


RDF has many (inconvenient) serializations
   •RDF-XML
   •N3
   •Turtle
   •NTriples
   •RDFa
URIs provide identity
http://rdf.freebase.com/ns/en.robert_cook



  Stability
  Simplicity
  Manageability
Not all URL’s are good identifiers
Plugable Data

         Data


                 Semantics allows an
Data



                 application to utilize
      ...
Plugable Data
Data Portability
Data     Data


                Semantics allows data to
                be utilized by
         Data
   ...
Data Portability




     http://dev.mqlx.com/~jamie/simile/timeline.html
Data Portability
Why Does This Work?

 Semantics facilitate shared meaning through
• Subject Identity
• Strong and Consistent Semantics
• O...
RDF Graphs
 Carrie
              Starred In   Star Wars
 Fisher



              Starred In



Harrison                   ...
Triple Stores
(aka Graph Stores)
Allegro Graph
+


               +

Keep your data as flexible as the source
Strong Identifiers

Strong Semantics
(strong vocabularies)

    Open Data
Can describe?!   At the end of this talk -
                                you should be able to
                         ...
Using Semantics to Enhance Content Publishing
Using Semantics to Enhance Content Publishing
Using Semantics to Enhance Content Publishing
Using Semantics to Enhance Content Publishing
Using Semantics to Enhance Content Publishing
Using Semantics to Enhance Content Publishing
Using Semantics to Enhance Content Publishing
Using Semantics to Enhance Content Publishing
Using Semantics to Enhance Content Publishing
Using Semantics to Enhance Content Publishing
Using Semantics to Enhance Content Publishing
Using Semantics to Enhance Content Publishing
Using Semantics to Enhance Content Publishing
Using Semantics to Enhance Content Publishing
Upcoming SlideShare
Loading in...5
×

Using Semantics to Enhance Content Publishing

3,522

Published on

Integrating the cloud into content. Web2.0 Expo NY 2009 Workshop

Published in: Technology, Business
0 Comments
5 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
3,522
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
94
Comments
0
Likes
5
Embeds 0
No embeds

No notes for slide

Transcript of "Using Semantics to Enhance Content Publishing"

  1. 1. Integrating the Cloud into Content Using Semantics to Enhance Content Publishing Jamie Taylor http://semprog.com/presentations/web20ny
  2. 2. What do y'all mean "Semantics"
  3. 3. critique misfortune bad luck occurrence roast KNOCK sound zing knocking zizz vroom bang belt bash bump rap whack blow
  4. 4. critique misfortune bad luck occurrence roast LJOMF sound zing knocking zizz vroom bang belt bash bump rap whack blow
  5. 5. IBM
  6. 6. 1 New Orchard Road Publicaly Listed Armonk, New York Company rs Le 0000051143 arte ga lS NYSE:IBM dqu tru ol Hea ctu 1889 b K ym CI re Dat S e Fou e r nde Ti ck d Thomas Watson Founders Sam Palmisano IBM CEO SIC O pe 3571:Electronic ra IC t Soft es Computers NA diari in g In war co m i e De Subs e 334111:Electronic 17,604,000,000 Computer Manufacturing velo ped USD 2006 Cognos Cross Worlds SANSF, ViaVoice Lotus Notes
  7. 7. 1 New Orchard Road Publicaly Listed Armonk, New York Company rs Le 0000051143 arte ga lS NYSE:IBM dqu tru ol Hea ctu 1889 b K ym CI re Dat S e Fou e r nde Ti ck d Thomas Watson Founders Sam Palmisano CEO SIC O pe 3571:Electronic ra IC t Soft es Computers NA diari in g In war co m i e De Subs e 334111:Electronic 17,604,000,000 Computer Manufacturing velo ped USD 2006 Cognos Cross Worlds SANSF, ViaVoice Lotus Notes
  8. 8. http://www.flickr.com/photos/pacroon/ http://www.flickr.com/photos/soldiersmediacenter/
  9. 9. PageRank tm
  10. 10. 1 New Orchard Road Publicaly Listed Armonk, New York Company 0000051143 NYSE:IBM 1889 Thomas Watson Sam Palmisano 3571:Electronic Computers 334111:Electronic 17,604,000,000 Computer Manufacturing USD 2006 Cognos Cross Worlds SANSF, ViaVoice Lotus Notes
  11. 11. Earlier this year, the AP slashed prices to try to hold on to subscribers. That's not the answer, says Jeff Jarvis, journalism professor at City University of New York. JEFF JARVIS: The fundamentals of the media economy are changing, from a content economy to a link-based economy. Jarvis says the AP needs to become the broker for those links, like helping the Baltimore Sun link to a story about GM from the Detroit Free Press.
  12. 12. Jarvis resorts to the concept of a "gift economy" to explain the link economy http://www.flickr.com/photos/pagedooley/
  13. 13. I am a behavioral economist. Gift economics are frequently used as explanations for what we don't understand
  14. 14. Worse I am a Behaviorist Only talk about what you can observe
  15. 15. Semantics Process of communicating enough meaning to result in an action
  16. 16. Link Economy • Enriching links focuses meaning • Improves "findability" (SEO) • Increased usability • Better ad selection
  17. 17. Link Economy At the end of this talk - you should be able to say how semantics benefits each of these groups • Semantics Benefit • Site owners • Site users • Developers • You
  18. 18. Wish it were real
  19. 19. Might be real
  20. 20. Is real, but don't believe it
  21. 21. Is very useful Build Flexible Applications with Graph Data
  22. 22. Not Your Typical Semantic Web Talk
  23. 23. The W3C Layer Cake The Cake taken from http://www.w3.org/2007/Talks/0130-sb-W3CTechSemWeb/layerCake-4.png
  24. 24. AI Agents http://www.flickr.com/photos/matthewtownsend/
  25. 25. Ontologies
  26. 26. RDF Serialization Formats <http://rdf.freebase.com/ns/guid.9202a8c04000641f8000000005b7ab1a> <http://www.w3.org/1999/02/22-rdf-syntax- ns#type> <http://rdf.freebase.com/ns/business.employment_tenure>. <http://rdf.freebase.com/ns/guid.9202a8c04000641f8000000005b7ab1a> <http://rdf.freebase.com/ns/ business.employment_tenure.company> <http://rdf.freebase.com/ns/en.determine_software>. <http://rdf.freebase.com/ns/guid.9202a8c04000641f8000000007e53e16> <http://rdf.freebase.com/ns/ education.education.institution> <http://rdf.freebase.com/ns/en.mounds_view_high_school>.<http://rdf.freebase.com/ns/ guid.9202a8c04000641f8000000007e53e16> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http:// rdf.freebase.com/ns/education.education>. <http://rdf.freebase.com/ns/guid.9202a8c04000641f8000000007e53e16> <http://rdf.freebase.com/ns/ education.education.student> <http://rdf.freebase.com/ns/en.jamie_taylor>. <http://rdf.freebase.com/ns/en.jamie_taylor> <http://rdf.freebase.com/ns/business.company_founder.companies_founded> <http://rdf.freebase.com/ns/en.mobius_net>. <http://rdf.freebase.com/ns/en.jamie_taylor> <http://creativecommons.org/ns#attributionName> "Source: Freebase - The World's database". <http://rdf.freebase.com/ns/en.jamie_taylor> <http://rdf.freebase.com/ns/people.person.nationality> <http:// rdf.freebase.com/ns/en.united_states>. <http://rdf.freebase.com/ns/en.jamie_taylor> <http://rdf.freebase.com/ns/common.topic.image> <http://rdf.freebase.com/ ns/en.jamie_headshot>. <http://rdf.freebase.com/ns/en.jamie_taylor> <http://rdf.freebase.com/ns/type.object.name> "Jamie Taylor"@en. <http://rdf.freebase.com/ns/en.jamie_taylor> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http:// rdf.freebase.com/ns/user.skud.freebase_events.tshirt_recipient>. <http://rdf.freebase.com/ns/en.jamie_taylor> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http:// rdf.freebase.com/ns/user.skud.freebase_events.topic>. <http://rdf.freebase.com/ns/en.jamie_taylor> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http:// rdf.freebase.com/ns/book.author>. <http://rdf.freebase.com/ns/en.jamie_taylor> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http:// rdf.freebase.com/ns/people.person>.
  27. 27. Instead.... Part I - so you can explain to other Part II - so you can do what you say • Part I • Why • Uses, Benefits • Part II • How • Representation, Concepts
  28. 28. Part I Why
  29. 29. Is very useful Build Flexible Applications with Graph Data
  30. 30. The Office (US) Leatherheads TV Program Film stars in starred in John Krasinski Person, Actor attended Brown University College/university Graph Data Model
  31. 31. A socially managed semantic database
  32. 32. Freebase has Many Types of Things
  33. 33. 9,547,107 Topics
  34. 34. Contributions over $50000 made to members of the US congress in the 2008 election cycle by companies headquartered outside of the United States topic: topic: Barack Obama Switzerland government position held took money from is based in topic: topic: United States UBS AG Senator Freebase
  35. 35. Industry Browser Identity Model Industry (USCB) Company Company Donations NAICS Ticker CRP CRP ID CRP CRP ID NAICS/SIC Map SEC Freebase Industry (SEC) Company People Person SIC SEC CIK SEC CIK Freebase Wikipedia Freebase Wikipedia Location Article ZIP Code
  36. 36. Industry Browser http://kiwitobes.com/industry_mashup/
  37. 37. Barriers between science and the humanities impede solving humanities important problems Web 2.0 + Semantics
  38. 38. "Smoov" Ankolekar et al.2007
  39. 39. Topic Blocks http://www.freebase.com/topicblocks/index?id=/en/pirates_of_the_caribbean_3
  40. 40. http://www.freebase.com/widget/topic? mode=i&pane=image,article_props& id=/en/pirates_of_the_caribbean_3 http://www.freebase.com/widget/topic? mode=i&pane=image,article_props&id=/en/blade_runner
  41. 41. Patrick Sinclair (BBC)
  42. 42. About the Content (and visitor?)
  43. 43. MIT Simile
  44. 44. Simile http://dev.mqlx.com/~jamie/simile/timeline.html
  45. 45. Data Portability Data Data Semantics allows data to be utilized by Data unanticipated new applications Data
  46. 46. Simile
  47. 47. MIT Simile: Exhibit
  48. 48. User Experience
  49. 49. Topic Hubs
  50. 50. Open Calais
  51. 51. Open Calais
  52. 52. http://p.opencalais.com/er/company/ralg-tr1r/9e3f6c34-aa6b-3a3b-b221-a07aa7933633 Open Calais
  53. 53. <rdf:Description rdf:nodeID="A1"> <att:lastupdated>2009-06-18T21:22:28</att:lastupdated> <att:text>IBM Corporation And Siemens Announce Integrated Solutions To Help Companies</att:text> </rdf:Description> <rdf:Description rdf:nodeID="A2"> <att:code>3577</att:code> <att:description>Computer Periph'L Equipment, Nec</att:description> </rdf:Description> <rdf:Description rdf:nodeID="A3"> <att:code>7371</att:code> <att:description>Computer Programming Services</att:description> </rdf:Description> <rdf:Description rdf:nodeID="A4"> <att:age>46</att:age> <att:lastname>Iwata</att:lastname> <att:officerurl rdf:resource="http://www.reuters.com/finance/stocks/ officerProfile?symbol=IBM.N&amp;officerId=222727"/> <att:firstname>Jon</att:firstname> <att:title>Senior Vice President - Marketing and Communications</att:title> <att:middle>C.</att:middle> </rdf:Description> http://p.opencalais.com/er/company/ralg-tr1r/9e3f6c34-aa6b-3a3b-b221-a07aa7933633 Open Calais
  54. 54. <owl:sameAs rdf:resource="http://dbpedia.org/resource/IBM"/> <owl:sameAs rdf:resource="http://cb.semsol.org/company/ibm#self"/> A Graph of Graphs <owl:sameAs rdf:resource="http://p.opencalais.com/er/company/ralg- tr1r/9e3f6c34-aa6b-3a3b-b221-a07aa7933633"/>
  55. 55. Epispider Herman Tolentino et al. http://epispider.net/index.php
  56. 56. Chris Thorpe guardian.co.uk Open Platform
  57. 57. Vocabulary Do you understand the words that are coming out of my mouth? -Chris Tucker, Rush Hour
  58. 58. 1 New Orchard Road Publicaly Listed Armonk, New York Company rs Le 0000051143 arte g al NYSE:IBM dqu Str uc ol Hea 1889 b K tur ym CI Dat S e eF e r oun ded Ti ck Thomas Watson Founders Sam Palmisano CEO SIC O pe 3571:Electronic ra IC tin Soft es Computers NA g diari In war com i e De Subs e 334111:Electronic 17,604,000,000 Computer Manufacturing velo ped USD 2006 Cognos Cross Worlds SANSF, ViaVoice Lotus Notes
  59. 59. Epispider Herman Tolentino et al. http://epispider.net/index.php
  60. 60. vocabularies...are everywhere
  61. 61. @ Short URLs # The Twitter Vocabulary
  62. 62. Pivot on an @ tag
  63. 63. Pivot on a # tag
  64. 64. http://bit.ly/info/3zyJ8g Pivot on a Short URL
  65. 65. Vocabularies make links more understandable ...and thus content more findable
  66. 66. microformats Annotate existing HTML so the content can be "extracted by software and indexed, searched for, saved, cross-referenced or combined. "
  67. 67. microformats
  68. 68. microformats <div class="vcard"> ..... <div id="view"> <div id="home"> <table> <tr> <td class="f">address</td> <td class="v"> <div class="adr"> <span class="locality">Berkeley</span>, <span class="region">CA</span> <div class="country-name">United States</div> </div> </td> </tr> <tr> <td class="f">aim</td> <td class="v"><a id="aim" class="url im offline" href="aim:goim?screenname=jaredhanson@mac.com">jaredhanson@mac.com</a></td> </tr>
  69. 69. microformats.org
  70. 70. microformats • (Relatively) easy to use • Small, fixed vocabulary • No standard parsing pattern • No strong identifiers • Limits utility
  71. 71. RDFa Annotate HTML with machine readable RDF
  72. 72. RDFa <div xmlns:fb=”http://rdf.freebase.com/ns/” about=”http://rdf.freebase.com/ns/en.jamie_taylor” rel=fb:people.person.place_of_birth> <span resource=”http://rdf.freebase.com/ns/en.saint_paul”/> </div>
  73. 73. RDFa • Unambiguous identifiers • Extensible vocabulary • Standard parsing pattern • Produces RDF • Hard to use • Rules about formatting based on RDF
  74. 74. What “concepts” are covered in content Like existing tagging, but with strong identifiers! <resource> tagged Tag taggingDate "2001-01-01" label means "text" <resource> Strong identifier goes here!
  75. 75. <resource> tagged Tag taggingDate label means <div class="rdfa" "text" <resource> xmlns:ctag="http://commontag.org/ns#"> NASA's <a typeof="ctag:Tag" rel="ctag:means" href="http://rdf.freebase.com/ns/en.phoenix_mars_mission" property="ctag:label">Phoenix Mars Lander</a> has deployed its robotic arm. </div>
  76. 76. And the winner is....
  77. 77. HTML5 MicroData • Annotate HTML with machine readable data • Simple Name-Value Pair design
  78. 78. HTML5 MicroData Sometimes, it is desirable to annotate content with specific machine-readable labels, e.g. to allow generic scripts to provide services that are customised to the page, or to enable content from a variety of cooperating authors to be processed by a single script in a consistent manner.
  79. 79. HTML5 Simple! 15 pages of 657 page spec
  80. 80. HTML5 MicroData <section itemscope itemtype="http://example.org/animals#cat" itemid="http://semprog.com/jamiestuff/hedral"> <h1 itemprop="name">Hedral</h1> <p itemprop="desc">Hedral is a male american domestic shorthair, that is <span itemprop="http://example.com/color">black</span> and <span itemprop="http://example.com/color">white</span>.</p> <img itemprop="img" src="hedral.jpeg" alt="" title="Hedral, age 18 months"> </section>
  81. 81. MicroData Widgets
  82. 82. HTML5 MicroData • Easy to use • Strong identifiers • Extensible vocabulary • Easy to parse • In last call for comments stage! • Usable! Now!
  83. 83. Vocabulary Powered Search Search Applications: - Enhanced results - Info Bar
  84. 84. <div class="hReview-aggregate"> <div class="item vcard"> <h1 class="fn org">Taylor's Automatic Refresher</h1> <div class=rating> <img class="stars_3_half rating average" width="83" height="325" title="3.5 star rating" alt="3.5 star rating" src="http://static1.px.yelp.com/static/2843250757/i/new/ico/stars/stars_map.png"/></div> <em>based on <span class="count">888</span> reviews</em> </div> <div id="bizInfoContent"> <p id="bizCategories">Category: <span id="cat_display"><a href="/c/sf/burgers">Burgers</a> </span> <address class="adr"> Neighborhood: Embarcadero<br/> <span class="street-address">1 Ferry Bldg<br />Marketplace Shop #6</span><br /> <span class="locality">San Francisco</span>, <span class="region">CA</span> <span class="postal-code">94111</span><br /> </address> <span id="bizPhone" class="tel">(866) 328-3663</span>
  85. 85. <div class="hReview-aggregate"> <div class="item vcard"> <h1 class="fn org">Taylor's Automatic Refresher</h1> <div class=rating> <img class="stars_3_half rating average" width="83" height="325" title="3.5 star rating" alt="3.5 star rating" src="http://static1.px.yelp.com/static/2843250757/i/new/ico/stars/stars_map.png"/></div> <em>based on <span class="count">888</span> reviews</em> </div> <div id="bizInfoContent"> <p id="bizCategories">Category: <span id="cat_display"><a href="/c/sf/burgers">Burgers</a> </span> <address class="adr"> Neighborhood: Embarcadero<br/> <span class="street-address">1 Ferry Bldg<br />Marketplace Shop #6</span><br /> <span class="locality">San Francisco</span>, <span class="region">CA</span> <span class="postal-code">94111</span><br /> </address> <span id="bizPhone" class="tel">(866) 328-3663</span>
  86. 86. Search Monkey Vocabulary
  87. 87. Search Monkey Vocabulary
  88. 88. DBPedia Place Vocabulary <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf- schema#"> <rdf:Description rdf:about="http://dbpedia.org/ontology/areaTotal"><rdfs:domain rdf:resource="http://dbpedia.org/ ontology/Place"/></rdf:Description> <rdf:Description rdf:nodeID="b29203"><rdf:first rdf:resource="http://dbpedia.org/ontology/Place"/></rdf:Description> <rdf:Description rdf:about="http://dbpedia.org/ontology/Place/nickname"><rdfs:domain rdf:resource="http:// dbpedia.org/ontology/Place"/></rdf:Description> <rdf:Description rdf:about="http://dbpedia.org/ontology/Place/location"><rdfs:range rdf:resource="http://dbpedia.org/ ontology/Place"/></rdf:Description> <rdf:Description rdf:about="http://dbpedia.org/ontology/maximumDepth"><rdfs:domain rdf:resource="http:// dbpedia.org/ontology/Place"/></rdf:Description> <rdf:Description rdf:about="http://dbpedia.org/ontology/Place/maximumElevation"><rdfs:domain rdf:resource="http:// dbpedia.org/ontology/Place"/></rdf:Description> <rdf:Description rdf:nodeID="b29250"><rdf:first rdf:resource="http://dbpedia.org/ontology/Place"/></rdf:Description> <rdf:Description rdf:about="http://dbpedia.org/ontology/nearestCity"><rdfs:domain rdf:resource="http://dbpedia.org/ ontology/Place"/></rdf:Description> <rdf:Description rdf:about="http://dbpedia.org/ontology/PopulatedPlace"><rdfs:subClassOf rdf:resource="http:// dbpedia.org/ontology/Place"/></rdf:Description> <rdf:Description rdf:about="http://dbpedia.org/ontology/Place/maximumDepth"><rdfs:domain rdf:resource="http:// dbpedia.org/ontology/Place"/></rdf:Description> <rdf:Description rdf:about="http://dbpedia.org/ontology/Place/location"><rdfs:domain rdf:resource="http:// dbpedia.org/ontology/Place"/></rdf:Description> <rdf:Description rdf:nodeID="b29225"><rdf:first rdf:resource="http://dbpedia.org/ontology/Place"/></rdf:Description>
  89. 89. Rich Snippet Vocabulary • name • affiliation • nickname • price • postal-code • dtReviewed • photo • country-name • locality • reviewer • region • count • address • itemReviewed • title • brand • category • role http://data-vocabulary.org
  90. 90. Rich Snippet Vocabulary <rdf:Property rdf:ID="affiliation"> <rdfs:comment>An affiliation can be specified by a string literal or an Organization instance.</rdfs:comment> <rdfs:domain rdf:resource="#Person"/> <rdfs:range> <owl:Class> <owl:unionOf rdf:parseType="Collection"> <owl:Class rdf:about="#Organization"/> <owl:Class rdf:about="xsd:string"/> </owl:unionOf> </owl:Class> </rdfs:range> </rdf:Property> <rdf:Property rdf:ID="brand"> <rdfs:domain rdf:resource="#Product"/> </rdf:Property> <rdf:Property rdf:ID="category"> <rdfs:domain> <owl:Class> <owl:unionOf rdf:parseType="Collection"> <owl:Class rdf:about="#Organization"/> <owl:Class rdf:about="#Product"/> </owl:unionOf> </owl:Class> </rdfs:domain> </rdf:Property>
  91. 91. HTML5 Vocabularies
  92. 92. Vocab Hub http://microdata.freebaseapps.com/
  93. 93. Part II How (or why we wrote the book)
  94. 94. The Office (US) Leatherheads TV Program Film stars in starred in John Krasinski Person, Actor attended Brown University College/university Rich Graph Data
  95. 95. Connected to other rich sources
  96. 96. Where does your data live?
  97. 97. Traditional data-modeling
  98. 98. Tabular data Restaurant Address Cuisine Price Open Deli Lllama Peachtree Rd Deli $ Mon, Tue, Wed, Thu, Fri Peking Inn Lake St Chinese $$$ Thur, Fri, Sat Thai Tanic Branch Dr Thai $$ Tue, Wed, Thu, Fri, Sat, Sun Lord of the Fries Flower Ave Fast food $$ Tue, Wed, Thu, Fri, Sat, Sun Marquis de Salade Main St French $$$ Thur, Fri, Sat Wok this way Second St Chinese $ Mon, Tue, Wed, Thu, Fri, Sat, Sun Luna Sea Autumn Dr Seafood $$$ Tue, Thu, Fri, Sat Pita Pan Thunder Rd Middle Eastern $$ Mon, Tue, Wed, Thu, Fri, Sat, Sun Award Weiners Dorfold Mews Fast food $ Mon, Tue, Wed, Thu, Fri, Sat Lettuce Eat Rustic Parkway Deli $$ Mon, Tue, Wed, Thu, Fri The beloved spreadsheet
  99. 99. Tabular Data Restaurant Address Cuisine Price Open Deli Lllama Peachtree Rd Deli $ Mon (11a-4p), Tue (11-4), Wed (11-4), Thu (11-7), Fri (11-8) Peking Inn Lake St Chinese $$$ Thur (5p-10p), Fri (5p-1a), Sat (5p-1a) etc… Too much information, not enough cells
  100. 100. A simple schema Restaurant Hours id restaurant_id name day address open cuisine_id close Cuisine id name Allows for simple queries
  101. 101. A simple schema id name address price restaurant_id day open close 1 Deli Lllama Peachtree $ 1 Mon 11 16 Rd 1 Tue 11 16 2 Peking Inn Lake St $$$ 1 Thu 11 19 ... 2 Fri 5 23 ... Filled with data
  102. 102. Some new data Bar Address DJ Best Drink The Bitter End 14th Ave No Beer Peking Inn Lake St No Scorpion Bowl Hammer Time Wildcat Dr Yes Hennessey Marquis de Salade Main St Yes Martini This doesn’t fit into our schema...
  103. 103. Half-empty columns Restaurant Address Price DJ Best Drink Deli Lllama Peachtree Rd $ Peking Inn Lake St $$$ No Scorpion Bowl Thai Tanic Branch Dr $$ Lord of the Fries Flower Ave $$ Marquis de Salade Main St $$$ Yes Martini Wok this way Second St $ Luna Sea Autumn Dr $$$ Pita Pan Thunder Rd $$ Award Weiners Dorfold Mews $ Lettuce Eat Rustic Parkway $$ Hammer Time Wildcat Dr Yes Hennessey The Bitter End 14th St No Beer Maybe ok now, but can’t this keep happening?
  104. 104. Link the tables Restaurant RB_Link id restaurant_id Bar name bar_id id address name cuisine_id dj best_drink But now the information is duplicated :(
  105. 105. Split place / purpose Bar id venue_id dj Hours Venue best_drink venue_id id day name open address Restaurant close id venue_id cuisine_id Better, but now we have to “migrate”
  106. 106. Large schemas A small section of a limited product
  107. 107. A flexible schema Venue Properties id venue_id name field_id address value field id name Does this look familiar?
  108. 108. Add some data id name address venue_id field_id value 1 Deli Lllama Peachtree Rd 1 1 Deli 2 Peking Inn Lake St 1 2 $ ... 2 1 Chinese 2 2 $$$ 2 3 Scorpion Bowl 2 4 No id name 1 Cuisine 2 Price 3 Specialty Cocktail 4 DJ? simple enough...
  109. 109. Add live music info id name address venue_id field_id value 1 Deli Lllama Peachtree Rd 1 1 Deli 1 2 $ 2 Peking Inn Lake St 2 1 Chinese 3 Thai Tanic Branch Dr 2 2 $$$ 2 3 Scorpion Bowl 2 4 No 3 5 Yes 3 6 Jazz id name 1 Cuisine 2 Price 3 Specialty Cocktail 4 DJ? 5 Live Music 6 Music Genre No schema change required
  110. 110. Explicit semantics
  111. 111. The basic data unit subject predicate object Remember this from grammar class?
  112. 112. Restaurants as triples subject predicate object S1 cuisine “Deli” S1 price “$” S1 name “Deli Llama” S2 cuisine “Chinese” S2 price “$” S2 name “Peking Inn” S2 best drink “Scorpion Bowl” S2 address “Lake St” S2 DJ? “No” S4 name “Fendalton” S4 contained-by S5 S5 name “Christchurch” S1 location S4 S6 name “Downtown” S6 contained-by S7 S7 name “Wellington, NZ” S2 location S6 Machine readable and almost human readable
  113. 113. ...or as a graph Deli Liiama Name Cuisine S1 Deli Price $
  114. 114. Restaurant Graph Peking Inn Deli Liiama Name Cuisine Name S1 Deli Price S2 $ Location Cuisine Location Chinese Contained-by Christchurch S4 Name Fendalton
  115. 115. Extending The Restaurant Model Deli Liiama Urban Chic Name Decor Cuisine S1 Deli Music Price $ Location Live DJ Contained-by Christchurch S4 Name Fendalton
  116. 116. Integrating Graph Data Models Deli Liiama Name Deli Liiama Name A2 Cuisine S1 Deli Price OnTap $ Z6 Brand Leinenkugel Brand Pabst BR
  117. 117. What Went Wrong? Scripting Languages facilitate change ....where is the data model that does the same? Things change Requirements change User expectations change Data structures change Our data models aren’t keeping up
  118. 118. Semantic Representation Relationships are represented explicitly Schema can be represented as a graph Data integration is the union of two graphs This makes creating, extending, and combining data much easier than before
  119. 119. Just enough RDF
  120. 120. Just Enough RDF RDF is a Data Model A very simple model!
  121. 121. Cosmos was written by Carl Sagan
  122. 122. Subject Predicate Object (Cosmos) (was written by) (Carl Sagan) author Carl Cosmos Sagan
  123. 123. Subject Which Cosmos? (Cosmos)
  124. 124. Subject Which Cosmos? (Cosmos)
  125. 125. Identifiers are Everywhere #w2e
  126. 126. The humble URI •URI’s provide strong references •Much like pointing in the physical world “this is red” “this is a pen” •a URIref is an unambiguous pointer to something of meaning
  127. 127. Subject Which Cosmos? (Cosmos) http://rdf.freebase.com/ns/authority.openlibrary.book.OL3568862M
  128. 128. What do you mean, author? http://rdf.freebase.com/ns/book.written_work.author author Carl Cosmos Sagan vocabulary
  129. 129. There are billions of Carl Sagans... http://rdf.freebase.com/ns/en.carl_sagan Cosmos author
  130. 130. 0 ” 9 8 d “1 h e b lis p u author Carl Cosmos Sagan
  131. 131. RDF Data Model Nodes (“Subjects”) connect via Links (“Predicates”) to Objects • either Nodes or Literals
  132. 132. Expressions of RDF RDF has many (inconvenient) serializations •RDF-XML •N3 •Turtle •NTriples •RDFa
  133. 133. URIs provide identity http://rdf.freebase.com/ns/en.robert_cook Stability Simplicity Manageability
  134. 134. Not all URL’s are good identifiers
  135. 135. Plugable Data Data Semantics allows an Data application to utilize unanticipated new Data Data data sources
  136. 136. Plugable Data
  137. 137. Data Portability Data Data Semantics allows data to be utilized by Data unanticipated new applications Data
  138. 138. Data Portability http://dev.mqlx.com/~jamie/simile/timeline.html
  139. 139. Data Portability
  140. 140. Why Does This Work? Semantics facilitate shared meaning through • Subject Identity • Strong and Consistent Semantics • Open APIS + Open Data These principles make it much easier to extend, combine, and integrate data
  141. 141. RDF Graphs Carrie Starred In Star Wars Fisher Starred In Harrison Blade Starred In Ford Runner Starred In Daryl Hannah
  142. 142. Triple Stores (aka Graph Stores)
  143. 143. Allegro Graph
  144. 144. + + Keep your data as flexible as the source
  145. 145. Strong Identifiers Strong Semantics (strong vocabularies) Open Data
  146. 146. Can describe?! At the end of this talk - you should be able to say how semantics benefits each of these groups • Semantics Benefit • Site owners • Site users • Developers • You
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×