Using Semantics to Enhance Content Publishing

3,845 views
3,762 views

Published on

Integrating the cloud into content. Web2.0 Expo NY 2009 Workshop

Published in: Technology, Business
0 Comments
5 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
3,845
On SlideShare
0
From Embeds
0
Number of Embeds
62
Actions
Shares
0
Downloads
95
Comments
0
Likes
5
Embeds 0
No embeds

No notes for slide

Using Semantics to Enhance Content Publishing

  1. 1. Integrating the Cloud into Content Using Semantics to Enhance Content Publishing Jamie Taylor http://semprog.com/presentations/web20ny
  2. 2. What do y'all mean "Semantics"
  3. 3. critique misfortune bad luck occurrence roast KNOCK sound zing knocking zizz vroom bang belt bash bump rap whack blow
  4. 4. critique misfortune bad luck occurrence roast LJOMF sound zing knocking zizz vroom bang belt bash bump rap whack blow
  5. 5. IBM
  6. 6. 1 New Orchard Road Publicaly Listed Armonk, New York Company rs Le 0000051143 arte ga lS NYSE:IBM dqu tru ol Hea ctu 1889 b K ym CI re Dat S e Fou e r nde Ti ck d Thomas Watson Founders Sam Palmisano IBM CEO SIC O pe 3571:Electronic ra IC t Soft es Computers NA diari in g In war co m i e De Subs e 334111:Electronic 17,604,000,000 Computer Manufacturing velo ped USD 2006 Cognos Cross Worlds SANSF, ViaVoice Lotus Notes
  7. 7. 1 New Orchard Road Publicaly Listed Armonk, New York Company rs Le 0000051143 arte ga lS NYSE:IBM dqu tru ol Hea ctu 1889 b K ym CI re Dat S e Fou e r nde Ti ck d Thomas Watson Founders Sam Palmisano CEO SIC O pe 3571:Electronic ra IC t Soft es Computers NA diari in g In war co m i e De Subs e 334111:Electronic 17,604,000,000 Computer Manufacturing velo ped USD 2006 Cognos Cross Worlds SANSF, ViaVoice Lotus Notes
  8. 8. http://www.flickr.com/photos/pacroon/ http://www.flickr.com/photos/soldiersmediacenter/
  9. 9. PageRank tm
  10. 10. 1 New Orchard Road Publicaly Listed Armonk, New York Company 0000051143 NYSE:IBM 1889 Thomas Watson Sam Palmisano 3571:Electronic Computers 334111:Electronic 17,604,000,000 Computer Manufacturing USD 2006 Cognos Cross Worlds SANSF, ViaVoice Lotus Notes
  11. 11. Earlier this year, the AP slashed prices to try to hold on to subscribers. That's not the answer, says Jeff Jarvis, journalism professor at City University of New York. JEFF JARVIS: The fundamentals of the media economy are changing, from a content economy to a link-based economy. Jarvis says the AP needs to become the broker for those links, like helping the Baltimore Sun link to a story about GM from the Detroit Free Press.
  12. 12. Jarvis resorts to the concept of a "gift economy" to explain the link economy http://www.flickr.com/photos/pagedooley/
  13. 13. I am a behavioral economist. Gift economics are frequently used as explanations for what we don't understand
  14. 14. Worse I am a Behaviorist Only talk about what you can observe
  15. 15. Semantics Process of communicating enough meaning to result in an action
  16. 16. Link Economy • Enriching links focuses meaning • Improves "findability" (SEO) • Increased usability • Better ad selection
  17. 17. Link Economy At the end of this talk - you should be able to say how semantics benefits each of these groups • Semantics Benefit • Site owners • Site users • Developers • You
  18. 18. Wish it were real
  19. 19. Might be real
  20. 20. Is real, but don't believe it
  21. 21. Is very useful Build Flexible Applications with Graph Data
  22. 22. Not Your Typical Semantic Web Talk
  23. 23. The W3C Layer Cake The Cake taken from http://www.w3.org/2007/Talks/0130-sb-W3CTechSemWeb/layerCake-4.png
  24. 24. AI Agents http://www.flickr.com/photos/matthewtownsend/
  25. 25. Ontologies
  26. 26. RDF Serialization Formats <http://rdf.freebase.com/ns/guid.9202a8c04000641f8000000005b7ab1a> <http://www.w3.org/1999/02/22-rdf-syntax- ns#type> <http://rdf.freebase.com/ns/business.employment_tenure>. <http://rdf.freebase.com/ns/guid.9202a8c04000641f8000000005b7ab1a> <http://rdf.freebase.com/ns/ business.employment_tenure.company> <http://rdf.freebase.com/ns/en.determine_software>. <http://rdf.freebase.com/ns/guid.9202a8c04000641f8000000007e53e16> <http://rdf.freebase.com/ns/ education.education.institution> <http://rdf.freebase.com/ns/en.mounds_view_high_school>.<http://rdf.freebase.com/ns/ guid.9202a8c04000641f8000000007e53e16> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http:// rdf.freebase.com/ns/education.education>. <http://rdf.freebase.com/ns/guid.9202a8c04000641f8000000007e53e16> <http://rdf.freebase.com/ns/ education.education.student> <http://rdf.freebase.com/ns/en.jamie_taylor>. <http://rdf.freebase.com/ns/en.jamie_taylor> <http://rdf.freebase.com/ns/business.company_founder.companies_founded> <http://rdf.freebase.com/ns/en.mobius_net>. <http://rdf.freebase.com/ns/en.jamie_taylor> <http://creativecommons.org/ns#attributionName> "Source: Freebase - The World's database". <http://rdf.freebase.com/ns/en.jamie_taylor> <http://rdf.freebase.com/ns/people.person.nationality> <http:// rdf.freebase.com/ns/en.united_states>. <http://rdf.freebase.com/ns/en.jamie_taylor> <http://rdf.freebase.com/ns/common.topic.image> <http://rdf.freebase.com/ ns/en.jamie_headshot>. <http://rdf.freebase.com/ns/en.jamie_taylor> <http://rdf.freebase.com/ns/type.object.name> "Jamie Taylor"@en. <http://rdf.freebase.com/ns/en.jamie_taylor> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http:// rdf.freebase.com/ns/user.skud.freebase_events.tshirt_recipient>. <http://rdf.freebase.com/ns/en.jamie_taylor> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http:// rdf.freebase.com/ns/user.skud.freebase_events.topic>. <http://rdf.freebase.com/ns/en.jamie_taylor> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http:// rdf.freebase.com/ns/book.author>. <http://rdf.freebase.com/ns/en.jamie_taylor> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http:// rdf.freebase.com/ns/people.person>.
  27. 27. Instead.... Part I - so you can explain to other Part II - so you can do what you say • Part I • Why • Uses, Benefits • Part II • How • Representation, Concepts
  28. 28. Part I Why
  29. 29. Is very useful Build Flexible Applications with Graph Data
  30. 30. The Office (US) Leatherheads TV Program Film stars in starred in John Krasinski Person, Actor attended Brown University College/university Graph Data Model
  31. 31. A socially managed semantic database
  32. 32. Freebase has Many Types of Things
  33. 33. 9,547,107 Topics
  34. 34. Contributions over $50000 made to members of the US congress in the 2008 election cycle by companies headquartered outside of the United States topic: topic: Barack Obama Switzerland government position held took money from is based in topic: topic: United States UBS AG Senator Freebase
  35. 35. Industry Browser Identity Model Industry (USCB) Company Company Donations NAICS Ticker CRP CRP ID CRP CRP ID NAICS/SIC Map SEC Freebase Industry (SEC) Company People Person SIC SEC CIK SEC CIK Freebase Wikipedia Freebase Wikipedia Location Article ZIP Code
  36. 36. Industry Browser http://kiwitobes.com/industry_mashup/
  37. 37. Barriers between science and the humanities impede solving humanities important problems Web 2.0 + Semantics
  38. 38. "Smoov" Ankolekar et al.2007
  39. 39. Topic Blocks http://www.freebase.com/topicblocks/index?id=/en/pirates_of_the_caribbean_3
  40. 40. http://www.freebase.com/widget/topic? mode=i&pane=image,article_props& id=/en/pirates_of_the_caribbean_3 http://www.freebase.com/widget/topic? mode=i&pane=image,article_props&id=/en/blade_runner
  41. 41. Patrick Sinclair (BBC)
  42. 42. About the Content (and visitor?)
  43. 43. MIT Simile
  44. 44. Simile http://dev.mqlx.com/~jamie/simile/timeline.html
  45. 45. Data Portability Data Data Semantics allows data to be utilized by Data unanticipated new applications Data
  46. 46. Simile
  47. 47. MIT Simile: Exhibit
  48. 48. User Experience
  49. 49. Topic Hubs
  50. 50. Open Calais
  51. 51. Open Calais
  52. 52. http://p.opencalais.com/er/company/ralg-tr1r/9e3f6c34-aa6b-3a3b-b221-a07aa7933633 Open Calais
  53. 53. <rdf:Description rdf:nodeID="A1"> <att:lastupdated>2009-06-18T21:22:28</att:lastupdated> <att:text>IBM Corporation And Siemens Announce Integrated Solutions To Help Companies</att:text> </rdf:Description> <rdf:Description rdf:nodeID="A2"> <att:code>3577</att:code> <att:description>Computer Periph'L Equipment, Nec</att:description> </rdf:Description> <rdf:Description rdf:nodeID="A3"> <att:code>7371</att:code> <att:description>Computer Programming Services</att:description> </rdf:Description> <rdf:Description rdf:nodeID="A4"> <att:age>46</att:age> <att:lastname>Iwata</att:lastname> <att:officerurl rdf:resource="http://www.reuters.com/finance/stocks/ officerProfile?symbol=IBM.N&amp;officerId=222727"/> <att:firstname>Jon</att:firstname> <att:title>Senior Vice President - Marketing and Communications</att:title> <att:middle>C.</att:middle> </rdf:Description> http://p.opencalais.com/er/company/ralg-tr1r/9e3f6c34-aa6b-3a3b-b221-a07aa7933633 Open Calais
  54. 54. <owl:sameAs rdf:resource="http://dbpedia.org/resource/IBM"/> <owl:sameAs rdf:resource="http://cb.semsol.org/company/ibm#self"/> A Graph of Graphs <owl:sameAs rdf:resource="http://p.opencalais.com/er/company/ralg- tr1r/9e3f6c34-aa6b-3a3b-b221-a07aa7933633"/>
  55. 55. Epispider Herman Tolentino et al. http://epispider.net/index.php
  56. 56. Chris Thorpe guardian.co.uk Open Platform
  57. 57. Vocabulary Do you understand the words that are coming out of my mouth? -Chris Tucker, Rush Hour
  58. 58. 1 New Orchard Road Publicaly Listed Armonk, New York Company rs Le 0000051143 arte g al NYSE:IBM dqu Str uc ol Hea 1889 b K tur ym CI Dat S e eF e r oun ded Ti ck Thomas Watson Founders Sam Palmisano CEO SIC O pe 3571:Electronic ra IC tin Soft es Computers NA g diari In war com i e De Subs e 334111:Electronic 17,604,000,000 Computer Manufacturing velo ped USD 2006 Cognos Cross Worlds SANSF, ViaVoice Lotus Notes
  59. 59. Epispider Herman Tolentino et al. http://epispider.net/index.php
  60. 60. vocabularies...are everywhere
  61. 61. @ Short URLs # The Twitter Vocabulary
  62. 62. Pivot on an @ tag
  63. 63. Pivot on a # tag
  64. 64. http://bit.ly/info/3zyJ8g Pivot on a Short URL
  65. 65. Vocabularies make links more understandable ...and thus content more findable
  66. 66. microformats Annotate existing HTML so the content can be "extracted by software and indexed, searched for, saved, cross-referenced or combined. "
  67. 67. microformats
  68. 68. microformats <div class="vcard"> ..... <div id="view"> <div id="home"> <table> <tr> <td class="f">address</td> <td class="v"> <div class="adr"> <span class="locality">Berkeley</span>, <span class="region">CA</span> <div class="country-name">United States</div> </div> </td> </tr> <tr> <td class="f">aim</td> <td class="v"><a id="aim" class="url im offline" href="aim:goim?screenname=jaredhanson@mac.com">jaredhanson@mac.com</a></td> </tr>
  69. 69. microformats.org
  70. 70. microformats • (Relatively) easy to use • Small, fixed vocabulary • No standard parsing pattern • No strong identifiers • Limits utility
  71. 71. RDFa Annotate HTML with machine readable RDF
  72. 72. RDFa <div xmlns:fb=”http://rdf.freebase.com/ns/” about=”http://rdf.freebase.com/ns/en.jamie_taylor” rel=fb:people.person.place_of_birth> <span resource=”http://rdf.freebase.com/ns/en.saint_paul”/> </div>
  73. 73. RDFa • Unambiguous identifiers • Extensible vocabulary • Standard parsing pattern • Produces RDF • Hard to use • Rules about formatting based on RDF
  74. 74. What “concepts” are covered in content Like existing tagging, but with strong identifiers! <resource> tagged Tag taggingDate "2001-01-01" label means "text" <resource> Strong identifier goes here!
  75. 75. <resource> tagged Tag taggingDate label means <div class="rdfa" "text" <resource> xmlns:ctag="http://commontag.org/ns#"> NASA's <a typeof="ctag:Tag" rel="ctag:means" href="http://rdf.freebase.com/ns/en.phoenix_mars_mission" property="ctag:label">Phoenix Mars Lander</a> has deployed its robotic arm. </div>
  76. 76. And the winner is....
  77. 77. HTML5 MicroData • Annotate HTML with machine readable data • Simple Name-Value Pair design
  78. 78. HTML5 MicroData Sometimes, it is desirable to annotate content with specific machine-readable labels, e.g. to allow generic scripts to provide services that are customised to the page, or to enable content from a variety of cooperating authors to be processed by a single script in a consistent manner.
  79. 79. HTML5 Simple! 15 pages of 657 page spec
  80. 80. HTML5 MicroData <section itemscope itemtype="http://example.org/animals#cat" itemid="http://semprog.com/jamiestuff/hedral"> <h1 itemprop="name">Hedral</h1> <p itemprop="desc">Hedral is a male american domestic shorthair, that is <span itemprop="http://example.com/color">black</span> and <span itemprop="http://example.com/color">white</span>.</p> <img itemprop="img" src="hedral.jpeg" alt="" title="Hedral, age 18 months"> </section>
  81. 81. MicroData Widgets
  82. 82. HTML5 MicroData • Easy to use • Strong identifiers • Extensible vocabulary • Easy to parse • In last call for comments stage! • Usable! Now!
  83. 83. Vocabulary Powered Search Search Applications: - Enhanced results - Info Bar
  84. 84. <div class="hReview-aggregate"> <div class="item vcard"> <h1 class="fn org">Taylor's Automatic Refresher</h1> <div class=rating> <img class="stars_3_half rating average" width="83" height="325" title="3.5 star rating" alt="3.5 star rating" src="http://static1.px.yelp.com/static/2843250757/i/new/ico/stars/stars_map.png"/></div> <em>based on <span class="count">888</span> reviews</em> </div> <div id="bizInfoContent"> <p id="bizCategories">Category: <span id="cat_display"><a href="/c/sf/burgers">Burgers</a> </span> <address class="adr"> Neighborhood: Embarcadero<br/> <span class="street-address">1 Ferry Bldg<br />Marketplace Shop #6</span><br /> <span class="locality">San Francisco</span>, <span class="region">CA</span> <span class="postal-code">94111</span><br /> </address> <span id="bizPhone" class="tel">(866) 328-3663</span>
  85. 85. <div class="hReview-aggregate"> <div class="item vcard"> <h1 class="fn org">Taylor's Automatic Refresher</h1> <div class=rating> <img class="stars_3_half rating average" width="83" height="325" title="3.5 star rating" alt="3.5 star rating" src="http://static1.px.yelp.com/static/2843250757/i/new/ico/stars/stars_map.png"/></div> <em>based on <span class="count">888</span> reviews</em> </div> <div id="bizInfoContent"> <p id="bizCategories">Category: <span id="cat_display"><a href="/c/sf/burgers">Burgers</a> </span> <address class="adr"> Neighborhood: Embarcadero<br/> <span class="street-address">1 Ferry Bldg<br />Marketplace Shop #6</span><br /> <span class="locality">San Francisco</span>, <span class="region">CA</span> <span class="postal-code">94111</span><br /> </address> <span id="bizPhone" class="tel">(866) 328-3663</span>
  86. 86. Search Monkey Vocabulary
  87. 87. Search Monkey Vocabulary
  88. 88. DBPedia Place Vocabulary <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf- schema#"> <rdf:Description rdf:about="http://dbpedia.org/ontology/areaTotal"><rdfs:domain rdf:resource="http://dbpedia.org/ ontology/Place"/></rdf:Description> <rdf:Description rdf:nodeID="b29203"><rdf:first rdf:resource="http://dbpedia.org/ontology/Place"/></rdf:Description> <rdf:Description rdf:about="http://dbpedia.org/ontology/Place/nickname"><rdfs:domain rdf:resource="http:// dbpedia.org/ontology/Place"/></rdf:Description> <rdf:Description rdf:about="http://dbpedia.org/ontology/Place/location"><rdfs:range rdf:resource="http://dbpedia.org/ ontology/Place"/></rdf:Description> <rdf:Description rdf:about="http://dbpedia.org/ontology/maximumDepth"><rdfs:domain rdf:resource="http:// dbpedia.org/ontology/Place"/></rdf:Description> <rdf:Description rdf:about="http://dbpedia.org/ontology/Place/maximumElevation"><rdfs:domain rdf:resource="http:// dbpedia.org/ontology/Place"/></rdf:Description> <rdf:Description rdf:nodeID="b29250"><rdf:first rdf:resource="http://dbpedia.org/ontology/Place"/></rdf:Description> <rdf:Description rdf:about="http://dbpedia.org/ontology/nearestCity"><rdfs:domain rdf:resource="http://dbpedia.org/ ontology/Place"/></rdf:Description> <rdf:Description rdf:about="http://dbpedia.org/ontology/PopulatedPlace"><rdfs:subClassOf rdf:resource="http:// dbpedia.org/ontology/Place"/></rdf:Description> <rdf:Description rdf:about="http://dbpedia.org/ontology/Place/maximumDepth"><rdfs:domain rdf:resource="http:// dbpedia.org/ontology/Place"/></rdf:Description> <rdf:Description rdf:about="http://dbpedia.org/ontology/Place/location"><rdfs:domain rdf:resource="http:// dbpedia.org/ontology/Place"/></rdf:Description> <rdf:Description rdf:nodeID="b29225"><rdf:first rdf:resource="http://dbpedia.org/ontology/Place"/></rdf:Description>
  89. 89. Rich Snippet Vocabulary • name • affiliation • nickname • price • postal-code • dtReviewed • photo • country-name • locality • reviewer • region • count • address • itemReviewed • title • brand • category • role http://data-vocabulary.org
  90. 90. Rich Snippet Vocabulary <rdf:Property rdf:ID="affiliation"> <rdfs:comment>An affiliation can be specified by a string literal or an Organization instance.</rdfs:comment> <rdfs:domain rdf:resource="#Person"/> <rdfs:range> <owl:Class> <owl:unionOf rdf:parseType="Collection"> <owl:Class rdf:about="#Organization"/> <owl:Class rdf:about="xsd:string"/> </owl:unionOf> </owl:Class> </rdfs:range> </rdf:Property> <rdf:Property rdf:ID="brand"> <rdfs:domain rdf:resource="#Product"/> </rdf:Property> <rdf:Property rdf:ID="category"> <rdfs:domain> <owl:Class> <owl:unionOf rdf:parseType="Collection"> <owl:Class rdf:about="#Organization"/> <owl:Class rdf:about="#Product"/> </owl:unionOf> </owl:Class> </rdfs:domain> </rdf:Property>
  91. 91. HTML5 Vocabularies
  92. 92. Vocab Hub http://microdata.freebaseapps.com/
  93. 93. Part II How (or why we wrote the book)
  94. 94. The Office (US) Leatherheads TV Program Film stars in starred in John Krasinski Person, Actor attended Brown University College/university Rich Graph Data
  95. 95. Connected to other rich sources
  96. 96. Where does your data live?
  97. 97. Traditional data-modeling
  98. 98. Tabular data Restaurant Address Cuisine Price Open Deli Lllama Peachtree Rd Deli $ Mon, Tue, Wed, Thu, Fri Peking Inn Lake St Chinese $$$ Thur, Fri, Sat Thai Tanic Branch Dr Thai $$ Tue, Wed, Thu, Fri, Sat, Sun Lord of the Fries Flower Ave Fast food $$ Tue, Wed, Thu, Fri, Sat, Sun Marquis de Salade Main St French $$$ Thur, Fri, Sat Wok this way Second St Chinese $ Mon, Tue, Wed, Thu, Fri, Sat, Sun Luna Sea Autumn Dr Seafood $$$ Tue, Thu, Fri, Sat Pita Pan Thunder Rd Middle Eastern $$ Mon, Tue, Wed, Thu, Fri, Sat, Sun Award Weiners Dorfold Mews Fast food $ Mon, Tue, Wed, Thu, Fri, Sat Lettuce Eat Rustic Parkway Deli $$ Mon, Tue, Wed, Thu, Fri The beloved spreadsheet
  99. 99. Tabular Data Restaurant Address Cuisine Price Open Deli Lllama Peachtree Rd Deli $ Mon (11a-4p), Tue (11-4), Wed (11-4), Thu (11-7), Fri (11-8) Peking Inn Lake St Chinese $$$ Thur (5p-10p), Fri (5p-1a), Sat (5p-1a) etc… Too much information, not enough cells
  100. 100. A simple schema Restaurant Hours id restaurant_id name day address open cuisine_id close Cuisine id name Allows for simple queries
  101. 101. A simple schema id name address price restaurant_id day open close 1 Deli Lllama Peachtree $ 1 Mon 11 16 Rd 1 Tue 11 16 2 Peking Inn Lake St $$$ 1 Thu 11 19 ... 2 Fri 5 23 ... Filled with data
  102. 102. Some new data Bar Address DJ Best Drink The Bitter End 14th Ave No Beer Peking Inn Lake St No Scorpion Bowl Hammer Time Wildcat Dr Yes Hennessey Marquis de Salade Main St Yes Martini This doesn’t fit into our schema...
  103. 103. Half-empty columns Restaurant Address Price DJ Best Drink Deli Lllama Peachtree Rd $ Peking Inn Lake St $$$ No Scorpion Bowl Thai Tanic Branch Dr $$ Lord of the Fries Flower Ave $$ Marquis de Salade Main St $$$ Yes Martini Wok this way Second St $ Luna Sea Autumn Dr $$$ Pita Pan Thunder Rd $$ Award Weiners Dorfold Mews $ Lettuce Eat Rustic Parkway $$ Hammer Time Wildcat Dr Yes Hennessey The Bitter End 14th St No Beer Maybe ok now, but can’t this keep happening?
  104. 104. Link the tables Restaurant RB_Link id restaurant_id Bar name bar_id id address name cuisine_id dj best_drink But now the information is duplicated :(
  105. 105. Split place / purpose Bar id venue_id dj Hours Venue best_drink venue_id id day name open address Restaurant close id venue_id cuisine_id Better, but now we have to “migrate”
  106. 106. Large schemas A small section of a limited product
  107. 107. A flexible schema Venue Properties id venue_id name field_id address value field id name Does this look familiar?
  108. 108. Add some data id name address venue_id field_id value 1 Deli Lllama Peachtree Rd 1 1 Deli 2 Peking Inn Lake St 1 2 $ ... 2 1 Chinese 2 2 $$$ 2 3 Scorpion Bowl 2 4 No id name 1 Cuisine 2 Price 3 Specialty Cocktail 4 DJ? simple enough...
  109. 109. Add live music info id name address venue_id field_id value 1 Deli Lllama Peachtree Rd 1 1 Deli 1 2 $ 2 Peking Inn Lake St 2 1 Chinese 3 Thai Tanic Branch Dr 2 2 $$$ 2 3 Scorpion Bowl 2 4 No 3 5 Yes 3 6 Jazz id name 1 Cuisine 2 Price 3 Specialty Cocktail 4 DJ? 5 Live Music 6 Music Genre No schema change required
  110. 110. Explicit semantics
  111. 111. The basic data unit subject predicate object Remember this from grammar class?
  112. 112. Restaurants as triples subject predicate object S1 cuisine “Deli” S1 price “$” S1 name “Deli Llama” S2 cuisine “Chinese” S2 price “$” S2 name “Peking Inn” S2 best drink “Scorpion Bowl” S2 address “Lake St” S2 DJ? “No” S4 name “Fendalton” S4 contained-by S5 S5 name “Christchurch” S1 location S4 S6 name “Downtown” S6 contained-by S7 S7 name “Wellington, NZ” S2 location S6 Machine readable and almost human readable
  113. 113. ...or as a graph Deli Liiama Name Cuisine S1 Deli Price $
  114. 114. Restaurant Graph Peking Inn Deli Liiama Name Cuisine Name S1 Deli Price S2 $ Location Cuisine Location Chinese Contained-by Christchurch S4 Name Fendalton
  115. 115. Extending The Restaurant Model Deli Liiama Urban Chic Name Decor Cuisine S1 Deli Music Price $ Location Live DJ Contained-by Christchurch S4 Name Fendalton
  116. 116. Integrating Graph Data Models Deli Liiama Name Deli Liiama Name A2 Cuisine S1 Deli Price OnTap $ Z6 Brand Leinenkugel Brand Pabst BR
  117. 117. What Went Wrong? Scripting Languages facilitate change ....where is the data model that does the same? Things change Requirements change User expectations change Data structures change Our data models aren’t keeping up
  118. 118. Semantic Representation Relationships are represented explicitly Schema can be represented as a graph Data integration is the union of two graphs This makes creating, extending, and combining data much easier than before
  119. 119. Just enough RDF
  120. 120. Just Enough RDF RDF is a Data Model A very simple model!
  121. 121. Cosmos was written by Carl Sagan
  122. 122. Subject Predicate Object (Cosmos) (was written by) (Carl Sagan) author Carl Cosmos Sagan
  123. 123. Subject Which Cosmos? (Cosmos)
  124. 124. Subject Which Cosmos? (Cosmos)
  125. 125. Identifiers are Everywhere #w2e
  126. 126. The humble URI •URI’s provide strong references •Much like pointing in the physical world “this is red” “this is a pen” •a URIref is an unambiguous pointer to something of meaning
  127. 127. Subject Which Cosmos? (Cosmos) http://rdf.freebase.com/ns/authority.openlibrary.book.OL3568862M
  128. 128. What do you mean, author? http://rdf.freebase.com/ns/book.written_work.author author Carl Cosmos Sagan vocabulary
  129. 129. There are billions of Carl Sagans... http://rdf.freebase.com/ns/en.carl_sagan Cosmos author
  130. 130. 0 ” 9 8 d “1 h e b lis p u author Carl Cosmos Sagan
  131. 131. RDF Data Model Nodes (“Subjects”) connect via Links (“Predicates”) to Objects • either Nodes or Literals
  132. 132. Expressions of RDF RDF has many (inconvenient) serializations •RDF-XML •N3 •Turtle •NTriples •RDFa
  133. 133. URIs provide identity http://rdf.freebase.com/ns/en.robert_cook Stability Simplicity Manageability
  134. 134. Not all URL’s are good identifiers
  135. 135. Plugable Data Data Semantics allows an Data application to utilize unanticipated new Data Data data sources
  136. 136. Plugable Data
  137. 137. Data Portability Data Data Semantics allows data to be utilized by Data unanticipated new applications Data
  138. 138. Data Portability http://dev.mqlx.com/~jamie/simile/timeline.html
  139. 139. Data Portability
  140. 140. Why Does This Work? Semantics facilitate shared meaning through • Subject Identity • Strong and Consistent Semantics • Open APIS + Open Data These principles make it much easier to extend, combine, and integrate data
  141. 141. RDF Graphs Carrie Starred In Star Wars Fisher Starred In Harrison Blade Starred In Ford Runner Starred In Daryl Hannah
  142. 142. Triple Stores (aka Graph Stores)
  143. 143. Allegro Graph
  144. 144. + + Keep your data as flexible as the source
  145. 145. Strong Identifiers Strong Semantics (strong vocabularies) Open Data
  146. 146. Can describe?! At the end of this talk - you should be able to say how semantics benefits each of these groups • Semantics Benefit • Site owners • Site users • Developers • You

×