The Joy of DataA cookbook for publishing Linked Data on the Web     Bernadette Hyland, CEO        3 Round Stones, Inc    b...
A pragmatic            approach topublishing & consuming           Linked Data
Agenda• Setting the scene• Ingredients ... we use a cooking analogy• Open standards & best practices• Data modeling withou...
Setting the scene ...   where should we             focus?
We’ll review• Converting data into RDF• The social contract publishers  make• The importance of announcing• Where to turn ...
Why should we care?• We     pretend our organizations are hierarchical -- they aren’t• Information      is power.  • Combi...
World changing phenomenon     Linked Data approach, we can begin to address the• Using non-hierarchical nature of our orga...
We are sowing the seeds for nothing         short of a        revolution
What does it take?• The ingredients list ...• Thinking differently about your  data• Modeling for re-use• Summary of proce...
“The change from atoms to bits is irrevocable             and unstoppable”                            Being Digital by Nic...
We use URIs to describe both bits & atoms ... Information resources are things that computers understand, e.g., Web pages,...
• A different way of thinking about  data• The Open World Assumption• Lots of URIs• To be citizen of the world (not  every...
Peeling the  onion ....
Machine readable
and Human Readable (or edible)
Publish machine & human      readable content• Machine readable format• Human-readable descriptions of your data set• Incr...
100%                                                                 House email            90%                           ...
Model without      context
There is a ProcessIdentify   Model   Name    Describe   Convert   Publish                          Maintain
Preparation1. Leverage what exists• Request a copy of the logical and physical model of the   database(s)• Obtain data ext...
Modeling the data2. Model data without context to allow for   reuse and easier merging of data sets • Traditional         ...
Modeling the data3. Look for real world objects of interest (e.g., people, places,   things, locations, etc.) and model th...
Modeling the data ...4. Connect data from different sources and authoritative  vocabularies (see list of popular vocabular...
Modeling the data ...• Put aside immediate needs of any application• Don’t think about how an application will use your da...
Convert, Publish & Maintain5. Write a script or process to convert the data set   repeatedly6. Publish to the Web and anno...
Take the plunge ... Be forgiving •   Simplistic data models can still be useful •   Better to make progress with something...
Take an iterative approach1. Review of modeling decisions2. Review vocabularies chosen and developed3. Modify/update data ...
shared innovation™29
Describe your         data
Data stewards should....• Make data accessible via the Web’s standard  access mechanism, specifically http URIs• Represent ...
Linked Data Formats• RDF/XML - RDF for XML pipelines• Turtle - Human-readable RDF• XHTML with GRDDL transformation• XHTML ...
In a tart, smoothie or margarita ... berries   can be combined in       different ways
Merging data
Guidelines for merging• URIs name the resources we are describing• Two people using the same URI are describing the same  ...
Announcing the      finished      product!
•Inform the LOD developer community (linkeddata.org, W3 lists)•Announce to search engines (RDFa hints, register to make ac...
ACCEPTABLE ROI FOR IT         4%   17%   13% 16%                    6 months              49%   12 months                 ...
The Social Contract ...                      The not so fine print• LOD is a social contract to provide the public with in...
We’ve createdsometing quite     beautiful
Reading    http://linkeddatabook.com/editions/1.0/http://3roundstones.com/linking-enterprise-data/
This work is Copyright © 2011 3 Round Stones Inc.It is licensed under the Creative Commons Attribution 3.0 Unported Licens...
Bernadette Hyland SemTech 2011 West - Linked Data Cookbook
Bernadette Hyland SemTech 2011 West - Linked Data Cookbook
Bernadette Hyland SemTech 2011 West - Linked Data Cookbook
Upcoming SlideShare
Loading in...5
×

Bernadette Hyland SemTech 2011 West - Linked Data Cookbook

6,918

Published on

Linked Data is an evolving set of techniques for publishing and consuming data on the Web. Learn how Linked Data can turn the Web into a distributed database and how you can participate. In this session, Bernadette Hyland takes the mystery out of Linked Data by summarizing seven steps to prepare your data sets as Linked Data and announce it so others will use it.

Published in: Technology, Education
0 Comments
9 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
6,918
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
107
Comments
0
Likes
9
Embeds 0
No embeds

No notes for slide

Transcript of "Bernadette Hyland SemTech 2011 West - Linked Data Cookbook"

  1. 1. The Joy of DataA cookbook for publishing Linked Data on the Web Bernadette Hyland, CEO 3 Round Stones, Inc bhyland@3roundstones.com
  2. 2. A pragmatic approach topublishing & consuming Linked Data
  3. 3. Agenda• Setting the scene• Ingredients ... we use a cooking analogy• Open standards & best practices• Data modeling without context• Social contract as a publisher• Next steps
  4. 4. Setting the scene ... where should we focus?
  5. 5. We’ll review• Converting data into RDF• The social contract publishers make• The importance of announcing• Where to turn for guidance
  6. 6. Why should we care?• We pretend our organizations are hierarchical -- they aren’t• Information is power. • Combining information from different sources is very powerful.• The US data warehouse market in 2010 was $10B• In 2012 expected to grow to $13.5B
  7. 7. World changing phenomenon Linked Data approach, we can begin to address the• Using non-hierarchical nature of our organizations• We can combine information sources• The W3C has defined standards that enable interoperability and allow us to freely move data
  8. 8. We are sowing the seeds for nothing short of a revolution
  9. 9. What does it take?• The ingredients list ...• Thinking differently about your data• Modeling for re-use• Summary of process in 7 steps
  10. 10. “The change from atoms to bits is irrevocable and unstoppable” Being Digital by Nicolas Negroponte
  11. 11. We use URIs to describe both bits & atoms ... Information resources are things that computers understand, e.g., Web pages, images, CSS files, etc. Non-information resources are atoms, e.g., people, places, events, things, concepts, etc.
  12. 12. • A different way of thinking about data• The Open World Assumption• Lots of URIs• To be citizen of the world (not everyone speaks English)• To publish useful information & announce it!
  13. 13. Peeling the onion ....
  14. 14. Machine readable
  15. 15. and Human Readable (or edible)
  16. 16. Publish machine & human readable content• Machine readable format• Human-readable descriptions of your data set• Increase visibility with search engines • Include RDFa or other microformats • Publish a voID description of your RDF dataset
  17. 17. 100% House email 90% SEO 80% Paid search Banners, 70% buttons Text-link adsUsage >>> Affiliate Marketing 60% Behavioral Contextual targeting targeting Rented email lists 50% Rich media/ video 40% Pop-ups/ pop-unders 30% 0% 10% 20% 30% 40% 50% 60% Marketers Reporting “Great” Return on Investment
  18. 18. Model without context
  19. 19. There is a ProcessIdentify Model Name Describe Convert Publish Maintain
  20. 20. Preparation1. Leverage what exists• Request a copy of the logical and physical model of the database(s)• Obtain data extracts (i.e., databases and/or spreadsheets) or create data in a way that can be replicated.
  21. 21. Modeling the data2. Model data without context to allow for reuse and easier merging of data sets • Traditional DBAs organize data for specified Web services or applications. • With LD, application logic does not drive the data schema, concepts, etc.
  22. 22. Modeling the data3. Look for real world objects of interest (e.g., people, places, things, locations, etc.) and model them.• Investigate how others are already modeling similar or related data.• Look for duplication and normalize the data• Use common sense to decide whether or not to make link
  23. 23. Modeling the data ...4. Connect data from different sources and authoritative vocabularies (see list of popular vocabularies below).• Use URIs as names for your objects
  24. 24. Modeling the data ...• Put aside immediate needs of any application• Don’t think about how an application will use your data• Do think about time and how the data will change over time.
  25. 25. Convert, Publish & Maintain5. Write a script or process to convert the data set repeatedly6. Publish to the Web and announce it! (more details shortly)7. Maintenance strategy (more details in the social contract at the end)
  26. 26. Take the plunge ... Be forgiving • Simplistic data models can still be useful • Better to make progress with something rather than do nothing because we cannot be comprehensive and complete
  27. 27. Take an iterative approach1. Review of modeling decisions2. Review vocabularies chosen and developed3. Modify/update data conversion scripts4. Do a maintenance walk-through with real use cases5. Show how to explore data with SPARQL and visualizations6. Discuss a persistent identifier strategy (think PURLs)
  28. 28. shared innovation™29
  29. 29. Describe your data
  30. 30. Data stewards should....• Make data accessible via the Web’s standard access mechanism, specifically http URIs• Represent data in a common format, such as RDF/XML, Notation-3 (N3), Turtle, N- Triples, RDFa, and RDF/JSON• Provide self describing data
  31. 31. Linked Data Formats• RDF/XML - RDF for XML pipelines• Turtle - Human-readable RDF• XHTML with GRDDL transformation• XHTML with embedded RDFa• RDF Schema - Describing structure
  32. 32. In a tart, smoothie or margarita ... berries can be combined in different ways
  33. 33. Merging data
  34. 34. Guidelines for merging• URIs name the resources we are describing• Two people using the same URI are describing the same thing• The same URI in two datasets means the same thing• Graphs from several different sources can be merged;• Resources with the same URI are considered identical;• No limitations on which graphs can be merged.
  35. 35. Announcing the finished product!
  36. 36. •Inform the LOD developer community (linkeddata.org, W3 lists)•Announce to search engines (RDFa hints, register to make accessible)•Publish human readable descriptions•Encourage interlinking•Publish schema as voID•Include SPARQL endpoint
  37. 37. ACCEPTABLE ROI FOR IT 4% 17% 13% 16% 6 months 49% 12 months 18 months 24 months More than 24 months
  38. 38. The Social Contract ... The not so fine print• LOD is a social contract to provide the public with information• Follow best practices for modeling• Carefully consider your URI strategy• Ensure that your LOD remains available where you say it will be• Publish voID description• For a government agency ... a data policy is “a must” • specify data quality and retention, treatment of data thru secondary sources, restrictions for use, frequency of updates, public participation, and applicability of this data policy
  39. 39. We’ve createdsometing quite beautiful
  40. 40. Reading http://linkeddatabook.com/editions/1.0/http://3roundstones.com/linking-enterprise-data/
  41. 41. This work is Copyright © 2011 3 Round Stones Inc.It is licensed under the Creative Commons Attribution 3.0 Unported LicenseFull details at: http://creativecommons.org/licenses/by/3.0/You are free: to Share — to copy, distribute and transmit the work to Remix — to adapt the workUnder the following conditions: Attribution. You must attribute the work in the manner specified by the author or licensor (but not in any way that suggests that they endorse you or your use of the work).• For any reuse or distribution, you must make clear to others the license terms of this work.• Any of the above conditions can be waived if you get permission from the copyright holder.• Nothing in this license impairs or restricts the authors moral rights.• Some Content in the work may be licensed under different terms, this is noted separately.
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×