Dr. David Wood
david@3roundstones.com
@prototypo
12 September 2013
Linked Data: Opportunities
for Entrepreneurs
David Wood
B.S. Mechanical Engineering
B.S. Electrical Engineering (equivalency)
M.S.Astronautical Engineering
Aeronautica...
David Wood
ongoing
ongoing
company founded products disposition
2002
2005
@𝛑Plugged In Software
David Wood
RDF Database
RDF Database
Management
RDF Usage ongoing
Linked Data
Management
ongoing
company founded products ...
Readable by
people
Data in the Physical World
Machine readable
Readable
by
motivated
people
40% annual growth in data produced
5% annual growth in IT spending
1.8 ZB
35 ZB
2012 2020
Digital Information Produced
294...
Today’s Data on the Web
“Perfection is achieved, not when
there is nothing left to add, but when
there is nothing left to remove.”
-- Antoine de S...
Ted Nelson
“The Web is the minimal concession to
hypertext that a sequence-and-hierarchy
chauvinist could possibly make.”
“HTML is pr...
Aristotle
Tim Berners-Lee
Simon Kaplan
Marc Andreesen
Data Systems
Lack of Context
Required Context
Thinking about Data
Organization Charts
Organization Charts
redux
The Web makes graphs
out of hierarchies
New Data Requirements
• Global access
• Open format
• Record context
• to allow sharing
• to allow reuse
• Record provenan...
Challenges
• Global access: Need to publish to the Web
• Open format: Most data currently bound
to proprietary tools/forma...
Linked Data on the Web
my data
collector
collected by
measurement
Michael
first name
Hausenblaslast name
Person
a
a measure...
johnson@example.com
Appropriate Copy Problem
Someone else (we don’t know)
Schemas/Vocabularies
YouTube HDTV
watch videos
watch Better
videos
Publish videos
Share videos
Rate videos
Discuss videos
Linked Data RDBMS
Use data Use data
Publish data
Share data
Rate data
Discuss data
Credit: Bradley P.Allen, Elsevier Labs
HTTP-accessible endpoints capable of returning XML or textual content
Convert XML or textual results to
RDF
Render RDF to ...
• Linked Data warehouses
10B USD annually
• Linked Data supply chains
205M USD annually (Web)
6B USD annually (enterprise)...
This work is Copyright © 2011 3 Round Stones Inc.
It is licensed under the Creative Commons Attribution 3.0 Unported Licen...
Dr. David Wood
david@3roundstones.com
@prototypo
12 September 2013
Linked Data: Opportunities
for Entrepreneurs
Linked Data: Opportunities for Entrepreneurs
Linked Data: Opportunities for Entrepreneurs
Linked Data: Opportunities for Entrepreneurs
Linked Data: Opportunities for Entrepreneurs
Linked Data: Opportunities for Entrepreneurs
Linked Data: Opportunities for Entrepreneurs
Linked Data: Opportunities for Entrepreneurs
Linked Data: Opportunities for Entrepreneurs
Linked Data: Opportunities for Entrepreneurs
Upcoming SlideShare
Loading in...5
×

Linked Data: Opportunities for Entrepreneurs

385

Published on

Multidisciplinary engineer and entrepreneur David Wood discusses the reasons, approaches and success stories for structured data on the World Wide Web. Linked Data is placed in context with the rest of the Web and that context is used to suggest some areas ripe for entrepreneurial innovation.

Published in: Education, Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
385
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
20
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

Linked Data: Opportunities for Entrepreneurs

  1. 1. Dr. David Wood david@3roundstones.com @prototypo 12 September 2013 Linked Data: Opportunities for Entrepreneurs
  2. 2. David Wood B.S. Mechanical Engineering B.S. Electrical Engineering (equivalency) M.S.Astronautical Engineering Aeronautical & Astronautical Engineer Ph.D. Software Engineering
  3. 3. David Wood ongoing ongoing company founded products disposition 2002 2005 @𝛑Plugged In Software
  4. 4. David Wood RDF Database RDF Database Management RDF Usage ongoing Linked Data Management ongoing company founded products disposition 2002 2005 @𝛑Plugged In Software
  5. 5. Readable by people Data in the Physical World
  6. 6. Machine readable Readable by motivated people
  7. 7. 40% annual growth in data produced 5% annual growth in IT spending 1.8 ZB 35 ZB 2012 2020 Digital Information Produced 294B 1 Trillion 2 Trillion 3 Trillion 4 Trillion 5 Trillion Online Ad Impressions Emails Tweets Daily (2013) 230M 4.8T
  8. 8. Today’s Data on the Web
  9. 9. “Perfection is achieved, not when there is nothing left to add, but when there is nothing left to remove.” -- Antoine de Saint-Exupéry
  10. 10. Ted Nelson
  11. 11. “The Web is the minimal concession to hypertext that a sequence-and-hierarchy chauvinist could possibly make.” “HTML is precisely what we were trying to PREVENT-- ever-breaking links, links going outward only, quotes you can't follow to their origins, no version management, no rights management.” “The "Browser" is an extremely silly concept-- a window for looking sequentially at a large parallel structure. It does not show this structure in a useful way.”
  12. 12. Aristotle
  13. 13. Tim Berners-Lee
  14. 14. Simon Kaplan
  15. 15. Marc Andreesen
  16. 16. Data Systems
  17. 17. Lack of Context
  18. 18. Required Context
  19. 19. Thinking about Data
  20. 20. Organization Charts
  21. 21. Organization Charts redux
  22. 22. The Web makes graphs out of hierarchies
  23. 23. New Data Requirements • Global access • Open format • Record context • to allow sharing • to allow reuse • Record provenance
  24. 24. Challenges • Global access: Need to publish to the Web • Open format: Most data currently bound to proprietary tools/formats • Context: Data often structured for individual use without thought to sharing • Provenance: Paradoxically easy given solutions to the others
  25. 25. Linked Data on the Web my data collector collected by measurement Michael first name Hausenblaslast name Person a a measurement 2011-01-01 date 0 value units of measure degrees Centigrade ... Galway Airport collected at or
  26. 26. johnson@example.com Appropriate Copy Problem
  27. 27. Someone else (we don’t know) Schemas/Vocabularies
  28. 28. YouTube HDTV watch videos watch Better videos Publish videos Share videos Rate videos Discuss videos
  29. 29. Linked Data RDBMS Use data Use data Publish data Share data Rate data Discuss data
  30. 30. Credit: Bradley P.Allen, Elsevier Labs
  31. 31. HTTP-accessible endpoints capable of returning XML or textual content Convert XML or textual results to RDF Render RDF to HTML via template User resolves a single URI to an Active PURL Multiple targets queried independently 1 David Wood1 and Tom Plasterer2 1david@3roundstones.com, 2Tom.Plasterer@astrazeneca.com Active PURLs for Clinical Study Aggregation The problem: No coordinated view of clinical study information. Information is distributed across departments, subsidiaries and government data sources. The solution: Gather, convert, aggregate and format for display Challenges Next steps How semantic technologies help 3 Round Stones and AstraZeneca created a system to allow coordinated views of distributed clinical trial information. The system extended the Callimachus Project, an Open Source management system for Linked Data. Persistent URLs, or PURLs, were used to provide globally unique and resolvable identifiers for each clinical study. The PURL concept was extended to enable PURLs to have multiple targets and for the results of each target to undergo arbitrary transformation. PURLs which have such capabilities are called Active PURLs. Information sources relevant to clinical studies were identified, regardless of whether their location was internal or external to the pharmaceutical company's network. Active PURLs were used to resolve data sources having HTTP endpoints capable of returning XML or textual results. Each information source is dynamically transformed into Resource Description Framework (RDF) formats and all sources' results then merged into a single, temporary graph of RDF data. Information is rendered to end users as coordinated HTML descriptions regarding each clinical trial using the Callimachus template engine. Machine-readable versions of the data are also available. Linked Data techniques can help to address both the availability of clinical trial information and provide a means to build effective information systems using it. Linked Data techniques allow for "cooperation without coordination". Publishers of data provide context for use by third parties in other portions of a distributed enterprise. Users of Linked Data can combine information from multiple sources. Subsequent publication can create a virtuous circle of positive feedback, allowing researchers, informaticists and support staff to collaboratively and distributively build a reusable knowledge base. Distributed queries have many known limitations, such as the introduction of multiple single points of failure in any given PURL resolution. HTTP timeouts, auth/auth errors or other network failures can slow or stop a pipeline from returning correctly. Similarly, distributed queries can result in variant query-time performance due to complex network and endpoint perform- ance variances. Proactive caching and cache manage- meant strategies can improve runtime performance and protect end users from the limitations inherent in a distributed query architecture. Caching of intermediate results from endpoints has not yet been implemented. References User experience Users resolve a URL that provides a unique identifier for a clinical study, drug, chemical or other concept managed by this system. The user may be presented with the URL on HTML pages, search it via full- text techniques or discover it via semantic search. 1 2 Users are presented with a dynamically generated Web page representing aggregated clinical study information. Users are isolated from the complex and distributed information environment.
  32. 32. • Linked Data warehouses 10B USD annually • Linked Data supply chains 205M USD annually (Web) 6B USD annually (enterprise) • Linked Data analytics 16B USD annually Your Opportunity?
  33. 33. This work is Copyright © 2011 3 Round Stones Inc. It is licensed under the Creative Commons Attribution 3.0 Unported License Full details at: http://creativecommons.org/licenses/by/3.0/ You are free: to Share — to copy, distribute and transmit the work to Remix — to adapt the work Under the following conditions: Attribution. You must attribute the work in the manner specified by the author or licensor (but not in any way that suggests that they endorse you or your use of the work). Share Alike. If you alter, transform, or build upon this work, you may distribute the resulting work only under the same or similar license to this one.
  34. 34. Dr. David Wood david@3roundstones.com @prototypo 12 September 2013 Linked Data: Opportunities for Entrepreneurs
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×