New York City and Baltimore Semantic Web Meetups 20130221/20120226


Published on

Slides for a talk at the New York City and Baltimore Semantic Web Meetups on the Linked Data book being published by Manning.

Published in: Education
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

New York City and Baltimore Semantic Web Meetups 20130221/20120226

  1. 1. Discount code: 13ldev at
  2. 2. Our Rapidly Changing Internet 35 hours of video uploaded per minute51% of Internet traffic is non-human >2.3 billon Internet users, >1 billion in Asia
  3. 3. 08 Oct 2007 07 Nov 2007 10 Nov 2007 28 Feb 2008 31 Mar 200818 Sep 2008 05 Mar 2009 27 Mar 2009 14 Jul 2009 22 Sep 2010
  4. 4. David Wood has co-founded several Open Source Softwareprojects related to the Semantic Web, including PersistentURLs, Mulgara and the Callimachus Project. He is co-chair ofthe W3C’s RDF Working Group.Marsha Zaidman is Associate Professor Emerita ofComputer Science at the University of Mary Washington.Luke Ruth is a Linked Data developer supporting theCallimachus Project ( Hausenblas is Chief Data Engineer at MapR.He formerly led the Linked Data Research Centre inGalway, Ireland.
  5. 5.
  6. 6. Linked Data about the book!
  7. 7. The first chapter is free
  8. 8. Manning Early Access Program (MEAP) • Concept: • Give away the first chapter • Sell a low-resolution PDF to early readers • Readers get PDF updates and a print copy when it becomes available
  9. 9. Author Forum
  10. 10. Success Criteria
  11. 11. Success Criteria• Sell 100 copies in MEAP in one month
  12. 12. Success Criteria• Sell 100 copies in MEAP in one month• 598 copies in one month
  13. 13. Success Criteria• Sell 100 copies in MEAP in one month• 598 copies in one month• Lots of interest in Linked Data!
  14. 14. What’s Inside• What Linked Data is• Find Linked Data you can reuse• Use Linked Data in your applications• Create your own Linked Data• Build Linked Data applications using standard Web techniques
  15. 15. Coding Examples Callimachus
  16. 16. 1. Linked Data to the rescue Available2. RDF - the data model for Linked Data Available3. Consuming Linked Data Available4. Creating Linked Data Available5. Querying Linked Data Available6. Enhancing results from search engines Available7. Collecting Linked Data Available8. Datasets Completed9. Callimachus - a Linked Data management system In draft10. Building a read-write Linked Data application In draft
  17. 17. Callimachus
  18. 18.
  19. 19. Partners Callimachus Sesame Sesame(in progress)
  20. 20. From EPAFrom WikipediaOpen Street Map
  21. 21. SubjectObject(Predicate is defined in a template)
  22. 22. RDF  “Describe”  View SubjectPredicate Object
  23. 23. Active PURLs for Clinical Study Aggregation David Wood1 and Tom Plasterer2 1, 2Tom.Plasterer@astrazeneca.comThe problem: No coordinated view of clinical study information. Information is distributed across departments, subsidiaries and government data sources.The solution: Gather, convert, aggregate and format for display 3 Round Stones and AstraZeneca created a system to allow coordinated views of distributed clinical trial information. The system extended the Callimachus Project, an Open Source management system for Linked Data. Persistent URLs, or PURLs, were used to provide globally unique and resolvable identifiers for each clinical study. The PURL concept was extended to enable PURLs to have multiple targets and for the results of each target to undergo arbitrary transformation. PURLs which have such capabilities are called Active PURLs. Information sources relevant to clinical studies were identified, regardless of whether their location was internal or external to the pharmaceutical companys network. Active PURLs were used to resolve data sources having HTTP endpoints capable of returning XML or textual results. Each information source is dynamically transformed into Resource Description Framework (RDF) formats and all sources results then merged into a single, temporary graph of RDF data. Information is rendered to end users as coordinated HTML descriptions regarding each clinical trial using the Callimachus template engine. Machine-readable versions of the data are also available.How semantic technologies help Linked Data techniques can help to address both the availability of clinical trial information and provide a means to build effective information systems using it. Linked Data techniques allow for "cooperation without coordination". Publishers of data provide context for use by third parties in other portions of a distributed enterprise. Users of Linked Data can combine information from multiple sources. Subsequent publication can create a virtuous circle of positive feedback, allowing researchers, informaticists and support staff to collaboratively and distributively build a reusable knowledge base.User experience Challenges HTTP-accessible endpoints capable of returning XML or textual content Distributed queries have many known 1 Users resolve a URL that limitations, such as the introduction of provides a unique identifier for multiple single points of failure in any a clinical study, drug, chemical given PURL resolution. HTTP timeouts, or other concept managed by auth/auth errors or other network failures this system. The user may can slow or stop a pipeline from returning be presented with the URL on correctly. HTML pages, search it via full- Similarly, distributed queries can result text techniques or discover it in variant query-time performance due to via semantic search. complex network and endpoint perform- Multiple targets queried independently ance variances. Convert XML or textual results to 2 Users are presented with a RDF Proactive caching and cache manage- dynamically generated Web meant strategies can improve runtime page representing aggregated 1 performance and protect end users from clinical study information. Users User resolves a single URI to an Render RDF to HTML via template the limitations inherent in a distributed are isolated from the complex Active PURL query architecture. Caching of and distributed information intermediate results from endpoints has environment. not yet been implemented.References Next steps
  24. 24. ✔ DocBook 5 ✔ XHTML 5 ✔ ePub 3Credit: Bradley P. Allen, Elsevier Labs
  25. 25.
  26. 26. Discount code: 13ldev at