New York City and Baltimore Semantic Web Meetups 20130221/20120226
Discount code: 13ldev at manning.com
Our Rapidly Changing Internet 35 hours of video uploaded per minute51% of Internet trafﬁc is non-human >2.3 billon Internet users, >1 billion in Asia
08 Oct 2007 07 Nov 2007 10 Nov 2007 28 Feb 2008 31 Mar 200818 Sep 2008 05 Mar 2009 27 Mar 2009 14 Jul 2009 22 Sep 2010
David Wood has co-founded several Open Source Softwareprojects related to the Semantic Web, including PersistentURLs, Mulgara and the Callimachus Project. He is co-chair ofthe W3C’s RDF Working Group.Marsha Zaidman is Associate Professor Emerita ofComputer Science at the University of Mary Washington.Luke Ruth is a Linked Data developer supporting theCallimachus Project (http://callimachusproject.org).Michael Hausenblas is Chief Data Engineer at MapR.He formerly led the Linked Data Research Centre inGalway, Ireland.
1. Linked Data to the rescue Available2. RDF - the data model for Linked Data Available3. Consuming Linked Data Available4. Creating Linked Data Available5. Querying Linked Data Available6. Enhancing results from search engines Available7. Collecting Linked Data Available8. Datasets Completed9. Callimachus - a Linked Data management system In draft10. Building a read-write Linked Data application In draft
Active PURLs for Clinical Study Aggregation David Wood1 and Tom Plasterer2 1 email@example.com, 2Tom.Plasterer@astrazeneca.comThe problem: No coordinated view of clinical study information. Information is distributed across departments, subsidiaries and government data sources.The solution: Gather, convert, aggregate and format for display 3 Round Stones and AstraZeneca created a system to allow coordinated views of distributed clinical trial information. The system extended the Callimachus Project, an Open Source management system for Linked Data. Persistent URLs, or PURLs, were used to provide globally unique and resolvable identiﬁers for each clinical study. The PURL concept was extended to enable PURLs to have multiple targets and for the results of each target to undergo arbitrary transformation. PURLs which have such capabilities are called Active PURLs. Information sources relevant to clinical studies were identiﬁed, regardless of whether their location was internal or external to the pharmaceutical companys network. Active PURLs were used to resolve data sources having HTTP endpoints capable of returning XML or textual results. Each information source is dynamically transformed into Resource Description Framework (RDF) formats and all sources results then merged into a single, temporary graph of RDF data. Information is rendered to end users as coordinated HTML descriptions regarding each clinical trial using the Callimachus template engine. Machine-readable versions of the data are also available.How semantic technologies help Linked Data techniques can help to address both the availability of clinical trial information and provide a means to build effective information systems using it. Linked Data techniques allow for "cooperation without coordination". Publishers of data provide context for use by third parties in other portions of a distributed enterprise. Users of Linked Data can combine information from multiple sources. Subsequent publication can create a virtuous circle of positive feedback, allowing researchers, informaticists and support staff to collaboratively and distributively build a reusable knowledge base.User experience Challenges HTTP-accessible endpoints capable of returning XML or textual content Distributed queries have many known 1 Users resolve a URL that limitations, such as the introduction of provides a unique identiﬁer for multiple single points of failure in any a clinical study, drug, chemical given PURL resolution. HTTP timeouts, or other concept managed by auth/auth errors or other network failures this system. The user may can slow or stop a pipeline from returning be presented with the URL on correctly. HTML pages, search it via full- Similarly, distributed queries can result text techniques or discover it in variant query-time performance due to via semantic search. complex network and endpoint perform- Multiple targets queried independently ance variances. Convert XML or textual results to 2 Users are presented with a RDF Proactive caching and cache manage- dynamically generated Web meant strategies can improve runtime page representing aggregated 1 performance and protect end users from clinical study information. Users User resolves a single URI to an Render RDF to HTML via template the limitations inherent in a distributed are isolated from the complex Active PURL query architecture. Caching of and distributed information intermediate results from endpoints has environment. not yet been implemented.References Next steps