Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Using Architectures for Semantic Interoperability to Create Journal Clubs for Emergency Response


Published on

In certain types of _slow burn_ emergencies, careful accumulation and evaluation of information can offer a crucial advantage. The SARS outbreak in the first decade of the 21st century was such an event, and ad hoc
journal clubs played a critical role in assisting scientific and technical responders in identifying and developing various strategies for halting what could have become a dangerous pandemic. This paper describes a process for leveraging emerging semantic web and digital library architectures and standards to (1) create a focused collection of bibliographic metadata, (2) extract semantic information, (3) convert it to the Resource Description Framework /Extensible Markup Language (RDF/XML),
and (4) integrate it so that scientific and technical responders can share and explore critical information in the collections.

Published in: Technology, Education
  • Be the first to comment

Using Architectures for Semantic Interoperability to Create Journal Clubs for Emergency Response

  1. 1. LA-UR-09-02970 Approved for public release; distribution is unlimited. Title: Using Architectures for Semantic Interoperability to Create Journal Clubs for Emergency Response Author(s): James E. Powell Linn Marks Collins Mark L. B. Martinez Intended for: 6th International Conference on Information Systems for Crisis Response and Management Special Session on Solutions for Information Overload May 10-13, 2009 Göteborg, Sweden Los Alamos National Laboratory, an affirmative action/equal opportunity employer, is operated by the Los Alamos National Security, LLC for the National Nuclear Security Administration of the U.S. Department of Energy under contract DE-AC52-06NA25396. By acceptance of this article, the publisher recognizes that the U.S. Government retains a nonexclusive, royalty-free license to publish or reproduce the published form of this contribution, or to allow others to do so, for U.S. Government purposes. Los Alamos National Laboratory requests that the publisher identify this article as work performed under the auspices of the U.S. Department of Energy. Los Alamos National Laboratory strongly supports academic freedom and a researcher’s right to publish; as an institution, however, the Laboratory does not endorse the viewpoint of a publication or guarantee its technical correctness. Form 836 (7/06)
  2. 2. Using Architectures for Semantic Interoperability to Create Journal Clubs for Emergency Response James E. Powell, Linn Marks Collins, and Mark L.B. Martinez {jepowell, linn, mlbm} Knowledge Systems and Human Factors Team Research Library, Los Alamos National Laboratory Slide 1
  3. 3. Bibliographic Metadata to Social, Semantic Collection Metadata In Social Semantic Repository Out UNCLASSIFIED Slide 2
  4. 4. SARS event (Severe Acute Respiratory Syndrome) First cases in 2002 in Asia Spread to 29 countries 8,000+ cases, nearly 800 deaths Significant economic toll Chaotic and secretive response Took months to identify cause Science played a critical role But essentially, we're still here because we got lucky. UNCLASSIFIED Slide 3
  5. 5. How we (re-)defined journal club Traditional journal club Regular face-to-face meetings • Reader summaries, supervised discussion • Short reading list • In support of research, clinical, academic activities • Our version: social, topical collections Metadata for a few hundred to a few thousand selected papers (mini-digital libraries) • Various mechanisms to explore collection • Social tools for calling participants attention to particular papers • Interoperability with other collaboration frameworks • UNCLASSIFIED Slide 4
  6. 6. Digital Library - “thought in cold storage”* What is a digital library? Collection of content – e-journal articles, e-books, etc. • Metadata to expose that content – the “archive” • Persistent identifiers • Search engine(s) to explore metadata • Citation counts enhance the utility of metadata • Metadata export tools • Some Examples: Name Domain Size ACM Digital Library Computer Science >54,000 PubMed Life sciences and biomedicine >17,000,000 Web of Science Broad scientific coverage >38,000,000 Scopus Broad scientific coverage >33,000,000 arXiv Physical sciences >44,000 Oppie Broad scientific coverage >94,000,000 NSDL Broad scientific coverage >2,000,000 UNCLASSIFIED Slide 5
  7. 7. Research in Progress at the Los Alamos National Laboratory, Los Alamos, NM, US Making digital library content relevant in emergencies Is deeper knowledge of value in this situation? • What sources would be useful? • Develop strategy for creating a focused, or targeted, collection — Topic specific harvest or — results of a search against a larger digital library • Match focused collections with people – (expert user or librarian(s) + technology) Normalize, import and provide access to the data • Map the data to RDF expose using tools for exploring moderately large highly relevant collections, and • Make the content social UNCLASSIFIED Slide 6
  8. 8. Components of the System Perform query or harvest content Topic harvester automatically maps content to RDF/XML Augments content with georeference and foaf data • RDF repository hosts the semantic content Middleware – REST web services which expose standards compliant output in response to queries • against RDF repository SPARQL query templates at the core • Depending on service, output is GraphML, KML, GDF, etc. • Rendering layer – Combines output with a viewing app (e.g. google maps view, graph applet view) • RDF Digital Library – Search and social linking services – ties it all together • UNCLASSIFIED Slide 7
  9. 9. From query to RDF repository Bibliographic metadata search service that returns results as XML Metadata mapped to RDF/XML representations (using Dublin Core, FOAF ontologies) Each unique author assigned a UUID Social network of co-authors, author matching via UUID, FOAF Georeference augmentation via placenames found in MARC bibliographic fields – lookup Option to perform additional augmentation by submitted abstracts to OpenCalais UNCLASSIFIED Slide 8
  10. 10. Harvest to RDF OAI-PMH – Open Archives Initiative – Protocol for Metadata Harvesting Data provider shares metadata • Service provider harvests metadata, aggregates, and exposes it • OAI-PMH enables retrieval of bibliographic metadata from remote repositories • Protocol allows for limited set of verbs for exploring a target repository, • including: ListMetadataFormats, ListSets, ListRecords • Then, map and augment as with search results Map metadata to triples • Represent authorship networks • Augment with georeference information • Store content in RDF repository • UNCLASSIFIED Slide 9
  11. 11. Embracing Lossy 80% solution needs to be okay Need quick turnaround • Accept that metadata mapping is not an exact science • Handle mismatched or omitted elements gracefully • Cope with metadata lacking sufficient granularity or type information • Downgrade metadata when necessary, e.g. MARC to Dublin Core • Normalize and extract maximum value from available data, both explicit and implicit • Mapping Metadata Original element RDF property name identifier dc:identifier title dc:title creator, contributor dc:creator foaf:name date dc:date subject dc:subject geo:placename description dc:description UNCLASSIFIED Slide 10
  12. 12. Augmentation Make explicit what is implicit: social networks Assign a UUID to each author • Associate authors and co-authors using FOAF • Match author names across publications, within query results/harvested set • Enhance what's there Find placenames in metadata • Ask geonames service for coordinates • Leverage knowledge extraction services OpenCalais can identify placenames in full text • Also identifies, exposes other concepts and RDF links • UNCLASSIFIED Slide 11
  13. 13. Pre- and post-augmentation results for sample queries situation: (quot;situation awarenessquot;) infoviz: (quot;information visualizationquot;) humanitarian: (quot;humanitarian assistancequot; OR quot;disaster reliefquot;) sars: (“sars” and “coronavirus”) Q uery O riginal Post-processing situation # records 6652 4609 situation # georefs 5 445 infoviz # records 6807 4636 infoviz # georefs 7 194 hum anitarian # records 1626 1312 hum anitarian # georefs 86 455 sars # records 4674 2937 sars # georefs 369 UNCLASSIFIED Slide 12
  14. 14. Mapping process summary UNCLASSIFIED Slide 13
  15. 15. Core ontologies used in mapped data Dublin Core (xmlns:dc=quot;;) Metadata about publications • <object> dc:title “SARS: How a Global Epidemic was Stopped”. — <object> dc:identifier “92-9061-213-4”. — FOAF (xmlns:foaf=quot;;) Properties describing people • <object> foaf:name “James Powell”. — uuid foaf:knows uuid. — Geonames (xmlns:geo=quot;”) Properties describing places • <object> geo:name “Ohio, United States”. — “Ohio, United States” geo:lat “40.5”. — “Ohio, United States” geo:long “-82.5”. — UNCLASSIFIED Slide 14
  16. 16. Social (Author) Awareness Tool Query layer /srdf?q=Visualization&repo=cr&format=graphml Output could be used with any • visualization tool that supports GraphML markup Rendering layer /sat?q=Visualization&repo=cr Uses GUESS Java visualization Applet • Combines Java applet with output from • above REST web service SPARQL query SELECT ?title ?name ?selfuuid ?knowsuuid WHERE { ?y <> ?title. ... UNCLASSIFIED Slide 15
  17. 17. Geographic Awareness Tool Query layer /grdf?query=avian+flu&repo=influenza &format=kml KML output could be used with • Google Earth Rendering layer /gat?query=avian+flu&repo=influenza Combines Java applet with output • from above REST web service uses Google Maps API • SPARQL query SELECT ?title ?name ?selfuuid ?knowsuuid WHERE { ?y <> ?title. ?y <> ?selfuuid. ?selfuuid <> ?name. ... UNCLASSIFIED Slide 16
  18. 18. RDF Digital Library Interface UNCLASSIFIED Slide 17
  19. 19. Enabling Journal Club style interactions Practitioners reviewing and discussing a set of published journal articles Intersects nicely with social linking capabilities Tagging – user provided keywords related to a paper • Rating – a user's personal ranking of the paper on a simple numeric scale • Comment - a user's thoughts about a particular paper • Technical Issues Unique identifier ensures social data is connected to appropriate object • User identity, and overlap with authorship • Users may be interacting all over the place – how can we be interoperable with other • collaboration spaces? UNCLASSIFIED Slide 18
  20. 20. Use SIOC SIOC stands for Semantically Interlinked Online Communities • enables harvesting and aggregation of social linking data • from disparate social networking sites We use SIOC properties to describe users of our social linking tools, as well as some aspects of the content they input when they annotate a particular record. We also use SIOC types module's Comment property A Java servlet converts social linking form input into triples, and uses the openrdf API to insert the triples into the triplestore Social linking – totally native RDF, no intermediary – Immediately accessible to aggregation services UNCLASSIFIED Slide 19
  21. 21. A User Comment in SIOC <sioct:Comment> <sioc:about rdf:resource=quot; 20000081109quot;/> <dc:title>Pele</dc:title> <dcterms:created>2009.04.08 03:41:26 MDT</dcterms:created> <sioc:has_container rdf:resource=quot;uuid:1234-1234-1234quot;/> <sioc:has_creator> <sioc:User rdf:about=quot;http://rdfsearch/queryRdf/ search.jsp#jpowellquot;> <rdfs:seeAlso rdf:resource=quot;http://rdfsearch/queryR df/showComments?userid=jpowellquot;/> </sioc:User> </sioc:has_creator> <sioc:content>Does Pele know about Io (or for that matter, Jupiter?)</sioc:content> UNCLASSIFIED Slide 20
  22. 22. Next up, user tags Users will be able to tag content Tags will receive similar treatment as Comments, stored as triples, • using various ontologies including Dublin Core, FOAF, SIOC • applicable ontologies include SCOT (Social Semantic Cloud of Tags), and/or • MOAT (Meaning Of A Tag) Tags will link to DBPedia concepts RDF Linking will greatly enhance the utility of the tags, • The tag “infectious disease” would be associated with • Reuse of “infectious disease” will point to other library content Our tags could be aggregated with tags from other collaboration spaces UNCLASSIFIED Slide 21
  23. 23. Summary Disease outbreaks can be disastrous but diseases often emerge slowly, sometimes crossing species barriers • appear in one region, then move with populations • are caused by a pathogen that may eventually be identified • may be treatable, corralled by quarantine, curtailed by hygiene, halted by vaccines • Science can prevent or reduce casualties from, and, if we're lucky, reduce or eliminate the risk of future outbreaks of disease but only if scientists have access to topical, vetted, high quality information • and tools to explore and share this knowledge • Our semantic digital library solution has normalized collections of enhanced metadata, organized by topic (search), or source (harvest) with • Middleware layers that enable exploration of the data • Rendering layers that make results usable (e.g. via visualization tools, overlay results onto map) • Social linking capabilities for collaboration and interoperability UNCLASSIFIED Slide 22
  24. 24. Questions? UNCLASSIFIED Slide 23
  25. 25. Awareness tools for E-SOS Our content exploration tools started out as a suite of Just In Time Information Retrieval Tools (JITIR) • leverage user's context • proactively perform tasks in support of user's main activity • manage user's attention appropriately “Context” is content authoring – e.g. blog post, forum entry, email message Input text is “intercepted” and parsed for tokens that may be used to automatically construct queries Query is stop word deleted – boolean linked significant terms, or results of an intermediate call to a term extraction service (Calais, Yahoo) Query is submitted to query middleware tool Results are returned when available, using appropriate rendering layer UNCLASSIFIED Slide 24
  26. 26. UNCLASSIFIED Slide 25
  27. 27. Excerpt of possible tag RDF/XML <sioc:about rdf:resource=quot;info:lanl-repo/biosis/PREV198273075669quot;/> <moat:Tag rdf:about=quot;;> <moat:name><![CDATA[infectious disease]]></moat:name> <moat:hasMeaning> <moat:Meaning> <moat:meaningURI rdf:resource=quot;;/> <foaf:maker rdf:resource=quot;http://rdfsearch/queryRdf/search.jsp#jpowellquot;/> <foaf:maker rdf:resource=quot;;/> </moat:Meaning> </moat:has_meaning> ... More info: UNCLASSIFIED Slide 26