Linked Data and                 Cochrane Reviews           A report from the „Star Trek‟ CrewChris MavergamesWeb Operation...
 Intro to linked data and what it  means for Cochrane "Star Trek" stream of work so far Whats possible now and in the f...
   There are problems that limit their use by    some people    ◦ Difficult to wade through all of the text    ◦ Difficul...
 Search for “Prozac” – no reviews Search for “fluoxetine” – 25 reviewsSearching The Cochrane Library
   Beginning to do this now:    ◦ Summaries.Cochrane.org for consumers    ◦ Cochrane Clinical for clinicians   BUT    ◦ ...
How did Bing read 3 differentweather sites & bring me the dataI need?
If so, what might we be able to accomplish?Could we do similar magic withour Cochrane reviews?
Linked data
Semantic Web           is made up of:   Linked Data & Web of Data     Which all together comprise           Web 3.0What is...
Current web = Web of documents
Docs are linked not data in docs
 Data on the web is meant for human  consumption Machines need the data to be structured Once structured, information c...
Cochrane Reviews and Linked data
XML        <?xml version="1.0" encoding="ISO-8859-1"        standalone="no"?>        <COCHRANE_REVIEW DESCRIPTION="For pub...
Fortunately, Cochrane Reviews are  structured – but we still need to  teach the machines how to read  them, where to find ...
Data       Data            point     Data point                point              Data    Data      point    Data    point...
Cochrane Register of Studies
 Lack of unique study IDs a real problem CRS solves this by providing a unique ID  for all studies that can be reference...
   OWL (Web Ontology Language)   RDF (Resource Description Framework)   SPARQL (RDF query language) Model Cochrane Rev...
Use the gears!
Subject ->     Property -> Object<Gerd Antes> has-role <Director German Ctr><Director German Ctr> works-in <Freiburg, Germ...
Standard tools have been     All       developed to facilitate this   Reviews             process  in Archie  A Copy of   ...
A Question                    A Machine                    Generated                     Answer  A Machine  Readable    “T...
Star Trek
Insert witty Star Trek reference here!
Cochrane Review ontology
Lots of work still needed from          people with a deep understanding of          Cochrane content in order to get the ...
Cochrane Review ontology
Cochrane Review ontology
Cochrane Review ontology
Findings ontology from Lorne
A  Question                           A Machine                           Generated                            Answer  A M...
Gears!
 Ask questions that use data from several  different reviews Enhance the experience of our users by  including data from...
Ask questions that use data from     several different reviewsEnhancing the User Experience
I’ve done a search for trials on a particular         intervention for dementia.  I want to know which of the trials have ...
   Search for the relevant Reviews   Read the reference lists to find included trials   Compare with my trial search  ...
My list of  trials       A              A Machine ”studified”          Generated  list from              list of  the CRS ...
Links to the relevant Review forthose trials that were included
INSERT IMAGE FOR     QUESTION 1 HEREQuestion 1: SPARQL query and     partial list of results
What are the risks of bias for the entire set of trials assessing the effectiveness of a           particular intervention...
 Search for the relevant reviews (there  may be more than one) Read the tables of included studies to find  risk of bias...
A Machine                       generated   The                      summary of Cochrane                       the Risk of...
RoB Summary for Cochrane Reviews ondementia These figures summarize Risks of Bias from the trials included in the reviews ...
XML        <?xml version="1.0" encoding="ISO-8859-1"        standalone="no"?>        <COCHRANE_REVIEW DESCRIPTION="For pub...
Make search work betterEnhancing the User Experience
   Or, one could say any of these:     Abenol (CA), Acephen, Anadin Paracetamol    (UK), Apo-Acetaminophen (CA), Aspirin ...
LinkedLifeData.com
LinkedLifeData.com
DrugBank
XML        <?xml version="1.0" encoding="ISO-8859-1"        standalone="no"?>        <COCHRANE_REVIEW DESCRIPTION="For pub...
XML        <?xml version="1.0" encoding="ISO-8859-1"        standalone="no"?>        <COCHRANE_REVIEW DESCRIPTION="For pub...
Make it easier for people to find        Cochrane ReviewsEnhancing the User Experience
Enhancing news content
 Cochrane Reviews marked up in semantic  markup can be linked to news publishers For example, BBC Health writers could b...
Super Star Trek
How applicable is this Review in my part of the world?Super Star Trek
A list of the drugs in comparisons of  malaria in Reviews and the geographic        extent of their effectivenessGeographi...
Map of Artemisin Resistance
The future
Structured and linked data can help makeour content “nimble”Nimble content can:  • Travel Freely  • Retain Context Meaning...
"Structured data allows you topreserve your value propositionover a longer distance to a muchwider audience."             ...
Implementing semantic and linkeddata technologies should be:  • Non-invasive  • Agile  • Low impact (on staff –  hopefully...
What would Cochrane data “look like” outside of it’s container, the              Review?Looking to the future
For example: someone who is looking at a   study in PubMed might be interested in seeing Cochrane’s Risk of Bias assessmen...
RoB assessment in PubMed
   Linked Data or Web 3.0 is here   How can we leverage these tools to further    our mission   Requires that we think ...
CRS/ CDSR        CENTRAL                         HTAs  DARE                CMREbHC Semantic Platform
CRS/ CDSR               CENTRAL                                   UMLS         Drug         Bank          Diseasome      H...
Cochrane and EbHC ontology?
Will Cochrane have a bubblehere someday?
Muchas Gracias!
Linked Data and Cochrane Reviews
Upcoming SlideShare
Loading in...5
×

Linked Data and Cochrane Reviews

2,023

Published on

A talk I gave at the Cochrane Colloquium in Madrid in October 2011
http://www.cochrane.org/multimedia/multimedia-cochrane-colloquia-and-meetings/colloquium-madrid-2011#5

Published in: Health & Medicine, Education
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,023
On Slideshare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
0
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide
  • The technologies that underpin this can be used at the level of the web itself AND within and across individual datasets. We are investigating both.Current web = web of documentsData in documents (mostly HTML/XML, etc.) not structuredLinked data allows for structuring data so that both humans and computers can understand itFor example, Cochrane Review XML is highly-structured but relationships not explicitIf they were, we could query across dataset and link other datasets for complex queriesThe web turns into a giant database (the vision, anyway)Search results display can be improvedContent enriched with external data and re-packaging of our dataMany other possiblitiesCurrent web = web of documentsMostly in HTML/XMLGood for telling computers how to display informationNot good for embedding meaning and relationships between bits of informationLinked Data technologies allow for structuring data so that both humans and computers can understand itIt’s all about structured data
  • Every presentation about the web and technology nowadays has to include a cloud image!!
  • So, what does this mean for Cochrane data.
  • Cochrane Reviews are in XML that is transformed into HTML for display on web pages (Cochrane Library, cochrane.org, etc.). The document is highly structured but relationships between the various elements not explicit. If they were, we could link data across our own dataset, across Reviews.
  • Cochrane Reviews are in XML that is transformed into HTML for display on web pages (Cochrane Library, cochrane.org, etc.). The document is highly structured but relationships between the various elements not explicit. If they were, we could link data across our own dataset, across Reviews.
  • The CRS is an exciting new project that could facilitate better linking of data. (Refer to something Gordon said in his talk, if possible/applicable).
  • Henceforth in this presentation, when you see this image, all that semantic technology “stuff“ is at work behind-the-scenes...
  • Triples in a triple store are understandable by both humans and machines. We think in triples. So, imagine having a whole bunch (millions) of these statements about aspects of Cochrane Reviews in one bin and querying it. You could infer all sorts of new knowledge. Then, imagine chucking in and linking to other datasets such as the CRS or external ones...
  • And, we‘re going to need those gears where we‘re going! So, this is what the “Star Trek“ stream of work has done so far.
  • Explain Star Trek joke and general framework here...
  • This is the initial pass at creating an ontology, a structured representation of the concepts behind Cochrane Reviews. I know you can‘t read it all here but the basic idea is to express all the relationships between the various parts of the Review.In case anyone asks:- What the heck is an “ontology” – a definition: “An ontology is a formal specification of a shared conceptualization” – Tom Gruber- Ontology = Database schema (more or less)
  • So, here you have the concept of a Review.
  • Reviews have included studies.
  • Reviews include comparisons. Comparisons include outcomes that are compared. Etc. This is by no means complete but was meant to assist in a “proof of concept“ exercise we called “interrogating the XML“ of Cochrane Reviews.
  • For example, in the area of findings, the ontology is not yet fleshed out. Lorne Becker thus began a basic finding ontology to move us forward in that area.
  • Well, we‘ll need the gears for sure! From this initial ontology, we were able to use the gears (OWL, RDF, SPARQL) to answer some initial queries. We started by asking the question: “What have you always wanted to know from Cochrane Reviews?“ So, to tease out queries that cut across the dataset and answer questions that Review structure is not set up to do, currently. For example...***Special note: Archie and/or the CRS might already be able to do some of these queries either via the front-end interface or behind the scenes. The difference with the linked data approach is that this markup enables linking to outside datasets as well as “stitching together” data across Cochrane datasets. In addition, there are advantages and disadvantages to this approach that Paul will cover.***
  • These questions build on each other in increasing complexity...
  • Just so you can visualize the potential application of this: In this Pubmed search, each Cochrane box has pointers to the Cochrane review(s) that included that trial.
  • Just showing this once: the actual gears at work!
  • What sorts of bias are most prevalent in this particular body of research/clinical question of interest?
  • So, in just these first 2 sample questions, one can see how looking across the data and querying the dataset in ways that aren‘t currently possible (without a load of manual work, of course) can allow us to ask new questions of our data. These are just a few, “proof of concept“ examples. There are an infinite number of other possible questions we could have asked...
  • How can we achieve thìs? Linked Data sets like Linkedlifedata.com contain multiple datasets all linked together via...
  • the gears! So, with our data using the gears, and these datasets using these same gears, we can enrich our content, improve search, etc.
  • For example, one dataset in linkedlifedata is DrugBank. Drugbank contains all the variant names, worldwide, for a given drug. But...
  • Our Reviews contain many inconsistencies in the data when referring to drugs and in fields. One of the lessons learned from the Star Trek process so far is that our data is not always clean and consistent. Thankfully, there are things we can do to affect this without changing Archie/RevMan now. Perhaps mention Semantic BioMedical Tagger??
  • But, we might want to look at improvements we can make to RevMan, Archie and our processes that are further “upstream“ in the Review production process that can improve the quality of the data that comes out.
  • Obviously a hand-cranked example but you get the idea. This seems super star trek but we‘re in contact with folks at the BBC who are interested in doing a project such as this.
  • Looking to the future again...
  • We could look at answering these kinds of questions which involve external datasets and mashups...
  • We could look at answering these kinds of questions which involve external datasets and mashups...
  • GeoNames is a linked data set with geographical information and WWARN, though not yet available in linked data markup, contains info about Malaria drug resistance worldwide.
  • Photoshopped visualization of the answer to this question.
  • Some of these developments require us, an organisation, to think differently about our content. The one-size-fits-all “container“ of the Cochrane Review will need to be flexible in order for use to meet new user demands and to allow for content that travels freely (any device, any platform, any context), retains its context and meaning so that people know they can trust it and allows us to create new “products“ to meet these new user demands.
  • Quote from Martin Hepp: The “value proposition“ of Cochrane Review data is what we have to say about health care, about the evidence behind certain interventions for certain conditions, about the trials that are conducted, etc. Structured and linked data allows us to spread our message wider, to disseminate more effectively the valuable things we have to say about how health care is administered worldwide.
  • BUT, we shouldn‘t sell the farm or throw out what we have and start from scratch at all! What‘s great about these technologies is that they can sit alongside and enrich and enhance our content, without overturning current processes and infrastructure. So, no worries!
  • How could partners use it? What would it look like on a news site? Or, in PubMed? or anywhere else, for that matter?
  • Obviously, this is a Photoshopped image. We‘d need to work out a deal with PubMed! But, it represents the basic idea of thinking of our content in “nimble“ terms.
  • Cochrane could take the lead and model the entire knowledge space of Evidence-based Health Care in these semantic standards and create a giant triple store with our data. Then, others in the EbHC would use our ontologies and refer to our data and thus we would drive “the conversation” around the data.
  • Then, once we start throwing in other datasets, the triple store becomes even more powerful (note: not sure I drew all lines between all datasets, but you get the idea)…
  • For example, Volkswagen have done this for the car industry. They modeled the domain of car options with a car options ontology and are now positioned with “first mover” advantage in the car industry in leveraging semantic technologies.
  • This crazy image is the Linked Data cloud which shows all the various datasets currently in the web of data. The pink area is the life sciences area and includes PubMed, DrugBank and others.
  • Linked Data and Cochrane Reviews

    1. 1. Linked Data and Cochrane Reviews A report from the „Star Trek‟ CrewChris MavergamesWeb Operations Manager/Information ArchitectCochrane Collaboration Web Team
    2. 2.  Intro to linked data and what it means for Cochrane "Star Trek" stream of work so far Whats possible now and in the future * Acknowledgements to Lorne Becker and the entire Star Trek crew. Their input was invaluable in the preparation of this talk.Structure of this talk
    3. 3.  There are problems that limit their use by some people ◦ Difficult to wade through all of the text ◦ Difficult to understand the figures, terminology, and other bits of the Review ◦ Hard to compare interventions without reading multiple Reviews ◦ Can be difficult to find the Review you seekCochrane Reviews are fantasticBUT…
    4. 4.  Search for “Prozac” – no reviews Search for “fluoxetine” – 25 reviewsSearching The Cochrane Library
    5. 5.  Beginning to do this now: ◦ Summaries.Cochrane.org for consumers ◦ Cochrane Clinical for clinicians BUT ◦ Takes a lot of work to reformulate reviews & authors, CRGs, etc are busyWouldn’t it be nice if we could automate or partially automate this?Ideally we‟d restructure ourcontent for different users
    6. 6. How did Bing read 3 differentweather sites & bring me the dataI need?
    7. 7. If so, what might we be able to accomplish?Could we do similar magic withour Cochrane reviews?
    8. 8. Linked data
    9. 9. Semantic Web is made up of: Linked Data & Web of Data Which all together comprise Web 3.0What is linked data?
    10. 10. Current web = Web of documents
    11. 11. Docs are linked not data in docs
    12. 12.  Data on the web is meant for human consumption Machines need the data to be structured Once structured, information can be more easily shared within datasets and across web pagesMachines aren„t good at readingweb pages
    13. 13. Cochrane Reviews and Linked data
    14. 14. XML <?xml version="1.0" encoding="ISO-8859-1" standalone="no"?> <COCHRANE_REVIEW DESCRIPTION="For publication" DOI="10.1002/14651858.CD008440" GROUP_ID="HIV" ID="589309120202025823" MERGED_FROM="" MODIFIED="2011-05-06 12:29:46 +0100" MODIFIED_BY="Rachel Marshall" REVIEW_NO="" REVMAN_SUB_VERSION="5.1.1" REVMAN_VERSION="5" SPLIT_FROM="" STAGE="R" STATUS="A" TYPE="INTERVENTION" VERSION_NO="2.0">........Cochrane Reviews
    15. 15. Fortunately, Cochrane Reviews are structured – but we still need to teach the machines how to read them, where to find data within them and how the data is related.
    16. 16. Data Data point Data point point Data Data point Data point pointCochrane Reviews
    17. 17. Cochrane Register of Studies
    18. 18.  Lack of unique study IDs a real problem CRS solves this by providing a unique ID for all studies that can be referenced Better linking of data about trials and possibilities with linking to external sources such as PubMed (example later)Links to the CRS
    19. 19.  OWL (Web Ontology Language) RDF (Resource Description Framework) SPARQL (RDF query language) Model Cochrane Reviews in OWL Transform them into RDF and add to triple store Query them with SPARQL OR, simply...Linked data technologies
    20. 20. Use the gears!
    21. 21. Subject -> Property -> Object<Gerd Antes> has-role <Director German Ctr><Director German Ctr> works-in <Freiburg, Germany><Gerd Antes> works-in <Freiburg, Germany>Triple store = Way we think!
    22. 22. Standard tools have been All developed to facilitate this Reviews process in Archie A Copy of A A Machine the Model Readable Review of the “Triple XML Data Store”Using “the gears”
    23. 23. A Question A Machine Generated Answer A Machine Readable “Triple Store”Using “the gears”
    24. 24. Star Trek
    25. 25. Insert witty Star Trek reference here!
    26. 26. Cochrane Review ontology
    27. 27. Lots of work still needed from people with a deep understanding of Cochrane content in order to get the data model and ontology rightCochrane Review ontology
    28. 28. Cochrane Review ontology
    29. 29. Cochrane Review ontology
    30. 30. Cochrane Review ontology
    31. 31. Findings ontology from Lorne
    32. 32. A Question A Machine Generated Answer A Machine Readable “Triple Store”What sorts of things could we dowith this?
    33. 33. Gears!
    34. 34.  Ask questions that use data from several different reviews Enhance the experience of our users by including data from the triple stores of others Improve search Make it easier for people to find Cochrane ReviewsWe can…
    35. 35. Ask questions that use data from several different reviewsEnhancing the User Experience
    36. 36. I’ve done a search for trials on a particular intervention for dementia. I want to know which of the trials have been included in a Cochrane Review.A question using multiple reviews
    37. 37.  Search for the relevant Reviews Read the reference lists to find included trials Compare with my trial search Eliminate the new references that are additional publications from trials already included in a Review.OR…Finding the answer the old way
    38. 38. My list of trials A A Machine ”studified” Generated list from list of the CRS trials not yet included The in a Cochrane review Review “Triple Store”The “Star Trek” Way
    39. 39. Links to the relevant Review forthose trials that were included
    40. 40. INSERT IMAGE FOR QUESTION 1 HEREQuestion 1: SPARQL query and partial list of results
    41. 41. What are the risks of bias for the entire set of trials assessing the effectiveness of a particular intervention?Another question using multipleReviews
    42. 42.  Search for the relevant reviews (there may be more than one) Read the tables of included studies to find risk of bias assessments for each trial Combine them* (in some cases review authors may have done this for all of the trials in a single review)Finding the answer the old way
    43. 43. A Machine generated The summary of Cochrane the Risk of Review Bias “Triple assessments Store” for the relevant trialsThe “Star Trek” Way
    44. 44. RoB Summary for Cochrane Reviews ondementia These figures summarize Risks of Bias from the trials included in the reviews in your searchQuestion 2 visualized
    45. 45. XML <?xml version="1.0" encoding="ISO-8859-1" standalone="no"?> <COCHRANE_REVIEW DESCRIPTION="For publication" DOI="10.1002/14651858.CD008440" GROUP_ID="HIV" ID="589309120202025823" MERGED_FROM="" MODIFIED="2011-05-06 12:29:46 +0100" MODIFIED_BY="Rachel Marshall" REVIEW_NO="" REVMAN_SUB_VERSION="5.1.1" REVMAN_VERSION="5" SPLIT_FROM="" STAGE="R" STATUS="A" TYPE="INTERVENTION" VERSION_NO="2.0">........Cochrane Reviews
    46. 46. Make search work betterEnhancing the User Experience
    47. 47.  Or, one could say any of these: Abenol (CA), Acephen, Anadin Paracetamol (UK), Apo-Acetaminophen (CA), Aspirin Free Anacin, Atasol (CA), Calpol (UK), Cetaphen, Childrens Tylenol Soft Chews, Disprol (UK), Exdol (CA), Feverall, Galpamol (UK), Genapap, Genebs, Infants Pain Reliever, Mandanol (UK), Nortemp, Pain Eze, Panadol (UK), Robigesic (CA), Silapap, Tycolene, Tylenol 8 Hour, Tylenol, Tylenol Arthritis, Uni- Ace, ValorinYou Say “Paracetamol”I Say “Acetaminophen”
    48. 48. LinkedLifeData.com
    49. 49. LinkedLifeData.com
    50. 50. DrugBank
    51. 51. XML <?xml version="1.0" encoding="ISO-8859-1" standalone="no"?> <COCHRANE_REVIEW DESCRIPTION="For publication" DOI="10.1002/14651858.CD008440" GROUP_ID="HIV" ID="589309120202025823" MERGED_FROM="" MODIFIED="2011-05-06 12:29:46 +0100" MODIFIED_BY="Rachel Marshall" REVIEW_NO="" REVMAN_SUB_VERSION="5.1.1" REVMAN_VERSION="5" SPLIT_FROM="" STAGE="R" STATUS="A" TYPE="INTERVENTION" VERSION_NO="2.0">........Cochrane Reviews
    52. 52. XML <?xml version="1.0" encoding="ISO-8859-1" standalone="no"?> <COCHRANE_REVIEW DESCRIPTION="For publication" DOI="10.1002/14651858.CD008440" GROUP_ID="HIV" ID="589309120202025823" MERGED_FROM="" MODIFIED="2011-05-06 12:29:46 +0100" MODIFIED_BY="Rachel Marshall" REVIEW_NO="" REVMAN_SUB_VERSION="5.1.1" REVMAN_VERSION="5" SPLIT_FROM="" STAGE="R" STATUS="A" TYPE="INTERVENTION" VERSION_NO="2.0">........Cochrane Reviews
    53. 53. Make it easier for people to find Cochrane ReviewsEnhancing the User Experience
    54. 54. Enhancing news content
    55. 55.  Cochrane Reviews marked up in semantic markup can be linked to news publishers For example, BBC Health writers could be suggested related Cochrane evidence for a particular story they are writing And, could include a link to primary source material such as a Cochrane Review Thus driving traffic to our ReviewsEnhancing news content
    56. 56. Super Star Trek
    57. 57. How applicable is this Review in my part of the world?Super Star Trek
    58. 58. A list of the drugs in comparisons of malaria in Reviews and the geographic extent of their effectivenessGeographical relevance
    59. 59. Map of Artemisin Resistance
    60. 60. The future
    61. 61. Structured and linked data can help makeour content “nimble”Nimble content can: • Travel Freely • Retain Context Meaning • Create New Products - R. Lovinger, RazorfishMaking our content nimble
    62. 62. "Structured data allows you topreserve your value propositionover a longer distance to a muchwider audience." - Martin Hepp, creator of the Good Relations ontologyStructured data
    63. 63. Implementing semantic and linkeddata technologies should be: • Non-invasive • Agile • Low impact (on staff – hopefully, high impact on users!)Incremental development
    64. 64. What would Cochrane data “look like” outside of it’s container, the Review?Looking to the future
    65. 65. For example: someone who is looking at a study in PubMed might be interested in seeing Cochrane’s Risk of Bias assessment of this study, regardless of whether they are interested in the overall Cochrane Review that includes that study.Risk of Bias in PubMed
    66. 66. RoB assessment in PubMed
    67. 67.  Linked Data or Web 3.0 is here How can we leverage these tools to further our mission Requires that we think differently about the “container“ of the Review Our data needs to become “nimble“ to meet future user needs We should proceed slowly, incrementally What are the “quick wins“ – Links to CRS? Across-Review queries? Links to external datasetsSummary
    68. 68. CRS/ CDSR CENTRAL HTAs DARE CMREbHC Semantic Platform
    69. 69. CRS/ CDSR CENTRAL UMLS Drug Bank Diseasome HTAs DARE Symptom CMR * BBC Health Ontology OntologyEbHC Semantic Platform * Not yet created
    70. 70. Cochrane and EbHC ontology?
    71. 71. Will Cochrane have a bubblehere someday?
    72. 72. Muchas Gracias!

    ×