Your SlideShare is downloading. ×
0
ReDD-Observatory       Using the Web Of Data to Evaluate Research-Disease                           Disparity             ...
Outline      •   Overview      •   Steps          •   Dataset Identification          •   Dataset Conversion          •   D...
ReDD-ObservatoryAmrapali Zaveri, Universität Leipzig   24 August, WI-2011   3 of 18
ReDD-Observatory      •  A project to evaluate the disparity          between         • active areas of biomedical researc...
MotivationAmrapali Zaveri, Universität Leipzig   24 August, WI-2011   4 of 18
Motivation      •  Large amount of disparity all over the world       • Placing individuals in danger        • Hindrance t...
Datasets Identification            The Life Science Linked Data Web                 Linked                   CT            ...
Datasets Identification                      Dataset                   RDFized                                             ...
Dataset Conversion   GHOAmrapali Zaveri, Universität Leipzig   24 August, WI-2011   7 of 18
Dataset Conversion   GHO               ConvertAmrapali Zaveri, Universität Leipzig   24 August, WI-2011   7 of 18
Dataset Conversion   GHO               Convert     Using The RDF      Data Cube      VocabularyAmrapali Zaveri, Universitä...
Dataset Conversion   GHO               Convert                             Publish     Using The RDF      Data Cube      V...
Dataset Conversion   GHO               Convert                                                                            ...
Dataset RDFizationAmrapali Zaveri, Universität Leipzig   24 August, WI-2011   8 of 18
Dataset RDFization  • Available as spreadsheetsAmrapali Zaveri, Universität Leipzig   24 August, WI-2011   8 of 18
Dataset RDFization  • Available as spreadsheets    • Fully automated conversion not feasibleAmrapali Zaveri, Universität L...
Dataset RDFization  • Available as spreadsheets    • Fully automated conversion not feasible  • Semi-automatic approach de...
Dataset RDFization  • Available as spreadsheets    • Fully automated conversion not feasible  • Semi-automatic approach de...
Dataset RDFization  • Available as spreadsheets    • Fully automated conversion not feasible  • Semi-automatic approach de...
Dataset RDFization  • Available as spreadsheets    • Fully automated conversion not feasible  • Semi-automatic approach de...
Dataset RDFization               OntoWiki’s CSV Import Plug-inAmrapali Zaveri, Universität Leipzig   24 August, WI-2011   ...
Dataset RDFization               OntoWiki’s CSV Import Plug-in                                              Table 3. Estim...
Dataset RDFization                      OntoWiki’s CSV Import Plug-in Dimensions                                         T...
Dataset RDFization                      OntoWiki’s CSV Import Plug-in Dimensions                                         T...
Dataset RDFization                      OntoWiki’s CSV Import Plug-in Dimensions                                         T...
Dataset RDFization    * Available for download at: http://aksw.org/Projects/Stats2RDFAmrapali Zaveri, Universität Leipzig ...
Dataset RDFization    Example of a single statistical item, the death    value of 41.4, from the GHO dataset*    represent...
Dataset RDFization    Example of a single statistical item, the death    value of 41.4, from the GHO dataset*    represent...
Dataset RDFization    Example of a single statistical item, the death    value of 41.4, from the GHO dataset*    represent...
Datasets Interlinking       Using SILK* and MeSH (UMLS)** *J. Volz, C. Bizer, M. Gaedke, and G. Kobilarov, “Discovering an...
Datasets Interlinking       Using SILK* and MeSH (UMLS)**       Publications *J. Volz, C. Bizer, M. Gaedke, and G. Kobilar...
Datasets Interlinking       Using SILK* and MeSH (UMLS)**       Publications                    Countries *J. Volz, C. Biz...
Datasets Interlinking       Using SILK* and MeSH (UMLS)**       Publications                    Countries         Diseases...
Datasets DetailsAmrapali Zaveri, Universität Leipzig   24 August, WI-2011   12 of 18
Datasets Details              Linked           SPARQL Dataset                                          Triples   Classes  ...
Research Indices                                       Death                                       DALYDALY = Disability-a...
Research Indices                                       Death Trials vs. Deaths = Trials (weighted)/Total Number of Trials ...
Research Indices                                       Death Trials vs. Deaths = Trials (weighted)/Total Number of Trials ...
Research Indices                                       Death Trials vs. Deaths = Trials (weighted)/Total Number of Trials ...
Research Indices                                       Death Trials vs. Deaths = Trials (weighted)/Total Number of Trials ...
Research Indices                                       Death Trials vs. Deaths = Trials (weighted)/Total Number of Trials ...
Research Indices                                       Death Trials vs. Deaths = Trials (weighted)/Total Number of Trials ...
Result - ReDD-ObservatoryAmrapali Zaveri, Universität Leipzig   24 August, WI-2011   14 of 18
Result - ReDD-Observatory      Country Profile - TuberculosisAmrapali Zaveri, Universität Leipzig   24 August, WI-2011   14...
Result - ReDD-Observatory      Country Profile - Tuberculosis             TrialsAmrapali Zaveri, Universität Leipzig   24 A...
Result - ReDD-Observatory      Country Profile - Tuberculosis             Trials                               Publications...
User InterfaceAmrapali Zaveri, Universität Leipzig   24 August, WI-2011   15 of 18
User InterfaceAmrapali Zaveri, Universität Leipzig   24 August, WI-2011   15 of 18
User InterfaceAmrapali Zaveri, Universität Leipzig   24 August, WI-2011   15 of 18
User InterfaceAmrapali Zaveri, Universität Leipzig   24 August, WI-2011   15 of 18
User Interface                    * Available at: http://redd.aksw.orgAmrapali Zaveri, Universität Leipzig     24 August, ...
Limitations and Future WorkAmrapali Zaveri, Universität Leipzig   24 August, WI-2011   16 of 18
Limitations and Future Work    • LimitationsAmrapali Zaveri, Universität Leipzig   24 August, WI-2011   16 of 18
Limitations and Future Work    • Limitations      • Information QualityAmrapali Zaveri, Universität Leipzig   24 August, W...
Limitations and Future Work    • Limitations      • Information Quality      • CoverageAmrapali Zaveri, Universität Leipzi...
Limitations and Future Work    • Limitations      • Information Quality      • Coverage      • Interlinking QualityAmrapal...
Limitations and Future Work    • Limitations      • Information Quality      • Coverage      • Interlinking Quality      •...
Limitations and Future Work    • Limitations      • Information Quality      • Coverage      • Interlinking Quality      •...
Limitations and Future Work    • Limitations      • Information Quality      • Coverage      • Interlinking Quality      •...
Limitations and Future Work    • Limitations      • Information Quality      • Coverage      • Interlinking Quality      •...
Limitations and Future Work    • Limitations      • Information Quality      • Coverage      • Interlinking Quality      •...
Acknowledgements                                   Team Members                           ReviewersAmrapali Zaveri, Univer...
Questions?                       Comments?                       Suggestions?                                  Thank you !...
Upcoming SlideShare
Loading in...5
×

ReDD-Observatory

725

Published on

http://aksw.org/Projects/ReDDObservatory

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
725
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
5
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • \n
  • \n
  • Why evaluate disparity?\nhealth care researchers, policy makers\nto take appropriate decisions to allocate funds, conduct clinical trials, perform research\n
  • between the availability of treatment options and the prevalence of diseases all over the world\nLinked Data refers to the publishing and connecting of structured data on the web in a way such that the data is machine-readable, its meaning is explicitly defined, it is linked to other external data sets, and can in turn be linked to from external data sets.\n\n
  • In essence, problem we are addressing: The analysis of biomedical research effectiveness with regard to reducing disparity between research intensity and global burden of disease is hampered by a lack of methods for integrating and querying distributed, heterogeneous data.\n
  • After performing an extensive analysis of relevant datasets on the numerous biomedical datasets (Figure 1) available, we selected three particular ones\n\n
  • 61,920 governmentally and privately funded clinical trials conducted around the world.\n19 million publications from MEDLINE and other life science journals.\nBio2RDF is a mashup of about 42 different bio-medical knowledge bases, aiming to facilitate the creation of bioinformatics information systems.\nGHO contains statistical information regarding the mortality and burden of disease classified according to the death and DALY (disability-adjusted life year) estimates grouped by countries\n\n
  • \n
  • \n
  • \n
  • \n
  • yearly\n
  • yearly\n
  • yearly\n
  • yearly\n
  • yearly\n
  • yearly\n
  • The dimension components serve to identify the observations.\nThe attribute components allow us to qualify and interpret the observed value(s).\n
  • The dimension components serve to identify the observations.\nThe attribute components allow us to qualify and interpret the observed value(s).\n
  • The dimension components serve to identify the observations.\nThe attribute components allow us to qualify and interpret the observed value(s).\n
  • The dimension components serve to identify the observations.\nThe attribute components allow us to qualify and interpret the observed value(s).\n
  • \n
  • \n
  • \n
  • Silk 2.0 [17] is a tool for discovering relationships between data items within different knowledge bases, usually available via SPARQL endpoints.\nMeSH: controlled vocabulary thesaurus. It consists of sets of terms (i.e. synonyms) naming descriptors (e.g. diseases) arranged in a hierarchical structure\nowl:sameAS\nrdfs:subClassOf\n
  • Silk 2.0 [17] is a tool for discovering relationships between data items within different knowledge bases, usually available via SPARQL endpoints.\nMeSH: controlled vocabulary thesaurus. It consists of sets of terms (i.e. synonyms) naming descriptors (e.g. diseases) arranged in a hierarchical structure\nowl:sameAS\nrdfs:subClassOf\n
  • Silk 2.0 [17] is a tool for discovering relationships between data items within different knowledge bases, usually available via SPARQL endpoints.\nMeSH: controlled vocabulary thesaurus. It consists of sets of terms (i.e. synonyms) naming descriptors (e.g. diseases) arranged in a hierarchical structure\nowl:sameAS\nrdfs:subClassOf\n
  • Challenges\n- Ambiguity\n- Dataset quality\nowl:sameAS\nrdfs:subClassOf\n
  • \n
  • One disability-adjusted life-year is defined as the loss of one year of healthy life to disease\nThe research productivity indicator was normalized by creating a ratio between total productivity for a given disease in a given country over total research productivity for a given country.\nThe disease burden indicator was normalized by representing it as a percent ratio between the disease burden for a given condition for a given country over the disease burden for all diseases for a given country. Indicators were placed in the denominator of the research- disease index so that 100 represented a perfect match be- tween research effort and disease burden for a given country and for a given disease. Numbers over 100 represent an over investment in research for that area, whereas numbers under 100 represent underinvestment. However, we did not take into account whether a country spends more or less effort relative to other countries.\n\n\n\n
  • One disability-adjusted life-year is defined as the loss of one year of healthy life to disease\nThe research productivity indicator was normalized by creating a ratio between total productivity for a given disease in a given country over total research productivity for a given country.\nThe disease burden indicator was normalized by representing it as a percent ratio between the disease burden for a given condition for a given country over the disease burden for all diseases for a given country. Indicators were placed in the denominator of the research- disease index so that 100 represented a perfect match be- tween research effort and disease burden for a given country and for a given disease. Numbers over 100 represent an over investment in research for that area, whereas numbers under 100 represent underinvestment. However, we did not take into account whether a country spends more or less effort relative to other countries.\n\n\n\n
  • One disability-adjusted life-year is defined as the loss of one year of healthy life to disease\nThe research productivity indicator was normalized by creating a ratio between total productivity for a given disease in a given country over total research productivity for a given country.\nThe disease burden indicator was normalized by representing it as a percent ratio between the disease burden for a given condition for a given country over the disease burden for all diseases for a given country. Indicators were placed in the denominator of the research- disease index so that 100 represented a perfect match be- tween research effort and disease burden for a given country and for a given disease. Numbers over 100 represent an over investment in research for that area, whereas numbers under 100 represent underinvestment. However, we did not take into account whether a country spends more or less effort relative to other countries.\n\n\n\n
  • One disability-adjusted life-year is defined as the loss of one year of healthy life to disease\nThe research productivity indicator was normalized by creating a ratio between total productivity for a given disease in a given country over total research productivity for a given country.\nThe disease burden indicator was normalized by representing it as a percent ratio between the disease burden for a given condition for a given country over the disease burden for all diseases for a given country. Indicators were placed in the denominator of the research- disease index so that 100 represented a perfect match be- tween research effort and disease burden for a given country and for a given disease. Numbers over 100 represent an over investment in research for that area, whereas numbers under 100 represent underinvestment. However, we did not take into account whether a country spends more or less effort relative to other countries.\n\n\n\n
  • One disability-adjusted life-year is defined as the loss of one year of healthy life to disease\nThe research productivity indicator was normalized by creating a ratio between total productivity for a given disease in a given country over total research productivity for a given country.\nThe disease burden indicator was normalized by representing it as a percent ratio between the disease burden for a given condition for a given country over the disease burden for all diseases for a given country. Indicators were placed in the denominator of the research- disease index so that 100 represented a perfect match be- tween research effort and disease burden for a given country and for a given disease. Numbers over 100 represent an over investment in research for that area, whereas numbers under 100 represent underinvestment. However, we did not take into account whether a country spends more or less effort relative to other countries.\n\n\n\n
  • One disability-adjusted life-year is defined as the loss of one year of healthy life to disease\nThe research productivity indicator was normalized by creating a ratio between total productivity for a given disease in a given country over total research productivity for a given country.\nThe disease burden indicator was normalized by representing it as a percent ratio between the disease burden for a given condition for a given country over the disease burden for all diseases for a given country. Indicators were placed in the denominator of the research- disease index so that 100 represented a perfect match be- tween research effort and disease burden for a given country and for a given disease. Numbers over 100 represent an over investment in research for that area, whereas numbers under 100 represent underinvestment. However, we did not take into account whether a country spends more or less effort relative to other countries.\n\n\n\n
  • correlation between the indices comprising death and DALY as well as between the indices comprising trials and publications.\nover-resourced from the viewpoint of indices comprising publications, but under-resourced from the viewpoint of indices comprising clinical trials. \nit is difficult to balance between the two priorities longevity and quality-of-life.\n\n
  • correlation between the indices comprising death and DALY as well as between the indices comprising trials and publications.\nover-resourced from the viewpoint of indices comprising publications, but under-resourced from the viewpoint of indices comprising clinical trials. \nit is difficult to balance between the two priorities longevity and quality-of-life.\n\n
  • correlation between the indices comprising death and DALY as well as between the indices comprising trials and publications.\nover-resourced from the viewpoint of indices comprising publications, but under-resourced from the viewpoint of indices comprising clinical trials. \nit is difficult to balance between the two priorities longevity and quality-of-life.\n\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Transcript of "ReDD-Observatory"

    1. 1. ReDD-Observatory Using the Web Of Data to Evaluate Research-Disease Disparity Amrapali Zaveri, Ricardo Pietrobon, Sören Auer, Jens Lehmann, Michael Martin and Timofey Ermilov24 August, WI-2011 1 of 20
    2. 2. Outline • Overview • Steps • Dataset Identification • Dataset Conversion • Datasets Interlinking • Research Indices • Results • User Interface • Limitations and Future WorkAmrapali Zaveri, Universität Leipzig 24 August, WI-2011 2 of 18
    3. 3. ReDD-ObservatoryAmrapali Zaveri, Universität Leipzig 24 August, WI-2011 3 of 18
    4. 4. ReDD-Observatory •  A project to evaluate the disparity between • active areas of biomedical research and  • the global burden of disease  • Using Linked DataAmrapali Zaveri, Universität Leipzig 24 August, WI-2011 3 of 18
    5. 5. MotivationAmrapali Zaveri, Universität Leipzig 24 August, WI-2011 4 of 18
    6. 6. Motivation •  Large amount of disparity all over the world • Placing individuals in danger  • Hindrance to Research Policy makers • Partially caused by restricted access to information • Due to difficulty in reliably obtaining and integrating data • Solution: Using Linked DataAmrapali Zaveri, Universität Leipzig 24 August, WI-2011 4 of 18
    7. 7. Datasets Identification The Life Science Linked Data Web Linked CT PubMedAmrapali Zaveri, Universität Leipzig 24 August, WI-2011 5 of 18
    8. 8. Datasets Identification Dataset RDFized LinkedCT 1 Linked CT Bio2RDF’s PubMed 2 PubMed Global Health Observatory (GHO) 3 ?Amrapali Zaveri, Universität Leipzig 24 August, WI-2011 6 of 18
    9. 9. Dataset Conversion GHOAmrapali Zaveri, Universität Leipzig 24 August, WI-2011 7 of 18
    10. 10. Dataset Conversion GHO ConvertAmrapali Zaveri, Universität Leipzig 24 August, WI-2011 7 of 18
    11. 11. Dataset Conversion GHO Convert Using The RDF Data Cube VocabularyAmrapali Zaveri, Universität Leipzig 24 August, WI-2011 7 of 18
    12. 12. Dataset Conversion GHO Convert Publish Using The RDF Data Cube VocabularyAmrapali Zaveri, Universität Leipzig 24 August, WI-2011 7 of 18
    13. 13. Dataset Conversion GHO Convert The Linked Data Web Publish Sussex Reading St. Andrews NDL Using The RDF Audio- Lists Resource subjects t4gm MySpace scrobbler Lists Moseley (DBTune) (DBTune) RAMEAU Folk NTU SH lobid GTAA Plymouth Resource Lists Organi- Reading Lists sations Music The Open ECS Magna- Brainz Music DB tune Library LCSH South- (Data Brainz LIBRIS ampton Tropes lobid Ulm Incubator) (zitgist) Man- EPrints Resources Data Cube chester Surge Reading biz. Music RISKS Radio Lists The Open ECS data. John Brainz Discogs Library PSH Gem. UB South- gov.uk Peel (DBTune) FanHubz (Data In- (Talis) Norm- Mann- ampton (DB cubator) Jamendo datei heim RESEX Tune) Popula- Poké- DEPLOY Last.fm tion (En- pédia Artists Last.FM Linked RDF AKTing) research EUTC (DBTune) (rdfize) LCCN VIAF Book Wiki data.gov Produc- Pisa Eurécom P20 Mashup semantic NHS .uk tions classical web.org Pokedex Vocabulary (EnAKTing) (DB Mortality Tune) PBAC ECS (En- AKTing) BBC MARC (RKB Budapest Program Codes Explorer) Energy education OpenEI BBC List Semantic Lotico Revyu OAI (En- CO2 data.gov mes Music Crunch SW AKTing) (En- .uk Chronic- Linked Dog NSZL Base AKTing) ling Event- MDB RDF Food IRIT America Media Catalog ohloh BBC DBLP ACM IBM Good- BibBase Ord- Wildlife (RKB Openly Recht- win nance Finder Explorer) Local spraak. Family DBLP legislation Survey Tele- New VIVO UF .gov.uk nl graphis York flickr (L3S) New- VIVO castle Times URI wrappr Open Indiana RAE2001 UK Post- Burner Calais DBLP codes statistics (FU VIVO CiteSeer Roma data.gov LOIUS Taxon iServe Berlin) IEEE .uk Cornell Concept Geo World data ESD Fact- OS dcs Names book dotAC stan- reference Project Linked Data NASA (FUB) Freebase dards data.gov Guten- .uk for Intervals (Data GESIS Course- transport DBpedia berg STW ePrints CORDIS Incu- ware data.gov bator) (FUB) Fishes ERA UN/ .uk of Texas Geo LOCODE Uberblic Euro- Species The stat dbpedia TCM SIDER Pub KISTI (FUB) lite Gene STITCH Chem JISC London Geo KEGG DIT LAAS Gazette TWC LOGD Linked Daily OBO Drug Eurostat Data UMBEL lingvoj Med (es) Disea- YAGO Medi some Care ChEBI KEGG NSF Linked KEGG KEGG Linked Drug Cpd GovTrack rdfabout Glycan Sensor Data CT Bank Pathway US SEC Open Reactome (Kno.e.sis) riese Uni Cyc Lexvo Path- way PDB Media Semantic totl.net Pfam HGNC XBRL WordNet KEGG KEGG Geographic Linked Taxo- CAS Reaction Twarql (VUA) UniProt Enzyme rdfabout EUNIS Open nomy US Census Publications Numbers PRO- ProDom SITE Chem2 UniRef Bio2RDF User-generated content Climbing WordNet SGD Homolo Linked (W3C) Affy- Gene GeoData Cornetto metrix Government PubMed Gene UniParc Ontology GeneID Cross-domain Airports Product DB UniSTS MGI Gen Life sciences Bank OMIM InterPro As of September 2010Amrapali Zaveri, Universität Leipzig 24 August, WI-2011 7 of 18
    14. 14. Dataset RDFizationAmrapali Zaveri, Universität Leipzig 24 August, WI-2011 8 of 18
    15. 15. Dataset RDFization • Available as spreadsheetsAmrapali Zaveri, Universität Leipzig 24 August, WI-2011 8 of 18
    16. 16. Dataset RDFization • Available as spreadsheets • Fully automated conversion not feasibleAmrapali Zaveri, Universität Leipzig 24 August, WI-2011 8 of 18
    17. 17. Dataset RDFization • Available as spreadsheets • Fully automated conversion not feasible • Semi-automatic approach developed:Amrapali Zaveri, Universität Leipzig 24 August, WI-2011 8 of 18
    18. 18. Dataset RDFization • Available as spreadsheets • Fully automated conversion not feasible • Semi-automatic approach developed: • As plug-in in OntoWiki* - semantic collaboration platform developed by AKSW research group.Amrapali Zaveri, Universität Leipzig 24 August, WI-2011 8 of 18
    19. 19. Dataset RDFization • Available as spreadsheets • Fully automated conversion not feasible • Semi-automatic approach developed: • As plug-in in OntoWiki* - semantic collaboration platform developed by AKSW research group. • CSV file converted to RDF using the RDF Data Cube VocabularyAmrapali Zaveri, Universität Leipzig 24 August, WI-2011 8 of 18
    20. 20. Dataset RDFization • Available as spreadsheets • Fully automated conversion not feasible • Semi-automatic approach developed: • As plug-in in OntoWiki* - semantic collaboration platform developed by AKSW research group. • CSV file converted to RDF using the RDF Data Cube Vocabulary * Sören Auer et.al.: OntoWiki: A Tool for Social Semantic Collaboration In: Proceedings of the CKC 2007 at the 16th International WWW2007 Banff, Canada, 2007Amrapali Zaveri, Universität Leipzig 24 August, WI-2011 8 of 18
    21. 21. Dataset RDFization OntoWiki’s CSV Import Plug-inAmrapali Zaveri, Universität Leipzig 24 August, WI-2011 9 of 18
    22. 22. Dataset RDFization OntoWiki’s CSV Import Plug-in Table 3. Estimated deaths per 100,000 female population by cause, and Member State, 2004 (a) Afghanistan Albania Algeria Andorra Angola 3010 4005 1010 4008 1020 Tuberculosis 41.4 1.4 1.3 0.9 16.2 STDs 5.4 0 1.5 0.1 5.3 excluding HIV Syphilis 3.5 0 0.5 0 3.8 Chlamydia 0.9 - 0 - 0.1 Gonorrhoea 0.3 - 0 - 0.1 HIV/AIDS 0 - 0.7 1.4 66 Diarrhoeal 309.2 4.6 27.9 1.2 332.4 diseases Childhood- 42 0 2.1 0 38.7 cluster diseasesAmrapali Zaveri, Universität Leipzig 24 August, WI-2011 9 of 18
    23. 23. Dataset RDFization OntoWiki’s CSV Import Plug-in Dimensions Table 3. Estimated deaths per 100,000 female population by cause, and Country Member State, 2004 (a) rdfs:subPropertyOf qb:concept Afghanistan Albania Algeria Andorra Angola Population 3010 4005 1010 4008 1020 rdfs:subPropertyOf qb:concept Tuberculosis 41.4 1.4 1.3 0.9 16.2 Disease STDs 5.4 0 1.5 0.1 5.3 rdfs:subPropertyOf excluding HIV qb:concept Syphilis 3.5 0 0.5 0 3.8 Chlamydia 0.9 - 0 - 0.1 Gonorrhoea 0.3 - 0 - 0.1 HIV/AIDS 0 - 0.7 1.4 66 Diarrhoeal 309.2 4.6 27.9 1.2 332.4 diseases Childhood- 42 0 2.1 0 38.7 cluster diseasesAmrapali Zaveri, Universität Leipzig 24 August, WI-2011 9 of 18
    24. 24. Dataset RDFization OntoWiki’s CSV Import Plug-in Dimensions Table 3. Estimated deaths per 100,000 female population by cause, and Country Member State, 2004 (a) rdfs:subPropertyOf qb:concept Afghanistan Albania Algeria Andorra Angola Population 3010 4005 1010 4008 1020 rdfs:subPropertyOf qb:concept Tuberculosis 41.4 1.4 1.3 0.9 16.2 Disease STDs 5.4 0 1.5 0.1 5.3 rdfs:subPropertyOf excluding HIV qb:concept Syphilis 3.5 0 0.5 0 3.8Attributes Chlamydia 0.9 - 0 - 0.1 Measure Gonorrhoea 0.3 - 0 - 0.1 qb:attribute Unit of Measure HIV/AIDS 0 - 0.7 1.4 66 Diarrhoeal 309.2 4.6 27.9 1.2 332.4 diseases Childhood- 42 0 2.1 0 38.7 cluster diseasesAmrapali Zaveri, Universität Leipzig 24 August, WI-2011 9 of 18
    25. 25. Dataset RDFization OntoWiki’s CSV Import Plug-in Dimensions Table 3. Estimated deaths per 100,000 female population by cause, and Country Member State, 2004 (a) rdfs:subPropertyOf qb:concept Afghanistan Albania Algeria Andorra Angola Population 3010 4005 1010 4008 1020 rdfs:subPropertyOf qb:concept Tuberculosis 41.4 1.4 1.3 0.9 16.2 Disease STDs 5.4 0 1.5 0.1 5.3 rdfs:subPropertyOf excluding HIV qb:concept Syphilis 3.5 0 0.5 0 3.8Attributes Chlamydia 0.9 - 0 - 0.1 Measure Gonorrhoea 0.3 - 0 - 0.1 qb:attribute Unit of Measure HIV/AIDS 0 - 0.7 1.4 66Data Range Diarrhoeal 309.2 4.6 27.9 1.2 332.4 3,5 diseases Childhood- 42 0 2.1 0 38.7 7,12 cluster diseasesAmrapali Zaveri, Universität Leipzig 24 August, WI-2011 9 of 18
    26. 26. Dataset RDFization * Available for download at: http://aksw.org/Projects/Stats2RDFAmrapali Zaveri, Universität Leipzig 24 August, WI-2011 10 of 18
    27. 27. Dataset RDFization Example of a single statistical item, the death value of 41.4, from the GHO dataset* represented using the Data Cube vocabulary: eg:o1     a               qb:Observation;           gho:Country   Afghanistan;           gho:whoid      1605;           gho:pop     3010;           gho:disease      Tuberculosis;           gho:gbdcode     W0003;           gho:death        41.4. * Available for download at: http://aksw.org/Projects/Stats2RDFAmrapali Zaveri, Universität Leipzig 24 August, WI-2011 10 of 18
    28. 28. Dataset RDFization Example of a single statistical item, the death value of 41.4, from the GHO dataset* represented using the Data Cube vocabulary: eg:o1     a               qb:Observation;           gho:Country   Afghanistan;           gho:whoid      1605;           gho:pop     3010;           gho:disease      Tuberculosis;           gho:gbdcode     W0003;           gho:death        41.4. * Available for download at: http://aksw.org/Projects/Stats2RDFAmrapali Zaveri, Universität Leipzig 24 August, WI-2011 10 of 18
    29. 29. Dataset RDFization Example of a single statistical item, the death value of 41.4, from the GHO dataset* represented using the Data Cube vocabulary: eg:o1     a               qb:Observation;           gho:Country   Afghanistan;           gho:whoid      1605;           gho:pop     3010; 3 Million Triples*           gho:disease      Tuberculosis;           gho:gbdcode     W0003;           gho:death        41.4. * Available for download at: http://aksw.org/Projects/Stats2RDFAmrapali Zaveri, Universität Leipzig 24 August, WI-2011 10 of 18
    30. 30. Datasets Interlinking Using SILK* and MeSH (UMLS)** *J. Volz, C. Bizer, M. Gaedke, and G. Kobilarov, “Discovering and maintaining links on the web of data,” in ISWC, 2009. **S. J. Nelson, T. Powell, Humphreys, and B. L., The Unified Medical Language System (UMLS) Project. New York: Marcel Dekker, Inc., 2002, pp. 369–378.Amrapali Zaveri, Universität Leipzig 24 August, WI-2011 11 of 18
    31. 31. Datasets Interlinking Using SILK* and MeSH (UMLS)** Publications *J. Volz, C. Bizer, M. Gaedke, and G. Kobilarov, “Discovering and maintaining links on the web of data,” in ISWC, 2009. **S. J. Nelson, T. Powell, Humphreys, and B. L., The Unified Medical Language System (UMLS) Project. New York: Marcel Dekker, Inc., 2002, pp. 369–378.Amrapali Zaveri, Universität Leipzig 24 August, WI-2011 11 of 18
    32. 32. Datasets Interlinking Using SILK* and MeSH (UMLS)** Publications Countries *J. Volz, C. Bizer, M. Gaedke, and G. Kobilarov, “Discovering and maintaining links on the web of data,” in ISWC, 2009. **S. J. Nelson, T. Powell, Humphreys, and B. L., The Unified Medical Language System (UMLS) Project. New York: Marcel Dekker, Inc., 2002, pp. 369–378.Amrapali Zaveri, Universität Leipzig 24 August, WI-2011 11 of 18
    33. 33. Datasets Interlinking Using SILK* and MeSH (UMLS)** Publications Countries Diseases *J. Volz, C. Bizer, M. Gaedke, and G. Kobilarov, “Discovering and maintaining links on the web of data,” in ISWC, 2009. **S. J. Nelson, T. Powell, Humphreys, and B. L., The Unified Medical Language System (UMLS) Project. New York: Marcel Dekker, Inc., 2002, pp. 369–378.Amrapali Zaveri, Universität Leipzig 24 August, WI-2011 11 of 18
    34. 34. Datasets DetailsAmrapali Zaveri, Universität Leipzig 24 August, WI-2011 12 of 18
    35. 35. Datasets Details Linked SPARQL Dataset Triples Classes Properties RDF Links Data Endpoint data/linkedct.org/ LinkedCT LinkedCT.org snorql 7M 13 90 235,998 pubmed.bio2rdf.org PubMed Bio2RDF. org /sparql 797M 120 362 30,000 db0.aksw.org:8895/ GHO redd.aksw.org sparql 3M 8 8 6,000Amrapali Zaveri, Universität Leipzig 24 August, WI-2011 12 of 18
    36. 36. Research Indices Death DALYDALY = Disability-adjusted life-yearAmrapali Zaveri, Universität Leipzig 24 August, WI-2011 13 of 18
    37. 37. Research Indices Death Trials vs. Deaths = Trials (weighted)/Total Number of Trials DALYDALY = Disability-adjusted life-yearAmrapali Zaveri, Universität Leipzig 24 August, WI-2011 13 of 18
    38. 38. Research Indices Death Trials vs. Deaths = Trials (weighted)/Total Number of Trials Deaths/Total Number of Deaths DALYDALY = Disability-adjusted life-yearAmrapali Zaveri, Universität Leipzig 24 August, WI-2011 13 of 18
    39. 39. Research Indices Death Trials vs. Deaths = Trials (weighted)/Total Number of Trials Deaths/Total Number of Deaths Pubs vs. Deaths = Pubs (weighted)/Total Number of Pubs DALYDALY = Disability-adjusted life-yearAmrapali Zaveri, Universität Leipzig 24 August, WI-2011 13 of 18
    40. 40. Research Indices Death Trials vs. Deaths = Trials (weighted)/Total Number of Trials Deaths/Total Number of Deaths Pubs vs. Deaths = Pubs (weighted)/Total Number of Pubs Deaths/Total Number of Deaths DALYDALY = Disability-adjusted life-yearAmrapali Zaveri, Universität Leipzig 24 August, WI-2011 13 of 18
    41. 41. Research Indices Death Trials vs. Deaths = Trials (weighted)/Total Number of Trials Deaths/Total Number of Deaths Pubs vs. Deaths = Pubs (weighted)/Total Number of Pubs Deaths/Total Number of Deaths DALY Trials vs. DALYs = Trials (weighted)/Total Number of Trials DALY/Total Number of DALYsDALY = Disability-adjusted life-yearAmrapali Zaveri, Universität Leipzig 24 August, WI-2011 13 of 18
    42. 42. Research Indices Death Trials vs. Deaths = Trials (weighted)/Total Number of Trials Deaths/Total Number of Deaths Pubs vs. Deaths = Pubs (weighted)/Total Number of Pubs Deaths/Total Number of Deaths DALY Trials vs. DALYs = Trials (weighted)/Total Number of Trials DALY/Total Number of DALYs Pubs vs. DALYs = Pubs (weighted)/Total Number of Pubs DALY/Total Number of DALYsDALY = Disability-adjusted life-yearAmrapali Zaveri, Universität Leipzig 24 August, WI-2011 13 of 18
    43. 43. Result - ReDD-ObservatoryAmrapali Zaveri, Universität Leipzig 24 August, WI-2011 14 of 18
    44. 44. Result - ReDD-Observatory Country Profile - TuberculosisAmrapali Zaveri, Universität Leipzig 24 August, WI-2011 14 of 18
    45. 45. Result - ReDD-Observatory Country Profile - Tuberculosis TrialsAmrapali Zaveri, Universität Leipzig 24 August, WI-2011 14 of 18
    46. 46. Result - ReDD-Observatory Country Profile - Tuberculosis Trials PublicationsAmrapali Zaveri, Universität Leipzig 24 August, WI-2011 14 of 18
    47. 47. User InterfaceAmrapali Zaveri, Universität Leipzig 24 August, WI-2011 15 of 18
    48. 48. User InterfaceAmrapali Zaveri, Universität Leipzig 24 August, WI-2011 15 of 18
    49. 49. User InterfaceAmrapali Zaveri, Universität Leipzig 24 August, WI-2011 15 of 18
    50. 50. User InterfaceAmrapali Zaveri, Universität Leipzig 24 August, WI-2011 15 of 18
    51. 51. User Interface * Available at: http://redd.aksw.orgAmrapali Zaveri, Universität Leipzig 24 August, WI-2011 15 of 18
    52. 52. Limitations and Future WorkAmrapali Zaveri, Universität Leipzig 24 August, WI-2011 16 of 18
    53. 53. Limitations and Future Work • LimitationsAmrapali Zaveri, Universität Leipzig 24 August, WI-2011 16 of 18
    54. 54. Limitations and Future Work • Limitations • Information QualityAmrapali Zaveri, Universität Leipzig 24 August, WI-2011 16 of 18
    55. 55. Limitations and Future Work • Limitations • Information Quality • CoverageAmrapali Zaveri, Universität Leipzig 24 August, WI-2011 16 of 18
    56. 56. Limitations and Future Work • Limitations • Information Quality • Coverage • Interlinking QualityAmrapali Zaveri, Universität Leipzig 24 August, WI-2011 16 of 18
    57. 57. Limitations and Future Work • Limitations • Information Quality • Coverage • Interlinking Quality • Error PropagationAmrapali Zaveri, Universität Leipzig 24 August, WI-2011 16 of 18
    58. 58. Limitations and Future Work • Limitations • Information Quality • Coverage • Interlinking Quality • Error Propagation • Future WorkAmrapali Zaveri, Universität Leipzig 24 August, WI-2011 16 of 18
    59. 59. Limitations and Future Work • Limitations • Information Quality • Coverage • Interlinking Quality • Error Propagation • Future Work • Include more datasetsAmrapali Zaveri, Universität Leipzig 24 August, WI-2011 16 of 18
    60. 60. Limitations and Future Work • Limitations • Information Quality • Coverage • Interlinking Quality • Error Propagation • Future Work • Include more datasets • Refine indicesAmrapali Zaveri, Universität Leipzig 24 August, WI-2011 16 of 18
    61. 61. Limitations and Future Work • Limitations • Information Quality • Coverage • Interlinking Quality • Error Propagation • Future Work • Include more datasets • Refine indices • Improve user interfaceAmrapali Zaveri, Universität Leipzig 24 August, WI-2011 16 of 18
    62. 62. Acknowledgements Team Members ReviewersAmrapali Zaveri, Universität Leipzig 24 August, WI-2011 17 of 18
    63. 63. Questions? Comments? Suggestions? Thank you ! http://aksw.org/AmrapaliZaveri zaveri@informatik.uni-leipzig.deAmrapali Zaveri, Universität Leipzig 24 August, WI-2011 18 of 18
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×