CQLD on health.data.gov @ SemTech 2011

1,493 views
1,328 views

Published on

Presentation given at SemTech SF 2011

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,493
On SlideShare
0
From Embeds
0
Number of Embeds
19
Actions
Shares
0
Downloads
15
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

CQLD on health.data.gov @ SemTech 2011

  1. 1. Clinical Quality Linked Data on health.data.gov George Thomas, HHS SemTech2011, 2011-06-08 Franciscan B, 2:20-2:45pm
  2. 2. This Presentation <ul><li>Data.gov </li></ul><ul><ul><li>2010: EOP/OMB and GSA, RPI </li></ul></ul><ul><ul><li>2011: add HHS/CMS </li></ul></ul><ul><li>Clinical Quality Linked Data </li></ul><ul><ul><li>health.data.gov </li></ul></ul><ul><ul><li>Hospital Compare: tools, metadata, data </li></ul></ul><ul><li>Community of Practice </li></ul><ul><ul><li>W3C Government Linked Data Working Group </li></ul></ul><ul><li>Community of Interest </li></ul><ul><ul><li>Data.gov PMO Semantic Web / Linked Data Team </li></ul></ul>
  3. 3. data.gov 2010 <ul><li>EOP and OMB </li></ul><ul><ul><li>Open Gov Directive </li></ul></ul><ul><ul><ul><li>‘ TPC’ </li></ul></ul></ul><ul><ul><li>Fed CIO Vivek Kundra </li></ul></ul><ul><ul><ul><li>‘ Democratizing Data’ </li></ul></ul></ul><ul><li>OMB and GSA </li></ul><ul><ul><li>OCSIT </li></ul></ul><ul><ul><ul><li>Data.gov PMO </li></ul></ul></ul><ul><ul><li>Semantic Web </li></ul></ul><ul><ul><ul><li>RPI , Virtuoso </li></ul></ul></ul>
  4. 4. HHS and CMS <ul><li>CTO @todd_park </li></ul><ul><ul><li>/open innovation </li></ul></ul><ul><ul><ul><li>‘ unleash the mojo!’ </li></ul></ul></ul><ul><li>OCIO </li></ul><ul><ul><li>Dep CIO, Chief Arch </li></ul></ul><ul><li>OCSQ </li></ul><ul><ul><li>Hospital Compare </li></ul></ul><ul><ul><li>Data.Medicare.gov </li></ul></ul><ul><ul><li>Clinical Quality Linked Data! </li></ul></ul>
  5. 5. health.data.gov 2011 <ul><li>Health Community </li></ul><ul><ul><li>Mashups </li></ul></ul><ul><ul><ul><li>public and private </li></ul></ul></ul><ul><ul><li>Drupal, Socrata </li></ul></ul><ul><ul><ul><li>showcase , challenges </li></ul></ul></ul><ul><ul><ul><li>blogs, feeds, syndication </li></ul></ul></ul><ul><li>Linked Data </li></ul><ul><ul><li>Virtuoso serves: </li></ul></ul><ul><ul><ul><li>/def/{vocab}/{concept} </li></ul></ul></ul><ul><ul><ul><li>/id/{concept}/{instance} </li></ul></ul></ul><ul><ul><ul><li>/doc/{concept}/{instance}.ext </li></ul></ul></ul><ul><ul><ul><li>/dataset/{filename}/{date} </li></ul></ul></ul>
  6. 6. Tools <ul><li>Google Refine + DERI RDF extension </li></ul><ul><ul><li>Graph prototyping, source.tsv lifting </li></ul></ul><ul><li>Top Braid Composer </li></ul><ul><ul><li>Vocabulary modeling (RDFS) </li></ul></ul><ul><ul><li>Initial instance data testing (inferences and queries) </li></ul></ul><ul><li>Jena </li></ul><ul><ul><li>schemagen .rdfs to .java </li></ul></ul><ul><ul><li>ETL source.tsv to source.rdf/ttl </li></ul></ul><ul><li>Virtuoso </li></ul><ul><ul><li>Quad store, HTTP conneg / url_rewrite rules </li></ul></ul><ul><ul><li>Faceted search and browse , REST API ’ s </li></ul></ul>
  7. 7. Hospital Compare Metadata <ul><li>Created a handful of (generic and domain specific) small component vocabularies </li></ul><ul><ul><li>health.data.gov/def/{vocab-name}/{Class-or-predicate} </li></ul></ul><ul><ul><ul><li>/def/hospital/Hospital </li></ul></ul></ul><ul><ul><ul><li>/def/compare/Condition, /Measure, /Metric </li></ul></ul></ul><ul><ul><li>reference.data.gov/def/govdata </li></ul></ul><ul><ul><ul><li>/Record, /RecordSet, /State, /County, /Country </li></ul></ul></ul><ul><li>Reused another handful (the usual) </li></ul><ul><ul><li>VoID, FOAF, DC </li></ul></ul><ul><ul><li>W3C Org, Vcard </li></ul></ul><ul><ul><ul><li>Evolve toward SKOS, SDMX-RDF and QB ? </li></ul></ul></ul>
  8. 8. Hospital Compare Data <ul><li>They don ’t call it Virtuoso for nothing… </li></ul><ul><ul><li>303 from /id NIR to /doc IR </li></ul></ul><ul><ul><li>Serves a variety of representation formats </li></ul></ul><ul><ul><ul><li>RDF/XML, RDF+JSON, Turtle, N-triples, CSV, </li></ul></ul></ul><ul><ul><ul><li>Atom/OData feeds have wide usage scenarios </li></ul></ul></ul><ul><ul><ul><ul><li>All data about a particular Hospital over time </li></ul></ul></ul></ul><ul><ul><ul><ul><li>All instances of Measure(s) as they evolve over time </li></ul></ul></ul></ul><ul><ul><ul><ul><li>All data for a particular Report/Survey dataset over time … </li></ul></ul></ul></ul><ul><ul><li>Follow your nose with faceted search and browse services </li></ul></ul><ul><ul><ul><li>Discover the data model while building SPARQL queries </li></ul></ul></ul><ul><ul><ul><ul><li>Whether you ’re a carbon or silicon based agent </li></ul></ul></ul></ul><ul><ul><ul><li>‘ Sponge’ external sites, expose RDB’s as RDF, …more… </li></ul></ul></ul>
  9. 9. Community of Practice <ul><li>W3C Government Linked Data Working Group </li></ul><ul><ul><li>Member oriented, focused on SemWeb impl ’s of GLD </li></ul></ul><ul><ul><li>I ’m a co-chair along with Bernadette Hyland </li></ul></ul><ul><ul><ul><li>Of Talis, Inc. – she ’s also here at SemTech2011 </li></ul></ul></ul><ul><ul><li>And the expert advice and support of W3C ’s Sandro Hawk! </li></ul></ul><ul><ul><li>Expected GLD charter </li></ul></ul><ul><ul><ul><li>Community Dir, Publishing Best Practices, Standard Vocabs </li></ul></ul></ul><ul><ul><ul><li>Complements eGov IG charter </li></ul></ul></ul><ul><ul><li>First Face to Face GLD Meeting </li></ul></ul><ul><ul><ul><li>6/29-30 in Washington, DC area at NITRD </li></ul></ul></ul>
  10. 10. Community of Interest <ul><li>Data.gov PMO Semantic Web / Linked Data Team </li></ul><ul><ul><li>Open to the public! </li></ul></ul><ul><ul><ul><li>Ask me for telecon and webshare info if you ’re interested in participating </li></ul></ul></ul><ul><li>Ongoing work </li></ul><ul><ul><li>EPA Linked Data </li></ul></ul><ul><ul><ul><li>Facilities and Chemical Registries </li></ul></ul></ul><ul><ul><li>HHS and CMS Linked Data </li></ul></ul><ul><ul><ul><li>Additional Clinical Quality domains (other ‘compare’ data) </li></ul></ul></ul><ul><li>Cross domain correlation </li></ul><ul><ul><li>Additional mashup (visualization) challenges </li></ul></ul>
  11. 11. Thank You! Questions? <ul><li>george at thomas dot name </li></ul>

×