This webinar explains the service, covers what publishers need to participate and answers any questions you may have. This webinar was held on November 10, 2015.
How to describe a dataset. Interoperability issuesValeria Pesce
Presented by Valeria Pesce during the pre-meeting of the Agricultural Data Interoperability Interest Group (IGAD) of the Research Data Alliance (RDA), held on 21 and 22 September 2015 in Paris at INRA.
Our speaker, Joshua Mandel, will provide a lightning tour of Fast Healthcare Interoperability Resources (FHIR), an emerging clinical data standard, with a focus on its resource-oriented approach, and a discussion of how FHIR intersects with the Semantic Web. We'll look at how FHIR represents links between entities; how FHIR represents concepts from standards-based vocabularies; and how a set of FHIR instance data can be represented in RDF.
Yosemite Project - Part 3 - Transformations for Integrating VA data with FHIR...DATAVERSITY
In our series on The Yosemite Project, we explore RDF as a data standard for health data. In this installment, we will hear from Rafael Richards, Physician Informatician, Office of Informatics and Analytics in the Veterans Health Administration (VHA), about “Transformations for Integrating VA data with FHIR in RDF.”
The VistA EHR has its own data model and vocabularies for representing healthcare data. This webinar describes how SPARQL Inference Notation (SPIN) can be used to translate VistA data to the data represented used by FHIR, an emerging interchange standard.
DataTags, The Tags Toolset, and Dataverse IntegrationMichael Bar-Sinai
This presentation describes the concept of DataTags, which simplifies handling of sensitive datasets. It then shows the Tags toolset, and how it is integrated with Dataverse, Harvard's popular dataset repository.
Presentation by Luiz Olavo Bonino about the current state of the developments on FAIR Data supporting tools at the Dutch Techcentre for Life Sciences Partners Event on November 3-4 2016.
This webinar explains the service, covers what publishers need to participate and answers any questions you may have. This webinar was held on November 10, 2015.
How to describe a dataset. Interoperability issuesValeria Pesce
Presented by Valeria Pesce during the pre-meeting of the Agricultural Data Interoperability Interest Group (IGAD) of the Research Data Alliance (RDA), held on 21 and 22 September 2015 in Paris at INRA.
Our speaker, Joshua Mandel, will provide a lightning tour of Fast Healthcare Interoperability Resources (FHIR), an emerging clinical data standard, with a focus on its resource-oriented approach, and a discussion of how FHIR intersects with the Semantic Web. We'll look at how FHIR represents links between entities; how FHIR represents concepts from standards-based vocabularies; and how a set of FHIR instance data can be represented in RDF.
Yosemite Project - Part 3 - Transformations for Integrating VA data with FHIR...DATAVERSITY
In our series on The Yosemite Project, we explore RDF as a data standard for health data. In this installment, we will hear from Rafael Richards, Physician Informatician, Office of Informatics and Analytics in the Veterans Health Administration (VHA), about “Transformations for Integrating VA data with FHIR in RDF.”
The VistA EHR has its own data model and vocabularies for representing healthcare data. This webinar describes how SPARQL Inference Notation (SPIN) can be used to translate VistA data to the data represented used by FHIR, an emerging interchange standard.
DataTags, The Tags Toolset, and Dataverse IntegrationMichael Bar-Sinai
This presentation describes the concept of DataTags, which simplifies handling of sensitive datasets. It then shows the Tags toolset, and how it is integrated with Dataverse, Harvard's popular dataset repository.
Presentation by Luiz Olavo Bonino about the current state of the developments on FAIR Data supporting tools at the Dutch Techcentre for Life Sciences Partners Event on November 3-4 2016.
Crossref DataCite joint data citation webinarCrossref
Webinar by DataCite and Crossref on how to cite data (for publishers and repositories) and how this information can be used to recognise and credit data creators.
An introduction to the FAIR principles and a discussion of key issues that must be addressed to ensure data is findable, accessible, interoperable and re-usable. The session explored the role of the CDISC and DDI standards for addressing these issues.
Presented by Gareth Knight at the ADMIT Network conference, organised by the Association for Data Management in the Tropics, in Antwerp, Belgium on December 1st 2015.
A presentation of the Dutch Techcentre for Life Sciences FAIR Data ecosystem given at the BlueBridge workshop, a pre-event of the Research Data Alliance's 9th Plenary
Usage of Linked Data: Introduction and Application ScenariosEUCLID project
This presentation introduces the main principles of Linked Data, the underlying technologies and background standards. It provides basic knowledge for how data can be published over the Web, how it can be queried, and what are the possible use cases and benefits. As an example, we use the development of a music portal (based on the MusicBrainz dataset), which facilitates access to a wide range of information and multimedia resources relating to music.
The presentation gives an overview of what metadata is and why it is important. It also addresses the benefits that metadata can bring and offers advice and tips on how to produce good quality metadata and, to close, how EUDAT uses metadata in the B2FIND service.
November 2016
BioPharma and FAIR Data, a Collaborative AdvantageTom Plasterer
The concept of FAIR (Findable, Accessible, Interoperable and Reusable) data is becoming a reality as stakeholders from industry, academia, funding agencies and publishers are embracing this approach. For BioPharma being able to effectively share and reuse data is a tremendous competitive advantage, within a company, with peer organizations, key opinion leaders and regulatory agencies. A few key drivers, success stories and preliminary results of an industry data stewardship survey are presented.
Dataset Catalogs as a Foundation for FAIR* DataTom Plasterer
BioPharma and the broader research community is faced with the challenge of simply finding the appropriate internal and external datasets for downstream analytics, knowledge-generation and collaboration. With datasets as the core asset, we wanted to promote both human and machine exploitability, using web-centric data cataloguing principles as described in the W3C Data on the Web Best Practices. To do so, we adopted DCAT (Data CATalog Vocabulary) and VoID (Vocabulary of Interlinked Datasets) for both RDF and non-RDF datasets at summary, version and distribution levels. Further, we’ve described datasets using a limited set of well-vetted public vocabularies, focused on cross-omics analytes and clinical features of the catalogued datasets.
Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...Merce Crosas
Presentation for the NFAIS Webinar series: Open Data Fostering Open Science: Meeting Researchers' Needs
http://www.nfais.org/index.php?option=com_mc&view=mc&mcid=72&eventId=508850&orgId=nfais
Edge Informatics and FAIR (Findable, Accessible, Interoperable and Reusable) ...Tom Plasterer
Edge Informatics is an approach to accelerate collaboration in the BioPharma pipeline. By combining technical and social solutions knowledge can be shared and leveraged across the multiple internal and external silos participating in the drug development process. This is accomplished by making data assets findable, accessible, interoperable and reusable (FAIR). Public consortia and internal efforts embracing FAIR data and Edge Informatics are highlighted, in both preclinical and clinical domains.
This talk was presented at the Molecular Medicine Tri-Conference in San Francisco, CA on February 20, 2017
Crossref DataCite joint data citation webinarCrossref
Webinar by DataCite and Crossref on how to cite data (for publishers and repositories) and how this information can be used to recognise and credit data creators.
An introduction to the FAIR principles and a discussion of key issues that must be addressed to ensure data is findable, accessible, interoperable and re-usable. The session explored the role of the CDISC and DDI standards for addressing these issues.
Presented by Gareth Knight at the ADMIT Network conference, organised by the Association for Data Management in the Tropics, in Antwerp, Belgium on December 1st 2015.
A presentation of the Dutch Techcentre for Life Sciences FAIR Data ecosystem given at the BlueBridge workshop, a pre-event of the Research Data Alliance's 9th Plenary
Usage of Linked Data: Introduction and Application ScenariosEUCLID project
This presentation introduces the main principles of Linked Data, the underlying technologies and background standards. It provides basic knowledge for how data can be published over the Web, how it can be queried, and what are the possible use cases and benefits. As an example, we use the development of a music portal (based on the MusicBrainz dataset), which facilitates access to a wide range of information and multimedia resources relating to music.
The presentation gives an overview of what metadata is and why it is important. It also addresses the benefits that metadata can bring and offers advice and tips on how to produce good quality metadata and, to close, how EUDAT uses metadata in the B2FIND service.
November 2016
BioPharma and FAIR Data, a Collaborative AdvantageTom Plasterer
The concept of FAIR (Findable, Accessible, Interoperable and Reusable) data is becoming a reality as stakeholders from industry, academia, funding agencies and publishers are embracing this approach. For BioPharma being able to effectively share and reuse data is a tremendous competitive advantage, within a company, with peer organizations, key opinion leaders and regulatory agencies. A few key drivers, success stories and preliminary results of an industry data stewardship survey are presented.
Dataset Catalogs as a Foundation for FAIR* DataTom Plasterer
BioPharma and the broader research community is faced with the challenge of simply finding the appropriate internal and external datasets for downstream analytics, knowledge-generation and collaboration. With datasets as the core asset, we wanted to promote both human and machine exploitability, using web-centric data cataloguing principles as described in the W3C Data on the Web Best Practices. To do so, we adopted DCAT (Data CATalog Vocabulary) and VoID (Vocabulary of Interlinked Datasets) for both RDF and non-RDF datasets at summary, version and distribution levels. Further, we’ve described datasets using a limited set of well-vetted public vocabularies, focused on cross-omics analytes and clinical features of the catalogued datasets.
Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...Merce Crosas
Presentation for the NFAIS Webinar series: Open Data Fostering Open Science: Meeting Researchers' Needs
http://www.nfais.org/index.php?option=com_mc&view=mc&mcid=72&eventId=508850&orgId=nfais
Edge Informatics and FAIR (Findable, Accessible, Interoperable and Reusable) ...Tom Plasterer
Edge Informatics is an approach to accelerate collaboration in the BioPharma pipeline. By combining technical and social solutions knowledge can be shared and leveraged across the multiple internal and external silos participating in the drug development process. This is accomplished by making data assets findable, accessible, interoperable and reusable (FAIR). Public consortia and internal efforts embracing FAIR data and Edge Informatics are highlighted, in both preclinical and clinical domains.
This talk was presented at the Molecular Medicine Tri-Conference in San Francisco, CA on February 20, 2017
HDI III - Healthdata.gov - Now, Next and ChallengesGeorge Thomas
This is a presentation that will be given at the 2012 Health Datapalooza (http://hdiforum.org), describing the new healthdata.gov site, its PaaS/DaaS direction, and related i2/ONC developer challenges.
This is a presentation given for a HealthData.gov Developer Challenge; see
http://www.health2con.com/devchallenge/health-data-platform-metadata-challenge/
and/or
http://www.health2con.com/devchallenge/health-data-platform-simple-sign-on-challenge/
(both links contain the same embedded video and deck)
Realizing the GPRAMA using Government Linked DataGeorge Thomas
This presentation was given at the 2011 DoD symposium on SOA & Semantic Technology, and demonstrates the use of open standard metadata tags to implement the Government Performance and Results Act Modernization Act (GPRAMA) using topical examples like cloud computing, and the meaningful use of electronic health record exchanges.
A presentation that captures some basic ideas about connecting planning data with spending data, part of my OMB detail in support of the Obama Administration transparency and open Government goals.
App's, innovation boulevards, free-wifi, ... Heel wat interessante 'nieuwe' technologiën bieden zich aan. Maar wat zijn de mogelijkheden en hoe kan je het introduceren.
Presentatie 'Shop&go' door Axel Weydts - Kortrijk
Providing geospatial information as Linked Open DataPat Kenny
ADAPT is revolutionising the way people can seamlessly interact with digital content, systems and each other and enabling users to achieve unprecedented levels of access and efficiency. - Prof. Declan O'Sullivan, Trinity College Dublin. Address given at Ordnance Survey Ireland GI R&D Initiatives, Tuesday, 22 March 2016, 13:00 to 20:30 (GMT), Maynooth University.
This module supported the training on Linked Open Data delivered to the EU Institutions on 30 November 2015 in Brussels. https://joinup.ec.europa.eu/community/ods/news/ods-onsite-training-european-commission
Linked Open Data Principles, Technologies and ExamplesOpen Data Support
Theoretical and practical introducton to linked data, focusing both on the value proposition, the theory/foundations, and on practical examples. The material is tailored to the context of the EU institutions.
Slides from our tutorial on Linked Data generation in the energy domain, presented at the Sustainable Places 2014 conference on October 2nd in Nice, France
This presentation covers the whole spectrum of Linked Data production and exposure. After a grounding in the Linked Data principles and best practices, with special emphasis on the VoID vocabulary, we cover R2RML, operating on relational databases, Open Refine, operating on spreadsheets, and GATECloud, operating on natural language. Finally we describe the means to increase interlinkage between datasets, especially the use of tools like Silk.
Data Citation Implementation Guidelines By Tim Clarkdatascienceiqss
This talk presents a set of detailed technical recommendations for operationalizing the Joint Declaration of Data Citation Principles (JDDCP) - the most widely agreed set of principle-based recommendations for direct scholarly data citation.
We will provide initial recommendations on identifier schemes, identifier resolution behavior, required metadata elements, and best practices for realizing programmatic machine actionability of cited data.
We hope that these recommendations along with the new NISO JATS document schema revision, developed in parallel, will help accelerate the wide adoption of data citation in scholarly literature. We believe their adoption will enable open data transparency for validation, reuse and extension of scientific results; and will significantly counteract the problem of false positives in the literature.
Linked Data for the Masses: The approach and the SoftwareIMC Technologies
Title: Linked Data for the Masses: The approach and the Software
@ EELLAK (GFOSS) Conference 2010
Athens, Greece
15/05/2010
Creator: George Anadiotis (R&D Director)
morning session talk at the second Keystone Training School "Keyword search in Big Linked Data" held in Santiago de Compostela.
https://eventos.citius.usc.es/keystone.school/
Applied semantic technology and linked dataWilliam Smith
Mapping a human brain generates petabytes of gene listings and the corresponding locations of these genes throughout the human brain. Due to the large dataset a prototype Semantic Web application was created with the unique ability to link new datasets from similar fields of research, and present these new models to an online community. The resulting application presents a large set of gene to location mappings and provides new information about diseases, drugs, and side effects in relation to the genes and areas of the human brain.
In this presentation we will discuss the normalization processes and tools for adding new datasets, the user experience throughout the publishing process, the underlying technologies behind the application, and demonstrate the preliminary use cases of the project.
Tutorial presented at 2012 ACM SIGHIT International Health Informatics Symposium (IHI 2012), January 28-30, 2012. http://sites.google.com/site/web2011ihi/participants/tutorials
This tutorial weaves together three themes and the associated topics:
[1] The role of biomedical ontologies
[2] Key Semantic Web technologies with focus on Semantic provenance and integration
[3] In-practice tools and real world use cases built to serve the needs of sleep medicine researchers, cardiologists involved in clinical practice, and work on vaccine development for human pathogens.
It19 20140721 linked data personal perspectiveJanifer Gatenby
A presentation made for Standards Australia's seminar. Outlines the basic aspects of linked data from a personal perspective and where it fits with direct and subject searching.
Muhammad sharif Software Engineer, SKMCHRC. This book is copywrite of Muhammad Sharif
This book title: Database Systems handbook. Other Names are DBMS, RDBMS, Database slides and database management systems, relational database management systems
This is final and 4rth edition of this book.
Similar to Clinical Quality Linked Data on health.data.gov (20)
Implementing the Open Government Directive using the technologies of the Soci...George Thomas
This presentation demonstrates the use of Semantic Web technologies with Social Networking tools, considering metadata specifications as Social Media. Example ontologies and instance data from the Capital Planning and Investment Control and Business Motivation are created that link 'what' (Agency IT investments) with 'why' (Agency goals and objectives), using a simple linking ontology. Knowledge Workers use a Semantic Halo Mediawiki to curate the data.
This presentation is the culmination of my detail to the E-Government Office in the US Office of Management and Budget and the work I did to evolve and mature initiatives like recovery.gov and data.gov.
'Transparency, Participation, Collaboration'
Solution Architecture works in progress for recovery.gov
This is a presentation I gave at the Sunlight Foundations http://transparencycamp.org/ on 2/28/09.
With respect to whether the ideas and approaches I've expressed and advocated here will ultimately be realized by those now responsible for managing and operating this initiative - Caveat Venditor/Emptor.
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
The Metaverse and AI: how can decision-makers harness the Metaverse for their...Jen Stirrup
The Metaverse is popularized in science fiction, and now it is becoming closer to being a part of our daily lives through the use of social media and shopping companies. How can businesses survive in a world where Artificial Intelligence is becoming the present as well as the future of technology, and how does the Metaverse fit into business strategy when futurist ideas are developing into reality at accelerated rates? How do we do this when our data isn't up to scratch? How can we move towards success with our data so we are set up for the Metaverse when it arrives?
How can you help your company evolve, adapt, and succeed using Artificial Intelligence and the Metaverse to stay ahead of the competition? What are the potential issues, complications, and benefits that these technologies could bring to us and our organizations? In this session, Jen Stirrup will explain how to start thinking about these technologies as an organisation.
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™UiPathCommunity
In questo evento online gratuito, organizzato dalla Community Italiana di UiPath, potrai esplorare le nuove funzionalità di Autopilot, il tool che integra l'Intelligenza Artificiale nei processi di sviluppo e utilizzo delle Automazioni.
📕 Vedremo insieme alcuni esempi dell'utilizzo di Autopilot in diversi tool della Suite UiPath:
Autopilot per Studio Web
Autopilot per Studio
Autopilot per Apps
Clipboard AI
GenAI applicata alla Document Understanding
👨🏫👨💻 Speakers:
Stefano Negro, UiPath MVPx3, RPA Tech Lead @ BSP Consultant
Flavio Martinelli, UiPath MVP 2023, Technical Account Manager @UiPath
Andrei Tasca, RPA Solutions Team Lead @NTT Data
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofsAlex Pruden
This paper presents Reef, a system for generating publicly verifiable succinct non-interactive zero-knowledge proofs that a committed document matches or does not match a regular expression. We describe applications such as proving the strength of passwords, the provenance of email despite redactions, the validity of oblivious DNS queries, and the existence of mutations in DNA. Reef supports the Perl Compatible Regular Expression syntax, including wildcards, alternation, ranges, capture groups, Kleene star, negations, and lookarounds. Reef introduces a new type of automata, Skipping Alternating Finite Automata (SAFA), that skips irrelevant parts of a document when producing proofs without undermining soundness, and instantiates SAFA with a lookup argument. Our experimental evaluation confirms that Reef can generate proofs for documents with 32M characters; the proofs are small and cheap to verify (under a second).
Paper: https://eprint.iacr.org/2023/1886
1. Clinical Quality Linked Data (CQLD)
on
health.data.gov
Health Data Initiative Forum Presentation/Demonstration
Margaret Pezeshki, CMS – George Thomas, HHS
2011-06-09
2. Table of Contents
• Part I – Presentation
– A brief overview of Linked Data
– And the Clinical Quality Linked Data project at CMS
• Part II – Demonstration
– Screenshots of scripted interactions highlight features
and functionality
• On health.data.gov
– Links to examples given
• With some „sequence‟ exceptions
– But it gives an idea of what to click on and why
1
3. Part I - Presentation
1. What is Linked Data?
2. Why is Linked Data Useful?
3. Linked Data Mechanics
4. CQLD Features
5. Vision and Mission
6. Acknowledgements
7. References
2
4. [1] What is Linked Data?
• An evolution and maturation of the WWW
– From the Web of Documents
• Where this page <links to> that page
– To the Web of Data
• Where this datum <links to> that datum
• Providing global data identity resolution
– Using „dereferenceable‟ HTTP URI‟s
• Solves data disintegration root cause (local identity = info silo)
– Making fine-grained data uniquely addressable
• On the global network of computers!
3
5. [1] Linked Data Principles
1. Use URI‟s as names for things
2. Use HTTP URI‟s so that people can look up those
names
3. When someone looks up a URI, provide useful
information, using the standards (RDF*, SPARQL)
4. Include links to other URI‟s so that they can
discover more things
– Source
• Linked Data Design Issues, Tim Berners-Lee
4
6. [1] Linked Data Standards
• Resource Description Framework (RDF)
– A standard model for data interchange on the WWW
• RDF data is a graph of ‘triples’
– Subject-Predicate-Object (SPO: 1-2-3)
– aka node-arc-node, entity-attribute-value, datum1-link-datum2
• RDF Schema (RDFS)
– A standard for describing RDF Vocabularies
• classes of RDF resources and their relationships
• SPARQL Protocol and RDF Query Language (SPARQL)
– A standard RDF query language and (HTTP) protocol
• SPARQL is to RDF as SQL is to relational data
5
7. [2] Why is Linked Data Useful?
• Turns the Web into a distributed database
– A „giant global graph‟ of „data in the Web‟
• Graphs are flexible, adaptable, agile
– Integrate new data without app/infrastructure changes
• Makes it easy for mashup apps to find/merge data
– Triples normalize conceptual and physical data models
• Access, query, process and persist consistently across domains
• Lightweight, Standard API for all Open Gov Data!
– TCP/IP (net), HTTP (app), RDF (data), SPARQL (query)
6
8. [3] LD Mechanics: NIR <- Conneg -> IR
• HTTP Content Negotiation (conneg)
– When you HTTP GET (aka dereference) a Non-
Information Resource (NIR)
– You receive a HTTP response code “303 See Other”
that includes a “Location: URI”
– You then HTTP GET an Information Resource (IR) at
that location in some serialization format (media-type)
– And you receive a document that contains data about
the NIR
7
9. [4] CQLD URI Scheme Conventions
• NIR‟s are abstract and real world things
– The RDFS class Hospital is an abstract thing
• http://health.data.gov/def/hospital/Hospital
– Here we are defining a vocabulary (ontology) name hospital
which has within it a definition for a class Hospital
» Classes use UpperCamelCase, properties use
lowerCamelCase naming conventions
– „Children‟s Hospital of Philadelphia‟ is a real world
thing that is an instance of the class Hospital
• http://health.data.gov/id/hospital/393303
– Here we are specifying a globally unique identifier for instance
393303 of a hospital class of thing in the real world
8
10. [4] CQLD URI Scheme Conventions
• When you dereference the NIR at;
– http://health.data.gov/id/hospital/393303
• You‟ll get a 303 response with the location of a generic IR at;
– http://health.data.gov/doc/hospital/393303
• That will inform you of all the available representations
• If you request a specific IR representation;
– curl –H „Accept: application/json‟
http://health.data.gov/id/hospital/393303
• You‟ll get a 303 response with the location of a specific IR at;
– http://health.data.gov/doc/hospital/393303.json
9
11. [4] Getting all the Hospital(s) information
An instance of the
classhttp://health.data.gov/def/hospital/Hospital
(or hosp:Hospital for short), like
http://health.data.gov/id/hospital/393303
returns all the info on the left
Its org:siteAddress property
Its hosp:site property links to links to all other VCard info in
/id/hospital/393303/site/1 /393303/site/1/vcard/1
10
12. [4] Defining Conditions, Measures, and Metrics
• A few small component vocabularies define
the classes shown on the right (among others)
– http://health.data.gov/def/compare/Condition
• There are 9 unique instances of this class
• http://health.data.gov/id/condition/1through /id/condition/9
– http://health.data.gov/def/compare/Measure
• There are 2 subclasses, one in a different vocab
• There are 59 unique instances of these classes
• http://health.data.gov/id/measure/1 through /id/measure/59
– http://health.data.gov/def/compare/Metric
• There are 2 subclasses, one in a different vocab
• There are 32 unique instances of these classes
• http://health.data.gov/id/metric/1 through /id/metric/32
11
13. [4] Getting to all Hospital/State/Country Record Data
• Metadata triples (SPO)
– „traversing the graph‟
• Hospital recordset RecordSet
• State recordset RecordSet
• Country recordset RecordSet
– RecordSet record Record
• http://health.data.gov/dataset
– Returns a list of all datasets, where
• /dataset/name/yyyy-mm-dd
– are void:Dataset(s), each linking to
– RecordSet(s) that each link to Record(s) at
• /dataset/name/yyyy-mm-dd/recordset/x/record/y
– http://health.data.gov/dataset/cacn/2011-02-25/recordset/1/record/1
12
14. [4] Other Features and Services
• Faceted search and browsing
– Discover the data models and fine grained data interactively
• Where your click stream is a query/report builder!
• Retrieve (request) multiple representation formats
– Suitable for people and machines
• .html, .csv, .json, .xml, .atom, .ttl, …
• Query points and visualization services
– Find the needle in the haystack
• Query across all data at once, plot results on maps, …
• Feeds, other API‟s, and more –
• Stay for (Part II) a Demonstration!
13
15. [5] Vision: Health Data Publisher Federation
• Imagine a Linked Health Data Cloud, that integrates;
– Independently published data from Doctors, Hospitals, Clinics, Labs,
Insurers, …
14
16. [5] Vision: Health Data Publisher Federation
• Imagine a Linked Health Data Cloud, that integrates;
– Independently published data from Doctors, Hospitals, Clinics, Labs,
Insurers, …
This image is the obligatory depiction of the so-
called ‘Linked Data Cloud’ – which is just one
(public) Web of Data.
Circles are nodes (RDF subjects and objects), lines
are arcs (RDF predicates). Colored nodes are
different data domains (kinds of publishers).
The vision expressed here would be realized as a
different (public/private) Web of Data, populated
by authoritative health information publishers.
15
17. [5] Mission: Health IT
• Linked Data implements SOA
– As RESTful Web Services;
• That provide access to resource representations via HTTP‟s
uniform interface, and use hyperlinks as the engine of
application state (HATEOS)
• Data as a Service (DaaS)
• Linked Data implements PCAST recommendations
– Health IT Report
• „Data Element Access Services‟ (Faceted Browsing)
– Universal Exchange Language
• „Metadata tagged‟ (RDF/XML and RDF Schemas)
16
18. [6] Acknowledgements – Special Thanks to:
• HHS
– CTO Todd Park, OCIO John Teeter, Mary Forbes
– CMS OCSQ Glenn Sperle
• OMB and GSA
– Vivek Kundra, Data.gov PMO
• W3C
– Sir Tim Berners-Lee, Sandro Hawke
• RPI
– Professor Jim Hendler and the TWC
17
19. [6] Tools We Used – Special Thanks to:
• OpenLinkSW Virtuoso
– Quad store, HTTP conneg / url_rewrite rules
– Faceted search and browse, REST API‟s
• Top Quadrant Top Braid Composer
– Vocabulary modeling (RDFS), initial (inference/query) testing
– Jena (HP) schemagen .rdfs to .java, ETL .tsv to .rdf/ttl
• DERI LiDRC
– Google Refine + DERI RDF extension
• Graph prototyping, source.tsv lifting
18
20. [7] References – Learn More
• Data.gov
– http://www.data.gov/semantic
– http://logd.tw.rpi.edu
• W3C
– http://www.w3.org/wiki/LinkedData
– http://www.w3.org/participate/podcastsvideo.html
• Community sites
– http://linkeddata.org
– http://semanticweb.com
• Books
– Linking Enterprise Data
– Evolving the Web into a Global Data Space
19
21. Part II – Demonstration
1. Exploring the Vocabularies
2. Exploring the Datasets
3. Exploring Multiple representations
4. Query Points
5. Faceted Search
6. Faceted Browsing
7. Other API‟s
8. Contact
20
23. [1] Downloading Hospital Compare Vocabs
• http://health.data.gov/def/cqld.rdfs
• http://health.data.gov/def/hospital.rdfs
• http://health.data.gov/def/compare.rdfs
• http://health.data.gov/def/hospital-compare.rdfs
• http://health.data.gov/def/medicare.rdfs
• http://reference.data.gov/def/govdata.rdfs
– Note: a common (best) practice redirects requests to any
classes/properties to their respective locations above…
• We’ve diverged from this initially to enable data model browsing
– and provided rdfs:seeAlso links to the vocab.rdfs file URL’s above
22
24. [1] http://health.data.gov/def/cqld/
The /cqld/ vocab is intended as a ‘bridge’
vocabulary – things that relate concepts
in other vocabularies are (or will be)
specified here as we add more vocabs for
other Clinical Quality domains.
At the moment, it mostly just imports the
other vocabularies, including those we
created and reused, which makes it a
convenient starting place when using an
ontology editing tool.
23
25. [1] http://health.data.gov/def/hospital/
The /hospital/ vocab contains mostly
properties that capture values for
general hospital information, and
categorize the kind of ownership and
types of services a hospital provides.
24
26. [1] http://health.data.gov/def/compare/
A general purpose vocabulary
called /govdata/ captures classes
that are common to all open
government data.
The /compare/ vocab contains
general classes that are relevant
to any Clinical Quality domain,
not just hospitals – such as
Conditions, Measures, and
Metrics.
These concepts get subclassed in
the /hospital-compare/ vocab,
where they’re specialized for the
Hospital domain.
25
28. [1] http://health.data.gov/def/hospital/Hospital
The default html view of any
class within any vocabulary,
(like hosp:Hospital shown
here) will provide some
metadata about the class
(like what vocab it’s defined
by), and will also provide a
list of intances you can page
through or click to access.
27
29. [2] Exploring Datasets Over Time
• Get a list of all the source RDF/XML files
– http://health.data.gov/dataset
• Click on one
– To see it‟s foaf:homepage, like
• http://health.data.gov/dataset/cacp
– Then select a void:Dataset with any dcterms:issued date
• http://health.data.gov/dataset/cacp/2010-11-16
– Then follow the void:exampleResource link to
• http://health.data.gov/dataset/cacp/2010-11-16/recordset/1/record/1
– To look at the data in one of its hoco:Record(s)
28
30. [2] Getting to all Hospital/State/Country Record Data
• Metadata triples (SPO)
– „traversing the graph‟
• Hospital recordset RecordSet
• State recordset RecordSet
• Country recordset RecordSet
– RecordSet record Record
• http://health.data.gov/dataset
– Returns a list of all datasets, where
• /dataset/name/yyyy-mm-dd
– are void:Dataset(s), each linking to
– RecordSet(s) that each link to Record(s) at
• /dataset/name/yyyy-mm-dd/recordset/x/record/y
– http://health.data.gov/dataset/cacp/2010-11-16/recordset/1/record/1
29
31. [2] http://health.data.gov/dataset
Let’s check out /cacp
Go to:
http://health.data.gov/dataset
to get a list of all the available
source data files
30
32. [2] http://health.data.gov/dataset/cacp
Follow the link from the
/cacp homepage to
a dataset issued
on a particular
/yyyy-mm-dd date –
these will accumulate
over time…
31
35. [3] http://health.data.gov/doc/hospital/393303
The default representation for
any resource is HTML
(more specifically XHTML+RDFa),
where data linked to and from
various properties will be given.
A number of other resource
representations are also
available – see below.
But first, let’s take a closer look at
this representation – by hitting
‘view source’ (see next slide):
34
36. [3] http://health.data.gov/doc/hospital/393303.html
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+RDFa 1.0//EN" "http://www.w3.org/MarkUp/DTD/xhtml-rdfa-1.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" version="XHTML+RDFa 1.0">
<head>
<title>About: CHILDREN'S HOSPITAL OF PHILADELPHIA</title>
[snip other header stuff] This XHTML+RDFa
</head> representation is
human and
<body about=http://health.data.gov/id/hospital/393303>
machine readable!
[snip table-row-display-unnumbered-list-list-item stuff]
<span property="rdfs:label" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#">CHILDREN'S HOSPITAL OF PHILADELPHIA</span>
<span property="ns5:cacProvider" xmlns:ns5="http://health.data.gov/def/hospital/">1</span>(xsd:integer)</span>
<span property="ns6:emergencyServices" xmlns:ns6="http://health.data.gov/def/hospital/">1</span>(xsd:integer)</span>
[snip]
</body>
</html>
35
37. [3] http://health.data.gov/doc/hospital/393303.html
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+RDFa 1.0//EN" "http://www.w3.org/MarkUp/DTD/xhtml-rdfa-1.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" version="XHTML+RDFa 1.0">
<head>
<title>About: CHILDREN'S HOSPITAL OF PHILADELPHIA</title>
[snip other header stuff]
</head>
<body about=http://health.data.gov/id/hospital/393303>
[snip table-row-display-unnumbered-list-list-item stuff]
<span property="rdfs:label" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#">CHILDREN'S HOSPITAL OF PHILADELPHIA</span>
<span property="ns5:cacProvider" xmlns:ns5="http://health.data.gov/def/hospital/">1</span>(xsd:integer)</span>
<span property="ns6:emergencyServices" xmlns:ns6="http://health.data.gov/def/hospital/">1</span>(xsd:integer)</span>
[snip]
</body>
</html>
36
39. [3] http://health.data.gov/doc/hospital/393303.atom
New hoco:RecordSet(s) will accumulate over
time about/around each hosp:Hospital instance,
accessed at their respective feed URLs.
This will also occur for feeds of other type
instances, like void:Dataset(s), and for lists of
class instances, providing an easy way to be
notified as new comp:Measure(s) and
comp:Metric(s) get introduced (for example).
{Note: this last part is only partially implemented
– we’re still tweaking configs and queries! }
38
41. [4] Using the /isparql Visualization Service
Go to any
/id/hospital/{guid}/site/1/vcard/1
Then click on the ‘iSPARQL’ link
And select Google Maps View:
(see next slide)
40
43. [4] http://health.data.gov/isparql - Visual Query Service
Here you can enter your
SPARQL query and the /isparql
service will draw your graph
when you select the ‘QBE’ tab
-or vice versa -
You can also use the visual
graph tools and /isparql will
generate your SPARQL query
(see next slide).
42
45. [4] /isparql – Text Query Generation and Execution
This query finds a needle in the
RDF haystack. It asks the
iSPARQL Service to return;
Any Hospitals in DC that have
any 30-day Readmission
Rate(a specific Measure)
Records where the
readmission percentage is
worse than the US national
rate (a specific Metric).
44
47. [5] /facet Search – Keyword Text
Use the Faceted Search service at
http://health.data.gov/facet
for executing normal keyword
searches, such as the search for
keyword ‘hospital’ shown here.
46
48. [5] /facet Search – Keyword Results
The highest ranked
match for keyword
‘hospital’ is the URI for
the Hospital class in the
hospital vocabulary –
hosp:Hospital
When you click that
link, you land on a
page about the
hosp:Hospital
class, see next slide…
47
49. [5] /facet Search – from Keyword Text to Structured Data
This page tells you that hosp:Hospital is a
(rdf:type) rdfs:Class …
…and which RDFS Vocabulary this class is
defined by…
…and also provides a list of all the instances
of this class that you can page through
or click through to.
48
50. [5] /facet Search – from Metadata Labels
You can also use the Faceted Search service at
http://health.data.gov/facet
for executing rdfs:label searches, such as the search for
label ‘Hospital’ shown here, which displays instant results,
showing on the left the matching rdfs:label of the
hosp:Hospital class, followed by anything else having the
string ‘Hospital’ in its rdfs:label. Selecting the highlighted
suggestion takes you to the next slide…
49
51. [5] /facet Search – to Metadata Classes
And we’re back on the page about the
hosp:Hospital class as expected…
50
52. [5] /facet Search – URI Lookup
Lastly, you can also use the Faceted Search service at
http://health.data.gov/facet
for executing URI Lookup searches, such as the search for
http://health.data.gov/def/hospital/ shown here, which
also displays instant results, beginning with the class
Hospital which is specified in that vocabulary, followed by
other entities in that vocab. Selecting the highlighted
suggestion takes you once again to…
51
53. [6] /facet Browsing – Interactive Query Builder
The hosp:Hospital class page. From here,
you can ‘follow your nose’ and traverse the
graph by dereferencing links on this page,
which is fundamentally the most basic form
of ‘faceted browsing’.
52
54. [6] /facet Browsing – Interactive Query Builder
This time, let’s try something new
and click on the
‘Start faceted browsing from this Type’ link
53
55. [6] /facet Browsing – All Instances of hosp:Hospital
The results on the bottom left match this triple:
(any) Entity1 (that) is a (rdf:type) hosp:Hospital
54
56. [6] /facet Browsing – All Instances of hosp:Hospital
Now, under the
Entity Relations Navigation
area on the right;
Step 1. click on Attributes
to see what properties (arcs) are
connected to Entity1 instances (nodes),
which will find anything that links to the
class hosp:Hospital
55
57. [6] /facet Browsing – hosp:Hospital Arcs
Step 2. Select the rdfs:label link to add another triple
to the graph matching pattern at the top left.
Step 3. ‘Refocus’ your node by clicking on Entity1 (top
left) again, and then select Attribute again top right,
returning to this screen.
Step 4. Select the mpvProvider arc to add yet
another triple to the pattern at the top left.
You’re building a simple SPARQL query, which so far
should start to look like the next slide...
56
58. [6] /facet Browsing – Filtering Property Values
Step 5. Click on Entity3, which is the
object of the triple with subject Entity1
and predicate hosp:mpvProvider to
focus it, then select the value ‘1’ from the
left, to assert that the object (node) of
the hosp:mpvProvider property (arc)
must have value ‘1’, and your screen
should look as it does here.
The list at the bottom left now displays
only the names of hospitals that provide
Medicare payment and volume info.
57
59. [6] /facet Browsing – adding Property/Values
Step 6. Refocus (by clicking on) Entity1 again, and select
Attributes from the right again, this time selecting the
gd:stateCode property.
Step 7. Focus Entity4 and assign it’s value to be ‘CA’ for
California, by selecting the first State in the new bottom left list .
Step 8. Focus Entity1 again, select Attributes again, and this time
select the hoco:recordset property
Step. 9 Focus Entity5, select Attributes, and select the record
property
And your screen should look as it does in the next slide…
58
60. [6] /facet Browsing – Bookmark Your Query
You can save and
bookmark the queries you
generate from browsing
facets!
59
61. [6] /facet Browsing – Examine Your Query
Now that you’ve generated a list (on the left) of all
the records for all the hospitals in California that are
Medicare payment volume providers using the
faceted browsing feature (aka ‘data surfing’);
Take a look at the SPARQL query you generated by
clicking on the ‘View query as SPARQL’ link -
Which takes you to the next slide…
60
62. [6] /sparql Endpoint – Select a Response Format
You’ll end up at the SPARQL endpoint service at
http://health.data.gov/sparql
where you can select multiple query response
formats – we’ll select JSON, since it’s such a popular
format for mashup apps.
61
63. [6] /sparql Endpoint – Run Your Query
You might also want to
make minor edits to the
SPARQL query you
generated, reformating
it for readability, making
variable names that get
used as results headers
more meaningful to
humans, before you
click ‘Run Query’ (on the
bottom left)
62
64. [6] /sparql – Mash on Your Results
Which (as you might
expect) produces some
lovely RDF data in a
JSON format for you to
use in further application
processing, visualization
tools, or what have you.
Mash away!
63
65. [7] Facet Service API
• Retrieve a list of Hospitals (instances of type Hospital)
– Using cURL and your favorite terminal shell:
curl -H "Content-Type: text/xml" -d @post.xml http://health.data.gov/fct/service
– Where post.xml is a simple „Plain Old XML‟ (POX) file
containing the following query;
<?xml version="1.0"?>
<query>
<class iri="[hosp:Hospital]"/>
<view type="list"/>
</query>
64
66. [7] Facet Service API
• And get POX returned
<fct:facets>
<fct:result>
[...snip]
<fct:row>
<fct:column datatype="url”>http://health.data.gov/id/hospital/393303</fct:column>
</fct:row>
<fct:row>
<fct:column datatype="url”>http://health.data.gov/id/hospital/160156</fct:column>
</fct:row>
[snip...]
</fct:result>
</fct:facets>
• See Facet Service API docs
65