SlideShare a Scribd company logo
Healthcare Data Management using Domain Specific
Languages for Metadata Management
David Milward
This talk looks at :
human factors of DSLs
tool support for DSL users
studies of usability and other benefits of DSLs
experience reports of DSLs deployed in practice
Overview
• A data dictionary is not enough
• A catalogue is not enough
• The tools must be usable by domain experts.
• There will be more models than you think
Within our language the prime artefacts are DataItems and Relationships, so it is essential to
ensure that all relationships are valid, therefore we need a notion of referential integrity.
Referential Integrity is provided out of the box with Xtext, which in turn leverages the Eclipse Modelling Framework
(EMF), which in turn has an API for validation of Ecore models.This can be accessed directly for other validation, and
thus one can avoid complicated grammar rules – since DSL’s in Xtext are defined by grammar rules (similar to ANTLR)
Refines – the idea of can I use data item A instead of
data item B.
Example: Sex is classified in a number of ways in NHS
datasets, for Genomics England the key ones are:
1. Phenotypic Sex - 2 Female 1 Male 9 Indeterminate
2. Person Stated Gender - 1 Male 2 Female 9 Indeterminate
(Unable to be classified as either male or female) X Not Known (PERSON
STATEDGENDER CODE not recorded)
3. Person Karyotypic Sex XY : XX : XO : XXY : XYY :XXX :XXYY :
XXXY : : XXXX : other : unknown
Very common groovy ‘builder’ pattern used for building request object.
This code is used behind the scenes to fetch details of data items, in particular
regular expressions which are used in DataElement specification, the regular
expressions are then used to generate “fake data”
http://metadataconsulting.github.io/spreadsheet-builder/
https://github.com/dsl-builders/spreadsheet/blob/documentation/sheet.rest/docs/index.adoc
https://github.com/MetadataConsulting/ModelCataloguePlugin
david.milward@cs.ox.ac.uk
ThankYou

More Related Content

What's hot

The OpenOffice.org ODF Toolkit Project
The OpenOffice.org ODF Toolkit ProjectThe OpenOffice.org ODF Toolkit Project
The OpenOffice.org ODF Toolkit Project
Alexandro Colorado
 
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...
Stuart Chalk
 
Crosslinks
Crosslinks Crosslinks
Crosslinks
ericmeeks
 
Scientific Units in the Electronic Age
Scientific Units in the Electronic AgeScientific Units in the Electronic Age
Scientific Units in the Electronic Age
Stuart Chalk
 
Assignment 5 presentation (smaller w audio)
Assignment 5 presentation (smaller w audio)Assignment 5 presentation (smaller w audio)
Assignment 5 presentation (smaller w audio)
blewter8
 
Neo4j GraphTour New YorkOntologies and Knowledge Graphs
Neo4j GraphTour New YorkOntologies and Knowledge GraphsNeo4j GraphTour New YorkOntologies and Knowledge Graphs
Neo4j GraphTour New YorkOntologies and Knowledge Graphs
Neo4j
 
Introduction to tree
Introduction to treeIntroduction to tree
Introduction to tree
Md. Rakib Trofder
 
Open Research Knowledge Graph (ORKG) - an overview
Open Research Knowledge Graph (ORKG) - an overview   Open Research Knowledge Graph (ORKG) - an overview
Open Research Knowledge Graph (ORKG) - an overview
Jennifer D'Souza
 
A Generic Scientific Data Model and Ontology for Representation of Chemical Data
A Generic Scientific Data Model and Ontology for Representation of Chemical DataA Generic Scientific Data Model and Ontology for Representation of Chemical Data
A Generic Scientific Data Model and Ontology for Representation of Chemical Data
Stuart Chalk
 
FAIRness through a novel combination of Web technologies
FAIRness through a novel combination of Web technologiesFAIRness through a novel combination of Web technologies
FAIRness through a novel combination of Web technologies
Research Data Alliance
 
Building intelligent systems with FAIR data
Building intelligent systems with FAIR dataBuilding intelligent systems with FAIR data
Building intelligent systems with FAIR data
Vrije Universiteit Amsterdam
 
Eureka Research Workbench: A Semantic Approach to an Open Source Electroni...
Eureka Research Workbench: A Semantic Approach to an Open Source Electroni...Eureka Research Workbench: A Semantic Approach to an Open Source Electroni...
Eureka Research Workbench: A Semantic Approach to an Open Source Electroni...
Stuart Chalk
 
Rule-based Capture/Storage of Scientific Data from PDF Files and Export using...
Rule-based Capture/Storage of Scientific Data from PDF Files and Export using...Rule-based Capture/Storage of Scientific Data from PDF Files and Export using...
Rule-based Capture/Storage of Scientific Data from PDF Files and Export using...
Stuart Chalk
 
Data Wrangling in SQL & Other Tools :: Data Wranglers DC :: June 4, 2014
Data Wrangling in SQL & Other Tools :: Data Wranglers DC :: June 4, 2014Data Wrangling in SQL & Other Tools :: Data Wranglers DC :: June 4, 2014
Data Wrangling in SQL & Other Tools :: Data Wranglers DC :: June 4, 2014
Ryan B Harvey, CSDP, CSM
 
Making Linked Data SPARQL with the InterMine Biological Data Warehouse
Making Linked Data SPARQL with the InterMine Biological Data WarehouseMaking Linked Data SPARQL with the InterMine Biological Data Warehouse
Making Linked Data SPARQL with the InterMine Biological Data Warehouse
Justin Clark-Casey
 
2010 09 opm_tutorial_01-jun-usecase-datagovuk
2010 09 opm_tutorial_01-jun-usecase-datagovuk2010 09 opm_tutorial_01-jun-usecase-datagovuk
2010 09 opm_tutorial_01-jun-usecase-datagovuk
Jun Zhao
 
Towards Reusable Research Software
Towards Reusable Research SoftwareTowards Reusable Research Software
Towards Reusable Research Software
dgarijo
 
20130626 OpenRefine Introduction
20130626 OpenRefine Introduction20130626 OpenRefine Introduction
20130626 OpenRefine Introduction
Martin Magdinier
 
Converting Metadata to Linked Data
Converting Metadata to Linked DataConverting Metadata to Linked Data
Converting Metadata to Linked Data
Karen Estlund
 
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge GraphsOBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
dgarijo
 

What's hot (20)

The OpenOffice.org ODF Toolkit Project
The OpenOffice.org ODF Toolkit ProjectThe OpenOffice.org ODF Toolkit Project
The OpenOffice.org ODF Toolkit Project
 
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...
 
Crosslinks
Crosslinks Crosslinks
Crosslinks
 
Scientific Units in the Electronic Age
Scientific Units in the Electronic AgeScientific Units in the Electronic Age
Scientific Units in the Electronic Age
 
Assignment 5 presentation (smaller w audio)
Assignment 5 presentation (smaller w audio)Assignment 5 presentation (smaller w audio)
Assignment 5 presentation (smaller w audio)
 
Neo4j GraphTour New YorkOntologies and Knowledge Graphs
Neo4j GraphTour New YorkOntologies and Knowledge GraphsNeo4j GraphTour New YorkOntologies and Knowledge Graphs
Neo4j GraphTour New YorkOntologies and Knowledge Graphs
 
Introduction to tree
Introduction to treeIntroduction to tree
Introduction to tree
 
Open Research Knowledge Graph (ORKG) - an overview
Open Research Knowledge Graph (ORKG) - an overview   Open Research Knowledge Graph (ORKG) - an overview
Open Research Knowledge Graph (ORKG) - an overview
 
A Generic Scientific Data Model and Ontology for Representation of Chemical Data
A Generic Scientific Data Model and Ontology for Representation of Chemical DataA Generic Scientific Data Model and Ontology for Representation of Chemical Data
A Generic Scientific Data Model and Ontology for Representation of Chemical Data
 
FAIRness through a novel combination of Web technologies
FAIRness through a novel combination of Web technologiesFAIRness through a novel combination of Web technologies
FAIRness through a novel combination of Web technologies
 
Building intelligent systems with FAIR data
Building intelligent systems with FAIR dataBuilding intelligent systems with FAIR data
Building intelligent systems with FAIR data
 
Eureka Research Workbench: A Semantic Approach to an Open Source Electroni...
Eureka Research Workbench: A Semantic Approach to an Open Source Electroni...Eureka Research Workbench: A Semantic Approach to an Open Source Electroni...
Eureka Research Workbench: A Semantic Approach to an Open Source Electroni...
 
Rule-based Capture/Storage of Scientific Data from PDF Files and Export using...
Rule-based Capture/Storage of Scientific Data from PDF Files and Export using...Rule-based Capture/Storage of Scientific Data from PDF Files and Export using...
Rule-based Capture/Storage of Scientific Data from PDF Files and Export using...
 
Data Wrangling in SQL & Other Tools :: Data Wranglers DC :: June 4, 2014
Data Wrangling in SQL & Other Tools :: Data Wranglers DC :: June 4, 2014Data Wrangling in SQL & Other Tools :: Data Wranglers DC :: June 4, 2014
Data Wrangling in SQL & Other Tools :: Data Wranglers DC :: June 4, 2014
 
Making Linked Data SPARQL with the InterMine Biological Data Warehouse
Making Linked Data SPARQL with the InterMine Biological Data WarehouseMaking Linked Data SPARQL with the InterMine Biological Data Warehouse
Making Linked Data SPARQL with the InterMine Biological Data Warehouse
 
2010 09 opm_tutorial_01-jun-usecase-datagovuk
2010 09 opm_tutorial_01-jun-usecase-datagovuk2010 09 opm_tutorial_01-jun-usecase-datagovuk
2010 09 opm_tutorial_01-jun-usecase-datagovuk
 
Towards Reusable Research Software
Towards Reusable Research SoftwareTowards Reusable Research Software
Towards Reusable Research Software
 
20130626 OpenRefine Introduction
20130626 OpenRefine Introduction20130626 OpenRefine Introduction
20130626 OpenRefine Introduction
 
Converting Metadata to Linked Data
Converting Metadata to Linked DataConverting Metadata to Linked Data
Converting Metadata to Linked Data
 
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge GraphsOBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
 

Similar to Healthcare Data Management using Domain Specific Languages for Metadata Management. - Splash2018 DSLDI Workshop

Are Data Models Superfluous Nov2003
Are Data Models Superfluous Nov2003Are Data Models Superfluous Nov2003
Are Data Models Superfluous Nov2003
Andries_vanRenssen
 
IRJET- An Efficient Way to Querying XML Database using Natural Language
IRJET-  	  An Efficient Way to Querying XML Database using Natural LanguageIRJET-  	  An Efficient Way to Querying XML Database using Natural Language
IRJET- An Efficient Way to Querying XML Database using Natural Language
IRJET Journal
 
Semantic Interoperability - grafi della conoscenza
Semantic Interoperability - grafi della conoscenzaSemantic Interoperability - grafi della conoscenza
Semantic Interoperability - grafi della conoscenza
Giorgia Lodi
 
USING ONTOLOGIES TO OVERCOMING DRAWBACKS OF DATABASES AND VICE VERSA: A SURVEY
USING ONTOLOGIES TO OVERCOMING DRAWBACKS OF DATABASES AND VICE VERSA: A SURVEYUSING ONTOLOGIES TO OVERCOMING DRAWBACKS OF DATABASES AND VICE VERSA: A SURVEY
USING ONTOLOGIES TO OVERCOMING DRAWBACKS OF DATABASES AND VICE VERSA: A SURVEY
cseij
 
Keystone Summer School 2015: Mauro Dragoni, Ontologies For Information Retrieval
Keystone Summer School 2015: Mauro Dragoni, Ontologies For Information RetrievalKeystone Summer School 2015: Mauro Dragoni, Ontologies For Information Retrieval
Keystone Summer School 2015: Mauro Dragoni, Ontologies For Information Retrieval
Mauro Dragoni
 
Semantic web
Semantic webSemantic web
Semantic web
Hon Lasisi H
 
Survey of Object Oriented Database
Survey of Object Oriented DatabaseSurvey of Object Oriented Database
Survey of Object Oriented Database
Editor IJMTER
 
Gellish A Standard Data And Knowledge Representation Language And Ontology
Gellish   A Standard Data And Knowledge Representation Language And OntologyGellish   A Standard Data And Knowledge Representation Language And Ontology
Gellish A Standard Data And Knowledge Representation Language And Ontology
Andries_vanRenssen
 
The need of Interoperability in Office and GIS formats
The need of Interoperability in Office and GIS formatsThe need of Interoperability in Office and GIS formats
The need of Interoperability in Office and GIS formats
Markus Neteler
 
Introduction of Data Science and Data Analytics
Introduction of Data Science and Data AnalyticsIntroduction of Data Science and Data Analytics
Introduction of Data Science and Data Analytics
VrushaliSolanke
 
A NOVEL APPROACH OF CLASSIFICATION TECHNIQUES FOR CLIR
A NOVEL APPROACH OF CLASSIFICATION TECHNIQUES FOR CLIRA NOVEL APPROACH OF CLASSIFICATION TECHNIQUES FOR CLIR
A NOVEL APPROACH OF CLASSIFICATION TECHNIQUES FOR CLIR
cscpconf
 
Reference Domain Ontologies and Large Medical Language Models.pptx
Reference Domain Ontologies and Large Medical Language Models.pptxReference Domain Ontologies and Large Medical Language Models.pptx
Reference Domain Ontologies and Large Medical Language Models.pptx
Chimezie Ogbuji
 
Semantic Web: Technolgies and Applications for Real-World
Semantic Web: Technolgies and Applications for Real-WorldSemantic Web: Technolgies and Applications for Real-World
Semantic Web: Technolgies and Applications for Real-World
Amit Sheth
 
Semantic Technolgy
Semantic TechnolgySemantic Technolgy
Semantic Technolgy
Talat Fakhri
 
Terminology in openEHR
Terminology in openEHRTerminology in openEHR
Terminology in openEHR
Pablo Pazos
 
Becoming Datacentric
Becoming DatacentricBecoming Datacentric
Becoming Datacentric
Timothy Cook
 
CSHALS 2010 W3C Semanic Web Tutorial
CSHALS 2010 W3C Semanic Web TutorialCSHALS 2010 W3C Semanic Web Tutorial
CSHALS 2010 W3C Semanic Web Tutorial
LeeFeigenbaum
 
Semantic Search Component
Semantic Search ComponentSemantic Search Component
Semantic Search Component
Mario Flecha
 
Pattern based approach for Natural Language Interface to Database
Pattern based approach for Natural Language Interface to DatabasePattern based approach for Natural Language Interface to Database
Pattern based approach for Natural Language Interface to Database
IJERA Editor
 
Database Concepts
Database ConceptsDatabase Concepts
Database Concepts
Upendra Reddy Vuyyuru
 

Similar to Healthcare Data Management using Domain Specific Languages for Metadata Management. - Splash2018 DSLDI Workshop (20)

Are Data Models Superfluous Nov2003
Are Data Models Superfluous Nov2003Are Data Models Superfluous Nov2003
Are Data Models Superfluous Nov2003
 
IRJET- An Efficient Way to Querying XML Database using Natural Language
IRJET-  	  An Efficient Way to Querying XML Database using Natural LanguageIRJET-  	  An Efficient Way to Querying XML Database using Natural Language
IRJET- An Efficient Way to Querying XML Database using Natural Language
 
Semantic Interoperability - grafi della conoscenza
Semantic Interoperability - grafi della conoscenzaSemantic Interoperability - grafi della conoscenza
Semantic Interoperability - grafi della conoscenza
 
USING ONTOLOGIES TO OVERCOMING DRAWBACKS OF DATABASES AND VICE VERSA: A SURVEY
USING ONTOLOGIES TO OVERCOMING DRAWBACKS OF DATABASES AND VICE VERSA: A SURVEYUSING ONTOLOGIES TO OVERCOMING DRAWBACKS OF DATABASES AND VICE VERSA: A SURVEY
USING ONTOLOGIES TO OVERCOMING DRAWBACKS OF DATABASES AND VICE VERSA: A SURVEY
 
Keystone Summer School 2015: Mauro Dragoni, Ontologies For Information Retrieval
Keystone Summer School 2015: Mauro Dragoni, Ontologies For Information RetrievalKeystone Summer School 2015: Mauro Dragoni, Ontologies For Information Retrieval
Keystone Summer School 2015: Mauro Dragoni, Ontologies For Information Retrieval
 
Semantic web
Semantic webSemantic web
Semantic web
 
Survey of Object Oriented Database
Survey of Object Oriented DatabaseSurvey of Object Oriented Database
Survey of Object Oriented Database
 
Gellish A Standard Data And Knowledge Representation Language And Ontology
Gellish   A Standard Data And Knowledge Representation Language And OntologyGellish   A Standard Data And Knowledge Representation Language And Ontology
Gellish A Standard Data And Knowledge Representation Language And Ontology
 
The need of Interoperability in Office and GIS formats
The need of Interoperability in Office and GIS formatsThe need of Interoperability in Office and GIS formats
The need of Interoperability in Office and GIS formats
 
Introduction of Data Science and Data Analytics
Introduction of Data Science and Data AnalyticsIntroduction of Data Science and Data Analytics
Introduction of Data Science and Data Analytics
 
A NOVEL APPROACH OF CLASSIFICATION TECHNIQUES FOR CLIR
A NOVEL APPROACH OF CLASSIFICATION TECHNIQUES FOR CLIRA NOVEL APPROACH OF CLASSIFICATION TECHNIQUES FOR CLIR
A NOVEL APPROACH OF CLASSIFICATION TECHNIQUES FOR CLIR
 
Reference Domain Ontologies and Large Medical Language Models.pptx
Reference Domain Ontologies and Large Medical Language Models.pptxReference Domain Ontologies and Large Medical Language Models.pptx
Reference Domain Ontologies and Large Medical Language Models.pptx
 
Semantic Web: Technolgies and Applications for Real-World
Semantic Web: Technolgies and Applications for Real-WorldSemantic Web: Technolgies and Applications for Real-World
Semantic Web: Technolgies and Applications for Real-World
 
Semantic Technolgy
Semantic TechnolgySemantic Technolgy
Semantic Technolgy
 
Terminology in openEHR
Terminology in openEHRTerminology in openEHR
Terminology in openEHR
 
Becoming Datacentric
Becoming DatacentricBecoming Datacentric
Becoming Datacentric
 
CSHALS 2010 W3C Semanic Web Tutorial
CSHALS 2010 W3C Semanic Web TutorialCSHALS 2010 W3C Semanic Web Tutorial
CSHALS 2010 W3C Semanic Web Tutorial
 
Semantic Search Component
Semantic Search ComponentSemantic Search Component
Semantic Search Component
 
Pattern based approach for Natural Language Interface to Database
Pattern based approach for Natural Language Interface to DatabasePattern based approach for Natural Language Interface to Database
Pattern based approach for Natural Language Interface to Database
 
Database Concepts
Database ConceptsDatabase Concepts
Database Concepts
 

Recently uploaded

4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
Social Samosa
 
writing report business partner b1+ .pdf
writing report business partner b1+ .pdfwriting report business partner b1+ .pdf
writing report business partner b1+ .pdf
VyNguyen709676
 
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
a9qfiubqu
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
nuttdpt
 
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
hyfjgavov
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
sameer shah
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
Lars Albertsson
 
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
z6osjkqvd
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
apvysm8
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
Timothy Spann
 
Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024
ElizabethGarrettChri
 
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
slg6lamcq
 
Build applications with generative AI on Google Cloud
Build applications with generative AI on Google CloudBuild applications with generative AI on Google Cloud
Build applications with generative AI on Google Cloud
Márton Kodok
 
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Kaxil Naik
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
aqzctr7x
 
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCAModule 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
yuvarajkumar334
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
Timothy Spann
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
nyfuhyz
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
AndrzejJarynowski
 
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
taqyea
 

Recently uploaded (20)

4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
 
writing report business partner b1+ .pdf
writing report business partner b1+ .pdfwriting report business partner b1+ .pdf
writing report business partner b1+ .pdf
 
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
 
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
 
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
 
Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024
 
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
 
Build applications with generative AI on Google Cloud
Build applications with generative AI on Google CloudBuild applications with generative AI on Google Cloud
Build applications with generative AI on Google Cloud
 
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCAModule 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
 
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
 

Healthcare Data Management using Domain Specific Languages for Metadata Management. - Splash2018 DSLDI Workshop

  • 1. Healthcare Data Management using Domain Specific Languages for Metadata Management David Milward This talk looks at : human factors of DSLs tool support for DSL users studies of usability and other benefits of DSLs experience reports of DSLs deployed in practice
  • 3.
  • 4.
  • 5. • A data dictionary is not enough • A catalogue is not enough • The tools must be usable by domain experts. • There will be more models than you think
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11. Within our language the prime artefacts are DataItems and Relationships, so it is essential to ensure that all relationships are valid, therefore we need a notion of referential integrity.
  • 12. Referential Integrity is provided out of the box with Xtext, which in turn leverages the Eclipse Modelling Framework (EMF), which in turn has an API for validation of Ecore models.This can be accessed directly for other validation, and thus one can avoid complicated grammar rules – since DSL’s in Xtext are defined by grammar rules (similar to ANTLR)
  • 13.
  • 14. Refines – the idea of can I use data item A instead of data item B. Example: Sex is classified in a number of ways in NHS datasets, for Genomics England the key ones are: 1. Phenotypic Sex - 2 Female 1 Male 9 Indeterminate 2. Person Stated Gender - 1 Male 2 Female 9 Indeterminate (Unable to be classified as either male or female) X Not Known (PERSON STATEDGENDER CODE not recorded) 3. Person Karyotypic Sex XY : XX : XO : XXY : XYY :XXX :XXYY : XXXY : : XXXX : other : unknown
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22. Very common groovy ‘builder’ pattern used for building request object. This code is used behind the scenes to fetch details of data items, in particular regular expressions which are used in DataElement specification, the regular expressions are then used to generate “fake data”
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
  • 33.
  • 34.

Editor's Notes

  1. My name is David Milward, I'm a PHD(DPHIL) student at Oxford University Department of Computer Science, a rather ancient student - I previously worked in Data Interoperability for about 10 years prior, primarily for NATO. NATO an organization with a large number of members who all want to share data, that is to say they want to see everyone else's data, but they don't want anyone looking at there's....I've since worked for the last 5 years in the healthcare sector exploring new ways of managing and integrating datasets, some of which involve Domain Specific Languages, which is why am I am here talking....
  2. I’ve partitioned the talk into the following sections for clarity and reference. In fact I am going to be telling a story over the next 25 minutes, it starts in late 2013/2014 when I was a full-time DPhil (PHD) student at Oxford and I was asked to provide a DSL to represent the ISO11179 standard for metadata registries, specifically in connection with a project my supervisor was conducting at the Oxford Biomedical Research Centre. We submitted a paper to this same conference in 2015 called: “Domain-Specific Modelling for Clinical Research”, which my colleague (at the time) Dr Sayeed Shah presented. The Oxford BRC continued working with Genomics England until early 2016, at which point they asked myself and some colleagues to take over the work, since it was no longer deemed ‘research’, so we formed a small company to carry on this work with Genomics England and a number of other Healthcare Organizations, the core work has gone into a toolkit which I will give you a quick demonstration of at the end of this talk. The story is not JUST about Domain specific languages, it is a project that has used DSL’s and it is the documentation of that experience. We ended up using 3 DSL’s, 1 based on the Xtext workbench, and 2 as Groovy (internal) DSLs. I’m going to tell the story from the beginning to give context, and then move into cover the DSL’s in turn in more depth. Finally I will give a quick toolkit demonstration, showing how DSL’s can be used in Dataset validation.
  3. The original project was to integrate datasets from 5 different healthcare trusts. The approach was to write a small mini-language to try and capture the essence of ISO11179 – the standard for metadata registries. This has already been used in an exploratory project at Oxford, sponsored by Cancer Research UK, and had been used in the US for the CaCore (G. A. Komatsoulis, D. B. Warzel, F. W. Hartel, K. Shanbhag,et al. caCORE version 3: Implementation of a model driven, service-oriented architecture for semantic interoperability. Journal of Biomedical Informatics, 41(1):106–123, 2008.) Thus we were following up these lines of work. The initial problems were that the core meta-model defined in ISO11179 was internally inconsistent and impossible to implement fully. The first prototype resulted many data items having to be entered twice or 3 times, hence we trimmed it back to Iteration 1. At this point the team at Genomics England looked at our results and decided they wanted to sponsor the project forward, this time to integrate 10-12 different datasets. However the terminology was still difficult so we started from scratch.
  4. The problem: The left column showed the main artefacts used to express datasets using ISO11179 The middle column was our first iteration The RHS column was out second (successful) iteration. Names : Model Catalogue, Metadata Exchange, Metadata Registry, MML (Metadata Modelling Language), MDML (Metadata Management Language)
  5. By this point in time the metadata registry had morphed into the Model Catalogue. The results in the 2015 are listed as: A data dictionary is not enough. A simple, flat list of data definitions does not support re-use at scale: it requires the user to place all of the contextual information into the definition of each data item, and mitigates against the automatic generation and application of definitions. Instead, a compositional approach is required, in which data elements are defined in explicit context. A catalogue is not enough. The models in the catalogue must be linked to implementations, and to each other, with a considerable degree of automatic support. If the models are out of sync with the implementations, and with the data, then their value is sharply diminished. If you are going to manage data at scale, you need a data model-driven approach. The tools must be usable by domain experts. To have the processes of model creation and maintenance mediated by software engineers is problematic: there may be misunderstandings regarding interpretation, but—more importantly—there are not enough software engineers to go around. An appropriate user interface, that closely matches the intuition and expectations of domain experts, is essential. There will be more models than you think. Different models will be required for different types of implementation, and—in any research domain, at least—data models will be constantly evolving, with data being collected against different versions. Intelligent, automatic support is essential. The information content of precise data models is considerable, and there may be complex dependencies between data concepts and constraints. A considerable degree of automation is required if users are to cope with this complexity. =================================================
  6. The data standards being used in healthcare originate from different specialist areas, use different formats and have many overlaps, resulting in a number of different viewpoints over which standards are more or less useful. This list gives an overview of the main standards currently used in the UK NHS. Each standards has a different history, a different set of demands and a different ‘language’.
  7. A quick example would be the idea of pre-coordinated and post-coordinated terms in SNOMED CT.
  8. This is the kind of existing Dataset that needs to be managed. It is presented in a spreadsheet form, the headers provide the definition of the metamodel, which is then transformed into the MDML format. Example will be shown later on.
  9. Continuing the idea that Xtext builds on EMF, we are able out of the box to enforce referential integrity. After the first two iteration we made a detailed examination of what kind of language was needed, and we wrote this specification up using a formal language called Z. The Z specification allowed us to work out what was required in the initial language. Referential Integrity was required, as expressed in the snippet here.
  10. This
  11. Key Features Unique ID for a DataModel – built in with version and status Status relates to lifecycle – draft-finalized-superceded(deprecated) GUID – unique identifier for dataitem – e.g. status=draft 123123@0.01, or status=finalized 123123@1.00 Refines relates to the idea of can dataitem A be used in place of dataitem B e.g. sex. Phenotypic Sex - 2 Female 1 Male 9 Indeterminate Person Stated Gender - 1 Male 2 Female 9 Indeterminate (Unable to be classified as either male or female) X Not Known (PERSON STATED GENDER CODE not recorded) Person Karyotypic Sex XY XY XX XX XO XO XXY XXY XYY XYY XXX XXX XXYY XXYY XXXY XXXY XXXX XXXX other other unknown unknown
  12. So when we are mapping and transforming datasets refinement relationships can be defined. This idea has been removed from the latest versions, but may be re-introduced.
  13. Relationships can be varied, and can map between different elements or data items ExtensionItems are in effect pure “metadata” And enable transformations to work effectively – anything not expressible in the target dataset is stored in an ExtensionItem
  14. Status – refer to previous explanation Constraint – allow a) constraints upon element types – i.e. a DataType can be constrained by a regular expression b) A number of DataElements can be constrained as a group within a Datamodel
  15. You can also write this from a groovy script – see web page documentation at the end. Future work – to put a direct call into the metadata registry to get the item – i.e. within the DSL
  16. What is happening is that the item named is being searched for, an ID found and then a request made to the server and specifications are obtained.
  17. This is the fake data being generated from the previous DSL.