SlideShare a Scribd company logo
V.2.2
Eric Little, PhD
Chief Data Officer
OSTHUS
eric.little@osthus.com
From Allotrope to Reference Master
Data Management:
How semantic metadata in .adf can
be extended across the enterprise
Slide 2
LIMS
Studies
Registration
The Silo Situation: Expensive, Ineffective and Error Prone
… ISO3
… DEU
… FRA
… …
… Country
… Germany
… France
… …
… ISO3-num
… 276
… 250
… …
?
?
• Applications use different names for the same things.
• Data exchange is expensive and limited (mapping knowledge in interfaces).
Slide 3
Situation with Semantic Reference Master Data Management
LIMS
Study
Management
Product
Registration
DataGovernance
Semantic Reference Master Data System
“France”@en
“FRA”
“250”
…
“EU”
“European Union”@en
registered
“AAFYZ-1217”
products locations
Value of Semantics:
• standardized naming conventions
for your core entities
• standardized meta models
(vendor agnostic)
• reuse of public ontologies (see
e.g., BioPortal)
• well defined hierarchies
• synonyms & mappings
• qualified relationships
• flexibility of graph models
• rules and inference
• data validation
Slide 4
Documents are processed for term/concept extraction
Extracted concepts are checked for accuracy
A Gold Standard Doc is created by a human – fully accurate reading
Documents are re-run based on human/machine corrections
Machine Learning improves performance over time
How Text Extraction Basically works (highly simplified version)
Documents
Text
Analytics
Engine
• Strains
• Persons
• Organizations
• Seasons
• Locations
• Etc…
Gold
Standard
Document
Extracted
Entities
Human In
The Loop
Feedback Loop for
Learning/Improvement
Slide 5
Extracted entities from the text source are stored in a DB or File Store
They are mapped to other data
 Legacy RDBs
 Semantic Models (shown here)
 Other data sources
The semantic model adds context to the extracted information
 A term can now be related to other objects from other sources
Linking to Semantics (Knowledge Graph)
Semantic Model
Documents
Text
Analytics
Engine
• Strains
• Persons
• Organizations
• Seasons
• Locations
• Etc…
Extracted
Entities
Slide 6
A Semantic Framework can connect the entire enterprise using a common semantics
The Semantic Hub should only focus on metadata (not instance level data)
Benefits: Common Terms, Models, Queries, Rules and Results (End-to-End)
Integrating Data Across the Enterprise
Lab Instruments Clinical Trials Regulatory AffairsProduction eArchiving
V.2.2
Allotrope
Slide 8
Allotrope Structure 2017
Astrix Technology Group
BSSN Software
Elemental Machines
Erasmus MC
Fraunhofer IPA
The HDF Group
LabAnswer
LabWare
Mettler Toledo
NIST
SciBite
Stanford University
University of Illinois at Chicago
University of Southampton
Slide 9
The Allotrope Framework
Slide 10
Allotrope Data Format (ADF)
HDF5
Platform Independent File Format
Allotrope Data Format (ADF)
Descriptive metadata about
• Method, instrument, sample,
process, result, etc.
• Provenance, audit trail
• Data Cube, Data Package
Analytical data represented by
one- or multidimensional arrays
of homogeneous data structures.
Analytical data represented by
arbitrary formats, incl. native
instrument formats, images,
pdf, video, etc.
Specifically designed to store
and organize large amounts
of scientific data.
Data Description
Semantic Graph Model
Data Cubes
Universal Data Container
Data Package
Virtual File System
APIs(Java&.NETclasslibraries)
Chromatogram 2D HDF
Slide 11
Example Use Case
HPLC – UV
Mobile Phase Selection
Slide 12
Ontology for HPLC Example
resultdevice
material
process
Slide 13
expected answer
 specified percentage of components,
e.g. 25% A, 75% B
 specified composition of components,
e.g. A = 0.5 mol/L Acetonitrile, B = Methanol
 specified qualities of chemical compounds
What mobile phase is required ?
MeCN/MeOH 40/60
Slide 14
What mobile phase is required ?
specification of
mobile phase
composition
of
mobile
phase
device
experiment
Slide 15
Models to Capture Plans, Workflows (Processes), Entities & Results
V.2.2
Applying Allotrope to
eArchiving
Slide 17
Using ADF for eArchiving in ZONTAL
V.2.2
Applying Allotrope to
eDecision
Slide 19
manual state of batch comparison
Final Step
Manual Report
LIMS
Purity Summary –
Crude to Drug Prod
Batch Comparison Table
Early/Late
Impurities
Batch to Batch
Comparison
Submission/Sample #
Embedded in ELN
Analyst
ELN
Manual Communication
SME
• Significant amounts of
manual effort
• Disconnected data sources
• Locally stored information
• Lack of traceability
• Data is difficult to interpret
or manipulate
Instrument
Data
Inst.
File
DB
% Purity
Full Lngth
Prod
% Indiv
Impurities
• No Automation
• Limited Batch Comparisons
can be produced
• Limited Distribution
Slide 20
Integration for batch comparison
Final Step
Manual Report
LIMS
Purity Summary –
Crude to Drug Prod
Batch Comparison Table
Early/Late
Impurities
Batch to Batch
Comparison
Submission/Sample #
Embedded in ELN
Analyst
ELN
Manual Communication SME
• Shows data integration
capabilities from LIMS + ELN
data
• Utilizes important metadata
• Metadata is key component of
ADF flies (Data Description)
Instrument
Data
Inst.File
DB
% Purity
Full Lngth
Prod
% Indiv
Impurities
• Can be expanded to include
all Batch Comparison steps
• Provides Integration +
Automation over time
Slide 21
Moving to “product genealogy”
• ZONTAL integrates data across the
enterprise
• Reporting and visibility utilizes the
entire Data Lake
• Instrument data is captured via the
Allotrope Framework
• Expanded to include all scientific
data feeding into ELN, LIMS, etc.
Enterprise-Wide User Community
Slide 22
Benefits of Data Lifecycle Management
Cost Saving Measures:
• Scientists spend more time doing science – not computer science
• Data can be generated and found easily – saves time/money
• Conceptual information is more easily shared/understood upstream
and downstream (w traceability)
• Faster project decisions can be made (with more complete data)
• Managing data/projects across multiple locations/labs is easier
• Integration provides a more complete picture
Innovation:
• Leading your organization to better leverage the value of Data
Science
• Adopting new technologies fosters new ideas and breakthroughs
• 86% of CEO’s surveyed said “technological advances will transform
business the most over the next 5 years” (PWC, Jan 2014)
1

More Related Content

What's hot

Reinventing Laboratory Data To Be Bigger, Smarter & Faster
Reinventing Laboratory Data To Be Bigger, Smarter & FasterReinventing Laboratory Data To Be Bigger, Smarter & Faster
Reinventing Laboratory Data To Be Bigger, Smarter & Faster
OSTHUS
 
Faster R & D Analysis Tool - TRG
Faster R & D Analysis Tool - TRG Faster R & D Analysis Tool - TRG
Faster R & D Analysis Tool - TRG
TRG
 
What is openEHR?
What is openEHR?What is openEHR?
What is openEHR?
openEHR Foundation
 
Revolutionizing Laboratory Instrument Data for the Pharmaceutical Industry:...
Revolutionizing Laboratory  Instrument Data for the  Pharmaceutical Industry:...Revolutionizing Laboratory  Instrument Data for the  Pharmaceutical Industry:...
Revolutionizing Laboratory Instrument Data for the Pharmaceutical Industry:...
OSTHUS
 
Sowmya Raghavan Strand Life
Sowmya Raghavan Strand LifeSowmya Raghavan Strand Life
Sowmya Raghavan Strand Life
EmTech
 
Removing the information bottleneck in R&D
Removing the information bottleneck in R&DRemoving the information bottleneck in R&D
Removing the information bottleneck in R&D
Craig Morgan NZCS, MBA (Hons), PMP
 
Big Data & ML for Clinical Data
Big Data & ML for Clinical DataBig Data & ML for Clinical Data
Big Data & ML for Clinical Data
Paul Agapow
 
Democratizing Data Science: Balancing Flexibility and Usability for Scientifi...
Democratizing Data Science: Balancing Flexibility and Usability for Scientifi...Democratizing Data Science: Balancing Flexibility and Usability for Scientifi...
Democratizing Data Science: Balancing Flexibility and Usability for Scientifi...
PerkinElmer Informatics
 
Open Insights Harvard DBMI - Personal Health Train - Kees van Bochove - The Hyve
Open Insights Harvard DBMI - Personal Health Train - Kees van Bochove - The HyveOpen Insights Harvard DBMI - Personal Health Train - Kees van Bochove - The Hyve
Open Insights Harvard DBMI - Personal Health Train - Kees van Bochove - The Hyve
Kees van Bochove
 
Darwin ai covid-net mitre
Darwin ai   covid-net mitreDarwin ai   covid-net mitre
Darwin ai covid-net mitre
ianmitch
 
Decision Forests: FDA's Tool for Data Analysis and Pattern-Recognition Methods
Decision Forests: FDA's Tool for Data Analysis and Pattern-Recognition MethodsDecision Forests: FDA's Tool for Data Analysis and Pattern-Recognition Methods
Decision Forests: FDA's Tool for Data Analysis and Pattern-Recognition Methods
EMMAIntl
 
Finding common ground: integrating the eagle-i and VIVO ontologies
Finding common ground: integrating the eagle-i and VIVO ontologiesFinding common ground: integrating the eagle-i and VIVO ontologies
Finding common ground: integrating the eagle-i and VIVO ontologies
mhaendel
 
To Be Digital, Pharma Labs Must Bridge the Gap Between Legacy Systems & Conne...
To Be Digital, Pharma Labs Must Bridge the Gap Between Legacy Systems & Conne...To Be Digital, Pharma Labs Must Bridge the Gap Between Legacy Systems & Conne...
To Be Digital, Pharma Labs Must Bridge the Gap Between Legacy Systems & Conne...
Cognizant
 
Heartificial intelligence - claudio-mirti
Heartificial intelligence - claudio-mirtiHeartificial intelligence - claudio-mirti
Heartificial intelligence - claudio-mirti
Pistoia Alliance
 
Life Science Analytics
Life Science AnalyticsLife Science Analytics
Life Science Analytics
Andrew Malinow, PhD
 
Collaborative Drug Discovery -- Life Science Collaboration & Virtualization: ...
Collaborative Drug Discovery -- Life Science Collaboration & Virtualization: ...Collaborative Drug Discovery -- Life Science Collaboration & Virtualization: ...
Collaborative Drug Discovery -- Life Science Collaboration & Virtualization: ...
Pistoia Alliance
 
Pistoia Pharma-in-a box: A vision for virtualized pharma in 2020
Pistoia Pharma-in-a box: A vision for virtualized pharma in 2020Pistoia Pharma-in-a box: A vision for virtualized pharma in 2020
Pistoia Pharma-in-a box: A vision for virtualized pharma in 2020
Sean Ekins
 
Pistoia Alliance Debates: PhUSE Framework for the Adoption of Cloud Technolog...
Pistoia Alliance Debates: PhUSE Framework for the Adoption of Cloud Technolog...Pistoia Alliance Debates: PhUSE Framework for the Adoption of Cloud Technolog...
Pistoia Alliance Debates: PhUSE Framework for the Adoption of Cloud Technolog...
Pistoia Alliance
 
Data Virtualization Modernizes Biobanking
Data Virtualization Modernizes BiobankingData Virtualization Modernizes Biobanking
Data Virtualization Modernizes Biobanking
Denodo
 
Is one enough? Data warehousing for biomedical research
Is one enough? Data warehousing for biomedical researchIs one enough? Data warehousing for biomedical research
Is one enough? Data warehousing for biomedical research
Greg Landrum
 

What's hot (20)

Reinventing Laboratory Data To Be Bigger, Smarter & Faster
Reinventing Laboratory Data To Be Bigger, Smarter & FasterReinventing Laboratory Data To Be Bigger, Smarter & Faster
Reinventing Laboratory Data To Be Bigger, Smarter & Faster
 
Faster R & D Analysis Tool - TRG
Faster R & D Analysis Tool - TRG Faster R & D Analysis Tool - TRG
Faster R & D Analysis Tool - TRG
 
What is openEHR?
What is openEHR?What is openEHR?
What is openEHR?
 
Revolutionizing Laboratory Instrument Data for the Pharmaceutical Industry:...
Revolutionizing Laboratory  Instrument Data for the  Pharmaceutical Industry:...Revolutionizing Laboratory  Instrument Data for the  Pharmaceutical Industry:...
Revolutionizing Laboratory Instrument Data for the Pharmaceutical Industry:...
 
Sowmya Raghavan Strand Life
Sowmya Raghavan Strand LifeSowmya Raghavan Strand Life
Sowmya Raghavan Strand Life
 
Removing the information bottleneck in R&D
Removing the information bottleneck in R&DRemoving the information bottleneck in R&D
Removing the information bottleneck in R&D
 
Big Data & ML for Clinical Data
Big Data & ML for Clinical DataBig Data & ML for Clinical Data
Big Data & ML for Clinical Data
 
Democratizing Data Science: Balancing Flexibility and Usability for Scientifi...
Democratizing Data Science: Balancing Flexibility and Usability for Scientifi...Democratizing Data Science: Balancing Flexibility and Usability for Scientifi...
Democratizing Data Science: Balancing Flexibility and Usability for Scientifi...
 
Open Insights Harvard DBMI - Personal Health Train - Kees van Bochove - The Hyve
Open Insights Harvard DBMI - Personal Health Train - Kees van Bochove - The HyveOpen Insights Harvard DBMI - Personal Health Train - Kees van Bochove - The Hyve
Open Insights Harvard DBMI - Personal Health Train - Kees van Bochove - The Hyve
 
Darwin ai covid-net mitre
Darwin ai   covid-net mitreDarwin ai   covid-net mitre
Darwin ai covid-net mitre
 
Decision Forests: FDA's Tool for Data Analysis and Pattern-Recognition Methods
Decision Forests: FDA's Tool for Data Analysis and Pattern-Recognition MethodsDecision Forests: FDA's Tool for Data Analysis and Pattern-Recognition Methods
Decision Forests: FDA's Tool for Data Analysis and Pattern-Recognition Methods
 
Finding common ground: integrating the eagle-i and VIVO ontologies
Finding common ground: integrating the eagle-i and VIVO ontologiesFinding common ground: integrating the eagle-i and VIVO ontologies
Finding common ground: integrating the eagle-i and VIVO ontologies
 
To Be Digital, Pharma Labs Must Bridge the Gap Between Legacy Systems & Conne...
To Be Digital, Pharma Labs Must Bridge the Gap Between Legacy Systems & Conne...To Be Digital, Pharma Labs Must Bridge the Gap Between Legacy Systems & Conne...
To Be Digital, Pharma Labs Must Bridge the Gap Between Legacy Systems & Conne...
 
Heartificial intelligence - claudio-mirti
Heartificial intelligence - claudio-mirtiHeartificial intelligence - claudio-mirti
Heartificial intelligence - claudio-mirti
 
Life Science Analytics
Life Science AnalyticsLife Science Analytics
Life Science Analytics
 
Collaborative Drug Discovery -- Life Science Collaboration & Virtualization: ...
Collaborative Drug Discovery -- Life Science Collaboration & Virtualization: ...Collaborative Drug Discovery -- Life Science Collaboration & Virtualization: ...
Collaborative Drug Discovery -- Life Science Collaboration & Virtualization: ...
 
Pistoia Pharma-in-a box: A vision for virtualized pharma in 2020
Pistoia Pharma-in-a box: A vision for virtualized pharma in 2020Pistoia Pharma-in-a box: A vision for virtualized pharma in 2020
Pistoia Pharma-in-a box: A vision for virtualized pharma in 2020
 
Pistoia Alliance Debates: PhUSE Framework for the Adoption of Cloud Technolog...
Pistoia Alliance Debates: PhUSE Framework for the Adoption of Cloud Technolog...Pistoia Alliance Debates: PhUSE Framework for the Adoption of Cloud Technolog...
Pistoia Alliance Debates: PhUSE Framework for the Adoption of Cloud Technolog...
 
Data Virtualization Modernizes Biobanking
Data Virtualization Modernizes BiobankingData Virtualization Modernizes Biobanking
Data Virtualization Modernizes Biobanking
 
Is one enough? Data warehousing for biomedical research
Is one enough? Data warehousing for biomedical researchIs one enough? Data warehousing for biomedical research
Is one enough? Data warehousing for biomedical research
 

Similar to From allotrope to reference master data management

Allotrope Foundation & OSTHUS at SmartLab Exchange 2015: Update on the Allotr...
Allotrope Foundation & OSTHUS at SmartLab Exchange 2015: Update on the Allotr...Allotrope Foundation & OSTHUS at SmartLab Exchange 2015: Update on the Allotr...
Allotrope Foundation & OSTHUS at SmartLab Exchange 2015: Update on the Allotr...
OSTHUS
 
eTRIKS Data Harmonization Service Platform
eTRIKS Data Harmonization Service PlatformeTRIKS Data Harmonization Service Platform
eTRIKS Data Harmonization Service Platform
ibemam
 
Connectivity challenges APC Europe by Alan Weber
Connectivity challenges APC Europe by Alan WeberConnectivity challenges APC Europe by Alan Weber
Connectivity challenges APC Europe by Alan Weber
Kimberly Daich
 
Addressing Connectivity Challenges of Disparate Data Sources in Smart Manufac...
Addressing Connectivity Challengesof Disparate Data Sourcesin Smart Manufac...Addressing Connectivity Challengesof Disparate Data Sourcesin Smart Manufac...
Addressing Connectivity Challenges of Disparate Data Sources in Smart Manufac...
Kimberly Daich
 
OSTHUS-Allotrope presents "Laboratory Informatics Strategy" at SmartLab 2015
OSTHUS-Allotrope presents "Laboratory Informatics Strategy" at SmartLab 2015OSTHUS-Allotrope presents "Laboratory Informatics Strategy" at SmartLab 2015
OSTHUS-Allotrope presents "Laboratory Informatics Strategy" at SmartLab 2015
OSTHUS
 
Achieving a 360 degree view of manufacturing
Achieving a 360 degree view of manufacturingAchieving a 360 degree view of manufacturing
Achieving a 360 degree view of manufacturing
DataWorks Summit
 
Augmentation, Collaboration, Governance: Defining the Future of Self-Service BI
Augmentation, Collaboration, Governance: Defining the Future of Self-Service BIAugmentation, Collaboration, Governance: Defining the Future of Self-Service BI
Augmentation, Collaboration, Governance: Defining the Future of Self-Service BI
Denodo
 
Crossing the Analytics Chasm and Getting the Models You Developed Deployed
Crossing the Analytics Chasm and Getting the Models You Developed DeployedCrossing the Analytics Chasm and Getting the Models You Developed Deployed
Crossing the Analytics Chasm and Getting the Models You Developed Deployed
Robert Grossman
 
ELIXIR
ELIXIRELIXIR
Case Study Life Sciences Data: Central for Integrative Systems Biology and Bi...
Case Study Life Sciences Data: Central for Integrative Systems Biology and Bi...Case Study Life Sciences Data: Central for Integrative Systems Biology and Bi...
Case Study Life Sciences Data: Central for Integrative Systems Biology and Bi...
sesrdm
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
Carole Goble
 
Data Warehouse 101
Data Warehouse 101Data Warehouse 101
Data Warehouse 101
PanaEk Warawit
 
Tag.bio: Self Service Data Mesh Platform
Tag.bio: Self Service Data Mesh PlatformTag.bio: Self Service Data Mesh Platform
Tag.bio: Self Service Data Mesh Platform
Sanjay Padhi, Ph.D
 
Managing Data Integration Initiatives
Managing Data Integration InitiativesManaging Data Integration Initiatives
Managing Data Integration Initiatives
AllinConsulting
 
FAIR data and model management for systems biology (and SOPs too!)
FAIR data and model management for systems biology (and SOPs too!)FAIR data and model management for systems biology (and SOPs too!)
FAIR data and model management for systems biology (and SOPs too!)
FAIRDOM
 
FAIR Data and Model Management for Systems Biology (and SOPs too!)
FAIR Data and Model Management for Systems Biology(and SOPs too!)FAIR Data and Model Management for Systems Biology(and SOPs too!)
FAIR Data and Model Management for Systems Biology (and SOPs too!)
Carole Goble
 
Define enterprise integration strategy by industry leader bhawani nandanprasad
Define enterprise integration strategy by industry leader bhawani nandanprasadDefine enterprise integration strategy by industry leader bhawani nandanprasad
Define enterprise integration strategy by industry leader bhawani nandanprasad
Bhawani N Prasad
 
Achieving a 360-degree view of manufacturing via open source industrial data ...
Achieving a 360-degree view of manufacturing via open source industrial data ...Achieving a 360-degree view of manufacturing via open source industrial data ...
Achieving a 360-degree view of manufacturing via open source industrial data ...
DataWorks Summit
 
Why Data Virtualization? An Introduction by Denodo
Why Data Virtualization? An Introduction by DenodoWhy Data Virtualization? An Introduction by Denodo
Why Data Virtualization? An Introduction by Denodo
Justo Hidalgo
 
Hughes RDAP11 Data Publication Repositories
Hughes RDAP11 Data Publication RepositoriesHughes RDAP11 Data Publication Repositories
Hughes RDAP11 Data Publication Repositories
ASIS&T
 

Similar to From allotrope to reference master data management (20)

Allotrope Foundation & OSTHUS at SmartLab Exchange 2015: Update on the Allotr...
Allotrope Foundation & OSTHUS at SmartLab Exchange 2015: Update on the Allotr...Allotrope Foundation & OSTHUS at SmartLab Exchange 2015: Update on the Allotr...
Allotrope Foundation & OSTHUS at SmartLab Exchange 2015: Update on the Allotr...
 
eTRIKS Data Harmonization Service Platform
eTRIKS Data Harmonization Service PlatformeTRIKS Data Harmonization Service Platform
eTRIKS Data Harmonization Service Platform
 
Connectivity challenges APC Europe by Alan Weber
Connectivity challenges APC Europe by Alan WeberConnectivity challenges APC Europe by Alan Weber
Connectivity challenges APC Europe by Alan Weber
 
Addressing Connectivity Challenges of Disparate Data Sources in Smart Manufac...
Addressing Connectivity Challengesof Disparate Data Sourcesin Smart Manufac...Addressing Connectivity Challengesof Disparate Data Sourcesin Smart Manufac...
Addressing Connectivity Challenges of Disparate Data Sources in Smart Manufac...
 
OSTHUS-Allotrope presents "Laboratory Informatics Strategy" at SmartLab 2015
OSTHUS-Allotrope presents "Laboratory Informatics Strategy" at SmartLab 2015OSTHUS-Allotrope presents "Laboratory Informatics Strategy" at SmartLab 2015
OSTHUS-Allotrope presents "Laboratory Informatics Strategy" at SmartLab 2015
 
Achieving a 360 degree view of manufacturing
Achieving a 360 degree view of manufacturingAchieving a 360 degree view of manufacturing
Achieving a 360 degree view of manufacturing
 
Augmentation, Collaboration, Governance: Defining the Future of Self-Service BI
Augmentation, Collaboration, Governance: Defining the Future of Self-Service BIAugmentation, Collaboration, Governance: Defining the Future of Self-Service BI
Augmentation, Collaboration, Governance: Defining the Future of Self-Service BI
 
Crossing the Analytics Chasm and Getting the Models You Developed Deployed
Crossing the Analytics Chasm and Getting the Models You Developed DeployedCrossing the Analytics Chasm and Getting the Models You Developed Deployed
Crossing the Analytics Chasm and Getting the Models You Developed Deployed
 
ELIXIR
ELIXIRELIXIR
ELIXIR
 
Case Study Life Sciences Data: Central for Integrative Systems Biology and Bi...
Case Study Life Sciences Data: Central for Integrative Systems Biology and Bi...Case Study Life Sciences Data: Central for Integrative Systems Biology and Bi...
Case Study Life Sciences Data: Central for Integrative Systems Biology and Bi...
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
 
Data Warehouse 101
Data Warehouse 101Data Warehouse 101
Data Warehouse 101
 
Tag.bio: Self Service Data Mesh Platform
Tag.bio: Self Service Data Mesh PlatformTag.bio: Self Service Data Mesh Platform
Tag.bio: Self Service Data Mesh Platform
 
Managing Data Integration Initiatives
Managing Data Integration InitiativesManaging Data Integration Initiatives
Managing Data Integration Initiatives
 
FAIR data and model management for systems biology (and SOPs too!)
FAIR data and model management for systems biology (and SOPs too!)FAIR data and model management for systems biology (and SOPs too!)
FAIR data and model management for systems biology (and SOPs too!)
 
FAIR Data and Model Management for Systems Biology (and SOPs too!)
FAIR Data and Model Management for Systems Biology(and SOPs too!)FAIR Data and Model Management for Systems Biology(and SOPs too!)
FAIR Data and Model Management for Systems Biology (and SOPs too!)
 
Define enterprise integration strategy by industry leader bhawani nandanprasad
Define enterprise integration strategy by industry leader bhawani nandanprasadDefine enterprise integration strategy by industry leader bhawani nandanprasad
Define enterprise integration strategy by industry leader bhawani nandanprasad
 
Achieving a 360-degree view of manufacturing via open source industrial data ...
Achieving a 360-degree view of manufacturing via open source industrial data ...Achieving a 360-degree view of manufacturing via open source industrial data ...
Achieving a 360-degree view of manufacturing via open source industrial data ...
 
Why Data Virtualization? An Introduction by Denodo
Why Data Virtualization? An Introduction by DenodoWhy Data Virtualization? An Introduction by Denodo
Why Data Virtualization? An Introduction by Denodo
 
Hughes RDAP11 Data Publication Repositories
Hughes RDAP11 Data Publication RepositoriesHughes RDAP11 Data Publication Repositories
Hughes RDAP11 Data Publication Repositories
 

More from OSTHUS

The Fast Track to Fair Lab Data
The Fast Track to Fair Lab Data The Fast Track to Fair Lab Data
The Fast Track to Fair Lab Data
OSTHUS
 
Early AI Adoption Via Advanced Analytics
Early AI Adoption Via  Advanced AnalyticsEarly AI Adoption Via  Advanced Analytics
Early AI Adoption Via Advanced Analytics
OSTHUS
 
Why paperless lab is just the first step towards a smart lab
Why paperless lab is just the first step towards a smart labWhy paperless lab is just the first step towards a smart lab
Why paperless lab is just the first step towards a smart lab
OSTHUS
 
Allotrope foundation vanderwall_and_little_bio_it_world_2016
Allotrope foundation vanderwall_and_little_bio_it_world_2016Allotrope foundation vanderwall_and_little_bio_it_world_2016
Allotrope foundation vanderwall_and_little_bio_it_world_2016
OSTHUS
 
Semantics for Integrated Analytical Laboratory Processes – the Allotrope Pers...
Semantics for Integrated Analytical Laboratory Processes – the Allotrope Pers...Semantics for Integrated Analytical Laboratory Processes – the Allotrope Pers...
Semantics for Integrated Analytical Laboratory Processes – the Allotrope Pers...
OSTHUS
 
Reasoning over big data
Reasoning over big dataReasoning over big data
Reasoning over big data
OSTHUS
 
Semantics for integrated laboratory analytical processes - The Allotrope Pers...
Semantics for integrated laboratory analytical processes - The Allotrope Pers...Semantics for integrated laboratory analytical processes - The Allotrope Pers...
Semantics for integrated laboratory analytical processes - The Allotrope Pers...
OSTHUS
 
Best Practice Reference Architecture for Data Curation
Best Practice Reference Architecture for Data CurationBest Practice Reference Architecture for Data Curation
Best Practice Reference Architecture for Data Curation
OSTHUS
 
Data Quality- How to clean up your legacy data
Data Quality- How to clean up your legacy dataData Quality- How to clean up your legacy data
Data Quality- How to clean up your legacy data
OSTHUS
 
Data Quality- How to clean up your legacy data?
Data Quality- How to clean up your legacy data?Data Quality- How to clean up your legacy data?
Data Quality- How to clean up your legacy data?
OSTHUS
 

More from OSTHUS (10)

The Fast Track to Fair Lab Data
The Fast Track to Fair Lab Data The Fast Track to Fair Lab Data
The Fast Track to Fair Lab Data
 
Early AI Adoption Via Advanced Analytics
Early AI Adoption Via  Advanced AnalyticsEarly AI Adoption Via  Advanced Analytics
Early AI Adoption Via Advanced Analytics
 
Why paperless lab is just the first step towards a smart lab
Why paperless lab is just the first step towards a smart labWhy paperless lab is just the first step towards a smart lab
Why paperless lab is just the first step towards a smart lab
 
Allotrope foundation vanderwall_and_little_bio_it_world_2016
Allotrope foundation vanderwall_and_little_bio_it_world_2016Allotrope foundation vanderwall_and_little_bio_it_world_2016
Allotrope foundation vanderwall_and_little_bio_it_world_2016
 
Semantics for Integrated Analytical Laboratory Processes – the Allotrope Pers...
Semantics for Integrated Analytical Laboratory Processes – the Allotrope Pers...Semantics for Integrated Analytical Laboratory Processes – the Allotrope Pers...
Semantics for Integrated Analytical Laboratory Processes – the Allotrope Pers...
 
Reasoning over big data
Reasoning over big dataReasoning over big data
Reasoning over big data
 
Semantics for integrated laboratory analytical processes - The Allotrope Pers...
Semantics for integrated laboratory analytical processes - The Allotrope Pers...Semantics for integrated laboratory analytical processes - The Allotrope Pers...
Semantics for integrated laboratory analytical processes - The Allotrope Pers...
 
Best Practice Reference Architecture for Data Curation
Best Practice Reference Architecture for Data CurationBest Practice Reference Architecture for Data Curation
Best Practice Reference Architecture for Data Curation
 
Data Quality- How to clean up your legacy data
Data Quality- How to clean up your legacy dataData Quality- How to clean up your legacy data
Data Quality- How to clean up your legacy data
 
Data Quality- How to clean up your legacy data?
Data Quality- How to clean up your legacy data?Data Quality- How to clean up your legacy data?
Data Quality- How to clean up your legacy data?
 

Recently uploaded

Using Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional SafetyUsing Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional Safety
Ayan Halder
 
Fundamentals of Programming and Language Processors
Fundamentals of Programming and Language ProcessorsFundamentals of Programming and Language Processors
Fundamentals of Programming and Language Processors
Rakesh Kumar R
 
E-commerce Application Development Company.pdf
E-commerce Application Development Company.pdfE-commerce Application Development Company.pdf
E-commerce Application Development Company.pdf
Hornet Dynamics
 
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian CompaniesE-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
Quickdice ERP
 
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
mz5nrf0n
 
Empowering Growth with Best Software Development Company in Noida - Deuglo
Empowering Growth with Best Software  Development Company in Noida - DeugloEmpowering Growth with Best Software  Development Company in Noida - Deuglo
Empowering Growth with Best Software Development Company in Noida - Deuglo
Deuglo Infosystem Pvt Ltd
 
Energy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina JonuziEnergy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina Jonuzi
Green Software Development
 
Revolutionizing Visual Effects Mastering AI Face Swaps.pdf
Revolutionizing Visual Effects Mastering AI Face Swaps.pdfRevolutionizing Visual Effects Mastering AI Face Swaps.pdf
Revolutionizing Visual Effects Mastering AI Face Swaps.pdf
Undress Baby
 
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling ExtensionsUI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
Peter Muessig
 
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Crescat
 
Graspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code AnalysisGraspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code Analysis
Aftab Hussain
 
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI AppAI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
Google
 
What is Augmented Reality Image Tracking
What is Augmented Reality Image TrackingWhat is Augmented Reality Image Tracking
What is Augmented Reality Image Tracking
pavan998932
 
socradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdfsocradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdf
SOCRadar
 
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata
 
SMS API Integration in Saudi Arabia| Best SMS API Service
SMS API Integration in Saudi Arabia| Best SMS API ServiceSMS API Integration in Saudi Arabia| Best SMS API Service
SMS API Integration in Saudi Arabia| Best SMS API Service
Yara Milbes
 
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j
 
Artificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension FunctionsArtificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension Functions
Octavian Nadolu
 
Transform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR SolutionsTransform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR Solutions
TheSMSPoint
 
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeA Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
Aftab Hussain
 

Recently uploaded (20)

Using Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional SafetyUsing Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional Safety
 
Fundamentals of Programming and Language Processors
Fundamentals of Programming and Language ProcessorsFundamentals of Programming and Language Processors
Fundamentals of Programming and Language Processors
 
E-commerce Application Development Company.pdf
E-commerce Application Development Company.pdfE-commerce Application Development Company.pdf
E-commerce Application Development Company.pdf
 
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian CompaniesE-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
 
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
 
Empowering Growth with Best Software Development Company in Noida - Deuglo
Empowering Growth with Best Software  Development Company in Noida - DeugloEmpowering Growth with Best Software  Development Company in Noida - Deuglo
Empowering Growth with Best Software Development Company in Noida - Deuglo
 
Energy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina JonuziEnergy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina Jonuzi
 
Revolutionizing Visual Effects Mastering AI Face Swaps.pdf
Revolutionizing Visual Effects Mastering AI Face Swaps.pdfRevolutionizing Visual Effects Mastering AI Face Swaps.pdf
Revolutionizing Visual Effects Mastering AI Face Swaps.pdf
 
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling ExtensionsUI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
 
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
 
Graspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code AnalysisGraspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code Analysis
 
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI AppAI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
 
What is Augmented Reality Image Tracking
What is Augmented Reality Image TrackingWhat is Augmented Reality Image Tracking
What is Augmented Reality Image Tracking
 
socradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdfsocradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdf
 
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024
 
SMS API Integration in Saudi Arabia| Best SMS API Service
SMS API Integration in Saudi Arabia| Best SMS API ServiceSMS API Integration in Saudi Arabia| Best SMS API Service
SMS API Integration in Saudi Arabia| Best SMS API Service
 
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
 
Artificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension FunctionsArtificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension Functions
 
Transform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR SolutionsTransform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR Solutions
 
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeA Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
 

From allotrope to reference master data management

  • 1. V.2.2 Eric Little, PhD Chief Data Officer OSTHUS eric.little@osthus.com From Allotrope to Reference Master Data Management: How semantic metadata in .adf can be extended across the enterprise
  • 2. Slide 2 LIMS Studies Registration The Silo Situation: Expensive, Ineffective and Error Prone … ISO3 … DEU … FRA … … … Country … Germany … France … … … ISO3-num … 276 … 250 … … ? ? • Applications use different names for the same things. • Data exchange is expensive and limited (mapping knowledge in interfaces).
  • 3. Slide 3 Situation with Semantic Reference Master Data Management LIMS Study Management Product Registration DataGovernance Semantic Reference Master Data System “France”@en “FRA” “250” … “EU” “European Union”@en registered “AAFYZ-1217” products locations Value of Semantics: • standardized naming conventions for your core entities • standardized meta models (vendor agnostic) • reuse of public ontologies (see e.g., BioPortal) • well defined hierarchies • synonyms & mappings • qualified relationships • flexibility of graph models • rules and inference • data validation
  • 4. Slide 4 Documents are processed for term/concept extraction Extracted concepts are checked for accuracy A Gold Standard Doc is created by a human – fully accurate reading Documents are re-run based on human/machine corrections Machine Learning improves performance over time How Text Extraction Basically works (highly simplified version) Documents Text Analytics Engine • Strains • Persons • Organizations • Seasons • Locations • Etc… Gold Standard Document Extracted Entities Human In The Loop Feedback Loop for Learning/Improvement
  • 5. Slide 5 Extracted entities from the text source are stored in a DB or File Store They are mapped to other data  Legacy RDBs  Semantic Models (shown here)  Other data sources The semantic model adds context to the extracted information  A term can now be related to other objects from other sources Linking to Semantics (Knowledge Graph) Semantic Model Documents Text Analytics Engine • Strains • Persons • Organizations • Seasons • Locations • Etc… Extracted Entities
  • 6. Slide 6 A Semantic Framework can connect the entire enterprise using a common semantics The Semantic Hub should only focus on metadata (not instance level data) Benefits: Common Terms, Models, Queries, Rules and Results (End-to-End) Integrating Data Across the Enterprise Lab Instruments Clinical Trials Regulatory AffairsProduction eArchiving
  • 8. Slide 8 Allotrope Structure 2017 Astrix Technology Group BSSN Software Elemental Machines Erasmus MC Fraunhofer IPA The HDF Group LabAnswer LabWare Mettler Toledo NIST SciBite Stanford University University of Illinois at Chicago University of Southampton
  • 10. Slide 10 Allotrope Data Format (ADF) HDF5 Platform Independent File Format Allotrope Data Format (ADF) Descriptive metadata about • Method, instrument, sample, process, result, etc. • Provenance, audit trail • Data Cube, Data Package Analytical data represented by one- or multidimensional arrays of homogeneous data structures. Analytical data represented by arbitrary formats, incl. native instrument formats, images, pdf, video, etc. Specifically designed to store and organize large amounts of scientific data. Data Description Semantic Graph Model Data Cubes Universal Data Container Data Package Virtual File System APIs(Java&.NETclasslibraries) Chromatogram 2D HDF
  • 11. Slide 11 Example Use Case HPLC – UV Mobile Phase Selection
  • 12. Slide 12 Ontology for HPLC Example resultdevice material process
  • 13. Slide 13 expected answer  specified percentage of components, e.g. 25% A, 75% B  specified composition of components, e.g. A = 0.5 mol/L Acetonitrile, B = Methanol  specified qualities of chemical compounds What mobile phase is required ? MeCN/MeOH 40/60
  • 14. Slide 14 What mobile phase is required ? specification of mobile phase composition of mobile phase device experiment
  • 15. Slide 15 Models to Capture Plans, Workflows (Processes), Entities & Results
  • 17. Slide 17 Using ADF for eArchiving in ZONTAL
  • 19. Slide 19 manual state of batch comparison Final Step Manual Report LIMS Purity Summary – Crude to Drug Prod Batch Comparison Table Early/Late Impurities Batch to Batch Comparison Submission/Sample # Embedded in ELN Analyst ELN Manual Communication SME • Significant amounts of manual effort • Disconnected data sources • Locally stored information • Lack of traceability • Data is difficult to interpret or manipulate Instrument Data Inst. File DB % Purity Full Lngth Prod % Indiv Impurities • No Automation • Limited Batch Comparisons can be produced • Limited Distribution
  • 20. Slide 20 Integration for batch comparison Final Step Manual Report LIMS Purity Summary – Crude to Drug Prod Batch Comparison Table Early/Late Impurities Batch to Batch Comparison Submission/Sample # Embedded in ELN Analyst ELN Manual Communication SME • Shows data integration capabilities from LIMS + ELN data • Utilizes important metadata • Metadata is key component of ADF flies (Data Description) Instrument Data Inst.File DB % Purity Full Lngth Prod % Indiv Impurities • Can be expanded to include all Batch Comparison steps • Provides Integration + Automation over time
  • 21. Slide 21 Moving to “product genealogy” • ZONTAL integrates data across the enterprise • Reporting and visibility utilizes the entire Data Lake • Instrument data is captured via the Allotrope Framework • Expanded to include all scientific data feeding into ELN, LIMS, etc. Enterprise-Wide User Community
  • 22. Slide 22 Benefits of Data Lifecycle Management Cost Saving Measures: • Scientists spend more time doing science – not computer science • Data can be generated and found easily – saves time/money • Conceptual information is more easily shared/understood upstream and downstream (w traceability) • Faster project decisions can be made (with more complete data) • Managing data/projects across multiple locations/labs is easier • Integration provides a more complete picture Innovation: • Leading your organization to better leverage the value of Data Science • Adopting new technologies fosters new ideas and breakthroughs • 86% of CEO’s surveyed said “technological advances will transform business the most over the next 5 years” (PWC, Jan 2014) 1