More Related Content Similar to Allotrope foundation vanderwall_and_little_bio_it_world_2016 (20) Allotrope foundation vanderwall_and_little_bio_it_world_20161. ©2016 Allotrope Foundation
Allotrope Foundation:
Driving Metadata & Master Data Management through
Improved Data Modeling with Semantic Technologies
Dana Vanderwall, Ph.D.
Director, Biology & Preclinical IT (BMS)
Vice Chair, Board of Directors (Allotrope)
Eric Little, Ph.D.
Vice President, Data Science (OSTHUS)
Adjunct Professor (NYU Polytechnic School of Engineering )
2. ©2016 Allotrope Foundation
The Current Situation in the Lab
Many challenges exist for data to be
captured, integrated and shared
• Data Silos
• Incompatible instruments and
software systems
• Legacy architectures are brittle
and rigid
• SME knowledge resides in
people’s heads
• Data schemas are not explicitly
understood
• Lack of common vision between
business units and scientists
2
3. ©2016 Allotrope Foundation
How do we change that?
3
Data in Standard Format
Metadata in a Standard vocabularyRegulatory Guidance
Methods
Recipes
SOPs
…
Vendor-Specific Formats
Process
Material
Equipment
Result
4. ©2016 Allotrope Foundation
Allotrope Foundation: Driving the Change
4
• Subject Matter Experts
• Project Funding
Member Companies
• Project Management
• Legal & Logistical Support
Secretariat
• Framework Development
• Technical Leadership
Professional
Software Firm
• Requirements & Specifications
• Contributions, PoC Applications
Partner Network
AbbVie
Amgen
Baxter
Bayer
Biogen
Boehringer Ingelheim
Bristol-Myers Squibb
Eli Lilly
Genentech/Roche
GlaxoSmithKline
Merck & Co.
Pfizer
ACD/Labs
Agilent
Biovia
Bruker
BSSN
IDBS
LabAnswer
LabVantage
LEAP Technologies
Mestrelab Research
Mettler Toledo
PerkinElmer
Persistent Systems
Riffyn
Sartorius
Shimadzu
Tetra Science
Thermo Scientific
Waters
Erasmus Univ.
Med Center
J. Paul Getty
Trust
(UK) Science and
Technology
Facilities Council
University of
Southampton
University of
Strathclyde
5. ©2016 Allotrope Foundation
Allotrope Data Format (ADF)
5
Data Description
RDF Model
Data Cubes
Universal data container
Data Package
Virtual file system *
Contains:
• Method, instrument, sample,
process, result, etc.
• Data cube metadata
• Data package metadata
• …
Analytical data represented by
one- or multidimensional arrays.
HDF5
Platform Independent File Format
Allotrope Data Format
Analytical data represented by
arbitrary formats, incl. native
instrument formats, images, pdf,
video, etc.
Specifically designed to store and
organize large amounts of
numerical data.
APIs(Java&.NETclasslibraries)
v1.0 ADF, Taxonomies, Class Libraries released Sept 2015, v1.1 April 2016
6. ©2016 Allotrope Foundation
Moving from Data Format to Semantics
• Has its origins in philosophy - generally understood as the abstract study
of meaning
• Distinguished from syntax – which is the rules-based grammar of a
language
6
“Washington”
9. ©2016 Allotrope Foundation
Utilizing the Semantic Spectrum
(Moving Beyond Taxonomies)
9
Code (Lists) Terms (Soil, Plant, etc.)
Controlled Vocabulary
(Agreed Upon Terms)
Taxonomy
(Hierarchy)
Thesaurus
(Preferred Labels, Synonyms, etc.)
RDF Models
(Triples as Graphs)
OWL Ontologies
(RDF + Axioms)
Reasoning
(Rule-based Logics:
Discover New Patterns)
Ontologies and Reasoning add
Axioms and Advanced Logic
10. ©2016 Allotrope Foundation
Understanding the 4V’s of Big Data
10
Normally the focus –
Big Data Analysis is
more than just size
Performance is
Critical to Success
Data complexity is
increasing – Model
complexity
Uncertainty abounds
– requires statistics
and probabilities
Majority of Big Data analytics
approaches treat these two V’s
Semantic
technologies provide
clear advantages
Mathematical
Clustering
Techniques
provide clear
advantages
11. ©2016 Allotrope Foundation
Why Semantics Matters for Data Analytics
11
Big Data approaches require
proper metadata and
terminologies to integrate
information well
Relationships matter in the
data
Understanding perspective
(context) is crucial for success
in today’s world
Semantics provides better
data models/schemas
12. ©2016 Allotrope Foundation
The Foundation for Real Data Analytics on the Laboratory
Workflow and Data
12
Plan
Analysis
Prepare
Samples
Submit
Samples
Control Inst.
Acquire Data
Process
Data
Analyze
Data
Reports
Results
Store,
Archive
Data
Request Report
Search &
Reuse
Data
Sample Prep
Data
Instrument
Instructions
Instrument
Data Processed Data Analyzed Data Reported
Results Stored DataAnalytical
Method
Data Description
RDF Model
Data Cubes
Universal data
container
Data Package
Virtual file system
Allotrope Data Format
13. ©2016 Allotrope Foundation
How is the Framework Being Used?
Implementation by Member Companies
13
DevelopmentResearch Commercial
Member non-GMP GMPInstrument
BMS
Bayer
Baxter
Merck & Co.
Amgen
Boehringer-
Ingelheim
GSK Drug Substance Release & Stability
Structure ID, Purification,
In vitro bioanalysis
Method Screening
HPLC-
UV/MS
HPLC-UV
Balance
HPLC-
UV/MS
Structure IDHPLC-MS
Fermentation
Process Control
Bioanalyzer
Small and Large Molecule CMC
Genentech
Elemental Impurities
Assay, PurityHPLC-UV
Biogen CRO IntegrationHPLC-UV
Pfizer LC Data to ADF Converter/AdapterHPLC-UV
ICP-MS
pH, Weighing, GC, Karl Fischer, TGA, NMR ,
Cell Density/Viability, Blood Gas Analyzer, Cell
Culture Analyzer, Capillary Electrophoresis…
14. ©2016 Allotrope Foundation
How is the Framework Being Used?
Implementations by Member Companies
14
DevelopmentResearch Commercial
Member non-GMP GMPInstrument
Member 6
Member 3
Member 2
Member 9
Member 1
Member 5
Member 8 Drug Substance Release & Stability
Structure ID, Purification,
In vitro bioanalysis
Method ScreeningHPLC-UV/MS
HPLC-UV
Balance
HPLC-UV/MS
Structure IDHPLC-MS
Fermentation
Process Control
Bioanalyzer
Small and Large Molecule CMC
Multiple
types
Member 7
Elemental ImpuritiesICP-MS
Assay, PurityHPLC-UV
pH, Weighing, GC, Karl Fischer, TGA,
NMR , Cell Density/Viability, Blood
Gas Analyzer , Cell Culture Analyzer,
Capillary Electrophoresis
…
Member 4 CRO IntegrationHPLC-UV
Member 10 LC Data to ADF Converter/AdapterHPLC-UV
DevelopmentResearch Commercial
Member 6
Member 9
Member 8 Drug Substance Release & Stability
Structure ID, Purification,
In vitro bioanalysis
Method ScreeningHPLC-UV/MS
HPLC-UV
Balance
HPLC-UV/MS
Member 10 LC Data to ADF Converter/AdapterHPLC-UV
Member 1 Small and Large Molecule CMC
Multiple
types
taxonomies
methods
repository
data repository
adapter
instrument adapter
pH, Weighing, GC, Karl Fischer, TGA, NMR ,
Cell Density/Viability, Blood Gas Analyzer, Cell
Culture Analyzer, Capillary Electrophoresis…
15. ©2016 Allotrope Foundation
Smart Labs for the 21st Century
Smart labs in the future will provide the
enterprise with:
• Integrated Data – common reference
data structures (vocabularies)
• Sharable Data – easier interaction
across teams and business units
• Scalability – Big data applications that
can be highly elastic
• Conceptual Representations – context
and perspective are captured
• Advanced Analytics – complex &
automated problem-solving capabilities
16. ©2016 Allotrope Foundation
Thank you!
• Any questions, please contact the Secretariat at
more.info@allotrope.org or james.vergis@dbr.com
• 2016 Workshops
– January 20, 2016: San Francisco, CA @ Genentech
– June 2016: Ingelheim, Germany @ Boehringer Ingelheim
– September 2016: Indianapolis, ID @ Eli Lilly and Co.
http://www.allotrope.org for more information and to register
16