Pistoia Alliance European Conference 2015 - Gerhard Noelken / Allotrope Foundation

©2015 Allotrope Foundation
We are a data industry: let’s act like one.
Gerhard Noelken
Pfizer Allotrope Liaison

Today’s Environment:
In the absence of standards in laboratory software
2
Incomplete,
Incompatible
Software
No Standard
File Formats
Inconsistent
Metadata
Convert or
transcribe
Application 1
.abc
out Application 2
input
.xyz
technology A technology X
Project Test Instrument
AF 0012354 IR Fingerprinting QC Lab #33B 380 FT-IR
AE0012764 Bulk & Tapped Density ASTM Standard Seive #6
AF 12989 NMR Characterization AM500
Tapped & Bulk Density Sieve XXX
AF0045674 Caractérisation RMN Nouvelle DRX600
AF-0034558 IR iS10 FT-IR
.etc
.etc
.etc
.etc
.etc
.etc
.etc
.etc
.etc
.etc
.dat
.tbl.HDF
.raw .csv
.DAML
.LCD
.XML
.mzML
.jdx
.irf .pdid
.drdd
.asc
.cdf
.frx

Leading to the problems that brought
the collaboration together
It’s hard to find data
based on intuitive
starting points [e.g.
study, project, analyst,
technique]
It’s hard to integrate
data from different labs
instruments, or
online/offline because
the file format is
different
It’s hard to mine a
collection of data
because the details and
the context of the
experiment is stored
somewhere else
Can’t understand
/interpret data later
because the context is
incomplete,
inconsistent, often free
text
Instrument & software
interoperability is
limited…at best
3

Allotrope Foundation
4
• Subject Matter Experts
• Project Funding
Member Companies
• Project Management
• Legal & Logistical Support
Secretariat
• Framework Development
• Technical Leadership
Professional
Software Firm
• Requirements & Specifications
• Contributions, PoC Applications
Partner Network
AbbVie
Amgen
Baxter
Bayer
Biogen Idec
Boehringer Ingelheim
Bristol-Myers Squibb
Eisai
Eli Lilly
Genentech/Roche
GlaxoSmithKline
Merck
Pfizer
ACD/Labs
Agilent
Biovia
BSSN
IDBS
Mestrelab Research
Mettler Toledo
Sartorius
Shimadzu
Thermo Scientific
Waters

What is Allotrope Creating?
Allotrope Foundation Framework
Reusable
Software
Components
Open
Document
Standard
Open
Metadata
Repository
5
Application 1 Application 2
Metadata
repository
New
Instrument
Instrument
.etc
.etc
.etc
.etc
.etc
.etc
.etc
.etc
.etc
.etc
.dat
.tbl.HDF
.raw .csv
.DAML
.LCD
.XML
.mz
ML
.jdx
.irf .pdid
.drdd
.asc
.cdf
.frx
.adf
The set of standards used
based on our requirements
AF 0012354 IR Fingerprinting QC Lab #33B 380 FT-IR
AE0012764 Bulk & Tapped Density ASTM Standard Seive #6
AF 12989 NMR Characterization AM500
Tapped & Bulk Density Sieve XXX
AF0045674 Caractérisation RMN Nouvelle DRX600
AF-0034558 IR iS10 FT-IR
AF0012354 IR Fingerprinting 380 FTIR/-SN/145453
AF0012764 Bulk and Tapped Density ASTM Sieve-SN/3452
AF0012989 NMR Characterization AM500-SN/0034578
AF0013142 Bulk and Tapped Density ASTM Sieve-SN/09783
AF0045674 NMR Characterization DRX600-SN/10234567
AF0034558 IR Fingerprinting iS10 FTIR/-SN/341980
With the Metadata Repository
The core of the
controlled vocabulary
A toolkit that enables
use of the standards &
metadata in software
development

Small Molecule Biologics
HPLC, LC-MS, GC, GC-MS
CE, SFC, IEX, SEC
RI, ICP-MS, IMS
UV-VIS, IR, NIR, Raman, FL, NMR,
KF Coulometric, KF Volumetric
Light Scattering, Microscope,
pH Meter, Polarimeter,
Potentiostat, Coulometer
XRD, SEM, Viscometer, Water Activity
Balance, DSC, TGA, Dissolution
Sterility, Reconstitution Time
Osmolality, TOC
PCR, qPCR, cIEF, iCE
ELISA, EIA
Flow Cytometry,
O2 sensor, SDS-PAGE
Endotoxin
Conductivity
Far/Near UV CD
Peptide Digestion
Plasmid Retention
Genotypic Verification
Disintegration
Calorimetry,
Flowability
Surface Area
Hardness
Optical Rotation
Wet Chemistry
TLC
The Analytical “Laboratory” Data Landscape
Delivering a Solution Across Technologies and Modalities will Enhance Value and Adoption
6

The basic analytical workflow and
data flow standardized
7
Plan
Analysis
Prepare
Samples
Submit
Samples
Control Inst.
Acquire
Data
Process
Data
Analyze
Data
Reports
Results
Store,
Archive
Data
Request Report
Search
& Reuse
Data
Sample Prep
Data
Instrument
Instructions
Instrument
Data
Processed
Data
Analyzed
Data
Reported
Results Stored DataAnalytical
Method
Data &
Metadata
Process
Step
Legend
Ultimately the collective meta data is EVIDENCE that supports a DECISION about your
MANUFACTURING PROCESS or MATERIAL

8
Plan
Analysis
Prepare
Samples
Submit
Samples
Control Inst.
Acquire
Data
Process
Data
Analyze
Data
Reports
Results
Store,
Archive
Data
Request Report
& Share
Search
& Reuse
Data
Sample Prep
Data
Instrument
Instructions
Instrument
Data
Processed
Data
Analyzed
Data
Reported
Method
Data &
Metadata
Process
Step
Legend
Standard data file format & metadata
Control Inst.
Acquire
Data
Process
Data
Analyze
Data
Interoperability
More automated reporting,
Powerful searching

ADF Key Requirements
• Large data volume, small file size, fast
• Arbitrary techniques; extensible
• Platform independent
Technical capabilities
• Who, what, when, where, why and how
• Scientist, sample, time stamp/audit trail, instrument, purpose,
method
Comprehensive Metadata
• Documented file format
• Vendor neutral format
• Adaptable and extensible
Long term data access
9

Allotrope Data Format
Data stored in public formats, incl. images,
pdf, video
10
Platform Independent File
Format
HDF5
Data Description
Resource Description Framework
(RDF) Model
Data Cubes
Universal data container
Data Package
Virtual file system
Contains semantic descriptions of:
• Method, instrument, sample, process, result, etc.
• Data cube metadata
• Binary file metadata
Binary representation analytical data in
one- or multidimensional arrays

ADF Class Library
Platform independent file format
(HDF 5)
Data Package API Data Cube API
Data Description API
(Jena, dotNetRDF)
Analytical Data API
Taxonomies
11
Triple Store API
April
2015
April
2015
Feb
2015

A lot of good ideas that can be used:
More than 100 relevant public standards & ontologies, highly connected
12
International Standards Organization
Open Geospatial Consortium
World Wide Web Consortium
…
SensorML
AnIML
S88/BatchML
mzML
…
Analytical Data
Standards
Metadata
Standards
Allotrope
Framework

Allotrope Taxonomy Concept
• Create library of extensible taxonomies
– Easy to understand and maintain by SMEs and Vendors
– Hosted in the public Allotrope Metadata Repository
– Collaborative development across membership & APN
13

Why we’re not yet another …
• We share the same pain points – so we’re sharing a significant
upfront investment in fixing the root causes of the problem (not a
“band aid”)
• We’re collaborating across industries to provide the vendor
community a set of coherent, community-wide requirements
• Doing the work on real problems, with code, and in the lab to test
assumptions is a great way to make progress
• It takes money, commitment and time from scientists, lab
managers, and senior managers; leveraging vast team of SMEs
across 13 companies
• We’ve engaged professionals – software engineers, architects,
laboratory automation, attorneys, scientists, process/domain
experts, project managers
• We are making tangible progress, hitting milestones, and will
deploy the first production Framework in 2016
14

Examples of 2015 Integration Projects
• Converters are a temporary, expedient solution to transform data into the
Allotrope Data Format
Converters
• Leverage contextual metadata for natural language search
• “find all lots with impurity at retention time 10.2 s greater than 0.01 area
percent”
• Lightweight universal viewer for any technique
Data Storage & Access
• Standard platform for the planning, execution, analysis & reporting of
analytical chemistry leveraging the Allotrope Framework
• Includes IoT instrument integration; metadata repository/method
management; workflow execution
Automation of Analytical Chemistry
15

Collaboration with the Vendor community
Launched in March 2014
• Designed to provide equal opportunities for all partners
• Contains elements for the Development, QA/Support,
Marketing and Sales teams of Allotrope Partners
http://partners.allotrope.org
• Portal for any vendor to collaborate with
Allotrope Foundation and contribute to the Framework
Expanded opportunities for participation in 2015
• Integration projects & PoCs
• Focused engagement with thought leaders & SMEs
• Well defined roadmap and governance
16

Project Trajectory
2013
2014
2015
2016
17
Allotrope Foundation
Initiated software
development and evaluations
Established feasibility through PoCs;
ADF design & due diligence
Framework Development
Integration at Members
Framework used in
production
First Public release

Questions?
Network with Peers: upcoming workshops
• Allotrope Cross-Industry Workshops
– April 24, 2015 (Cambridge, MA)
– June 9, 2015 (Leverkusen, Germany)
– September 16, 2015 (Chicago, IL)
• Allotrope Partner Network Workshops
– September 15, 2015 (Chicago, IL)
Presentations
• BioIT World, April 23 (Cambridge, MA)
• COSMOS Aug 17-19 (San Diego, CA)
18
To join or get additional information, contact:
James Vergis, Ph.D.
Science Advisor | Drinker Biddle & Reath LLP
1-202-230-5439
James.Vergis@dbr.com
more.info@allotrope.org www.allotrope.org

Federated standards we use everyday…
http://
TCP/IP
Kashmir
Led Zeppelin
SMTP
MIME
ASCII SSL

MP3 Attributes
21
• Platform Independent
• Standardized metadata
• Efficient Storage
• Share songs (data)
• Open standards
Music is recorded and stored in standard
formats with the contextual metadata
needed to find, share and enjoy it years
later.

Allotrope Data Format (ADF) Scope
• Measurement process
offline, online, PAT
• Research through Manufacturing process
chemistry, formulation, bioprocessing
• Records management
record retention, regulatory submissions, reporting
Holistic solution for industry
• Regulators, bench scientist, data analysts, modelers,
manufacturing, archivists, IT
Requirements from range of perspectives & roles
22

23
Plan
Analysis
Prepare
Samples
Submit
Samples
Control Inst.
Acquire
Data
Process
Data
Analyze
Data
Reports
Results
Store,
Archive
Data
Request Report
Search
& Reuse
Data
Sample Prep
Data
Instrument
Instructions
Instrument
Data
Processed
Data
Analyzed
Data
Reported
Method
Data &
Metadata
Process
Step
Legend
Standard data file format & metadata

Jan Feb Mar Apr May Jun Jul Aug Sep Oct
Allotrope Framework High Level Project Plan
Draft Data
Description
API
Draft
Analytical
Data API
Release
Version 1.0
2015
Draft ADF Format & API
Data Description API
ADF Format & API Version 1.0
Draft Taxonomies Taxonomies Version 1.0
Equipment Process
Material Result
Data Cube API
Data Package API
Data Description API Version 1.0
Data Cube API Version 1.0
Data Package API Version 1.0
Draft Analytical Data API Analytical Data API Version 1.0
Alpha
release to
APN
24
Analytical Techniques
Taxonomies Version 1.0

Allotrope Taxonomies
• Create library of extensible taxonomies
– Using WC3 standard SKOS
– Easy to understand and maintain by SMEs and Vendors
– Hosted in the public Allotrope Metadata Repository
– Collaborative development across membership & APN
• Start by harvesting existing available concepts
– PSI-MS; IUPAC; RSC Chemical Methods Ontology; Dictionary of
weighing terms; AnIML
• Create data description models using ontologies and data
shapes (Shapes Constraints Language; WC3)
– Link to taxonomies to add meaning
25

Pistoia Alliance European Conference 2015 - Gerhard Noelken / Allotrope Foundation

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Pistoia Alliance European Conference 2015 - Gerhard Noelken / Allotrope Foundation

Similar to Pistoia Alliance European Conference 2015 - Gerhard Noelken / Allotrope Foundation (20)

More from Pistoia Alliance

More from Pistoia Alliance (20)

Recently uploaded

Recently uploaded (20)

Pistoia Alliance European Conference 2015 - Gerhard Noelken / Allotrope Foundation

Editor's Notes