SlideShare a Scribd company logo
1 of 21
Download to read offline
ALEX HENDERSON & PETER GARDNER MANCHESTER INSTITUTE OF BIOTECHNOLOGY UNIVERSITY OF MANCHESTER, UK HTTP://GARDNER-LAB.COM & HTTP://CLIRSPEC.ORG 
SPEC 2014 Shedding New Light on Disease 
Kraków, Poland. 17-22 August 2014 
WHAT’S MINE IS YOURS (AND VICE VERSA): DATA SHARING IN VIBRATIONAL SPECTROSCOPY
Sharing…
Why share? 
Technique validation 
Round-robins 
Standard spectra for unknown identification 
Standard operating procedure validation 
Test visualisation schemes 
Remote location of special samples 
Remote location of special equipment
What to share? 
Raw data files 
Eg. For testing data processing procedures 
Metadata for sample preparation 
Sample SOP 
Metadata for experimental procedure/protocol 
Acquisition SOP 
Processed data to save doing it yourself
What to share? 
Raw data files 
Eg. For testing data processing procedures 
Metadata for sample preparation 
Sample SOP 
Metadata for experimental procedure/protocol 
Acquisition SOP 
Processed data to save doing it yourself
How to give? 
Pen drive 
CD 
Email 
Dropbox 
ftp server 
Data repository 
One-to-one 
One-to-few 
One-to-more 
One-to-all 
Best solution
How to receive? 
Data in different file formats introduces a barrier to end user 
Disconnect between analysis software and file format 
Incorrectly/poorly coded formats require additional information 
(hyper)Spectral data disconnected from sample treatments or acquisition protocols
Third-party data analysis suites 
Package 
Author 
Platform 
CytoSpec 
Peter Lasch 
MATLAB 
hyperSpec 
Claudia Beleites 
R 
ProSpect 
Paul Bassan 
MATLAB 
SpecToolbox 
Matt Baker (and friends) 
MATLAB 
… 
Not an exhaustive list, email me your package info 
Author must write import filter for each version of each vendor’s formats
Writing import filters 
Slow 
Laborious 
Steep learning curve 
Potential for error 
Incomplete filter without sufficient test data 
No access to file format specification/detail 
IP issues with proprietary formats (NDA) 
Some limited to (32-bit) Windows (eg. DLL or DDE)
Objectives 2014 – 2017 
Developing 
Understanding of interaction of light with clinical samples 
Strategies for pre-processing and statistical analysis in clinical spectroscopy 
Protocols 
Preparation of cells, tissue and biofluids for clinical spectroscopy 
Inter-group data sharing 
Evidence 
Power of spectroscopy for use in the clinical arena 
Requirements of instrumentation suitable for use in the clinic 
Clinical Infrared and Raman Spectroscopy for Medical Diagnosis 
PARTNERS 
ACADEMIC 
Peter Gardner 
Matthew J Baker 
Nicholas Stone 
Julian Moger 
Josep Sulé-Suso 
Francis Martin 
Sergei G Kazarian 
Hugh J Byrne 
Roy Goodacre 
John M Chalmers 
Alex Henderson 
Peter Lasch 
Ganesh Sockalingum 
Bayden Wood 
Peter Weightman 
Gianfelice Cinque 
Peter Rich 
CLINICAL 
Noel Clarke 
Jonathan Shanks 
Timothy Dawson 
Charles Davis 
Pierre Martin-Hirsch 
Hugh Barr 
Neil Shepherd 
John McGrath 
Jim Brown 
Sam Janes 
INDUSTRIAL 
Agilent 
Bruker 
Cobalt Light Systems 
Coherent UK 
Perkin Elmer 
Renishaw 
@clirspec 
http://clirspec.org/
CLIRSPEC Work Package 6 
Assess current spectral and image data attributes from the range of currently employed network instrumentation 
Develop a standard data transfer format to allow free and easy dissemination of data between network members enhancing collaboration and efficiency of research funding 
Provide a single software target, easing the development of third party software and its uptake within the clinical arena 
Investigate the utility of standard spectra for specific diseases 
Investigate the technological, cultural, ethical and IP issues in order to enable data sharing and reuse
CLIRSPEC Work Package 6 
Assess current spectral and image data attributes from the range of currently employed network instrumentation 
Develop a standard data transfer format to allow free and easy dissemination of data between network members enhancing collaboration and efficiency of research funding 
Provide a single software target, easing the development of third party software and its uptake within the clinical arena 
Investigate the utility of standard spectra for specific diseases 
Investigate the technological, cultural, ethical and IP issues in order to enable data sharing and reuse
Data format requirements 
Operating system neutral 
Scalable to large file sizes (futureproof) 
Random access (don’t unzip before reading) 
File format description available (NDA open) 
Other software available that can read it 
Quick to write and, more importantly, quick to read 
Able to hold (encrypted) instrumental parameters 
Enables round-tripping, no information loss 
…
Open data formats – Spectra 
JCAMP-DX 
Over 4 compression systems 
Some code available 
Grams SPC 
Understands spectroscopy types and units 
Some import filters available 
CSV/text 
Simple to read 
Not scalable 
Not suitable for images 
Loss of metadata
Hyperspectral images 
Grams SPC 
Pixel indexing issues, needs help 
ENVI 
Manual spectrum-centric or image-centric access 
May require IDL library 
NetCDF-4 
Self-describing, accessed via libraries 
Compression and streaming available
3D confocal and tomographic 
NetCDF-4 
Unlimited dimensionality 
Optimised spectrum-centric or image-centric access through ‘chunking’ 
Supported
Community input required 
Data types that need to be supported 
Irregularly shaped images 
Collections of spectra 
Discrete wavelength data (multispectral not hyperspectral) 
Time course (multiple dependent variables) 
Software 
Filters written, format testing etc. 
THINKING and PLANNING!!
Registration at http://clirspec.org
Groups at http://clirspec.org
Updates at http://clirspec.org
Remember…

More Related Content

What's hot

Predict Conference: Data Analytics for Digital Forensics and Cybersecurity
Predict Conference: Data Analytics for Digital Forensics and CybersecurityPredict Conference: Data Analytics for Digital Forensics and Cybersecurity
Predict Conference: Data Analytics for Digital Forensics and CybersecurityMark Scanlon
 
Smith T Bio Hdf Bosc2008
Smith T Bio Hdf Bosc2008Smith T Bio Hdf Bosc2008
Smith T Bio Hdf Bosc2008bosc_2008
 
Open PHACTS for BDE SC1.1
Open PHACTS for BDE SC1.1Open PHACTS for BDE SC1.1
Open PHACTS for BDE SC1.1BigData_Europe
 
IEEE 2014 DOTNET PARALLEL DISTRIBUTED PROJECTS Signature searching in a netwo...
IEEE 2014 DOTNET PARALLEL DISTRIBUTED PROJECTS Signature searching in a netwo...IEEE 2014 DOTNET PARALLEL DISTRIBUTED PROJECTS Signature searching in a netwo...
IEEE 2014 DOTNET PARALLEL DISTRIBUTED PROJECTS Signature searching in a netwo...IEEEMEMTECHSTUDENTPROJECTS
 
Reproducibility and Scientific Research: why, what, where, when, who, how
Reproducibility and Scientific Research: why, what, where, when, who, how Reproducibility and Scientific Research: why, what, where, when, who, how
Reproducibility and Scientific Research: why, what, where, when, who, how Carole Goble
 
A practical guide to practicing open science
A practical guide to practicing open scienceA practical guide to practicing open science
A practical guide to practicing open scienceKrzysztof Gorgolewski
 
Evaluation of the importance of standards for data and metadata exchange for ...
Evaluation of the importance of standards for data and metadata exchange for ...Evaluation of the importance of standards for data and metadata exchange for ...
Evaluation of the importance of standards for data and metadata exchange for ...Wolfgang Kuchinke
 
Parallel and Distributed Information Retrieval System
Parallel and Distributed Information Retrieval SystemParallel and Distributed Information Retrieval System
Parallel and Distributed Information Retrieval Systemvimalsura
 
FAIRness Assessment of the Library of Integrated Network-based Cellular Signa...
FAIRness Assessment of the Library of Integrated Network-based Cellular Signa...FAIRness Assessment of the Library of Integrated Network-based Cellular Signa...
FAIRness Assessment of the Library of Integrated Network-based Cellular Signa...Kathleen Jagodnik
 
Where is the opportunity for libraries in the collaborative data infrastructure?
Where is the opportunity for libraries in the collaborative data infrastructure?Where is the opportunity for libraries in the collaborative data infrastructure?
Where is the opportunity for libraries in the collaborative data infrastructure?LIBER Europe
 
The Role of OAIS Representation Information in the Digital Curation of Crysta...
The Role of OAIS Representation Information in the Digital Curation of Crysta...The Role of OAIS Representation Information in the Digital Curation of Crysta...
The Role of OAIS Representation Information in the Digital Curation of Crysta...ManjulaPatel
 
2020.04.07 automated molecular design and the bradshaw platform webinar
2020.04.07 automated molecular design and the bradshaw platform webinar2020.04.07 automated molecular design and the bradshaw platform webinar
2020.04.07 automated molecular design and the bradshaw platform webinarPistoia Alliance
 
What is Reproducibility? The R* brouhaha (and how Research Objects can help)
What is Reproducibility? The R* brouhaha (and how Research Objects can help)What is Reproducibility? The R* brouhaha (and how Research Objects can help)
What is Reproducibility? The R* brouhaha (and how Research Objects can help)Carole Goble
 
Take a Lesson From the Research World - Strata OLC
Take a Lesson From the Research World - Strata OLCTake a Lesson From the Research World - Strata OLC
Take a Lesson From the Research World - Strata OLCKaitlin Thaney
 
SEEKing our way to better presentation of data and models from scientific inv...
SEEKing our way to better presentation of data and models from scientific inv...SEEKing our way to better presentation of data and models from scientific inv...
SEEKing our way to better presentation of data and models from scientific inv...Natalie Stanford
 
Aspects of Reproducibility in Earth Science
Aspects of Reproducibility in Earth ScienceAspects of Reproducibility in Earth Science
Aspects of Reproducibility in Earth ScienceRaul Palma
 
Tools and approaches for data deposition into nanomaterial databases
Tools and approaches for data deposition into nanomaterial databasesTools and approaches for data deposition into nanomaterial databases
Tools and approaches for data deposition into nanomaterial databasesValery Tkachenko
 
USUGM 2014 - Zhengwei Peng (Merck): In-depth analysis of patent molecular spa...
USUGM 2014 - Zhengwei Peng (Merck): In-depth analysis of patent molecular spa...USUGM 2014 - Zhengwei Peng (Merck): In-depth analysis of patent molecular spa...
USUGM 2014 - Zhengwei Peng (Merck): In-depth analysis of patent molecular spa...ChemAxon
 
RARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research ObjectsRARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research ObjectsCarole Goble
 

What's hot (19)

Predict Conference: Data Analytics for Digital Forensics and Cybersecurity
Predict Conference: Data Analytics for Digital Forensics and CybersecurityPredict Conference: Data Analytics for Digital Forensics and Cybersecurity
Predict Conference: Data Analytics for Digital Forensics and Cybersecurity
 
Smith T Bio Hdf Bosc2008
Smith T Bio Hdf Bosc2008Smith T Bio Hdf Bosc2008
Smith T Bio Hdf Bosc2008
 
Open PHACTS for BDE SC1.1
Open PHACTS for BDE SC1.1Open PHACTS for BDE SC1.1
Open PHACTS for BDE SC1.1
 
IEEE 2014 DOTNET PARALLEL DISTRIBUTED PROJECTS Signature searching in a netwo...
IEEE 2014 DOTNET PARALLEL DISTRIBUTED PROJECTS Signature searching in a netwo...IEEE 2014 DOTNET PARALLEL DISTRIBUTED PROJECTS Signature searching in a netwo...
IEEE 2014 DOTNET PARALLEL DISTRIBUTED PROJECTS Signature searching in a netwo...
 
Reproducibility and Scientific Research: why, what, where, when, who, how
Reproducibility and Scientific Research: why, what, where, when, who, how Reproducibility and Scientific Research: why, what, where, when, who, how
Reproducibility and Scientific Research: why, what, where, when, who, how
 
A practical guide to practicing open science
A practical guide to practicing open scienceA practical guide to practicing open science
A practical guide to practicing open science
 
Evaluation of the importance of standards for data and metadata exchange for ...
Evaluation of the importance of standards for data and metadata exchange for ...Evaluation of the importance of standards for data and metadata exchange for ...
Evaluation of the importance of standards for data and metadata exchange for ...
 
Parallel and Distributed Information Retrieval System
Parallel and Distributed Information Retrieval SystemParallel and Distributed Information Retrieval System
Parallel and Distributed Information Retrieval System
 
FAIRness Assessment of the Library of Integrated Network-based Cellular Signa...
FAIRness Assessment of the Library of Integrated Network-based Cellular Signa...FAIRness Assessment of the Library of Integrated Network-based Cellular Signa...
FAIRness Assessment of the Library of Integrated Network-based Cellular Signa...
 
Where is the opportunity for libraries in the collaborative data infrastructure?
Where is the opportunity for libraries in the collaborative data infrastructure?Where is the opportunity for libraries in the collaborative data infrastructure?
Where is the opportunity for libraries in the collaborative data infrastructure?
 
The Role of OAIS Representation Information in the Digital Curation of Crysta...
The Role of OAIS Representation Information in the Digital Curation of Crysta...The Role of OAIS Representation Information in the Digital Curation of Crysta...
The Role of OAIS Representation Information in the Digital Curation of Crysta...
 
2020.04.07 automated molecular design and the bradshaw platform webinar
2020.04.07 automated molecular design and the bradshaw platform webinar2020.04.07 automated molecular design and the bradshaw platform webinar
2020.04.07 automated molecular design and the bradshaw platform webinar
 
What is Reproducibility? The R* brouhaha (and how Research Objects can help)
What is Reproducibility? The R* brouhaha (and how Research Objects can help)What is Reproducibility? The R* brouhaha (and how Research Objects can help)
What is Reproducibility? The R* brouhaha (and how Research Objects can help)
 
Take a Lesson From the Research World - Strata OLC
Take a Lesson From the Research World - Strata OLCTake a Lesson From the Research World - Strata OLC
Take a Lesson From the Research World - Strata OLC
 
SEEKing our way to better presentation of data and models from scientific inv...
SEEKing our way to better presentation of data and models from scientific inv...SEEKing our way to better presentation of data and models from scientific inv...
SEEKing our way to better presentation of data and models from scientific inv...
 
Aspects of Reproducibility in Earth Science
Aspects of Reproducibility in Earth ScienceAspects of Reproducibility in Earth Science
Aspects of Reproducibility in Earth Science
 
Tools and approaches for data deposition into nanomaterial databases
Tools and approaches for data deposition into nanomaterial databasesTools and approaches for data deposition into nanomaterial databases
Tools and approaches for data deposition into nanomaterial databases
 
USUGM 2014 - Zhengwei Peng (Merck): In-depth analysis of patent molecular spa...
USUGM 2014 - Zhengwei Peng (Merck): In-depth analysis of patent molecular spa...USUGM 2014 - Zhengwei Peng (Merck): In-depth analysis of patent molecular spa...
USUGM 2014 - Zhengwei Peng (Merck): In-depth analysis of patent molecular spa...
 
RARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research ObjectsRARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research Objects
 

Similar to What's mine is yours (and vice versa) Data sharing in vibrational spectroscopy

On the large scale of studying dynamics with MEG: Lessons learned from the Hu...
On the large scale of studying dynamics with MEG: Lessons learned from the Hu...On the large scale of studying dynamics with MEG: Lessons learned from the Hu...
On the large scale of studying dynamics with MEG: Lessons learned from the Hu...Robert Oostenveld
 
Integrated research data management in the Structural Sciences
Integrated research data management in the Structural SciencesIntegrated research data management in the Structural Sciences
Integrated research data management in the Structural SciencesManjulaPatel
 
LABORATORY INFORMATION SYSTEM RADIOLOGY INFORMATION SYSTEM
LABORATORY INFORMATION SYSTEM RADIOLOGY INFORMATION SYSTEMLABORATORY INFORMATION SYSTEM RADIOLOGY INFORMATION SYSTEM
LABORATORY INFORMATION SYSTEM RADIOLOGY INFORMATION SYSTEMAj Raj
 
pacs picture archeving comunication system instrumentation
pacs picture archeving comunication system instrumentationpacs picture archeving comunication system instrumentation
pacs picture archeving comunication system instrumentationDarshan Reddy
 
eTRIKS Data Harmonization Service Platform
eTRIKS Data Harmonization Service PlatformeTRIKS Data Harmonization Service Platform
eTRIKS Data Harmonization Service Platformibemam
 
kantorNSF-NIJ-ISI-03-06-04.ppt
kantorNSF-NIJ-ISI-03-06-04.pptkantorNSF-NIJ-ISI-03-06-04.ppt
kantorNSF-NIJ-ISI-03-06-04.pptbutest
 
Leveraging CEDAR workbench for ontology-linked submission of adaptive immune ...
Leveraging CEDAR workbench for ontology-linked submission of adaptive immune ...Leveraging CEDAR workbench for ontology-linked submission of adaptive immune ...
Leveraging CEDAR workbench for ontology-linked submission of adaptive immune ...Syed Ahmad Chan Bukhari, PhD
 
Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...
Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...
Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...Ahmad C. Bukhari
 
BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)
BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)
BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)BigData_Europe
 
Importance of data standards and system validation of software for clinical r...
Importance of data standards and system validation of software for clinical r...Importance of data standards and system validation of software for clinical r...
Importance of data standards and system validation of software for clinical r...Wolfgang Kuchinke
 
Let’s go on a FAIR safari!
Let’s go on a FAIR safari!Let’s go on a FAIR safari!
Let’s go on a FAIR safari!Carole Goble
 
Discovery Engines for Big Data: Accelerating Discovery in Basic Energy Sciences
Discovery Engines for Big Data: Accelerating Discovery in Basic Energy SciencesDiscovery Engines for Big Data: Accelerating Discovery in Basic Energy Sciences
Discovery Engines for Big Data: Accelerating Discovery in Basic Energy SciencesIan Foster
 
ProteomeXchange: data deposition and data retrieval made easy
ProteomeXchange: data deposition and data retrieval made easyProteomeXchange: data deposition and data retrieval made easy
ProteomeXchange: data deposition and data retrieval made easyJuan Antonio Vizcaino
 
UK Digital Curation Centre: enabling research data management at the coalface
UK Digital Curation Centre: enabling research data management at the coalfaceUK Digital Curation Centre: enabling research data management at the coalface
UK Digital Curation Centre: enabling research data management at the coalfaceLizLyon
 
Donders neuroimage toolkit - open science and good practices
Donders neuroimage toolkit -  open science and good practicesDonders neuroimage toolkit -  open science and good practices
Donders neuroimage toolkit - open science and good practicesRobert Oostenveld
 
Semantically-Enabled Digital Investigations - Research Overview
Semantically-Enabled Digital Investigations - Research OverviewSemantically-Enabled Digital Investigations - Research Overview
Semantically-Enabled Digital Investigations - Research Overviewinbroker
 
NIH Data Summit - The NIH Data Commons
NIH Data Summit - The NIH Data CommonsNIH Data Summit - The NIH Data Commons
NIH Data Summit - The NIH Data CommonsVivien Bonazzi
 
Revolutionizing Laboratory Instrument Data for the Pharmaceutical Industry:...
Revolutionizing Laboratory  Instrument Data for the  Pharmaceutical Industry:...Revolutionizing Laboratory  Instrument Data for the  Pharmaceutical Industry:...
Revolutionizing Laboratory Instrument Data for the Pharmaceutical Industry:...OSTHUS
 
SPatially Explicit Data Discovery, Extraction and Evaluation Services (SPEDDE...
SPatially Explicit Data Discovery, Extraction and Evaluation Services (SPEDDE...SPatially Explicit Data Discovery, Extraction and Evaluation Services (SPEDDE...
SPatially Explicit Data Discovery, Extraction and Evaluation Services (SPEDDE...aceas13tern
 

Similar to What's mine is yours (and vice versa) Data sharing in vibrational spectroscopy (20)

On the large scale of studying dynamics with MEG: Lessons learned from the Hu...
On the large scale of studying dynamics with MEG: Lessons learned from the Hu...On the large scale of studying dynamics with MEG: Lessons learned from the Hu...
On the large scale of studying dynamics with MEG: Lessons learned from the Hu...
 
Integrated research data management in the Structural Sciences
Integrated research data management in the Structural SciencesIntegrated research data management in the Structural Sciences
Integrated research data management in the Structural Sciences
 
LABORATORY INFORMATION SYSTEM RADIOLOGY INFORMATION SYSTEM
LABORATORY INFORMATION SYSTEM RADIOLOGY INFORMATION SYSTEMLABORATORY INFORMATION SYSTEM RADIOLOGY INFORMATION SYSTEM
LABORATORY INFORMATION SYSTEM RADIOLOGY INFORMATION SYSTEM
 
FAIRer Research
FAIRer ResearchFAIRer Research
FAIRer Research
 
pacs picture archeving comunication system instrumentation
pacs picture archeving comunication system instrumentationpacs picture archeving comunication system instrumentation
pacs picture archeving comunication system instrumentation
 
eTRIKS Data Harmonization Service Platform
eTRIKS Data Harmonization Service PlatformeTRIKS Data Harmonization Service Platform
eTRIKS Data Harmonization Service Platform
 
kantorNSF-NIJ-ISI-03-06-04.ppt
kantorNSF-NIJ-ISI-03-06-04.pptkantorNSF-NIJ-ISI-03-06-04.ppt
kantorNSF-NIJ-ISI-03-06-04.ppt
 
Leveraging CEDAR workbench for ontology-linked submission of adaptive immune ...
Leveraging CEDAR workbench for ontology-linked submission of adaptive immune ...Leveraging CEDAR workbench for ontology-linked submission of adaptive immune ...
Leveraging CEDAR workbench for ontology-linked submission of adaptive immune ...
 
Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...
Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...
Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...
 
BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)
BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)
BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)
 
Importance of data standards and system validation of software for clinical r...
Importance of data standards and system validation of software for clinical r...Importance of data standards and system validation of software for clinical r...
Importance of data standards and system validation of software for clinical r...
 
Let’s go on a FAIR safari!
Let’s go on a FAIR safari!Let’s go on a FAIR safari!
Let’s go on a FAIR safari!
 
Discovery Engines for Big Data: Accelerating Discovery in Basic Energy Sciences
Discovery Engines for Big Data: Accelerating Discovery in Basic Energy SciencesDiscovery Engines for Big Data: Accelerating Discovery in Basic Energy Sciences
Discovery Engines for Big Data: Accelerating Discovery in Basic Energy Sciences
 
ProteomeXchange: data deposition and data retrieval made easy
ProteomeXchange: data deposition and data retrieval made easyProteomeXchange: data deposition and data retrieval made easy
ProteomeXchange: data deposition and data retrieval made easy
 
UK Digital Curation Centre: enabling research data management at the coalface
UK Digital Curation Centre: enabling research data management at the coalfaceUK Digital Curation Centre: enabling research data management at the coalface
UK Digital Curation Centre: enabling research data management at the coalface
 
Donders neuroimage toolkit - open science and good practices
Donders neuroimage toolkit -  open science and good practicesDonders neuroimage toolkit -  open science and good practices
Donders neuroimage toolkit - open science and good practices
 
Semantically-Enabled Digital Investigations - Research Overview
Semantically-Enabled Digital Investigations - Research OverviewSemantically-Enabled Digital Investigations - Research Overview
Semantically-Enabled Digital Investigations - Research Overview
 
NIH Data Summit - The NIH Data Commons
NIH Data Summit - The NIH Data CommonsNIH Data Summit - The NIH Data Commons
NIH Data Summit - The NIH Data Commons
 
Revolutionizing Laboratory Instrument Data for the Pharmaceutical Industry:...
Revolutionizing Laboratory  Instrument Data for the  Pharmaceutical Industry:...Revolutionizing Laboratory  Instrument Data for the  Pharmaceutical Industry:...
Revolutionizing Laboratory Instrument Data for the Pharmaceutical Industry:...
 
SPatially Explicit Data Discovery, Extraction and Evaluation Services (SPEDDE...
SPatially Explicit Data Discovery, Extraction and Evaluation Services (SPEDDE...SPatially Explicit Data Discovery, Extraction and Evaluation Services (SPEDDE...
SPatially Explicit Data Discovery, Extraction and Evaluation Services (SPEDDE...
 

More from Alex Henderson

Hyperspectral Data Issues
Hyperspectral Data IssuesHyperspectral Data Issues
Hyperspectral Data IssuesAlex Henderson
 
The Class Imbalance Problem: AdaBoost to the Rescue?
The Class Imbalance Problem: AdaBoost to the Rescue?The Class Imbalance Problem: AdaBoost to the Rescue?
The Class Imbalance Problem: AdaBoost to the Rescue?Alex Henderson
 
Getting started with chemometric classification
Getting started with chemometric classificationGetting started with chemometric classification
Getting started with chemometric classificationAlex Henderson
 
Too good to be true? How validate your data
Too good to be true? How validate your dataToo good to be true? How validate your data
Too good to be true? How validate your dataAlex Henderson
 
2020 Vision (Dubious Design Decisions)
2020 Vision (Dubious Design Decisions)2020 Vision (Dubious Design Decisions)
2020 Vision (Dubious Design Decisions)Alex Henderson
 
To bag, or to boost? A question of balance
To bag, or to boost? A question of balanceTo bag, or to boost? A question of balance
To bag, or to boost? A question of balanceAlex Henderson
 
Digging into Data: Analysis and Visualisation in 3D
Digging into Data: Analysis and Visualisation in 3DDigging into Data: Analysis and Visualisation in 3D
Digging into Data: Analysis and Visualisation in 3DAlex Henderson
 
Rise of the Machines: The Use of Machine Learning in SIMS Data Analysis
Rise of the Machines: The Use of Machine Learning in SIMS Data AnalysisRise of the Machines: The Use of Machine Learning in SIMS Data Analysis
Rise of the Machines: The Use of Machine Learning in SIMS Data AnalysisAlex Henderson
 
How to validate your model
How to validate your modelHow to validate your model
How to validate your modelAlex Henderson
 
Interpretation of Static SIMS Spectra
Interpretation of Static SIMS SpectraInterpretation of Static SIMS Spectra
Interpretation of Static SIMS SpectraAlex Henderson
 
Secondary Ion Mass Spectrometry
Secondary Ion Mass SpectrometrySecondary Ion Mass Spectrometry
Secondary Ion Mass SpectrometryAlex Henderson
 

More from Alex Henderson (11)

Hyperspectral Data Issues
Hyperspectral Data IssuesHyperspectral Data Issues
Hyperspectral Data Issues
 
The Class Imbalance Problem: AdaBoost to the Rescue?
The Class Imbalance Problem: AdaBoost to the Rescue?The Class Imbalance Problem: AdaBoost to the Rescue?
The Class Imbalance Problem: AdaBoost to the Rescue?
 
Getting started with chemometric classification
Getting started with chemometric classificationGetting started with chemometric classification
Getting started with chemometric classification
 
Too good to be true? How validate your data
Too good to be true? How validate your dataToo good to be true? How validate your data
Too good to be true? How validate your data
 
2020 Vision (Dubious Design Decisions)
2020 Vision (Dubious Design Decisions)2020 Vision (Dubious Design Decisions)
2020 Vision (Dubious Design Decisions)
 
To bag, or to boost? A question of balance
To bag, or to boost? A question of balanceTo bag, or to boost? A question of balance
To bag, or to boost? A question of balance
 
Digging into Data: Analysis and Visualisation in 3D
Digging into Data: Analysis and Visualisation in 3DDigging into Data: Analysis and Visualisation in 3D
Digging into Data: Analysis and Visualisation in 3D
 
Rise of the Machines: The Use of Machine Learning in SIMS Data Analysis
Rise of the Machines: The Use of Machine Learning in SIMS Data AnalysisRise of the Machines: The Use of Machine Learning in SIMS Data Analysis
Rise of the Machines: The Use of Machine Learning in SIMS Data Analysis
 
How to validate your model
How to validate your modelHow to validate your model
How to validate your model
 
Interpretation of Static SIMS Spectra
Interpretation of Static SIMS SpectraInterpretation of Static SIMS Spectra
Interpretation of Static SIMS Spectra
 
Secondary Ion Mass Spectrometry
Secondary Ion Mass SpectrometrySecondary Ion Mass Spectrometry
Secondary Ion Mass Spectrometry
 

Recently uploaded

Luciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptxLuciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptxAleenaTreesaSaji
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PPRINCE C P
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxpradhanghanshyam7136
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCEPRINCE C P
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfSwapnil Therkar
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...jana861314
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfnehabiju2046
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real timeSatoshi NAKAHIRA
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfSELF-EXPLANATORY
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoSérgio Sacani
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxyaramohamed343013
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxAArockiyaNisha
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 

Recently uploaded (20)

Luciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptxLuciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptx
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C P
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptx
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdf
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real time
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docx
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 

What's mine is yours (and vice versa) Data sharing in vibrational spectroscopy

  • 1. ALEX HENDERSON & PETER GARDNER MANCHESTER INSTITUTE OF BIOTECHNOLOGY UNIVERSITY OF MANCHESTER, UK HTTP://GARDNER-LAB.COM & HTTP://CLIRSPEC.ORG SPEC 2014 Shedding New Light on Disease Kraków, Poland. 17-22 August 2014 WHAT’S MINE IS YOURS (AND VICE VERSA): DATA SHARING IN VIBRATIONAL SPECTROSCOPY
  • 3. Why share? Technique validation Round-robins Standard spectra for unknown identification Standard operating procedure validation Test visualisation schemes Remote location of special samples Remote location of special equipment
  • 4. What to share? Raw data files Eg. For testing data processing procedures Metadata for sample preparation Sample SOP Metadata for experimental procedure/protocol Acquisition SOP Processed data to save doing it yourself
  • 5. What to share? Raw data files Eg. For testing data processing procedures Metadata for sample preparation Sample SOP Metadata for experimental procedure/protocol Acquisition SOP Processed data to save doing it yourself
  • 6. How to give? Pen drive CD Email Dropbox ftp server Data repository One-to-one One-to-few One-to-more One-to-all Best solution
  • 7. How to receive? Data in different file formats introduces a barrier to end user Disconnect between analysis software and file format Incorrectly/poorly coded formats require additional information (hyper)Spectral data disconnected from sample treatments or acquisition protocols
  • 8. Third-party data analysis suites Package Author Platform CytoSpec Peter Lasch MATLAB hyperSpec Claudia Beleites R ProSpect Paul Bassan MATLAB SpecToolbox Matt Baker (and friends) MATLAB … Not an exhaustive list, email me your package info Author must write import filter for each version of each vendor’s formats
  • 9. Writing import filters Slow Laborious Steep learning curve Potential for error Incomplete filter without sufficient test data No access to file format specification/detail IP issues with proprietary formats (NDA) Some limited to (32-bit) Windows (eg. DLL or DDE)
  • 10. Objectives 2014 – 2017 Developing Understanding of interaction of light with clinical samples Strategies for pre-processing and statistical analysis in clinical spectroscopy Protocols Preparation of cells, tissue and biofluids for clinical spectroscopy Inter-group data sharing Evidence Power of spectroscopy for use in the clinical arena Requirements of instrumentation suitable for use in the clinic Clinical Infrared and Raman Spectroscopy for Medical Diagnosis PARTNERS ACADEMIC Peter Gardner Matthew J Baker Nicholas Stone Julian Moger Josep Sulé-Suso Francis Martin Sergei G Kazarian Hugh J Byrne Roy Goodacre John M Chalmers Alex Henderson Peter Lasch Ganesh Sockalingum Bayden Wood Peter Weightman Gianfelice Cinque Peter Rich CLINICAL Noel Clarke Jonathan Shanks Timothy Dawson Charles Davis Pierre Martin-Hirsch Hugh Barr Neil Shepherd John McGrath Jim Brown Sam Janes INDUSTRIAL Agilent Bruker Cobalt Light Systems Coherent UK Perkin Elmer Renishaw @clirspec http://clirspec.org/
  • 11. CLIRSPEC Work Package 6 Assess current spectral and image data attributes from the range of currently employed network instrumentation Develop a standard data transfer format to allow free and easy dissemination of data between network members enhancing collaboration and efficiency of research funding Provide a single software target, easing the development of third party software and its uptake within the clinical arena Investigate the utility of standard spectra for specific diseases Investigate the technological, cultural, ethical and IP issues in order to enable data sharing and reuse
  • 12. CLIRSPEC Work Package 6 Assess current spectral and image data attributes from the range of currently employed network instrumentation Develop a standard data transfer format to allow free and easy dissemination of data between network members enhancing collaboration and efficiency of research funding Provide a single software target, easing the development of third party software and its uptake within the clinical arena Investigate the utility of standard spectra for specific diseases Investigate the technological, cultural, ethical and IP issues in order to enable data sharing and reuse
  • 13. Data format requirements Operating system neutral Scalable to large file sizes (futureproof) Random access (don’t unzip before reading) File format description available (NDA open) Other software available that can read it Quick to write and, more importantly, quick to read Able to hold (encrypted) instrumental parameters Enables round-tripping, no information loss …
  • 14. Open data formats – Spectra JCAMP-DX Over 4 compression systems Some code available Grams SPC Understands spectroscopy types and units Some import filters available CSV/text Simple to read Not scalable Not suitable for images Loss of metadata
  • 15. Hyperspectral images Grams SPC Pixel indexing issues, needs help ENVI Manual spectrum-centric or image-centric access May require IDL library NetCDF-4 Self-describing, accessed via libraries Compression and streaming available
  • 16. 3D confocal and tomographic NetCDF-4 Unlimited dimensionality Optimised spectrum-centric or image-centric access through ‘chunking’ Supported
  • 17. Community input required Data types that need to be supported Irregularly shaped images Collections of spectra Discrete wavelength data (multispectral not hyperspectral) Time course (multiple dependent variables) Software Filters written, format testing etc. THINKING and PLANNING!!