SlideShare a Scribd company logo
1 of 20
Reactions to
The Open Spectral Database
http://osdb.info
Stuart J. Chalk, Department of Chemistry
University of North Florida
schalk@unf.edu
Instigator: Tony Williams
SCTY 28 – Pacifichem 2015
 What would Jean-Claude Bradley have wanted?
 Share and Reuse Research Data!
 How Do You Make Everything Open?
 JCAMP Implementation
 The Open Spectral Database
 Data Model
 Live Demo (fingers crossed)
 Future Plans
 Conclusion
Outline
What Would JCB Have Wanted?
 Simple: Openness as the norm not the exception
 Data made available, without restriction, so its useful
 Mechanisms/tools to make data available
 Formats to allow others to get the data…
 …but also so its easy to use
 Annotated data to make it easy to find
 Community driven promotion of and action on these issues
 Ryan P. Womack (2015) Research Data in Core Journals in Biology, Chemistry, Mathematics,
and Physics. PLoS ONE 10(12): e0143460. doi:10.1371/journal.pone.0143460
Share and Reuse Research Data!
 You have to know/define what “everything” means
 Open Data
 Open Data Model
 Open and useable data structures
 Open Code
 Open to input from the community on all aspects
 Open to add, extend, change, and rethink all of this
How Do You Make Everything Open?
 Spectral data – There are many formats but only one
open and generally accepted standard – JCAMP
 Its not perfect…
 …but its an output format people can share
 Lets export the data, metadata, and inference as
much as possible from JCAMP files
 Not as easy as it seems…
First Attempt
 Great data exchange format, however…
 …not meant to be computer input…
 …more a way to get data out so a human can process
 Missing parameters (metadata)
 Missing data
 Incorrect values
 Extra data
 Incorrectly compressed
Challenges with JCAMP
 Upload JCAMP spectra
 Data and metadata extracted
 Organize metadata so it can be used to find data
 Use REST based website and API to make data available
and allow searching – document API
 Make the website available as a project on GitHub and
invite the community to get involved
The Open Spectral Database
 Apache 2.4 (http://httpd.apache.org)
 PHP 5.6 (http://www.php.org)
 CakePHP 2.7 PHP Framework (http://cakephp.org)
 MySQL 5.5 (http://www.mysql.com)
 jQuery (JavaScript) (http://jquery.com)
 Flot for jQuery (http://www.flotcharts.org)
 Jsmol (http://jmol.org)
 Bootstrap CSS (http://getbootstrap)
 eXtensible Markup Language (http://www.w3.org/TR/xml/)
 JavaScript Object Notation (JSON) (http://json.org)
 JSON for Linked Data (JSON-LD) (http://www.w3.org/TR/json-ld/)
Technology
 JCAMP file is imported into PHP as an array, then
 Clean
 Uncomment ($$)
 Separate
 Labeled Data Records (LDRs)
 Parameters (##.)
 User Defined Labels (##$)
 Validate
 Standardize
 Decompress
 Convert to output format or store in database
Ingestion Process
 In order to organize the data and metadata it is
distributed across a number of tables in the database
 This is a generic science data model that is being used
for multiple projects
 Not limited to spectra or even just Chemistry data
Data Model
Data Model
 File upload
 Export formats
 Search API
Live Demo
Semantic
Annotation
 Enthusiastic Feedback with constructive comments…
 Spectral list is boring needs molecules linked to spectra
 Less metadata on the spectral page with option to see more
 Revise homepage to make it more inviting
Reactions to Alpha Version
 Again Enthusiastic…
 ”Love the layout! Very clean…”
 “Nice Work!” (Twitter comment)
 … with constructive comments
 Needs a zoom spectra feature
 Clicking on spectrum provides data that is not useful
 Maybe you could use JSpecView rather than Flot?
Reactions to Beta Version
 Handle more complicated JCAMP files
 Handle file formats other than JCAMP
 Export in AnIML format
 Expand the API
 Improve Flot viewer functionality (e.g. zoom)
 Add JSpecView spectral viewer
 Endpoint summary page
 Document the website (GitHub)
 Document how to contribute to the website (GitHub)
 Solicit feature requests and encourage contributions
Things To Do
Take Home
 The OSD is open for the community to develop and
implement ideas about open spectral data re:
 Data Model
 API features
 Export Formats
 Services
 Community Involvement!
 Use as a data source for other applications
 Submission of feature requests
 Participation as code contributor
 schalk@unf.edu
 Phone: 904-620-5311
 Skype: stuartchalk
 Twitter: @StuChalk
 LinkedIn/Slidehare: https://www.linkedin.com/in/stuchalk
 ORCID: http://orcid.org/0000-0002-0703-7776
 ResearcherID: http://www.researcherid.com/rid/D-8577-2013
Questions?

More Related Content

What's hot

Towards a Query-by-Example System for Knowledge Graphs
Towards a Query-by-Example System for Knowledge GraphsTowards a Query-by-Example System for Knowledge Graphs
GraphConnect Europe 2016 - Building a Repository of Biomedical Ontologies wit...
GraphConnect Europe 2016 - Building a Repository of Biomedical Ontologies wit...GraphConnect Europe 2016 - Building a Repository of Biomedical Ontologies wit...
GraphConnect Europe 2016 - Building a Repository of Biomedical Ontologies wit...
Neo4j
 
The Landscape of Ontology Reuse in Linked Data - OEDW2012
The Landscape of Ontology Reuse in Linked Data - OEDW2012The Landscape of Ontology Reuse in Linked Data - OEDW2012
The Landscape of Ontology Reuse in Linked Data - OEDW2012
María Poveda Villalón
 

What's hot (20)

Improving the Management of Computational Models -- Invited talk at the EBI
Improving the Management of Computational Models -- Invited talk at the EBIImproving the Management of Computational Models -- Invited talk at the EBI
Improving the Management of Computational Models -- Invited talk at the EBI
 
Converting Metadata to Linked Data
Converting Metadata to Linked DataConverting Metadata to Linked Data
Converting Metadata to Linked Data
 
Reproducibility of model-based results: standards, infrastructure, and recogn...
Reproducibility of model-based results: standards, infrastructure, and recogn...Reproducibility of model-based results: standards, infrastructure, and recogn...
Reproducibility of model-based results: standards, infrastructure, and recogn...
 
Milex 2010 final
Milex 2010 finalMilex 2010 final
Milex 2010 final
 
Reproducible research: practice
Reproducible research: practiceReproducible research: practice
Reproducible research: practice
 
Bio solr building a better search for bioinformatics
Bio solr   building a better search for bioinformaticsBio solr   building a better search for bioinformatics
Bio solr building a better search for bioinformatics
 
ISA-Tab Standards at Metabolomics Society Meeting, Tsuruoka 2014, Japan
ISA-Tab Standards at Metabolomics Society Meeting, Tsuruoka 2014, JapanISA-Tab Standards at Metabolomics Society Meeting, Tsuruoka 2014, Japan
ISA-Tab Standards at Metabolomics Society Meeting, Tsuruoka 2014, Japan
 
YQL Presentation at Geek Girls Dinner Sydney
YQL Presentation at Geek Girls Dinner SydneyYQL Presentation at Geek Girls Dinner Sydney
YQL Presentation at Geek Girls Dinner Sydney
 
Facilitating semantic alignment.-biohackathon-jupp
Facilitating semantic alignment.-biohackathon-juppFacilitating semantic alignment.-biohackathon-jupp
Facilitating semantic alignment.-biohackathon-jupp
 
Federated data stores using semantic web technology
Federated data stores using semantic web technologyFederated data stores using semantic web technology
Federated data stores using semantic web technology
 
schema.org and biomedical ontologies
schema.org and biomedical ontologies schema.org and biomedical ontologies
schema.org and biomedical ontologies
 
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
 
Tutorial Data Management and workflows
Tutorial Data Management and workflowsTutorial Data Management and workflows
Tutorial Data Management and workflows
 
Expanding the content categories at JaLC
Expanding the content categories at JaLCExpanding the content categories at JaLC
Expanding the content categories at JaLC
 
Towards a Query-by-Example System for Knowledge Graphs
Towards a Query-by-Example System for Knowledge GraphsTowards a Query-by-Example System for Knowledge Graphs
Towards a Query-by-Example System for Knowledge Graphs
 
GraphConnect Europe 2016 - Building a Repository of Biomedical Ontologies wit...
GraphConnect Europe 2016 - Building a Repository of Biomedical Ontologies wit...GraphConnect Europe 2016 - Building a Repository of Biomedical Ontologies wit...
GraphConnect Europe 2016 - Building a Repository of Biomedical Ontologies wit...
 
The Landscape of Ontology Reuse in Linked Data - OEDW2012
The Landscape of Ontology Reuse in Linked Data - OEDW2012The Landscape of Ontology Reuse in Linked Data - OEDW2012
The Landscape of Ontology Reuse in Linked Data - OEDW2012
 
Project progress
Project progressProject progress
Project progress
 
Making Linked Data SPARQL with the InterMine Biological Data Warehouse
Making Linked Data SPARQL with the InterMine Biological Data WarehouseMaking Linked Data SPARQL with the InterMine Biological Data Warehouse
Making Linked Data SPARQL with the InterMine Biological Data Warehouse
 
2016 Bio-IT World Cell Line Coordination Poster 2016-04-05v1
2016 Bio-IT World Cell Line Coordination Poster 2016-04-05v12016 Bio-IT World Cell Line Coordination Poster 2016-04-05v1
2016 Bio-IT World Cell Line Coordination Poster 2016-04-05v1
 

Viewers also liked

DINKAR SINGH_SYSTEM_TEST_ANALYST
DINKAR SINGH_SYSTEM_TEST_ANALYSTDINKAR SINGH_SYSTEM_TEST_ANALYST
DINKAR SINGH_SYSTEM_TEST_ANALYST
DInkar SiNgh
 
Discard inport exchange table & tablespace
Discard inport exchange table & tablespaceDiscard inport exchange table & tablespace
Discard inport exchange table & tablespace
Marco Tusa
 
New Relic Future Stack 2015 - Step 1 in our quest for DevOps at US Foods 111...
New Relic Future Stack 2015 - Step 1 in our quest for DevOps at US Foods  111...New Relic Future Stack 2015 - Step 1 in our quest for DevOps at US Foods  111...
New Relic Future Stack 2015 - Step 1 in our quest for DevOps at US Foods 111...
David Kent
 

Viewers also liked (20)

Business skills 4 E-Commerce presentation
Business skills 4 E-Commerce presentationBusiness skills 4 E-Commerce presentation
Business skills 4 E-Commerce presentation
 
DINKAR SINGH_SYSTEM_TEST_ANALYST
DINKAR SINGH_SYSTEM_TEST_ANALYSTDINKAR SINGH_SYSTEM_TEST_ANALYST
DINKAR SINGH_SYSTEM_TEST_ANALYST
 
About OpenStack DBaas (trove)
About OpenStack DBaas (trove)About OpenStack DBaas (trove)
About OpenStack DBaas (trove)
 
Jdbc_ravi_2016
Jdbc_ravi_2016Jdbc_ravi_2016
Jdbc_ravi_2016
 
Discard inport exchange table & tablespace
Discard inport exchange table & tablespaceDiscard inport exchange table & tablespace
Discard inport exchange table & tablespace
 
New Relic Future Stack 2015 - Step 1 in our quest for DevOps at US Foods 111...
New Relic Future Stack 2015 - Step 1 in our quest for DevOps at US Foods  111...New Relic Future Stack 2015 - Step 1 in our quest for DevOps at US Foods  111...
New Relic Future Stack 2015 - Step 1 in our quest for DevOps at US Foods 111...
 
Writing Ruby Extensions
Writing Ruby ExtensionsWriting Ruby Extensions
Writing Ruby Extensions
 
Bliss: A New Read Overlap Detection Algorithm
Bliss: A New Read Overlap Detection AlgorithmBliss: A New Read Overlap Detection Algorithm
Bliss: A New Read Overlap Detection Algorithm
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
The Staging Server is Dead! Long Live the Staging Server!
The Staging Server is Dead! Long Live the Staging Server!The Staging Server is Dead! Long Live the Staging Server!
The Staging Server is Dead! Long Live the Staging Server!
 
Building a fully-automated Fast Data Platform
Building a fully-automated Fast Data PlatformBuilding a fully-automated Fast Data Platform
Building a fully-automated Fast Data Platform
 
I Love APIs 2015: Building Predictive Apps with Lamda and MicroServices
I Love APIs 2015: Building Predictive Apps with Lamda and MicroServices I Love APIs 2015: Building Predictive Apps with Lamda and MicroServices
I Love APIs 2015: Building Predictive Apps with Lamda and MicroServices
 
Oracle Cloud As Services
Oracle Cloud As ServicesOracle Cloud As Services
Oracle Cloud As Services
 
PostgreSQLでpg_bigmを使って日本語全文検索 (MySQLとPostgreSQLの日本語全文検索勉強会 発表資料)
PostgreSQLでpg_bigmを使って日本語全文検索 (MySQLとPostgreSQLの日本語全文検索勉強会 発表資料)PostgreSQLでpg_bigmを使って日本語全文検索 (MySQLとPostgreSQLの日本語全文検索勉強会 発表資料)
PostgreSQLでpg_bigmを使って日本語全文検索 (MySQLとPostgreSQLの日本語全文検索勉強会 発表資料)
 
avinash_resume
avinash_resumeavinash_resume
avinash_resume
 
WSO2Con USA 2015: End-to-end Microservice Architecture with WSO2 Identity Ser...
WSO2Con USA 2015: End-to-end Microservice Architecture with WSO2 Identity Ser...WSO2Con USA 2015: End-to-end Microservice Architecture with WSO2 Identity Ser...
WSO2Con USA 2015: End-to-end Microservice Architecture with WSO2 Identity Ser...
 
AWS December 2015 Webinar Series - Introducing Amazon Inspector
AWS December 2015 Webinar Series - Introducing Amazon InspectorAWS December 2015 Webinar Series - Introducing Amazon Inspector
AWS December 2015 Webinar Series - Introducing Amazon Inspector
 
Resume (1)
Resume (1)Resume (1)
Resume (1)
 
Scaling Traffic from 0 to 139 Million Unique Visitors
Scaling Traffic from 0 to 139 Million Unique VisitorsScaling Traffic from 0 to 139 Million Unique Visitors
Scaling Traffic from 0 to 139 Million Unique Visitors
 
D-DAY 2015 Conteneurisé une startup
D-DAY 2015 Conteneurisé une startup  D-DAY 2015 Conteneurisé une startup
D-DAY 2015 Conteneurisé une startup
 

Similar to Reactions to the Open Spectral Database

Semantic Wiki @ RPI, Apr 2009
Semantic Wiki @ RPI, Apr 2009Semantic Wiki @ RPI, Apr 2009
Semantic Wiki @ RPI, Apr 2009
Jie Bao
 
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Carole Goble
 
Big data & hadoop framework
Big data & hadoop frameworkBig data & hadoop framework
Big data & hadoop framework
Tu Pham
 

Similar to Reactions to the Open Spectral Database (20)

ACS 248th Paper 71 ChAMP Project
ACS 248th Paper 71 ChAMP ProjectACS 248th Paper 71 ChAMP Project
ACS 248th Paper 71 ChAMP Project
 
A Look into the Apache OODT Ecosystem
A Look into the Apache OODT EcosystemA Look into the Apache OODT Ecosystem
A Look into the Apache OODT Ecosystem
 
Towards a rebirth of data science (by Data Fellas)
Towards a rebirth of data science (by Data Fellas)Towards a rebirth of data science (by Data Fellas)
Towards a rebirth of data science (by Data Fellas)
 
What is a distributed data science pipeline. how with apache spark and friends.
What is a distributed data science pipeline. how with apache spark and friends.What is a distributed data science pipeline. how with apache spark and friends.
What is a distributed data science pipeline. how with apache spark and friends.
 
Semantic Wiki @ RPI, Apr 2009
Semantic Wiki @ RPI, Apr 2009Semantic Wiki @ RPI, Apr 2009
Semantic Wiki @ RPI, Apr 2009
 
SQL is Dead; Long Live SQL: Lightweight Query Services for Long Tail Science
SQL is Dead; Long Live SQL: Lightweight Query Services for Long Tail ScienceSQL is Dead; Long Live SQL: Lightweight Query Services for Long Tail Science
SQL is Dead; Long Live SQL: Lightweight Query Services for Long Tail Science
 
Information Extraction and Linked Data Cloud
Information Extraction and Linked Data CloudInformation Extraction and Linked Data Cloud
Information Extraction and Linked Data Cloud
 
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
 
Eureka Research Workbench: A Semantic Approach to an Open Source Electroni...
Eureka Research Workbench: A Semantic Approach to an Open Source Electroni...Eureka Research Workbench: A Semantic Approach to an Open Source Electroni...
Eureka Research Workbench: A Semantic Approach to an Open Source Electroni...
 
Blogs Logs Pods: Smart Labs
Blogs Logs Pods: Smart LabsBlogs Logs Pods: Smart Labs
Blogs Logs Pods: Smart Labs
 
Royal society of chemistry activities to develop a data repository for chemis...
Royal society of chemistry activities to develop a data repository for chemis...Royal society of chemistry activities to develop a data repository for chemis...
Royal society of chemistry activities to develop a data repository for chemis...
 
Royal society of chemistry activities to develop a data repository for chemis...
Royal society of chemistry activities to develop a data repository for chemis...Royal society of chemistry activities to develop a data repository for chemis...
Royal society of chemistry activities to develop a data repository for chemis...
 
eScience: A Transformed Scientific Method
eScience: A Transformed Scientific MethodeScience: A Transformed Scientific Method
eScience: A Transformed Scientific Method
 
Sem tech 2011 v8
Sem tech 2011 v8Sem tech 2011 v8
Sem tech 2011 v8
 
DataFinder: A Python Application for Scientific Data Management
DataFinder: A Python Application for Scientific Data ManagementDataFinder: A Python Application for Scientific Data Management
DataFinder: A Python Application for Scientific Data Management
 
Big data & hadoop framework
Big data & hadoop frameworkBig data & hadoop framework
Big data & hadoop framework
 
Feeding and consuming data to support open notebook science via the chem spid...
Feeding and consuming data to support open notebook science via the chem spid...Feeding and consuming data to support open notebook science via the chem spid...
Feeding and consuming data to support open notebook science via the chem spid...
 
Organizing the Data Chaos of Scientists
Organizing the Data Chaos of ScientistsOrganizing the Data Chaos of Scientists
Organizing the Data Chaos of Scientists
 
Discovery Engines for Big Data: Accelerating Discovery in Basic Energy Sciences
Discovery Engines for Big Data: Accelerating Discovery in Basic Energy SciencesDiscovery Engines for Big Data: Accelerating Discovery in Basic Energy Sciences
Discovery Engines for Big Data: Accelerating Discovery in Basic Energy Sciences
 
SciDB : Open Source Data Management System for Data-Intensive Scientific Anal...
SciDB : Open Source Data Management System for Data-Intensive Scientific Anal...SciDB : Open Source Data Management System for Data-Intensive Scientific Anal...
SciDB : Open Source Data Management System for Data-Intensive Scientific Anal...
 

More from Stuart Chalk

Building a Standard for Standards: The ChAMP Project
Building a Standard for Standards: The ChAMP ProjectBuilding a Standard for Standards: The ChAMP Project
Building a Standard for Standards: The ChAMP Project
Stuart Chalk
 

More from Stuart Chalk (20)

Semantic properties and units
Semantic properties and unitsSemantic properties and units
Semantic properties and units
 
Open semantic chemical structures
Open semantic chemical structuresOpen semantic chemical structures
Open semantic chemical structures
 
ChemExtractor: Enhanced Rule-Based Capture and Identification of PDF Based Pr...
ChemExtractor: Enhanced Rule-Based Capture and Identification of PDF Based Pr...ChemExtractor: Enhanced Rule-Based Capture and Identification of PDF Based Pr...
ChemExtractor: Enhanced Rule-Based Capture and Identification of PDF Based Pr...
 
AnIML: A New Analytical Data Standard
AnIML: A New Analytical Data StandardAnIML: A New Analytical Data Standard
AnIML: A New Analytical Data Standard
 
A Generic Scientific Data Model and Ontology for Representation of Chemical Data
A Generic Scientific Data Model and Ontology for Representation of Chemical DataA Generic Scientific Data Model and Ontology for Representation of Chemical Data
A Generic Scientific Data Model and Ontology for Representation of Chemical Data
 
Scientific Units in the Electronic Age
Scientific Units in the Electronic AgeScientific Units in the Electronic Age
Scientific Units in the Electronic Age
 
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...
 
Rule-based Capture/Storage of Scientific Data from PDF Files and Export using...
Rule-based Capture/Storage of Scientific Data from PDF Files and Export using...Rule-based Capture/Storage of Scientific Data from PDF Files and Export using...
Rule-based Capture/Storage of Scientific Data from PDF Files and Export using...
 
The Electronic Notebook Ontology
The Electronic Notebook OntologyThe Electronic Notebook Ontology
The Electronic Notebook Ontology
 
Sharing Science Data: Semantically Reimagining the IUPAC Solubility Series Data
Sharing Science Data: Semantically Reimagining the IUPAC Solubility Series DataSharing Science Data: Semantically Reimagining the IUPAC Solubility Series Data
Sharing Science Data: Semantically Reimagining the IUPAC Solubility Series Data
 
Bringing Flow injection Analysis to the Semantic Web
Bringing Flow injection Analysis to the Semantic WebBringing Flow injection Analysis to the Semantic Web
Bringing Flow injection Analysis to the Semantic Web
 
Integrating AnIML Files in Electronic Laboratory Notebooks - PittCon 2015
Integrating AnIML Files in Electronic Laboratory Notebooks - PittCon 2015Integrating AnIML Files in Electronic Laboratory Notebooks - PittCon 2015
Integrating AnIML Files in Electronic Laboratory Notebooks - PittCon 2015
 
Building a Standard for Standards: The ChAMP Project
Building a Standard for Standards: The ChAMP ProjectBuilding a Standard for Standards: The ChAMP Project
Building a Standard for Standards: The ChAMP Project
 
A Standard Data Format for Computational Chemistry: CSX
A Standard Data Format for Computational Chemistry: CSXA Standard Data Format for Computational Chemistry: CSX
A Standard Data Format for Computational Chemistry: CSX
 
Overview of the Analytical Information Markup Language (AnIML)
Overview of the Analytical Information Markup Language (AnIML)Overview of the Analytical Information Markup Language (AnIML)
Overview of the Analytical Information Markup Language (AnIML)
 
ACS 248th Paper 146 VIVO/ScientistsDB Integration into Eureka
ACS 248th Paper 146 VIVO/ScientistsDB Integration into EurekaACS 248th Paper 146 VIVO/ScientistsDB Integration into Eureka
ACS 248th Paper 146 VIVO/ScientistsDB Integration into Eureka
 
ACS 248th Paper 136 JSmol/JSpecView Eureka Integration
ACS 248th Paper 136 JSmol/JSpecView Eureka IntegrationACS 248th Paper 136 JSmol/JSpecView Eureka Integration
ACS 248th Paper 136 JSmol/JSpecView Eureka Integration
 
ACS 248th Paper 108 NIST-IUPAC Solubility Data
ACS 248th Paper 108 NIST-IUPAC Solubility DataACS 248th Paper 108 NIST-IUPAC Solubility Data
ACS 248th Paper 108 NIST-IUPAC Solubility Data
 
ACS 248th Paper 104 ChemData Project
ACS 248th Paper 104 ChemData ProjectACS 248th Paper 104 ChemData Project
ACS 248th Paper 104 ChemData Project
 
ACS 248th Paper 67 Eureka Collaboration
ACS 248th Paper 67 Eureka CollaborationACS 248th Paper 67 Eureka Collaboration
ACS 248th Paper 67 Eureka Collaboration
 

Recently uploaded

Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Sérgio Sacani
 
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
Scintica Instrumentation
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
Areesha Ahmad
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
1301aanya
 

Recently uploaded (20)

Chemistry 5th semester paper 1st Notes.pdf
Chemistry 5th semester paper 1st Notes.pdfChemistry 5th semester paper 1st Notes.pdf
Chemistry 5th semester paper 1st Notes.pdf
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
 
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxPSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
 
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxClimate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
 
300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx
 
Use of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptxUse of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptx
 
An introduction on sequence tagged site mapping
An introduction on sequence tagged site mappingAn introduction on sequence tagged site mapping
An introduction on sequence tagged site mapping
 
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIACURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
 

Reactions to the Open Spectral Database

  • 1. Reactions to The Open Spectral Database http://osdb.info Stuart J. Chalk, Department of Chemistry University of North Florida schalk@unf.edu Instigator: Tony Williams SCTY 28 – Pacifichem 2015
  • 2.  What would Jean-Claude Bradley have wanted?  Share and Reuse Research Data!  How Do You Make Everything Open?  JCAMP Implementation  The Open Spectral Database  Data Model  Live Demo (fingers crossed)  Future Plans  Conclusion Outline
  • 3. What Would JCB Have Wanted?  Simple: Openness as the norm not the exception  Data made available, without restriction, so its useful  Mechanisms/tools to make data available  Formats to allow others to get the data…  …but also so its easy to use  Annotated data to make it easy to find  Community driven promotion of and action on these issues
  • 4.  Ryan P. Womack (2015) Research Data in Core Journals in Biology, Chemistry, Mathematics, and Physics. PLoS ONE 10(12): e0143460. doi:10.1371/journal.pone.0143460 Share and Reuse Research Data!
  • 5.  You have to know/define what “everything” means  Open Data  Open Data Model  Open and useable data structures  Open Code  Open to input from the community on all aspects  Open to add, extend, change, and rethink all of this How Do You Make Everything Open?
  • 6.  Spectral data – There are many formats but only one open and generally accepted standard – JCAMP  Its not perfect…  …but its an output format people can share  Lets export the data, metadata, and inference as much as possible from JCAMP files  Not as easy as it seems… First Attempt
  • 7.
  • 8.  Great data exchange format, however…  …not meant to be computer input…  …more a way to get data out so a human can process  Missing parameters (metadata)  Missing data  Incorrect values  Extra data  Incorrectly compressed Challenges with JCAMP
  • 9.  Upload JCAMP spectra  Data and metadata extracted  Organize metadata so it can be used to find data  Use REST based website and API to make data available and allow searching – document API  Make the website available as a project on GitHub and invite the community to get involved The Open Spectral Database
  • 10.  Apache 2.4 (http://httpd.apache.org)  PHP 5.6 (http://www.php.org)  CakePHP 2.7 PHP Framework (http://cakephp.org)  MySQL 5.5 (http://www.mysql.com)  jQuery (JavaScript) (http://jquery.com)  Flot for jQuery (http://www.flotcharts.org)  Jsmol (http://jmol.org)  Bootstrap CSS (http://getbootstrap)  eXtensible Markup Language (http://www.w3.org/TR/xml/)  JavaScript Object Notation (JSON) (http://json.org)  JSON for Linked Data (JSON-LD) (http://www.w3.org/TR/json-ld/) Technology
  • 11.  JCAMP file is imported into PHP as an array, then  Clean  Uncomment ($$)  Separate  Labeled Data Records (LDRs)  Parameters (##.)  User Defined Labels (##$)  Validate  Standardize  Decompress  Convert to output format or store in database Ingestion Process
  • 12.  In order to organize the data and metadata it is distributed across a number of tables in the database  This is a generic science data model that is being used for multiple projects  Not limited to spectra or even just Chemistry data Data Model
  • 14.  File upload  Export formats  Search API Live Demo
  • 16.  Enthusiastic Feedback with constructive comments…  Spectral list is boring needs molecules linked to spectra  Less metadata on the spectral page with option to see more  Revise homepage to make it more inviting Reactions to Alpha Version
  • 17.  Again Enthusiastic…  ”Love the layout! Very clean…”  “Nice Work!” (Twitter comment)  … with constructive comments  Needs a zoom spectra feature  Clicking on spectrum provides data that is not useful  Maybe you could use JSpecView rather than Flot? Reactions to Beta Version
  • 18.  Handle more complicated JCAMP files  Handle file formats other than JCAMP  Export in AnIML format  Expand the API  Improve Flot viewer functionality (e.g. zoom)  Add JSpecView spectral viewer  Endpoint summary page  Document the website (GitHub)  Document how to contribute to the website (GitHub)  Solicit feature requests and encourage contributions Things To Do
  • 19. Take Home  The OSD is open for the community to develop and implement ideas about open spectral data re:  Data Model  API features  Export Formats  Services  Community Involvement!  Use as a data source for other applications  Submission of feature requests  Participation as code contributor
  • 20.  schalk@unf.edu  Phone: 904-620-5311  Skype: stuartchalk  Twitter: @StuChalk  LinkedIn/Slidehare: https://www.linkedin.com/in/stuchalk  ORCID: http://orcid.org/0000-0002-0703-7776  ResearcherID: http://www.researcherid.com/rid/D-8577-2013 Questions?