SlideShare a Scribd company logo
Data Standards & Best Practices
Kerstin Lehnert
Lamont-Doherty Earth Observatory
iedadata.or
g
2
Vouchering the Stratigraphic
Record
 A synthesis database?
 Aggregates data that are published in articles or in data
repositories
 Requirements: Integration, Quality (Trusted data!)
 Needs standardized metadata, semantics, and persistent
unique identifiers
 A trusted repository?
 Publishes and ensures persistent access to data
 Requirements: Compliance with international data
curation and repository standards
 Long-term preservation, data identification (DOI), editorial
procedures, etc.
3
Data Standards
“documented agreements on representation, format,
definition, structuring, tagging, transmission,
manipulation, use, and management of data.”
 Discipline specific
 Data type specific
 Application specific
4
Data Standards: Why?
 Re-usability of data
 Reproducibility of science
 Integration/interoperability of data
6
Reproducibility in the Field
Sciences
 Workshop in May 2015, organized by AAAS (M. McNutt), AGU,
and ESA, funded by the Arnold Foundation
 Report in preparation
Technical Requirements for Transparent, Reproducible Data
1. The data themselves must be publicly available in machine-readable, non-
proprietary formats with accurate and precise descriptive metadata;
2. Data provenance—process(es) by which usable datasets were generated or
derived from raw, often streaming or machine-readable-only data—must be
accurately and precisely specified;
3. Computer code (“scripts”) and software with which datasets were analyzed
must be available and adequately described to ensure their repeated use and
be publicly available in non-proprietary formats, and;
4. Version control should be used to ensure that the original data and code are
maintained.
(from draft workshop report)
7
Coalition for Publishing Data in the Earth
& Space Sciences (COPDESS)
 Joint initiative of Earth Science publishers and Data
Facilities to better help translate the aspirations of
open, available, and useful data from policy into
practice.
 Reaffirm and ensure adherence to existing journal and
publishing policies and society position statements
regarding open data sharing and archiving of data, tools,
and models.
 Ensure that Earth science data will, to the greatest extent
possible, be stored in community approved repositories
that can provide additional data services.
 Statement of Commitment signed by all major
Earth & Space Science publishers
7
www.copdess.org
9
Repository Standards
 Open access
 Data quality assurance (editorial process)
 Persistence (long-term preservation)
 Persistent & unique identification of data (DOI
registration)
 Standard-based metadata (ISO) & APIs (OAI-
PHM)
9
accessible
small data
findable
identification,
persistence
protection,
protocols
context,
provenance
re-usable
harmonized,
machine-readable
interoperable
BIG DATA
Generic Repositories Community Data CollectionsDomain Repositories
11
Distributed Data Curation
 Alert: Stratigraphy is multi-disciplinary
 There are many data types that already have homes
 Paleobio Database
 Macrostrat/Digital Crust
 Geochron (@IEDA)
 MagIC
 Open Core Data (@IEDA – under development)
 EarthChem (@IEDA)
 System for Earth Sample Registration (@IEDA)
 Don’t reinvent, but leverage, link, & integrate!
EarthCube
EarthCube: A Process
Get all the info at: http://earthcube.org
COMPUTER SCIENCES
SOFTWARE ENGINEERS
SCIENTIFIC VISION
TECHNICAL ARCHITECTURE
ENGAGEMENT
FUNDED PROJECTS
14
Back to Data Standards
 Metadata
 Content
 Structure (data model)
 Vocabularies & Taxonomies
 Identifiers
 (API = Application Programming Interface)
15
Metadata Standards
 Geospatial
 Scientific Context
 Object classifications
 Methods (instrumentation, computation, etc.)
 Actions
 dates
 actors
 Data provenance (references, authors, etc.)
16
Open Geospatial Consortium (OGC):
Observations & Measurements
16
Sampling Observation
“Observations commonly involve sampling of an ultimate feature of
interest. This International Standard defines a common set of sampling
feature types classified primarily by topological dimension, as well as
samples for ex-situ observations.”
(OGC O&M 2.0.0 / ISO19156; editor: Simon Cox)
e.g. Station,
Transect, Section
Observation Data
Model v2
Kerstin Lehnert: "Making small data BIG: Insights from a Long-tail Geoscience Domain"
17
ODM2 Team:
J S Horsburgh
A K Aufdenkampe
L Hsu
A Jones
K Lehnert
E Mayorga
L Song
D Tarboton
I Zaslavsky
18
Data Templates
LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data
18
Persistent Unique Identifiers
Samples
Dataset
Article publication
Awards & grants
ORCID
Cruise ID
IGSN
DOI
FundRef
DOI
Researchers
Field Program
Data DOI Metadata
22
Internet of Samples in the Earth
Sciences
 Physical samples need to be linked to the digital data
generated by their study.
 Reproducibility! Access to the physical samples is required to
verify & reproduce observations.
 Re-usability! Access to information about samples is required
for proper evaluation & interpretation of sample-based data.
 Physical samples need to be shared broadly for use &
re-use.
 Samples are often expensive to collect (drilling, remote locations).
 Many samples are unique and irreplaceable.
 Re-analysis augments utility of existing data.
 Samples often serve in ways that the collectors and repositories could
not have imagined.
3/26/2015
22
23
Unique Sample Identification
 Imagine the possibilities …
 Easily find a specific sample and contact its owner
 Find all publications that mention a specific sample
 Find all data for that sample across the literature
and distributed databases
 Find other samples with similar properties
 geospatial
 temporal
 compositional
23
24
Sample Identification Until Now
 Samples have ambiguous and non-persistent
names and cannot be properly cited.
24
The EarthChem Portal shows
75 publications with
geochemical data
referenced to a sample with
the name M1 (or M-1).
(www.earthchem.org)
Names of dredge sample 3 of
the Amphitrite cruise
(PetDB database, www.petdb.org)
25
Sample Identification From Now:
IGSN: International Geo Sample Number
 Persistent unique identifier for physical objects in
the Earth Sciences
 Global uniqueness guaranteed via governance by the
IGSN e.V.
 Persistent access and preservation of sample
metadata
 Cataloguing services of IGSN e.V. members
 Allows to build central search engine
 Resolving service of the IGSN central registry
 Does not replace personal or institutional naming
protocols
25
IGSN: Examples
Oriented Core Drill Hole (ODP)
Soil Section Rock Specimen
27
IGSN Status
 International governance established in 2011
 14 members (organizations) in the IGSN e.V. (www.igsn.org)
 ca. 4 million samples registered (registration tripled in 2014)
 >350 active users, including
 increasing number of individual scientists
 sample repositories & museums (Smithsonian, marine cores,
 geological surveys (USGS, Geoscience Australia, BGR)
 large-scale observatories and sampling campaigns
 ICDP, IODP, CZO, DCO, GeoPRISMs, etc.)
27
IGSN Adoption
IGSN Adoption
COPDESS Statement of Commitment
IGSN in Action
IGSN in Action:
Publications
31
32
Metadata
 Identification
 Sample name(s), registrant
 Description
 Material, classification, age, size, comments
 Geospatial information
 Geographical names, coordinates
 Collection
 Expedition/cruise, platform, date, collector,
technique
 Archiving/access
 Physical location of sample (repository), contact
32
IGSN Sample “Geneology” 33
34
Extended IGSN Metadata
 Images
 Documents (.pdf, .xls, .doc)
 References
 URLs for related data resources
 User defined metadata
34
 Advance use of innovative CI to connect physical samples
across the Earth Sciences with digital data infrastructure
 Goals:
 Improve discovery, access, and re-usability of physical samples
 Improve re-usability and reproducibility of the data generated by their
study
Registries &
Catalogs
Metadata
Identifiers
Citation
Repositories
Software Tools
Taxonomies
C4P: Collaboration & Cyberinfrastructure for Paleoscience
An EarthCube Research Coordination Network
Unravel the large-scale, long-term evolution of the Earth-Life System
through the study of the geological record
Major challenges C4P addresses:
• Heterogeneous & dispersed data
• Modeling of age & time
• Legacy & ‘dark’ data
• Limited interoperability among resources
• Variable semantics & ontologies
A diverse community:
paleobiology, paleoclimate, paleoceanography, geochemistry,
dendrochronology, stratigraphy, geochronology, sample
curation, data management, bioinformatics, semantics,
software architecture, and more ...
C4P achievements:
• New resources
• data & software catalogs
• Educational materials (webinars)
• New collaborations
• Convergence on best practices (samples,
age, taxonomy)
37
Take Away Messages
37
 develop leading practices for data
 get community buy-in
 align & coordinate with existing leading
practices
 leverage existing infrastructure
 get started and don’t let the challenges stop
you
“The Hitchhiker’s Guide to
Geoinformatics”
(Lee Allison, LISTMG
Workshop 2004)“Building an International
Collaboration for
Geoinformatics”
(Walter Snyder, AGU 2005)
“Cyberinfrastructure for Solid Earth
Geochemistry” (Kerstin Lehnert, GSA 2003)
The Cultural Challenges
38
39
Thank You!
"The wonderful thing about
standards is that there are so many
of them to choose from”.
(Grace Hopper)

More Related Content

What's hot

The Dryad Digital Repository: Published evolutionary data as part of the gre...
The Dryad Digital Repository: Published evolutionary data as part of the gre...The Dryad Digital Repository: Published evolutionary data as part of the gre...
The Dryad Digital Repository: Published evolutionary data as part of the gre...
Todd Vision
 
Biodiversity Informatics: An Interdisciplinary Challenge
Biodiversity Informatics: An Interdisciplinary ChallengeBiodiversity Informatics: An Interdisciplinary Challenge
Biodiversity Informatics: An Interdisciplinary Challenge
Bryan Heidorn
 
D paul ecn2013
D paul ecn2013D paul ecn2013
D paul ecn2013ECNOfficer
 
FAIR sequencing data repository based on iRODS
FAIR sequencing data repository based on iRODSFAIR sequencing data repository based on iRODS
FAIR sequencing data repository based on iRODS
Felipe Gutierrez
 
Integrated research data management in the Structural Sciences
Integrated research data management in the Structural SciencesIntegrated research data management in the Structural Sciences
Integrated research data management in the Structural Sciences
ManjulaPatel
 
The eCrystals Federation
The eCrystals FederationThe eCrystals Federation
The eCrystals Federation
ManjulaPatel
 
Research resources: curating the new eagle-i discovery system
Research resources: curating the new eagle-i discovery systemResearch resources: curating the new eagle-i discovery system
Research resources: curating the new eagle-i discovery system
Nicole Vasilevsky
 
Research Data Sharing LERU
Research Data Sharing LERU Research Data Sharing LERU
Research Data Sharing LERU
LIBER Europe
 
Being FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data ScienceBeing FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data Science
Carole Goble
 
BioDBCore: Current Status and Next Developments
BioDBCore: Current Status and Next DevelopmentsBioDBCore: Current Status and Next Developments
BioDBCore: Current Status and Next Developments
Pascale Gaudet
 
Knowledge Exchange, Nov 2011, Bonn
Knowledge Exchange, Nov 2011, BonnKnowledge Exchange, Nov 2011, Bonn
Knowledge Exchange, Nov 2011, Bonn
Todd Vision
 
Data sharing archiving discovery, Bill Michener
Data sharing archiving discovery, Bill MichenerData sharing archiving discovery, Bill Michener
Data sharing archiving discovery, Bill Michener
Alison Specht
 
Leveraging publication metadata to help overcome the data ingest bottleneck
Leveraging publication metadata to help overcome the data ingest bottleneck Leveraging publication metadata to help overcome the data ingest bottleneck
Leveraging publication metadata to help overcome the data ingest bottleneck
Todd Vision
 
Metadata for digital long-term preservation
Metadata for digital long-term preservationMetadata for digital long-term preservation
Metadata for digital long-term preservation
Michael Day
 
Data reuse and scholarly reward: understanding practice and building infrastr...
Data reuse and scholarly reward: understanding practice and building infrastr...Data reuse and scholarly reward: understanding practice and building infrastr...
Data reuse and scholarly reward: understanding practice and building infrastr...
Todd Vision
 
Being Reproducible: SSBSS Summer School 2017
Being Reproducible: SSBSS Summer School 2017Being Reproducible: SSBSS Summer School 2017
Being Reproducible: SSBSS Summer School 2017
Carole Goble
 
Research Objects: more than the sum of the parts
Research Objects: more than the sum of the partsResearch Objects: more than the sum of the parts
Research Objects: more than the sum of the parts
Carole Goble
 
Research data and scholarly publications: going from casual acquaintances to ...
Research data and scholarly publications: going from casual acquaintances to ...Research data and scholarly publications: going from casual acquaintances to ...
Research data and scholarly publications: going from casual acquaintances to ...
Todd Vision
 
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...
Carole Goble
 
Mtsr2015 goble-keynote
Mtsr2015 goble-keynoteMtsr2015 goble-keynote
Mtsr2015 goble-keynote
Carole Goble
 

What's hot (20)

The Dryad Digital Repository: Published evolutionary data as part of the gre...
The Dryad Digital Repository: Published evolutionary data as part of the gre...The Dryad Digital Repository: Published evolutionary data as part of the gre...
The Dryad Digital Repository: Published evolutionary data as part of the gre...
 
Biodiversity Informatics: An Interdisciplinary Challenge
Biodiversity Informatics: An Interdisciplinary ChallengeBiodiversity Informatics: An Interdisciplinary Challenge
Biodiversity Informatics: An Interdisciplinary Challenge
 
D paul ecn2013
D paul ecn2013D paul ecn2013
D paul ecn2013
 
FAIR sequencing data repository based on iRODS
FAIR sequencing data repository based on iRODSFAIR sequencing data repository based on iRODS
FAIR sequencing data repository based on iRODS
 
Integrated research data management in the Structural Sciences
Integrated research data management in the Structural SciencesIntegrated research data management in the Structural Sciences
Integrated research data management in the Structural Sciences
 
The eCrystals Federation
The eCrystals FederationThe eCrystals Federation
The eCrystals Federation
 
Research resources: curating the new eagle-i discovery system
Research resources: curating the new eagle-i discovery systemResearch resources: curating the new eagle-i discovery system
Research resources: curating the new eagle-i discovery system
 
Research Data Sharing LERU
Research Data Sharing LERU Research Data Sharing LERU
Research Data Sharing LERU
 
Being FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data ScienceBeing FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data Science
 
BioDBCore: Current Status and Next Developments
BioDBCore: Current Status and Next DevelopmentsBioDBCore: Current Status and Next Developments
BioDBCore: Current Status and Next Developments
 
Knowledge Exchange, Nov 2011, Bonn
Knowledge Exchange, Nov 2011, BonnKnowledge Exchange, Nov 2011, Bonn
Knowledge Exchange, Nov 2011, Bonn
 
Data sharing archiving discovery, Bill Michener
Data sharing archiving discovery, Bill MichenerData sharing archiving discovery, Bill Michener
Data sharing archiving discovery, Bill Michener
 
Leveraging publication metadata to help overcome the data ingest bottleneck
Leveraging publication metadata to help overcome the data ingest bottleneck Leveraging publication metadata to help overcome the data ingest bottleneck
Leveraging publication metadata to help overcome the data ingest bottleneck
 
Metadata for digital long-term preservation
Metadata for digital long-term preservationMetadata for digital long-term preservation
Metadata for digital long-term preservation
 
Data reuse and scholarly reward: understanding practice and building infrastr...
Data reuse and scholarly reward: understanding practice and building infrastr...Data reuse and scholarly reward: understanding practice and building infrastr...
Data reuse and scholarly reward: understanding practice and building infrastr...
 
Being Reproducible: SSBSS Summer School 2017
Being Reproducible: SSBSS Summer School 2017Being Reproducible: SSBSS Summer School 2017
Being Reproducible: SSBSS Summer School 2017
 
Research Objects: more than the sum of the parts
Research Objects: more than the sum of the partsResearch Objects: more than the sum of the parts
Research Objects: more than the sum of the parts
 
Research data and scholarly publications: going from casual acquaintances to ...
Research data and scholarly publications: going from casual acquaintances to ...Research data and scholarly publications: going from casual acquaintances to ...
Research data and scholarly publications: going from casual acquaintances to ...
 
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...
 
Mtsr2015 goble-keynote
Mtsr2015 goble-keynoteMtsr2015 goble-keynote
Mtsr2015 goble-keynote
 

Viewers also liked

Jump-Starting Data Standards I: Launching a Data Clean-Up Program
Jump-Starting Data Standards I: Launching a Data Clean-Up ProgramJump-Starting Data Standards I: Launching a Data Clean-Up Program
Jump-Starting Data Standards I: Launching a Data Clean-Up Program
CollectiveImagination
 
The IGSN and Geosamples
The IGSN and GeosamplesThe IGSN and Geosamples
The IGSN and Geosamples
iedadata
 
CSIRO Soils Archive
CSIRO Soils ArchiveCSIRO Soils Archive
CSIRO Soils Archive
ARDC
 
IGSN: The International Geo Sample Number (DFG Roundtable)
IGSN: The International Geo Sample Number (DFG Roundtable)IGSN: The International Geo Sample Number (DFG Roundtable)
IGSN: The International Geo Sample Number (DFG Roundtable)
Kerstin Lehnert
 
The Internet of Samples: IGSN in Action
The Internet of Samples: IGSN in ActionThe Internet of Samples: IGSN in Action
The Internet of Samples: IGSN in Action
Kerstin Lehnert
 
Data You May Like: A Recommender System for Research Data Discovery
Data You May Like: A Recommender System for Research Data DiscoveryData You May Like: A Recommender System for Research Data Discovery
Data You May Like: A Recommender System for Research Data Discovery
Anusuriya Devaraju
 
IGSN implementation at gGeoscience Australia
IGSN implementation at gGeoscience AustraliaIGSN implementation at gGeoscience Australia
IGSN implementation at gGeoscience Australia
ARDC
 
CSIRO National Research Collections
CSIRO National Research CollectionsCSIRO National Research Collections
CSIRO National Research Collections
ARDC
 
Raiser's Edge Database Cleanup Tips
Raiser's Edge Database Cleanup TipsRaiser's Edge Database Cleanup Tips
Raiser's Edge Database Cleanup Tips
Blackbaud
 
Best Practices in Raiser's Edge Data Integrity Management
Best Practices in Raiser's Edge Data Integrity ManagementBest Practices in Raiser's Edge Data Integrity Management
Best Practices in Raiser's Edge Data Integrity Management
Blackbaud
 
Best practice strategies to clean up and maintain your database with Hether G...
Best practice strategies to clean up and maintain your database with Hether G...Best practice strategies to clean up and maintain your database with Hether G...
Best practice strategies to clean up and maintain your database with Hether G...
Blackbaud Pacific
 

Viewers also liked (12)

Jump-Starting Data Standards I: Launching a Data Clean-Up Program
Jump-Starting Data Standards I: Launching a Data Clean-Up ProgramJump-Starting Data Standards I: Launching a Data Clean-Up Program
Jump-Starting Data Standards I: Launching a Data Clean-Up Program
 
The IGSN and Geosamples
The IGSN and GeosamplesThe IGSN and Geosamples
The IGSN and Geosamples
 
CSIRO Soils Archive
CSIRO Soils ArchiveCSIRO Soils Archive
CSIRO Soils Archive
 
IGSN: The International Geo Sample Number (DFG Roundtable)
IGSN: The International Geo Sample Number (DFG Roundtable)IGSN: The International Geo Sample Number (DFG Roundtable)
IGSN: The International Geo Sample Number (DFG Roundtable)
 
The Internet of Samples: IGSN in Action
The Internet of Samples: IGSN in ActionThe Internet of Samples: IGSN in Action
The Internet of Samples: IGSN in Action
 
Data You May Like: A Recommender System for Research Data Discovery
Data You May Like: A Recommender System for Research Data DiscoveryData You May Like: A Recommender System for Research Data Discovery
Data You May Like: A Recommender System for Research Data Discovery
 
IGSN implementation at gGeoscience Australia
IGSN implementation at gGeoscience AustraliaIGSN implementation at gGeoscience Australia
IGSN implementation at gGeoscience Australia
 
CSIRO National Research Collections
CSIRO National Research CollectionsCSIRO National Research Collections
CSIRO National Research Collections
 
Data entry guidelines
Data entry guidelinesData entry guidelines
Data entry guidelines
 
Raiser's Edge Database Cleanup Tips
Raiser's Edge Database Cleanup TipsRaiser's Edge Database Cleanup Tips
Raiser's Edge Database Cleanup Tips
 
Best Practices in Raiser's Edge Data Integrity Management
Best Practices in Raiser's Edge Data Integrity ManagementBest Practices in Raiser's Edge Data Integrity Management
Best Practices in Raiser's Edge Data Integrity Management
 
Best practice strategies to clean up and maintain your database with Hether G...
Best practice strategies to clean up and maintain your database with Hether G...Best practice strategies to clean up and maintain your database with Hether G...
Best practice strategies to clean up and maintain your database with Hether G...
 

Similar to Data Standards & Best Practices for the Stratigraphic Record

Scott Edmunds: GigaScience - a journal or a database? Lessons learned from th...
Scott Edmunds: GigaScience - a journal or a database? Lessons learned from th...Scott Edmunds: GigaScience - a journal or a database? Lessons learned from th...
Scott Edmunds: GigaScience - a journal or a database? Lessons learned from th...
GigaScience, BGI Hong Kong
 
2 Discovery and Acquisition of Data1.pptx
2 Discovery and Acquisition of Data1.pptx2 Discovery and Acquisition of Data1.pptx
2 Discovery and Acquisition of Data1.pptx
vijayapraba1
 
Open Data and Institutional Repositories
Open Data and Institutional RepositoriesOpen Data and Institutional Repositories
Open Data and Institutional Repositories
Robin Rice
 
Dp Geosc Info Presentation Final Version 2
Dp Geosc Info Presentation Final Version 2Dp Geosc Info Presentation Final Version 2
Dp Geosc Info Presentation Final Version 2Smita Chandra
 
Next-Generation Search Engines for Information Retrieval
Next-Generation Search Engines for Information RetrievalNext-Generation Search Engines for Information Retrieval
Next-Generation Search Engines for Information Retrieval
Waqas Tariq
 
Foundations for the Future of Science
Foundations for the Future of ScienceFoundations for the Future of Science
Foundations for the Future of Science
Globus
 
Researh data management
Researh data managementResearh data management
Researh data management
Nikesh Narayanan
 
Current and emerging scientific data curation practices
Current and emerging scientific data curation practicesCurrent and emerging scientific data curation practices
Current and emerging scientific data curation practices
Michael Day
 
Lehnert_EGU201_SampleMetadataStandards
Lehnert_EGU201_SampleMetadataStandardsLehnert_EGU201_SampleMetadataStandards
Lehnert_EGU201_SampleMetadataStandards
Kerstin Lehnert
 
Ten Habits of Highly Successful Data
Ten Habits of Highly Successful DataTen Habits of Highly Successful Data
Ten Habits of Highly Successful Data
Anita de Waard
 
ODIN Final Event - The Care and Feeding of Scientific Data
ODIN Final Event - The Care and Feeding of Scientific DataODIN Final Event - The Care and Feeding of Scientific Data
ODIN Final Event - The Care and Feeding of Scientific Data
datacite
 
Ten habits of highly effective data
Ten habits of highly effective dataTen habits of highly effective data
Ten habits of highly effective data
Anita de Waard
 
FAIR Data Management and FAIR Data Sharing
FAIR Data Management and FAIR Data SharingFAIR Data Management and FAIR Data Sharing
FAIR Data Management and FAIR Data Sharing
Merce Crosas
 
The Human Cell Atlas Data Coordination Platform
The Human Cell Atlas Data Coordination PlatformThe Human Cell Atlas Data Coordination Platform
The Human Cell Atlas Data Coordination Platform
Laura Clarke
 
The habits of highly successful data:
The habits of highly successful data: The habits of highly successful data:
The habits of highly successful data: Anita de Waard
 
Open Science Globally: Some Developments/Dr Simon Hodson
Open Science Globally: Some Developments/Dr Simon HodsonOpen Science Globally: Some Developments/Dr Simon Hodson
Open Science Globally: Some Developments/Dr Simon Hodson
African Open Science Platform
 
2021-01-27--biodiversity-informatics-gbif-(52slides)
2021-01-27--biodiversity-informatics-gbif-(52slides)2021-01-27--biodiversity-informatics-gbif-(52slides)
2021-01-27--biodiversity-informatics-gbif-(52slides)
Dag Endresen
 
FAIR Data, Operations and Model management for Systems Biology and Systems Me...
FAIR Data, Operations and Model management for Systems Biology and Systems Me...FAIR Data, Operations and Model management for Systems Biology and Systems Me...
FAIR Data, Operations and Model management for Systems Biology and Systems Me...
Carole Goble
 
FAIRy Stories
FAIRy StoriesFAIRy Stories
FAIRy Stories
Carole Goble
 
Introduction to digital curation
Introduction to digital curationIntroduction to digital curation
Introduction to digital curation
Michael Day
 

Similar to Data Standards & Best Practices for the Stratigraphic Record (20)

Scott Edmunds: GigaScience - a journal or a database? Lessons learned from th...
Scott Edmunds: GigaScience - a journal or a database? Lessons learned from th...Scott Edmunds: GigaScience - a journal or a database? Lessons learned from th...
Scott Edmunds: GigaScience - a journal or a database? Lessons learned from th...
 
2 Discovery and Acquisition of Data1.pptx
2 Discovery and Acquisition of Data1.pptx2 Discovery and Acquisition of Data1.pptx
2 Discovery and Acquisition of Data1.pptx
 
Open Data and Institutional Repositories
Open Data and Institutional RepositoriesOpen Data and Institutional Repositories
Open Data and Institutional Repositories
 
Dp Geosc Info Presentation Final Version 2
Dp Geosc Info Presentation Final Version 2Dp Geosc Info Presentation Final Version 2
Dp Geosc Info Presentation Final Version 2
 
Next-Generation Search Engines for Information Retrieval
Next-Generation Search Engines for Information RetrievalNext-Generation Search Engines for Information Retrieval
Next-Generation Search Engines for Information Retrieval
 
Foundations for the Future of Science
Foundations for the Future of ScienceFoundations for the Future of Science
Foundations for the Future of Science
 
Researh data management
Researh data managementResearh data management
Researh data management
 
Current and emerging scientific data curation practices
Current and emerging scientific data curation practicesCurrent and emerging scientific data curation practices
Current and emerging scientific data curation practices
 
Lehnert_EGU201_SampleMetadataStandards
Lehnert_EGU201_SampleMetadataStandardsLehnert_EGU201_SampleMetadataStandards
Lehnert_EGU201_SampleMetadataStandards
 
Ten Habits of Highly Successful Data
Ten Habits of Highly Successful DataTen Habits of Highly Successful Data
Ten Habits of Highly Successful Data
 
ODIN Final Event - The Care and Feeding of Scientific Data
ODIN Final Event - The Care and Feeding of Scientific DataODIN Final Event - The Care and Feeding of Scientific Data
ODIN Final Event - The Care and Feeding of Scientific Data
 
Ten habits of highly effective data
Ten habits of highly effective dataTen habits of highly effective data
Ten habits of highly effective data
 
FAIR Data Management and FAIR Data Sharing
FAIR Data Management and FAIR Data SharingFAIR Data Management and FAIR Data Sharing
FAIR Data Management and FAIR Data Sharing
 
The Human Cell Atlas Data Coordination Platform
The Human Cell Atlas Data Coordination PlatformThe Human Cell Atlas Data Coordination Platform
The Human Cell Atlas Data Coordination Platform
 
The habits of highly successful data:
The habits of highly successful data: The habits of highly successful data:
The habits of highly successful data:
 
Open Science Globally: Some Developments/Dr Simon Hodson
Open Science Globally: Some Developments/Dr Simon HodsonOpen Science Globally: Some Developments/Dr Simon Hodson
Open Science Globally: Some Developments/Dr Simon Hodson
 
2021-01-27--biodiversity-informatics-gbif-(52slides)
2021-01-27--biodiversity-informatics-gbif-(52slides)2021-01-27--biodiversity-informatics-gbif-(52slides)
2021-01-27--biodiversity-informatics-gbif-(52slides)
 
FAIR Data, Operations and Model management for Systems Biology and Systems Me...
FAIR Data, Operations and Model management for Systems Biology and Systems Me...FAIR Data, Operations and Model management for Systems Biology and Systems Me...
FAIR Data, Operations and Model management for Systems Biology and Systems Me...
 
FAIRy Stories
FAIRy StoriesFAIRy Stories
FAIRy Stories
 
Introduction to digital curation
Introduction to digital curationIntroduction to digital curation
Introduction to digital curation
 

More from Kerstin Lehnert

Astromat Update on Developments 2021-01-29
Astromat Update on Developments 2021-01-29Astromat Update on Developments 2021-01-29
Astromat Update on Developments 2021-01-29
Kerstin Lehnert
 
Data Services for Geochemical Data
Data Services for Geochemical DataData Services for Geochemical Data
Data Services for Geochemical Data
Kerstin Lehnert
 
Goldschmidt2019 Samples Workshop
Goldschmidt2019 Samples WorkshopGoldschmidt2019 Samples Workshop
Goldschmidt2019 Samples Workshop
Kerstin Lehnert
 
Boosting Data Science in Geochemistry: We Need Global Geochemical Data Standa...
Boosting Data Science in Geochemistry: We Need Global Geochemical Data Standa...Boosting Data Science in Geochemistry: We Need Global Geochemical Data Standa...
Boosting Data Science in Geochemistry: We Need Global Geochemical Data Standa...
Kerstin Lehnert
 
EGU 2018 Ian McHarg Lecture
EGU 2018 Ian McHarg LectureEGU 2018 Ian McHarg Lecture
EGU 2018 Ian McHarg Lecture
Kerstin Lehnert
 
EarthCubeArchitectureWS_June2015
EarthCubeArchitectureWS_June2015EarthCubeArchitectureWS_June2015
EarthCubeArchitectureWS_June2015
Kerstin Lehnert
 
Advancing Reproducible Science from Physical Samples: The IGSN and the iSampl...
Advancing Reproducible Science from Physical Samples: The IGSN and the iSampl...Advancing Reproducible Science from Physical Samples: The IGSN and the iSampl...
Advancing Reproducible Science from Physical Samples: The IGSN and the iSampl...
Kerstin Lehnert
 
Making Small Data BIG (UT Austin, March 2016)
Making Small Data BIG (UT Austin, March 2016)Making Small Data BIG (UT Austin, March 2016)
Making Small Data BIG (UT Austin, March 2016)
Kerstin Lehnert
 
Research Data Infrastructure for Geochemistry (DFG Roundtable)
Research Data Infrastructure for Geochemistry (DFG Roundtable)Research Data Infrastructure for Geochemistry (DFG Roundtable)
Research Data Infrastructure for Geochemistry (DFG Roundtable)
Kerstin Lehnert
 
Interdisciplinary Data Resources for Volcanology at the IEDA (Interdisciplina...
Interdisciplinary Data Resources for Volcanology at the IEDA (Interdisciplina...Interdisciplinary Data Resources for Volcanology at the IEDA (Interdisciplina...
Interdisciplinary Data Resources for Volcanology at the IEDA (Interdisciplina...
Kerstin Lehnert
 
Digital Representation of Physical Samples in Scientific Publications
Digital Representation of Physical Samples in Scientific PublicationsDigital Representation of Physical Samples in Scientific Publications
Digital Representation of Physical Samples in Scientific Publications
Kerstin Lehnert
 
Lehnert: Making Small Data Big, IACS, April2015
Lehnert: Making Small Data Big, IACS, April2015Lehnert: Making Small Data Big, IACS, April2015
Lehnert: Making Small Data Big, IACS, April2015
Kerstin Lehnert
 
IEDA: Making Small Data BIG Through Interdisciplinary Partnerships Among Long...
IEDA: Making Small Data BIG Through Interdisciplinary Partnerships Among Long...IEDA: Making Small Data BIG Through Interdisciplinary Partnerships Among Long...
IEDA: Making Small Data BIG Through Interdisciplinary Partnerships Among Long...
Kerstin Lehnert
 
iSamples Research Coordination Network (C4P Webinar)
iSamples Research Coordination Network (C4P Webinar)iSamples Research Coordination Network (C4P Webinar)
iSamples Research Coordination Network (C4P Webinar)
Kerstin Lehnert
 
MoonDB: Restoration & Synthesis of Planetary Geochemical Data
MoonDB: Restoration & Synthesis of Planetary Geochemical DataMoonDB: Restoration & Synthesis of Planetary Geochemical Data
MoonDB: Restoration & Synthesis of Planetary Geochemical Data
Kerstin Lehnert
 
IEDA Data Publication Workshop @AGU
IEDA Data Publication Workshop @AGUIEDA Data Publication Workshop @AGU
IEDA Data Publication Workshop @AGU
Kerstin Lehnert
 

More from Kerstin Lehnert (16)

Astromat Update on Developments 2021-01-29
Astromat Update on Developments 2021-01-29Astromat Update on Developments 2021-01-29
Astromat Update on Developments 2021-01-29
 
Data Services for Geochemical Data
Data Services for Geochemical DataData Services for Geochemical Data
Data Services for Geochemical Data
 
Goldschmidt2019 Samples Workshop
Goldschmidt2019 Samples WorkshopGoldschmidt2019 Samples Workshop
Goldschmidt2019 Samples Workshop
 
Boosting Data Science in Geochemistry: We Need Global Geochemical Data Standa...
Boosting Data Science in Geochemistry: We Need Global Geochemical Data Standa...Boosting Data Science in Geochemistry: We Need Global Geochemical Data Standa...
Boosting Data Science in Geochemistry: We Need Global Geochemical Data Standa...
 
EGU 2018 Ian McHarg Lecture
EGU 2018 Ian McHarg LectureEGU 2018 Ian McHarg Lecture
EGU 2018 Ian McHarg Lecture
 
EarthCubeArchitectureWS_June2015
EarthCubeArchitectureWS_June2015EarthCubeArchitectureWS_June2015
EarthCubeArchitectureWS_June2015
 
Advancing Reproducible Science from Physical Samples: The IGSN and the iSampl...
Advancing Reproducible Science from Physical Samples: The IGSN and the iSampl...Advancing Reproducible Science from Physical Samples: The IGSN and the iSampl...
Advancing Reproducible Science from Physical Samples: The IGSN and the iSampl...
 
Making Small Data BIG (UT Austin, March 2016)
Making Small Data BIG (UT Austin, March 2016)Making Small Data BIG (UT Austin, March 2016)
Making Small Data BIG (UT Austin, March 2016)
 
Research Data Infrastructure for Geochemistry (DFG Roundtable)
Research Data Infrastructure for Geochemistry (DFG Roundtable)Research Data Infrastructure for Geochemistry (DFG Roundtable)
Research Data Infrastructure for Geochemistry (DFG Roundtable)
 
Interdisciplinary Data Resources for Volcanology at the IEDA (Interdisciplina...
Interdisciplinary Data Resources for Volcanology at the IEDA (Interdisciplina...Interdisciplinary Data Resources for Volcanology at the IEDA (Interdisciplina...
Interdisciplinary Data Resources for Volcanology at the IEDA (Interdisciplina...
 
Digital Representation of Physical Samples in Scientific Publications
Digital Representation of Physical Samples in Scientific PublicationsDigital Representation of Physical Samples in Scientific Publications
Digital Representation of Physical Samples in Scientific Publications
 
Lehnert: Making Small Data Big, IACS, April2015
Lehnert: Making Small Data Big, IACS, April2015Lehnert: Making Small Data Big, IACS, April2015
Lehnert: Making Small Data Big, IACS, April2015
 
IEDA: Making Small Data BIG Through Interdisciplinary Partnerships Among Long...
IEDA: Making Small Data BIG Through Interdisciplinary Partnerships Among Long...IEDA: Making Small Data BIG Through Interdisciplinary Partnerships Among Long...
IEDA: Making Small Data BIG Through Interdisciplinary Partnerships Among Long...
 
iSamples Research Coordination Network (C4P Webinar)
iSamples Research Coordination Network (C4P Webinar)iSamples Research Coordination Network (C4P Webinar)
iSamples Research Coordination Network (C4P Webinar)
 
MoonDB: Restoration & Synthesis of Planetary Geochemical Data
MoonDB: Restoration & Synthesis of Planetary Geochemical DataMoonDB: Restoration & Synthesis of Planetary Geochemical Data
MoonDB: Restoration & Synthesis of Planetary Geochemical Data
 
IEDA Data Publication Workshop @AGU
IEDA Data Publication Workshop @AGUIEDA Data Publication Workshop @AGU
IEDA Data Publication Workshop @AGU
 

Recently uploaded

DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
Wasswaderrick3
 
Chapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisisChapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisis
tonzsalvador2222
 
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
Studia Poinsotiana
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
moosaasad1975
 
in vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptxin vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptx
yusufzako14
 
nodule formation by alisha dewangan.pptx
nodule formation by alisha dewangan.pptxnodule formation by alisha dewangan.pptx
nodule formation by alisha dewangan.pptx
alishadewangan1
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Ana Luísa Pinho
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
SAMIR PANDA
 
Richard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlandsRichard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlands
Richard Gill
 
Phenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvementPhenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvement
IshaGoswami9
 
Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
RenuJangid3
 
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Sérgio Sacani
 
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
yqqaatn0
 
Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
University of Rennes, INSA Rennes, Inria/IRISA, CNRS
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
University of Maribor
 
GBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture MediaGBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture Media
Areesha Ahmad
 
S.1 chemistry scheme term 2 for ordinary level
S.1 chemistry scheme term 2 for ordinary levelS.1 chemistry scheme term 2 for ordinary level
S.1 chemistry scheme term 2 for ordinary level
ronaldlakony0
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
Columbia Weather Systems
 
general properties of oerganologametal.ppt
general properties of oerganologametal.pptgeneral properties of oerganologametal.ppt
general properties of oerganologametal.ppt
IqrimaNabilatulhusni
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Erdal Coalmaker
 

Recently uploaded (20)

DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
 
Chapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisisChapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisis
 
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
 
in vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptxin vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptx
 
nodule formation by alisha dewangan.pptx
nodule formation by alisha dewangan.pptxnodule formation by alisha dewangan.pptx
nodule formation by alisha dewangan.pptx
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
 
Richard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlandsRichard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlands
 
Phenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvementPhenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvement
 
Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
 
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
 
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
 
Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
 
GBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture MediaGBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture Media
 
S.1 chemistry scheme term 2 for ordinary level
S.1 chemistry scheme term 2 for ordinary levelS.1 chemistry scheme term 2 for ordinary level
S.1 chemistry scheme term 2 for ordinary level
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
 
general properties of oerganologametal.ppt
general properties of oerganologametal.pptgeneral properties of oerganologametal.ppt
general properties of oerganologametal.ppt
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
 

Data Standards & Best Practices for the Stratigraphic Record

  • 1. Data Standards & Best Practices Kerstin Lehnert Lamont-Doherty Earth Observatory iedadata.or g
  • 2. 2 Vouchering the Stratigraphic Record  A synthesis database?  Aggregates data that are published in articles or in data repositories  Requirements: Integration, Quality (Trusted data!)  Needs standardized metadata, semantics, and persistent unique identifiers  A trusted repository?  Publishes and ensures persistent access to data  Requirements: Compliance with international data curation and repository standards  Long-term preservation, data identification (DOI), editorial procedures, etc.
  • 3. 3 Data Standards “documented agreements on representation, format, definition, structuring, tagging, transmission, manipulation, use, and management of data.”  Discipline specific  Data type specific  Application specific
  • 4. 4 Data Standards: Why?  Re-usability of data  Reproducibility of science  Integration/interoperability of data
  • 5.
  • 6. 6 Reproducibility in the Field Sciences  Workshop in May 2015, organized by AAAS (M. McNutt), AGU, and ESA, funded by the Arnold Foundation  Report in preparation Technical Requirements for Transparent, Reproducible Data 1. The data themselves must be publicly available in machine-readable, non- proprietary formats with accurate and precise descriptive metadata; 2. Data provenance—process(es) by which usable datasets were generated or derived from raw, often streaming or machine-readable-only data—must be accurately and precisely specified; 3. Computer code (“scripts”) and software with which datasets were analyzed must be available and adequately described to ensure their repeated use and be publicly available in non-proprietary formats, and; 4. Version control should be used to ensure that the original data and code are maintained. (from draft workshop report)
  • 7. 7 Coalition for Publishing Data in the Earth & Space Sciences (COPDESS)  Joint initiative of Earth Science publishers and Data Facilities to better help translate the aspirations of open, available, and useful data from policy into practice.  Reaffirm and ensure adherence to existing journal and publishing policies and society position statements regarding open data sharing and archiving of data, tools, and models.  Ensure that Earth science data will, to the greatest extent possible, be stored in community approved repositories that can provide additional data services.  Statement of Commitment signed by all major Earth & Space Science publishers 7 www.copdess.org
  • 8.
  • 9. 9 Repository Standards  Open access  Data quality assurance (editorial process)  Persistence (long-term preservation)  Persistent & unique identification of data (DOI registration)  Standard-based metadata (ISO) & APIs (OAI- PHM) 9
  • 11. 11 Distributed Data Curation  Alert: Stratigraphy is multi-disciplinary  There are many data types that already have homes  Paleobio Database  Macrostrat/Digital Crust  Geochron (@IEDA)  MagIC  Open Core Data (@IEDA – under development)  EarthChem (@IEDA)  System for Earth Sample Registration (@IEDA)  Don’t reinvent, but leverage, link, & integrate!
  • 13. EarthCube: A Process Get all the info at: http://earthcube.org COMPUTER SCIENCES SOFTWARE ENGINEERS SCIENTIFIC VISION TECHNICAL ARCHITECTURE ENGAGEMENT FUNDED PROJECTS
  • 14. 14 Back to Data Standards  Metadata  Content  Structure (data model)  Vocabularies & Taxonomies  Identifiers  (API = Application Programming Interface)
  • 15. 15 Metadata Standards  Geospatial  Scientific Context  Object classifications  Methods (instrumentation, computation, etc.)  Actions  dates  actors  Data provenance (references, authors, etc.)
  • 16. 16 Open Geospatial Consortium (OGC): Observations & Measurements 16 Sampling Observation “Observations commonly involve sampling of an ultimate feature of interest. This International Standard defines a common set of sampling feature types classified primarily by topological dimension, as well as samples for ex-situ observations.” (OGC O&M 2.0.0 / ISO19156; editor: Simon Cox) e.g. Station, Transect, Section
  • 17. Observation Data Model v2 Kerstin Lehnert: "Making small data BIG: Insights from a Long-tail Geoscience Domain" 17 ODM2 Team: J S Horsburgh A K Aufdenkampe L Hsu A Jones K Lehnert E Mayorga L Song D Tarboton I Zaslavsky
  • 18. 18 Data Templates LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data 18
  • 19. Persistent Unique Identifiers Samples Dataset Article publication Awards & grants ORCID Cruise ID IGSN DOI FundRef DOI Researchers Field Program
  • 20.
  • 22. 22 Internet of Samples in the Earth Sciences  Physical samples need to be linked to the digital data generated by their study.  Reproducibility! Access to the physical samples is required to verify & reproduce observations.  Re-usability! Access to information about samples is required for proper evaluation & interpretation of sample-based data.  Physical samples need to be shared broadly for use & re-use.  Samples are often expensive to collect (drilling, remote locations).  Many samples are unique and irreplaceable.  Re-analysis augments utility of existing data.  Samples often serve in ways that the collectors and repositories could not have imagined. 3/26/2015 22
  • 23. 23 Unique Sample Identification  Imagine the possibilities …  Easily find a specific sample and contact its owner  Find all publications that mention a specific sample  Find all data for that sample across the literature and distributed databases  Find other samples with similar properties  geospatial  temporal  compositional 23
  • 24. 24 Sample Identification Until Now  Samples have ambiguous and non-persistent names and cannot be properly cited. 24 The EarthChem Portal shows 75 publications with geochemical data referenced to a sample with the name M1 (or M-1). (www.earthchem.org) Names of dredge sample 3 of the Amphitrite cruise (PetDB database, www.petdb.org)
  • 25. 25 Sample Identification From Now: IGSN: International Geo Sample Number  Persistent unique identifier for physical objects in the Earth Sciences  Global uniqueness guaranteed via governance by the IGSN e.V.  Persistent access and preservation of sample metadata  Cataloguing services of IGSN e.V. members  Allows to build central search engine  Resolving service of the IGSN central registry  Does not replace personal or institutional naming protocols 25
  • 26. IGSN: Examples Oriented Core Drill Hole (ODP) Soil Section Rock Specimen
  • 27. 27 IGSN Status  International governance established in 2011  14 members (organizations) in the IGSN e.V. (www.igsn.org)  ca. 4 million samples registered (registration tripled in 2014)  >350 active users, including  increasing number of individual scientists  sample repositories & museums (Smithsonian, marine cores,  geological surveys (USGS, Geoscience Australia, BGR)  large-scale observatories and sampling campaigns  ICDP, IODP, CZO, DCO, GeoPRISMs, etc.) 27
  • 32. 32 Metadata  Identification  Sample name(s), registrant  Description  Material, classification, age, size, comments  Geospatial information  Geographical names, coordinates  Collection  Expedition/cruise, platform, date, collector, technique  Archiving/access  Physical location of sample (repository), contact 32
  • 34. 34 Extended IGSN Metadata  Images  Documents (.pdf, .xls, .doc)  References  URLs for related data resources  User defined metadata 34
  • 35.  Advance use of innovative CI to connect physical samples across the Earth Sciences with digital data infrastructure  Goals:  Improve discovery, access, and re-usability of physical samples  Improve re-usability and reproducibility of the data generated by their study Registries & Catalogs Metadata Identifiers Citation Repositories Software Tools Taxonomies
  • 36. C4P: Collaboration & Cyberinfrastructure for Paleoscience An EarthCube Research Coordination Network Unravel the large-scale, long-term evolution of the Earth-Life System through the study of the geological record Major challenges C4P addresses: • Heterogeneous & dispersed data • Modeling of age & time • Legacy & ‘dark’ data • Limited interoperability among resources • Variable semantics & ontologies A diverse community: paleobiology, paleoclimate, paleoceanography, geochemistry, dendrochronology, stratigraphy, geochronology, sample curation, data management, bioinformatics, semantics, software architecture, and more ... C4P achievements: • New resources • data & software catalogs • Educational materials (webinars) • New collaborations • Convergence on best practices (samples, age, taxonomy)
  • 37. 37 Take Away Messages 37  develop leading practices for data  get community buy-in  align & coordinate with existing leading practices  leverage existing infrastructure  get started and don’t let the challenges stop you
  • 38. “The Hitchhiker’s Guide to Geoinformatics” (Lee Allison, LISTMG Workshop 2004)“Building an International Collaboration for Geoinformatics” (Walter Snyder, AGU 2005) “Cyberinfrastructure for Solid Earth Geochemistry” (Kerstin Lehnert, GSA 2003) The Cultural Challenges 38
  • 39. 39 Thank You! "The wonderful thing about standards is that there are so many of them to choose from”. (Grace Hopper)