Introduction to Scientific Data Stewardship Maturity Matrix
Introduction to Scientific Data
Stewardship Maturity Matrix
Cooperative Institute for Climate and Satellite – North Carolina (CICS-NC), NC State University
and NOAA’s National Centers for Environmental Information – NC (NCEI-NC)
(Formerly known as NOAA’s National Climatic Data Center (NCDC))
A Unified Framework for Measuring Stewardship Practices
Applied to Digital Environmental Datasets
In Collaboration with
Jeff Privette, Ed Kearns, Nancy Ritchey, and Steve Ansari
Version: 09/15/2016 r2
• What is scientific data stewardship? What does it mean?
• Why should we care?
• Why do we need a data stewardship maturity matrix (DSMM)?
• Where are we now?
• What is the NCEI/ICS-NC Scientific Data Stewardship Maturity Matrix?
• How did we get to where we are?
• Who could use the DSMM? What are the ways to use the DSMM?
• Putting maturity assessment into perspective
• What to do next?
In This Presentation
An overview of the scientific data stewardship maturity
assessment model with high-level background
What Is Scientific Data Stewardship?
Activities to ensure or improve the quality and usability
of geosciences data and products
• Activities to preserve or improve the information content,
accessibility, and usability of environmental data and
metadata (National Research Council, 2007)
To Ensure Data Are
• always meaningful
• Common data
• Spatial &
Scientific Data Stewardship Mean?
Ensure your data are
preserved and secure
available, discoverable, and accessible
credible and understandable
usable and useful
sustainable and extendable
citable and traceable
Version: 20141017 Rev. 2.2 POC: firstname.lastname@example.org
Why Should We Care?
Quality of data and what being done with/to data matter!
Knowing stewardship maturity is essential in making informed,
actionable, and efficient data management decisions!
Problem: Most of data centers currently cannot readily convey - or even assess –
the level of stewardship practices for its stakeholders or customers. No community
Hypothetic questions to a data center:
1. Congress: Are your datasets compliant with the U.S. Data Quality Act? If not, then what?
2. Business: Is your product credible? Readily accessible with common data format?
3. Modelers: Is the quality of a routinely updated product being assessed?
Solution: Define a Stewardship Maturity Matrix to assess stewardship practices
applied to individual data products
Why Do We Need a Data Stewardship Maturity Matrix?
This is a vulnerability – and an opportunity!
The value and quality of a data set depends – in part – on the
stewardship practices applied after its production.
Where Are We Now?
• A stewardship maturity matrix for individual digital
environmental datasets – baselined
• A paper – published by a peer-reviewed journal
with free online access
(Peng et al., 2015: doi:10.2481/dsj.14-049)
What Is the NCEI/CICS-NC Scientific Data
Stewardship Maturity Matrix (DSMM)?
A Unified Framework for
Measuring Stewardship Practices Applied to
Individual Digital Earth Sciences Data Products
That Are Publicly Available Online
Leveraging Institutional Knowledge and Community Best Practices and Standards
DSMM Defines Measureable, Five-Level Progressive Practices
in Nine Quasi-Independent Key Components
(Data system integrity is also very important but not included in the matrix due to potential security risks to the system.)
The Scope of Stewardship Practices
• Those applied to individual datasets – measureable and progressive
• Those associated with the functional entities of the Open Archival
Information System (OAIS) (within the shaded box in the diagram below)
CCSDS (2012) Version: 650x0m2-2012
How Did We Get here?
Policies Processes Tasks
Pathway to Identify Key Components and
Define Levels of Stewardship Maturity Matrix
DSMM Follows CMMI level Structure
Community Good Practices
Community Best Practices
Measured, Controlled, Audit
Reference Maturity Level Structure
• Capability Maturity Model Integration (CMMI)
• Levels of Maturity of Digital repository
Recommended level for
online operational products
National Data Centers
Assess & Convey & Path Forward
Not to Reinvent Wheels
• NCEI Subject Matter Experts (SMEs)
• Community accepted good and best
practices and standards
• SMEs from national and international
Who Could Use The Matrix?
• Data providers and scientific stewards
to evaluate and improve the quality and usability of their products against community
• Modelers, decision-support system users, and scientists
to improve their products and uncertainty estimates
to make investment and use decision
• Data managers/stewards of data centers and repositories
to validate their compliance or lack of to community accepted stewardship practice or
to assess the current state
to create a roadmap forward to improve or enhance its stewardship maturity of
practices applied to a certain product or all its holdings
• General data users
to make an educated choice on selecting or utilizing a dataset
Ways to Utilize DSMM & Assessment Results
• To know the current state of your
dataset(s) – maturity assessment
(stewardship maturity scoreboard)
• To know where you want or need
to be – stewardship requirements
• To know how to get there –
roadmap forward (informed,
• A reference model for stewardship planning and resource allocation –
informed decision-making support
• A consolidate source and transparency for information about stewardship
practices – assessment with detailed justifications
Need to Be
Stewardship Maturity Scoreboard and Roadmap Forward
• Content-rich quality metadata – enhanced discoverability and usability
Tiers of Maturity Assessment
within Context of Scientific Data Stewardship
• Repository Procedures Maturity
(e.g., ISO 16363:2012–trustworthiness)
• Stewardship Practices Maturity
(e.g., NCEI/CICS-NC Data Stewardship
Maturity Matrix (Peng et al., 2015))
• Repository Processes Maturity
(e.g., CMMI Data Management Maturity)
• Asset Management Maturity
(e.g., National Geospatial Dataset Asset
Lifecycle Maturity Model (FGDC, 2016))
Zhao et al. (2016)
Bates and Privette
Peng et al.
NCEI MM-Serv WG
Individual Datasets Maturity Assessment
within Context of Dataset Lifecycle Stages
An End-2-End, Consistent, Integrated Maturity Matrix Suite
A Consistent Measure of Product, Stewardship, and Service Maturity
(See Peng et al. (2016a) for an overview of the current state of dataset-centric maturity assessment models.)
Communities Are Interested In This Subject!
Introduction to Stewardship Maturity Matrix
• 1598 views globally since 1st upload in July 2014
Data Stewardship Maturity Matrix
• 976 views globally since 1st upload in July 2014
(Based on view metrics provided by slideshare.net as of 9/15/2016)
DSMM Self-Assessment Template
• 465 downloads since 1st upload in February 2015
(Based on download metrics provided by figshare.com as of 9/15/2016)
(Based on view metrics provided by slideshare.net as of 9/15/2016)
What To Do Next?
• ESIP (The Federation of Earth Science Information Partners) Data Stewardship
Committee – ensure consistent application and implementation of DSMM
across agencies and potentially get the committee endorsement (e.g., Downs
et al., 2015)
• EUMETSAT – provide a common stewardship assessment framework between
NOAA and EUMETSAT satellite Climate Data Records (CDRs)
• OMB A-16 NGDA Portfolio lifecycle maturity assessment model working group
– potentially integrate DSMM into their portfolio assessment model
• Use case studies (NCEI, ESIP, NSIDC, NCAR, DataOne, CSIRO, etc.) – application
and refinement of DSMM & defining roles and responsibilities for assessment
(e.g., Ritchey and Peng, 2015; Hou et al., 2015, Peng et al., 2016b,c);
• Decision-support tools (NOAA OSD & TRIO, CICS-NC, NCEI) – assess, display,
and integrate content-rich quality information in a more systematic way (e.g.,
Austin and Peng, 2015; Ritchey et al., 2016; Zinn et al., 2017).
What Is Good
Scientific Data Stewardship?
Make it easier for users
to trust your data
to find your dataset(s)
to get your data files
To understand your data
to learn the quality of your data
to use your data
to integrate your data
Version: 20141017 Rev. 2.1 POC: email@example.com
Benefit greatly from input and feedback from many
people at or affiliated with NCEI-NC and other data
centers and agencies
Appreciate support and guidance from NCEI-NC (formerly
known as NCDC), CICS-NC, CDR Program, RSAD, and
Product Branch management
*** NCEI-NC Informal Focus Groups ***
• Data Preservability
• Data Accessibility/Usability
• Data Integrity/Security
• Production Sustainability
Walter Jesse Glance
• Data Quality
• User Requirement
We Would Like to Thank Them All!
Special THANKS to
Jeff Privette, Ed Kearns, Nancy Ritchey, Steve Ansari,
Ken Knapp, Drew Saunders, John Keck, Scott Koger,
John Bates, Otis Brown, Bryant Cramer, Richard Kauffold,
Linda Copley, Phil Jones, Daniel Wunder, Terry McPherson,
Dan Kowal, Ken Casey, Grace Peng, Ruth Duerr,
Donna Scott, Matthew Austin, Ana Privette,
NCEI – NC Metadata Working Group
Like to learn more? Could contribute?
contact us at firstname.lastname@example.org or
register at http://goo.gl/kUW5Qq or
Austin, M. and G. Peng, 2015: A Prototype for content-rich decision-making support in NOAA using data as an asset.
Poster: IN21A-1676. 2015 AGU Fall meeting, 14 – 18 December 2015, San Francisco, CA, USA.
Bates, J. J. and J.L. Privette, 2012: A maturity model for assessing the completeness of climate data records. EOS,
Transactions of the AGU, 44, 441.
CCSDS (The Consultative Committee for Space Data Systems), 2012: Reference Model for an Open Archival Information
System (OAIS), Recommended Practices, Issue 2. Version: CCSDS 650.0-M-2. 135 pp.
DAMA International, 2010: Guide to the Data Management Body of Knowledge (DAMA-DMBOK). Eds. Mosley, M.,
Brackett, M., & Earley, S., Technics Publications, LLC, New Jersey, USA. 2nd Print Edition. 406 pp.
Downs, R.R., R. Duerr, D.J. Hills, and H.K. Ramapriyan, 2015: Data Stewardship in the Earth Sciences. D-Lib Magazine,
21, doi: 10.1045/july2015-downs
EUMETSAT, 2013: CORE-CLIMAX Climate Data Record Assessment Instruction Manual. Version 2, 25 November 2013.
EUMETSAT, 2015: GAIA-CLIM Measurement Maturity Matrix Guidance: Gap Analysis for Integrated Atmospheric ECV
Climate Monitoring: Report on system of systems approach adopted and rationale. Version: 27 Nov 2015.
FGDC, 2016: National Geospatial Data Asset (NGDA) Lifecycle Maturity Assessment (LMA) 2015 Report - Analysis and
Recommendations. Version: 8 December 2016.
Hou, C.-Y., M. Mayermik, G. Peng, R. Duerr, and A. Rosati, 2015: Assessing formation quality: Use case studies for the
data stewardship maturity matrix. Poster: IN21A-1675. 2015 AGU Fall meeting, 14 – 18 December 2015, San
Francisco, CA, USA.
National Research Council, 2007: Environmental data management at NOAA: Archiving, stewardship, and access. 116
pp. The National Academies Press, Washington, D.C.
NCEI MM-Serv WG (Use/Service Maturity Matrix Working Group), 2017: A reference framework for assessing service
maturity of digital environmental datasets. Under development.
Reference – Cont.
Peng, G., J.L. Privette, E.J. Kearns, N.A. Ritchey, and S. Ansari, 2015: A unified framework for measuring stewardship
practices applied to digital environmental datasets. Data Science Journal, 13, 231 - 253. doi:
Peng, G., H. Ramapriyan, and D. F. Moroni, 2016a: The State of Building a Consistent Framework for Curation and
Presentation of Earth Science Data Quality. Poster: IN41C.1666, AGU 2016 Fall Meeting, 12 – 16 December 2016,
San Francisco, CA, USA.
Peng, G., N. A. Ritchey, K. S. Casey, E. J. Kearns, J. L. Privette, D. Saunders, P. Jones, T. Maycock, and S. Ansari, 2016b:
Scientific stewardship in the Open Data and Big Data era - Roles and responsibilities of stewards and other major
product stakeholders. D.-Lib Magazine. 22, doi:10.1045/may2016-peng.
Peng, G., J. Lawrimore, V. Toner, C. Lief, R. Baldwin, N. Ritchey, and D. Bringar, 2016c: Assessment of Stewardship
Maturity of the Global Historical Climatology Network-Monthly (GHCN-M) Dataset and Lessons Learned. D.-Lib
Ritchey, N. and G. Peng, 2015: Assessing stewardship maturity: use case study results and lessons learned. IN14A-05,
2015 AGU Fall meeting, 14 – 18 December 2015, San Francisco, CA, USA.
Ritchey, N.A., G. Peng, A. Milan, P. Lemieux, R. Partee, R. Lonin, and K.S. Casey, 2016: Practical Application of the Data
Stewardship Maturity Model for NOAA’s OneStop Project. IN42D-08. AGU 2016 Fall Meeting, 12 – 16 December
2016, San Francisco, CA, USA.
Zhou, L. H., M. Divakarla, and X. P. Liu, 2016: An Overview of the Joint Polar Satellite System (JPSS) Science Data
Product Calibration and Validation. Remote Sensing, 8(2). doi:10.3390/rs8020139
Zinn, S., J. Relph, G. Peng, A. Milan, and A. Rosenberg, 2017: Design and implementation of automation tools for
DSMM diagrams and reports. Invited Talk. ESIP 2017 Winter Meeting, 11 – 13 January 2017, Bethesda, MD, USA.
A self-assessment template using the latest DSMM is available at: