Foundations of Data Curation Final Project


Published on

Presentation given to Foundations of Data Curation course at University of Illinois Urbana-Champaign. The slides were "delivered" to a fictional organization.

Published in: Business
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Foundations of Data Curation Final Project

  1. 1. INTRODUCTION TO DATA CURATION A presentation to the Illinois Association of Astronomers and Astrophysicists (IAAA) during the 2013 Conference in Chicago, IL. By Katie Schmitt
  2. 2. WHAT IS DATA?  "A reinterpretable representation of information in a formalized manner suitable for communication, interpretation, or processing. Examples of data include a sequence of bits, a table of numbers, the characters on a page, the recording of sounds made by a person speaking, or a moon rock specimen“ – OAIS reference model, 2012  Types of Data:  Experimental  Samples  Reports  Maps  Websites
  3. 3. WHAT IS DATA CURATION? DDI Combined Life Cycle Model [2004] Source: DDI Structural Reform Group. “DDI Version 3.0 Conceptual Model." DDI Alliance. 2004.
  4. 4. BEST PRACTICES - PROVENANCE  prov·e·nance [ próvvənənss ]:  the place of origin of something  the source and ownership history  Data Provenance  Instrument characteristics, calibration data and method of discovery  Processing algorithms  Changes in location or instrumentation  Changes in ownership of the data  Where did the data come from and how did it get here?
  5. 5. BEST PRACTICES - METADATA  Used to enable data discovery  Metadata standards vary per data repository  In general, metadata must be:  Consistent  Written for humans  In a digital format
  6. 6. BEST PRACTICES PRESERVATION  The best format is  Platform and Vendor-independent  Non-proprietary  Stable  Open  Well-supported  Unencrypted  Uncompressed  Self-describing Source: Week 4 Slides by Ruth Duerr, LIS590DCL
  7. 7. BEST PRACTICES – ACCESS  Constant balance between preservation and access  Similar to preservation format  Master v. Access
  8. 8. TYPES OF REPOSITORIES  Domain  Established  Often connected to a University  Usually provide high levels of service  Specialized by discipline  Institutional  Excel in basic service  New to the data management realm.
  9. 9. A FEW RESOURCES…  Choudhury, G. S., Palmer, C. L., Baker, K. S., & DiLauro, T. (2013, January). Levels of services and curation for high-functioning data. Presented at the International Digital Curation Conference, Amsterdam, Netherlands.  Miles, S., Deelman, E., Groth, P., Vahi, K., Mehta, G., & Moreau, L. (2007). Connecting Scientific Data to Scientific Experiments with Provenance. e-Science and Grid Computing, IEEE International Conference, 179-186. 10.1109/ESCIENCE.2007.22  Renear, A. H., Sacchi, S., & Wickett, K. M. (2010). Definitions of dataset in the scientific and technical literature. Proceedings of the American Society for Information Science and Technology, 47(1), 1–4
  10. 10. QUESTIONS? Katie Schmitt @kmschmitt