RDAP13 John Kunze: The Data Management Ecosystem


Published on

John Kunze, University of California, Curation Center
California Digital Library (CDL)

The Data Management Ecosystem

Panel: Partnerships between institutional repositories, domain repositories, and publishers
Research Data Access & Preservation Summit 2013
Baltimore, MD April 4, 2013 #rdap13

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Panel: Partnerships between institutional repositories, domain repositories, and publishers20-25 mins, 9:30-11amThe 'data management ecosystem' angle seems appropriate for the panel, but feel free to share some of the technical aspects with the audience, too.partnerships via conventions and APIs. Data Citation conventions, Libraries are chipping away on several fronts to try to shrink this "data curation" problem to a more manageable size, and they are offering a great deal of support for data management planning, data citation, identifier and repository services,repository federation, and “data publication”.
  • Research data can be seen to fit in a kind of ecosystem of inter-dependent stakeholder niches. Each niche depends on other niches.In a broad sense, partnerships are about dependencies. Besides explicit partnerships between publishers and institutional and domain repositories, there are other critical inter-dependencies – essentially implicit partnerships.Libraries as neutral connectors to sub-partners insystem development and collection buildinglinking with museums and archives
  • Development partners:DMPTool: U Va, Smithsonian, DCC, et alDataUp: MSRC, GBMF, D1 WAS: LC, UNT, NYU, et alUser partners (clients, patrons, customers): any
  • Partners: JISC/EDINA, paying customers on two continents
  • D1 network partners all over the world
  • partnering with escholarship and UC campuses for collection building
  • Partnering with JISC/EDINA, DataCite, the Research Data Alliance
  • Each member partners with regional data repositoriesDataCite partners with publishers (eg, T-R) for data citation indexCreditDiscoveryImpact trackingHelping data authors verify use of their data andHelping identify how others have used the dataWith archiving: re-use and reproducibility
  • RDAP13 John Kunze: The Data Management Ecosystem

    1. 1. The Data Management Ecosystem 4 April 2013University of California Curation Center California Digital Library
    2. 2. The research data problem• Journal article • Research data – Uniquely and persistently – Nope identified – Concept of “publish” – Not really – Multiple copies – Typically one – Easily findable – Difficult – Services: impact metrics, – Nope citation tracking, etc. Research data is seen as a second- class citizen in the scholarly record.
    3. 3. An ecosystem of inter-dependent partners Besides data repository and publisher partners... • researchers • educators • citizen science groups • funders • tenure and promotion committees Libraries as neutral connection partners
    4. 4. Where can libraries make a difference? Research & Scholarship Lifecycle Research Save Collect Create Knowledge Share Publish
    5. 5. Collect > Publish > Share > Save > Research Create, edit, share, and save data management plans Open source curation add-in for Microsoft Excel Capture today’s web; build tomorrow’s archives
    6. 6. Collect >Publish > Share > Save > Research Create and manage persistent identifiers: ARKs, DOIs, etc.An infrastructure to publish and get credit for sharing research data
    7. 7. Collect > Publish >Share > Save > ResearchCuration repository: store, manage, preserve, and share research data Open deposit, open access repository for spreadsheet dataData Observation Network for Earth
    8. 8. Collect > Publish > Share > Save >ResearchWhat’s missing to complete the “incentive” circuit?• Impact measures, citation tracking “Connecting the data to the research it informs”Altmetrics tools to measure non- traditional products and uses , , etc.
    9. 9. Stable storage: Merritt repository • Curation repository open to the UC community and beyond • Discipline / content agnostic • Micro-services architecture • Easy-to-use UI or API • Hosted or locally deployed
    10. 10. EZID: Long term identifiers made easy• Precise identification of a dataset (DOI or ARK)• Credit to data producers and data publishers• A link from the traditional literature to the data (DataCite)• Exposure and research metrics for datasets (Web of Knowledge, Google) Take control of the management and distribution of your research, share and get credit for it, and build your reputation through its collection and documentation
    11. 11. Discovery: DataCiteconsortium• Technische Informationsbibliothek (TIB), • Canada Institute for Scientific and Germany Technical Information (CISTI) • L’Institut de l’Information Scientifique• Australian National Data Service (ANDS) et Technique (INIST), France• The British Library • Library or the ETH Zürich• California Digital Library, USA • Library of TU Delft, The Netherlands • Office of Scientific and Technical Information, US Department of Energy • Purdue University, USA • Technical Information Center of Denmark
    12. 12. New distributed framework Coordinating Nodes Flexible, scalable, Member Nodes• retain complete metadata sustainable network• catalog institutions diverse• subset of all data• serve local community• perform basic indexing• provide resources for• provide network-widemanaging their data services• ensure data availability (preservation)• provide replication services
    13. 13. The rest of the story www.cdlib.org/uc3 John.Kunze@ucop.eduuc3@ucop.edu for service questions