RDAP14: It’s a Real World: Developing Preservation Policy for Dryad

Uploaded on

Research Data Access and Preservation Summit, 2014 …

Research Data Access and Preservation Summit, 2014
San Diego, CA
March 26-28, 2014

Ayoung Yoon
Dryad preservation working group, Doctoral Candidate at UNC-­‐CH

Sara Mannheimer
Former Dryad curator, Data management librarian at Montana State University

Elena Feinstein, Jane Greenberg, Ryan Scherle
Dryad Digital Repository

More in: Education
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads


Total Views
On Slideshare
From Embeds
Number of Embeds



Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide


  • 1. It’s  a  Real  World:     Developing  Preserva6on  Policy  for  Dryad   Ayoung  Yoon  (Dryad  preserva2on  working  group,  Doctoral  Candidate  at  UNC-­‐CH)   Sara  Mannheimer  (Former  Dryad  curator,  Data  management  librarian  at  Uof  Montada)   Elena  Feinstein,  Jane  Greenberg,  Ryan  Scherle,  Dryad  Digital  Repository   March  26,  2014   Research  Data  Access  &  Preserva6on  Submit  (RDAP)  2014  
  • 2. Outline •  Introduc2on     •  What  is  Dryad  Digital  Repository?     •  Preserva2on  policy  development  process     •  Dryad  preserva2on  policy   •  Lesson  learned  and  open  ques2ons     •  Conclusion   •  Acknowledgement      
  • 3. Introduction •  “Data  deluge”   •  Journals  and  funding  agency  mandates   •  Benefits  to  archiving  and  preserving  research  data:   –  Facilitates:   •  Verifica2on  of  research   •  accessibility  and  discoverability   •  opportuni2es  for  data  reuse   •  increased  cita2ons   •  research  visibility   –  Prevents:     •  redundant  data  collec2on   •  inefficient  legacy  data  cura2on   •  burden  of  sharing-­‐on-­‐request   •  Challenges  of  data  archiving:   –  Wider  variety  of  file  formats  than  most  digital  archival  materials.     –  New  versions  as  data  sets  are  added  to  and  updated   –  Security  considera2ons   –  Large  amounts  of  data     Benefits  adapted  from  Beagrie  N,  Lavoie  BF,  Woollard  M  (2010)   Keeping  research  data  safe  2.  HEFCE  
  • 4. Why preservation policy? •  Preserva2on  policy  supports  strategic  planning  for   implementa2on   •  Communicates  to  stakeholders   –  trustworthiness  and  commitment  to  preserva2on     •  Not  many  data  preserva2on  policies.  Some  examples:   –  CERN:  CMS  data   –  Archaeology  Data  Service   –  NSIDC  Data  Management  Policies   –  Odum  Ins2tute  Preserva2on  Policy   –  ISPSR   –  DataONE  
  • 5. Dryad Digital Repository •  A  curated,  general-­‐purpose  repository  that  makes  the   data  underlying  scien2fic  and  medical  publica2ons   discoverable,  freely  reusable,  and  citable     (hap://datadryad.org/).   •  Facilitates  data  availability,  data  sharing,  and  scholarly   communica2on.   •  Originally  partnered  with  leading  journals  and  scien2fic   socie2es  in  evolu2onary  biology  and  ecology.   •  Broad  collec2ng  policy  –  almost  any  data  is  accepted,       as  long  as  it  is  associated  with  a  publica2on.      
  • 6. Common filetypes in Dryad 0   200   400   600   800   1000   WAV   HTML   Phylip   R  script   JPEG  Image   Newick  tree  file   RTF   XML   GZip  archive   MS  Word  OpenXML   MS  Word  97-­‐2007   Nexus   PDF   FASTA   MS  Excel  OpenXML   Zip  archive   CSV   MS  Excel  97-­‐2007  
  • 7. Dryad and Preservation Needs •  Preserva2on  is  a  major  part  of  Dryad’s  mission.   •  Current  preserva2on  ac2ons:   –  MD5  Checksums   –  provenance  metadata   –  informal  encouragement  of  preferred  formats   •  Developing  and  implemen2ng  a  formal  preserva2on   policy  will:   –  guide  current  and  future  preserva2on  prac2ce   –  Facilitate  the  long-­‐term  preserva2on  of  the  repository’s  digital   assets  
  • 8. Policy Development Process      2012    Feb  2013                                      May  2013                                      July  2013                                  Nov  2013                                           An  ini2al   preserva2on  plan   (version  1.0.)   Preserva2on  Working   Group  in  Feb  2013   Version  2.0.  presented   to  the  Dryad  Board  of   Directors   Version  2.0.  revised  in   coopera2on  with  Dryad   staff   •  Version  2.4.  Approved  by  Dryad   Board  of  Directors   •  Preserva2on  Working  Group   dissolved.   Preserva2on  Task   Force  formed    
  • 9. Preservation Policy •  Purpose     •  Scope  and  content  coverage     •  Overview  of  preserva2on  strategies     •  Format  support  and  levels  of  preserva2on   –  e.g.  Preferred  formats  and  format  support  levels   •  Implemen2ng  the  strategy   –  e.g.  integra2ons  of  OAIS  func2onal  ac2vi2es,  pre-­‐ingest  &   ingest,  and  archival  storage,  authen2city  and  integrity,   security,  versioning,  and  withdrawal  of  collec2ons   •  Sustainability  plans   –  e.g.  technical  sustainability,  ins2tu2onal  and  financial   sustainability  
  • 10. Lesson Learned and Open Questions •  A  nego2a2on  between  what  is  ideal  and  what  is   realis2c   –  Adop2ng  Interna2onal  standards,  models,  and  best   prac2ces  exist  for  long-­‐term  preserva2on     •  Open  Archival  Informa2on  System  (OAIS)   reference  model  (ISO  14721:2003)   •  PREMIS  (PREserva2on  Metadata:  Implementa2on   Strategies)   –  Other  standards  and  guidelines  about  audit  and   cer2fica2on  for  building  a  trusted  digital  repository   •  Trustworthy  Repositories  Audit  &  Cer4fica4on:   Criteria  and  Checklist  (TRAC)  and  Data  Seal  of   Approval  (DSA)  
  • 11. Lesson Learned and Open Questions •  Aligning  with  other  internal  and  ins2tu2onal  policies   –  Follow  Dryad’s  internal  policies,  we  looked  primarily   to  Dryad’s  Terms  of  Service  document  ( haps://datadryad.org/pages/policies),  which  includes   policies  on  submission,  content,  payment,  usage,  and   privacy     –  Comply  with  Dryad’s  unofficial  policies,  which  have   yet  to  be  finalized   •  A  policy-­‐in-­‐progress:  Dryad’s  policy  on  versioning   –  Comply  with  policy  from  partner  ins2tu2ons   •  Dryad  func2ons  as  a  partnership  between  the   University  of  North  Carolina  at  Chapel  Hill  (UNC),  Duke   University  (Duke),  and  North  Carolina  State  University   (NC  State)    
  • 12. Lesson Learned and Open Questions •  Structuring  the  policy  according  to  Dryad’s  specific   needs   –  Mee2ng  specific  organiza2onal  needs  is  fundamentally   important  and  should  be  the  first  considera2on  in  all   work,  as  each  organiza2on  has  different  goals,   priori2es,  and  capabili2es.     –  Data  depositors’  requirements:  minimum   requirements   •  balance  “minimum  efforts”  and  having  “enough”   representa2on  informa2on   •  compensated  by  other  factors    
  • 13. Conclusion •  Policy-­‐crea2on  and  planning  are  just  first  steps  -­‐-­‐   implementa2on  will  require  further  considera2ons   •  Future  plan   –  Poten2als  for  implemen2ng  TRAC  /  DSA  in  the  future   –  Divide  policy  and  implementa2on  into  separate   documents   –  New  Task  Force  
  • 14. Acknowledgement •  The  works  was  supported  in  part  from  Na2onal   Science  Founda2on  (NSF),  Award  number:  1147166/ ABI  Development:  Dryad:  scalable  and  sustainable   infrastructure  for  the  publica2on  of  data.    
  • 15. Thank you! Ayoung  Yoon    Doctoral  candidate      University  of  North  Carolina  at  Chapel  Hill    ayyoon@email.unc.edu   Sara  Mannheimer    Data  management  librarian      Montana  State  University    sara.mannheimer@montana.edu