Data Management Plans: Presentation for Data Governance Workshop

  • 623 views
Uploaded on

 

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
623
On Slideshare
0
From Embeds
0
Number of Embeds
2

Actions

Shares
Downloads
0
Comments
0
Likes
2

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Data  Management  Plans     Status   Perspectives   Opportunities  Carly  Strasser  California  Digital  Library  
  • 2. Melissa  Cragin  (AAAS  Fellow  at  NSF)  Jennifer    Schopf  (NSF)  Mark  Schildhauer  (NCEAS)  MacKenzie  Smith    Not  Copyright   Compa.ble  
  • 3. Roadmap   4.  Policies:  Opportunities  &   3.  Current  Status   Challenges     2.  Perspectives  1.  Introduction  to  DMPs    
  • 4. Plan   Analyze   Collect  Integrate   Assure   Discover   Describe   Preserve   DataONE  
  • 5. Digital  data  
  • 6. Where  data  end  up   www Data  Metadata   Recreated  from  Klump  et  al.  2006  
  • 7. Where  data  end  up   www Data   wwwMetadata   Recreated  from  Klump  et  al.  2006  
  • 8. Data   Reuse   Data   Sharing   Data  Management  
  • 9. Trends  in  Data  Archiving  Journal  publishers  Joint  Data  Archiving  Agreement  
  • 10. Trends  in  Data  Archiving  Journal  publishers  Joint  Data  Archiving  Agreement    Data  Papers  Ecological  Archives  
  • 11. Trends  in  Data  Archiving  Journal  publishers  Joint  Data  Archiving  Agreement    Data  Papers  Ecological  Archives    Funders  
  • 12. •  NIH  Grants  >$500k  require  a  data  sharing  plan  in  the  application.  Data  should  be   made  as  widely  and  freely  available  as  possible.  •  CDC  Provides  data  to  its  partners  for  appropriate  public  health  purposes  and  all   data  are  released  and/or  shared  as  soon  as  feasible  •  DOE  US  Global  Change  Research  program:  data  of  potentially  broad  use  in  climate   change  research  and  assessments  should  be  archived,  when  possible,  in  data   repositories  for  subsequent  dissemination  •  DOD  Establish  and  maintain  a  coordinated  and  comprehensive  program  to   document  the  results  and  outcome  of  research  efforts  and  provide  access   effectively.  •  NASA  Maximize  access  to  data  •  National  Endowment  for  the  Humanities  Requires  DMPs  as  of  2011    •  USDA  CSREES  Data  required  to  be  submitted  into  the  public  domain  without   restriction  •  US  Department  of  Education  provides  Data  Sharing  Implementation   Guide  •  NSF  Requires  DMPs  as  of  2011     University  of  Minnesota  Libraries  
  • 13. NSF  DMP  Requirements   From  Grant  Proposal  Guidelines:    DMP  supplement  may  include:   1.  the  types  of  data,  samples,  physical  collections,  software,  curriculum   materials,  and  other  materials  to  be  produced  in  the  course  of  the  project   2.   the  standards  to  be  used  for  data  and  metadata  format  and  content  (where   existing  standards  are  absent  or  deemed  inadequate,  this  should  be   documented  along  with  any  proposed  solutions  or  remedies)   3.   policies  for  access  and  sharing  including  provisions  for  appropriate   protection  of  privacy,  confidentiality,  security,  intellectual  property,  or  other   rights  or  requirements   4.   policies  and  provisions  for  re-­‐use,  re-­‐distribution,  and  the  production  of   derivatives   5.   plans  for  archiving  data,  samples,  and  other  research  products,  and  for   preservation  of  access  to  them  
  • 14. Roadmap   4.  Policies:  Opportunities  &   3.  Current  Status   Challenges     2.  Perspectives  1.  Introduction  to  DMPs    
  • 15. Scientist  Perspectives   “Another  requirement?”  Scientists  have   •  increasing  competition  for   funds,  jobs,  students   •  no  time   •  no  data  management  training   Carly A. Strasser, PhD Contact DCXL Project Manager 510.987.0179 Information 415 20th Street carly.strasser@ucop.edu Oakland California 94612 www.carlystrasser.net Education Massachusetts Institute of Technology/Woods Hole Oceanographic Institution Ph.D. Biological Oceanography, March 2008 Dissertation: “Metapopulation Dynamics of the Softshell Clam, Mya arenaria” •  no  knowledge  of  incentives   University of San Diego B.A., Marine Science with Biology emphasis, Summa Cum Laude, May 2001 Thesis: “Population structure of the Antarctic krill, Euphausia superba” Research & California Digital Library, University of California O ce of the President, Oakland CA Professional DCXL Project Manager July 2011 - present Experience The Digital Curation for Excel (DCXL) project will facilitate data management, sharing, and archiv- ing for earth, environmental, and ecological scientists. We will build an open source add-in for Mi- crosoft Excel that will assist scientists in preparing their Excel data for sharing. We are currently talking to scientists about what this might entail. DataONE, based at National Center for Ecological Analysis & Synthesis, UC Santa Barbara Postdoctoral Associate September 2010 - August 2011 Engaging the scientific and data management community in the Data Observation Network for Earth (DataONE), a cyberinfrastructure that will provide universal access to environmental and ecological data. Advisor: Stephanie Hampton. University of Alberta & Dalhousie University, Edmonton AB & Halifax NS, Canada Postdoctoral Investigator January 2009 - October 2010 Used theoretical and experimental approaches to understand the role of life stage in establishment of invasive copepods introduced via ballast water. Advisors: Mark Lewis (University of Alberta) and Claudio DiBacco (Dalhousie University & Bedford Institute of Oceanography). Woods Hole Oceanographic Institution, Woods Hole MA Postdoctoral Investigator March 2008 - December 2008 Developed demographic models of the endangered North Atlantic right whale population based on mark-recapture data. Advisor: Hal Caswell. MIT-WHOI Joint Program, Boston & Woods Hole MA PhD Candidate June 2002 - March 2008 Combined experimental, field, and theoretical techniques to explore the metapopulation dynamics of the softshell clam, Mya arenaria. Advisors: Lauren Mullineax, Simon Thorrold, Mike Neubert. Maric College, San Diego CA Laboratory Manager June 2001 - June 2002 Managed education laboratories for Biology, Anatomy, and Biochemistry courses. Generated bud- gets, ordered supplies, prepared and maintained materials for student laboratories.
  • 16. Why  should  scientists  prepare  a  DMP?       Funders  protect  their  investment   Saves  time   Increases  efficiency   Easier  to  use  data       Others  can  understand  &  use  data   Credit  for  data  products    
  • 17. Scientist  Perspectives   Data-­‐related  goals  for  scientists   •  Credit  for  work   •  Access,  rights,  control  over  use   Carly A. Strasser, PhD Contact Information DCXL Project Manager 415 20th Street 510.987.0179 carly.strasser@ucop.edu Oakland California 94612 www.carlystrasser.net •  Help  complying  with  grant  terms   Education Massachusetts Institute of Technology/Woods Hole Oceanographic Institution Ph.D. Biological Oceanography, March 2008 Dissertation: “Metapopulation Dynamics of the Softshell Clam, Mya arenaria” University of San Diego B.A., Marine Science with Biology emphasis, Summa Cum Laude, May 2001 Thesis: “Population structure of the Antarctic krill, Euphausia superba” Research & California Digital Library, University of California O ce of the President, Oakland CA Professional DCXL Project Manager July 2011 - present Experience The Digital Curation for Excel (DCXL) project will facilitate data management, sharing, and archiv- ing for earth, environmental, and ecological scientists. We will build an open source add-in for Mi- crosoft Excel that will assist scientists in preparing their Excel data for sharing. We are currently talking to scientists about what this might entail. DataONE, based at National Center for Ecological Analysis & Synthesis, UC Santa Barbara Postdoctoral Associate September 2010 - August 2011 Engaging the scientific and data management community in the Data Observation Network for Earth (DataONE), a cyberinfrastructure that will provide universal access to environmental and ecological data. Advisor: Stephanie Hampton. University of Alberta & Dalhousie University, Edmonton AB & Halifax NS, Canada Postdoctoral Investigator January 2009 - October 2010 Used theoretical and experimental approaches to understand the role of life stage in establishment of invasive copepods introduced via ballast water. Advisors: Mark Lewis (University of Alberta) and Claudio DiBacco (Dalhousie University & Bedford Institute of Oceanography). Woods Hole Oceanographic Institution, Woods Hole MA Postdoctoral Investigator March 2008 - December 2008 Developed demographic models of the endangered North Atlantic right whale population based on mark-recapture data. Advisor: Hal Caswell. MIT-WHOI Joint Program, Boston & Woods Hole MA PhD Candidate June 2002 - March 2008 Combined experimental, field, and theoretical techniques to explore the metapopulation dynamics of the softshell clam, Mya arenaria. Advisors: Lauren Mullineax, Simon Thorrold, Mike Neubert. Maric College, San Diego CA Laboratory Manager June 2001 - June 2002 Managed education laboratories for Biology, Anatomy, and Biochemistry courses. Generated bud- gets, ordered supplies, prepared and maintained materials for student laboratories.From  MacKenzie  Smith  
  • 18. Scientist  Perspectives  What  scientists  want   •  Boiler  plate   •  Trusted  advice   •  Examples   •  Phrases  to  use   •  Guidance  on  archives   •  Best  practices  training   From  Flickr  by  ThewmaH  
  • 19. Perspectives:  Institutions  •  Puts  pressure  on  institutions  to  take  some   responsibility  for  data  preservation  •  How  to  properly  support  scientists?   –  Cyberinfrastructure   –  Personnel   –  Long-­‐term  preservation                costs   –  Scientist  education  •  How  to  fund?   From  Flickr  by  Francisco  Diez  
  • 20. Perspectives:  Funders    •  Protect  investment  •  Ensure  future  viability  of  data  products  •  Maximize  $  impact  
  • 21. Roadmap   4.  Policies:  Opportunities  &   3.  Current  Status   Challenges     2.  Perspectives  1.  Introduction  to  DMPs    
  • 22. NSF’s  Vision  •  DMPs  and  their  evaluation  will  grow  and  change   over  time  (e.g.  broader  impacts)  •  Peer  review  will  determine  next  steps  •  DMPs  are  a  good  first  step  towards  improving  data   stewardship   –  starting  discussion   –  scientists  learning  about  data  management  •  Working  group  will  assess  outcomes  
  • 23. NSF’s  Vision  Community-­‐driven  guidelines  •  Avoiding  a  one-­‐size-­‐fits-­‐all  approach   –  Different  disciplines  have  different  definitions  of   acceptable  data-­‐sharing  •  DMPs  will  be  subject  to  peer  review,   community  standards   –  Flexibility  at  the  directorate  and  division  levels   –  Tailor  implementation  as  appropriate   From  Jennifer  Schopf  
  • 24. NSF’s  Vision  •  Evaluation  will  vary  with  directorate,  division,   program  officer  •  Overall  guidelines  for  evaluation  might  be  useful  •  Additional  expertise  on  panels  to  effectively   evaluate  DMPs  (?)  
  • 25. NSF  Panel  Evaluation  of  DMPs  •  Will  not  tank  a  proposal  unless  it  is  not  present  •  Panels  looking  to  POs  for  direction  on  evaluation  •  Determine  whether  DMP  is  adequate   –  Is  a  DMP  present?   –  Does  the  PI  discuss  how  they  will  archive  data?  •  Not  currently  a  part  of  the  merit  review  process   –  Slap  on  the  wrist  if  good  proposal  has  a  bad  DMP   –  Another  nail  in  the  coffin  if  a  bad  proposal  has  a  bad  DMP  •  Some  knowledgeable  PIs  using  DMPs  as  strategic  tool:  part  of   proposal  narrative  •  DMPs  can  be  used  to  identify  gaps,  e.g.  domains  with  no   cyberinfrastructure  for  their  data      
  • 26. Unofficial  Notes  •  Templates  are  potentially  problematic   –  must  morph  with  requirements  over  time   –  must  be  sufficiently  flexible    •  Templates  are  primarily  from  libraries:  the   libraries  are  not  part  of  a  larger  research   management  initiative   –  policy  complications  and  governance  issues  may   not  be  addressed  adequately  
  • 27. DMPTool   Step-­‐by-­‐step  wizard  for  generating  DMP  Create    |    edit    |    re-­‐use    |    share    |    save    |    generate     Open  to  community     Links  to  institutional  resources   Directorate  information  &  updates  
  • 28. DCC  DMP  Online   DCC  Policies  section  
  • 29. Roadmap   4.  Policies:  Opportunities  &   3.  Current  Status   Challenges     2.  Perspectives  1.  Introduction  to  DMPs    
  • 30. NSF  DMP  Requirements   From  Grant  Proposal  Guidelines:    DMP  supplement  may  include:   1.  the  types  of  data,  samples,  physical  collections,  software,  curriculum   materials,  and  other  materials  to  be  produced  in  the  course  of  the  project   2.   the  standards  to  be  used  for  data  and  metadata  format  and  content  (where   existing  standards  are  absent  or  deemed  inadequate,  this  should  be   documented  along  with  any  proposed  solutions  or  remedies)   3.   policies  for  access  and  sharing  including  provisions  for  appropriate   protection  of  privacy,  confidentiality,  security,  intellectual  property,  or  other   rights  or  requirements   4.   policies  and  provisions  for  re-­‐use,  re-­‐distribution,  and  the  production  of   derivatives   5.   plans  for  archiving  data,  samples,  and  other  research  products,  and  for   preservation  of  access  to  them  
  • 31. Policies:  Challenges  &  Opportunities  •  Lack  of  policies,  policy   guidance  •  Scientists  want  HELP  •  Need  a  trusted  source  to   provide  information  and   help  with  DMP  text   www.snccomputerrepair.com  
  • 32. Things  to  Consider…  •  Different  types  of  data   –  Models    |    Simulation  runs    |    Parameter  sets   –  Software    |    Code    |    Requirements    |    Testing   –  Images    |    Video    |    Audio    |    Specimens   –  Calibrations  &  test  runs   –  Intermediate  data    |    Primary  data  •  Meta-­‐analysis  and  reuse  
  • 33. Different  Policy  Types  •  Reuse  policies  •  Sharing  policies  •  Access  policies  •  Archiving  policies  •  Attribution/citation  policies   –  attribution  stacking?  •  Commercial  use  •  …  
  • 34. Discussion  Points  •  How  to  create  policies  that  will  be  domain-­‐ agnostic?  •  Who  should  define  data  as  it  pertains  to   policies?  •  How  to  centralize  policy  efforts  in  this  time  of   rapid  change?