The	
  Fractured	
  Lab	
  Notebook	
  
	
  
Undergraduates	
  are	
  	
  
not	
  learning	
  ecological	
  data	
  management	
  	
  
at	
  top	
  US	
  institutions	
  
	
  
Carly	
  Strasser	
  |	
  @carlystrasser	
  
California	
  Digital	
  Library	
  
&	
  
Stephanie	
  Hampton	
  
National	
  Center	
  for	
  Ecological	
  Analysis	
  &	
  Synthesis	
  
ESA	
  2013	
  
Why	
  don’t	
  people	
  
share	
  data?	
  
Is	
  data	
  management	
  
being	
  taught?	
  
Do	
  attitudes	
  about	
  
sharing	
  differ	
  
among	
  disciplines?	
  
What	
  role	
  can	
  
libraries	
  play	
  in	
  
data	
  education?	
  
How	
  can	
  we	
  promote	
  storing	
  
data	
  in	
  repositories?	
  
What	
  barriers	
  to	
  sharing	
  
can	
  we	
  eliminate?	
  
NSF	
  funded	
  DataNet	
  Project	
  
Office	
  of	
  Cyberinfrastructure	
  
Back in the day…
Da	
  Vinci	
  
Curie	
  
Newton	
  
classicalschool.blogspot.com	
  
Darwin	
  
Digital	
  data	
  
From	
  Flickr	
  by	
  Flickmor	
  
From	
  Flickr	
  by	
  US	
  Army	
  Environmental	
  Command	
  
From	
  Flickr	
  by	
  	
  DW0825	
  
C.	
  Strasser	
  
Courtesey	
  of	
  WHOI	
  
From	
  Flickr	
  by	
  	
  deltaMike	
  
Digital	
  data	
  
+	
  	
  
Complex	
  
workflows	
  
From	
  Flickr	
  by	
  ~Minnea~	
  
Data	
  management	
  
Documentation	
  
Reproducibility	
  
THE
TRUTH
Data	
  management	
  
Metadata	
  
Data	
  repositories	
  
Sharing	
  data	
  publicly	
  
From	
  sandierpastures.com	
  
YOU NEED TO
KNOW ABOUT
From	
  Flickr	
  by	
  hyperion327	
  
From	
  Flickr	
  by	
  Redden-­‐McAllister	
  
From	
  Flickr	
  by	
  iowa_spirit_walker	
  
•  Cost	
  
•  Confusion	
  about	
  
standards	
  
•  Lack	
  of	
  training	
  
•  Fear	
  of	
  lost	
  rights	
  or	
  
benefits	
  
•  No	
  incentives	
  
Are	
  	
  
ecologists	
  	
  
Are	
  	
  
ecology	
  grad	
  students	
  
Are	
  soon-­‐to-­‐be	
  	
  
ecology	
  grad	
  students	
  	
  
learning	
  about	
  data	
  
management?	
  
From	
  Flickr	
  by	
  hyperion327	
  
~5,000	
  
Research	
  
Universities	
  
Graduate	
  
Research	
  
Fellowships	
  	
  
	
  
2006-­‐2010	
  
recipients	
  for	
  
life	
  sciences	
  
	
  
Baccalaureate/
Arts	
  &	
  Sciences	
  	
  
Institutions	
  
Amherst	
  College	
  	
  
Bowdoin	
  College	
  	
  
Brown	
  University	
  	
  
Carleton	
  College	
  	
  
Colorado	
  College	
  	
  
Colorado	
  State	
  University	
  	
  
Columbia	
  University	
  	
  
Cornell	
  University	
  	
  
Dartmouth	
  College	
  	
  
Duke	
  University	
  	
  
Emory	
  University	
  	
  
Evergreen	
  State	
  College	
  	
  
Grinnell	
  College	
  	
  
Harvard	
  University	
  	
  
Indiana	
  University	
  	
  
	
  (Bloomington)	
  	
  
Lewis	
  and	
  Clark	
  College	
  	
  
Michigan	
  State	
  University	
  	
  
Middlebury	
  College	
  	
  
Oberlin	
  College	
  	
  
Oregon	
  State	
  University	
  	
  
Princeton	
  University	
  	
  
Purdue	
  University	
  	
  
Rice	
  University	
  	
  
San	
  Diego	
  State	
  University	
  	
  
Stanford	
  University	
  	
  
Swarthmore	
  College	
  	
  
University	
  of	
  Arizona	
  	
  
UC	
  Berkeley	
  	
  
UC	
  Davis	
  
UC	
  Irvine	
  	
  
UC	
  Riverside	
  	
  
UC	
  Santa	
  Barbara	
  	
  
UC	
  Santa	
  Cruz	
  	
  
University	
  of	
  Chicago	
  	
  
University	
  of	
  Colorado	
  
University	
  of	
  Florida	
  	
  
University	
  of	
  Georgia	
  
University	
  of	
  Idaho	
  	
  
University	
  of	
  Illinois	
  
	
  Urbana-­‐Champaign	
  	
  
University	
  of	
  Kansas	
  
	
  University	
  of	
  Maryland	
  	
  
University	
  of	
  Michigan	
  	
  
University	
  of	
  Minnesota	
  	
  
University	
  of	
  Montana	
  	
  
University	
  of	
  New	
  Mexico	
  	
  
University	
  of	
  Southern	
  
California	
  	
  
University	
  of	
  Texas	
  Austin	
  	
  
Utah	
  State	
  University	
  	
  
Washington	
  University	
   	
  
	
   	
  in	
  St.	
  Louis	
  	
  
Williams	
  College	
  	
  
Yale	
  University	
  	
  
Research	
  
University	
  
	
  
BAS	
  Institution	
  
Asked	
  
ecology	
  
instructors:	
  
Institution?	
  
Program?	
  
Course?	
  
Lab?	
  
Students?	
  
You?	
  
0%	
  
20%	
  
40%	
  
60%	
  
80%	
  
100%	
  
BAS	
   RU	
  
Not	
  
science	
  
1-­‐2	
  yr	
  
science	
  
3-­‐4	
  yr	
  
science	
  
grad	
  
Student	
  Makeup	
  
0	
  
20	
  
40	
  
60	
  
80	
  
100	
  
120	
  
Notebook	
  
required	
  
NB	
  assessed	
  
for	
  DM	
  
Data	
  Used	
   Workflow	
  
assessed	
  
BAS	
  
RU	
  
Course	
  Characteristics	
  
Quality	
  control	
  and	
  quality	
  assurance	
  
The	
  proper	
  way	
  to	
  name	
  computer	
  files	
  
Types	
  of	
  files	
  and	
  software	
  to	
  use	
  
Metadata	
  generation	
  	
  
Protecting	
  data	
  
Databases	
  and	
  data	
  archiving	
  
Data	
  re-­‐use	
  
Meta-­‐analysis	
  
Data	
  sharing	
  
Reproducibility	
  
Notebook	
  protocols	
  (lab	
  or	
  field)	
  
Workflows	
  
	
  
0	
  
10	
  
20	
  
30	
  
40	
  
50	
  
60	
  
70	
  
BAS	
  
RU	
  
In	
  Curriculum/Assessed?	
  
0%	
  
20%	
  
40%	
  
60%	
  
80%	
  
100%	
  
How	
  Important	
  Is	
  It?	
  
Not	
  at	
  all	
  
Extremely	
  
0	
  
10	
  
20	
  
30	
  
40	
  
50	
  
2.0	
   2.5	
   3.0	
   3.5	
   4.0	
  
%	
  Courses	
  
that	
  cover	
  
topic	
  
Average	
  importance	
  
Why	
  Not?	
  
%	
  with	
  response	
  
0	
   10	
   20	
   30	
   40	
   50	
   60	
   70	
  
Time	
  
Not	
  appropriate	
  
Covered	
  in	
  lab	
  
Students	
  not	
  prepared	
  
Funding	
  
Class	
  too	
  big	
  
Instructor	
  not	
  prepared	
  
Covered	
  elsewhere	
  
Solutions?	
  
Make	
  time	
  
Teach	
  thyself	
  
Use	
  tools	
  
From	
  Flickr	
  by	
  Joe	
  Crimmings	
  Photography	
  
1.  Maximize	
  free	
  public	
  access	
  
2.  Ensure	
  researchers	
  create	
  data	
  
management	
  plans	
  
3.  Allow	
  costs	
  for	
  data	
  preservation	
  and	
  
access	
  in	
  proposal	
  budgets	
  
4.  Ensure	
  evaluation	
  of	
  data	
  management	
  
plan	
  merits	
  
5.  Ensure	
  researchers	
  comply	
  with	
  their	
  data	
  
management	
  plans	
  
6.  Promote	
  data	
  deposition	
  into	
  public	
  
repositories	
  
7.  Develop	
  approaches	
  for	
  identification	
  and	
  
attribution	
  of	
  datasets	
  
8.  Educate	
  folks	
  about	
  data	
  stewardship	
  
datapub.cdlib.org	
  Teach	
  Thyself	
  
DCXL	
  blog:	
  dcxl.cdlib.org	
  
carlystrasser.net	
  
dataone.org	
  
Teach	
  Thyself	
  
Step-­‐by-­‐step	
  wizard	
  for	
  generating	
  DMP	
  
Create	
  |	
  edit	
  |	
  re-­‐use	
  |	
  share	
  |	
  save	
  |	
  generate	
  	
  
Open	
  to	
  community	
  	
  
dmptool.org	
  	
  	
  	
  	
  	
  	
  	
  	
  
Use	
  Tools	
  
•  Data	
  Education	
  Tutorials	
  
•  Database	
  of	
  best	
  practices	
  	
  &	
  
software	
  tools	
  
•  Primer	
  on	
  data	
  management	
  
•  Investigator	
  Toolkit	
  
www.dataone.org	
  Use	
  Tools	
  
•  Best	
  practices	
  check	
  
•  Create	
  metadata	
  
•  Get	
  identifier	
  
•  Post	
  to	
  repository	
  
dataup.cdlib.org	
  Use	
  Tools	
  
carlystrasser.net	
  
My	
  website	
  
Email	
  me	
  
Tweet	
  me	
  
My	
  slides	
  
carlystrasser.net	
  
carlystrasser@gmail.com	
  
@carlystrasser	
  	
  
slideshare.net/carlystrasser	
  

"Undergrad ecologists aren't learning data management" - ESA 2013

  • 1.
    The  Fractured  Lab  Notebook     Undergraduates  are     not  learning  ecological  data  management     at  top  US  institutions     Carly  Strasser  |  @carlystrasser   California  Digital  Library   &   Stephanie  Hampton   National  Center  for  Ecological  Analysis  &  Synthesis   ESA  2013  
  • 2.
    Why  don’t  people   share  data?   Is  data  management   being  taught?   Do  attitudes  about   sharing  differ   among  disciplines?   What  role  can   libraries  play  in   data  education?   How  can  we  promote  storing   data  in  repositories?   What  barriers  to  sharing   can  we  eliminate?   NSF  funded  DataNet  Project   Office  of  Cyberinfrastructure  
  • 3.
    Back in theday… Da  Vinci   Curie   Newton   classicalschool.blogspot.com   Darwin  
  • 4.
    Digital  data   From  Flickr  by  Flickmor   From  Flickr  by  US  Army  Environmental  Command   From  Flickr  by    DW0825   C.  Strasser   Courtesey  of  WHOI   From  Flickr  by    deltaMike  
  • 5.
    Digital  data   +     Complex   workflows  
  • 6.
    From  Flickr  by  ~Minnea~   Data  management   Documentation   Reproducibility  
  • 7.
    THE TRUTH Data  management   Metadata   Data  repositories   Sharing  data  publicly   From  sandierpastures.com   YOU NEED TO KNOW ABOUT
  • 8.
    From  Flickr  by  hyperion327   From  Flickr  by  Redden-­‐McAllister  
  • 9.
    From  Flickr  by  iowa_spirit_walker   •  Cost   •  Confusion  about   standards   •  Lack  of  training   •  Fear  of  lost  rights  or   benefits   •  No  incentives  
  • 10.
    Are     ecologists     Are     ecology  grad  students   Are  soon-­‐to-­‐be     ecology  grad  students     learning  about  data   management?  
  • 11.
    From  Flickr  by  hyperion327   ~5,000  
  • 12.
    Research   Universities   Graduate   Research   Fellowships       2006-­‐2010   recipients  for   life  sciences     Baccalaureate/ Arts  &  Sciences     Institutions  
  • 13.
    Amherst  College     Bowdoin  College     Brown  University     Carleton  College     Colorado  College     Colorado  State  University     Columbia  University     Cornell  University     Dartmouth  College     Duke  University     Emory  University     Evergreen  State  College     Grinnell  College     Harvard  University     Indiana  University      (Bloomington)     Lewis  and  Clark  College     Michigan  State  University     Middlebury  College     Oberlin  College     Oregon  State  University     Princeton  University     Purdue  University     Rice  University     San  Diego  State  University     Stanford  University     Swarthmore  College     University  of  Arizona     UC  Berkeley     UC  Davis   UC  Irvine     UC  Riverside     UC  Santa  Barbara     UC  Santa  Cruz     University  of  Chicago     University  of  Colorado   University  of  Florida     University  of  Georgia   University  of  Idaho     University  of  Illinois    Urbana-­‐Champaign     University  of  Kansas    University  of  Maryland     University  of  Michigan     University  of  Minnesota     University  of  Montana     University  of  New  Mexico     University  of  Southern   California     University  of  Texas  Austin     Utah  State  University     Washington  University        in  St.  Louis     Williams  College     Yale  University     Research   University     BAS  Institution  
  • 14.
    Asked   ecology   instructors:   Institution?   Program?   Course?   Lab?   Students?   You?  
  • 15.
    0%   20%   40%   60%   80%   100%   BAS   RU   Not   science   1-­‐2  yr   science   3-­‐4  yr   science   grad   Student  Makeup  
  • 16.
    0   20   40   60   80   100   120   Notebook   required   NB  assessed   for  DM   Data  Used   Workflow   assessed   BAS   RU   Course  Characteristics  
  • 17.
    Quality  control  and  quality  assurance   The  proper  way  to  name  computer  files   Types  of  files  and  software  to  use   Metadata  generation     Protecting  data   Databases  and  data  archiving   Data  re-­‐use   Meta-­‐analysis   Data  sharing   Reproducibility   Notebook  protocols  (lab  or  field)   Workflows    
  • 18.
    0   10   20   30   40   50   60   70   BAS   RU   In  Curriculum/Assessed?  
  • 19.
    0%   20%   40%   60%   80%   100%   How  Important  Is  It?   Not  at  all   Extremely  
  • 20.
    0   10   20   30   40   50   2.0   2.5   3.0   3.5   4.0   %  Courses   that  cover   topic   Average  importance  
  • 21.
    Why  Not?   %  with  response   0   10   20   30   40   50   60   70   Time   Not  appropriate   Covered  in  lab   Students  not  prepared   Funding   Class  too  big   Instructor  not  prepared   Covered  elsewhere  
  • 22.
    Solutions?   Make  time   Teach  thyself   Use  tools  
  • 23.
    From  Flickr  by  Joe  Crimmings  Photography   1.  Maximize  free  public  access   2.  Ensure  researchers  create  data   management  plans   3.  Allow  costs  for  data  preservation  and   access  in  proposal  budgets   4.  Ensure  evaluation  of  data  management   plan  merits   5.  Ensure  researchers  comply  with  their  data   management  plans   6.  Promote  data  deposition  into  public   repositories   7.  Develop  approaches  for  identification  and   attribution  of  datasets   8.  Educate  folks  about  data  stewardship  
  • 24.
  • 25.
    DCXL  blog:  dcxl.cdlib.org   carlystrasser.net   dataone.org   Teach  Thyself  
  • 26.
    Step-­‐by-­‐step  wizard  for  generating  DMP   Create  |  edit  |  re-­‐use  |  share  |  save  |  generate     Open  to  community     dmptool.org                   Use  Tools  
  • 27.
    •  Data  Education  Tutorials   •  Database  of  best  practices    &   software  tools   •  Primer  on  data  management   •  Investigator  Toolkit   www.dataone.org  Use  Tools  
  • 28.
    •  Best  practices  check   •  Create  metadata   •  Get  identifier   •  Post  to  repository   dataup.cdlib.org  Use  Tools  
  • 29.
  • 30.
    My  website   Email  me   Tweet  me   My  slides   carlystrasser.net   carlystrasser@gmail.com   @carlystrasser     slideshare.net/carlystrasser