Preparing eScience Librarians for Managing Research Data - Jian Qin - RDAP12


Published on

Preparing eScience Librarians for Managing Research Data
Jian Qin

Education and Training panel
Presentation at Research Data Access & Preservation Summit
23 March 2012

Published in: Education, Technology

Preparing eScience Librarians for Managing Research Data - Jian Qin - RDAP12

  1. 1. Preparing  eScience  Librarians  for  Managing  Research  Data   RDAP  2012,  New  Orleans,  LA     Jian  Qin     School  of  InformaCon  Studies   Syracuse  University  
  2. 2. NoCons  of  eScience  librarianship   ProacCve   training  for   data  literacy     ConsultaCve   Leader  in   services  for   eScience   data  use  and   iniCaCves     management     AcCve  players   and   contributors   of  data   Part  of  team   curaCon   transcending   disciplinary   boundaries     RDAP  2012,  New  Orleans   2  
  3. 3. EducaCng  the  new  type  of  workforce   •  ScienCfic  data  literacy    (SDL)   project  (hNp://,  2007-­‐2009   •  E-­‐Science  Librarianship  Curriculum   project    (eSLib  hNp://,   2009-­‐2012,  in  partnership  with   Cornell  University  Library     RDAP  2012,  New  Orleans   3  
  4. 4. A  curriculum  for  eScience  librarianship  •  Overall  learning  objecCves:   –  Ability  to  arCculate  eScience  and  to  plan  and   develop  eScience  librarianship  projects   –  Competency  in  scienCfic  data  management   –  Competency  in  cyberinfrastructure  technologies   –  Ability  to  collaborate,  communicate,  and  lead  in   eScience  librarianship  projects   RDAP  2012,  New  Orleans   4  
  5. 5. Ability  to   •  ArCculate  eScience  process  and   data  lifecycle      arCculate   •  IdenCfy  user  needs  and  translate  eScience  and  to   the  needs  into  system   requirements    plan  and  develop   •  Make  plans  for  eScience  eScience   librarianship  project  iniCaCon  and  librarianship   implementaCon   •  Conduct  research  on  data  related  projects   issues  such  as  insCtuConal  data   policy,  support  services,  and   technology  adopCon     •  Write  grant  proposals  for   obtaining  funding  to  support   eScience  librarianship  projects     RDAP  2012,  New  Orleans   5  
  6. 6. •  ArCculate  data  Competency  in   characterisCcs  scien-fic  data   •  Analyze  domain  data  sets  management   and  develop  data  models     •  Define  metadata  element   sets     •  Develop  specialized   metadata  for  data  curaCon,   preservaCon,  and  access   •  Create  metadata  records  for   scienCfic  data  sets   RDAP  2012,  New  Orleans   6  
  7. 7. •  Maintain  informaCon  Competency  in   retrieval  interfaces  cyberinfrastruct •  Maintain  informaCon  ure  technologies   exchange  networks   •  Program,  write  code,  and   manipulate  scripts   •  Use  content  management   systems   •  IdenCfy  and  model  data/ work  flows   •  Assess  research  needs  for   and  performance  of  CI  tools   RDAP  2012,  New  Orleans   7  
  8. 8. Ability  to   •  Develop  partnership  with  collaborate,   internal  and  external  communicate,  and   organizaConal  units  and  lead  in  eScience   collaborators    librarianship   •  Communicate  with  projects   administrators  and   researchers     •  Engage  researchers  in  data   management  processes     •  IniCate  and  lead  in   eScience  librarianship   projects     RDAP  2012,  New  Orleans   8  
  9. 9. The  curriculum                                    Courses                                                        Primary  learning  outcomes   in  eScience  librarianship  projects     Ability  to  collaborate,  communicate,  and  lead   ScienCfic  Data   Competency  in  scienCfic  data   Management  (core)   management   Competency  in   Cyberinfrastructure   (core)   cyberinfrastructure  technologies   Ability  to  arCculate  eScience  and  to   Data  services  (capstone)   plan  and  develop  eScience   librarianship  projects     Database  systems   (required  elecCve)   Metadata  (elecCve)   RDAP  2012,  New  Orleans   9  
  10. 10. Theme  1:  building  fundamentals   1   2   Case  studies  that  use   Overview  of  scienCfic  data   pracCcal  examples  to  guide   management  that  covers   students  step-­‐by-­‐step  in   data  and  metadata   data  analysis  and   fundamentals   management  3   Using  scienCfic  data,  which  involves   discussions  of  data  quality,  data   repositories  and  discovery,  data   analysis  and  presentaCon,  and   ethics  and  intellectual  property   issues   RDAP  2012,  New  Orleans   10  
  11. 11. Building  fundamentals:  data  formats   Overview  of  scienti.ic  data  management  that   covers  data  and  metadata  fundamentals   Data   NASA’s    de-inition  of  data   Processing  level   level   processing  levels   Level  4   Self-­‐descripCve     informaCon  existed  as   Level   Reconstructed  unprocessed  instrument   0   data  at  full  resolutions.   Level  3   header  of  the  data  file     Level   Reconstructed,  unprocessed  instrument   Level  2   1A   data  at  full  resolution,  time  referenced,     and  annotated  with  ancillary  information,   Common  Data  Format  (CDF)   Level  1B   Flexible  Image  Transport  System  (FITS)   but  not  applied  to  the  Level  0  data.     GRid  In  Binary  (GRIB)   Level   Level  1A  data  that  has  been  processed  to   Level  1A   Hierarchical  Data  Format  (HDF)   1B   sensor  units.  Not  all     Network  Common  Data  Format  (netCDF)   instruments  will  have  a  Level  1B   Level  0   equivalent.   Major  scienCfic  data  format   RDAP  2012,  New  Orleans   11  
  12. 12. Building  fundamentals:    Understanding  data  and  metadata   Data   formats   Processing   levels   Data   collecCons   Some  formats  contain  self-­‐ Lineage  vital  to   descripCve  metadata   assessing  data   Metadata  standards  need   quality   to  be  adjusted  for  local   descripCon  needs   RDAP  2012,  New  Orleans   12  
  13. 13. Building  fundamentals:  data  literacy   IL:  ACRL.  (2010).     DL:  Finn,  Charles,  W.P.  (Tech  &  Learning,  2004)   SDL:  Qin,  J.  &  J.  D’Ignazio,  (Journal  of  Library  Metadata,  2010)       RDAP  2012,  New  Orleans   13  
  14. 14. Theme  2:  Analysis  and   generalizaCon    Analysis  of  data  problems  is  an  analysis  of  domain  data,  requirements,  and  workflows  that  will  lead  to  the  development  of  soluCons.   RDAP  2012,  New  Orleans   14  
  15. 15. Analysis  and  generalizaCon:  engaging  in  real  research  projects    •  Engage  students  in  research  and  service   projects   –  Data  policy  analysis   –  Data  management  consultaCon   –  Interviews  and  survey  design  •  Course  projects   –  Real-­‐world  data  management  problems   RDAP  2012,  New  Orleans   15  
  16. 16. Theme  3:  collaboraCon  and  communicaCon   •  Community  of  pracCce   •  InsCtuConalizaCon  of  data  services   –  Data  policies   –  Compliance  to  funding  agency  policies  and   mandates   –  Infrastructural  data  services  at  insCtuConal,   community,  and  naConal  levels   •  Awareness,  incenCves,  and  training   RDAP  2012,  New  Orleans   16  
  17. 17. CollaboraCon  and  communicaCon  •  Mentoring  by  Cornell  librarians,  led  by  Gail   Steinhart  •  Internships  in  academic  libraries  and/or   research  centers  •  Guest  speakers  to  classes  •  Engaging  students  in  research  and  service   projects   RDAP  2012,  New  Orleans   17  
  18. 18. Evolving  curriculum   CAS  in  Data  Science   Required  courses:   •  Database     •  Applied  Data  Science  Data  storage   Data   Data   Systems   and   analyCcs   visualizaCon   management  management     RDAP  2012,  New  Orleans   18  
  19. 19. eScience  Librarianship  Project   Website:   hNp://     RDAP  2012,  New  Orleans   19