Data repositories -- Xiamen University 2012 06-08
Upcoming SlideShare
Loading in...5
×
 

Data repositories -- Xiamen University 2012 06-08

on

  • 699 views

 

Statistics

Views

Total Views
699
Views on SlideShare
699
Embed Views
0

Actions

Likes
1
Downloads
15
Comments
1

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Data repositories -- Xiamen University 2012 06-08 Data repositories -- Xiamen University 2012 06-08 Presentation Transcript

  • Data  Repositories  and   Services   Xiamen  University  Library   June  8,  2012     Jian  Qin   School  of  InformaCon  Studies   Syracuse  University   hDp://eslib.ischool.syr.edu/jqin/  
  • Agenda  •  What  is  a  repository?  Repository  soNware?  •  What  does  it  do?    •  How  does  it  work?  •  Case  studies:   –  Dryad:  an  internaConal  repository  of  data  and   publicaCons  for  basic  and  applied  biosciences   –  Dataverse:  a  data  repository  system  6/8/12   Data  repositories  and  services   2  
  • What  is  a  data  repository?  Data  Repository  is  a  logical  (and  someCmes  physical)  parCConing   Repository  commonly   refers  to  a  locaCon  for   of  data  where  mulCple   storage,  oNen  for  safety   databases  which  apply  to   or  preservaCon.   specific  applicaCons  or  sets  of     applicaCons  reside.     hDp://en.wikipedia.org/wiki/Repository      hDp://www.learn.geekinterview.com/data-­‐warehouse/ dw-­‐basics/what-­‐is-­‐data-­‐repository.html     6/8/12   Data  repositories  and  services   3  
  • WHAT  CAN  WE  EXPECT  IN  A  DATA   REPOSITORY?  6/8/12   Data  repositories  and  services   4  
  • Technical  features  •  Standards   –  OAI-­‐PMH   –  Z39.50  protocol     –  Open  source  license  •  Hardware   •  Staff  requirements   –  Minimum  hardware  requirements   –  UNIX  systems   –  SAN  support   administrator  •  So;ware   –  Java  programmer   –  OS     –  PERL  programmer   –  Programming  language   –  Python  programmer   –  Database   –  Web  server   Open  Society  InsCtute.  (2004).  A  guide  to   –  Java  servlet  engine   insCtuConal  repository  soNware.  3rd  ed.   hDp://www.soros.org/openaccess/pdf/ –  Search  engine   OSI_Guide_to_IR_SoNware_v3.pdf       – 6/8/12   Other   Data  repositories  and  services   5  
  • Features  and  funcCons  •  Repository  &  system  administraDon   –  User  registraCon,  authenCcaCon  &  password   administraCon   –  Module-­‐level  APIs  •  Content  submission  administraDon   –  Define  mulCple  collecCons  with  same  instance  of   system   –  Submission  stages   –  Submission  support   –  System  generated  usage  stats  and  reposts   Open  Society  InsCtute.  (2004).  A  guide  to  insCtuConal  repository  soNware.  3rd  ed.   hDp://www.soros.org/openaccess/pdf/OSI_Guide_to_IR_SoNware_v3.pdf       6/8/12   Data  repositories  and  services   6  
  • FuncCons  of  repositories  •  Content  management   •  Archiving   –  Content  import/export   –  Persistent  document   idenCficaCon   –  Document/object  formats   –  Data  preservaCon  report   –  Metadata   –  Object  history/version  control   –  Real-­‐Cme  updaCng  and   indexing  of  accepted  content   •  System  maintenance  •  DisseminaCon   –  System  support   •  DocumentaCon/manual   –  User  interface   •  Listserv   –  Search  capability   •  Bug  track/feature  request   •  Full  text   system   •  All  descripCve  metadata   •  Formal  support/help  desk   •  Selected  metadata  fields   •  Browse   •  Sort  search  results   Open  Society  InsCtute.  (2004).  A  guide  to   –  Indexed  by  Google/other   insCtuConal  repository  soNware.  3rd  ed.   search  engines   hDp://www.soros.org/openaccess/pdf/ OSI_Guide_to_IR_SoNware_v3.pdf       6/8/12   Data  repositories  and  services   7  
  • The  context  of  repositories   Research   community   InsCtuConal   repository   Data   repository   PublicaCons,   presentaCons,   Datasets   reports,  etc.     Disciplines   Standards   Technology  6/8/12   Data  repositories  and  services   8  
  • InsCtuConal  repositories   InsCtuConal   •  An  insCtuConal  repository  (IR)consists  of  formally   repository   organized  and  managed  collecCons  of  digital  content   generated  by  faculty,  staff,  and  students  at  an  insCtuCon   PublicaCons,  presentaCons,   •  Types  of  IRs:   reports,  etc.     –  CollecCon-­‐based  digital  repositories  managed  by  library   professionals   –  Course  management  systems  and  associated  file  stores   –  CollecCon  of  research  data  and  reports  managed  by  research   units  (centers,  laboratories,  etc.)   –  Student  academic  porlolio  systems   –  InsCtuConal  file  storage  systems   –  Digital  asset  management  workflow  systems     –  Web  content  management  systems    used  by  insCtuCons  or   depts  to  store  and  stage  web  content  EDUCAUSE  Evolving  Technologies  CommiDee.  (2003).  InsCtuConal  repositories:  Enhancing  teaching,  learning,  and  research.  hDp://net.educause.edu/ir/library/pdf/DEC0303.pdf     6/8/12   Data  repositories  and  services   9  
  • Data  repositories  •  No  one  agreed-­‐upon  definiCon   Data  •  CharacterisCcs:   repository   –  A  repository  operated  by  an  academic   insCtuCon/unit  or  a  research  organizaCon   Datasets   –  A  system  for  storing,  managing,  preserving,   and  providing  access  to  data   –  Centered  on  a  discipline  or  a  research  field   involving  mulCple  disciplines   –  Policies  governing  the  intellectual  property   rights,  management,  access,  sharing,  and   citaCon  6/8/12   Data  repositories  and  services   10  
  • Dryad:  a  repository  for   data  and  publicaCons  hDp://datadryad.org/     •  As  a  data  repository,  Dryad  provides  a  plalorm  to  associate   data  with  underlying  publicaCons.     •  Content  acquisiCon:  user  submission   •  How  to  moCvate  users  to  submit  data?   •  Make  it  simple  and  rewarding   •  Provide  detailed  support  informaCon  about:   •  DeposiCng  data   •  Managing  data   •  Intellectual  property  rights  (CC0)   •  Download  data  packages   •  View  usage  staCsCcs   6/8/12   Data  repositories  and  services   11  
  • hDp://datadryad.org/handle/10255/dryad.8085     Dryad   metadata   record   example  6/8/12   Data  repositories  and  services   12  
  • Dryad  metadata  record  example  (cont’d)  Individual  files  in  the  data  package.  The  metadata  shows:  •  #  of  downloads  •  File  technical   data  •  Copyright  type  •  DocumentaCon   for  the  data  file   6/8/12   Data  repositories  and  services   13  
  • Dryad  Backend  •  Uses  core  features  of  DSpace  with   modificaCons  or  complete  replacement  •  Uses  OAI-­‐PMH  to  allow  metadata  harvesCng   –  Metadata  formats  available  for  harvesCng  include   •  METS/MODS,  OAI-­‐DC  (Dublin  Core),  OAI-­‐ORE/Atom,   and  RDF/DC    •  Uses  DOI  to  idenCfy  Dryad  data  packages  and   files   hDp://wiki.datadryad.org/Category:Technical_DocumentaCon    6/8/12   Data  repositories  and  services   14  
  • DOI  Examples       •  Data  packages   –  doi:10.5061/dryad.1664   –  doi:10.5061/dryad.642   –  doi:10.5061/dryad.1307   •  Data  files   –  doi:10.5061/dryad.1664/1   –  doi:10.5061/dryad.642/1   –  doi:10.5061/dryad.1307/1   –  doi:10.5061/dryad.1307/2   –  doi:10.5061/dryad.1307/3  6/8/12   Data  repositories  and  services   15  
  • DATA  REPOSITORY  SOFTWARE  6/8/12   Data  repositories  and  services   16  
  • 6/8/12   Data  repositories  and  services   17  
  • Dataverse  metadata  ediCng  interface  6/8/12   Data  repositories  and  services   18  
  • Dataverse  metadata  ediCng  interface  (cont’d)  6/8/12   Data  repositories  and  services   19  
  • 6/8/12   Data  repositories  and  services   20  
  • Standards  and  tools  for  repositories   •  Open  Archive  IniCaCve  (OAI)  and  its  Protocol  for   Metadata  HarvesCng  (OAI-­‐PMH)   •  Tools  (open  source):   –  DSpace  (hDp://www.dspace.org)     –  Fedora  (hDp://www.fedora-­‐commons.org/)   –  Dataverse  (hDp://thedata.org/)     –  EPrints  (hDp://www.eprints.org/)   –  More:   hDp://oad.simmons.edu/oadwiki/Free_and_open-­‐ source_repository_soNware    6/8/12   Data  repositories  and  services   21