SlideShare a Scribd company logo
Suppor&ng	
  Data-­‐Rich	
  
Research	
  on	
  Many	
  Fronts	
  
                                 2 1 	
   M a y 	
   2 0 1 2 	
  

  U n i v e r s i t y 	
   o f 	
   C a l i f o r n i a 	
   C u r a & o n 	
   C e n t e r 	
  
                C a l i f o r n i a 	
   D i g i t a l 	
   L i b r a r y 	
  
California	
  Digital	
  Library	
  
Serving	
  the	
  University	
  of	
  California	
     CDL	
  supports	
  the	
  research	
  lifecycle	
  	
  
•  10	
  campuses	
                                    •  Collec&ons	
  
•  360K	
  students,	
  faculty,	
  and	
  staff	
      •  Digital	
  Special	
  Collec&ons	
  
•  100’s	
  of	
  museums,	
  art	
  galleries,	
      •  Discovery	
  &	
  Delivery	
  
   observatories,	
  marine	
  centers,	
              •  Publishing	
  Group	
  
   botanical	
  gardens	
                              •  UC	
  Cura&on	
  Center	
  (UC3)	
  
•  5	
  medical	
  centers	
  
•  5	
  law	
  schools	
  
•  3	
  Na&onal	
  Laboratories	
  
California	
  Digital	
  Library	
  (CDL)	
  
Our	
  environment	
  circa	
  2002-­‐2008	
  
Focus	
  on	
  preserva&on	
  
For	
  memory	
  organiza&ons	
  
Infrastructure:	
  sta&c	
  
Services:	
  hosted	
  
Content:	
  museum	
  &	
  library	
  
Sustainability:	
  ?	
  
Our	
  environment	
  since	
  2008	
  
Focus	
  on	
  preserva&on	
           	
  cura%on	
  (lifecycle)	
  
For	
  memory	
  organiza&ons	
    	
  	
  and	
  now	
  data	
  producers	
  
Infrastructure:	
  sta&c	
             	
  	
  +	
  cloud,	
  VM,	
  bitbucket	
  	
  
Services:	
  hosted	
                   	
  	
  +	
  partnered,	
  self-­‐serve	
  
Content:	
  museum	
  &	
  library	
    	
  	
  +	
  research,	
  web	
  crawls	
  
Sustainability:	
  ?	
                 	
  	
  cost	
  recovery,	
  pay	
  once	
  
Today’s	
  journey	
  
          Data	
  service	
  basics	
  at	
  CDL	
  
               • Stable	
  storage	
  (Merri)	
  
               • Stable	
  iden&fiers	
  (EZID)	
  
               • Data	
  cita&on	
  (DataCite)	
  
               • Management	
  (DMPTool)	
  
               • Preserva&on	
  cost	
  modeling	
  
          ...	
  that	
  enable	
  
               • Federa&on	
  (DataONE)	
  
               • Data	
  papers	
  
               • Capture	
  (WAS	
  web	
  archiving)	
  
               • Excel	
  add-­‐in	
  (DCXL)	
  
The	
  scien&fic	
  record	
  is	
  at	
  risk	
  
Data	
  dissemina&on	
  is	
  rare,	
  risky,	
  expensive,	
  
 labor-­‐intensive,	
  domain-­‐specific,	
  and	
  
 receives	
  lile	
  credit	
  as	
  research	
  output	
  




                   Global	
  Change	
   Galac&c	
  Change	
  
The	
  changing	
  landscape	
  
•  Ever	
  increasing	
  number,	
  size,	
  and	
  
   diversity	
  of	
  content	
  
•  Ever	
  increasing	
  diversity	
  of	
  
   partners,	
  and	
  stakeholders	
  
•  Decreasing	
  resources	
  
•  Inevitability	
  of	
  disrup&ve	
  change	
  
     – Technology	
  
     – Ins&tu&onal	
  mission	
  

                                                       R ESOURCES	
  


                                                                        T IME	
  
Stable	
  storage:	
  	
  Merri	
  repository	
  
               •  Cura&on	
  repository	
  open	
  to	
  the	
  UC	
  
                  community	
  and	
  beyond	
  
               •  Discipline	
  /	
  content	
  agnos&c	
  	
  
               •  Micro-­‐services	
  architecture	
  
               •  Easy-­‐to-­‐use	
  UI	
  or	
  API	
  
               •  Hosted	
  or	
  locally	
  deployed	
  
                Primary	
  FuncAons	
  
                1.	
  Deposit	
  	
  
                2.	
  Manage	
  (metadata,	
  versions,	
  etc)	
  
                3.	
  Access	
  (expose)	
  
                4.	
  Share	
  (with	
  other	
  researchers)	
  
                5.	
  Preserve	
  
EZID:	
  Long	
  term	
  iden%fiers	
  made	
  easy	
  
 •  Precise	
  iden&fica&on	
  of	
  a	
  dataset	
  
    (DOI	
  or	
  ARK)	
  
 •  Credit	
  to	
  data	
  producers	
  and	
  
    data	
  publishers	
  
 •  A	
  link	
  from	
  the	
  tradi&onal	
  
    literature	
  to	
  the	
  data	
  (DataCite)	
  
 •  Exposure	
  and	
  research	
  metrics	
  
    for	
  datasets	
  
    (Web	
  of	
  Knowledge,	
  Google)	
  

                                                        Take	
  control	
  of	
  the	
  
Primary	
  FuncAons	
  
                                                        management	
  and	
  distribu%on	
  of	
  
1.	
  Create	
  persistent	
  iden&fiers	
               your	
  research,	
  share	
  and	
  get	
  
2.	
  Manage	
  iden&fiers	
  (and	
  associated	
       credit	
  for	
  it,	
  and	
  build	
  your	
  
      metadata)	
  over	
  &me	
                        reputa%on	
  through	
  its	
  collec%on	
  
                                                        and	
  documenta%on	
  
3.	
  Resolve	
  iden&fiers	
  
Discovery:	
  DataCite	
  consor&um	
  
•    Technische	
  Informa&onsbibliothek	
  (TIB),	
   •           Canada	
  Ins&tute	
  for	
  Scien&fic	
  and	
  
     Germany	
                                                     Technical	
  Informa&on	
  (CISTI)	
  
                                                              •    L’Ins&tut	
  de	
  l’Informa&on	
  Scien&fique	
  
•    Australian	
  Na&onal	
  Data	
  Service	
  (ANDS)	
  
                                                                   et	
  Technique	
  (INIST),	
  France	
  
•    The	
  Bri&sh	
  Library	
  
                                                              •    Library	
  or	
  the	
  ETH	
  Zürich	
  
•    California	
  Digital	
  Library,	
  USA	
               •    Library	
  of	
  TU	
  Delk,	
  The	
  Netherlands	
  
                                                              •    Office	
  of	
  ScienAfic	
  and	
  Technical	
  
                                                                   InformaAon,	
  US	
  Department	
  of	
  Energy	
  
                                                              •    Purdue	
  University,	
  USA	
  
                                                              •    Technical	
  Informa&on	
  Center	
  of	
  
                                                                   Denmark	
  
DMPTool	
  
  Mee&ng	
  funding	
  agencies	
  data	
  management	
  plan	
  requirements	
  
 •  Connect	
  researchers	
  to	
  resources	
  to	
  
    create	
  a	
  data	
  management	
  plan	
  
 •  NSF	
  and	
  directorates,	
  NIH,	
  NEH,	
  
    IMLS,	
  founda&ons	
  plus	
  
 •  Customizable	
  


Primary	
  FuncAons	
  
1.	
  Step-­‐by-­‐step	
  “wizard”	
  
2.	
  Templates	
  and	
  examples	
  
3.	
  Links	
  to	
  ins&tu&onal	
  resources	
  
      and	
  agency	
  informa&on	
  
4.	
  Plan	
  publica&on	
  and	
  sharing	
  
Number	
  of	
  Plans	
  Created	
  	
  
  Oct	
  2011	
  –	
  Feb	
  2012	
  
Cost	
  Model	
  1:	
  Pay	
  as	
  you	
  go	
  
•  Billed/paid	
  annually	
  

                                                                            {   P 	
  if	
  year = 0
                                                                                	
  0	
  	
  	
  if	
  year > 0


   –  Costs	
  for	
  archival	
  System	
  (A ),	
  Workflows	
  (W ),	
  Content	
  
      Types	
  (C ),	
  Monitoring	
  (M ),	
  and	
  Interven%ons	
  (V )	
  are	
  
      considered	
  common	
  goods,	
  and	
  are	
  appor&oned	
  equally	
  
      across	
  all	
  n	
  Producers	
  (P )	
  
        •  Model	
  components	
  are	
  represented	
  by	
  two	
  terms:	
  the	
  number	
  of	
  
           units	
  and	
  the	
  per-­‐unit	
  cost,	
  e.g.,	
  k ·S
   –  Storage	
  cost	
  (S )	
  accounted	
  on	
  a	
  per-­‐Producer	
  basis	
  
Model	
  2:	
  Pay	
  once,	
  preserve	
  for	
  “ T”	
  years	
  

•  Paid-­‐up	
  price	
  for	
  fixed	
  term T	
  	
      	
  




     –  A	
  func&on	
  of	
  r,	
  the	
  annual	
  investment	
  return,	
  and	
  d,	
  the	
  
        annual	
  decrease	
  in	
  unit	
  cost	
  of	
  preserva&on	
  
     –  G	
   is	
  the	
  cost	
  of	
  providing	
  a	
  year’s	
  preserva&on	
  service;	
  	
  	
  	
  
             	
  



        G0	
  includes	
  the	
  added	
  first	
  year	
  expense	
  of	
  Producer	
  
        engagement	
  and	
  registra&on	
  
     –  Sepng	
  T	
  =	
  ∞	
  calculates	
  the	
  price	
  for	
  “forever”	
  
New	
  distributed	
  framework	
  
           CoordinaAng	
  Nodes	
              Flexible,	
  scalable,	
  
              Member	
  Nodes	
  
•  retain	
  complete	
  metadata	
  
                                              sustainable	
  network	
  
• 	
  catalog	
  	
   ins&tu&ons	
  
      	
  diverse	
  
•  subset	
  of	
  all	
  data	
  
• 	
  	
  serve	
  local	
  community	
  
•  perform	
  basic	
  indexing	
  
• 	
  provide	
  network-­‐wide	
  
•  	
  provide	
  resources	
  for	
  
managing	
  their	
  data	
  
     services	
  
•  ensure	
  data	
  availability	
  
     (preserva&on)	
  	
  	
  
•  provide	
  replica&on	
  
     services	
  
Tradi&onal	
  ar&cles	
  vs	
  data	
  papers	
  
The	
  collec&ve	
  data	
  product	
  
Need	
  to	
  save	
  data	
  +	
  processing	
  




      Algorithms	
  +	
  Data	
  Structures	
  =	
  Programs	
  	
  
Vision	
  for	
  a	
  “data	
  paper”	
  	
  
•  Wrap	
  the	
  unfamiliar	
  in	
  a	
  familiar	
  façade	
  
•  A	
  “data	
  paper”	
  is	
  minimally	
  a	
  cover	
  sheet	
  
   and	
  a	
  set	
  of	
  links	
  to	
  archived	
  ar&facts	
  	
  
•  Cover	
  sheet	
  contains	
  familiar	
  elements:	
  
   &tle,	
  date,	
  authors,	
  abstract,	
  and	
  
   persistent	
  iden&fier	
  (DOI,	
  ARK,	
  etc.)	
  
•  Just	
  enough	
  to	
  permit	
  basic	
  exposure	
  and	
  
   discovery	
  
–  Building	
  a	
  basic	
  data	
  cita&on	
  	
  
–  Indexing	
  by	
  services	
  such	
  as	
  Web	
  of	
  
   Science,	
  Google	
  Scholar	
  
–  Ins&lling	
  	
  confidence	
  in	
  the	
  iden&fier’s	
  	
  
   stability	
  	
  
43 public archives
                                            120+ archives total
                                            58K crawls
                                            7,500 + sites
                                            600 million + URLs
                                            40+ TB
                                            24 institutions




Developed with LoC support by CDL, UNT, and others
What	
  are	
  people	
  using	
  WAS	
  for?	
  
       Archiving	
  at-­‐risk	
  government	
  websites	
  and	
  publica&ons	
  
                 Archiving	
  their	
  own	
  university	
  domains	
  
       Building	
  web	
  archives	
  to	
  complement	
  library	
  collec&ons	
  
           Documen&ng	
  web	
  coverage	
  of	
  significant	
  events	
  
Data	
  cura%on	
  for	
  Excel	
  
•  Excel	
  is	
  the	
  database	
  of	
  choice	
  for	
  many	
  researchers	
  
•  Make	
  it	
  easy	
  to	
  share,	
  archive,	
  	
  and	
  publish	
  data	
  
•  Keep	
  up	
  to	
  date	
  at	
  dcxl.cdlib.org	
  

Primary	
  FuncAons	
                                Surveyed	
  users	
  and	
  found:	
  
                                                     •  Most	
  researchers	
  are	
  unaware	
  of	
  
1.	
  An	
  Excel	
  add-­‐in	
  and	
  web	
  
                                                        preserva&on	
  op&ons	
  
    applica&on	
                                     •  Documenta&on	
  prac&ces	
  are	
  poor	
  
2.	
  Metadata	
  descrip&on	
  (through	
           •  Excel	
  is	
  just	
  one	
  tool	
  in	
  workflows	
  
    extrac&on	
  and	
  augmenta&on)	
  
3.	
  Check	
  for	
  good	
  data	
  prac&ces	
  
3.	
  Transfer	
  to	
  repository	
  	
  
A	
  data	
  cura&on	
  approach	
  at	
  CDL	
  
•  New	
  “data	
  paper”	
  publishing	
  model	
  [GBMF]	
  
•  DataCite	
  consor&um	
  and	
  cita&on	
  standards	
  
•  Other	
  fronts:	
  
   •  DataONE	
  global	
  data	
  network	
  [NSF]	
  
   •  Merri:	
  general-­‐purpose	
  data	
  repository	
  
   •  EZID:	
  scheme-­‐agnos&c	
  &	
  de-­‐coupled	
  crea&on,	
  
      resolu&on,	
  and	
  management	
  of	
  persistent	
  ids	
  
   •  Data	
  management	
  plan	
  generator	
  
   •  Web	
  archiving	
  service	
  [Library	
  of	
  Congress]	
  
   •  Open-­‐source	
  Excel	
  add-­‐in	
  [MS	
  Research	
  &	
  GBMF]	
  
Ques&ons?	
  

John.Kunze@ucop.edu	
  

California	
  Digital	
  Library	
  
 hp://www.cdlib.org/	
  

More Related Content

What's hot

Digital preservation
Digital preservationDigital preservation
Digital preservation
Sarika Sawant
 
Digitisation Overview
Digitisation OverviewDigitisation Overview
Digitisation Overview
Ria Groenewald
 
Creation of LSE Digital Library
Creation of LSE Digital LibraryCreation of LSE Digital Library
Creation of LSE Digital Library
Ed Fay
 
Digital Preservation
Digital PreservationDigital Preservation
Digital Preservation
Digital PreservationDigital Preservation
Digital Preservation
smtcd
 
Preparation, Proceed and Review of preservation of Digital Library
Preparation, Proceed and Review of preservation of Digital Library Preparation, Proceed and Review of preservation of Digital Library
Preparation, Proceed and Review of preservation of Digital Library
Asheesh Kamal
 
Digital preservation from a records management perspective
Digital preservation from a records management perspectiveDigital preservation from a records management perspective
Digital preservation from a records management perspective
Michael Day
 
Digital preservation: an introduction
Digital preservation: an introductionDigital preservation: an introduction
Digital preservation: an introduction
Michael Day
 
Digital Preservation in the Wild
Digital Preservation in the WildDigital Preservation in the Wild
Digital Preservation in the Wild
Tim Donohue
 
Intro to Digital Preservation
Intro to Digital PreservationIntro to Digital Preservation
Intro to Digital Preservation
Ben Fino-radin
 
Digital Preservation
Digital PreservationDigital Preservation
Digital Preservation
Michael Day
 
Digital preservation
Digital preservationDigital preservation
Digital preservation
Michael Day
 
Natalie Harrower - Digital Preservation: Let's do it together!
Natalie Harrower - Digital Preservation: Let's do it together!Natalie Harrower - Digital Preservation: Let's do it together!
Natalie Harrower - Digital Preservation: Let's do it together!
dri_ireland
 
‘If a tree falls in the forest’: recording and sharing digital preservation k...
‘If a tree falls in the forest’: recording and sharing digital preservation k...‘If a tree falls in the forest’: recording and sharing digital preservation k...
‘If a tree falls in the forest’: recording and sharing digital preservation k...National Library of Australia
 
Research Data Management at the University of Edinburgh
Research Data Management at the University of EdinburghResearch Data Management at the University of Edinburgh
Research Data Management at the University of Edinburgh
EDINA, University of Edinburgh
 
Issues in long-term knowledge retention in engineering
Issues in long-term knowledge retention in engineeringIssues in long-term knowledge retention in engineering
Issues in long-term knowledge retention in engineering
Chris Rusbridge
 
Brief Introduction to Digital Preservation
Brief Introduction to Digital PreservationBrief Introduction to Digital Preservation
Brief Introduction to Digital Preservation
Michael Day
 
Putting Business Intelligence to Work on Hadoop Data Stores
Putting Business Intelligence to Work on Hadoop Data StoresPutting Business Intelligence to Work on Hadoop Data Stores
Putting Business Intelligence to Work on Hadoop Data Stores
DATAVERSITY
 
Practical steps towards digital preservation at institutional levels
Practical steps towards digital preservation at institutional levelsPractical steps towards digital preservation at institutional levels
Practical steps towards digital preservation at institutional levels
Chris Rusbridge
 
An Introduction to digital preservation at the Library of Congress
An Introduction to digital preservation at the Library of CongressAn Introduction to digital preservation at the Library of Congress
An Introduction to digital preservation at the Library of Congress
lljohnston
 

What's hot (20)

Digital preservation
Digital preservationDigital preservation
Digital preservation
 
Digitisation Overview
Digitisation OverviewDigitisation Overview
Digitisation Overview
 
Creation of LSE Digital Library
Creation of LSE Digital LibraryCreation of LSE Digital Library
Creation of LSE Digital Library
 
Digital Preservation
Digital PreservationDigital Preservation
Digital Preservation
 
Digital Preservation
Digital PreservationDigital Preservation
Digital Preservation
 
Preparation, Proceed and Review of preservation of Digital Library
Preparation, Proceed and Review of preservation of Digital Library Preparation, Proceed and Review of preservation of Digital Library
Preparation, Proceed and Review of preservation of Digital Library
 
Digital preservation from a records management perspective
Digital preservation from a records management perspectiveDigital preservation from a records management perspective
Digital preservation from a records management perspective
 
Digital preservation: an introduction
Digital preservation: an introductionDigital preservation: an introduction
Digital preservation: an introduction
 
Digital Preservation in the Wild
Digital Preservation in the WildDigital Preservation in the Wild
Digital Preservation in the Wild
 
Intro to Digital Preservation
Intro to Digital PreservationIntro to Digital Preservation
Intro to Digital Preservation
 
Digital Preservation
Digital PreservationDigital Preservation
Digital Preservation
 
Digital preservation
Digital preservationDigital preservation
Digital preservation
 
Natalie Harrower - Digital Preservation: Let's do it together!
Natalie Harrower - Digital Preservation: Let's do it together!Natalie Harrower - Digital Preservation: Let's do it together!
Natalie Harrower - Digital Preservation: Let's do it together!
 
‘If a tree falls in the forest’: recording and sharing digital preservation k...
‘If a tree falls in the forest’: recording and sharing digital preservation k...‘If a tree falls in the forest’: recording and sharing digital preservation k...
‘If a tree falls in the forest’: recording and sharing digital preservation k...
 
Research Data Management at the University of Edinburgh
Research Data Management at the University of EdinburghResearch Data Management at the University of Edinburgh
Research Data Management at the University of Edinburgh
 
Issues in long-term knowledge retention in engineering
Issues in long-term knowledge retention in engineeringIssues in long-term knowledge retention in engineering
Issues in long-term knowledge retention in engineering
 
Brief Introduction to Digital Preservation
Brief Introduction to Digital PreservationBrief Introduction to Digital Preservation
Brief Introduction to Digital Preservation
 
Putting Business Intelligence to Work on Hadoop Data Stores
Putting Business Intelligence to Work on Hadoop Data StoresPutting Business Intelligence to Work on Hadoop Data Stores
Putting Business Intelligence to Work on Hadoop Data Stores
 
Practical steps towards digital preservation at institutional levels
Practical steps towards digital preservation at institutional levelsPractical steps towards digital preservation at institutional levels
Practical steps towards digital preservation at institutional levels
 
An Introduction to digital preservation at the Library of Congress
An Introduction to digital preservation at the Library of CongressAn Introduction to digital preservation at the Library of Congress
An Introduction to digital preservation at the Library of Congress
 

Viewers also liked

VistaNational Resource Library
VistaNational Resource LibraryVistaNational Resource Library
VistaNational Resource Libraryrrobatzek
 
Future-Proofing the Web: What We Can Do Today
Future-Proofing the Web: What We Can Do TodayFuture-Proofing the Web: What We Can Do Today
Future-Proofing the Web: What We Can Do Today
John Kunze
 
тренинг по продукту от кузнецова сергея
тренинг по продукту от кузнецова сергеятренинг по продукту от кузнецова сергея
тренинг по продукту от кузнецова сергеяСергей Кузнецов
 
Equipo 2 diabetes en el embarazo
Equipo 2  diabetes en el embarazoEquipo 2  diabetes en el embarazo
Equipo 2 diabetes en el embarazo
Eduardo Jimenez
 
Art
ArtArt
Треугольник продаж от кузнецова сергея
Треугольник продаж от кузнецова сергеяТреугольник продаж от кузнецова сергея
Треугольник продаж от кузнецова сергеяСергей Кузнецов
 

Viewers also liked (8)

Nomina compu
Nomina compuNomina compu
Nomina compu
 
VistaNational Resource Library
VistaNational Resource LibraryVistaNational Resource Library
VistaNational Resource Library
 
Future-Proofing the Web: What We Can Do Today
Future-Proofing the Web: What We Can Do TodayFuture-Proofing the Web: What We Can Do Today
Future-Proofing the Web: What We Can Do Today
 
CV Estela Rojas 2
CV Estela Rojas 2CV Estela Rojas 2
CV Estela Rojas 2
 
тренинг по продукту от кузнецова сергея
тренинг по продукту от кузнецова сергеятренинг по продукту от кузнецова сергея
тренинг по продукту от кузнецова сергея
 
Equipo 2 diabetes en el embarazo
Equipo 2  diabetes en el embarazoEquipo 2  diabetes en el embarazo
Equipo 2 diabetes en el embarazo
 
Art
ArtArt
Art
 
Треугольник продаж от кузнецова сергея
Треугольник продаж от кузнецова сергеяТреугольник продаж от кузнецова сергея
Треугольник продаж от кузнецова сергея
 

Similar to Supporting Data-Rich Research on Many Fronts

The Data Management Ecosystem
The Data Management EcosystemThe Data Management Ecosystem
The Data Management EcosystemJohn Kunze
 
RDAP13 John Kunze: The Data Management Ecosystem
RDAP13 John Kunze: The Data Management EcosystemRDAP13 John Kunze: The Data Management Ecosystem
RDAP13 John Kunze: The Data Management Ecosystem
ASIS&T
 
Why manage research data?
Why manage research data?Why manage research data?
Why manage research data?
Graham Pryor
 
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...
Sarah Anna Stewart
 
RDM Programme at University of Edinburgh
RDM Programme at University of EdinburghRDM Programme at University of Edinburgh
RDM Programme at University of Edinburgh
Historic Environment Scotland
 
The future of the DCC
The future of the DCCThe future of the DCC
The future of the DCC
Chris Rusbridge
 
Graham Pryor
Graham PryorGraham Pryor
Graham Pryor
Eduserv
 
Research Data (and Software) Management at Imperial: (Everything you need to ...
Research Data (and Software) Management at Imperial: (Everything you need to ...Research Data (and Software) Management at Imperial: (Everything you need to ...
Research Data (and Software) Management at Imperial: (Everything you need to ...
Sarah Anna Stewart
 
Supporting Libraries in Leading the Way in Research Data Management
Supporting Libraries in Leading the Way in Research Data ManagementSupporting Libraries in Leading the Way in Research Data Management
Supporting Libraries in Leading the Way in Research Data Management
Marieke Guy
 
The e-Ciber Superfacility Project
The e-Ciber Superfacility ProjectThe e-Ciber Superfacility Project
The e-Ciber Superfacility Project
Leandro Ciuffo
 
Ariadne overview
Ariadne overviewAriadne overview
Ariadne overview
ariadnenetwork
 
RCUK Cloud Workshop
RCUK Cloud WorkshopRCUK Cloud Workshop
RCUK Cloud Workshop
Simon Woodman
 
ExLibris National Library Meeting @ IFLA-Helsinki - Aug 15th 2012
ExLibris National Library Meeting @ IFLA-Helsinki - Aug 15th 2012ExLibris National Library Meeting @ IFLA-Helsinki - Aug 15th 2012
ExLibris National Library Meeting @ IFLA-Helsinki - Aug 15th 2012
Lee Dirks
 
Research Data Management at Imperial College London
Research Data Management at Imperial College LondonResearch Data Management at Imperial College London
Research Data Management at Imperial College London
Sarah Anna Stewart
 
Research Cyberinfrastructure at UCSD - David Minor - RDAP12
Research Cyberinfrastructure at UCSD - David Minor - RDAP12Research Cyberinfrastructure at UCSD - David Minor - RDAP12
Research Cyberinfrastructure at UCSD - David Minor - RDAP12
ASIS&T
 
Virtualization for HPC at NCI
Virtualization for HPC at NCIVirtualization for HPC at NCI
Virtualization for HPC at NCI
inside-BigData.com
 
Ariadne: Lifecycles
Ariadne: LifecyclesAriadne: Lifecycles
Ariadne: Lifecycles
ariadnenetwork
 
Scottish Digital Library Consortium Meeting: Edinburgh DataShare
Scottish Digital Library Consortium Meeting: Edinburgh DataShareScottish Digital Library Consortium Meeting: Edinburgh DataShare
Scottish Digital Library Consortium Meeting: Edinburgh DataShare
Robin Rice
 
GBIF and reuse of research data, Bergen (2016-12-14)
GBIF and reuse of research data, Bergen (2016-12-14)GBIF and reuse of research data, Bergen (2016-12-14)
GBIF and reuse of research data, Bergen (2016-12-14)
Dag Endresen
 
Ticer summer school_24_aug06
Ticer summer school_24_aug06Ticer summer school_24_aug06
Ticer summer school_24_aug06
SayDotCom.com
 

Similar to Supporting Data-Rich Research on Many Fronts (20)

The Data Management Ecosystem
The Data Management EcosystemThe Data Management Ecosystem
The Data Management Ecosystem
 
RDAP13 John Kunze: The Data Management Ecosystem
RDAP13 John Kunze: The Data Management EcosystemRDAP13 John Kunze: The Data Management Ecosystem
RDAP13 John Kunze: The Data Management Ecosystem
 
Why manage research data?
Why manage research data?Why manage research data?
Why manage research data?
 
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...
 
RDM Programme at University of Edinburgh
RDM Programme at University of EdinburghRDM Programme at University of Edinburgh
RDM Programme at University of Edinburgh
 
The future of the DCC
The future of the DCCThe future of the DCC
The future of the DCC
 
Graham Pryor
Graham PryorGraham Pryor
Graham Pryor
 
Research Data (and Software) Management at Imperial: (Everything you need to ...
Research Data (and Software) Management at Imperial: (Everything you need to ...Research Data (and Software) Management at Imperial: (Everything you need to ...
Research Data (and Software) Management at Imperial: (Everything you need to ...
 
Supporting Libraries in Leading the Way in Research Data Management
Supporting Libraries in Leading the Way in Research Data ManagementSupporting Libraries in Leading the Way in Research Data Management
Supporting Libraries in Leading the Way in Research Data Management
 
The e-Ciber Superfacility Project
The e-Ciber Superfacility ProjectThe e-Ciber Superfacility Project
The e-Ciber Superfacility Project
 
Ariadne overview
Ariadne overviewAriadne overview
Ariadne overview
 
RCUK Cloud Workshop
RCUK Cloud WorkshopRCUK Cloud Workshop
RCUK Cloud Workshop
 
ExLibris National Library Meeting @ IFLA-Helsinki - Aug 15th 2012
ExLibris National Library Meeting @ IFLA-Helsinki - Aug 15th 2012ExLibris National Library Meeting @ IFLA-Helsinki - Aug 15th 2012
ExLibris National Library Meeting @ IFLA-Helsinki - Aug 15th 2012
 
Research Data Management at Imperial College London
Research Data Management at Imperial College LondonResearch Data Management at Imperial College London
Research Data Management at Imperial College London
 
Research Cyberinfrastructure at UCSD - David Minor - RDAP12
Research Cyberinfrastructure at UCSD - David Minor - RDAP12Research Cyberinfrastructure at UCSD - David Minor - RDAP12
Research Cyberinfrastructure at UCSD - David Minor - RDAP12
 
Virtualization for HPC at NCI
Virtualization for HPC at NCIVirtualization for HPC at NCI
Virtualization for HPC at NCI
 
Ariadne: Lifecycles
Ariadne: LifecyclesAriadne: Lifecycles
Ariadne: Lifecycles
 
Scottish Digital Library Consortium Meeting: Edinburgh DataShare
Scottish Digital Library Consortium Meeting: Edinburgh DataShareScottish Digital Library Consortium Meeting: Edinburgh DataShare
Scottish Digital Library Consortium Meeting: Edinburgh DataShare
 
GBIF and reuse of research data, Bergen (2016-12-14)
GBIF and reuse of research data, Bergen (2016-12-14)GBIF and reuse of research data, Bergen (2016-12-14)
GBIF and reuse of research data, Bergen (2016-12-14)
 
Ticer summer school_24_aug06
Ticer summer school_24_aug06Ticer summer school_24_aug06
Ticer summer school_24_aug06
 

More from John Kunze

The YAMZ Metadictionary
The YAMZ MetadictionaryThe YAMZ Metadictionary
The YAMZ Metadictionary
John Kunze
 
YAMZ Metadata Vocabulary Builder
YAMZ Metadata Vocabulary BuilderYAMZ Metadata Vocabulary Builder
YAMZ Metadata Vocabulary Builder
John Kunze
 
The ARK Alliance: 20 years, 850 institutions, 8.2 billion persistent identifi...
The ARK Alliance: 20 years, 850 institutions, 8.2 billion persistent identifi...The ARK Alliance: 20 years, 850 institutions, 8.2 billion persistent identifi...
The ARK Alliance: 20 years, 850 institutions, 8.2 billion persistent identifi...
John Kunze
 
EZID and N2T at CDL
EZID and N2T at CDLEZID and N2T at CDL
EZID and N2T at CDL
John Kunze
 
YAMZ.net: better, faster, cheaper taxonomy building
YAMZ.net:  better, faster, cheaper taxonomy buildingYAMZ.net:  better, faster, cheaper taxonomy building
YAMZ.net: better, faster, cheaper taxonomy building
John Kunze
 
A Vocabulary for Persistence
A Vocabulary for PersistenceA Vocabulary for Persistence
A Vocabulary for Persistence
John Kunze
 
Identifiers obey Resolvers not Schemes
Identifiers obey Resolvers not SchemesIdentifiers obey Resolvers not Schemes
Identifiers obey Resolvers not Schemes
John Kunze
 
Names, Things, and Open Identifier Infrastructure: N2T and ARKs
Names, Things, and Open Identifier Infrastructure: N2T and ARKsNames, Things, and Open Identifier Infrastructure: N2T and ARKs
Names, Things, and Open Identifier Infrastructure: N2T and ARKs
John Kunze
 
ARK identifiers: lessons learnt at BnF: paths forward
ARK identifiers: lessons learnt at BnF: paths forwardARK identifiers: lessons learnt at BnF: paths forward
ARK identifiers: lessons learnt at BnF: paths forward
John Kunze
 
YAMZ: a cross-domain crowd-sourced metadata vocabulary
YAMZ: a cross-domain crowd-sourced metadata vocabularyYAMZ: a cross-domain crowd-sourced metadata vocabulary
YAMZ: a cross-domain crowd-sourced metadata vocabulary
John Kunze
 
DataONE Preservation and Metadata Working Group Report 2014
DataONE Preservation and Metadata Working Group Report 2014DataONE Preservation and Metadata Working Group Report 2014
DataONE Preservation and Metadata Working Group Report 2014John Kunze
 
Selected Bash shell tricks from Camp CDL breakout group
Selected Bash shell tricks from Camp CDL breakout groupSelected Bash shell tricks from Camp CDL breakout group
Selected Bash shell tricks from Camp CDL breakout groupJohn Kunze
 
Annotating Research Datasets
Annotating Research DatasetsAnnotating Research Datasets
Annotating Research Datasets
John Kunze
 
Library Tools Supporting Data-Rich Research
Library Tools Supporting Data-Rich ResearchLibrary Tools Supporting Data-Rich Research
Library Tools Supporting Data-Rich ResearchJohn Kunze
 
Big Data's Long Tail
Big Data's Long TailBig Data's Long Tail
Big Data's Long TailJohn Kunze
 
Pamwg 2012ahm
Pamwg 2012ahmPamwg 2012ahm
Pamwg 2012ahm
John Kunze
 
Scalable Identifiers for Natural History Collections
Scalable Identifiers for Natural History CollectionsScalable Identifiers for Natural History Collections
Scalable Identifiers for Natural History CollectionsJohn Kunze
 
The ARK Identifier Scheme at Ten Years Old
The ARK Identifier Scheme at Ten Years OldThe ARK Identifier Scheme at Ten Years Old
The ARK Identifier Scheme at Ten Years Old
John Kunze
 
New Metaphors: Data Papers and Data Citations
New Metaphors: Data Papers and Data CitationsNew Metaphors: Data Papers and Data Citations
New Metaphors: Data Papers and Data CitationsJohn Kunze
 
Pairtrees for object storage
Pairtrees for object storagePairtrees for object storage
Pairtrees for object storageJohn Kunze
 

More from John Kunze (20)

The YAMZ Metadictionary
The YAMZ MetadictionaryThe YAMZ Metadictionary
The YAMZ Metadictionary
 
YAMZ Metadata Vocabulary Builder
YAMZ Metadata Vocabulary BuilderYAMZ Metadata Vocabulary Builder
YAMZ Metadata Vocabulary Builder
 
The ARK Alliance: 20 years, 850 institutions, 8.2 billion persistent identifi...
The ARK Alliance: 20 years, 850 institutions, 8.2 billion persistent identifi...The ARK Alliance: 20 years, 850 institutions, 8.2 billion persistent identifi...
The ARK Alliance: 20 years, 850 institutions, 8.2 billion persistent identifi...
 
EZID and N2T at CDL
EZID and N2T at CDLEZID and N2T at CDL
EZID and N2T at CDL
 
YAMZ.net: better, faster, cheaper taxonomy building
YAMZ.net:  better, faster, cheaper taxonomy buildingYAMZ.net:  better, faster, cheaper taxonomy building
YAMZ.net: better, faster, cheaper taxonomy building
 
A Vocabulary for Persistence
A Vocabulary for PersistenceA Vocabulary for Persistence
A Vocabulary for Persistence
 
Identifiers obey Resolvers not Schemes
Identifiers obey Resolvers not SchemesIdentifiers obey Resolvers not Schemes
Identifiers obey Resolvers not Schemes
 
Names, Things, and Open Identifier Infrastructure: N2T and ARKs
Names, Things, and Open Identifier Infrastructure: N2T and ARKsNames, Things, and Open Identifier Infrastructure: N2T and ARKs
Names, Things, and Open Identifier Infrastructure: N2T and ARKs
 
ARK identifiers: lessons learnt at BnF: paths forward
ARK identifiers: lessons learnt at BnF: paths forwardARK identifiers: lessons learnt at BnF: paths forward
ARK identifiers: lessons learnt at BnF: paths forward
 
YAMZ: a cross-domain crowd-sourced metadata vocabulary
YAMZ: a cross-domain crowd-sourced metadata vocabularyYAMZ: a cross-domain crowd-sourced metadata vocabulary
YAMZ: a cross-domain crowd-sourced metadata vocabulary
 
DataONE Preservation and Metadata Working Group Report 2014
DataONE Preservation and Metadata Working Group Report 2014DataONE Preservation and Metadata Working Group Report 2014
DataONE Preservation and Metadata Working Group Report 2014
 
Selected Bash shell tricks from Camp CDL breakout group
Selected Bash shell tricks from Camp CDL breakout groupSelected Bash shell tricks from Camp CDL breakout group
Selected Bash shell tricks from Camp CDL breakout group
 
Annotating Research Datasets
Annotating Research DatasetsAnnotating Research Datasets
Annotating Research Datasets
 
Library Tools Supporting Data-Rich Research
Library Tools Supporting Data-Rich ResearchLibrary Tools Supporting Data-Rich Research
Library Tools Supporting Data-Rich Research
 
Big Data's Long Tail
Big Data's Long TailBig Data's Long Tail
Big Data's Long Tail
 
Pamwg 2012ahm
Pamwg 2012ahmPamwg 2012ahm
Pamwg 2012ahm
 
Scalable Identifiers for Natural History Collections
Scalable Identifiers for Natural History CollectionsScalable Identifiers for Natural History Collections
Scalable Identifiers for Natural History Collections
 
The ARK Identifier Scheme at Ten Years Old
The ARK Identifier Scheme at Ten Years OldThe ARK Identifier Scheme at Ten Years Old
The ARK Identifier Scheme at Ten Years Old
 
New Metaphors: Data Papers and Data Citations
New Metaphors: Data Papers and Data CitationsNew Metaphors: Data Papers and Data Citations
New Metaphors: Data Papers and Data Citations
 
Pairtrees for object storage
Pairtrees for object storagePairtrees for object storage
Pairtrees for object storage
 

Recently uploaded

Improving profitability for small business
Improving profitability for small businessImproving profitability for small business
Improving profitability for small business
Ben Wann
 
anas about venice for grade 6f about venice
anas about venice for grade 6f about veniceanas about venice for grade 6f about venice
anas about venice for grade 6f about venice
anasabutalha2013
 
Maksym Vyshnivetskyi: PMO Quality Management (UA)
Maksym Vyshnivetskyi: PMO Quality Management (UA)Maksym Vyshnivetskyi: PMO Quality Management (UA)
Maksym Vyshnivetskyi: PMO Quality Management (UA)
Lviv Startup Club
 
Cracking the Workplace Discipline Code Main.pptx
Cracking the Workplace Discipline Code Main.pptxCracking the Workplace Discipline Code Main.pptx
Cracking the Workplace Discipline Code Main.pptx
Workforce Group
 
Project File Report BBA 6th semester.pdf
Project File Report BBA 6th semester.pdfProject File Report BBA 6th semester.pdf
Project File Report BBA 6th semester.pdf
RajPriye
 
falcon-invoice-discounting-a-premier-platform-for-investors-in-india
falcon-invoice-discounting-a-premier-platform-for-investors-in-indiafalcon-invoice-discounting-a-premier-platform-for-investors-in-india
falcon-invoice-discounting-a-premier-platform-for-investors-in-india
Falcon Invoice Discounting
 
Global Interconnection Group Joint Venture[960] (1).pdf
Global Interconnection Group Joint Venture[960] (1).pdfGlobal Interconnection Group Joint Venture[960] (1).pdf
Global Interconnection Group Joint Venture[960] (1).pdf
Henry Tapper
 
chapter 10 - excise tax of transfer and business taxation
chapter 10 - excise tax of transfer and business taxationchapter 10 - excise tax of transfer and business taxation
chapter 10 - excise tax of transfer and business taxation
AUDIJEAngelo
 
RMD24 | Retail media: hoe zet je dit in als je geen AH of Unilever bent? Heid...
RMD24 | Retail media: hoe zet je dit in als je geen AH of Unilever bent? Heid...RMD24 | Retail media: hoe zet je dit in als je geen AH of Unilever bent? Heid...
RMD24 | Retail media: hoe zet je dit in als je geen AH of Unilever bent? Heid...
BBPMedia1
 
Meas_Dylan_DMBS_PB1_2024-05XX_Revised.pdf
Meas_Dylan_DMBS_PB1_2024-05XX_Revised.pdfMeas_Dylan_DMBS_PB1_2024-05XX_Revised.pdf
Meas_Dylan_DMBS_PB1_2024-05XX_Revised.pdf
dylandmeas
 
Role of Remote Sensing and Monitoring in Mining
Role of Remote Sensing and Monitoring in MiningRole of Remote Sensing and Monitoring in Mining
Role of Remote Sensing and Monitoring in Mining
Naaraayani Minerals Pvt.Ltd
 
April 2024 Nostalgia Products Newsletter
April 2024 Nostalgia Products NewsletterApril 2024 Nostalgia Products Newsletter
April 2024 Nostalgia Products Newsletter
NathanBaughman3
 
Business Valuation Principles for Entrepreneurs
Business Valuation Principles for EntrepreneursBusiness Valuation Principles for Entrepreneurs
Business Valuation Principles for Entrepreneurs
Ben Wann
 
PriyoShop Celebration Pohela Falgun Mar 20, 2024
PriyoShop Celebration Pohela Falgun Mar 20, 2024PriyoShop Celebration Pohela Falgun Mar 20, 2024
PriyoShop Celebration Pohela Falgun Mar 20, 2024
PriyoShop.com LTD
 
FINAL PRESENTATION.pptx12143241324134134
FINAL PRESENTATION.pptx12143241324134134FINAL PRESENTATION.pptx12143241324134134
FINAL PRESENTATION.pptx12143241324134134
LR1709MUSIC
 
3.0 Project 2_ Developing My Brand Identity Kit.pptx
3.0 Project 2_ Developing My Brand Identity Kit.pptx3.0 Project 2_ Developing My Brand Identity Kit.pptx
3.0 Project 2_ Developing My Brand Identity Kit.pptx
tanyjahb
 
What are the main advantages of using HR recruiter services.pdf
What are the main advantages of using HR recruiter services.pdfWhat are the main advantages of using HR recruiter services.pdf
What are the main advantages of using HR recruiter services.pdf
HumanResourceDimensi1
 
India Orthopedic Devices Market: Unlocking Growth Secrets, Trends and Develop...
India Orthopedic Devices Market: Unlocking Growth Secrets, Trends and Develop...India Orthopedic Devices Market: Unlocking Growth Secrets, Trends and Develop...
India Orthopedic Devices Market: Unlocking Growth Secrets, Trends and Develop...
Kumar Satyam
 
Discover the innovative and creative projects that highlight my journey throu...
Discover the innovative and creative projects that highlight my journey throu...Discover the innovative and creative projects that highlight my journey throu...
Discover the innovative and creative projects that highlight my journey throu...
dylandmeas
 
Search Disrupted Google’s Leaked Documents Rock the SEO World.pdf
Search Disrupted Google’s Leaked Documents Rock the SEO World.pdfSearch Disrupted Google’s Leaked Documents Rock the SEO World.pdf
Search Disrupted Google’s Leaked Documents Rock the SEO World.pdf
Arihant Webtech Pvt. Ltd
 

Recently uploaded (20)

Improving profitability for small business
Improving profitability for small businessImproving profitability for small business
Improving profitability for small business
 
anas about venice for grade 6f about venice
anas about venice for grade 6f about veniceanas about venice for grade 6f about venice
anas about venice for grade 6f about venice
 
Maksym Vyshnivetskyi: PMO Quality Management (UA)
Maksym Vyshnivetskyi: PMO Quality Management (UA)Maksym Vyshnivetskyi: PMO Quality Management (UA)
Maksym Vyshnivetskyi: PMO Quality Management (UA)
 
Cracking the Workplace Discipline Code Main.pptx
Cracking the Workplace Discipline Code Main.pptxCracking the Workplace Discipline Code Main.pptx
Cracking the Workplace Discipline Code Main.pptx
 
Project File Report BBA 6th semester.pdf
Project File Report BBA 6th semester.pdfProject File Report BBA 6th semester.pdf
Project File Report BBA 6th semester.pdf
 
falcon-invoice-discounting-a-premier-platform-for-investors-in-india
falcon-invoice-discounting-a-premier-platform-for-investors-in-indiafalcon-invoice-discounting-a-premier-platform-for-investors-in-india
falcon-invoice-discounting-a-premier-platform-for-investors-in-india
 
Global Interconnection Group Joint Venture[960] (1).pdf
Global Interconnection Group Joint Venture[960] (1).pdfGlobal Interconnection Group Joint Venture[960] (1).pdf
Global Interconnection Group Joint Venture[960] (1).pdf
 
chapter 10 - excise tax of transfer and business taxation
chapter 10 - excise tax of transfer and business taxationchapter 10 - excise tax of transfer and business taxation
chapter 10 - excise tax of transfer and business taxation
 
RMD24 | Retail media: hoe zet je dit in als je geen AH of Unilever bent? Heid...
RMD24 | Retail media: hoe zet je dit in als je geen AH of Unilever bent? Heid...RMD24 | Retail media: hoe zet je dit in als je geen AH of Unilever bent? Heid...
RMD24 | Retail media: hoe zet je dit in als je geen AH of Unilever bent? Heid...
 
Meas_Dylan_DMBS_PB1_2024-05XX_Revised.pdf
Meas_Dylan_DMBS_PB1_2024-05XX_Revised.pdfMeas_Dylan_DMBS_PB1_2024-05XX_Revised.pdf
Meas_Dylan_DMBS_PB1_2024-05XX_Revised.pdf
 
Role of Remote Sensing and Monitoring in Mining
Role of Remote Sensing and Monitoring in MiningRole of Remote Sensing and Monitoring in Mining
Role of Remote Sensing and Monitoring in Mining
 
April 2024 Nostalgia Products Newsletter
April 2024 Nostalgia Products NewsletterApril 2024 Nostalgia Products Newsletter
April 2024 Nostalgia Products Newsletter
 
Business Valuation Principles for Entrepreneurs
Business Valuation Principles for EntrepreneursBusiness Valuation Principles for Entrepreneurs
Business Valuation Principles for Entrepreneurs
 
PriyoShop Celebration Pohela Falgun Mar 20, 2024
PriyoShop Celebration Pohela Falgun Mar 20, 2024PriyoShop Celebration Pohela Falgun Mar 20, 2024
PriyoShop Celebration Pohela Falgun Mar 20, 2024
 
FINAL PRESENTATION.pptx12143241324134134
FINAL PRESENTATION.pptx12143241324134134FINAL PRESENTATION.pptx12143241324134134
FINAL PRESENTATION.pptx12143241324134134
 
3.0 Project 2_ Developing My Brand Identity Kit.pptx
3.0 Project 2_ Developing My Brand Identity Kit.pptx3.0 Project 2_ Developing My Brand Identity Kit.pptx
3.0 Project 2_ Developing My Brand Identity Kit.pptx
 
What are the main advantages of using HR recruiter services.pdf
What are the main advantages of using HR recruiter services.pdfWhat are the main advantages of using HR recruiter services.pdf
What are the main advantages of using HR recruiter services.pdf
 
India Orthopedic Devices Market: Unlocking Growth Secrets, Trends and Develop...
India Orthopedic Devices Market: Unlocking Growth Secrets, Trends and Develop...India Orthopedic Devices Market: Unlocking Growth Secrets, Trends and Develop...
India Orthopedic Devices Market: Unlocking Growth Secrets, Trends and Develop...
 
Discover the innovative and creative projects that highlight my journey throu...
Discover the innovative and creative projects that highlight my journey throu...Discover the innovative and creative projects that highlight my journey throu...
Discover the innovative and creative projects that highlight my journey throu...
 
Search Disrupted Google’s Leaked Documents Rock the SEO World.pdf
Search Disrupted Google’s Leaked Documents Rock the SEO World.pdfSearch Disrupted Google’s Leaked Documents Rock the SEO World.pdf
Search Disrupted Google’s Leaked Documents Rock the SEO World.pdf
 

Supporting Data-Rich Research on Many Fronts

  • 1. Suppor&ng  Data-­‐Rich   Research  on  Many  Fronts   2 1   M a y   2 0 1 2   U n i v e r s i t y   o f   C a l i f o r n i a   C u r a & o n   C e n t e r   C a l i f o r n i a   D i g i t a l   L i b r a r y  
  • 2. California  Digital  Library   Serving  the  University  of  California   CDL  supports  the  research  lifecycle     •  10  campuses   •  Collec&ons   •  360K  students,  faculty,  and  staff   •  Digital  Special  Collec&ons   •  100’s  of  museums,  art  galleries,   •  Discovery  &  Delivery   observatories,  marine  centers,   •  Publishing  Group   botanical  gardens   •  UC  Cura&on  Center  (UC3)   •  5  medical  centers   •  5  law  schools   •  3  Na&onal  Laboratories  
  • 4. Our  environment  circa  2002-­‐2008   Focus  on  preserva&on   For  memory  organiza&ons   Infrastructure:  sta&c   Services:  hosted   Content:  museum  &  library   Sustainability:  ?  
  • 5. Our  environment  since  2008   Focus  on  preserva&on      cura%on  (lifecycle)   For  memory  organiza&ons        and  now  data  producers   Infrastructure:  sta&c       +  cloud,  VM,  bitbucket     Services:  hosted        +  partnered,  self-­‐serve   Content:  museum  &  library        +  research,  web  crawls   Sustainability:  ?       cost  recovery,  pay  once  
  • 6. Today’s  journey   Data  service  basics  at  CDL   • Stable  storage  (Merri)   • Stable  iden&fiers  (EZID)   • Data  cita&on  (DataCite)   • Management  (DMPTool)   • Preserva&on  cost  modeling   ...  that  enable   • Federa&on  (DataONE)   • Data  papers   • Capture  (WAS  web  archiving)   • Excel  add-­‐in  (DCXL)  
  • 7. The  scien&fic  record  is  at  risk   Data  dissemina&on  is  rare,  risky,  expensive,   labor-­‐intensive,  domain-­‐specific,  and   receives  lile  credit  as  research  output   Global  Change   Galac&c  Change  
  • 8. The  changing  landscape   •  Ever  increasing  number,  size,  and   diversity  of  content   •  Ever  increasing  diversity  of   partners,  and  stakeholders   •  Decreasing  resources   •  Inevitability  of  disrup&ve  change   – Technology   – Ins&tu&onal  mission   R ESOURCES   T IME  
  • 9. Stable  storage:    Merri  repository   •  Cura&on  repository  open  to  the  UC   community  and  beyond   •  Discipline  /  content  agnos&c     •  Micro-­‐services  architecture   •  Easy-­‐to-­‐use  UI  or  API   •  Hosted  or  locally  deployed   Primary  FuncAons   1.  Deposit     2.  Manage  (metadata,  versions,  etc)   3.  Access  (expose)   4.  Share  (with  other  researchers)   5.  Preserve  
  • 10. EZID:  Long  term  iden%fiers  made  easy   •  Precise  iden&fica&on  of  a  dataset   (DOI  or  ARK)   •  Credit  to  data  producers  and   data  publishers   •  A  link  from  the  tradi&onal   literature  to  the  data  (DataCite)   •  Exposure  and  research  metrics   for  datasets   (Web  of  Knowledge,  Google)   Take  control  of  the   Primary  FuncAons   management  and  distribu%on  of   1.  Create  persistent  iden&fiers   your  research,  share  and  get   2.  Manage  iden&fiers  (and  associated   credit  for  it,  and  build  your   metadata)  over  &me   reputa%on  through  its  collec%on   and  documenta%on   3.  Resolve  iden&fiers  
  • 11. Discovery:  DataCite  consor&um   •  Technische  Informa&onsbibliothek  (TIB),   •  Canada  Ins&tute  for  Scien&fic  and   Germany   Technical  Informa&on  (CISTI)   •  L’Ins&tut  de  l’Informa&on  Scien&fique   •  Australian  Na&onal  Data  Service  (ANDS)   et  Technique  (INIST),  France   •  The  Bri&sh  Library   •  Library  or  the  ETH  Zürich   •  California  Digital  Library,  USA   •  Library  of  TU  Delk,  The  Netherlands   •  Office  of  ScienAfic  and  Technical   InformaAon,  US  Department  of  Energy   •  Purdue  University,  USA   •  Technical  Informa&on  Center  of   Denmark  
  • 12. DMPTool   Mee&ng  funding  agencies  data  management  plan  requirements   •  Connect  researchers  to  resources  to   create  a  data  management  plan   •  NSF  and  directorates,  NIH,  NEH,   IMLS,  founda&ons  plus   •  Customizable   Primary  FuncAons   1.  Step-­‐by-­‐step  “wizard”   2.  Templates  and  examples   3.  Links  to  ins&tu&onal  resources   and  agency  informa&on   4.  Plan  publica&on  and  sharing  
  • 13. Number  of  Plans  Created     Oct  2011  –  Feb  2012  
  • 14. Cost  Model  1:  Pay  as  you  go   •  Billed/paid  annually   { P  if  year = 0  0      if  year > 0 –  Costs  for  archival  System  (A ),  Workflows  (W ),  Content   Types  (C ),  Monitoring  (M ),  and  Interven%ons  (V )  are   considered  common  goods,  and  are  appor&oned  equally   across  all  n  Producers  (P )   •  Model  components  are  represented  by  two  terms:  the  number  of   units  and  the  per-­‐unit  cost,  e.g.,  k ·S –  Storage  cost  (S )  accounted  on  a  per-­‐Producer  basis  
  • 15. Model  2:  Pay  once,  preserve  for  “ T”  years   •  Paid-­‐up  price  for  fixed  term T       –  A  func&on  of  r,  the  annual  investment  return,  and  d,  the   annual  decrease  in  unit  cost  of  preserva&on   –  G   is  the  cost  of  providing  a  year’s  preserva&on  service;           G0  includes  the  added  first  year  expense  of  Producer   engagement  and  registra&on   –  Sepng  T  =  ∞  calculates  the  price  for  “forever”  
  • 16. New  distributed  framework   CoordinaAng  Nodes   Flexible,  scalable,   Member  Nodes   •  retain  complete  metadata   sustainable  network   •   catalog     ins&tu&ons    diverse   •  subset  of  all  data   •     serve  local  community   •  perform  basic  indexing   •   provide  network-­‐wide   •   provide  resources  for   managing  their  data   services   •  ensure  data  availability   (preserva&on)       •  provide  replica&on   services  
  • 17. Tradi&onal  ar&cles  vs  data  papers  
  • 18. The  collec&ve  data  product  
  • 19. Need  to  save  data  +  processing   Algorithms  +  Data  Structures  =  Programs    
  • 20. Vision  for  a  “data  paper”     •  Wrap  the  unfamiliar  in  a  familiar  façade   •  A  “data  paper”  is  minimally  a  cover  sheet   and  a  set  of  links  to  archived  ar&facts     •  Cover  sheet  contains  familiar  elements:   &tle,  date,  authors,  abstract,  and   persistent  iden&fier  (DOI,  ARK,  etc.)   •  Just  enough  to  permit  basic  exposure  and   discovery   –  Building  a  basic  data  cita&on     –  Indexing  by  services  such  as  Web  of   Science,  Google  Scholar   –  Ins&lling    confidence  in  the  iden&fier’s     stability    
  • 21. 43 public archives 120+ archives total 58K crawls 7,500 + sites 600 million + URLs 40+ TB 24 institutions Developed with LoC support by CDL, UNT, and others
  • 22. What  are  people  using  WAS  for?   Archiving  at-­‐risk  government  websites  and  publica&ons   Archiving  their  own  university  domains   Building  web  archives  to  complement  library  collec&ons   Documen&ng  web  coverage  of  significant  events  
  • 23. Data  cura%on  for  Excel   •  Excel  is  the  database  of  choice  for  many  researchers   •  Make  it  easy  to  share,  archive,    and  publish  data   •  Keep  up  to  date  at  dcxl.cdlib.org   Primary  FuncAons   Surveyed  users  and  found:   •  Most  researchers  are  unaware  of   1.  An  Excel  add-­‐in  and  web   preserva&on  op&ons   applica&on   •  Documenta&on  prac&ces  are  poor   2.  Metadata  descrip&on  (through   •  Excel  is  just  one  tool  in  workflows   extrac&on  and  augmenta&on)   3.  Check  for  good  data  prac&ces   3.  Transfer  to  repository    
  • 24. A  data  cura&on  approach  at  CDL   •  New  “data  paper”  publishing  model  [GBMF]   •  DataCite  consor&um  and  cita&on  standards   •  Other  fronts:   •  DataONE  global  data  network  [NSF]   •  Merri:  general-­‐purpose  data  repository   •  EZID:  scheme-­‐agnos&c  &  de-­‐coupled  crea&on,   resolu&on,  and  management  of  persistent  ids   •  Data  management  plan  generator   •  Web  archiving  service  [Library  of  Congress]   •  Open-­‐source  Excel  add-­‐in  [MS  Research  &  GBMF]  
  • 25. Ques&ons?   John.Kunze@ucop.edu   California  Digital  Library   hp://www.cdlib.org/