SlideShare a Scribd company logo
1 of 18
KAT HAGEDORN
HATHITRUST SPECIAL PROJECTS COORDINATOR
UNIVERSITY OF MICHIGAN LIBRARIES
OCTOBER 9, 2009
Seamless Sharing:
NYU, HathiTrust, ReCAP and the
Cloud Library
With thanks to Constance Malpas at OCLC and John Wilkin at University of Michigan for their considerable contributions
Overview
Need for cloud library
Our pilot project
Brief overview of HathiTrust
Scope and process for pilot project
Expectations and benefits
Cloud Library, not cloud computing
Similar but vastly different
Necessity/desire to share resources
 leverage shared investment, reduce local cost
Multiple digital and print repositories
Repositories can now move into a “cloud” that will
become a shared network resource
What infrastructure needed?
Registry
Transfers
Borrowing
System
Shared
Collections
Withdrawals
Retrievals Commitments
Holdings
Loans
Disclose
Aggregate holdings and joint
commitments constitute a
shared asset
enabling collaborative
management strategies
Procedures
Policies
Infrastructure
Assets
Local
Collections
Off-Site Collections
ReCAP
Digitized
Library Collections
Perceived need
Already good support of other “virtual” shared
services, e.g., ILL, doc delivery
What exists in off-site storage and digital
repositories that isn’t currently accessible?
Collection development mechanisms need to
discover accessibility and preservation statuses
How should we build such a service for consumers?
Partners in pilot
NYU – model customer
 Acute space pressures; major library renovation
 Limited mandate to build local collection of record
ReCAP – model supplier
 Large-scale shared academic storage collection
HathiTrust – model supplier
 Large-scale shared digital repository
OCLC Research and CLIR – consultants & convener
Demand for services
Multiple, sometimes overlapping, reasons
institutions will be interested in being part of a cloud
library
 preserving titles that are rare and/or special in some
manner
 remove titles that are duplicated across many institutions
 added value of shared materials in digital repository
(discovery, search)
 contributing to a public good
A bit about HathiTrust
To contribute to the common good by collecting,
organizing, preserving, communicating, and sharing
the record of human knowledge
 materials converted from print
 improve access …to meet the needs of the co-owning
institutions
 reliable and accessible electronic representations
 coordinate shared storage strategies
 “public good” …sustaining the historical record
 simultaneously …centralized …open
Growth of HathiTrust
 Includes ingest of materials not from Google (GBS)
Goals of pilot study
service expectations for both digital and print
repositories
cost/benefit analyses for sharing resources
processes for discovery of shareable titles
not the build-out of technical solutions
N=7.6M
ReCAP
ReCAP
N=3.8M
HathiTrust
Material that
NYU can
already source
through existing
ILL – enhance
local collection
Material that NYU
can obtain through
HT dependent on
copyright status –
enhance ‘local’
collection
N=2.3M
opportunities for institutional cooperation
shared policy frameworks
joint service agreements
increased operational efficiencies
Intersections
Material that NYU
may choose to
relegate with
appropriate service
level agreement
Material that NYU can
relegate with a high
degree of confidence
Process for discovery of overlap
Ingestion on a monthly basis
Checking of OCLC numbers (without can’t be
processed)– use of xID to derive more
New data structure…
Harvest
Hathi
metadata
Derive add’l
OCLC
numbers
via xID
Extract
WorldCat
data
Extract
OCLC
numbers
Normalize
rights
values
Process,
index,
analyze
Join Hathi
and
WorldCat
data
Monthly data harvest
2 weeks per cycle
to process
Rights
anomalies
report
OCLCnum
report
Overlap
analysis
report
HathiTrust: Looking forward
Ingesting from 4 institutions (UC, Indiana,
Wisconsin, Michigan), more to come
Moving from off-site storage scanning to main
libraries
Result: slight changes in number of PD volumes
Change in membership …broader base of institutions
for cost-sharing
Future contracts will mostly be picklists
Internet Archive ingest starts this winter/late fall
Completion of TRAC certification
Expectations
Service expectations for both HathiTrust and ReCAP
 turnaround time
 continuity of operations
 access privileges
For ReCAP, agreements similar to current processes
With HathiTrust, all are par for the course
Partners in cloud library with HathiTrust
With HathiTrust as a service partner, institutions
can reap the benefits of…
 preservation of texts and metadata
 longevity and perpetuity
 trust and reliability
 access to titles not held by library (comprehensive)
 opportunity for voice in HathiTrust development
Outcome
Increased reliance on a network of collections and
services with a robust underpinning of shared policy
and service infrastructures that are jointly owned by
participating libraries
Naturally, as number of participants grows, value of
partnership increases…
Questions?
Constance Malpas (OCLC): malpasc@oclc.org
John Wilkin (HathiTrust): jpwilkin@umich.edu
Kat Hagedorn (HathiTrust): khage@umich.edu
 http://hathitrust.org/
 hathitrust-info@umich.edu

More Related Content

What's hot

Global registries initiative frumkin omodei
Global registries initiative frumkin omodeiGlobal registries initiative frumkin omodei
Global registries initiative frumkin omodei
ASIS&T
 
Data discovery and sharing at UCLH
Data discovery and sharing at UCLHData discovery and sharing at UCLH
Data discovery and sharing at UCLH
Jisc
 

What's hot (20)

Standardising research data policies, research data network
Standardising research data policies, research data networkStandardising research data policies, research data network
Standardising research data policies, research data network
 
Global registries initiative frumkin omodei
Global registries initiative frumkin omodeiGlobal registries initiative frumkin omodei
Global registries initiative frumkin omodei
 
Developing institutional RDM services
Developing institutional RDM servicesDeveloping institutional RDM services
Developing institutional RDM services
 
Altman RDAP11 Policy-based Data Management
Altman RDAP11 Policy-based Data ManagementAltman RDAP11 Policy-based Data Management
Altman RDAP11 Policy-based Data Management
 
Research data spring: a consortial approach to RDM within SaS
Research data spring: a consortial approach to RDM within SaSResearch data spring: a consortial approach to RDM within SaS
Research data spring: a consortial approach to RDM within SaS
 
Rachel Bruce on DMP
Rachel Bruce on DMPRachel Bruce on DMP
Rachel Bruce on DMP
 
Smith RDAP11 NSF Data Management Plan Case Studies
Smith RDAP11 NSF Data Management Plan Case StudiesSmith RDAP11 NSF Data Management Plan Case Studies
Smith RDAP11 NSF Data Management Plan Case Studies
 
Optimising Scientific Knowledge Transfer: How Collective Sensemaking Can Ena...
Optimising Scientific Knowledge Transfer: How Collective Sensemaking Can Ena...Optimising Scientific Knowledge Transfer: How Collective Sensemaking Can Ena...
Optimising Scientific Knowledge Transfer: How Collective Sensemaking Can Ena...
 
UK Research Data Discovery Service metadata schema
UK Research Data Discovery Service metadata schemaUK Research Data Discovery Service metadata schema
UK Research Data Discovery Service metadata schema
 
Authentication Methods: Shibboleth
Authentication Methods: ShibbolethAuthentication Methods: Shibboleth
Authentication Methods: Shibboleth
 
Secure Lab at the UK Data Service
Secure Lab at the UK Data ServiceSecure Lab at the UK Data Service
Secure Lab at the UK Data Service
 
Research Data Support at the University of Edinburgh
Research Data Support at the University of EdinburghResearch Data Support at the University of Edinburgh
Research Data Support at the University of Edinburgh
 
Managing sensitive data at the University of Bristol
Managing sensitive data at the University of BristolManaging sensitive data at the University of Bristol
Managing sensitive data at the University of Bristol
 
EPFL Open Research Data - a Jisc perspective
EPFL Open Research Data - a Jisc perspectiveEPFL Open Research Data - a Jisc perspective
EPFL Open Research Data - a Jisc perspective
 
David Reeve - UKAD 2016 forum
David Reeve - UKAD 2016 forumDavid Reeve - UKAD 2016 forum
David Reeve - UKAD 2016 forum
 
Jan haspeslagh - Vlaams Instistuut voor de Zee
Jan haspeslagh - Vlaams Instistuut voor de ZeeJan haspeslagh - Vlaams Instistuut voor de Zee
Jan haspeslagh - Vlaams Instistuut voor de Zee
 
Bill Stockting - UKAD Forum 2016
Bill Stockting - UKAD Forum 2016Bill Stockting - UKAD Forum 2016
Bill Stockting - UKAD Forum 2016
 
Data discovery and sharing at UCLH
Data discovery and sharing at UCLHData discovery and sharing at UCLH
Data discovery and sharing at UCLH
 
Archivematica for research data
Archivematica for research dataArchivematica for research data
Archivematica for research data
 
Integrating Unique Materials into the Global Discovery Network
Integrating Unique Materials into the Global Discovery NetworkIntegrating Unique Materials into the Global Discovery Network
Integrating Unique Materials into the Global Discovery Network
 

Similar to Hagedorn, "Seamless Sharing: NYU, HathiTrust, ReCAP and the Cloud Library"

Andy Powell Presentation
Andy Powell PresentationAndy Powell Presentation
Andy Powell Presentation
Donggi heo
 
Mendeley Open Repositories 2011 Paper
Mendeley Open Repositories 2011 PaperMendeley Open Repositories 2011 Paper
Mendeley Open Repositories 2011 Paper
William Gunn
 

Similar to Hagedorn, "Seamless Sharing: NYU, HathiTrust, ReCAP and the Cloud Library" (20)

Cloud Library: Precipitating change in library infrastructure
Cloud Library: Precipitating change in library infrastructureCloud Library: Precipitating change in library infrastructure
Cloud Library: Precipitating change in library infrastructure
 
W3C Library Linked Data Incubator Group - 2011
W3C Library Linked Data Incubator Group  - 2011W3C Library Linked Data Incubator Group  - 2011
W3C Library Linked Data Incubator Group - 2011
 
Building a Public Research Center for the HathiTrust Digital Library
Building a Public Research Center for the HathiTrust Digital LibraryBuilding a Public Research Center for the HathiTrust Digital Library
Building a Public Research Center for the HathiTrust Digital Library
 
Digital library services and the changing environment
Digital library services and the changing environmentDigital library services and the changing environment
Digital library services and the changing environment
 
Brokering a National Data Agreement.pdf
Brokering a National Data Agreement.pdfBrokering a National Data Agreement.pdf
Brokering a National Data Agreement.pdf
 
NIH Data Summit - The NIH Data Commons
NIH Data Summit - The NIH Data CommonsNIH Data Summit - The NIH Data Commons
NIH Data Summit - The NIH Data Commons
 
الجلسة الأولى المداخلة الأولى استراتيجيات التكنولوجيا المستدامة للمكتبات العا...
الجلسة الأولى المداخلة الأولى استراتيجيات التكنولوجيا المستدامة للمكتبات العا...الجلسة الأولى المداخلة الأولى استراتيجيات التكنولوجيا المستدامة للمكتبات العا...
الجلسة الأولى المداخلة الأولى استراتيجيات التكنولوجيا المستدامة للمكتبات العا...
 
Data Library Services In The Data Stewardship Lifecycle
Data Library Services In The Data Stewardship LifecycleData Library Services In The Data Stewardship Lifecycle
Data Library Services In The Data Stewardship Lifecycle
 
The Repository Roadmap - are we heading in the right direction?
The Repository Roadmap - are we heading in the right direction?The Repository Roadmap - are we heading in the right direction?
The Repository Roadmap - are we heading in the right direction?
 
Andy Powell Presentation
Andy Powell PresentationAndy Powell Presentation
Andy Powell Presentation
 
CLARIAH Toogdag 2018: A distributed network of digital heritage information
CLARIAH Toogdag 2018: A distributed network of digital heritage informationCLARIAH Toogdag 2018: A distributed network of digital heritage information
CLARIAH Toogdag 2018: A distributed network of digital heritage information
 
Bonazzi commons bd2 k ahm 2016 v2
Bonazzi commons bd2 k ahm 2016 v2Bonazzi commons bd2 k ahm 2016 v2
Bonazzi commons bd2 k ahm 2016 v2
 
Open Archives Initiative Object Reuse and Exchange
Open Archives Initiative Object Reuse and ExchangeOpen Archives Initiative Object Reuse and Exchange
Open Archives Initiative Object Reuse and Exchange
 
Impact of Covid-19 on Learning and Education
Impact of Covid-19 on Learning and EducationImpact of Covid-19 on Learning and Education
Impact of Covid-19 on Learning and Education
 
EuroSakai CLIF project presentation
EuroSakai CLIF project presentationEuroSakai CLIF project presentation
EuroSakai CLIF project presentation
 
Data accessibilityandchallenges
Data accessibilityandchallengesData accessibilityandchallenges
Data accessibilityandchallenges
 
'Seeding' the Cloud Library--Precipitating Change in Library Infrastructure
'Seeding' the Cloud Library--Precipitating Change in Library Infrastructure'Seeding' the Cloud Library--Precipitating Change in Library Infrastructure
'Seeding' the Cloud Library--Precipitating Change in Library Infrastructure
 
Mendeley Open Repositories 2011 Paper
Mendeley Open Repositories 2011 PaperMendeley Open Repositories 2011 Paper
Mendeley Open Repositories 2011 Paper
 
Data to Decisions: Shared Print Retention in Maine
Data to Decisions: Shared Print Retention in MaineData to Decisions: Shared Print Retention in Maine
Data to Decisions: Shared Print Retention in Maine
 
OAIS and It's Applicability for Libraries, Archives, and Digital Repositories...
OAIS and It's Applicability for Libraries, Archives, and Digital Repositories...OAIS and It's Applicability for Libraries, Archives, and Digital Repositories...
OAIS and It's Applicability for Libraries, Archives, and Digital Repositories...
 

More from National Information Standards Organization (NISO)

More from National Information Standards Organization (NISO) (20)

Mattingly "AI & Prompt Design: Limitations and Solutions with LLMs"
Mattingly "AI & Prompt Design: Limitations and Solutions with LLMs"Mattingly "AI & Prompt Design: Limitations and Solutions with LLMs"
Mattingly "AI & Prompt Design: Limitations and Solutions with LLMs"
 
Mattingly "AI and Prompt Design: LLMs with Text Classification and Open Source"
Mattingly "AI and Prompt Design: LLMs with Text Classification and Open Source"Mattingly "AI and Prompt Design: LLMs with Text Classification and Open Source"
Mattingly "AI and Prompt Design: LLMs with Text Classification and Open Source"
 
Mattingly "AI and Prompt Design: LLMs with NER"
Mattingly "AI and Prompt Design: LLMs with NER"Mattingly "AI and Prompt Design: LLMs with NER"
Mattingly "AI and Prompt Design: LLMs with NER"
 
Mattingly "AI & Prompt Design: Named Entity Recognition"
Mattingly "AI & Prompt Design: Named Entity Recognition"Mattingly "AI & Prompt Design: Named Entity Recognition"
Mattingly "AI & Prompt Design: Named Entity Recognition"
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Bazargan "NISO Webinar, Sustainability in Publishing"
Bazargan "NISO Webinar, Sustainability in Publishing"Bazargan "NISO Webinar, Sustainability in Publishing"
Bazargan "NISO Webinar, Sustainability in Publishing"
 
Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"
 
Compton "NISO Webinar, Sustainability in Publishing"
Compton "NISO Webinar, Sustainability in Publishing"Compton "NISO Webinar, Sustainability in Publishing"
Compton "NISO Webinar, Sustainability in Publishing"
 
Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"
 
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
 
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
 
Mattingly "Text and Data Mining: Building Data Driven Applications"
Mattingly "Text and Data Mining: Building Data Driven Applications"Mattingly "Text and Data Mining: Building Data Driven Applications"
Mattingly "Text and Data Mining: Building Data Driven Applications"
 
Mattingly "Text and Data Mining: Searching Vectors"
Mattingly "Text and Data Mining: Searching Vectors"Mattingly "Text and Data Mining: Searching Vectors"
Mattingly "Text and Data Mining: Searching Vectors"
 
Mattingly "Text Mining Techniques"
Mattingly "Text Mining Techniques"Mattingly "Text Mining Techniques"
Mattingly "Text Mining Techniques"
 
Mattingly "Text Processing for Library Data: Representing Text as Data"
Mattingly "Text Processing for Library Data: Representing Text as Data"Mattingly "Text Processing for Library Data: Representing Text as Data"
Mattingly "Text Processing for Library Data: Representing Text as Data"
 
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
 
Ross and Clark "Strategic Planning"
Ross and Clark "Strategic Planning"Ross and Clark "Strategic Planning"
Ross and Clark "Strategic Planning"
 
Mattingly "Data Mining Techniques: Classification and Clustering"
Mattingly "Data Mining Techniques: Classification and Clustering"Mattingly "Data Mining Techniques: Classification and Clustering"
Mattingly "Data Mining Techniques: Classification and Clustering"
 
Straza "Global collaboration towards equitable and open science: UNESCO Recom...
Straza "Global collaboration towards equitable and open science: UNESCO Recom...Straza "Global collaboration towards equitable and open science: UNESCO Recom...
Straza "Global collaboration towards equitable and open science: UNESCO Recom...
 

Recently uploaded

MSc Ag Genetics & Plant Breeding: Insights from Previous Year JNKVV Entrance ...
MSc Ag Genetics & Plant Breeding: Insights from Previous Year JNKVV Entrance ...MSc Ag Genetics & Plant Breeding: Insights from Previous Year JNKVV Entrance ...
MSc Ag Genetics & Plant Breeding: Insights from Previous Year JNKVV Entrance ...
Krashi Coaching
 
The basics of sentences session 4pptx.pptx
The basics of sentences session 4pptx.pptxThe basics of sentences session 4pptx.pptx
The basics of sentences session 4pptx.pptx
heathfieldcps1
 

Recently uploaded (20)

Sternal Fractures & Dislocations - EMGuidewire Radiology Reading Room
Sternal Fractures & Dislocations - EMGuidewire Radiology Reading RoomSternal Fractures & Dislocations - EMGuidewire Radiology Reading Room
Sternal Fractures & Dislocations - EMGuidewire Radiology Reading Room
 
An Overview of the Odoo 17 Knowledge App
An Overview of the Odoo 17 Knowledge AppAn Overview of the Odoo 17 Knowledge App
An Overview of the Odoo 17 Knowledge App
 
Exploring Gemini AI and Integration with MuleSoft | MuleSoft Mysore Meetup #45
Exploring Gemini AI and Integration with MuleSoft | MuleSoft Mysore Meetup #45Exploring Gemini AI and Integration with MuleSoft | MuleSoft Mysore Meetup #45
Exploring Gemini AI and Integration with MuleSoft | MuleSoft Mysore Meetup #45
 
How to Analyse Profit of a Sales Order in Odoo 17
How to Analyse Profit of a Sales Order in Odoo 17How to Analyse Profit of a Sales Order in Odoo 17
How to Analyse Profit of a Sales Order in Odoo 17
 
Analyzing and resolving a communication crisis in Dhaka textiles LTD.pptx
Analyzing and resolving a communication crisis in Dhaka textiles LTD.pptxAnalyzing and resolving a communication crisis in Dhaka textiles LTD.pptx
Analyzing and resolving a communication crisis in Dhaka textiles LTD.pptx
 
demyelinated disorder: multiple sclerosis.pptx
demyelinated disorder: multiple sclerosis.pptxdemyelinated disorder: multiple sclerosis.pptx
demyelinated disorder: multiple sclerosis.pptx
 
“O BEIJO” EM ARTE .
“O BEIJO” EM ARTE                       .“O BEIJO” EM ARTE                       .
“O BEIJO” EM ARTE .
 
DEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUM
DEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUMDEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUM
DEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUM
 
Envelope of Discrepancy in Orthodontics: Enhancing Precision in Treatment
 Envelope of Discrepancy in Orthodontics: Enhancing Precision in Treatment Envelope of Discrepancy in Orthodontics: Enhancing Precision in Treatment
Envelope of Discrepancy in Orthodontics: Enhancing Precision in Treatment
 
How To Create Editable Tree View in Odoo 17
How To Create Editable Tree View in Odoo 17How To Create Editable Tree View in Odoo 17
How To Create Editable Tree View in Odoo 17
 
e-Sealing at EADTU by Kamakshi Rajagopal
e-Sealing at EADTU by Kamakshi Rajagopale-Sealing at EADTU by Kamakshi Rajagopal
e-Sealing at EADTU by Kamakshi Rajagopal
 
24 ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH SỞ GIÁO DỤC HẢI DƯ...
24 ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH SỞ GIÁO DỤC HẢI DƯ...24 ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH SỞ GIÁO DỤC HẢI DƯ...
24 ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH SỞ GIÁO DỤC HẢI DƯ...
 
MSc Ag Genetics & Plant Breeding: Insights from Previous Year JNKVV Entrance ...
MSc Ag Genetics & Plant Breeding: Insights from Previous Year JNKVV Entrance ...MSc Ag Genetics & Plant Breeding: Insights from Previous Year JNKVV Entrance ...
MSc Ag Genetics & Plant Breeding: Insights from Previous Year JNKVV Entrance ...
 
Improved Approval Flow in Odoo 17 Studio App
Improved Approval Flow in Odoo 17 Studio AppImproved Approval Flow in Odoo 17 Studio App
Improved Approval Flow in Odoo 17 Studio App
 
Stl Algorithms in C++ jjjjjjjjjjjjjjjjjj
Stl Algorithms in C++ jjjjjjjjjjjjjjjjjjStl Algorithms in C++ jjjjjjjjjjjjjjjjjj
Stl Algorithms in C++ jjjjjjjjjjjjjjjjjj
 
Championnat de France de Tennis de table/
Championnat de France de Tennis de table/Championnat de France de Tennis de table/
Championnat de France de Tennis de table/
 
The basics of sentences session 4pptx.pptx
The basics of sentences session 4pptx.pptxThe basics of sentences session 4pptx.pptx
The basics of sentences session 4pptx.pptx
 
The Liver & Gallbladder (Anatomy & Physiology).pptx
The Liver &  Gallbladder (Anatomy & Physiology).pptxThe Liver &  Gallbladder (Anatomy & Physiology).pptx
The Liver & Gallbladder (Anatomy & Physiology).pptx
 
The Story of Village Palampur Class 9 Free Study Material PDF
The Story of Village Palampur Class 9 Free Study Material PDFThe Story of Village Palampur Class 9 Free Study Material PDF
The Story of Village Palampur Class 9 Free Study Material PDF
 
Basic Civil Engineering notes on Transportation Engineering, Modes of Transpo...
Basic Civil Engineering notes on Transportation Engineering, Modes of Transpo...Basic Civil Engineering notes on Transportation Engineering, Modes of Transpo...
Basic Civil Engineering notes on Transportation Engineering, Modes of Transpo...
 

Hagedorn, "Seamless Sharing: NYU, HathiTrust, ReCAP and the Cloud Library"

  • 1. KAT HAGEDORN HATHITRUST SPECIAL PROJECTS COORDINATOR UNIVERSITY OF MICHIGAN LIBRARIES OCTOBER 9, 2009 Seamless Sharing: NYU, HathiTrust, ReCAP and the Cloud Library With thanks to Constance Malpas at OCLC and John Wilkin at University of Michigan for their considerable contributions
  • 2. Overview Need for cloud library Our pilot project Brief overview of HathiTrust Scope and process for pilot project Expectations and benefits
  • 3. Cloud Library, not cloud computing Similar but vastly different Necessity/desire to share resources  leverage shared investment, reduce local cost Multiple digital and print repositories Repositories can now move into a “cloud” that will become a shared network resource What infrastructure needed?
  • 4. Registry Transfers Borrowing System Shared Collections Withdrawals Retrievals Commitments Holdings Loans Disclose Aggregate holdings and joint commitments constitute a shared asset enabling collaborative management strategies Procedures Policies Infrastructure Assets Local Collections Off-Site Collections ReCAP Digitized Library Collections
  • 5. Perceived need Already good support of other “virtual” shared services, e.g., ILL, doc delivery What exists in off-site storage and digital repositories that isn’t currently accessible? Collection development mechanisms need to discover accessibility and preservation statuses How should we build such a service for consumers?
  • 6. Partners in pilot NYU – model customer  Acute space pressures; major library renovation  Limited mandate to build local collection of record ReCAP – model supplier  Large-scale shared academic storage collection HathiTrust – model supplier  Large-scale shared digital repository OCLC Research and CLIR – consultants & convener
  • 7. Demand for services Multiple, sometimes overlapping, reasons institutions will be interested in being part of a cloud library  preserving titles that are rare and/or special in some manner  remove titles that are duplicated across many institutions  added value of shared materials in digital repository (discovery, search)  contributing to a public good
  • 8. A bit about HathiTrust To contribute to the common good by collecting, organizing, preserving, communicating, and sharing the record of human knowledge  materials converted from print  improve access …to meet the needs of the co-owning institutions  reliable and accessible electronic representations  coordinate shared storage strategies  “public good” …sustaining the historical record  simultaneously …centralized …open
  • 9. Growth of HathiTrust  Includes ingest of materials not from Google (GBS)
  • 10. Goals of pilot study service expectations for both digital and print repositories cost/benefit analyses for sharing resources processes for discovery of shareable titles not the build-out of technical solutions
  • 11. N=7.6M ReCAP ReCAP N=3.8M HathiTrust Material that NYU can already source through existing ILL – enhance local collection Material that NYU can obtain through HT dependent on copyright status – enhance ‘local’ collection N=2.3M opportunities for institutional cooperation shared policy frameworks joint service agreements increased operational efficiencies Intersections Material that NYU may choose to relegate with appropriate service level agreement Material that NYU can relegate with a high degree of confidence
  • 12. Process for discovery of overlap Ingestion on a monthly basis Checking of OCLC numbers (without can’t be processed)– use of xID to derive more New data structure…
  • 13. Harvest Hathi metadata Derive add’l OCLC numbers via xID Extract WorldCat data Extract OCLC numbers Normalize rights values Process, index, analyze Join Hathi and WorldCat data Monthly data harvest 2 weeks per cycle to process Rights anomalies report OCLCnum report Overlap analysis report
  • 14. HathiTrust: Looking forward Ingesting from 4 institutions (UC, Indiana, Wisconsin, Michigan), more to come Moving from off-site storage scanning to main libraries Result: slight changes in number of PD volumes Change in membership …broader base of institutions for cost-sharing Future contracts will mostly be picklists Internet Archive ingest starts this winter/late fall Completion of TRAC certification
  • 15. Expectations Service expectations for both HathiTrust and ReCAP  turnaround time  continuity of operations  access privileges For ReCAP, agreements similar to current processes With HathiTrust, all are par for the course
  • 16. Partners in cloud library with HathiTrust With HathiTrust as a service partner, institutions can reap the benefits of…  preservation of texts and metadata  longevity and perpetuity  trust and reliability  access to titles not held by library (comprehensive)  opportunity for voice in HathiTrust development
  • 17. Outcome Increased reliance on a network of collections and services with a robust underpinning of shared policy and service infrastructures that are jointly owned by participating libraries Naturally, as number of participants grows, value of partnership increases…
  • 18. Questions? Constance Malpas (OCLC): malpasc@oclc.org John Wilkin (HathiTrust): jpwilkin@umich.edu Kat Hagedorn (HathiTrust): khage@umich.edu  http://hathitrust.org/  hathitrust-info@umich.edu

Editor's Notes

  1. We want people to perceive an analogy with cloud-sourcing core business services (storage, distribution) in the library environment -- this motivated partly by economic imperatives (opportunity to leverage shared investment, reduce local cost) but also by a fundamental transformation in the way that library services are consumed in the network
  2. A system diagram. We already have reasonably good infrastructure to support ‘virtual’ shared collections via inter-lending and document delivery. What we lack is a view of the assets that are locked up in digital repositories and off-site storage. To make the cloud library work, we need to make the preservation status and availability of this content more available for collection management decision making. We also need to understand the service expectations that consumers will bring to the table. Data flows and business processes that will need to be supported in a large-scale shared collections environment; it requires that we make print/digital repositories part of a shared library infrastructure (the dotted line around shared collections).  But before we build that kind of infrastructure, we have to figure out what kinds of inter-institutional agreements will actually support a wide-scale shift to reliance on shared collections.   The 'Cloud Library' project is seeking to expand the scope and scale of shared collections by making digital repositories (Hathi, primarily) and aggregate storage inventory (ReCAP, MLAC, SRLF etc -- the 1 billion books that we've already socked away in high-density facilities) part of the core service infrastructure. We've got a good, robust and dependable architecture for informal collection sharing via ILL-- what we lack is a comparable set of social agreements and registry services for the large-scale print and digital repository collections.
  3. Mass digitization has created an opportunity for libraries to rethink the function of locally held inventory. The combination of large-scale digital access (via subscription mechanisms, in the main), large-scale preservation repositories (Hathi), and the already extant 'latent' infrastructure in our large-scale storage collections provides the foundation for a new collections economy that can free up a tremendous amount of library resource. Sharing via ILL-- mechanisms for disclosing holdings and the policies that govern inter-lending practices.
  4. Worth noting here that NYU is representative of a large cohort of academic institutions that are looking for ways to participate in a 'multi-institutional' model of library management (this is a useful hook for the people who've read the CLIR “No Brief Candle” report that Paul Courant helped write).
  5. The scope of the CL project is really limited to the 'social/economic infrastructure' embodied in service agreements and not the build-out of technical solutions.
  6. The intersection between NYU and HathiTrust holdings creates opportunities for changed management of physical inventory at NYU (and other institutions), as does the intersection between NYU and ReCAP holdings. The nature of the opportunity varies according to the relative ‘availability’ of content in digital or print form as well as local demand patterns. The intersection of NYU/Hathi/ReCAP collections represents a particularly significant opportunity: titles in the public domain can be sourced from Hathi; titles in copyright can be discovered and searched in Hathi and delivered from ReCAP. Note that ReCAP and Hathi collections are growing at a faster pace than NYU’s own holdings, accelerating the rate at which NYU can increase its reliance on the shared repositories. Hathi is adding several hundred thousand volumes per month. ReCAP is adding about 57K vols per month. NYU is probably adding under 500K volumes to its collections annually. The opportunities for virtual enhancement of local collections are on the periphery.
  7. Process currently in place for harvesting, merging and analyzing Hathi and WorldCat data. Some steps taken longer than others, dependent on staff and computing resources. Need about 2 weeks per cycle; Hathi data made available on monthly cycle. Current schedule is to draw down HT data the first week of the month, with the aim completing processing and indexing by mid-month. This gives OCLC about 10 days to identify patterns and discuss with partners before next harvest. In terms of process, OCLC harvests HT data, finds those with an OCLC number (critical info), incorporates ReCAP holdings, and the result is a determination of the overlap of the three "repositories." Having identifying numbers is critical because there is no other way to appropriately match across the repositories. OCLC numbers are best, but other numbers can be folded into the OCLC processing routines.
  8. All these are important to fulfill the needs of the cloud library.