Institutional Uses of HathiTrust


Published on

Jeremy York's (HathiTrust) Maine InfoNet Collections Summit, May 24, 2013, University of Maine, Orono, ME.

Published in: Education, Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Wouldlike to offer print on demand of as many volumes as possible. Up to institutions to set up. Michigan has arrangement with HP and Amazon.
  • Variety of APIsData APIBibAPIOAIHathifilesLike collections, can be used to contextualize portions of HathiTrust for local audience.Institutions are using these to bring records into local catalogsAlso for doing different kinds of collection analysis
  • Bib API
  • OAI
  • Hathifiles
  • Human sexualityGreek and Latin LiteratureTalk about datasets hereGoogle-digitized~2.8 million textsRequires proposal to HathiTrustAgreement with GoogleStatement on use/managementNon-Google-digitized> 350,000 textsFreely availableStatement on management
  • Institutional Uses of HathiTrust

    1. 1. HATHITRUSTA Shared Digital RepositoryInstitution Uses ofHathiTrustJeremy YorkUniversity of MaineMay 24, 2013
    2. 2. PartnershipArizona State UniversityBaylor UniversityBoston CollegeBoston UniversityBrandeis UniversityBrown UniversityCalifornia Digital LibraryCarnegie MellonUniversityColumbia UniversityCornell UniversityDartmouth CollegeDuke UniversityEmory UniversityFlorida State UniversityGetty Research InstituteHarvard University LibraryIndiana UniversityIowa State UniversityJohns Hopkins UniversityKansas State UniversityLafayette CollegeLibrary of CongressMassachusetts Institute ofTechnologyMcGill University`Michigan State UniversityNew York Public LibraryNew York UniversityNorth Carolina CentralUniversityNorth Carolina StateUniversityNorthwestern UniversityThe Ohio State UniversityThe Pennsylvania StateUniversityPrinceton UniversityPurdue UniversityStanford UniversitySyracuse UniversityTexas A&M UniversityTufts UniversityUniversidad Complutensede MadridUniversity of AlbertaUniversity of ArizonaUniversity of CalgaryUniversity of CaliforniaBerkeleyDavisIrvineLos AngelesMercedRiversideSan DiegoSan FranciscoSanta BarbaraSanta CruzThe University of ChicagoUniversity of ConnecticutUniversity of DelawareUniversity of FloridaUniversity of HoustonUniversity of IllinoisUniversity of Illinois atChicagoThe University of IowaUniversity of KansasUniversity of MarylandUniversity of MiamiUniversity of MichiganUniversity of MinnesotaUniversity of MissouriUniversity of Nebraska-LincolnThe University of NorthCarolina at Chapel HillUniversity of Notre DameUniversity of PennsylvaniaUniversity of PittsburghUniversity of UtahUniversity of VermontUniversity of VirginiaUniversity of WashingtonUniversity of Wisconsin-MadisonUtah State UniversityVanderbilt UniversityVirginia TechWake Forest UniversityWashington UniversityYale University Library
    3. 3. Digital Repository• Launched 2008• Initial focus on digitized book and journalcontent– 10.7 million total volumes– 5.6 million book titles– 278,000 serial titles– 3.3 million public domain (~31%)
    4. 4. Mission• To contribute to the common good bycollecting, organizing, preserving, communicating, and sharing the record of human knowledge
    5. 5. Universal LibraryCommon GoalSingle Entity, Many PartnersHathiTrust
    6. 6. Collections and Collaboration• Comprehensive collection- Preservation…with Access- Repository centralized, yet open• Shared strategies– Copyright– Collection management, development– Preservation– Discovery / Use– Bibliographic Indeterminacy– Efficient user services• Public Good
    7. 7. Copyright DistributionIn-copyright orundetermined69%Public Domain(worldwide)16%U.S. FederalGovernmentDocuments(worldwide)4%PublicDomain(US)11%Open Access.1%Creative Commons.04%"Public Domain”31%
    8. 8. Michigan, 43%California, 32%Wisconsin, 5%Cornell, 4%NYPL, 2%Princeton, 2%Indiana, 2%Columbia, 1%Harvard, 2%LoC, 1%Madrid, 1%Minnesota, 1%Virginia, 0%Chicago, 0%Duke, 0%Illinois, 1%NCSU, 0%Northwestern, 0%Penn State, 0%Purdue, 0%UNC-Chapel Hill, 0%Utah State, 0%Yale, 0%Florida, 0%Boston College, 0%Content Sources
    9. 9. Preservation...with Access• Long-term preservation– Bit-level and migration• Bibliographic search• Full-text search• Copyright review• Reading and download capabilities– Access for users who have print disabilities– Access to out of bring and brittle books– Subject to terms and conditions at• Support beyond books and journals
    10. 10. Centralized...yet open• Services– Print on demand– ILL• Contextualizing materials– Linking from local catalogs– Collections
    11. 11. Linking in Local Catalogs• Bibliographic API– Volume and rights information– MARC records–• OAI–• “Hathifiles”–• Data API– Volume and rights information– Page images– OCR–
    12. 12. Collections
    13. 13. Collection Examples• Physical spaces• Collections within HathiTrust– English Short Title Catalog– USU Press, UM Press– Indiana University Folklore– Islamic Manuscripts– Incunabula (Universidad Complutense de Madrid)• Collections drawn from the collection– Patent Indexes– 19th cen cookbooks– Dictionaries– Anarchism pamphlets– Historical and topical collections• Datasets
    14. 14. Rights Determination• CRMS US (since 2008)– Published in US, 1923-1963– 245,305 reviewed– 131,880 opened (~54%)• CRMS-World (since 2012)– Published non-US (UK, Canada, Australia, Spain)– 44,746 reviewed– 24,154 opened (~54%)• Permissions– Open access – 6,656– Additional Creative Commons – 5,542
    15. 15. 2008 2009 2010 2011 2012 2013 (May)TotalVolumes2,477,871 5,221,092 7,836,698 9,966,572 10,622,285 10,724,270PublicDomain372,085 758,947 1,959,223 2,712,626 3,296,941 3,373,52202,000,0004,000,0006,000,0008,000,00010,000,00012,000,0002008 2009 2010 2011 2012 2013(May)Total VolumesPublic Domain
    16. 16. Collection Management, Development• Overlap– More than 50% median overlap with ARLinstitutions; higher for small liberal arts colleges• Pricing model based on Print holdings– Requires print holdings database– Also support expansion of legal uses, efforts in de-duplication– Facilitate individual and collaborative collectiondevelopment and management operations• Print monographs archiving
    17. 17. How to find out more• About:• Twitter:• Facebook:• Monthly newsletter:–– RSS• Contact us:• Blogs:– Large-scale Search– Perspectives from HathiTrust