The purpose of today’s session is to share preliminary results of some recent research on unique print book titles, acknowledge the contributions that RLG partners have made to this effort, and offer attendees an opportunity to help shape the final report for this project.
Most RB collections are comprised of early printed works, volumes printed before 1850 in the Americas, and before 1775 in Europe and the other continents.
WorldCat (excluding article-level metadata) has nearly doubled in size over the last 5 years. But while global coverage has increased significantly, the total proportion of unique holdings in the database has continued to grow.
I want to start by establishing some context for our current work on unique titles. Last expressions: “the only known manifestation of specific intellectual or artistic content” Measuring duplication at the title or manifestationlevel is inadequate; must consider relative uniqueness of content
Unique can be defined in absolute terms; “rare” is relative to a particular set of curatorial interests. In the mid ’80s, Ross Atkinson proposed a “materialistic” typology of preservation priorities, each with a distinctive kind of value: Class 1 materials represented rare books and manuscripts with high economic and research value; Class 2 materials represented heavily-used content that was at risk of physical deterioration; Class 3 materials represented infrequently used content that had enduring scholarly value but little economic value
Work : A distinct intellectual or artistic creation. Modifications involving a significant degree of independent intellectual effort such as paraphrases, rewritings, adaptations for children, parodies, abstracts, digests, and summaries are considered to be different works. Expression : The intellectual or artistic realization of a work. The boundaries of an expression are defined to exclude aspects of physical form (typeface, page layout, etc.) Revisions, updates, abridgements, enlargements, and translations are different expressions of the same work. Any revision or modification, no matter how minor, is considered to be a new expression. Manifestation : The physical embodiment of an expression of a work. A manifestation represents all the physical objects that bear the same intellectual and physical characteristics. Changes in typeface, size of font, page layout, or change of publisher will result in a new manifestation. New printings are not considered to be a new manifestation unless other significant changes are also made. The same manifestation may have different binding (hardcover vs. paperback) or the type of paper (regular or acid-free) or other variations (thumb-indexed) that do not significantly printed image. Item : A single exemplar of a manifestation. All changes that occur after the manufacturing process (defacement, rebinding, etc.) are considered changes to the item and do not result in a new manifestation.
(1) Bib lvl = ‘a’ or ‘t’ (books and manuscripts) (2) rec type = ‘m’ (3) enc lvl not = ‘8’ (no cip) (4) 245 subfield h not = “microform” or "electronic resource" (5) 533 subfield a not = "microfilm", "microopaque", "micro opaque", "microfiche", "microprint", "microcard", "microform", "electronic reproduction", "electronic resource", or "braille" (6) No 856 subfield 3 (7) Published before 2000 (a) for date types e, r, s, t; date1 < 2000 (b) for date types m, q; date2 < 2000
Previous study of Vanderbilt’s uniquely held books identified ‘last expressions’ as a class of material deserving careful scrutiny. Our current project confirms that a significant number of unique holdings represent unique intellectual content, i.e. content for which a single expression exists within the aggregate collection of WorldCat libraries.
Both in absolute terms (total number of titles/records in sample)
And in relative terms, with theses/dissertation and technical reports representing the greatest proportion of unique works.
Theses and dissertations account for 20% of the titles in our sample and more than a quarter of titles identified as unique works. Most of the durable uniqueness can be attributed to masters theses, which rarely have more than a single institutional holding in any format. Theses and dissertations are of particular interest, as they represent a source of “locally produced” uniqueness for university libraries.
Books without Boundaries also found ca. 50% non-English titles in sample of uniquely held works. Language distribution for unique works vs. others not substantially different.
Nonetheless, most of the imprints in our sample were published outside of the United States.
This answers question posed in Books without Boundaries regarding the institutional distribution of unique books, confirming that institutions with a strong preservation mission hold the greatest proportion of such titles. Ranked order of institution types by total holdings in WorldCat Non-academic ARL ARL Public Special Govt School State and national
Similarly, in Vanderbilt study, more than half of the titles in sample were published after 1950 – though the relative proportion of unique titles was highest for earlier period. I.e., as Google Books analysis suggests, duplication of holdings is inversely proportional to age of book – until the 1980s, when holdings become relatively scarce again.
http://www.flickr.com/photos/library_of_congress/2162895505/ Bain News Service, publisher. Greece in N.Y. 4th of July Parade [between 1910 and 1915]
Assessing Uniqueness in the System-wide Book Collection
RLG Programs Assessing Uniqueness in the System-wide Book CollectionRLG Webinar – 24 April 20082Why investigate unique print books? Future of library print collections is in question We need better “management intelligence” about wherecontinued investment in print collections – both legacyholdings and future acquisitions – should be directed Uniquely-held content may be an asset or liability Institutional assets that may be leveraged throughdigitization and resource-sharing agreements Potential preservation risks, if the content is notadequately cared for Size, character and distribution of aggregatecollection has broad implications Digitization – identifying distinctive collections Disclosure – maximizing discoverability Distributed print archiving – sizing the need
RLG Programs Assessing Uniqueness in the System-wide Book CollectionRLG Webinar – 24 April 20083Who’s Involved:OCLC Programs & ResearchConstance Malpas, Program OfficerEd O’Neill, Senior Research ScientistBrian Lavoie, Research ScientistRLG PartnersArizona State UniversityColumbia UniversityDuke UniversityFlorida State UniversityHarvard UniversityIndiana UniversityLibrary of CongressNew York Public LibraryNew York UniversityUniversity of AlbertaUniversity of ArizonaUniversity of California, BerkeleyUniversity of California, Los AngelesUniversity of ChicagoUniversity of MichiganUniversity of MinnesotaUniversity of PennsylvaniaUniversity of Texas, AustinYale University…among others
RLG Programs Assessing Uniqueness in the System-wide Book CollectionRLG Webinar – 24 April 20084Unique vs. rare: a distinction with a difference “Unique” = single holding attached to masterrecord in WorldCat describing a distinctmanifestation / edition some uniquely held titles may be associated withmultiple local copies “Rare” typically describes material that is inlimited supply and has special value to particularaudience Few copies were produced Few remaining copies available on the market Distinctive intellectual content or artifactual features(binding, signatures)
RLG Programs Assessing Uniqueness in the System-wide Book CollectionRLG Webinar – 24 April 20085Growth of Unique Holdings in WorldCatJan -03 Jan -05 Jan -07 Jan -08Date of SnapshotMasterRecords50%49%42%44%Proportion of master records with a single holdinghas increased 8% since 2003
RLG Programs Assessing Uniqueness in the System-wide Book CollectionRLG Webinar – 24 April 20086Background Anatomy of Aggregate Collections (2005) Thin duplication of book holdings across “Google Five” libraries(~40%) and between aggregate collection and rest ofWorldCat (~30%) Proportion of uniquely held titles decreases as publication dateadvances – until 1980s Books without Boundaries (2006) 9.5M uniquely held works representing 36% of works inWorldCat; preservation implications Unique titles in WorldCat represent ~2/3 of total printproduction; significant collection gap Last Copies: What’s at Risk? (2006) “last expressions” – a conceptual model 26K unique titles at Vanderbilt; typically “old, foreign, short” Global Resources Report (2007) Limited redundancy in ARL holdings of non-North Americanimprints (~3 to ~6 holdings per title)
RLG Programs Assessing Uniqueness in the System-wide Book CollectionRLG Webinar – 24 April 20087Importance of FRBR Measuring duplication at the “work” or expressionlevel provides maximum measure of overlap forintellectual content Uniquely-held manifestations may representartifactual treasures Book history – bindings, printers Provenance – autographs, annotations Implications for collection management Unique works represent distinctive intellectual assets Unique manifestations may require curatorial care
RLG Programs Assessing Uniqueness in the System-wide Book CollectionRLG Webinar – 24 April 20088FRBR: Group One EntitiesIsexemplified byIs embodied inWorkA distinct intellectual orartistic creationIs realizedthroughExpressionThe intellectual or artisticrealization of a workManifestationThe physicalembodiment of anexpressionItemA singleexemplar of amanifestationIsembodied in
RLG Programs Assessing Uniqueness in the System-wide Book CollectionRLG Webinar – 24 April 20089Goals of current last copies work Evaluate relative proportion of unique works in arepresentative and statistically significant sample Application of FRBR Characterize material and content types “old, foreign, short” Examine distribution of holdings by library-type preservation infrastructure Assess preservation status and circulation historyof selected titles In 1995 study of titles published 1850-1940, 12% werenot available for study – missing, not on shelf
RLG Programs Assessing Uniqueness in the System-wide Book CollectionRLG Webinar – 24 April 200810Sample CharacteristicsFractional sample of 250 records representing: January 2007 snapshot of WorldCat 74.5M bibliographic records Master records with a single holding symbol 36.8M records Monographic language-based titles, excluding non-printformats (electronic resources, microforms, braille) 14.7M recordsFurther limits were applied to facilitate analysis: English-language cataloging only Common descriptive standards Titles published before Y2000 Avoid ‘first copy’ (cataloging lag) problem
RLG Programs Assessing Uniqueness in the System-wide Book CollectionRLG Webinar – 24 April 200811Research Methods Independent assessment followed by team review Combination of machine- and manual analysis Connexion, FirstSearch, MARCView Level of uniqueness work: content is not duplicated within WorldCat expression: distinctive expression of duplicated content manifestation: alternate editions available in WorldCat analytic: content is part of a larger published work duplicate record found: cataloging anomalies Material / content types Non-fiction books; technical reports; language /literature; archival materials; ephemera Theses and dissertations (baccalaureate, masters, PhD) Government documents (national, state, local)
RLG Programs Assessing Uniqueness in the System-wide Book CollectionRLG Webinar – 24 April 200812Levels of Uniqueness within Samplenon-uniqueunique analyticsunique manifestationsunique expressionsunique worksN = 250 records>60% of titles insample representunique intellectualcontentcataloging shortfalls
RLG Programs Assessing Uniqueness in the System-wide Book CollectionRLG Webinar – 24 April 200813Content and Material Types33%20%15%10%7%3%12%Non-fiction published booksTheses and dissertationsTechnical reportsSerialsLiterature, poetryArchival materialsOther (ephemera, catalogs,manuals, direcotories, etc.)N = 250 recordsAcademic and technical content predominates . . .
RLG Programs Assessing Uniqueness in the System-wide Book CollectionRLG Webinar – 24 April 200814Range of Unique Works by Material TypeMaterial types representing >5% of titles in sample“grey literature” containsgreatest proportion ofunique intellectual contentmoremanifestations
RLG Programs Assessing Uniqueness in the System-wide Book CollectionRLG Webinar – 24 April 200815Theses and Dissertations051015202530Masters Doctoral BaccalaureateTotal in sampleUnique worksHeld by issuinginstitutionN = 49 records75% are unique works
RLG Programs Assessing Uniqueness in the System-wide Book CollectionRLG Webinar – 24 April 200816Language of PublicationNon-English publications account for<40% of uniquely held books in samplevs. ~75% of uniquely held books inVanderbilt studyN = 250 records
RLG Programs Assessing Uniqueness in the System-wide Book CollectionRLG Webinar – 24 April 200817Place of Publication32%68%US imprintNon-US imprintA majority of uniquely held print books were publishedoutside the United States63%37%5% more than print bookswith multiple holdingsUSNon-US
RLG Programs Assessing Uniqueness in the System-wide Book CollectionRLG Webinar – 24 April 200818Subject Access0%10%20%30%40%50%60%70%80%90%100%1 2No Subject CatalogingSubject CatalogingUnique works Multiple holdings19%9%~20% of unique print books lack subject catalogingNB: unique works donot benefit from FRBR-enhanceddiscoverability; norelated manifestations
RLG Programs Assessing Uniqueness in the System-wide Book CollectionRLG Webinar – 24 April 200819Sample Holdings by Institution Type54% of sample23% of sampleAcademic andresearch librarieshold the greatestshare of uniqueprint booksN = 250 recordsNon-ARL academic librarieshave the greatest number ofaggregate holdings inWorldCat – but are lesslikely than ARL institutionsto hold unique titles
RLG Programs Assessing Uniqueness in the System-wide Book CollectionRLG Webinar – 24 April 200820Age Distribution of Unique TitlesN = 250 records>70% of titles in samplewere produced after 1950Relative proportion ofunique works increases inpost-WWII periodincreased print production?rise of scientific and technicalenterprise?increased library collecting activity?Date of PublicationPercentageoftitles(records)insample
RLG Programs Assessing Uniqueness in the System-wide Book CollectionRLG Webinar – 24 April 200821Characterizing Unique WorksForeign, but accessibleLimited discoverabilityChallenging inventory control
RLG Programs Assessing Uniqueness in the System-wide Book CollectionRLG Webinar – 24 April 200822In Sum . . .Uniquely-held print books containing uniqueintellectualcontent are typically: Non US imprints English language titles Produced after 1950 Technical, non-fiction content Sparsely described Short (~100 pages in length) Held by academic and research libraries
RLG Programs Assessing Uniqueness in the System-wide Book CollectionRLG Webinar – 24 April 200823Preservation and circulation status Surveyed 27 RLG partners regarding shelf status,condition and circulation history of selected titlesfrom ‘only copy’ sample Responses (to date) from:Columbia University University of ArizonaHarvard University University of ChicagoIndiana University University of California, Los AngelesNew York Public Library University of Minnesota, Twin CitiesUniversity of Alberta University of PennsylvaniaUniversity of Texas, Austin Subset representative of larger sample:~70% unique works / expressions
RLG Programs Assessing Uniqueness in the System-wide Book CollectionRLG Webinar – 24 April 200824Survey Results (to date)Inventory control and item condition 100% of requested titles were available for examination Multiple copies held for 3 titles in sample, all theses None had significant condition problemsLocation and status 50% housed in off-site shelving facility Mostly transferred in the 1990s 50% non-circulating (local or off-site) Some availability via SHARESUse (value, discoverability?) None requested or circulated in past 5 years Limited usage data for non-circulating collections
RLG Programs Assessing Uniqueness in the System-wide Book CollectionRLG Webinar – 24 April 200825ImplicationsPreservation ~50% of uniquely held works are potentially at risk in on-site, circulating collections Limited discoverability and low-use of these titlesdiminishes relative risk Recent publications less likely to have inherent conditionproblemsAccess Preponderance of recent publications, and non-NorthAmerican imprints, is likely to limit potential impact of massdigitization Inter-institutional access and borrowing programs (e.g.SHARES) will test the limits of cooperative collectionmanagement Effective disclosure (holdings, condition, policies) mayrequire additional investment
RLG Programs Assessing Uniqueness in the System-wide Book CollectionRLG Webinar – 24 April 200826Opportunities for Joint Action Cooperative access agreementsIncrease the mobility of scarcely-held content; empower resource-sharing networks to lend and borrow unique holdings Distributed print archivingLeverage existing on- and off-site storage infrastructure asnetwork resource Shared digitization infrastructureReposition off-site repositories as digital delivery hubs Continue to build new uniqueness into system-wideholdings…strategicallyLocal collection development priorities will be trumped byeconomic realities; plan accordingly.
RLG Programs Assessing Uniqueness in the System-wide Book CollectionRLG Webinar – 24 April 200827Short, foreign …and competing for attention
RLG Programs Assessing Uniqueness in the System-wide Book CollectionRLG Webinar – 24 April 200828Questions, Comments?OCLC Programs & Research AgendaManaging the Collective CollectionConstance Malpasmalpasc@oclc.org