The Digital Public Library of America: An Overview and Working with the National Collections


Published on

The Digital Public Library of America: An Overview and Working with the National Collections. Martin R. Kalfatovic. NAGARA/CoSA Joint Conference. Santa Fe, New Mexico. 21 June 2012

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

The Digital Public Library of America: An Overview and Working with the National Collections

  1. 1. The Digital Public Library of AmericaAn Overview and Working with the National Collections Martin R. Kalfatovic | Smithsonian Libraries NAGARA/CoSA Joint Conference 21 June 2012 | Santa Fe, New Mexico
  2. 2. El Dorado County Public LibrarySouth Lake Tahoe, CATravelin LibrarianFlickr:
  3. 3. Ye Olde Smithsonian Website home page, circa 20th century
  4. 4. Are you feeling lucky?
  5. 5. Libraries Archives Museums Denver Art Museum & Denver Public Library
  6. 6. The Digital Public Library of America (DPLA) will make the culturaland scientific heritage of humanity available, free of charge, to all.The DPLA’s primary focus is on making available materials fromthe United States. By adhering to the fundamental principle of freeand universal access to knowledge, it will promote education in thebroadest sense of the term. That is, it will function as an onlinelibrary for students of all ages, from grades K-12 to postdoctoralresearchers and anyone seeking self-instruction; it will be a deepresource for community colleges, vocational schools, colleges,universities, and adult education programs; it will supplement theservices of public libraries in every corner of the country; and it willsatisfy other needs as well—the need for data related toemployment, for practical information of all kinds, and forenrichment in the use of leisure Concept Note (March 2012) Denver Art Museum & Denver Public Library
  7. 7. "These libraries have improved thegeneral conversation of theAmericans, made the commontradesmen and farmers as intelligentas most gentlemen from othercountries, and perhaps havecontributed in some degree to thestand so generally made throughoutthe colonies in defense of theirprivileges.” Benjamin Franklin, Autobiography
  8. 8. DPLA is obviously...Only about booksbecause it has“library” in its name!
  9. 9. DPLA is obviously...Only for eliteresearch purposesor actually only forthe general public
  10. 10. DPLA is obviously... Only a technology project
  11. 11. DPLA is obviously... Only just another digital library (whatever that is)
  12. 12. DPLA is obviously... Hard to describe ...
  13. 13. + DPLA planning initiativeThegrew out of an October 2010meeting at the Radcliffe Institutefor Advanced Study whichbrought together 40 D Prepresentatives from L Afoundations, researchinstitutions, culturalorganizations, government, andlibraries to discuss bestapproaches to building a nationaldigital library. Towards a Digital Public Library of America“ open, distributed network of comprehensive online resources thatwould draw on the nation’s living heritage from libraries, universities,archives, and museums in order to educate, inform, and empowereveryone in the current and future generations.”
  14. 14. + Code Metadata Content D P L A Tools & Services Community Governance Towards a Digital Public Library of America> Formally launched in October 2011 at the DPLA Plenary at the NationalArchives in Washington;> $5 million in funding from the Sloan and Arcadia Foundations;> Ambitious two year goal with a launch of the DPLA in October 2013.
  15. 15. National Archives | Smithsonian Institution | Library of Congress Modeling a Digital Collaboration for America’s National Collections
  16. 16. America’s National Collections National Archives Smithsonian Institution Library of CongressThe National Archives and The Smithsonian Institution—the Todays Library of Congress isRecords Administration is the world’s largest museum and an unparalleled world resource.nation’s record keeper, research complex —includes 19 The collection of more than 144safeguarding and preserving the museums and galleries and the million items includes more thanrecords of the United States National Zoological Park. The total 33 million cataloged books andGovernment and ensuring that the number of artifacts, works of art other print materials in 460American people can discover, and specimens in the languages; more than 63 millionuse, and learn from this Smithsonian’s collections is manuscripts; the largest raredocumentary heritage. In addition estimated at 137 million. The bulk book collection in Northto the Archives facilities in the of this material—more than 126 America; and the worlds largestWashington, DC area, there are million specimens and artifacts—is collection of legal materials,14 regional Archives facilities and part of the National Museum of films, maps, sheet music and13 Presidential Libraries around Natural History. In addition, the sound recordings. By providingthe country. We have over 10 Smithsonian Libraries maintains these materials online, thosebillion records in our holdings. Our 1.9 million library volumes, who may never come toholdings occupy over 4 million including rare books; and 89,000 Washington can gain access tocubic feet of space and 100 cubic feet of material are held in the treasures of the nation’sterabytes of electronic storage. archives. library.
  17. 17. The importance of participating in theDPLA, and especially of showing theability of three of the nations publicinstitutions to collaborate in makingtheir collections accessible drove theproject. For the Beta Sprint, weselected a small set of records thatwould show some of the breadth ofour collections. Abraham Lincoln’s Hat Smithsonian Institution
  18. 18. Each of our individual collections uniquely contribute to Americas National Collections. In the short time of the Beta Sprint, it was not possible to build a fully working implementation of an interface to the disparate collections of the Smithsonian Institution, the Library of Congress, and the National Archives.Proof of Union ServiceNational Archives
  19. 19. For the purpose of the DPLA Beta Sprint, staff from theSmithsonian Institution, the Library of Congress, and theNational Archives modeled a faceted search aggregatorusing the Smithsonians Enterprise Digital Asset Network(EDAN) as a starting point. Yankee volunteers marching into Dixie Library of Congress
  20. 20. As part of this proof of concept, a selection of records, with associated digitalassets were drawn from the collections of the Library of Congress and theNational Archives. Only a small set from the Library of Congress and NationalArchives were selected to test data mapping.The eleven records from NARA represent asampling of documents from the Online PublicAccess (OPA) system, including 19th-centuryphotographs, patents, drawings, andcorrespondence. The ten records from the Library of Congress were drawn from the Performing Arts Encyclopedia, including music manuscripts of composer Johannes Brahms and several pieces of 19th century sheet music. Example searches from LC and NARA websites
  21. 21. These joined the 7.44+ million records with 570,000+ images, video and sound files, electronic journals and other resources from the Smithsonians libraries, archives & museums in a test site for the Beta Sprint. LIBRARIES ARCHIVES MUSEUMSSmithsonian Libraries collections Smithsonian archival collections Smithsonian museum collectionsExample search retrieving digital Photographs, papers, online Scientific specimens as well asbooks and other library finding aids and other archival artworks and historical artifactscollections from the collections are available through are findable through theSmithsonian’s Collections Search the Smithsonian’s Collections Smithsonian’s CollectionsCenter. Search Center Search Center.
  22. 22. The World Wide Web DPLA Conceptual High-Level Architecture for a Common Aggregated Search Index and Web & Mobile Applications Service Layer, Including an Image User Interface(s) Delivery ServiceMetadata Delivery Service Tag Service Image Delivery Search Index / Metadata Repository Service (IDS) Data Transformation LoC NARA SI Library, Archive and Museum Systems (LAMS) Data Sets
  23. 23. End Users DPLA Technical High-Level Architecture for a Common Aggregated Search Index and Service Layer, Including an Image Delivery Service Web or Mobile Application / User Front-End – Cloud-hosted – MDS, IDS Handler Metadata Delivery Service (MDS) Image Delivery Service (IDS) Firewall MDS Back-end + IDS Back-end Master Index Request handlers, Response handlers, Update handlers Master Lucene Index / Metadata Repository Image Store(s) Pre-Processing Raw Index jetty (admin for Solr) update handlers Ingest / pre-processing repository XML Data Sources SI NARA Data Ingests LoC ...
  24. 24. Search Results: Grid View
  25. 25. National Archives ItemDetail record
  26. 26. Library of Congress ItemDetail record
  27. 27. Smithsonian ItemDetail record
  28. 28. Acknowledgements: Morgan Cundiff (LC) | David Ferriero (NARA) | Nancy Gwinn (SI) | MatthewJenkins (SI) | Martin Kalfatovic (SI) | Deanna Marcum (LC) | Ruth Scovill (LC) | Nate Trail (LC) |Günter Waibel (SI) | Ching-Hsien Wang (SI) | Pamela Wright (NARA)
  29. 29. Next steps Governance: Create a DPLA Board Content: Convene content providers DPLA Hack-a-thon (April 2012) Technical: Tech Dev Team and Beta Sprint, pt. 2 Community: engagement of more audiences
  30. 30. Where to learn more about DPLA