Exploring the Use of Linked Data to Bridge State and Federal Archives


Published on

Published in: Education, Technology
1 Like
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Exploring the Use of Linked Data to Bridge State and Federal Archives

  1. 1. Exploring the Use of Linked Data to Bridge State and Federal Archives Jon Voss, LookBackMaps MARA Guest Lecture San Jose State University June 15, 2010
  2. 2. Overview <ul><ul><li>Quick intro, logistics </li></ul></ul><ul><ul><li>Evolution and context of the Civil War Data 150 Project: A Very Exciting Time </li></ul></ul><ul><ul><li>Overview of Civil War Data 150 Project </li></ul></ul><ul><li>  </li></ul><ul><li>Halftime Q&A </li></ul><ul><ul><li>Some Technical Details on the Methodology, Tools </li></ul></ul><ul><ul><li>Placing CWD150 in the Big Picture </li></ul></ul><ul><ul><li>Dig Deeper Links </li></ul></ul><ul><li>Final Q&A </li></ul>http://www.loc.gov/pictures/item/cwp2003000505/PP
  3. 3. Give me feedback... <ul><li>email: </li></ul><ul><li>[email_address] </li></ul><ul><li>www.twitter.com/LookBackMaps </li></ul><ul><li>Comments welcome, just use @LookBackMaps on </li></ul><ul><li>Twitter or email me. </li></ul>
  4. 4. <ul><ul><li>www.lookbackmaps.net  </li></ul></ul><ul><ul><li>From perspective of presenting data, not organizing it--coming from the Web community  </li></ul></ul><ul><ul><li>Started in 2008 as a Google MyMaps mashup </li></ul></ul><ul><ul><li>Based on the simple idea of creating community around local history. </li></ul></ul><ul><ul><li>Created to solve the problem of disparate archives with no geotags through community and crowdsourcing  </li></ul></ul><ul><ul><li>Finding ways to access, display, and improve upon data </li></ul></ul>
  5. 5. screenshots from LookBackMaps iPhone app, overlay photos from The Bancroft Library
  6. 6. screenshots from LookBackMaps iPhone app, overlay photos from California Historical Society
  7. 7. 2008 Marks a major shift for public archives <ul><ul><li>The Library of Congress and Flickr collaboration spurs the Flickr Commons , and blows open the Web 2.0 door at archives and institutions worldwide. </li></ul></ul>
  8. 8. 2008 Marks a major shift for public archives <ul><ul><li>The Library of Congress and Flickr collaboration spurs the Flickr Commons , and blows open the Web 2.0 door at archives and institutions worldwide. </li></ul></ul><ul><ul><li>Multiple open source collections management and web publishing platforms begin to take hold and lower the barrier to entry for Web 2.0 presentation, collaboration, plugins and extensions </li></ul></ul>
  9. 9. <ul><li>Some stats from the LOC summary report a year after launch speaks to the success.  As of 10/23/08: </li></ul><ul><ul><li>10.4 million views of LOC photos on Flickr </li></ul></ul><ul><ul><li>79% of the 4,615 photos have been made a &quot;favorite&quot; </li></ul></ul><ul><ul><li>67,176 tags were added by 2,518 unique Flickr accounts </li></ul></ul><ul><ul><li>Less than 25 instances of user-generated comments were removed as inappropriate. </li></ul></ul><ul><ul><li>More than 500 records have been enhanced with new information provided by the Flickr community. </li></ul></ul>LOC/Flickr Commons
  10. 10. Public Archives in the Web 2.0 Environment <ul><li>While the majority of archives and libraries remain in a Web 1.0 environment, users have Web 2.0 expectations. </li></ul><ul><li>  </li></ul><ul><li>Institutions and users are meeting in the middle to build community around holdings. </li></ul><ul><ul><li>Search/Share : Archives want to get their holdings out to a wide-reaching public, Users want to search across institutions to discover based on interest, locality, etc. </li></ul></ul><ul><ul><li>Comment/Community : the ability to discuss and engage, create community </li></ul></ul><ul><ul><li>Contribute/Improve : Tag, geotag, crowdsource </li></ul></ul><ul><ul><li>Compare : Then and now. community identity often tied to history </li></ul></ul>
  11. 11. Stage is set for collaboration and innovation <ul><li>Mashups, collaborations, shared datasets, open source, open data, and open tools </li></ul>Bing Maps Streetside Photos (tech preview) http://www.bing.com/maps/explore/#/9gk357c6yqx3jost
  12. 12. <ul><li>The more shared data we have, the more we can do with it!  </li></ul><ul><li>  </li></ul><ul><li>  </li></ul><ul><li>By end of 2009, group of archivists and technologists start exploring collaborative efforts utilizing Linked Data to connect isolated archives and datasets in order to: </li></ul><ul><ul><li>join data in a robust, scalable, community-maintained database </li></ul></ul><ul><ul><li>increase discovery of and traffic to the archives while adding value to the data through crowdsourcing </li></ul></ul><ul><ul><li>make the data searchable and available to other web applications via API and semantic web queries </li></ul></ul>
  13. 13. <ul><li>Archives Metadata Mapping Project </li></ul><ul><li>  </li></ul><ul><li>  </li></ul><ul><li>Two important outcomes: </li></ul><ul><li>1. The potential of using Linked Data now by using Freebase as a Linked Data publishing platform. </li></ul><ul><li>2. The importance of use cases . </li></ul>
  14. 14. There, I said it.  Linked Data. Providing ways to start linking to DATA , no longer just DOCUMENTS .   It entails using tools and standards to make information (like metadata, MARC records, etc) searchable and machine readable. image: Harry Halpin. http://www.ibiblio.org/hhalpin/homepage/presentations/socialnet/
  15. 15. The Civil War Data 150 Project <ul><li>Born out of conversations with AMMP participant, Archives of Michigan . </li></ul><ul><li>Key ingredients for a strong use case: </li></ul><ul><ul><li>Specific subject matter </li></ul></ul><ul><ul><li>Diverse data in a wide array of institutions </li></ul></ul><ul><ul><li>A passionate user group </li></ul></ul><ul><ul><li>A significant anniversary </li></ul></ul>
  16. 16. http://www.flickr.com/photos/usnationalarchives/4166330219/ <ul><li>Three Primary Goals of CWD150: </li></ul><ul><li>  </li></ul><ul><ul><li>Identify sources and map metadata into Freebase. </li></ul></ul><ul><ul><li>Create web apps to enable users to add to or modify shared metadata with strong identifiers . </li></ul></ul><ul><ul><li>Engage the public in the process of interacting with and adding value to the data.  </li></ul></ul>
  17. 17. http://www.flickr.com/photos/usnationalarchives/3996142724/ Pause for Q&A  
  18. 18. Some Technical Details on the Methodology, Tools <ul><li>You can follow along and contribute to the project on the Freebase Wiki: http://wiki.freebase.com/wiki/CWD150 </li></ul><ul><li>  </li></ul>
  19. 19. Some Technical Details on the Methodology, Tools <ul><li>  1. Identifying primary data sets and ways at getting at the data </li></ul><ul><li>    Link to Google Spreadsheet on sources . Web crawling, screen scraping, XML dumps, CSV files, etc. </li></ul><ul><li>2. Creating Web Apps </li></ul><ul><ul><li>Once we have metadata mapped in Freebase, we can create RABJ queues.  See a simple example: Genderizer . </li></ul></ul><ul><ul><li>Then apply this to data that needs work, like regiments , or a photo queue. </li></ul></ul><ul><ul><li>Work with Civil War historians and others to add to specific schema . </li></ul></ul>
  20. 20. Some Technical Details on the Methodology, Tools <ul><li>  3. Engaging the Public, User Interface Development </li></ul><ul><ul><li>Messaging and powerful images </li></ul></ul><ul><ul><li>An easy interface with game elements and rewards </li></ul></ul><ul><ul><li>A plea for assistance and opportunity to genuinely make records more useful.  </li></ul></ul><ul><ul><li>Holy Grail: Civil War Soldier Survival App based on city of enlistment </li></ul></ul>
  21. 21. The Big Picture http://www.flickr.com/photos/37377809@N00/4701512132/
  22. 22. The Big Picture <ul><ul><li>CWD150 is a strong use case and an example for what can become possible in the wider web and developer community if libraries, archives and museums publish their metadata utilizing Linked Data standards and open licenses . </li></ul></ul><ul><ul><li>Our experience is showing us that the technological barriers are not as significant as the institutional barriers around adoption and openness.  But the Flickr Commons Shift has changed that. </li></ul></ul><ul><ul><li>With CWD150, we are side-stepping the Big Next Step of enabling institutions to publish their own metadata as Linked Data, and make meaningful connections. This is on the near horizon. </li></ul></ul>
  23. 23. The Big Picture <ul><li>Libraries, Archives and Museums will be critical to the adoption of Linked Data </li></ul><ul><ul><li>The vast information stored in disparate, isolated databases held by the worlds public institutions. </li></ul></ul><ul><ul><li>The expertise held by these institutions in the organization of systems and vocabularies to make sense of this information. </li></ul></ul><ul><ul><li>You can be on the front lines of this movement. </li></ul></ul>http://www.loc.gov/pictures/item/cwp2003000216/PP
  24. 24. Dig Deeper! <ul><li>Libraries </li></ul><ul><li>OCLC Research Linked Data parts 1 and 2 webinar </li></ul><ul><li>EMTACL10 April 2010. Gillian Byrne & Lisa Goddard: video | slides </li></ul><ul><li>JISC Linked Data Horizon Scan </li></ul><ul><li>Ed Summers is doing Linked Data work with LOC: Twitter | Blog </li></ul><ul><li>  </li></ul><ul><li>Archives </li></ul><ul><li>Mark Matienzo Linking as Repurposing Metadata </li></ul><ul><li>Tim Wragge's Flickr Machine tag Challenge </li></ul><ul><li>Tools </li></ul><ul><li>Build your own NYT Linked Data Application </li></ul><ul><li>Build apps on Freebase </li></ul><ul><li>Clean vast amounts of data with Gridworks </li></ul><ul><li>Tim Berners-Lee </li></ul><ul><li>TED Feb 2009 </li></ul><ul><li>TED Feb 2010 </li></ul><ul><li>Gov 2.0 Expo May 2010 </li></ul>
  25. 25. http://www.flickr.com/photos/library_of_congress/3252917783/ What will you do with that data? Q&A  
  26. 26. Give me feedback... <ul><li>email: </li></ul><ul><li>[email_address] </li></ul><ul><li>www.twitter.com/LookBackMaps </li></ul><ul><li>Comments welcome, just use @ or #LookBackMaps on </li></ul><ul><li>Twitter or email me. </li></ul>