Snac webinar v3

  • 1,417 views
Uploaded on

Slides about the SNAC project for OAC webinar. …

Slides about the SNAC project for OAC webinar.

http://www.cdlib.org/services/dsc/webinars/snac/

More in: Technology , Education
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
1,417
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
18
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n

Transcript

  • 1. http://socialarchive.iath.virginia.edu/
  • 2. Discussion Points1. Background2. Overview of prototype biographical resource and access system3. Future directions and open discussion
  • 3. Context• Research and demonstration project• Sponsored by NEH• Grant term: March 2010 – March 2012• Three partner organizations: – Institute for Advanced Technology in the Humanities, University of Virginia – School of Information, UC Berkeley – California Digital Library
  • 4. Goals• Develop tools for extracting EAC-CPF records, drawing on existing data (EAD finding aids/ collection guides)• Build a large test corpus of EAC-CPF records• Create a prototype biographical resource and access system, using those records
  • 5. What is EAC-CPF?• Encoded Archival Context – Corporate Bodies, Persons, and Families• Standard for encoding archival authority records: – Authorized name headings for the entity – Biographical/historical context for the entity – Links to resources created by the entity, and about the entity • Collections (represented by EAD finding aids) • Bibliographic resources, etc.
  • 6. EAD and EAC-CPF
  • 7. EAD and EAC-CPF
  • 8. EAD and EAC-CPF
  • 9. EAD and EAC-CPF
  • 10. A Vision for Integrated Access
  • 11. A Vision for Integrated Access Freebase
  • 12. Data Inputs• EAD Finding Aids – Online Archive of California [~14,000] – Northwest Digital Archives (NWDA) [~5,200] – Library of Congress [~900] – Virginia Heritage [~8,300]• Authority Records – Library of Congress: NACO/LCNAF [~4+ million] – Getty Vocabulary Program: Union List of Artist Names (ULAN) [~290,000] – OCLC Research: Virtual International Authority File (VIAF) [intersection with NACO/LCNAF]
  • 13. Data Flow VIA F ULA N
  • 14. Data Flow• Extract names from EAD finding aids – Creator names (<...name>) with biographical/organizational histories (<bioghist>) – Names as subjects (<controlaccess>) – Names in correspondence series• Normalize and convert into EAC-CPF; retain link back to EAD(s)• Match EAC-CPF records against one another and against existing authority records (ULAN, VIAF, LCNAF) – Enhance EAC-CPF by normalizing entries, adding alternative entries, titles, languages used, and sex (VIAF), and historical data (ULAN)
  • 15. Meet the target users
  • 16. Meet the target users Randy: Graduate student working on a PhD that involves biographies and the study of diplomatic families and networks.  Sometimes he comes to the site looking for information on specific people; other times he is looking for information on a specific subject or event.  He also TAs an undergraduate history class and sometimes has to help students find topics for papers. 
  • 17. Meet the target users Randy: Graduate student working on a PhD that involves biographies and the study of diplomatic families and networks.  Sometimes he comes to the site looking for information on specific people; other times he is looking for information on a specific subject or event.  He also TAs an undergraduate history class and sometimes has to help students find topics for papers.  Connie: Works at an institution that contributed records to the project.  Is going to be asking themselves how this site would be useful to their users.  Wants to understand how their records were used and what the added value is.
  • 18. Meet the target users Randy: Graduate student working on a PhD that involves biographies and the study of diplomatic families and networks.  Sometimes he comes to the site looking for information on specific people; other times he is looking for information on a specific subject or event.  He also TAs an undergraduate history class and sometimes has to help students find topics for papers.  Connie: Works at an institution that contributed records to the project.  Is going to be asking themselves how this site would be useful to their users.  Wants to understand how their records were used and what the added value is. Quincy: Library School Student working to QA record matching.
  • 19. Meet the target users Randy: Graduate student working on a PhD that involves biographies and the study of diplomatic families and networks.  Sometimes he comes to the site looking for information on specific people; other times he is looking for information on a specific subject or event.  He also TAs an undergraduate history class and sometimes has to help students find topics for papers.  Connie: Works at an institution that contributed records to the project.  Is going to be asking themselves how this site would be useful to their users.  Wants to understand how their records were used and what the added value is. Quincy: Library School Student working to QA record matching. Adele: Person doing authority work during collection processing.
  • 20. Meet the target users Randy: Graduate student working on a PhD that involves biographies and the study of diplomatic families and networks.  Sometimes he comes to the site looking for information on specific people; other times he is looking for information on a specific subject or event.  He also TAs an undergraduate history class and sometimes has to help students find topics for papers.  Connie: Works at an institution that contributed records to the project.  Is going to be asking themselves how this site would be useful to their users.  Wants to understand how their records were used and what the added value is. Quincy: Library School Student working to QA record matching. Adele: Person doing authority work during collection processing. Lenny: Lenny likes linked data, and wants to be able to mine the links that have been established programatically.
  • 21. EAC’s Implicit Information Architecture
  • 22. EAC’s Implicit Information Architecture Expose Schema’s terminology in user interface
  • 23. EAC’s Implicit Information Architecture Expose Schema’s terminology in user interface Metadata Fields / used mostly for facets
  • 24. EAC’s Implicit Information Architecture Expose Schema’s terminology in user interface Metadata Fields / used mostly for facets XTF Section Types / based on hierarchy of EAC
  • 25. XTF XSLT Frameworkpre filter - do special tokenization to create custom EAC facetsquery parser - CGI params to XTF query XMLresult formatter - XTF results to HTMLdoc formatter - EAC-CPF to HTMLhttp://code.google.com/p/xtf-cpf/
  • 26. Tinkerpop Graph Stackhttp://www.tinkerpop.com/
  • 27. social graph visualization code at https://code.google.com/p/eac-graph-load/ simple JSON access to tinkerpop graph on backend with javscript on front end in live prototype [graph demo link in prototype] graphML file with open license should be viewable in other tools
  • 28. Linked Data / Open DataRDFa owl:sameAs links to VIAF httpRange-14 (XTF URL + “#entity” for the car)HTML5 microdata chronologyFuture: RDF Dump with an Open Data License based on Ed Summer’s graphML to RDF python script links to wikipedia and other sources
  • 29. Demohttp://socialarchive.iath.virginia.edu/xtf/search
  • 30. Future Directions?• From research and demonstration to longer-term resource?• Integration of merged data back into EAD access systems?• Distributed cooperative archival authority control that is crowd-sourced by researchers and curated by archivists?• Scale up EAD data sources?• More links to external resources (Wikipedia, WorldCat Identities, openURLs)?• Social network visualizations/interactive navigation?• Unique identifiers for EAC-CPF records (ORCID, ISNI, ARK)?• Standardized name entries for source repositories contributing EAC-CPF records?
  • 31. Questions?http://socialarchive.iath.virginia.edu/
  • 32. Creative Commons Credit• http://www.flickr.com/photos/danja/2949957005/• photo by Danny Ayers 25