SlideShare a Scribd company logo
1 of 25
What’s in a Name,
Fernando Pessoa?
Adding URIs to Archival Description
Potential Connections
• Beinecke Library
• Pessoa, Fernando, 1888-1935
• http://id.loc.gov/authorities/n
ames/n50016857
• National Library of Portugal
• Pessoa, Fernando, 1888-1935
• (PTBNP)10380
Connections without Communicating
• Beinecke Library
• Pessoa, Fernando, 1888-1935
• http://id.loc.gov/authorities
/names/n50016857
• National Library of Portugal
• Pessoa, Fernando, 1888-1935
• (PTBNP)10380
https://www.wikidata.org/wiki/Q173481 https://www.wikidata.org/wiki/Q173481
Pessoa, Fernando, 1888-1935
http://id.loc.gov/authorities/names/n50016857
Project Requirements
Do not assume that any step will be problem free!
1. Every Resource record is linked to its MARC record
2. Every subfield 0 match is accurate
3. Verify that each match can be downloaded / imported
4. For each record pair, ensure that the headings match
Do not assume that any step will be problem free!
Near-Match Issue
600 1 0
$a Ford, Ford Madox, $d 1873-1939 $0
http://id.loc.gov/authorities/names/n810502328
Near-Match Issue
• Ford Madox Ford != his maternal grandfather
600 1 0
$a Ford, Ford Madox, $d 1873-1939 $0
http://id.loc.gov/authorities/names/n810502328
Solution
• Two parts:
1. Compare authorized name with name string
2. Check for multiple subfield 0s
Summary
• We are enhancing data in two, local systems
• We want to connect to external systems
• We want our description to be recognized outside of
our domain
• URIs are the first (not straightforward) step
• It’s not about links, but the potential for links
• Once connected, the network changes
Code created; code shared
• MARC XML analysis:
https://github.com/fordmadox/xquery-scripts
• Authority download and ASpace Linking:
https://github.com/mark-cooper/authorizer

More Related Content

Recently uploaded

bams-3rd-case-presentation-scabies-12-05-2020.pptx
bams-3rd-case-presentation-scabies-12-05-2020.pptxbams-3rd-case-presentation-scabies-12-05-2020.pptx
bams-3rd-case-presentation-scabies-12-05-2020.pptxJocylDuran
 
SCI8-Q4-MOD11.pdfwrwujrrjfaajerjrajrrarj
SCI8-Q4-MOD11.pdfwrwujrrjfaajerjrajrrarjSCI8-Q4-MOD11.pdfwrwujrrjfaajerjrajrrarj
SCI8-Q4-MOD11.pdfwrwujrrjfaajerjrajrrarjadimosmejiaslendon
 
Harnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptxHarnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptxParas Gupta
 
How to Transform Clinical Trial Management with Advanced Data Analytics
How to Transform Clinical Trial Management with Advanced Data AnalyticsHow to Transform Clinical Trial Management with Advanced Data Analytics
How to Transform Clinical Trial Management with Advanced Data AnalyticsBrainSell Technologies
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxronsairoathenadugay
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRajesh Mondal
 
Abortion Clinic in Kempton Park +27791653574 WhatsApp Abortion Clinic Service...
Abortion Clinic in Kempton Park +27791653574 WhatsApp Abortion Clinic Service...Abortion Clinic in Kempton Park +27791653574 WhatsApp Abortion Clinic Service...
Abortion Clinic in Kempton Park +27791653574 WhatsApp Abortion Clinic Service...mikehavy0
 
Las implicancias del memorándum de entendimiento entre Codelco y SQM según la...
Las implicancias del memorándum de entendimiento entre Codelco y SQM según la...Las implicancias del memorándum de entendimiento entre Codelco y SQM según la...
Las implicancias del memorándum de entendimiento entre Codelco y SQM según la...Voces Mineras
 
Audience Researchndfhcvnfgvgbhujhgfv.pptx
Audience Researchndfhcvnfgvgbhujhgfv.pptxAudience Researchndfhcvnfgvgbhujhgfv.pptx
Audience Researchndfhcvnfgvgbhujhgfv.pptxStephen266013
 
Unsatisfied Bhabhi ℂall Girls Vadodara Book Esha 7427069034 Top Class ℂall Gi...
Unsatisfied Bhabhi ℂall Girls Vadodara Book Esha 7427069034 Top Class ℂall Gi...Unsatisfied Bhabhi ℂall Girls Vadodara Book Esha 7427069034 Top Class ℂall Gi...
Unsatisfied Bhabhi ℂall Girls Vadodara Book Esha 7427069034 Top Class ℂall Gi...Payal Garg #K09
 
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...ThinkInnovation
 
Credit Card Fraud Detection: Safeguarding Transactions in the Digital Age
Credit Card Fraud Detection: Safeguarding Transactions in the Digital AgeCredit Card Fraud Detection: Safeguarding Transactions in the Digital Age
Credit Card Fraud Detection: Safeguarding Transactions in the Digital AgeBoston Institute of Analytics
 
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...Klinik Aborsi
 
sourabh vyas1222222222222222222244444444
sourabh vyas1222222222222222222244444444sourabh vyas1222222222222222222244444444
sourabh vyas1222222222222222222244444444saurabvyas476
 
Displacement, Velocity, Acceleration, and Second Derivatives
Displacement, Velocity, Acceleration, and Second DerivativesDisplacement, Velocity, Acceleration, and Second Derivatives
Displacement, Velocity, Acceleration, and Second Derivatives23050636
 
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...yulianti213969
 
Case Study 4 Where the cry of rebellion happen?
Case Study 4 Where the cry of rebellion happen?Case Study 4 Where the cry of rebellion happen?
Case Study 4 Where the cry of rebellion happen?RemarkSemacio
 
Client Researchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh.pptx
Client Researchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh.pptxClient Researchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh.pptx
Client Researchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh.pptxStephen266013
 

Recently uploaded (20)

bams-3rd-case-presentation-scabies-12-05-2020.pptx
bams-3rd-case-presentation-scabies-12-05-2020.pptxbams-3rd-case-presentation-scabies-12-05-2020.pptx
bams-3rd-case-presentation-scabies-12-05-2020.pptx
 
SCI8-Q4-MOD11.pdfwrwujrrjfaajerjrajrrarj
SCI8-Q4-MOD11.pdfwrwujrrjfaajerjrajrrarjSCI8-Q4-MOD11.pdfwrwujrrjfaajerjrajrrarj
SCI8-Q4-MOD11.pdfwrwujrrjfaajerjrajrrarj
 
Harnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptxHarnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptx
 
How to Transform Clinical Trial Management with Advanced Data Analytics
How to Transform Clinical Trial Management with Advanced Data AnalyticsHow to Transform Clinical Trial Management with Advanced Data Analytics
How to Transform Clinical Trial Management with Advanced Data Analytics
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
Abortion Clinic in Kempton Park +27791653574 WhatsApp Abortion Clinic Service...
Abortion Clinic in Kempton Park +27791653574 WhatsApp Abortion Clinic Service...Abortion Clinic in Kempton Park +27791653574 WhatsApp Abortion Clinic Service...
Abortion Clinic in Kempton Park +27791653574 WhatsApp Abortion Clinic Service...
 
Las implicancias del memorándum de entendimiento entre Codelco y SQM según la...
Las implicancias del memorándum de entendimiento entre Codelco y SQM según la...Las implicancias del memorándum de entendimiento entre Codelco y SQM según la...
Las implicancias del memorándum de entendimiento entre Codelco y SQM según la...
 
Audience Researchndfhcvnfgvgbhujhgfv.pptx
Audience Researchndfhcvnfgvgbhujhgfv.pptxAudience Researchndfhcvnfgvgbhujhgfv.pptx
Audience Researchndfhcvnfgvgbhujhgfv.pptx
 
Unsatisfied Bhabhi ℂall Girls Vadodara Book Esha 7427069034 Top Class ℂall Gi...
Unsatisfied Bhabhi ℂall Girls Vadodara Book Esha 7427069034 Top Class ℂall Gi...Unsatisfied Bhabhi ℂall Girls Vadodara Book Esha 7427069034 Top Class ℂall Gi...
Unsatisfied Bhabhi ℂall Girls Vadodara Book Esha 7427069034 Top Class ℂall Gi...
 
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
 
Credit Card Fraud Detection: Safeguarding Transactions in the Digital Age
Credit Card Fraud Detection: Safeguarding Transactions in the Digital AgeCredit Card Fraud Detection: Safeguarding Transactions in the Digital Age
Credit Card Fraud Detection: Safeguarding Transactions in the Digital Age
 
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
 
Abortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotec
Abortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotecAbortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotec
Abortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotec
 
Abortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
Abortion pills in Doha {{ QATAR }} +966572737505) Get CytotecAbortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
Abortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
 
sourabh vyas1222222222222222222244444444
sourabh vyas1222222222222222222244444444sourabh vyas1222222222222222222244444444
sourabh vyas1222222222222222222244444444
 
Displacement, Velocity, Acceleration, and Second Derivatives
Displacement, Velocity, Acceleration, and Second DerivativesDisplacement, Velocity, Acceleration, and Second Derivatives
Displacement, Velocity, Acceleration, and Second Derivatives
 
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
 
Case Study 4 Where the cry of rebellion happen?
Case Study 4 Where the cry of rebellion happen?Case Study 4 Where the cry of rebellion happen?
Case Study 4 Where the cry of rebellion happen?
 
Client Researchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh.pptx
Client Researchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh.pptxClient Researchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh.pptx
Client Researchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh.pptx
 

Featured

Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at WorkGetSmarter
 

Featured (20)

Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 

What's in a Name, Fernando Pessoa?

  • 1. What’s in a Name, Fernando Pessoa? Adding URIs to Archival Description
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10. Potential Connections • Beinecke Library • Pessoa, Fernando, 1888-1935 • http://id.loc.gov/authorities/n ames/n50016857 • National Library of Portugal • Pessoa, Fernando, 1888-1935 • (PTBNP)10380
  • 11. Connections without Communicating • Beinecke Library • Pessoa, Fernando, 1888-1935 • http://id.loc.gov/authorities /names/n50016857 • National Library of Portugal • Pessoa, Fernando, 1888-1935 • (PTBNP)10380 https://www.wikidata.org/wiki/Q173481 https://www.wikidata.org/wiki/Q173481
  • 13. Project Requirements Do not assume that any step will be problem free! 1. Every Resource record is linked to its MARC record 2. Every subfield 0 match is accurate 3. Verify that each match can be downloaded / imported 4. For each record pair, ensure that the headings match Do not assume that any step will be problem free!
  • 14. Near-Match Issue 600 1 0 $a Ford, Ford Madox, $d 1873-1939 $0 http://id.loc.gov/authorities/names/n810502328
  • 15. Near-Match Issue • Ford Madox Ford != his maternal grandfather 600 1 0 $a Ford, Ford Madox, $d 1873-1939 $0 http://id.loc.gov/authorities/names/n810502328
  • 16. Solution • Two parts: 1. Compare authorized name with name string 2. Check for multiple subfield 0s
  • 17.
  • 18.
  • 19. Summary • We are enhancing data in two, local systems • We want to connect to external systems • We want our description to be recognized outside of our domain • URIs are the first (not straightforward) step • It’s not about links, but the potential for links • Once connected, the network changes
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25. Code created; code shared • MARC XML analysis: https://github.com/fordmadox/xquery-scripts • Authority download and ASpace Linking: https://github.com/mark-cooper/authorizer

Editor's Notes

  1. Good afternoon, everyone. As Karen mentioned, I will be going into a bit more detail about our efforts to enhance Yale’s legacy and current archival description by associating URIs with name and subject headings. To do that, I have decided to frame my talk around Fernando Pessoa.
  2. There are a few different reasons why I have selected Fernando Pessoa, represented here in the Social Networks and Archival Context interface, but the most important reason is because of Pessoa’s proclivity for creating and often writing as a variety of heteronyms – over 70 throughout his lifetime.
  3. Pessoa preferred the term heteronym to pseudonym, since, as he said himself, his heteronyms were “authors to whom he served as literary executor.” Many of the names you see listed here in SNAC, which can be accessed in the interface by clicking on a link labelled alternative names, have been described as different writers, different people, which also begs the question, today, as we undertake Linked Data projects, if those different names require different URIs. In SNAC, there is one URI for Pessoa. In the Library of Congress Name Authority File, however, some of his heteronyms have their own authority records. An authority record for Ricardo Reis was just created last year, for example, in LC’s database. And even though there is only one Wikipedia entry for Pessoa in the English-language Wikipedia…
  4. …what you see on this next slide is an entry for Ricardo Reis in the Portuguese-language edition of Wikipedia. There are also stand-alone URIs for some of Pessoa’s heteronyms in Wikidata.
  5. In fact, if you view his entry in Wikidata, as seen here…
  6. You will find a series of statements that are grouped under a heading that’s labeled “said to be the same as”. Here, you will see an entry for Ricardo Reis and others. Each of these entries is a URI. There is also a URI for the term “heteronym” in Wikidata, which is how all of these same-as relationships are characterized. Furthermore, there is a URI for the concept of “said to be the same as” in Wikidata. At this point, we are starting to go down the Linked Data rabbit hole. I don’t plan to do that in today’s talk, so instead I would like to pull things back for a moment and provide a concrete example of the value in adding URIs to archival description.
  7. On this slide, I have included an image of the recently-released Digital Edition of Pessoa’s writings, which is a collaborative project undertaken by the New University of Lisbon and the University of Cologne. This site currently contains, among other writings, all of the poetry published by Pessoa in his lifetime.
  8. Here, for example, is an encoded transcription of one his of his poems, attributed to his birth name, alongside a digitized copy of that same version of the poem. Now, as for the connection to the project that Karen and I are reporting on today, because Pessoa is one of the agent records in our ArchivesSpace database, we will be updating our record for him with a URI. And here….
  9. …is how that Agent record looks in the development version of our ArchivesSpace Public User Interface. The only reason we have an agent record for Pessoa is because the Beinecke Library acquired a draft of the same poem that I just showed you in the Digital Edition website, which is represented here on the screen by the single search result. Of course, our local description would probably never go to the lengths of providing a researcher a link to an encoded version of the poem hosted elsewhere. But what happens when we add URIs to our description?
  10. When we use URIs, we create potential connections. And these potentialities can be realized without us doing anything else. How? Well, as you see in this slide, imagine that the Beinecke has added a URI to the LC Name Authority File. Also imagine that the Digital Edition website has done the same, but instead of using a LC URI, they have used the authority ID from the National Library of Portugal. So now have two IDs that aren’t the same.
  11. But because of Linked Data services, such as Wikidata, potential connections exist. Anyone can connect these two IDs, at the time of need, because Wikidata records them both. In fact, in addition to the LC NAF ID and the National Library of Portugal ID, the Wikidata record for Pessoa references 48 other IDs, all of which refer to Pessoa, in different systems, different languages, all over the world. In other words, we can provide a solid foundation for connections without even having to communicate with other description providers. So that’s why we’re adding URIs, and that’s also why I hope that everyone else is adding URIs or considering to add such URIs to their archival description. But Fernando Pesssoa and his heteronyms are also why adding URIs is not a straightforward process, let alone describing relationships amongst those URIs
  12. But all projects have to start somewhere, so that’s why I’ve started with the example of what it takes to enhance a single name string with a URI. When we started our project, I did not have a grasp on how many headings we would be updating, but now that we have gotten to this point, I can tell you that we have added exactly 31,665 unique URIs to nearly 10k finding aids. And along the way, we’ve made mistakes, so next I just want to talk a little bit about how we reviewed our work.
  13. When it came to quality control, we had four general project requirements. First, we had the task of connecting each ArchivesSpace record with its corresponding MARC record. Simple, right? Well, we had a few issues here, such as the wrong links being made accidentally, as well one issue where a single collection had so many access points that it was split, long ago, into two MARC records in our ILS, whereas we have a single Resource record for that collection in ArchivesSpace. Next, we wanted to ensure that every subfield 0 that we added during the course of the project was accurate. Most of the subfield 0s were added automatically, by Backstage, and only when the name string was an exact match with the primary heading from an authority record. In all of those cases, we had to hope that the archivist or cataloger added the name string correctly in the first place. We also added a much smaller subset of URIs manually, when our group reviewed the “near match” reports provided by Backstage, as Karen already mentioned. And, in my experience, whenever you have more than one person doing more than one thing manually, you are going to get a variety of errors, so you have to check the results. Third, and this was a simple one since we had LYRASIS do it for us, we had to make sure that for every subfield 0 we added, we were able to download its authority record from LC or the Getty. And that’s just a simple numbers check. Finally, we have to verify that all of the headings that we have in our ILS are also in ArchivesSpace and linked to the exact same descriptive records. This is also basically a numbers check, but it is a bit more nuanced since ArchivesSpace does not align one-to-one with bibliographic description. For just one example, a Meeting Name in MARC is mapped as a Corporate Agent record in ArchivesSpace, making it indistinguishable from other Corporate Agent headings. Not a travesty by any means, but it makes our last stage of verification thornier than I would like. The important takeaway, though, is that we had errors at every stage in the project that we needed to correct, so next, I am going to show 3 examples of errors encountered….
  14. URIs are opaque, so everything looks fine here…
  15. Wrong URI, Right Name (sort of, as FMF was named after FMB).
  16. In this case, we caught the error since the MARC record eventually had two subfield 0s. The reason: we sent this record to Backstage twice during the course of the project, and when it came back the second time, it had two subfield 0s. The second was the correct one; The first one was for FMB. So, we removed the first one! We also checked all of our matches with the authorized headings to ensure that there weren’t any other blatantly wrong matches.
  17. Getty responded within hours to apply a fix so that we could download this record. We also had to report a handful of issues to LoC during this project, when we discovered that same records were available at authorities.loc.gov but not available (due to indexing issues?) at id.loc.gov.
  18. Don’t use undifferentiated records. We had 135 matches to these types of records, and once we discovered that, we removed those subfield 0s (but not the headings) from our records. In this case, the record is for a G.B. who published a book in 2015 as well as a G.B. who was a 19th century musician.
  19. All our finding aids, subject, and agent records (now represented as URIs) from this project, ingested as a graph with Gephi. 43,109 nodes: 51% agent headings, 26% subjects, 23% finding aids 137, 844 connections / edges: 40% are 650 topical headings (orange lines), 22% are 600 and 700 headings (pink), and nearly 20% are geographic headings (green).
  20. Oft-used subject heading in the Beinecke.
  21. J.B., here in isolation, but she is / would be much, much more central in other graphs (even this graph, if we described relationships among people, in addition to material-to-people relationships). And the point here is that once we add URIs for these entities, we create that potential.
  22. Like Baker, another very isolated agent record in our graph…
  23. But the underlying metadata, seen here through Google’s eyes, provides one possibility for (re)connecting Pessoa outside of our “Archives at Yale” graph.
  24. The first link includes a few scripts used to review our MARC XML records before sending them to LYRASIS (checking for typical issues we encountered, like a URI link in a subfield other than subfield 0, etc.) The second link is the amazing set of tools developed by Mark Cooper, at LYRASIS, which downloaded authority records (from LC and Getty), gets them into ASpace, and links those authority records back to our Resource records (by means of our “bib ids” from our ILS).