Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Upcoming SlideShare
What to Upload to SlideShare
What to Upload to SlideShare
Loading in …3
×
1 of 18

Linked Pasts IV - Linking Syriac Geographic Data

1

Share

Download to read offline

Slides for the Linking Syriac Geographic Data Research Group at Linked Pasts IV in Mainz, Germany, as presented December 11th, 2018.

Related Books

Free with a 30 day trial from Scribd

See all

Linked Pasts IV - Linking Syriac Geographic Data

  1. 1. Linking Syriac Geographic Data Wido Van Peursen & Mathias Coeckelbergs
  2. 2. LinkSyr Project https://github.com/ETCBC/linksyr
  3. 3. OCR
  4. 4. Multiple Text Formats
  5. 5. Lemmatizer
  6. 6. Morphological Parser
  7. 7. Syriaca data • TEI RDF/XML data
  8. 8. Peshitta in Text-Fabric
  9. 9. Reconciliation Results • Syriaca place names in text-fabric
  10. 10. Matching in Recogito
  11. 11. Matching in Recogito
  12. 12. Fuzzy Matching • OCR
  13. 13. Fuzzy Matching • Gentilics
  14. 14. Fuzzy Matching • Allow false positives -> manual control and database expansion -> threshold of 0.8 models matters lectionis -> threshold of 0.55 models close relatives -> Example: !"#$‫ܐ‬ / ‫ܐ$)('ܣ‬ (’gbty /’gptws)
  15. 15. Fuzzy Matching • BLC test case -> threshold of 0.8: 84,8% tp and 15,2% fp -> threshold of 0.55 64% tp and 36% fp -> Example: !"#$‫ܐ‬ / ‫ܐ$)('ܣ‬ (’gbty /’gptws)
  16. 16. Fuzzy Matching
  17. 17. Event Matching
  18. 18. Future Work • Towards data-driven classification of URIs • Topic Modelling

×