SlideShare a Scribd company logo
WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem
JudaicaLink
Linked Data in the Jewish Studies FID
Kai Eckert
http://www.judaicalink.org
1
WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem
FID Jewish Studies / Israel Studies
Creation of a specialized information service
(Fach-Informations-Dienst) for the domain of Jewish
studies and Israel Studies.
Our part:
● Metadata integration and enrichment.
● Multilingual data matching.
2
Funding by
Consortium
WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem
Portal of Jewish Studies
Goals:
● Create a central access point
● Offer high performance information
infrastructure
And also,
● Contextualize the digital Judaica
collections
● Enrich the metadata
● Connect different data sources as
Linked Open Data
3
WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem
The Portal
4
http://umber.ub.uni-frankfurt.de/judaica/ (Beta version, not yet officially launched!)
WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem 5
WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem
Portal of Jewish Studies
Goals:
● Create a central access point
● Offer high performance information
infrastructure
And also,
● Contextualize the digital Judaica
collections
● Enrich the metadata
● Connect different data sources as
Linked Open Data
6
WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem
Re-Transliteration
7
● Automatic Retro-Conversion
of Romanized Hebrew Text
● Improve search facilities
for Hebrew speakers
● Needed to match data
cross-lingually
Aaron Christianson
WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem
Retro-Conversion of Romanized Hebrew Text
lĕqahaḥ t teḥ qst ʿivrî bĕ-taʿătîq lātîḥnî
‫עבוריות‬ ‫לאותיות‬ ‫אותו‬ ‫ולהפוך‬
8
WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem
‫המלא‬ ‫הסיפור‬
‫ֵא‬‫ל‬ ָ‫מ‬ַ‫ה‬ ‫רּ‬ ‫ו‬ִ‫ס‬ַ‫ה‬
9
WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem
Problem Statements
● No Hebrew Script until 2011
● Multiple standards of Romanization
● Ambiguities: Same romanized character can refer to
several Hebrew letters.
● Data imported from other catalogs
● The transliterations contain errors
(yes, even librarians make - rare - mistakes)
10
WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem
The Plan
1. Generate all possible original
forms in a “stupid” way.
2. Match the output against known
Hebrew names / titles.
3. Use the verified matches to
train a statistical model on the
word/phrase level.
11
WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem
Portal of Jewish Studies
Goals:
● Create a central access point
● Offer high performance information
infrastructure
And also,
● Contextualize the digital Judaica
collections
● Enrich the metadata
● Connect different data sources as
Linked Open Data
12
WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem
Contextualization of Digital Resources
● Find relevant data sources
● Find matching resources
● Extract information
● Add information to
library collection
13
Maral Dadvar
WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem
It’s all about Labels!
● Labels (Strings) are the first thing
we search to generate matching
candidates.
● Every additional label for a resource is
a possible new entry point to create
a connection.
● Caveat: More labels also create
more false positives. Further evidence
is needed to establish a link.
14
WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem 15
WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem
● Make unstructured data
sources like online
encyclopedia available
as structured data
● Identify and collect
relevant subsets of
general-purpose
knowledge bases like
DBpedia
● To function as a single
hub for the
contextualization
process
16
WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem
Main Tasks
1. Find new resource descriptions - with labels!
2. Find new labels (and other data) for known resources.
3. Find connections and duplicates within known resources.
4. Make the data available for others to contextualize.
17
WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem
Example 1: YIVO Encyclopedia
18
WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem
What Data can we find?
● A title
● Describing text
● Links in texts
"Surface form" => Concept
● Pictures
● Description of pictures
19
WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem
Making use of the Surface Forms
Minsk article links to "Poland before 1795" calling it
"Polish-Lithuanian Commonwealth".
"Polish-Lithuanian Commonwealth" is a subsection
of "Poland before 1795" in the main article "Poland".
So is "Demography"…
Surface forms are evidence for labels.
20
WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem
Example 2: Biographisches Handbuch der Rabbiner
● The Biographisches Handbuch der Rabbiner is an online encyclopedia
provided by the Salomon L. Steinheim-Institute for German-Jewish history at
the University of Duisburg-Essen, edited by Michael Brocke and Julius
Carlebach.
● The goal of this encyclopedia is to be a complete directory of all rabbis who
lived and worked in or originated from German-speaking areas since the age
of enlightenment.
● http://www.steinheim-institut.de/wiki/index.php/Biographisches_Handbuch_de
r_Rabbiner_%28BHR%29
21
WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem
Available
as PDF.
22
WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem
Some notes about PDF Sources
● PDF is great to keep the visual layout of a text across
systems.
● In all other aspects, in particular regarding the access to
the content, it is horrible.
● Digital-born PDFs (as in this case) are PDFs that have
been created directly by the authoring software.
● Even worse: PDFs created from scans (with OCR).
23
WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem 24
WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem
Biographisches Portal der Rabbiner
Gladly some people at Steinheim Institute created a database from the handbook:
http://www.steinheim-institut.de:50580/cgi-bin/bhr#i0001
This URL above is shown in the browser when you view the entry on Aach, Löb.
Great stuff:
● Semi-structured form of the entry
● “Link” to the PDF by means of volume and page number
● Reference to the number of the entry as it is used in the PDF.
● A GND number!!!
Not so great:
● We can not link to the database as the link above does not resolve to the
article.
25
WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem
Solution
The solution actually was already implemented:
There is an undocumented way to address an entry:
http://steinheim-institut.de:50580/cgi-bin/bhr?id=1
26
WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem
Example 3: DBpedia
Generation of a DBpedia subgraph.
1. Focused Crawling of data sources
a. identify “relevant” resources
b. extract “relevant” information
2. Find matches in the whole dataset
a. extract “relevant” information
27
WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem
Interlinking
The more data sources we have, the better we can use them to support the linking
process.
28
WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem
Architecture and Deployment
Triple store and SPARQL endpoint: Apache Jena Fuseki
Linked Data frontend (URI dereferencing, HTML Views): Pubby (DM2E version)
Static HTML pages of the website: Hugo
Versioning and management: GitHub
Search Access: Elasticsearch (planned)
29
WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem
Dataset Description in Markdown
with Metadata Frontmatter
+++
author = "Kai Eckert"
title = "Yivo Encyclopedia"
website = "http://www.yivoencyclopedia.org"
example = "http://data.judaicalink.org/data/yivo/Moscow"
graph = "http://data.judaicalink.org/data/yivo"
loaded = true
[[files]]
url = "http://data.judaicalink.org/dumps/yivo/current/yivo.n3.gz"
description = "Extraction from YIVO Encyclopediae"
+++
The YIVO Encyclopedia of Jews in Eastern Europe, courtesy of the YIVO Institute of
Jewish Research, NY.
<!--more-->
...
30
WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem
The whole website is maintained via GitHub
31
WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem
Every new commit gets pushed to the web server
Via the static site generator Hugo, all HTML pages are generated.
32
+++
author = "Kai Eckert"
title = "Yivo Encyclopedia"
website = "http://www.yivoencyclopedia.org"
example =
"http://data.judaicalink.org/data/yivo/Moscow"
graph = "http://data.judaicalink.org/data/yivo"
loaded = true
[[files]]
url =
"http://data.judaicalink.org/dumps/yivo/current/
yivo.n3.gz"
description = "Extraction from YIVO
Encyclopediae"
+++
The YIVO Encyclopedia of Jews in Eastern Europe,
courtesy of the YIVO Institute of Jewish
Research, NY.
<!--more-->
...
WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem
Every new commit gets pushed to the web server
A Python script parses the metadata of the pages
and loads and unloads the datasets automatically.
Advantages:
● No one needs access to the server.
● Write access to the data is easily done via GitHub.
● Data dumps of all datasets are always available.
● Description, dumps and loaded data are always synchronous.
● History of the datasets is maintained (and the dumps)
● Mistakes can easily be reverted by going back to an earlier commit.
33
WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem
Thank you.
http://slideshare.net/kaiec
http://www.wisslab.org
34

More Related Content

What's hot

Linked Open Data
Linked Open DataLinked Open Data
Linked Open Data
horvadam
 
Linked Data for Libraries: Great progress, but what is the benefit?
Linked Data for Libraries:  Great progress, but what is the benefit?Linked Data for Libraries:  Great progress, but what is the benefit?
Linked Data for Libraries: Great progress, but what is the benefit?
Richard Wallis
 
15th Border Regions in Transition (BRIT) Conference 'Cities, States and Borde...
15th Border Regions in Transition (BRIT) Conference 'Cities, States and Borde...15th Border Regions in Transition (BRIT) Conference 'Cities, States and Borde...
15th Border Regions in Transition (BRIT) Conference 'Cities, States and Borde...
Dr Igor Calzada, MBA, FeRSA
 
Cross lingual information retrieval across 100 languages - Andrej Muhic
Cross lingual information retrieval across 100 languages - Andrej Muhic Cross lingual information retrieval across 100 languages - Andrej Muhic
Cross lingual information retrieval across 100 languages - Andrej Muhic
Andrej Muhic
 
8 August 2016
8 August 20168 August 2016
8 August 2016
Neil Watson
 
EDL Stockholm
EDL StockholmEDL Stockholm
EDL Stockholm
Patrick Danowski
 
Networks, Social Networks, and Web Presence. 2017-2018 edition
Networks, Social Networks, and Web Presence. 2017-2018 editionNetworks, Social Networks, and Web Presence. 2017-2018 edition
Networks, Social Networks, and Web Presence. 2017-2018 edition
Roberto Peretta
 
Linked Open Data case study (illegal newspapers WW2, Wikipedia, DBpedia) - Le...
Linked Open Data case study (illegal newspapers WW2, Wikipedia, DBpedia) - Le...Linked Open Data case study (illegal newspapers WW2, Wikipedia, DBpedia) - Le...
Linked Open Data case study (illegal newspapers WW2, Wikipedia, DBpedia) - Le...
Olaf Janssen
 

What's hot (8)

Linked Open Data
Linked Open DataLinked Open Data
Linked Open Data
 
Linked Data for Libraries: Great progress, but what is the benefit?
Linked Data for Libraries:  Great progress, but what is the benefit?Linked Data for Libraries:  Great progress, but what is the benefit?
Linked Data for Libraries: Great progress, but what is the benefit?
 
15th Border Regions in Transition (BRIT) Conference 'Cities, States and Borde...
15th Border Regions in Transition (BRIT) Conference 'Cities, States and Borde...15th Border Regions in Transition (BRIT) Conference 'Cities, States and Borde...
15th Border Regions in Transition (BRIT) Conference 'Cities, States and Borde...
 
Cross lingual information retrieval across 100 languages - Andrej Muhic
Cross lingual information retrieval across 100 languages - Andrej Muhic Cross lingual information retrieval across 100 languages - Andrej Muhic
Cross lingual information retrieval across 100 languages - Andrej Muhic
 
8 August 2016
8 August 20168 August 2016
8 August 2016
 
EDL Stockholm
EDL StockholmEDL Stockholm
EDL Stockholm
 
Networks, Social Networks, and Web Presence. 2017-2018 edition
Networks, Social Networks, and Web Presence. 2017-2018 editionNetworks, Social Networks, and Web Presence. 2017-2018 edition
Networks, Social Networks, and Web Presence. 2017-2018 edition
 
Linked Open Data case study (illegal newspapers WW2, Wikipedia, DBpedia) - Le...
Linked Open Data case study (illegal newspapers WW2, Wikipedia, DBpedia) - Le...Linked Open Data case study (illegal newspapers WW2, Wikipedia, DBpedia) - Le...
Linked Open Data case study (illegal newspapers WW2, Wikipedia, DBpedia) - Le...
 

Similar to JudaicaLink: Linked Data in the Jewish Studies FID

Paul Evan Peters Lecture
Paul Evan Peters LecturePaul Evan Peters Lecture
Paul Evan Peters Lecture
Herbert Van de Sompel
 
Semantic Technologies for the Web of Linked Data
Semantic Technologies for the Web of Linked DataSemantic Technologies for the Web of Linked Data
Semantic Technologies for the Web of Linked Data
Nick Bassiliades
 
Understanding Metadata: Looking Forward
Understanding Metadata: Looking ForwardUnderstanding Metadata: Looking Forward
Understanding Metadata: Looking Forward
Jenn Riley
 
Alexia Meyermann: Building a research infrastructure for educational studies ...
Alexia Meyermann: Building a research infrastructure for educational studies ...Alexia Meyermann: Building a research infrastructure for educational studies ...
Alexia Meyermann: Building a research infrastructure for educational studies ...
DIPF | Leibniz-Institut für Bildungsforschung und Bildungsinformation
 
The GND initiative 2017-2021: Developing a Backbone for the Web of Cultural a...
The GND initiative 2017-2021: Developing a Backbone for the Web of Cultural a...The GND initiative 2017-2021: Developing a Backbone for the Web of Cultural a...
The GND initiative 2017-2021: Developing a Backbone for the Web of Cultural a...
LIBER Europe
 
Project MILDRED: Charting Ground for Research Data Management Services at Uni...
Project MILDRED: Charting Ground for Research Data Management Services at Uni...Project MILDRED: Charting Ground for Research Data Management Services at Uni...
Project MILDRED: Charting Ground for Research Data Management Services at Uni...
Mari Elisa Kuusniemi
 
Creating Topical Collections: Web Archives vs. Live Web
Creating Topical Collections:Web Archives vs. Live WebCreating Topical Collections:Web Archives vs. Live Web
Creating Topical Collections: Web Archives vs. Live Web
Martin Klein
 
Towards an Authoritative Global Data Infrastructure: Connecting Libraries wit...
Towards an Authoritative Global Data Infrastructure: Connecting Libraries wit...Towards an Authoritative Global Data Infrastructure: Connecting Libraries wit...
Towards an Authoritative Global Data Infrastructure: Connecting Libraries wit...
Lars G. Svensson
 
G3 dov winer_jewishstudiesknowledgegrid
G3 dov winer_jewishstudiesknowledgegridG3 dov winer_jewishstudiesknowledgegrid
G3 dov winer_jewishstudiesknowledgegrid
evaminerva
 
Providing Research Graph data in JSON-LD using Schema.org
Providing Research Graph data in JSON-LD using Schema.orgProviding Research Graph data in JSON-LD using Schema.org
Providing Research Graph data in JSON-LD using Schema.org
Jingbo Wang
 
The Danish case: What does the danish web talk about
The Danish case: What does the danish web talk aboutThe Danish case: What does the danish web talk about
The Danish case: What does the danish web talk about
WARCnet
 
G3 dov winer_jewishstudiesknowledgegrid.ppt
G3 dov winer_jewishstudiesknowledgegrid.pptG3 dov winer_jewishstudiesknowledgegrid.ppt
G3 dov winer_jewishstudiesknowledgegrid.ppt
evaminerva
 
Recommender Systems based on Linked Open Data
Recommender Systems based on Linked Open DataRecommender Systems based on Linked Open Data
Recommender Systems based on Linked Open Data
Cataldo Musto
 
Desired Outcomes for Libraries+ Network May Meeting
Desired Outcomes for Libraries+ Network May MeetingDesired Outcomes for Libraries+ Network May Meeting
Desired Outcomes for Libraries+ Network May Meeting
Kimberly Eke
 
Practitioner research: value, impact, and priorities
Practitioner research: value, impact, and prioritiesPractitioner research: value, impact, and priorities
Practitioner research: value, impact, and priorities
Hazel Hall
 

Similar to JudaicaLink: Linked Data in the Jewish Studies FID (15)

Paul Evan Peters Lecture
Paul Evan Peters LecturePaul Evan Peters Lecture
Paul Evan Peters Lecture
 
Semantic Technologies for the Web of Linked Data
Semantic Technologies for the Web of Linked DataSemantic Technologies for the Web of Linked Data
Semantic Technologies for the Web of Linked Data
 
Understanding Metadata: Looking Forward
Understanding Metadata: Looking ForwardUnderstanding Metadata: Looking Forward
Understanding Metadata: Looking Forward
 
Alexia Meyermann: Building a research infrastructure for educational studies ...
Alexia Meyermann: Building a research infrastructure for educational studies ...Alexia Meyermann: Building a research infrastructure for educational studies ...
Alexia Meyermann: Building a research infrastructure for educational studies ...
 
The GND initiative 2017-2021: Developing a Backbone for the Web of Cultural a...
The GND initiative 2017-2021: Developing a Backbone for the Web of Cultural a...The GND initiative 2017-2021: Developing a Backbone for the Web of Cultural a...
The GND initiative 2017-2021: Developing a Backbone for the Web of Cultural a...
 
Project MILDRED: Charting Ground for Research Data Management Services at Uni...
Project MILDRED: Charting Ground for Research Data Management Services at Uni...Project MILDRED: Charting Ground for Research Data Management Services at Uni...
Project MILDRED: Charting Ground for Research Data Management Services at Uni...
 
Creating Topical Collections: Web Archives vs. Live Web
Creating Topical Collections:Web Archives vs. Live WebCreating Topical Collections:Web Archives vs. Live Web
Creating Topical Collections: Web Archives vs. Live Web
 
Towards an Authoritative Global Data Infrastructure: Connecting Libraries wit...
Towards an Authoritative Global Data Infrastructure: Connecting Libraries wit...Towards an Authoritative Global Data Infrastructure: Connecting Libraries wit...
Towards an Authoritative Global Data Infrastructure: Connecting Libraries wit...
 
G3 dov winer_jewishstudiesknowledgegrid
G3 dov winer_jewishstudiesknowledgegridG3 dov winer_jewishstudiesknowledgegrid
G3 dov winer_jewishstudiesknowledgegrid
 
Providing Research Graph data in JSON-LD using Schema.org
Providing Research Graph data in JSON-LD using Schema.orgProviding Research Graph data in JSON-LD using Schema.org
Providing Research Graph data in JSON-LD using Schema.org
 
The Danish case: What does the danish web talk about
The Danish case: What does the danish web talk aboutThe Danish case: What does the danish web talk about
The Danish case: What does the danish web talk about
 
G3 dov winer_jewishstudiesknowledgegrid.ppt
G3 dov winer_jewishstudiesknowledgegrid.pptG3 dov winer_jewishstudiesknowledgegrid.ppt
G3 dov winer_jewishstudiesknowledgegrid.ppt
 
Recommender Systems based on Linked Open Data
Recommender Systems based on Linked Open DataRecommender Systems based on Linked Open Data
Recommender Systems based on Linked Open Data
 
Desired Outcomes for Libraries+ Network May Meeting
Desired Outcomes for Libraries+ Network May MeetingDesired Outcomes for Libraries+ Network May Meeting
Desired Outcomes for Libraries+ Network May Meeting
 
Practitioner research: value, impact, and priorities
Practitioner research: value, impact, and prioritiesPractitioner research: value, impact, and priorities
Practitioner research: value, impact, and priorities
 

More from Kai Eckert

Judaica link und der FID Jüdische Studien
Judaica link und der FID Jüdische StudienJudaica link und der FID Jüdische Studien
Judaica link und der FID Jüdische Studien
Kai Eckert
 
Linked Open Citation Database (LOC-DB)
Linked Open Citation Database (LOC-DB)Linked Open Citation Database (LOC-DB)
Linked Open Citation Database (LOC-DB)
Kai Eckert
 
Linked Data nach dem Hype
Linked Data nach dem HypeLinked Data nach dem Hype
Linked Data nach dem Hype
Kai Eckert
 
Guidance, Please! Towards a Framework for RDF-based Constraint Languages.
Guidance, Please! Towards a Framework for RDF-based Constraint Languages.Guidance, Please! Towards a Framework for RDF-based Constraint Languages.
Guidance, Please! Towards a Framework for RDF-based Constraint Languages.
Kai Eckert
 
RDF Application Profiles
RDF Application ProfilesRDF Application Profiles
RDF Application Profiles
Kai Eckert
 
Specialising the EDM for Digitised Manuscript (SWIB13)
Specialising the EDM for Digitised Manuscript (SWIB13)Specialising the EDM for Digitised Manuscript (SWIB13)
Specialising the EDM for Digitised Manuscript (SWIB13)
Kai Eckert
 
Metadata Provenance Tutorial at SWIB 13, Part 1
Metadata Provenance Tutorial at SWIB 13, Part 1Metadata Provenance Tutorial at SWIB 13, Part 1
Metadata Provenance Tutorial at SWIB 13, Part 1
Kai Eckert
 
The DM2E Data Model and the DM2E Ingestion Infrastructure
The DM2E Data Model and the DM2E Ingestion InfrastructureThe DM2E Data Model and the DM2E Ingestion Infrastructure
The DM2E Data Model and the DM2E Ingestion Infrastructure
Kai Eckert
 
LOHAI: Providing a baseline for KOS based automatic indexing
LOHAI: Providing a baseline for KOS based automatic indexingLOHAI: Providing a baseline for KOS based automatic indexing
LOHAI: Providing a baseline for KOS based automatic indexing
Kai Eckert
 
Extending DCAM for Metadata Provenance
Extending DCAM for Metadata ProvenanceExtending DCAM for Metadata Provenance
Extending DCAM for Metadata Provenance
Kai Eckert
 
Bibliotheken und Linked Open Data - Erfahrungen und Ideen aus der UB Mannheim
Bibliotheken und Linked Open Data - Erfahrungen und Ideen aus der UB Mannheim Bibliotheken und Linked Open Data - Erfahrungen und Ideen aus der UB Mannheim
Bibliotheken und Linked Open Data - Erfahrungen und Ideen aus der UB Mannheim
Kai Eckert
 
Thesaurusvisualisierung mit ICE-Map und SEMTINEL
Thesaurusvisualisierung mit ICE-Map und SEMTINELThesaurusvisualisierung mit ICE-Map und SEMTINEL
Thesaurusvisualisierung mit ICE-Map und SEMTINEL
Kai Eckert
 
SWIB 2010: Linked Open Projects
SWIB 2010: Linked Open ProjectsSWIB 2010: Linked Open Projects
SWIB 2010: Linked Open Projects
Kai Eckert
 
Towards Interoperable Metadata Provenance
Towards Interoperable Metadata ProvenanceTowards Interoperable Metadata Provenance
Towards Interoperable Metadata Provenance
Kai Eckert
 
Linked Open Projects (DCMI Library Community)
Linked Open Projects (DCMI Library Community)Linked Open Projects (DCMI Library Community)
Linked Open Projects (DCMI Library Community)
Kai Eckert
 
Metadata Provenance
Metadata ProvenanceMetadata Provenance
Metadata Provenance
Kai Eckert
 
Linked Open Projects (DGI-Konferenz)
Linked Open Projects (DGI-Konferenz)Linked Open Projects (DGI-Konferenz)
Linked Open Projects (DGI-Konferenz)Kai Eckert
 
Linked Open Projects
Linked Open ProjectsLinked Open Projects
Linked Open Projects
Kai Eckert
 
Crowdsourcing the Assembly of Concept Hierarchies
Crowdsourcing the Assembly of Concept HierarchiesCrowdsourcing the Assembly of Concept Hierarchies
Crowdsourcing the Assembly of Concept Hierarchies
Kai Eckert
 
A Unified Approach for Representing Metametadata
A Unified Approach for Representing MetametadataA Unified Approach for Representing Metametadata
A Unified Approach for Representing Metametadata
Kai Eckert
 

More from Kai Eckert (20)

Judaica link und der FID Jüdische Studien
Judaica link und der FID Jüdische StudienJudaica link und der FID Jüdische Studien
Judaica link und der FID Jüdische Studien
 
Linked Open Citation Database (LOC-DB)
Linked Open Citation Database (LOC-DB)Linked Open Citation Database (LOC-DB)
Linked Open Citation Database (LOC-DB)
 
Linked Data nach dem Hype
Linked Data nach dem HypeLinked Data nach dem Hype
Linked Data nach dem Hype
 
Guidance, Please! Towards a Framework for RDF-based Constraint Languages.
Guidance, Please! Towards a Framework for RDF-based Constraint Languages.Guidance, Please! Towards a Framework for RDF-based Constraint Languages.
Guidance, Please! Towards a Framework for RDF-based Constraint Languages.
 
RDF Application Profiles
RDF Application ProfilesRDF Application Profiles
RDF Application Profiles
 
Specialising the EDM for Digitised Manuscript (SWIB13)
Specialising the EDM for Digitised Manuscript (SWIB13)Specialising the EDM for Digitised Manuscript (SWIB13)
Specialising the EDM for Digitised Manuscript (SWIB13)
 
Metadata Provenance Tutorial at SWIB 13, Part 1
Metadata Provenance Tutorial at SWIB 13, Part 1Metadata Provenance Tutorial at SWIB 13, Part 1
Metadata Provenance Tutorial at SWIB 13, Part 1
 
The DM2E Data Model and the DM2E Ingestion Infrastructure
The DM2E Data Model and the DM2E Ingestion InfrastructureThe DM2E Data Model and the DM2E Ingestion Infrastructure
The DM2E Data Model and the DM2E Ingestion Infrastructure
 
LOHAI: Providing a baseline for KOS based automatic indexing
LOHAI: Providing a baseline for KOS based automatic indexingLOHAI: Providing a baseline for KOS based automatic indexing
LOHAI: Providing a baseline for KOS based automatic indexing
 
Extending DCAM for Metadata Provenance
Extending DCAM for Metadata ProvenanceExtending DCAM for Metadata Provenance
Extending DCAM for Metadata Provenance
 
Bibliotheken und Linked Open Data - Erfahrungen und Ideen aus der UB Mannheim
Bibliotheken und Linked Open Data - Erfahrungen und Ideen aus der UB Mannheim Bibliotheken und Linked Open Data - Erfahrungen und Ideen aus der UB Mannheim
Bibliotheken und Linked Open Data - Erfahrungen und Ideen aus der UB Mannheim
 
Thesaurusvisualisierung mit ICE-Map und SEMTINEL
Thesaurusvisualisierung mit ICE-Map und SEMTINELThesaurusvisualisierung mit ICE-Map und SEMTINEL
Thesaurusvisualisierung mit ICE-Map und SEMTINEL
 
SWIB 2010: Linked Open Projects
SWIB 2010: Linked Open ProjectsSWIB 2010: Linked Open Projects
SWIB 2010: Linked Open Projects
 
Towards Interoperable Metadata Provenance
Towards Interoperable Metadata ProvenanceTowards Interoperable Metadata Provenance
Towards Interoperable Metadata Provenance
 
Linked Open Projects (DCMI Library Community)
Linked Open Projects (DCMI Library Community)Linked Open Projects (DCMI Library Community)
Linked Open Projects (DCMI Library Community)
 
Metadata Provenance
Metadata ProvenanceMetadata Provenance
Metadata Provenance
 
Linked Open Projects (DGI-Konferenz)
Linked Open Projects (DGI-Konferenz)Linked Open Projects (DGI-Konferenz)
Linked Open Projects (DGI-Konferenz)
 
Linked Open Projects
Linked Open ProjectsLinked Open Projects
Linked Open Projects
 
Crowdsourcing the Assembly of Concept Hierarchies
Crowdsourcing the Assembly of Concept HierarchiesCrowdsourcing the Assembly of Concept Hierarchies
Crowdsourcing the Assembly of Concept Hierarchies
 
A Unified Approach for Representing Metametadata
A Unified Approach for Representing MetametadataA Unified Approach for Representing Metametadata
A Unified Approach for Representing Metametadata
 

Recently uploaded

Randomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNERandomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNE
University of Maribor
 
Methods of grain storage Structures in India.pdf
Methods of grain storage Structures in India.pdfMethods of grain storage Structures in India.pdf
Methods of grain storage Structures in India.pdf
PirithiRaju
 
Travis Hills of MN is Making Clean Water Accessible to All Through High Flux ...
Travis Hills of MN is Making Clean Water Accessible to All Through High Flux ...Travis Hills of MN is Making Clean Water Accessible to All Through High Flux ...
Travis Hills of MN is Making Clean Water Accessible to All Through High Flux ...
Travis Hills MN
 
MICROBIAL INTERACTION PPT/ MICROBIAL INTERACTION AND THEIR TYPES // PLANT MIC...
MICROBIAL INTERACTION PPT/ MICROBIAL INTERACTION AND THEIR TYPES // PLANT MIC...MICROBIAL INTERACTION PPT/ MICROBIAL INTERACTION AND THEIR TYPES // PLANT MIC...
MICROBIAL INTERACTION PPT/ MICROBIAL INTERACTION AND THEIR TYPES // PLANT MIC...
ABHISHEK SONI NIMT INSTITUTE OF MEDICAL AND PARAMEDCIAL SCIENCES , GOVT PG COLLEGE NOIDA
 
23PH301 - Optics - Optical Lenses.pptx
23PH301 - Optics  -  Optical Lenses.pptx23PH301 - Optics  -  Optical Lenses.pptx
23PH301 - Optics - Optical Lenses.pptx
RDhivya6
 
cathode ray oscilloscope and its applications
cathode ray oscilloscope and its applicationscathode ray oscilloscope and its applications
cathode ray oscilloscope and its applications
sandertein
 
The cost of acquiring information by natural selection
The cost of acquiring information by natural selectionThe cost of acquiring information by natural selection
The cost of acquiring information by natural selection
Carl Bergstrom
 
HUMAN EYE By-R.M Class 10 phy best digital notes.pdf
HUMAN EYE By-R.M Class 10 phy best digital notes.pdfHUMAN EYE By-R.M Class 10 phy best digital notes.pdf
HUMAN EYE By-R.M Class 10 phy best digital notes.pdf
Ritik83251
 
Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...
Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...
Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...
Sérgio Sacani
 
Anti-Universe And Emergent Gravity and the Dark Universe
Anti-Universe And Emergent Gravity and the Dark UniverseAnti-Universe And Emergent Gravity and the Dark Universe
Anti-Universe And Emergent Gravity and the Dark Universe
Sérgio Sacani
 
Pests of Storage_Identification_Dr.UPR.pdf
Pests of Storage_Identification_Dr.UPR.pdfPests of Storage_Identification_Dr.UPR.pdf
Pests of Storage_Identification_Dr.UPR.pdf
PirithiRaju
 
Microbiology of Central Nervous System INFECTIONS.pdf
Microbiology of Central Nervous System INFECTIONS.pdfMicrobiology of Central Nervous System INFECTIONS.pdf
Microbiology of Central Nervous System INFECTIONS.pdf
sammy700571
 
Direct Seeded Rice - Climate Smart Agriculture
Direct Seeded Rice - Climate Smart AgricultureDirect Seeded Rice - Climate Smart Agriculture
Direct Seeded Rice - Climate Smart Agriculture
International Food Policy Research Institute- South Asia Office
 
The debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically youngThe debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically young
Sérgio Sacani
 
Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...
Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...
Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...
PsychoTech Services
 
Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...
Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...
Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...
Sérgio Sacani
 
HOW DO ORGANISMS REPRODUCE?reproduction part 1
HOW DO ORGANISMS REPRODUCE?reproduction part 1HOW DO ORGANISMS REPRODUCE?reproduction part 1
HOW DO ORGANISMS REPRODUCE?reproduction part 1
Shashank Shekhar Pandey
 
Immersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths ForwardImmersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths Forward
Leonel Morgado
 
CLASS 12th CHEMISTRY SOLID STATE ppt (Animated)
CLASS 12th CHEMISTRY SOLID STATE ppt (Animated)CLASS 12th CHEMISTRY SOLID STATE ppt (Animated)
CLASS 12th CHEMISTRY SOLID STATE ppt (Animated)
eitps1506
 
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
vluwdy49
 

Recently uploaded (20)

Randomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNERandomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNE
 
Methods of grain storage Structures in India.pdf
Methods of grain storage Structures in India.pdfMethods of grain storage Structures in India.pdf
Methods of grain storage Structures in India.pdf
 
Travis Hills of MN is Making Clean Water Accessible to All Through High Flux ...
Travis Hills of MN is Making Clean Water Accessible to All Through High Flux ...Travis Hills of MN is Making Clean Water Accessible to All Through High Flux ...
Travis Hills of MN is Making Clean Water Accessible to All Through High Flux ...
 
MICROBIAL INTERACTION PPT/ MICROBIAL INTERACTION AND THEIR TYPES // PLANT MIC...
MICROBIAL INTERACTION PPT/ MICROBIAL INTERACTION AND THEIR TYPES // PLANT MIC...MICROBIAL INTERACTION PPT/ MICROBIAL INTERACTION AND THEIR TYPES // PLANT MIC...
MICROBIAL INTERACTION PPT/ MICROBIAL INTERACTION AND THEIR TYPES // PLANT MIC...
 
23PH301 - Optics - Optical Lenses.pptx
23PH301 - Optics  -  Optical Lenses.pptx23PH301 - Optics  -  Optical Lenses.pptx
23PH301 - Optics - Optical Lenses.pptx
 
cathode ray oscilloscope and its applications
cathode ray oscilloscope and its applicationscathode ray oscilloscope and its applications
cathode ray oscilloscope and its applications
 
The cost of acquiring information by natural selection
The cost of acquiring information by natural selectionThe cost of acquiring information by natural selection
The cost of acquiring information by natural selection
 
HUMAN EYE By-R.M Class 10 phy best digital notes.pdf
HUMAN EYE By-R.M Class 10 phy best digital notes.pdfHUMAN EYE By-R.M Class 10 phy best digital notes.pdf
HUMAN EYE By-R.M Class 10 phy best digital notes.pdf
 
Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...
Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...
Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...
 
Anti-Universe And Emergent Gravity and the Dark Universe
Anti-Universe And Emergent Gravity and the Dark UniverseAnti-Universe And Emergent Gravity and the Dark Universe
Anti-Universe And Emergent Gravity and the Dark Universe
 
Pests of Storage_Identification_Dr.UPR.pdf
Pests of Storage_Identification_Dr.UPR.pdfPests of Storage_Identification_Dr.UPR.pdf
Pests of Storage_Identification_Dr.UPR.pdf
 
Microbiology of Central Nervous System INFECTIONS.pdf
Microbiology of Central Nervous System INFECTIONS.pdfMicrobiology of Central Nervous System INFECTIONS.pdf
Microbiology of Central Nervous System INFECTIONS.pdf
 
Direct Seeded Rice - Climate Smart Agriculture
Direct Seeded Rice - Climate Smart AgricultureDirect Seeded Rice - Climate Smart Agriculture
Direct Seeded Rice - Climate Smart Agriculture
 
The debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically youngThe debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically young
 
Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...
Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...
Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...
 
Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...
Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...
Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...
 
HOW DO ORGANISMS REPRODUCE?reproduction part 1
HOW DO ORGANISMS REPRODUCE?reproduction part 1HOW DO ORGANISMS REPRODUCE?reproduction part 1
HOW DO ORGANISMS REPRODUCE?reproduction part 1
 
Immersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths ForwardImmersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths Forward
 
CLASS 12th CHEMISTRY SOLID STATE ppt (Animated)
CLASS 12th CHEMISTRY SOLID STATE ppt (Animated)CLASS 12th CHEMISTRY SOLID STATE ppt (Animated)
CLASS 12th CHEMISTRY SOLID STATE ppt (Animated)
 
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
 

JudaicaLink: Linked Data in the Jewish Studies FID

  • 1. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem JudaicaLink Linked Data in the Jewish Studies FID Kai Eckert http://www.judaicalink.org 1
  • 2. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem FID Jewish Studies / Israel Studies Creation of a specialized information service (Fach-Informations-Dienst) for the domain of Jewish studies and Israel Studies. Our part: ● Metadata integration and enrichment. ● Multilingual data matching. 2 Funding by Consortium
  • 3. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem Portal of Jewish Studies Goals: ● Create a central access point ● Offer high performance information infrastructure And also, ● Contextualize the digital Judaica collections ● Enrich the metadata ● Connect different data sources as Linked Open Data 3
  • 4. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem The Portal 4 http://umber.ub.uni-frankfurt.de/judaica/ (Beta version, not yet officially launched!)
  • 5. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem 5
  • 6. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem Portal of Jewish Studies Goals: ● Create a central access point ● Offer high performance information infrastructure And also, ● Contextualize the digital Judaica collections ● Enrich the metadata ● Connect different data sources as Linked Open Data 6
  • 7. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem Re-Transliteration 7 ● Automatic Retro-Conversion of Romanized Hebrew Text ● Improve search facilities for Hebrew speakers ● Needed to match data cross-lingually Aaron Christianson
  • 8. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem Retro-Conversion of Romanized Hebrew Text lĕqahaḥ t teḥ qst ʿivrî bĕ-taʿătîq lātîḥnî ‫עבוריות‬ ‫לאותיות‬ ‫אותו‬ ‫ולהפוך‬ 8
  • 9. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem ‫המלא‬ ‫הסיפור‬ ‫ֵא‬‫ל‬ ָ‫מ‬ַ‫ה‬ ‫רּ‬ ‫ו‬ִ‫ס‬ַ‫ה‬ 9
  • 10. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem Problem Statements ● No Hebrew Script until 2011 ● Multiple standards of Romanization ● Ambiguities: Same romanized character can refer to several Hebrew letters. ● Data imported from other catalogs ● The transliterations contain errors (yes, even librarians make - rare - mistakes) 10
  • 11. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem The Plan 1. Generate all possible original forms in a “stupid” way. 2. Match the output against known Hebrew names / titles. 3. Use the verified matches to train a statistical model on the word/phrase level. 11
  • 12. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem Portal of Jewish Studies Goals: ● Create a central access point ● Offer high performance information infrastructure And also, ● Contextualize the digital Judaica collections ● Enrich the metadata ● Connect different data sources as Linked Open Data 12
  • 13. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem Contextualization of Digital Resources ● Find relevant data sources ● Find matching resources ● Extract information ● Add information to library collection 13 Maral Dadvar
  • 14. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem It’s all about Labels! ● Labels (Strings) are the first thing we search to generate matching candidates. ● Every additional label for a resource is a possible new entry point to create a connection. ● Caveat: More labels also create more false positives. Further evidence is needed to establish a link. 14
  • 15. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem 15
  • 16. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem ● Make unstructured data sources like online encyclopedia available as structured data ● Identify and collect relevant subsets of general-purpose knowledge bases like DBpedia ● To function as a single hub for the contextualization process 16
  • 17. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem Main Tasks 1. Find new resource descriptions - with labels! 2. Find new labels (and other data) for known resources. 3. Find connections and duplicates within known resources. 4. Make the data available for others to contextualize. 17
  • 18. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem Example 1: YIVO Encyclopedia 18
  • 19. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem What Data can we find? ● A title ● Describing text ● Links in texts "Surface form" => Concept ● Pictures ● Description of pictures 19
  • 20. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem Making use of the Surface Forms Minsk article links to "Poland before 1795" calling it "Polish-Lithuanian Commonwealth". "Polish-Lithuanian Commonwealth" is a subsection of "Poland before 1795" in the main article "Poland". So is "Demography"… Surface forms are evidence for labels. 20
  • 21. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem Example 2: Biographisches Handbuch der Rabbiner ● The Biographisches Handbuch der Rabbiner is an online encyclopedia provided by the Salomon L. Steinheim-Institute for German-Jewish history at the University of Duisburg-Essen, edited by Michael Brocke and Julius Carlebach. ● The goal of this encyclopedia is to be a complete directory of all rabbis who lived and worked in or originated from German-speaking areas since the age of enlightenment. ● http://www.steinheim-institut.de/wiki/index.php/Biographisches_Handbuch_de r_Rabbiner_%28BHR%29 21
  • 22. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem Available as PDF. 22
  • 23. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem Some notes about PDF Sources ● PDF is great to keep the visual layout of a text across systems. ● In all other aspects, in particular regarding the access to the content, it is horrible. ● Digital-born PDFs (as in this case) are PDFs that have been created directly by the authoring software. ● Even worse: PDFs created from scans (with OCR). 23
  • 24. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem 24
  • 25. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem Biographisches Portal der Rabbiner Gladly some people at Steinheim Institute created a database from the handbook: http://www.steinheim-institut.de:50580/cgi-bin/bhr#i0001 This URL above is shown in the browser when you view the entry on Aach, Löb. Great stuff: ● Semi-structured form of the entry ● “Link” to the PDF by means of volume and page number ● Reference to the number of the entry as it is used in the PDF. ● A GND number!!! Not so great: ● We can not link to the database as the link above does not resolve to the article. 25
  • 26. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem Solution The solution actually was already implemented: There is an undocumented way to address an entry: http://steinheim-institut.de:50580/cgi-bin/bhr?id=1 26
  • 27. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem Example 3: DBpedia Generation of a DBpedia subgraph. 1. Focused Crawling of data sources a. identify “relevant” resources b. extract “relevant” information 2. Find matches in the whole dataset a. extract “relevant” information 27
  • 28. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem Interlinking The more data sources we have, the better we can use them to support the linking process. 28
  • 29. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem Architecture and Deployment Triple store and SPARQL endpoint: Apache Jena Fuseki Linked Data frontend (URI dereferencing, HTML Views): Pubby (DM2E version) Static HTML pages of the website: Hugo Versioning and management: GitHub Search Access: Elasticsearch (planned) 29
  • 30. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem Dataset Description in Markdown with Metadata Frontmatter +++ author = "Kai Eckert" title = "Yivo Encyclopedia" website = "http://www.yivoencyclopedia.org" example = "http://data.judaicalink.org/data/yivo/Moscow" graph = "http://data.judaicalink.org/data/yivo" loaded = true [[files]] url = "http://data.judaicalink.org/dumps/yivo/current/yivo.n3.gz" description = "Extraction from YIVO Encyclopediae" +++ The YIVO Encyclopedia of Jews in Eastern Europe, courtesy of the YIVO Institute of Jewish Research, NY. <!--more--> ... 30
  • 31. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem The whole website is maintained via GitHub 31
  • 32. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem Every new commit gets pushed to the web server Via the static site generator Hugo, all HTML pages are generated. 32 +++ author = "Kai Eckert" title = "Yivo Encyclopedia" website = "http://www.yivoencyclopedia.org" example = "http://data.judaicalink.org/data/yivo/Moscow" graph = "http://data.judaicalink.org/data/yivo" loaded = true [[files]] url = "http://data.judaicalink.org/dumps/yivo/current/ yivo.n3.gz" description = "Extraction from YIVO Encyclopediae" +++ The YIVO Encyclopedia of Jews in Eastern Europe, courtesy of the YIVO Institute of Jewish Research, NY. <!--more--> ...
  • 33. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem Every new commit gets pushed to the web server A Python script parses the metadata of the pages and loads and unloads the datasets automatically. Advantages: ● No one needs access to the server. ● Write access to the data is easily done via GitHub. ● Data dumps of all datasets are always available. ● Description, dumps and loaded data are always synchronous. ● History of the datasets is maintained (and the dumps) ● Mistakes can easily be reverted by going back to an earlier commit. 33
  • 34. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem Thank you. http://slideshare.net/kaiec http://www.wisslab.org 34