Keynote presented at Language Technologies & Digital Humanities Conferences 2018, Ljubljana, Slovenia.
Abstract:
Digital Humanities researchers rely on large digital datasets. Since the National Library of the Netherlands (KB) has been digitizing its collection for about ten years, their datasets are popular amongst DH scholars that focus on historical newspapers, periodicals and books. By not only supporting researchers by giving them access to datasets, but also by collaborating with them, the KB aims to incorporate DH research results in their services and products. We do this by sharing our prototype tools and code on our online Lab , invite academics to come and work as researcher-in-residence and are full partner in research projects. In this talk I will describe the challenges and opportunities for libraries and academics when they collaborate. What can researchers gain from collaborating with libraries? And how can libraries bring the affordances of DH research to the wider public?
Bringing Digital Humanities to the wider public: libraries as incubator for DH research results
1. Bringing Digital Humanities to the wider public:
Libraries as incubators for DH Research Results
dr. Martijn Kleppe – Head of Research Department
martijn.kleppe@kb.nl | @martijnkleppe | www.kb.nl/martijnkleppe
2. What is the National
Library of the
Netherlands?
27. “Putting TDM in the
Mainstream”, i.e. search
portals for bigger audience”
http://dh.library.yale.edu/projects/vogue/
https://www.youtube.com/watch?v=yHi4TD4YfGQ
https://twitter.com/sclaeyssens/status/748047246722228228
33. Collaboration with heritage institutes
https://pro.europeana.eu/network-association/special-interest-groups/europeanatech
https://www.netwerkdigitaalerfgoed.nl/en/
34. Collaboration with Research infrastructures
https://www.clariah.nl/
http://www.odissei-data.nl/en
https://www.clarin.eu/
https://www.dariah.eu/
https://timemachine.eu/
35. Collaboration with Researchers,
that are actually our customers
https://www.kb.nl/en/organisation/research-expertise/projects
https://www.kb.nl/en/organisation/research-expertise/researcher-in-residence
48. Continuous improvement
of enrichment algorithm
article number / time
80
1 108 mlj
• All DBpedia titles searched in news articles
• Named Entities searched in DBpedia
• Speedup by using HPC cloud SURFsara
• Using context and machine learning
Quality/confidence(%)
70
90
At the end cycle to first article and
overwrite earlier enrichments with
newest algorithm
49. algorithm accuracy link recall link precision link F-measure
Rule based .76 .76 .65 .70
Machine learning (SVM) .84 .76 .83 .79
Neural network .84 .73 .87 .79
Extra features
e.g. word embedding
.85 .81 .82 .82
Extra Wikidata data,
more training data
.87 .81 .86 .84
Entity embedding .88 .86 .85 .85
From conventional entity linking to deep learning and beyond
50. “Putting TDM in the
Mainstream”, i.e. search
portals for bigger audience”
“But in a sense, what we do
is: Applied Digital Humanities”
“Yes! But..
We’re not there yet…”
59. Questions?
Bringing Digital Humanities to the wider public-
Libraries as incubators for DH Research Results
dr. Martijn Kleppe – Head of Research Department
martijn.kleppe@kb.nl | @martijnkleppe | www.kb.nl/researchagenda
Editor's Notes
Peter Leonard
Alex Humpfrey
Zijn verder gegaan dan alleen classificatie
Zijn verder gegaan dan alleen classificatie
Zijn verder gegaan dan alleen classificatie
Verrijkingen project
Bruggetje: Search for info in data which is not described in the dataset
https://www.youtube.com/watch?v=PldvKPTPlz4&feature=youtu.be