SlideShare a Scribd company logo
Letters, Ideas and                                   scholarly
                                                     communication
Information Technology                               @ 1650
               Using digital corpora of letters to
                      disclose the circulation of
                 knowledge in the 17th century

Erik-Jan Bos, Univ. Utrecht,
    erik-jan.bos@phil.uu.nl
                                                     
                                                     scholarly
                                                     communication
Charles van den Heuvel, VKS,                         @ 2050
  charles.vandenheuvel@vks.knaw.nl
Dirk Roorda (that’s me), DANS,
   dirk.roorda@dans.knaw.nl
http://ckcc.huygens.knaw.nl/
Nota
Beeckman
Cats       STEVIN

relation disciplines
direct - water
indirect - literature

Huygens STEVIN
Langeren
Corpora of
    17th century scholars
   Constantijn Huygens
   Christiaan Huygens
   Grotius
   Descartes
   Swammerdam
   Leeuwenhoek
   Barleaus
   Spinoza                 4

   and more?
Corpus        Number In                Format     Metadata    Normalized?
              of letters: posession?
Grotius       7946      Yes            TEI        In Interp   Yes, DBNL
                                                  element     codes
Van           337       Yes            TEI        In Interp   Yes, DBNL
Leeuwenhoek                                       element     codes
Descartes     750       Yes            XML (no    other       No, plain text
                                       TEI)       markup
Barlaeus      1200      300 ready      Word       unknown     unknown
Swammerdam    80        Yes            Word       unknown     unknown
Constantijn   7295      Yes            xml        Probably    DBNL codes
Huygens                                           Interp
                                                  element
Christiaan    2900?     Medio 2010     probably   Probably    DBNL codes
Huygens                                TEI        Interp
                                                  element
CEN -Metadata




Catalogus Epistularum Neerlandaricum
265,000 descriptions of approximately
1,000,000 letters
from 1600 – now of which
100,000 letters in 17th century
Research Questions
• History of science:
  • How did knowledge circulate in the 17th-
    century Dutch Republic?
• Patterns in knowledge growth:
  • How can we visualise sets of letters that
    exhibit features of knowledge circulation?
• Re-use:
  • How can we expose the sources, annotations,
    and resulting patterns to further research?
Challenge

Traditional scholarship
• interpretation
• close reading        East
• solving puzzles

Computational methods               We
•dealing with patterns                 st
•gleaned from large quantities of texts
•by automatic tools

East is east and
                West is west and ...
Issues to deal with

• making the sources uniformly available
  • well coded in TEI, access rights
• overcoming the language barrier
  • (17th cent varieties of French, Latin, Dutch)
• named entity recognition & concepts
  • people, places, dates, concepts, instruments
  • mixture of interpretation and algorithms
• creating useful visualisations
  • aiding exploration by historians of science
ICT in Humanities Research
• collaboratory
  • e-Laborate as starting point
• algorithmic pipelines
  • from source material to visualisation
• infrastructure
  •   archiving results
  •   re-using data
  •   developing new algorithms
  •   disseminating the methodology
collaboratory
pipelines
pipelines (current)
• language detection, using
Language Identification from Text Using N-gram Based
  Cumulative Frequency Addition
Bashir Ahmed, Sung-Hyuk Cha, and Charles Tappert 2004
• results

            latin
            dutch
            french
            german
pipelines (current)
• spelling normalisation
  • VARD (http://www.comp.lancs.ac.uk/~barona/vard2/)
  • with help from (http://www.dicollecte.org/home.php?prj=fr)
• results
  • French: VARD works (after improvements),
    although designed for historical English
  • Dutch: still on the lookout for a combination of
    resources, tools, and dexterity
  • Latin: later
pipelines (current)
pipelines (current)
• named entity recognition
  • known tools get 70%
  • search for optimal tools in the next stage
pipelines (insights)
• expect the most from statistical methods
• language technology may boost results
• it remains to be seen by how much
Source: Scott

Topic-Author-Time   Weingart UIA
infrastructure
the project’s legacy
• more than publications
  • curated sources, annotations, visualisations
• more than algoritms
  • a framework for analysis of historical texts
• more than a piece of historical research
  • data and (intermediate) results worthwhile to
     • linguists, computer scientists, sociologists
• more than a passive dataset
  • extensible, dynamic, interactive
preserving the results
• part of the CLARIN infrastructure
  • http://www.clarin.eu/
  • http://www.clarin.nl/
• materials in a Trusted Digital Repository
  (DANS)
  • http://easy.dans.knaw.nl/dms
working with CLARIN
• CLARIN-EU
  • Outreach to humanities: use cases
  • CKCC one of 10 selected projects
  • received expert input for choice of language
    tools
• CLARIN-NL
  • CKCC one of 10 initial projects in the Dutch
    national construction effort
  • support for applying language technology
Adapting to CLARIN
• Conforming to standards
• CLARIN standards are in evolution
  • (and will remain evolvable)
• Common MetaData Infrastructure
  • a registry of metadata components
  • defined by the community
  • with explicit semantics (http://www.isocat.org/ )
• Data in TEI (as export/import format)
Trusted Digital Repository
• materials
   •   reliable (provenance metadata)
   •   findable (CMDI metadata)
   •   referable (persistent identifiers)
   •   accessible (viewable in webbrowser)
   •   usable (downloadable)
• sooner or later:
   • high-performance computing
   • memento: a time-sensitive webinterface to the
     dynamic contents of the collaboratory
        (http://arxiv.org/abs/0911.1112 )
http://www.clarin.eu/node/3073

http://ckcc.huygens.knaw.nl/

More Related Content

Similar to 2010 Digital Humanities London - Dutch Republic of Letters

20151111 utrecht ver theolbibliothecarissen
20151111 utrecht ver theolbibliothecarissen20151111 utrecht ver theolbibliothecarissen
20151111 utrecht ver theolbibliothecarissen
Dirk Roorda
 
Session5 01.rutger vankoert
Session5 01.rutger vankoertSession5 01.rutger vankoert
Session5 01.rutger vankoert
IMPACT Centre of Competence
 
[DCSB] Dr Gabriel Bodard (KCL) “A View on Digital Classics Collaboration: fro...
[DCSB] Dr Gabriel Bodard (KCL) “A View on Digital Classics Collaboration: fro...[DCSB] Dr Gabriel Bodard (KCL) “A View on Digital Classics Collaboration: fro...
[DCSB] Dr Gabriel Bodard (KCL) “A View on Digital Classics Collaboration: fro...Digital Classicist Seminar Berlin
 
Session5 03.george rehm
Session5 03.george rehmSession5 03.george rehm
Session5 03.george rehm
IMPACT Centre of Competence
 
Visible Words: Research and Training in Digital Contextual Epigraphy - Berkel...
Visible Words: Research and Training in Digital Contextual Epigraphy - Berkel...Visible Words: Research and Training in Digital Contextual Epigraphy - Berkel...
Visible Words: Research and Training in Digital Contextual Epigraphy - Berkel...
Project Visible Words/MotsAVoir
 
Sw4 sh slides
Sw4 sh slidesSw4 sh slides
Sw4 sh slides
Victor de Boer
 
Linked Data: principles and examples
Linked Data: principles and examples Linked Data: principles and examples
Linked Data: principles and examples
Victor de Boer
 
Innovative methods for data integration: Linked Data and NLP
Innovative methods for data integration: Linked Data and NLPInnovative methods for data integration: Linked Data and NLP
Innovative methods for data integration: Linked Data and NLP
ariadnenetwork
 
Digital Humanities and Linked Data
Digital Humanities and Linked DataDigital Humanities and Linked Data
Digital Humanities and Linked Data
Leon Wessels
 
DHI2018 - a comparative study of Chinese and English publications
DHI2018 - a comparative study of Chinese and English publicationsDHI2018 - a comparative study of Chinese and English publications
DHI2018 - a comparative study of Chinese and English publications
Jin Gao
 
17. kb.nederlab.20150324
17. kb.nederlab.2015032417. kb.nederlab.20150324
17. kb.nederlab.20150324
ingeangevaare
 
Saving Queries
Saving QueriesSaving Queries
Saving Queries
Dirk Roorda
 
CENDARI Summer School July 2015 Burrows
CENDARI Summer School July 2015 BurrowsCENDARI Summer School July 2015 Burrows
CENDARI Summer School July 2015 Burrows
Toby Burrows
 
Building the Biblissima Observatory
Building the Biblissima ObservatoryBuilding the Biblissima Observatory
Building the Biblissima Observatory
Equipex Biblissima
 
Discovering libraries's gold through collection-level descriptions
Discovering libraries's gold through collection-level descriptionsDiscovering libraries's gold through collection-level descriptions
Discovering libraries's gold through collection-level descriptions
Valentine Charles
 
Tim Hill
Tim HillTim Hill
Europeana Regia presentation at eChallenges 2011 conference
Europeana Regia presentation at eChallenges 2011 conferenceEuropeana Regia presentation at eChallenges 2011 conference
Europeana Regia presentation at eChallenges 2011 conferenceEuropeana Regia
 
Bringing Digital Humanities to the wider public: libraries as incubator for D...
Bringing Digital Humanities to the wider public: libraries as incubator for D...Bringing Digital Humanities to the wider public: libraries as incubator for D...
Bringing Digital Humanities to the wider public: libraries as incubator for D...
Martijn Kleppe
 
Digital History Seminar
Digital History SeminarDigital History Seminar
Digital History Seminar
CDesenclos
 
Transkribus | Günter Mühlberger
Transkribus | Günter MühlbergerTranskribus | Günter Mühlberger
Transkribus | Günter Mühlberger
Netwerk Oorlogsbronnen
 

Similar to 2010 Digital Humanities London - Dutch Republic of Letters (20)

20151111 utrecht ver theolbibliothecarissen
20151111 utrecht ver theolbibliothecarissen20151111 utrecht ver theolbibliothecarissen
20151111 utrecht ver theolbibliothecarissen
 
Session5 01.rutger vankoert
Session5 01.rutger vankoertSession5 01.rutger vankoert
Session5 01.rutger vankoert
 
[DCSB] Dr Gabriel Bodard (KCL) “A View on Digital Classics Collaboration: fro...
[DCSB] Dr Gabriel Bodard (KCL) “A View on Digital Classics Collaboration: fro...[DCSB] Dr Gabriel Bodard (KCL) “A View on Digital Classics Collaboration: fro...
[DCSB] Dr Gabriel Bodard (KCL) “A View on Digital Classics Collaboration: fro...
 
Session5 03.george rehm
Session5 03.george rehmSession5 03.george rehm
Session5 03.george rehm
 
Visible Words: Research and Training in Digital Contextual Epigraphy - Berkel...
Visible Words: Research and Training in Digital Contextual Epigraphy - Berkel...Visible Words: Research and Training in Digital Contextual Epigraphy - Berkel...
Visible Words: Research and Training in Digital Contextual Epigraphy - Berkel...
 
Sw4 sh slides
Sw4 sh slidesSw4 sh slides
Sw4 sh slides
 
Linked Data: principles and examples
Linked Data: principles and examples Linked Data: principles and examples
Linked Data: principles and examples
 
Innovative methods for data integration: Linked Data and NLP
Innovative methods for data integration: Linked Data and NLPInnovative methods for data integration: Linked Data and NLP
Innovative methods for data integration: Linked Data and NLP
 
Digital Humanities and Linked Data
Digital Humanities and Linked DataDigital Humanities and Linked Data
Digital Humanities and Linked Data
 
DHI2018 - a comparative study of Chinese and English publications
DHI2018 - a comparative study of Chinese and English publicationsDHI2018 - a comparative study of Chinese and English publications
DHI2018 - a comparative study of Chinese and English publications
 
17. kb.nederlab.20150324
17. kb.nederlab.2015032417. kb.nederlab.20150324
17. kb.nederlab.20150324
 
Saving Queries
Saving QueriesSaving Queries
Saving Queries
 
CENDARI Summer School July 2015 Burrows
CENDARI Summer School July 2015 BurrowsCENDARI Summer School July 2015 Burrows
CENDARI Summer School July 2015 Burrows
 
Building the Biblissima Observatory
Building the Biblissima ObservatoryBuilding the Biblissima Observatory
Building the Biblissima Observatory
 
Discovering libraries's gold through collection-level descriptions
Discovering libraries's gold through collection-level descriptionsDiscovering libraries's gold through collection-level descriptions
Discovering libraries's gold through collection-level descriptions
 
Tim Hill
Tim HillTim Hill
Tim Hill
 
Europeana Regia presentation at eChallenges 2011 conference
Europeana Regia presentation at eChallenges 2011 conferenceEuropeana Regia presentation at eChallenges 2011 conference
Europeana Regia presentation at eChallenges 2011 conference
 
Bringing Digital Humanities to the wider public: libraries as incubator for D...
Bringing Digital Humanities to the wider public: libraries as incubator for D...Bringing Digital Humanities to the wider public: libraries as incubator for D...
Bringing Digital Humanities to the wider public: libraries as incubator for D...
 
Digital History Seminar
Digital History SeminarDigital History Seminar
Digital History Seminar
 
Transkribus | Günter Mühlberger
Transkribus | Günter MühlbergerTranskribus | Günter Mühlberger
Transkribus | Günter Mühlberger
 

More from Dirk Roorda

TF-FAIR.pdf
TF-FAIR.pdfTF-FAIR.pdf
TF-FAIR.pdf
Dirk Roorda
 
Textpy
TextpyTextpy
Textpy
Dirk Roorda
 
General Missives
General MissivesGeneral Missives
General Missives
Dirk Roorda
 
Text Display (when it gets tricky)
Text Display (when it gets tricky)Text Display (when it gets tricky)
Text Display (when it gets tricky)
Dirk Roorda
 
Tf in-context
Tf in-contextTf in-context
Tf in-context
Dirk Roorda
 
Quran and Text-Fabric
Quran and Text-FabricQuran and Text-Fabric
Quran and Text-Fabric
Dirk Roorda
 
Ancient corpora analysis
Ancient corpora analysisAncient corpora analysis
Ancient corpora analysis
Dirk Roorda
 
Qdf2tf
Qdf2tfQdf2tf
Qdf2tf
Dirk Roorda
 
Text fabric
Text fabricText fabric
Text fabric
Dirk Roorda
 
Verbal Valency in Hebrew Verbs
Verbal Valency in Hebrew VerbsVerbal Valency in Hebrew Verbs
Verbal Valency in Hebrew Verbs
Dirk Roorda
 
Data management for researchers
Data management for researchersData management for researchers
Data management for researchers
Dirk Roorda
 
Annotating the Hebrew Bible
Annotating the Hebrew BibleAnnotating the Hebrew Bible
Annotating the Hebrew Bible
Dirk Roorda
 
Text as Data: processing the Hebrew Bible
Text as Data: processing the Hebrew BibleText as Data: processing the Hebrew Bible
Text as Data: processing the Hebrew Bible
Dirk Roorda
 
Datamanagement for Research: A Case Study
Datamanagement for Research: A Case StudyDatamanagement for Research: A Case Study
Datamanagement for Research: A Case Study
Dirk Roorda
 
Award
AwardAward
Datamanagement for Research: A Case Study
Datamanagement for Research: A Case StudyDatamanagement for Research: A Case Study
Datamanagement for Research: A Case Study
Dirk Roorda
 
Hebrew Bible as Data: Laboratory, Sharing, Lessons
Hebrew Bible as Data: Laboratory, Sharing, LessonsHebrew Bible as Data: Laboratory, Sharing, Lessons
Hebrew Bible as Data: Laboratory, Sharing, Lessons
Dirk Roorda
 
Laf fabric-dh benelux2014
Laf fabric-dh benelux2014Laf fabric-dh benelux2014
Laf fabric-dh benelux2014
Dirk Roorda
 
Data Analysis in the Hebrew Bible
Data Analysis in the Hebrew BibleData Analysis in the Hebrew Bible
Data Analysis in the Hebrew Bible
Dirk Roorda
 
LAF Fabric
LAF FabricLAF Fabric
LAF Fabric
Dirk Roorda
 

More from Dirk Roorda (20)

TF-FAIR.pdf
TF-FAIR.pdfTF-FAIR.pdf
TF-FAIR.pdf
 
Textpy
TextpyTextpy
Textpy
 
General Missives
General MissivesGeneral Missives
General Missives
 
Text Display (when it gets tricky)
Text Display (when it gets tricky)Text Display (when it gets tricky)
Text Display (when it gets tricky)
 
Tf in-context
Tf in-contextTf in-context
Tf in-context
 
Quran and Text-Fabric
Quran and Text-FabricQuran and Text-Fabric
Quran and Text-Fabric
 
Ancient corpora analysis
Ancient corpora analysisAncient corpora analysis
Ancient corpora analysis
 
Qdf2tf
Qdf2tfQdf2tf
Qdf2tf
 
Text fabric
Text fabricText fabric
Text fabric
 
Verbal Valency in Hebrew Verbs
Verbal Valency in Hebrew VerbsVerbal Valency in Hebrew Verbs
Verbal Valency in Hebrew Verbs
 
Data management for researchers
Data management for researchersData management for researchers
Data management for researchers
 
Annotating the Hebrew Bible
Annotating the Hebrew BibleAnnotating the Hebrew Bible
Annotating the Hebrew Bible
 
Text as Data: processing the Hebrew Bible
Text as Data: processing the Hebrew BibleText as Data: processing the Hebrew Bible
Text as Data: processing the Hebrew Bible
 
Datamanagement for Research: A Case Study
Datamanagement for Research: A Case StudyDatamanagement for Research: A Case Study
Datamanagement for Research: A Case Study
 
Award
AwardAward
Award
 
Datamanagement for Research: A Case Study
Datamanagement for Research: A Case StudyDatamanagement for Research: A Case Study
Datamanagement for Research: A Case Study
 
Hebrew Bible as Data: Laboratory, Sharing, Lessons
Hebrew Bible as Data: Laboratory, Sharing, LessonsHebrew Bible as Data: Laboratory, Sharing, Lessons
Hebrew Bible as Data: Laboratory, Sharing, Lessons
 
Laf fabric-dh benelux2014
Laf fabric-dh benelux2014Laf fabric-dh benelux2014
Laf fabric-dh benelux2014
 
Data Analysis in the Hebrew Bible
Data Analysis in the Hebrew BibleData Analysis in the Hebrew Bible
Data Analysis in the Hebrew Bible
 
LAF Fabric
LAF FabricLAF Fabric
LAF Fabric
 

Recently uploaded

The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
Vivekanand Anglo Vedic Academy
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
Jisc
 
1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx
JosvitaDsouza2
 
Basic phrases for greeting and assisting costumers
Basic phrases for greeting and assisting costumersBasic phrases for greeting and assisting costumers
Basic phrases for greeting and assisting costumers
PedroFerreira53928
 
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptxStudents, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
EduSkills OECD
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
EugeneSaldivar
 
Overview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with MechanismOverview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with Mechanism
DeeptiGupta154
 
The Art Pastor's Guide to Sabbath | Steve Thomason
The Art Pastor's Guide to Sabbath | Steve ThomasonThe Art Pastor's Guide to Sabbath | Steve Thomason
The Art Pastor's Guide to Sabbath | Steve Thomason
Steve Thomason
 
Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345
beazzy04
 
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCECLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
BhavyaRajput3
 
Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
Atul Kumar Singh
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
MIRIAMSALINAS13
 
ESC Beyond Borders _From EU to You_ InfoPack general.pdf
ESC Beyond Borders _From EU to You_ InfoPack general.pdfESC Beyond Borders _From EU to You_ InfoPack general.pdf
ESC Beyond Borders _From EU to You_ InfoPack general.pdf
Fundacja Rozwoju Społeczeństwa Przedsiębiorczego
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
Special education needs
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
Jisc
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
TechSoup
 
The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
kaushalkr1407
 
How to Split Bills in the Odoo 17 POS Module
How to Split Bills in the Odoo 17 POS ModuleHow to Split Bills in the Odoo 17 POS Module
How to Split Bills in the Odoo 17 POS Module
Celine George
 
Ethnobotany and Ethnopharmacology ......
Ethnobotany and Ethnopharmacology ......Ethnobotany and Ethnopharmacology ......
Ethnobotany and Ethnopharmacology ......
Ashokrao Mane college of Pharmacy Peth-Vadgaon
 
The geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideasThe geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideas
GeoBlogs
 

Recently uploaded (20)

The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
 
1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx
 
Basic phrases for greeting and assisting costumers
Basic phrases for greeting and assisting costumersBasic phrases for greeting and assisting costumers
Basic phrases for greeting and assisting costumers
 
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptxStudents, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
 
Overview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with MechanismOverview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with Mechanism
 
The Art Pastor's Guide to Sabbath | Steve Thomason
The Art Pastor's Guide to Sabbath | Steve ThomasonThe Art Pastor's Guide to Sabbath | Steve Thomason
The Art Pastor's Guide to Sabbath | Steve Thomason
 
Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345
 
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCECLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
 
Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
 
ESC Beyond Borders _From EU to You_ InfoPack general.pdf
ESC Beyond Borders _From EU to You_ InfoPack general.pdfESC Beyond Borders _From EU to You_ InfoPack general.pdf
ESC Beyond Borders _From EU to You_ InfoPack general.pdf
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
 
The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
 
How to Split Bills in the Odoo 17 POS Module
How to Split Bills in the Odoo 17 POS ModuleHow to Split Bills in the Odoo 17 POS Module
How to Split Bills in the Odoo 17 POS Module
 
Ethnobotany and Ethnopharmacology ......
Ethnobotany and Ethnopharmacology ......Ethnobotany and Ethnopharmacology ......
Ethnobotany and Ethnopharmacology ......
 
The geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideasThe geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideas
 

2010 Digital Humanities London - Dutch Republic of Letters

  • 1. Letters, Ideas and scholarly communication Information Technology @ 1650 Using digital corpora of letters to disclose the circulation of knowledge in the 17th century Erik-Jan Bos, Univ. Utrecht, erik-jan.bos@phil.uu.nl  scholarly communication Charles van den Heuvel, VKS, @ 2050 charles.vandenheuvel@vks.knaw.nl Dirk Roorda (that’s me), DANS, dirk.roorda@dans.knaw.nl
  • 3. Nota Beeckman Cats STEVIN relation disciplines direct - water indirect - literature Huygens STEVIN Langeren
  • 4. Corpora of 17th century scholars  Constantijn Huygens  Christiaan Huygens  Grotius  Descartes  Swammerdam  Leeuwenhoek  Barleaus  Spinoza 4  and more?
  • 5. Corpus Number In Format Metadata Normalized? of letters: posession? Grotius 7946 Yes TEI In Interp Yes, DBNL element codes Van 337 Yes TEI In Interp Yes, DBNL Leeuwenhoek element codes Descartes 750 Yes XML (no other No, plain text TEI) markup Barlaeus 1200 300 ready Word unknown unknown Swammerdam 80 Yes Word unknown unknown Constantijn 7295 Yes xml Probably DBNL codes Huygens Interp element Christiaan 2900? Medio 2010 probably Probably DBNL codes Huygens TEI Interp element
  • 6. CEN -Metadata Catalogus Epistularum Neerlandaricum 265,000 descriptions of approximately 1,000,000 letters from 1600 – now of which 100,000 letters in 17th century
  • 7. Research Questions • History of science: • How did knowledge circulate in the 17th- century Dutch Republic? • Patterns in knowledge growth: • How can we visualise sets of letters that exhibit features of knowledge circulation? • Re-use: • How can we expose the sources, annotations, and resulting patterns to further research?
  • 8. Challenge Traditional scholarship • interpretation • close reading East • solving puzzles Computational methods We •dealing with patterns st •gleaned from large quantities of texts •by automatic tools East is east and West is west and ...
  • 9. Issues to deal with • making the sources uniformly available • well coded in TEI, access rights • overcoming the language barrier • (17th cent varieties of French, Latin, Dutch) • named entity recognition & concepts • people, places, dates, concepts, instruments • mixture of interpretation and algorithms • creating useful visualisations • aiding exploration by historians of science
  • 10. ICT in Humanities Research • collaboratory • e-Laborate as starting point • algorithmic pipelines • from source material to visualisation • infrastructure • archiving results • re-using data • developing new algorithms • disseminating the methodology
  • 13. pipelines (current) • language detection, using Language Identification from Text Using N-gram Based Cumulative Frequency Addition Bashir Ahmed, Sung-Hyuk Cha, and Charles Tappert 2004 • results latin dutch french german
  • 14. pipelines (current) • spelling normalisation • VARD (http://www.comp.lancs.ac.uk/~barona/vard2/) • with help from (http://www.dicollecte.org/home.php?prj=fr) • results • French: VARD works (after improvements), although designed for historical English • Dutch: still on the lookout for a combination of resources, tools, and dexterity • Latin: later
  • 16. pipelines (current) • named entity recognition • known tools get 70% • search for optimal tools in the next stage
  • 17. pipelines (insights) • expect the most from statistical methods • language technology may boost results • it remains to be seen by how much
  • 20. the project’s legacy • more than publications • curated sources, annotations, visualisations • more than algoritms • a framework for analysis of historical texts • more than a piece of historical research • data and (intermediate) results worthwhile to • linguists, computer scientists, sociologists • more than a passive dataset • extensible, dynamic, interactive
  • 21. preserving the results • part of the CLARIN infrastructure • http://www.clarin.eu/ • http://www.clarin.nl/ • materials in a Trusted Digital Repository (DANS) • http://easy.dans.knaw.nl/dms
  • 22. working with CLARIN • CLARIN-EU • Outreach to humanities: use cases • CKCC one of 10 selected projects • received expert input for choice of language tools • CLARIN-NL • CKCC one of 10 initial projects in the Dutch national construction effort • support for applying language technology
  • 23. Adapting to CLARIN • Conforming to standards • CLARIN standards are in evolution • (and will remain evolvable) • Common MetaData Infrastructure • a registry of metadata components • defined by the community • with explicit semantics (http://www.isocat.org/ ) • Data in TEI (as export/import format)
  • 24. Trusted Digital Repository • materials • reliable (provenance metadata) • findable (CMDI metadata) • referable (persistent identifiers) • accessible (viewable in webbrowser) • usable (downloadable) • sooner or later: • high-performance computing • memento: a time-sensitive webinterface to the dynamic contents of the collaboratory (http://arxiv.org/abs/0911.1112 )

Editor's Notes

  1. Slide 7 Vergelijking Waterschyring met model voor het schuren van een haven in het binnenland gelegen door middel van spilsluizen en de afwatering in de kaart van Note Hier zien we duidelijk overeenkomsten. Echter, ondanks grote overeenkomsten in de figuur is het door onduidelijkheden in de datering van de niet door Stevin gepubliceerde teksten moeilijk na te gaan of dit werk Note kan hebben geïnspireerd. In ieder geval heeft, zoals we nog zullen zien, een ander werk van Stevin een grotere rol gespeeld in Note ’ s argumentatie van zijn uitvinding. Boevendien wordt het werk expliciet door Beeckman ’ s ter ondersteuning van Note ’ s verdediging genoemd. Het betreft De Beghinselen des Waterwichts van 1584.
  2. Catalogus Epistularum Neerlandaricum (CEN), or the Catalogue of letters in Dutch repositories. It is a relatively old database, already available via Telnet in the early 1990s, before the world wide web came into being. CEN is an exhaustive database of letters in the collections of five Dutch university libraries, the Royal Library, and four other important libraries. It contains more than 265,000 descriptions of approximately 1,000,000 letters, dating from 1600 until the present day (of which ca. 100,000 from the 17th century). It supplies the following metadata: sender, recipient, place of sending, year, language, repository and shelf mark. The format in which this database will be made available to the project is to be negotiated with the owner, OCLC6. Usage of this database will enable us to make assertions about the fraction of the selected letters with respect to the total body of letters. Moreover, it allows us to increase the density of the networks we are interested in, leading to unprecedented research opportunities.
  3. How did knowledge circulate in the 17th-century Dutch Republic? How were elements of knowledge picked up by the learned community? How was this new knowledge processed, disseminated, theorized and ultimately accepted, or rejected? How can we combine and structure various sets of letters of 17th-century scholars in such a way that we can analyze the circulation of knowledge in an international context and follow the development of themes of interest in space and time? How can we make this information on knowledge production accessible to interdisciplinary research in the Humanities?
  4. How can we combine and structure various sets of letters of 17th-century scholars and their correspondents in such a way that we can analyze and visualize the circulation and appropriation of knowledge production in a wider international context and recognize the development of themes of interest and scholarly debates in space and time? How can we make this information on knowledge production accessible to interdisciplinary research in the Humanities? How can this information be enriched by annotation ?
  5. Letters not uniformly available Multilingual and spelling variations Automated/Manual Linking and Tagging: Much interpretations needed to resolve references to names, dates, places, ideas and concepts; heterogeneous annotations How to make visualizations informative for research at basis of data? Qualitative: Who is corresponding/introducing? Can we distinguish circles and types of scholars? Where are they located/do they meet? Can we distinguish types of letters/rethorical structures? Can we distinguish emerging themes and debates in these networks? Quantitative: Number of correspondents. Frequency and duration of correspondence. Percentage of various languages and themes.
  6. !NB mention distinction between keyword and concept extraction
  7. WMatrix: good on a per letter basis; not so handy for the whole corpus
  8. LDA is puur statistich je kunt de input voor LDA verbeteren door stemming je kunt NER verbeteren door part of speech analysis concept extraction LDA is voor topical modeling keywords => topics samenstellen => labelen topic modeling => concepten
  9. Topic Modelling – with Mallet and LDA latent Dirichlet allocation an Relational Topical Modelling topics linked to senders and receivers of letters Comment on dips and peaks – worth exploring the little guys! Why are they peaking? next step: visualise the dynamics of topics in geography (buienradar)
  10. De nadruk op infrastructuur -voor CLARIN -ook Alfalab -toekomstige computational humanities -geleerdenbrieven (nu ook een CLARIN-NL project)
  11. http://www.clarin.eu/external/index.php?page=activities&sub=2
  12. see WP-2