SlideShare a Scribd company logo
1 of 16
Linked Humanities Data:
   The Next Frontier?
 A Case-Study in Historical Census Data


            Albert Meroño-Peñuela
  Knowledge Representation & Reasoning Group
                  29-10-2012
The Dutch historical censuses
                     (1795-1971)




29-10-2012           Linked Humanities Data: The Next Frontier?   2
The Dutch historical censuses
                     (1795-1971)




29-10-2012           Linked Humanities Data: The Next Frontier?   3
The Dutch historical censuses
                     (1795-1971)

• Population,
  Houses and
  Occupation
  censuses
• 507 Excel files
• 2,288 tables
• 33,283
  annotated cells

29-10-2012           Linked Humanities Data: The Next Frontier?   4
Heterogeneity: structural




29-10-2012         Linked Humanities Data: The Next Frontier?   5
Heterogeneity: semantic
• Variable meaning
      – Plaatselijke indeling / Kom, buiten de kom + Wijk +
        Naam / Plaats
      – Variable design (age 14-18, 19-20 vs. 14-15, 16-20)
• Variable values
      – RomschKatholik, RomsKatholic, VaticanChristelijk
      – Change in municipalities, occupations



29-10-2012           Linked Humanities Data: The Next Frontier?   6
(Current) Harmonization
• Manually create a (more general) translation
  table using standard CS
      – Map occupation literals with HISCO codes
      – Map municipality literals with AC codes
• Cons
      – Expensive
      – Detail/specificity loss
      – Process is non-repeatable

29-10-2012           Linked Humanities Data: The Next Frontier?   7
Additional requirements
• Errors: non-destructive update of values
• Provenance: record who did what, when, why
• Datamodel: do not commit to a specific one
• Linkage: enrich the dataset by linking it to
  others (e.g. labour strikes, book publications
  in NL)
• Publication: open data for researchers


29-10-2012        Linked Humanities Data: The Next Frontier?   8
Census RDF: arch

   • RDF Data Cube
     Vocabulary (cell data)
   • D2S Vocabulary (layout
     data)

   • Open Annotation Core
     Data Model (annotation
     data)




29-10-2012                Linked Humanities Data: The Next Frontier?   9
Census RDF: cell data




29-10-2012       Linked Humanities Data: The Next Frontier?   10
Census RDF: layout data




29-10-2012        Linked Humanities Data: The Next Frontier?   11
Census RDF: annotation data




29-10-2012          Linked Humanities Data: The Next Frontier?   12
Querying the RDF’d census




29-10-2012         Linked Humanities Data: The Next Frontier?   13
Not ready-to-publish RDF
• Disconnected graphs (but 279,136 possible variable
  mappings!)
• Complex & non-homogeneous SPARQL queries
• Contradictory annotation statements
• Drifted concepts
      – Tile settler -> roof repairer
      – Shoemaker (works with leather) -> shoemaker (owns a
        company)



29-10-2012            Linked Humanities Data: The Next Frontier?   14
New challenges
• Dynamic ontologies
      – Different concept formalizations depending on the
        time frame
      – Subjective definitions (contested concepts)
• Partitions and counting
      – Cannot merge counts of non aligned concepts
      – Infer individuals?
• Format round-tripping
      – On-demand XLS, CSV, RDF, RDB conversions with(out)
        data loss

29-10-2012            Linked Humanities Data: The Next Frontier?   15
Thank you!
Questions, suggestions?

     http://cedar-project.nl/
http://www.data2semantics.org/

More Related Content

Similar to Linked Humanities data

CEDAR & PRELIDA Preservation of Linked Socio-Historical Data
CEDAR & PRELIDA Preservation of Linked Socio-Historical DataCEDAR & PRELIDA Preservation of Linked Socio-Historical Data
CEDAR & PRELIDA Preservation of Linked Socio-Historical DataPRELIDA Project
 
MDST 3703 F10 Seminar 11
MDST 3703 F10 Seminar 11MDST 3703 F10 Seminar 11
MDST 3703 F10 Seminar 11Rafael Alvarado
 
Jordi Martí-Henneberg, Luís Espinha da Silveira, Daniel Alves & Josep Puig,To...
Jordi Martí-Henneberg, Luís Espinha da Silveira, Daniel Alves & Josep Puig,To...Jordi Martí-Henneberg, Luís Espinha da Silveira, Daniel Alves & Josep Puig,To...
Jordi Martí-Henneberg, Luís Espinha da Silveira, Daniel Alves & Josep Puig,To...Universidade Nova de Lisboa
 
Data Models and Query Languages for Linked Geospatial Data
Data Models and Query Languages for Linked Geospatial DataData Models and Query Languages for Linked Geospatial Data
Data Models and Query Languages for Linked Geospatial DataKostis Kyzirakos
 
Data Models and Query Languages for Linked Geospatial Data
Data Models and Query Languages for Linked Geospatial DataData Models and Query Languages for Linked Geospatial Data
Data Models and Query Languages for Linked Geospatial DataKostis Kyzirakos
 
LSD Dimensions: Use and Reuse of Linked Statistical Data as RDF Data Cube
LSD Dimensions: Use and Reuse of Linked Statistical Data as RDF Data CubeLSD Dimensions: Use and Reuse of Linked Statistical Data as RDF Data Cube
LSD Dimensions: Use and Reuse of Linked Statistical Data as RDF Data CubeAlbert Meroño-Peñuela
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked DataEUCLID project
 
Linked Open Data in Romania
Linked Open Data in RomaniaLinked Open Data in Romania
Linked Open Data in RomaniaVlad Posea
 
[DCSB] Dr Gabriel Bodard (KCL) “A View on Digital Classics Collaboration: fro...
[DCSB] Dr Gabriel Bodard (KCL) “A View on Digital Classics Collaboration: fro...[DCSB] Dr Gabriel Bodard (KCL) “A View on Digital Classics Collaboration: fro...
[DCSB] Dr Gabriel Bodard (KCL) “A View on Digital Classics Collaboration: fro...Digital Classicist Seminar Berlin
 
Real-time Visualisation of Cultural Heritage and Environmental Archaeology Da...
Real-time Visualisation of Cultural Heritage and Environmental Archaeology Da...Real-time Visualisation of Cultural Heritage and Environmental Archaeology Da...
Real-time Visualisation of Cultural Heritage and Environmental Archaeology Da...Marcus Smith
 
Knowledge Graph Introduction
Knowledge Graph IntroductionKnowledge Graph Introduction
Knowledge Graph IntroductionSören Auer
 
Decentralized Data Management for the Semantic Web
Decentralized Data Management for the Semantic WebDecentralized Data Management for the Semantic Web
Decentralized Data Management for the Semantic Webhala Skaf
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked DataMarin Dimitrov
 
Intra- and interdisciplinary cross-concordances for information retrieval
Intra- and interdisciplinary cross-concordances for information retrieval Intra- and interdisciplinary cross-concordances for information retrieval
Intra- and interdisciplinary cross-concordances for information retrieval GESIS
 

Similar to Linked Humanities data (20)

CEDAR & PRELIDA Preservation of Linked Socio-Historical Data
CEDAR & PRELIDA Preservation of Linked Socio-Historical DataCEDAR & PRELIDA Preservation of Linked Socio-Historical Data
CEDAR & PRELIDA Preservation of Linked Socio-Historical Data
 
MDST 3703 F10 Seminar 11
MDST 3703 F10 Seminar 11MDST 3703 F10 Seminar 11
MDST 3703 F10 Seminar 11
 
Data Driven Ontology Practices: The Real world objects of Ordnance Survey Ir...
Data Driven Ontology Practices: The Real world objects of  Ordnance Survey Ir...Data Driven Ontology Practices: The Real world objects of  Ordnance Survey Ir...
Data Driven Ontology Practices: The Real world objects of Ordnance Survey Ir...
 
Jordi Martí-Henneberg, Luís Espinha da Silveira, Daniel Alves & Josep Puig,To...
Jordi Martí-Henneberg, Luís Espinha da Silveira, Daniel Alves & Josep Puig,To...Jordi Martí-Henneberg, Luís Espinha da Silveira, Daniel Alves & Josep Puig,To...
Jordi Martí-Henneberg, Luís Espinha da Silveira, Daniel Alves & Josep Puig,To...
 
Data Models and Query Languages for Linked Geospatial Data
Data Models and Query Languages for Linked Geospatial DataData Models and Query Languages for Linked Geospatial Data
Data Models and Query Languages for Linked Geospatial Data
 
POSTDATA: Towards publishing European Poetry as Linked Open Data
POSTDATA: Towards publishing European Poetry as Linked Open DataPOSTDATA: Towards publishing European Poetry as Linked Open Data
POSTDATA: Towards publishing European Poetry as Linked Open Data
 
Data Models and Query Languages for Linked Geospatial Data
Data Models and Query Languages for Linked Geospatial DataData Models and Query Languages for Linked Geospatial Data
Data Models and Query Languages for Linked Geospatial Data
 
LSD Dimensions: Use and Reuse of Linked Statistical Data as RDF Data Cube
LSD Dimensions: Use and Reuse of Linked Statistical Data as RDF Data CubeLSD Dimensions: Use and Reuse of Linked Statistical Data as RDF Data Cube
LSD Dimensions: Use and Reuse of Linked Statistical Data as RDF Data Cube
 
Semantic Technologies for Cultural Heritage
Semantic Technologies for Cultural HeritageSemantic Technologies for Cultural Heritage
Semantic Technologies for Cultural Heritage
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked Data
 
Linked Open Data in Romania
Linked Open Data in RomaniaLinked Open Data in Romania
Linked Open Data in Romania
 
[DCSB] Dr Gabriel Bodard (KCL) “A View on Digital Classics Collaboration: fro...
[DCSB] Dr Gabriel Bodard (KCL) “A View on Digital Classics Collaboration: fro...[DCSB] Dr Gabriel Bodard (KCL) “A View on Digital Classics Collaboration: fro...
[DCSB] Dr Gabriel Bodard (KCL) “A View on Digital Classics Collaboration: fro...
 
Real-time Visualisation of Cultural Heritage and Environmental Archaeology Da...
Real-time Visualisation of Cultural Heritage and Environmental Archaeology Da...Real-time Visualisation of Cultural Heritage and Environmental Archaeology Da...
Real-time Visualisation of Cultural Heritage and Environmental Archaeology Da...
 
Knowledge Graph Introduction
Knowledge Graph IntroductionKnowledge Graph Introduction
Knowledge Graph Introduction
 
Statistical data in RDF
Statistical data in RDFStatistical data in RDF
Statistical data in RDF
 
Decentralized Data Management for the Semantic Web
Decentralized Data Management for the Semantic WebDecentralized Data Management for the Semantic Web
Decentralized Data Management for the Semantic Web
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked Data
 
CBS CEDAR Presentation
CBS CEDAR PresentationCBS CEDAR Presentation
CBS CEDAR Presentation
 
Intra- and interdisciplinary cross-concordances for information retrieval
Intra- and interdisciplinary cross-concordances for information retrieval Intra- and interdisciplinary cross-concordances for information retrieval
Intra- and interdisciplinary cross-concordances for information retrieval
 
Open statistics Belgium
Open statistics BelgiumOpen statistics Belgium
Open statistics Belgium
 

More from Albert Meroño-Peñuela

List.MID: A MIDI-Based Benchmark for RDF Lists
List.MID: A MIDI-Based Benchmark for RDF ListsList.MID: A MIDI-Based Benchmark for RDF Lists
List.MID: A MIDI-Based Benchmark for RDF ListsAlbert Meroño-Peñuela
 
Modelling and Querying Lists in RDF. A Pragmatic Study
Modelling and Querying Lists in RDF. A Pragmatic StudyModelling and Querying Lists in RDF. A Pragmatic Study
Modelling and Querying Lists in RDF. A Pragmatic StudyAlbert Meroño-Peñuela
 
Making social science more reproducible by encapsulating access to linked data
Making social science more reproducible by encapsulating access to linked dataMaking social science more reproducible by encapsulating access to linked data
Making social science more reproducible by encapsulating access to linked dataAlbert Meroño-Peñuela
 
What can I expect from an academic career? Valuable skills
What can I expect from an academic career? Valuable skillsWhat can I expect from an academic career? Valuable skills
What can I expect from an academic career? Valuable skillsAlbert Meroño-Peñuela
 
Automatic Query-Centric API for Routine Access to Linked Data
Automatic Query-Centric API for Routine Access to Linked DataAutomatic Query-Centric API for Routine Access to Linked Data
Automatic Query-Centric API for Routine Access to Linked DataAlbert Meroño-Peñuela
 
One Score To Rule Them All: Semantics in Music Notation
One Score To Rule Them All: Semantics in Music NotationOne Score To Rule Them All: Semantics in Music Notation
One Score To Rule Them All: Semantics in Music NotationAlbert Meroño-Peñuela
 
Repeatable Semantic Queries for the Linked Data Agnostic
Repeatable Semantic Queries for the Linked Data AgnosticRepeatable Semantic Queries for the Linked Data Agnostic
Repeatable Semantic Queries for the Linked Data AgnosticAlbert Meroño-Peñuela
 
The Statistics of Stairway to Heaven: A Semantic Story About Digital Humanities
The Statistics of Stairway to Heaven: A Semantic Story About Digital HumanitiesThe Statistics of Stairway to Heaven: A Semantic Story About Digital Humanities
The Statistics of Stairway to Heaven: A Semantic Story About Digital HumanitiesAlbert Meroño-Peñuela
 
grlc: Bridging the Gap Between RESTful APIs and Linked Data
grlc: Bridging the Gap Between RESTful APIs and Linked Datagrlc: Bridging the Gap Between RESTful APIs and Linked Data
grlc: Bridging the Gap Between RESTful APIs and Linked DataAlbert Meroño-Peñuela
 
grlc Makes GitHub Taste Like Linked Data APIs
grlc Makes GitHub Taste Like Linked Data APIsgrlc Makes GitHub Taste Like Linked Data APIs
grlc Makes GitHub Taste Like Linked Data APIsAlbert Meroño-Peñuela
 
How does a knowledge graph sound like? (or: music is a graph)
How does a knowledge graph sound like? (or: music is a graph)How does a knowledge graph sound like? (or: music is a graph)
How does a knowledge graph sound like? (or: music is a graph)Albert Meroño-Peñuela
 
Non-Temporal Orderings for Extensional Concept Drift
Non-Temporal Orderings for Extensional Concept DriftNon-Temporal Orderings for Extensional Concept Drift
Non-Temporal Orderings for Extensional Concept DriftAlbert Meroño-Peñuela
 
Detecting and Reporting Extensional Concept Drift in Statistical Linked Data
Detecting and Reporting Extensional Concept Drift in Statistical Linked DataDetecting and Reporting Extensional Concept Drift in Statistical Linked Data
Detecting and Reporting Extensional Concept Drift in Statistical Linked DataAlbert Meroño-Peñuela
 

More from Albert Meroño-Peñuela (18)

List.MID: A MIDI-Based Benchmark for RDF Lists
List.MID: A MIDI-Based Benchmark for RDF ListsList.MID: A MIDI-Based Benchmark for RDF Lists
List.MID: A MIDI-Based Benchmark for RDF Lists
 
Modelling and Querying Lists in RDF. A Pragmatic Study
Modelling and Querying Lists in RDF. A Pragmatic StudyModelling and Querying Lists in RDF. A Pragmatic Study
Modelling and Querying Lists in RDF. A Pragmatic Study
 
Making social science more reproducible by encapsulating access to linked data
Making social science more reproducible by encapsulating access to linked dataMaking social science more reproducible by encapsulating access to linked data
Making social science more reproducible by encapsulating access to linked data
 
What can I expect from an academic career? Valuable skills
What can I expect from an academic career? Valuable skillsWhat can I expect from an academic career? Valuable skills
What can I expect from an academic career? Valuable skills
 
The MIDI Linked Data Cloud
The MIDI Linked Data CloudThe MIDI Linked Data Cloud
The MIDI Linked Data Cloud
 
Automatic Query-Centric API for Routine Access to Linked Data
Automatic Query-Centric API for Routine Access to Linked DataAutomatic Query-Centric API for Routine Access to Linked Data
Automatic Query-Centric API for Routine Access to Linked Data
 
One Score To Rule Them All: Semantics in Music Notation
One Score To Rule Them All: Semantics in Music NotationOne Score To Rule Them All: Semantics in Music Notation
One Score To Rule Them All: Semantics in Music Notation
 
Repeatable Semantic Queries for the Linked Data Agnostic
Repeatable Semantic Queries for the Linked Data AgnosticRepeatable Semantic Queries for the Linked Data Agnostic
Repeatable Semantic Queries for the Linked Data Agnostic
 
The Statistics of Stairway to Heaven: A Semantic Story About Digital Humanities
The Statistics of Stairway to Heaven: A Semantic Story About Digital HumanitiesThe Statistics of Stairway to Heaven: A Semantic Story About Digital Humanities
The Statistics of Stairway to Heaven: A Semantic Story About Digital Humanities
 
grlc: Bridging the Gap Between RESTful APIs and Linked Data
grlc: Bridging the Gap Between RESTful APIs and Linked Datagrlc: Bridging the Gap Between RESTful APIs and Linked Data
grlc: Bridging the Gap Between RESTful APIs and Linked Data
 
grlc Makes GitHub Taste Like Linked Data APIs
grlc Makes GitHub Taste Like Linked Data APIsgrlc Makes GitHub Taste Like Linked Data APIs
grlc Makes GitHub Taste Like Linked Data APIs
 
Historical Reasoning on the Web
Historical Reasoning on the WebHistorical Reasoning on the Web
Historical Reasoning on the Web
 
How does a knowledge graph sound like? (or: music is a graph)
How does a knowledge graph sound like? (or: music is a graph)How does a knowledge graph sound like? (or: music is a graph)
How does a knowledge graph sound like? (or: music is a graph)
 
What Is Linked Historical Data?
What Is Linked Historical Data?What Is Linked Historical Data?
What Is Linked Historical Data?
 
Non-Temporal Orderings for Extensional Concept Drift
Non-Temporal Orderings for Extensional Concept DriftNon-Temporal Orderings for Extensional Concept Drift
Non-Temporal Orderings for Extensional Concept Drift
 
Detecting and Reporting Extensional Concept Drift in Statistical Linked Data
Detecting and Reporting Extensional Concept Drift in Statistical Linked DataDetecting and Reporting Extensional Concept Drift in Statistical Linked Data
Detecting and Reporting Extensional Concept Drift in Statistical Linked Data
 
Semantic Web for the Humanities
Semantic Web for the HumanitiesSemantic Web for the Humanities
Semantic Web for the Humanities
 
Linked Census Data
Linked Census DataLinked Census Data
Linked Census Data
 

Recently uploaded

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfOverkill Security
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 

Recently uploaded (20)

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 

Linked Humanities data

  • 1. Linked Humanities Data: The Next Frontier? A Case-Study in Historical Census Data Albert Meroño-Peñuela Knowledge Representation & Reasoning Group 29-10-2012
  • 2. The Dutch historical censuses (1795-1971) 29-10-2012 Linked Humanities Data: The Next Frontier? 2
  • 3. The Dutch historical censuses (1795-1971) 29-10-2012 Linked Humanities Data: The Next Frontier? 3
  • 4. The Dutch historical censuses (1795-1971) • Population, Houses and Occupation censuses • 507 Excel files • 2,288 tables • 33,283 annotated cells 29-10-2012 Linked Humanities Data: The Next Frontier? 4
  • 5. Heterogeneity: structural 29-10-2012 Linked Humanities Data: The Next Frontier? 5
  • 6. Heterogeneity: semantic • Variable meaning – Plaatselijke indeling / Kom, buiten de kom + Wijk + Naam / Plaats – Variable design (age 14-18, 19-20 vs. 14-15, 16-20) • Variable values – RomschKatholik, RomsKatholic, VaticanChristelijk – Change in municipalities, occupations 29-10-2012 Linked Humanities Data: The Next Frontier? 6
  • 7. (Current) Harmonization • Manually create a (more general) translation table using standard CS – Map occupation literals with HISCO codes – Map municipality literals with AC codes • Cons – Expensive – Detail/specificity loss – Process is non-repeatable 29-10-2012 Linked Humanities Data: The Next Frontier? 7
  • 8. Additional requirements • Errors: non-destructive update of values • Provenance: record who did what, when, why • Datamodel: do not commit to a specific one • Linkage: enrich the dataset by linking it to others (e.g. labour strikes, book publications in NL) • Publication: open data for researchers 29-10-2012 Linked Humanities Data: The Next Frontier? 8
  • 9. Census RDF: arch • RDF Data Cube Vocabulary (cell data) • D2S Vocabulary (layout data) • Open Annotation Core Data Model (annotation data) 29-10-2012 Linked Humanities Data: The Next Frontier? 9
  • 10. Census RDF: cell data 29-10-2012 Linked Humanities Data: The Next Frontier? 10
  • 11. Census RDF: layout data 29-10-2012 Linked Humanities Data: The Next Frontier? 11
  • 12. Census RDF: annotation data 29-10-2012 Linked Humanities Data: The Next Frontier? 12
  • 13. Querying the RDF’d census 29-10-2012 Linked Humanities Data: The Next Frontier? 13
  • 14. Not ready-to-publish RDF • Disconnected graphs (but 279,136 possible variable mappings!) • Complex & non-homogeneous SPARQL queries • Contradictory annotation statements • Drifted concepts – Tile settler -> roof repairer – Shoemaker (works with leather) -> shoemaker (owns a company) 29-10-2012 Linked Humanities Data: The Next Frontier? 14
  • 15. New challenges • Dynamic ontologies – Different concept formalizations depending on the time frame – Subjective definitions (contested concepts) • Partitions and counting – Cannot merge counts of non aligned concepts – Infer individuals? • Format round-tripping – On-demand XLS, CSV, RDF, RDB conversions with(out) data loss 29-10-2012 Linked Humanities Data: The Next Frontier? 15
  • 16. Thank you! Questions, suggestions? http://cedar-project.nl/ http://www.data2semantics.org/