Digital Humanities in a Linked Data world: Semantic Annotations
Dov Winer
NLI / EAJC (DM2E/Judaica Europeana)

http://www.makash.org.il/docs/dh_usp_2013.pdf
Outline
 Digital Humanities:Scholarly Primitives
 Exemplos
 Transformação do ciclo de trabalho
escolástico
 Projetos de ponta e o universo da
Europeana
 Dados linkados: o Web como banco
de dados global
Digital Humanities
Scholarly Primitives
Scholarly Primitives: what methods do
humanities researchers have in common, and
how might our tools reflect this?
John Unsworth
Humanities Computing: formal methods, experimental
practice
King’s College, London, May 13, 2000

Discovering

Annotating

Comparing

Referring

Sampling

Illustrating
Representing
Unsworth primitive

Bamboo theme of scholarly OCLC Scholarly Information Activity
practice

Discovery

Gathering / Foraging

Sampling

Synthesizing / Filtering Comparing

Referring

Contextualizing

Illustrating
Representing
Comparing

Conceptualizing,
Refining and Critiquing

Representing

Documenting methods

Discovering Referring
Representing

Managing data

Annotating

Annotating /
documenting
Modelling / visualizing

Illustrating
Representing
Representing

Suggested parenthetically
Common thread

Overlapping teaching and
research
Sharing / dissemination
/ publishing
Funding
Collaborating

Referring

Citation, credit, peer-review

Representing

Searching (direct searching, chaining, browsing, probing,
accessing)
Collecting (gathering, organizing)
Searching (chaining, browsing, probing)
Collecting (organizing)
Cross-cutting (monitoring)
Reading (scanning, assessing, rereading)
Cross-cutting (note taking, translating)
Writing (assembling)
Collaborating (consulting)
Writing (disseminating)
Cross-cutting (translating)
Searching (accessing)
Collecting (organizing)
Collaborating (coordinating, consulting)
Writing (assembling)
Cross-cutting (note taking)
Cross-cutting (translating)
Writing (assembling)
Collaborating (coordinating)
Cross-cutting (translating)
Writing (disseminating)

No analogue
Writing (co-authoring)
Collaborating (coordinating, networking, consulting)
Reading (assessing)
Writing (dissemination)
Collaborating (consulting)

OCLC: Scholarly Information Practices in the Online Environment
http://www.oclc.org/content/dam/research/publications/library/2009/2009-02.pdf?urlm=162919

Project Bamboo Scholarly Practice Report
https://wikihub.berkeley.edu/display/pbamboo/Project+Bamboo+Scholarly+Practice+Report
Scholarly Primitives
Scholarly primitives: Building institutional
infrastructure for humanities e-Science
Tobias Blanke, Mark Hedges
King’s College London, Centre for e-Research
Future Generation Computer Systems 29 (2013) 654-661

Scholarly Information Practices in the Online
Environment
Carole L. Palmer, Lauren C. Teffeau, Carrie M. Pirmannn
2009 OCLC Online Computer Library Center, Inc.
OCLC Online Computer Library Center 2009
http://www.oclc.org/content/dam/research/publications/library/2009/2009-02.pdf?urlm=162919
Examples
Republic of Letters network visualisation / Oxford
and Stanford
Republic of Letters networks
American Civil War Freebase Documentation
Freebase: an open linked data database service

http://www.freebase.com
Ontology based annotation for Philosophy texts

Michele Pasin – Enrico Motta
Ontological requirements for annotation and
navigation of philosophical resources
Synthese (2011) 182:235-267
A formal model for describing Philosophical ideas
CIDOC-CRM event centered
A formal model for
describing philosophical
ideas:
Argument-entity.
Problem-area.
Problem.
Method.
View: Thesis, Theory,
Philosophical-system,
School of thought.
Rhetorical figure.
Concept.
Distinction .
http://www.visualdataweb.org/relfinder.php
http://relfinder.dbpedia.org/relfinder.html
Shai Ophir (2010). A New Type of Historical Knowledge. Information
Society,, 26: 144-150, 2010,
Transformação do ciclo de
trabalho escolástico
Ciclo de trabalho escolástico

From S.Gradmann and J.C. Meister, Digital document and interpretation: re-thinking “text” and scholarship in electronic
settings . Poiesis & Praxis, V5 N2 (2008)
Ciclo de trabalho escolástico

From S.Gradmann and J.C. Meister, Digital document and interpretation: re-thinking “text” and scholarship in electronic
settings . Poiesis & Praxis, V5 N2 (2008)
Ciclo de trabalho escolástico

From S.Gradmann and J.C. Meister, Digital document and interpretation: re-thinking “text” and scholarship in electronic
settings . Poiesis & Praxis, V5 N2 (2008)
Processing source data in the Humanities: aggregation

From Gradmann (2008)
http://www.slideshare.net/gradmans/europeana-semantica
… modeling …

From Gradmann (2008)
http://www.slideshare.net/gradmans/europeana-semantica
… and digital heuristics?

From Gradmann (2008)
http://www.slideshare.net/gradmans/europeana-semantica
Projetos de Ponta
Scholarly services
Document Mapping;
Concordance;
Collocation/Cloud; Frequency;
Morphological Analysis;
Syntactic Analysis; Named
Entity Identification; Proxied
SEASR Analytics
Europeana Projects

37

10/25/2013
Prof. Stefan Gradmann
Prof. Christian Bizer
LOD
Dados linkados – o Web como
banco de dados global
Dados Linkados Datasets on the Web
Over 31.7 billion
RDF triples
(10/2011)
Over 40 billion
on
February 2012

http://www.linkeddata.org
http://esw.w3.org/DataSetRDFDump
http://esw.w3.org/TaskForces/CommunityProje
cts/LinkingOpenData/DataSets/Statistics

VI Encontro do CEDAP
Preservação do
Patrimônio e

Linking Open Data
cloud diagram, by
17.10.2012
41
Richard Cyganiak
and Anja Jentzsch.
http://lod-cloud.net/
Linked Data:
structured
data on the Web
David Woood
Marsha Zeidman
Luke Ruth
with
Michael Hausenblas
Manning Publications
MEAP 2013
The next following slides were taken from :
Linked Data and the Semantic Web in an Archival Context
Mark A. Matienzo (2012)
http://matienzo.org
http://www.slideshare.net/anarchivist/linked-data-and-thesemantic-web-in-the-archival-context
Usage of Linked Data Introduction and Application
Scenarios
Barry Norton (2013)
EUCLID
Education Curriculum for the usage of Linked Data
http://euclid-project.eu/
The essence of RDF: the “triple”

subject

property
value

Source: “The thirty minute guide to RDF and Linked Data”, by Ian Davis and Tom Heath

VI Encontro do CEDAP
Preservação do
Patrimônio e
Ross Singer
The Linked Library Data Cloud
LOD4LIB 2010
Source: “The thirty minute guide to RDF and Linked Data”, by Ian Davis and Tom Heath
Direct Mapping

RDB

Direct
Mapping

automatic

RDF

RDB2RDF

66
Direct Mapping on Table
Person
ID (pk)

NAME

AGE

1

Alice

25

2

Bob

NULL

RDB2RDF

67
Direct Mapping on Table
Person
ID (pk)

NAME

AGE

1

Alice

25

2

Bob

NULL

RDB2RDF

68
Direct Mapping on Table
Person
ID (pk)

NAME

AGE

1

Alice

25

2

Bob

NULL

<http://www.ex.com/Person/ID=1>
<http://www.ex.com/Person#NAME>
"Alice" .
RDB2RDF

69
Extract – Transform – Load (ETL)

RDB

SPARQL

Dump

RDF

RDB2RDF

70
Music Ontology
• MusicArtist
– ArtistEvent, member_of

• SignalGroup
‘Album’ as per Release_Group

• Release
– ReleaseEvent

•
•
•
•

Record
Track
Work
Composition

http://musicontology.com/

RDB2RDF

71
Scale
• MusicBrainz RDF derived via R2RML:

300M
Triples

lb:artist_member a rr:TriplesMap ;
rr:logicalTable [rr:sqlQuery
"""SELECT a1.gid, a2.gid AS band
FROM artist a1
INNER JOIN l_artist_artist ON a1.id =
l_artist_artist.entity0
INNER JOIN link ON l_artist_artist.link = link.id
INNER JOIN link_type ON link_type = link_type.id
INNER JOIN artist a2 on l_artist_artist.entity1 = a2.id
WHERE link_type.gid='5be4c609-9afa-4ea0-910b-12ffb71e3821'"""]
;
rr:subjectMap [rr:template "http://musicbrainz.org/artist/{gid}#_"]
;
rr:predicateObjectMap
[rr:predicate mo:member_of ;
72
rr:objectMap [rr:template
"http://musicbrainz.org/artist/{band}#_" ;
rr:termType rr:IRI]] .
RDB2RDF

73
RDB2RDF

74
RDB2RDF

75
RDB2RDF

76
RDB2RDF

77
Thank you for your attention!
Dov Winer
dov.winer @ gmail.com

http://www.makash.org.il/docs/dh_usp_2013.pdf

Dh usp 2013

  • 1.
    Digital Humanities ina Linked Data world: Semantic Annotations Dov Winer NLI / EAJC (DM2E/Judaica Europeana) http://www.makash.org.il/docs/dh_usp_2013.pdf
  • 2.
    Outline  Digital Humanities:ScholarlyPrimitives  Exemplos  Transformação do ciclo de trabalho escolástico  Projetos de ponta e o universo da Europeana  Dados linkados: o Web como banco de dados global
  • 5.
  • 6.
    Scholarly Primitives Scholarly Primitives:what methods do humanities researchers have in common, and how might our tools reflect this? John Unsworth Humanities Computing: formal methods, experimental practice King’s College, London, May 13, 2000 Discovering Annotating Comparing Referring Sampling Illustrating Representing
  • 7.
    Unsworth primitive Bamboo themeof scholarly OCLC Scholarly Information Activity practice Discovery Gathering / Foraging Sampling Synthesizing / Filtering Comparing Referring Contextualizing Illustrating Representing Comparing Conceptualizing, Refining and Critiquing Representing Documenting methods Discovering Referring Representing Managing data Annotating Annotating / documenting Modelling / visualizing Illustrating Representing Representing Suggested parenthetically Common thread Overlapping teaching and research Sharing / dissemination / publishing Funding Collaborating Referring Citation, credit, peer-review Representing Searching (direct searching, chaining, browsing, probing, accessing) Collecting (gathering, organizing) Searching (chaining, browsing, probing) Collecting (organizing) Cross-cutting (monitoring) Reading (scanning, assessing, rereading) Cross-cutting (note taking, translating) Writing (assembling) Collaborating (consulting) Writing (disseminating) Cross-cutting (translating) Searching (accessing) Collecting (organizing) Collaborating (coordinating, consulting) Writing (assembling) Cross-cutting (note taking) Cross-cutting (translating) Writing (assembling) Collaborating (coordinating) Cross-cutting (translating) Writing (disseminating) No analogue Writing (co-authoring) Collaborating (coordinating, networking, consulting) Reading (assessing) Writing (dissemination) Collaborating (consulting) OCLC: Scholarly Information Practices in the Online Environment http://www.oclc.org/content/dam/research/publications/library/2009/2009-02.pdf?urlm=162919 Project Bamboo Scholarly Practice Report https://wikihub.berkeley.edu/display/pbamboo/Project+Bamboo+Scholarly+Practice+Report
  • 9.
    Scholarly Primitives Scholarly primitives:Building institutional infrastructure for humanities e-Science Tobias Blanke, Mark Hedges King’s College London, Centre for e-Research Future Generation Computer Systems 29 (2013) 654-661 Scholarly Information Practices in the Online Environment Carole L. Palmer, Lauren C. Teffeau, Carrie M. Pirmannn 2009 OCLC Online Computer Library Center, Inc. OCLC Online Computer Library Center 2009 http://www.oclc.org/content/dam/research/publications/library/2009/2009-02.pdf?urlm=162919
  • 10.
  • 11.
    Republic of Lettersnetwork visualisation / Oxford and Stanford
  • 12.
  • 15.
    American Civil WarFreebase Documentation
  • 16.
    Freebase: an openlinked data database service http://www.freebase.com
  • 18.
    Ontology based annotationfor Philosophy texts Michele Pasin – Enrico Motta Ontological requirements for annotation and navigation of philosophical resources Synthese (2011) 182:235-267
  • 19.
    A formal modelfor describing Philosophical ideas CIDOC-CRM event centered A formal model for describing philosophical ideas: Argument-entity. Problem-area. Problem. Method. View: Thesis, Theory, Philosophical-system, School of thought. Rhetorical figure. Concept. Distinction .
  • 20.
  • 21.
  • 22.
    Shai Ophir (2010).A New Type of Historical Knowledge. Information Society,, 26: 144-150, 2010,
  • 23.
    Transformação do ciclode trabalho escolástico
  • 24.
    Ciclo de trabalhoescolástico From S.Gradmann and J.C. Meister, Digital document and interpretation: re-thinking “text” and scholarship in electronic settings . Poiesis & Praxis, V5 N2 (2008)
  • 25.
    Ciclo de trabalhoescolástico From S.Gradmann and J.C. Meister, Digital document and interpretation: re-thinking “text” and scholarship in electronic settings . Poiesis & Praxis, V5 N2 (2008)
  • 26.
    Ciclo de trabalhoescolástico From S.Gradmann and J.C. Meister, Digital document and interpretation: re-thinking “text” and scholarship in electronic settings . Poiesis & Praxis, V5 N2 (2008)
  • 27.
    Processing source datain the Humanities: aggregation From Gradmann (2008) http://www.slideshare.net/gradmans/europeana-semantica
  • 28.
    … modeling … FromGradmann (2008) http://www.slideshare.net/gradmans/europeana-semantica
  • 29.
    … and digitalheuristics? From Gradmann (2008) http://www.slideshare.net/gradmans/europeana-semantica
  • 31.
  • 36.
    Scholarly services Document Mapping; Concordance; Collocation/Cloud;Frequency; Morphological Analysis; Syntactic Analysis; Named Entity Identification; Proxied SEASR Analytics
  • 37.
  • 38.
  • 40.
    LOD Dados linkados –o Web como banco de dados global
  • 41.
    Dados Linkados Datasetson the Web Over 31.7 billion RDF triples (10/2011) Over 40 billion on February 2012 http://www.linkeddata.org http://esw.w3.org/DataSetRDFDump http://esw.w3.org/TaskForces/CommunityProje cts/LinkingOpenData/DataSets/Statistics VI Encontro do CEDAP Preservação do Patrimônio e Linking Open Data cloud diagram, by 17.10.2012 41 Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/
  • 42.
    Linked Data: structured data onthe Web David Woood Marsha Zeidman Luke Ruth with Michael Hausenblas Manning Publications MEAP 2013
  • 43.
    The next followingslides were taken from : Linked Data and the Semantic Web in an Archival Context Mark A. Matienzo (2012) http://matienzo.org http://www.slideshare.net/anarchivist/linked-data-and-thesemantic-web-in-the-archival-context Usage of Linked Data Introduction and Application Scenarios Barry Norton (2013) EUCLID Education Curriculum for the usage of Linked Data http://euclid-project.eu/
  • 62.
    The essence ofRDF: the “triple” subject property value Source: “The thirty minute guide to RDF and Linked Data”, by Ian Davis and Tom Heath VI Encontro do CEDAP Preservação do Patrimônio e
  • 63.
    Ross Singer The LinkedLibrary Data Cloud LOD4LIB 2010
  • 65.
    Source: “The thirtyminute guide to RDF and Linked Data”, by Ian Davis and Tom Heath
  • 66.
  • 67.
    Direct Mapping onTable Person ID (pk) NAME AGE 1 Alice 25 2 Bob NULL RDB2RDF 67
  • 68.
    Direct Mapping onTable Person ID (pk) NAME AGE 1 Alice 25 2 Bob NULL RDB2RDF 68
  • 69.
    Direct Mapping onTable Person ID (pk) NAME AGE 1 Alice 25 2 Bob NULL <http://www.ex.com/Person/ID=1> <http://www.ex.com/Person#NAME> "Alice" . RDB2RDF 69
  • 70.
    Extract – Transform– Load (ETL) RDB SPARQL Dump RDF RDB2RDF 70
  • 71.
    Music Ontology • MusicArtist –ArtistEvent, member_of • SignalGroup ‘Album’ as per Release_Group • Release – ReleaseEvent • • • • Record Track Work Composition http://musicontology.com/ RDB2RDF 71
  • 72.
    Scale • MusicBrainz RDFderived via R2RML: 300M Triples lb:artist_member a rr:TriplesMap ; rr:logicalTable [rr:sqlQuery """SELECT a1.gid, a2.gid AS band FROM artist a1 INNER JOIN l_artist_artist ON a1.id = l_artist_artist.entity0 INNER JOIN link ON l_artist_artist.link = link.id INNER JOIN link_type ON link_type = link_type.id INNER JOIN artist a2 on l_artist_artist.entity1 = a2.id WHERE link_type.gid='5be4c609-9afa-4ea0-910b-12ffb71e3821'"""] ; rr:subjectMap [rr:template "http://musicbrainz.org/artist/{gid}#_"] ; rr:predicateObjectMap [rr:predicate mo:member_of ; 72 rr:objectMap [rr:template "http://musicbrainz.org/artist/{band}#_" ; rr:termType rr:IRI]] .
  • 73.
  • 74.
  • 75.
  • 76.
  • 77.
  • 78.
    Thank you foryour attention! Dov Winer dov.winer @ gmail.com http://www.makash.org.il/docs/dh_usp_2013.pdf