Ontologies and the Humanities 
Some Issues Affecting the Design of Digital 
Infrastructure 
Department Toby Burrows of Digital Humanities
Of making many ontologies, there is no end… 
• “A joint project … which aims to develop an ontology of digital 
research methods in the arts and humanities…” 
• “Cette reflexion ́ a nećessité la modeĺisation d’une ontologie de la 
transtextualite…” 
• “The proposed ontology for 3D visualisation for cultural 
heritage…” 
• “Über die Modellierung einer Ontologie wissenschaftlicher 
Prozesse fur̈ den Exzellenzcluster…” 
• “The model can be aligned with upper level ontologies like the 
CIDOC-CRM…” 
Digital Humanities 2014, Lausanne, July 2014
What is an ontology? 
• “The representation of entities, ideas, and events, along with 
their properties and relations, according to a system of 
categories” (Wikipedia) 
• “An ontology formally represents knowledge as a set of concepts 
within a domain, using a shared vocabulary to denote the types, 
properties and interrelationships of those concepts” (Wikipedia) 
• “The most typical kind of ontology for the Web has a taxonomy 
and a set of inference rules” (Tim Berners-Lee)
Gill, Tony (2004) “Building semantic bridges between museums, libraries and archives: The CIDOC Conceptual Reference Model” First Monday vol. 9 no. 5
Why ontologies? 
• Computational perspective: 
– Machine-processable 
– Support automated reasoning and logic 
– Enable contextual search and browse 
– Enable software agents to identify trusted sources and provide 
service discovery 
• Humanities perspective: 
– Semantic analysis of the contents of scholarly materials 
– Categorization of scholarly materials 
– Relating different categorization schemes to each other 
– Computational reasoning – faceted searching and browsing 
Berners-Lee, Tim, James Hendler and Ora Lassila (2001) "The Semantic Web", Scientific American, May 2001, p. 29-37 
Allen, Colin (2013) “Cross-Cutting Categorization Schemes in the Digital Humanities”, Isis, Vol. 104, No. 3, pp. 573-583
Linguistic and semantic difficulties 
• Variations in terminology 
• Ambiguity of terminology 
• Historical change in language and meaning 
• Multilingualism – use of different languages 
• Interdisciplinarity – different perspectives (“cross-domain”) 
• Responses within ontologies: 
– Definitions of terms 
– Semantic context (provided by ontological structure) 
– Ontology mapping across domains 
– Ontology integration across domains 
– Ontology learning and modification
Alternative strategies? 
• Search – use ontologies to classify search results (facets) 
• Topic modeling – automatic generation of semantic categories 
and relations from text-based NLP 
• Fluid ontologies and vernacular ontologies 
• Linked Data with light categorization for reasoning 
– Vocabularies & thesauri encoded for the Semantic Web 
(SKOS) 
• “Folksonomies” or social tagging 
 Tags are applied to entities 
 There is no formal classification or categorization of concepts 
 There are no relationships between tags (other than being used to tag the 
same entity)
Massive Attack Tags (last.fm) 
00s 80s 90s acid jazz 
alternative alternative dance alternative rock 
ambient atmospheric beautiful 
bristol bristol sound british 
chill chill out chillout 
dance dark downbeat downtempo dub 
easy listening electro electronic electronica 
england english experimental 
favorite favorites favourite female vocalists 
hip hop hip-hop house hypnotic 
idm indie indie rock industrial instrumental 
jazz lounge male vocalists massive attack 
mellow pop psychedelic rap relax rock sexy soul 
soundtrack technotrance trip hop trip-hop triphopuk
Categorizing contemporary popular 
music 
_________________________________________________________________ 
Alternative rock 
Grunge 
Punk 
Indie 
Riot grrrl 
Alt-country 
Hard rock 
Seventies rock 
Goth-punk 
Slacker punk 
“That old weird 
America” 
Stoner rock 
Rap metal 
Nu metal 
Old-school punk 
Metal 
Hardcore 
Post-punk 
D-beat 
To explain this
Deeper issues for the humanities 
• More than just linguistic or semantic difficulties 
• Debates about “the nature of things” (ontology!) 
• Debates about “how to represent the world” 
• The nature of perception and cognition 
• Cognitive themes: 
– Similarity and dissimilarity 
– Relationships: metonymic and metaphorical, not just 
semantic or logical 
– Connections and trails 
– Seeing things holistically
“The problem of modeling representations” 
“The symbolic approach starts from the assumption that cognitive 
systems can be described as Turing machines. Cognition is seen as 
essentially being computation, involving symbol manipulation.” 
(Gärdenfors 2000:1) 
“The Semantic Web is a machine for creating syllogisms” (Shirky 
2003) 
“It is an unfortunate dogma of computer science in general, and 
the Semantic Web in particular, that all semantic contents are 
reducible to first-order logic or to set theory” (Gärdenfors 
2014:258)
Conceptual spaces 
In the current Semantic Web, the information mainly concerns 
taxonomies and inference rules. If conceptual spaces are used as a 
foundational methodology, the focus will be on describing domain 
structures. This involves, above all, specifying the geometric and 
topological structure of the domains. (Gärdenfors 2014: 261) 
The issue is this: Do meaningful thought and reason concern merely the 
manipulation of abstract symbols and their correspondence to an 
objective reality? Or do meaningful thought and reason essentially 
concern the nature of the organism doing the thinking – including the 
nature of its body, its interactions in its environment, its social character 
and so on? (Lakoff 1987: xv-xvi)
The world as graph 
The theory of graph-theoretic structure is sufficient to account for all 
structure in thought or world. Minimally, it has the information-theoretic 
content to describe the complexity of the apparent world, it mirrors the 
“computational” difficulty we have in grasping this world, and it has the 
combinatoric texture to give a theoretically satisfying account of the 
nature of the world. That is, the world is of daunting size and complexity, 
parts of it are difficult precisely to isolate and conceive, but it is 
fundamentally made up of parts arranged in simple, graspable 
arrangements. 
This is an extremely speculative assertion that a graph – large graphs 
anyway – have the same compositional “feel” as the world; and that the 
“facts” or sentences of first-order predicate logic of logico-metaphysical 
analysis do not. (Dipert 1997:351)
HuNI’s approach – socially-linked data 
• Aggregate heterogeneous data to a simple data model 
• Keep the categorization of data entities to a minimum: six basic 
categories 
• No imported relationships between entities 
• Allow users to express the relationships they see in the data – by 
creating links between entities 
• Allow multiple relationships between the same entities (even if 
they are contradictory) 
• The user-contributed links give meaning and add value 
• Users can also create and share collections of entities
More icons = more PERSON A natural person 
ORGANISATION A company, club, trust, gallery, political party, etc 
WORK A cultural artefact or “man-made” thing created by 
Concept 
HuNI Record Category 
Event Organisation Person Place Work 
someone, that has some existence in its own right, 
either physical or digital 
PLACE A real, spatial location 
EVENT An activity that occurs in space and time and may 
involve people, organisations, places, works, etc. 
CONCEPT Something whose existence is primarily mental 
http://wiki.huni.net.au/display/DS/Data+Model
Events 
• Central to humanities perspectives on the world 
• “Each entity is an event” – Bruno Latour 
• Attempts at ontological models of events: 
– Simple Event Model; LODE; The Event Ontology 
– Within larger models: CIDOC-CRM, Europeana, FRBRoo 
– Treat events as nameable entities 
• Knowledge representation of events: 
– CultureSampo (Finland) 
• Events as conceptual spaces – Peter Gärdenfors
sem:has 
SubEvent 
sem:hasPlace 
sem:Event sem:hasActor 
sem:Actor sem:Place sem:Time 
sem:hasTime 
sem:placeType 
sem:PlaceType 
sem:eventType 
sem:EventType 
sem:actorType 
sem:ActorType 
sem:TimeType 
sem:Type 
sem:timeType 
sem:Core 
sem:Constraint 
sem:Temporary sem:Role 
sem:View 
sem:RoleType 
sem:roleType 
sem:hasTimeStamp 
sem:hasSubType 
Core Classes 
(Foreign) 
Type System 
Property 
Constraints 
the Simple Event Model (SEM) 
Literal sem:hasTimeStamp 
Literal sem:hasTimeStamp 
sem: 
accordingTo sem:Authority 
sem:hasTime 
Willem Robert van Hage 
Véronique Malaisé 
Vrije Universiteit Amsterdam
T. Ruotsalo, E. Hyvonen, An event-based approach for ̈ semantic metadata interoperability (2007)
In 1862, Sir Thomas Phillipps bought Phillipps MS 16402 in London as 
part of the Sotheby’s sale of the collection of Guglielmo Libri.
The task for DH 
• Future lines of DH research: looking beyond ontologies 
• Computational modeling of humanities thought: going beyond 
“reasoning” in the logical sense, as embedded in ontologies 
• Examine alternatives from cognitive science and philosophy 
– Conceptual spaces: the geometry of meaning 
– Cognitive models 
– The world as a graph
Dr Toby Burrows 
Marie Curie Fellow 
Department of Digital Humanities 
King’s College London 
26-29 Drury Lane 
London WC2B 5RL 
toby.burrows@kcl.ac.uk

Ontologies and the humanities: some issues affecting the design of digital infrastructure

  • 1.
    Ontologies and theHumanities Some Issues Affecting the Design of Digital Infrastructure Department Toby Burrows of Digital Humanities
  • 2.
    Of making manyontologies, there is no end… • “A joint project … which aims to develop an ontology of digital research methods in the arts and humanities…” • “Cette reflexion ́ a nećessité la modeĺisation d’une ontologie de la transtextualite…” • “The proposed ontology for 3D visualisation for cultural heritage…” • “Über die Modellierung einer Ontologie wissenschaftlicher Prozesse fur̈ den Exzellenzcluster…” • “The model can be aligned with upper level ontologies like the CIDOC-CRM…” Digital Humanities 2014, Lausanne, July 2014
  • 3.
    What is anontology? • “The representation of entities, ideas, and events, along with their properties and relations, according to a system of categories” (Wikipedia) • “An ontology formally represents knowledge as a set of concepts within a domain, using a shared vocabulary to denote the types, properties and interrelationships of those concepts” (Wikipedia) • “The most typical kind of ontology for the Web has a taxonomy and a set of inference rules” (Tim Berners-Lee)
  • 6.
    Gill, Tony (2004)“Building semantic bridges between museums, libraries and archives: The CIDOC Conceptual Reference Model” First Monday vol. 9 no. 5
  • 8.
    Why ontologies? •Computational perspective: – Machine-processable – Support automated reasoning and logic – Enable contextual search and browse – Enable software agents to identify trusted sources and provide service discovery • Humanities perspective: – Semantic analysis of the contents of scholarly materials – Categorization of scholarly materials – Relating different categorization schemes to each other – Computational reasoning – faceted searching and browsing Berners-Lee, Tim, James Hendler and Ora Lassila (2001) "The Semantic Web", Scientific American, May 2001, p. 29-37 Allen, Colin (2013) “Cross-Cutting Categorization Schemes in the Digital Humanities”, Isis, Vol. 104, No. 3, pp. 573-583
  • 9.
    Linguistic and semanticdifficulties • Variations in terminology • Ambiguity of terminology • Historical change in language and meaning • Multilingualism – use of different languages • Interdisciplinarity – different perspectives (“cross-domain”) • Responses within ontologies: – Definitions of terms – Semantic context (provided by ontological structure) – Ontology mapping across domains – Ontology integration across domains – Ontology learning and modification
  • 11.
    Alternative strategies? •Search – use ontologies to classify search results (facets) • Topic modeling – automatic generation of semantic categories and relations from text-based NLP • Fluid ontologies and vernacular ontologies • Linked Data with light categorization for reasoning – Vocabularies & thesauri encoded for the Semantic Web (SKOS) • “Folksonomies” or social tagging  Tags are applied to entities  There is no formal classification or categorization of concepts  There are no relationships between tags (other than being used to tag the same entity)
  • 12.
    Massive Attack Tags(last.fm) 00s 80s 90s acid jazz alternative alternative dance alternative rock ambient atmospheric beautiful bristol bristol sound british chill chill out chillout dance dark downbeat downtempo dub easy listening electro electronic electronica england english experimental favorite favorites favourite female vocalists hip hop hip-hop house hypnotic idm indie indie rock industrial instrumental jazz lounge male vocalists massive attack mellow pop psychedelic rap relax rock sexy soul soundtrack technotrance trip hop trip-hop triphopuk
  • 13.
    Categorizing contemporary popular music _________________________________________________________________ Alternative rock Grunge Punk Indie Riot grrrl Alt-country Hard rock Seventies rock Goth-punk Slacker punk “That old weird America” Stoner rock Rap metal Nu metal Old-school punk Metal Hardcore Post-punk D-beat To explain this
  • 16.
    Deeper issues forthe humanities • More than just linguistic or semantic difficulties • Debates about “the nature of things” (ontology!) • Debates about “how to represent the world” • The nature of perception and cognition • Cognitive themes: – Similarity and dissimilarity – Relationships: metonymic and metaphorical, not just semantic or logical – Connections and trails – Seeing things holistically
  • 17.
    “The problem ofmodeling representations” “The symbolic approach starts from the assumption that cognitive systems can be described as Turing machines. Cognition is seen as essentially being computation, involving symbol manipulation.” (Gärdenfors 2000:1) “The Semantic Web is a machine for creating syllogisms” (Shirky 2003) “It is an unfortunate dogma of computer science in general, and the Semantic Web in particular, that all semantic contents are reducible to first-order logic or to set theory” (Gärdenfors 2014:258)
  • 18.
    Conceptual spaces Inthe current Semantic Web, the information mainly concerns taxonomies and inference rules. If conceptual spaces are used as a foundational methodology, the focus will be on describing domain structures. This involves, above all, specifying the geometric and topological structure of the domains. (Gärdenfors 2014: 261) The issue is this: Do meaningful thought and reason concern merely the manipulation of abstract symbols and their correspondence to an objective reality? Or do meaningful thought and reason essentially concern the nature of the organism doing the thinking – including the nature of its body, its interactions in its environment, its social character and so on? (Lakoff 1987: xv-xvi)
  • 19.
    The world asgraph The theory of graph-theoretic structure is sufficient to account for all structure in thought or world. Minimally, it has the information-theoretic content to describe the complexity of the apparent world, it mirrors the “computational” difficulty we have in grasping this world, and it has the combinatoric texture to give a theoretically satisfying account of the nature of the world. That is, the world is of daunting size and complexity, parts of it are difficult precisely to isolate and conceive, but it is fundamentally made up of parts arranged in simple, graspable arrangements. This is an extremely speculative assertion that a graph – large graphs anyway – have the same compositional “feel” as the world; and that the “facts” or sentences of first-order predicate logic of logico-metaphysical analysis do not. (Dipert 1997:351)
  • 20.
    HuNI’s approach –socially-linked data • Aggregate heterogeneous data to a simple data model • Keep the categorization of data entities to a minimum: six basic categories • No imported relationships between entities • Allow users to express the relationships they see in the data – by creating links between entities • Allow multiple relationships between the same entities (even if they are contradictory) • The user-contributed links give meaning and add value • Users can also create and share collections of entities
  • 21.
    More icons =more PERSON A natural person ORGANISATION A company, club, trust, gallery, political party, etc WORK A cultural artefact or “man-made” thing created by Concept HuNI Record Category Event Organisation Person Place Work someone, that has some existence in its own right, either physical or digital PLACE A real, spatial location EVENT An activity that occurs in space and time and may involve people, organisations, places, works, etc. CONCEPT Something whose existence is primarily mental http://wiki.huni.net.au/display/DS/Data+Model
  • 23.
    Events • Centralto humanities perspectives on the world • “Each entity is an event” – Bruno Latour • Attempts at ontological models of events: – Simple Event Model; LODE; The Event Ontology – Within larger models: CIDOC-CRM, Europeana, FRBRoo – Treat events as nameable entities • Knowledge representation of events: – CultureSampo (Finland) • Events as conceptual spaces – Peter Gärdenfors
  • 24.
    sem:has SubEvent sem:hasPlace sem:Event sem:hasActor sem:Actor sem:Place sem:Time sem:hasTime sem:placeType sem:PlaceType sem:eventType sem:EventType sem:actorType sem:ActorType sem:TimeType sem:Type sem:timeType sem:Core sem:Constraint sem:Temporary sem:Role sem:View sem:RoleType sem:roleType sem:hasTimeStamp sem:hasSubType Core Classes (Foreign) Type System Property Constraints the Simple Event Model (SEM) Literal sem:hasTimeStamp Literal sem:hasTimeStamp sem: accordingTo sem:Authority sem:hasTime Willem Robert van Hage Véronique Malaisé Vrije Universiteit Amsterdam
  • 25.
    T. Ruotsalo, E.Hyvonen, An event-based approach for ̈ semantic metadata interoperability (2007)
  • 26.
    In 1862, SirThomas Phillipps bought Phillipps MS 16402 in London as part of the Sotheby’s sale of the collection of Guglielmo Libri.
  • 27.
    The task forDH • Future lines of DH research: looking beyond ontologies • Computational modeling of humanities thought: going beyond “reasoning” in the logical sense, as embedded in ontologies • Examine alternatives from cognitive science and philosophy – Conceptual spaces: the geometry of meaning – Cognitive models – The world as a graph
  • 28.
    Dr Toby Burrows Marie Curie Fellow Department of Digital Humanities King’s College London 26-29 Drury Lane London WC2B 5RL toby.burrows@kcl.ac.uk

Editor's Notes

  • #4 Ontology has a specific meaning in computer science – though the term is often used very loosely - ironically Two definitional sentences from the Wikipedia entry Key words: “entities”, “categories”, “concepts”, “representation”, “properties and relations” Provide the link to the original meaning in philosophy – “the study of what exists, what has being” A form of knowledge representation Different from vocabularies, thesauri, taxonomies, topic maps, data models and metadata schemas As the definitions make clear, an ontology contains a vocabulary or taxonomy PLUS relations between terms (including categories) – which can be expressed as inference rules Designed for computer reasoning – as Berners-Lee emphasizes
  • #18 Fundamental questions about the underlying approach (symbolic, logic-based) and its limitations Raised by sceptics like Shirky But also by cognitive scientists like Peter Gärdenfors
  • #19 Gärdenfors proposes a very different approach to “making the Semantic Web more semantic” Based on conceptual spaces, domains, and cognitive models – “the geometry of thought” Drawing on earlier work by a range of people, including George Lakoff’s work on cognitive models of classification
  • #20 Another interesting alternative proposal, from the philosophical side: Replacing standard predicate logic with a graph-based approach to modelling the world and our understanding of it DH needs to think through the implications of these fundamentally different approaches Need to be aware of the limitations of the classic ontology framework and its assumptions Need to try different approaches
  • #21 I’m going to look at two projects which I’m involved in, as possible tentative examples HuNI – explain what it is 30 different humanities datasets – 716,000 entities The challenge is to organize and link heterogeneous data without pre-determining the structure and relationships Sufficient organization is required to make the data aggregate useful – but without imposing too much of a conceptual framework We need to be able to share – we need to be able to talk about the entities
  • #22 Definitions of the six core categories Documented in detail on the HuNI wiki
  • #23 Challenge: link Hugh Jackman to Switzerland Manually created links – diagram doesn’t show the nature of the link That information is in a tabular form below the diagram Can explore the graph by clicking on any of the entities Designed for browsing and exploration, rather than reasoning or network analysis
  • #24 Events in relation to provenance
  • #28 Gardenfors is at pains to address this in his latest book (2014) – looks at potential of CSML