Leipzig Functional Categorisation 11/12/2013
Upcoming SlideShare
Loading in...5
×
 

Leipzig Functional Categorisation 11/12/2013

on

  • 177 views

Leipzig eHumanities Seminars: Functional Categorisation for Historical Place Types

Leipzig eHumanities Seminars: Functional Categorisation for Historical Place Types

Statistics

Views

Total Views
177
Views on SlideShare
177
Embed Views
0

Actions

Likes
0
Downloads
0
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Leipzig Functional Categorisation 11/12/2013 Leipzig Functional Categorisation 11/12/2013 Presentation Transcript

  • Leipzig eHumanities Seminars 11/12/2013 Giovanni Colavizza Functional Categorisation for Historical Place Types ! ! Giovanni Colavizza Leibniz-Institut für Europäische Geschichte (IEG), Mainz ! Colavizza@ieg-mainz.de 1
  • Leipzig eHumanities Seminars 11/12/2013 Giovanni Colavizza Section 1: introduction and motivations 2
  • Leipzig eHumanities Seminars 11/12/2013 Giovanni Colavizza The topic ! Controlled vocabulary: “a pre-selected list of terms used for categorisation.” or “an organized arrangement of words and phrases used to index content and/ or to retrieve content through browsing or searching.” @Patricia Harpring ! Gazetteer: “a geographical dictionary or directory used in conjunction with a map or atlas.” @Wikipedia ! ! Focus for this talk: Controlled Vocabularies of concepts, not proper names. Historical Place Types. ! 3
  • Leipzig eHumanities Seminars 11/12/2013 Giovanni Colavizza Examples Terms—label – concept relations—are often not defined. Natural language is context and interpretation specific. ! ! @Dalia Varanka, A topographic feature taxonomy for a US national topographic mapping ontology, 2009. ! ! ! @Excerpt from LinkedGeoData ontology. 4
  • Leipzig eHumanities Seminars 11/12/2013 Giovanni Colavizza Motivations ! ! Quantitative analysis ↦ Classification ! Classification for quantitative analysis ↦ unambiguous, consistent, shared ! Controlled vocabularies for place types at the moment: • grow out of necessity, are project specific • have high degree of ambiguity • lack of explicit (formal) definitions of terms • are not designed for portability and reuse 5
  • Leipzig eHumanities Seminars 11/12/2013 Giovanni Colavizza Basic definitions - I ! ! Taxonomy: “a semantic network of concepts (referred to, or labeled, via a controlled vocabulary), linked by hierarchical relationships. A taxonomy is thus a limited thesaurus.” ! Thesaurus: “a semantic network of concepts (referred to, or labeled, via a controlled vocabulary), linked by equivalence, hierarchical and associative relationships.” ! Ontology: “formal and explicit specification of a shared conceptualisation.” @Studer, 1998 and Guarino, 2009 6
  • Leipzig eHumanities Seminars 11/12/2013 Giovanni Colavizza Basic definitions - II ! Taxonomy: contains categories organised hierarchically. Used to classify. E. g. “vehicles” ↦ “terrestrial vehicles” ↦ “car”. ! Thesaurus: contains concepts and labels for them, organised relationally. Used to index and search. E. g. “terrestrial vehicles” ↦ “car”@en (alternatives: “macchina”@it, “voiture”@fr, .. relates_to: “car park”, “highway”, ..) ! Ontology: contains classes, properties and logical rules. Eventually instances of classes. Used to instance and reason. E. g. “car” is_subclass_of “vehicle”. “has_horsepower” is a property between an instance of class “car” and an positive integer. “Audi RS Q3” is_a “car”. And so on.. @Thomas Francart 7
  • Leipzig eHumanities Seminars 11/12/2013 Giovanni Colavizza Getty’s TGN - I Getty’s Thesaurus of Geographic Names: “a database of places in context.” ! Target: professionals in the heritage sector. Always growing by design. ! Structure of a record: 8
  • Leipzig eHumanities Seminars 11/12/2013 Giovanni Colavizza Getty’s TGN - II Possible tensions from two directions: • The mix of physical features and administrative entities in the hierarchies, since “a geographic place is an administrative entity or a physical feature with a name”. • The account for both current and historical places, types and hierarchies. ! Good ideas: • Instances of administrative entities. E.g. Ancient Egypt (former nation) is predecessor of Egypt (modern nation). • Instances have time spans. 9
  • Leipzig eHumanities Seminars 11/12/2013 Giovanni Colavizza Getty’s TGN - III ! ! Place type: “a term that characterises a significant aspect of the place, including its role, function, political anatomy, size, or physical characteristics.” @TGNGuidelines, section 3.6.1.1 ! Foundation for the hierarchy of every TGN record via preferred type. Organised in flat general categories (Christian types, Physical features types, etc.). ! Most place types can be assigned to three macro-areas: physical features, administrative divisions (geopolitics and internal state structure) and functions (religious, economic, social, etc.). ! ! 10
  • Leipzig eHumanities Seminars 11/12/2013 Giovanni Colavizza Getty’s TGN - IV ! Terminological issues. Guideline: prefer the local terminology. USA has state and county, Italy region and province. Italian region is merged with region (generic administrative entity) and generic region (another more generic entity). ! Lead to Ontological issues. Place types are not themselves structured into a defined thesaurus, neither they are formally distinguished in different domains, with specific rules to disambiguate them. ! ! 11
  • Leipzig eHumanities Seminars 11/12/2013 Giovanni Colavizza Section 2: proposal 12
  • Leipzig eHumanities Seminars 11/12/2013 Giovanni Colavizza Desiderata ! • • • • • • Allow for comparison beyond single project (data integration) Interoperability and portability Scalability More accurate retrieval Reasoning… Essentially: make vocabularies more machine-actionable One possible solution: integrate a more strict knowledge model in the backend of controlled vocabularies. Express it via thesauri of concepts built abiding to ontologies. Standards already there: ISO 25964 (data model), SKOS (ontology) 13
  • Leipzig eHumanities Seminars 11/12/2013 Giovanni Colavizza An example - I ! List of monasteries in France: Id Name Type … 1 Manlieu Abbey tgn:monastery … 2 Argentan Abbey tgn:monastery … … … … … Can we improve on the simple tag “monastery”? 14
  • Leipzig eHumanities Seminars 11/12/2013 Giovanni Colavizza An example - II Thesaurus of concepts: <skos:Concept rdf:about="labelling.org/function-concept-10"> <skos:prefLabel xml:lang="en">worship</skos:prefLabel></skos:Concept> <skos:Concept rdf:about="labelling.org/function-concept-11"> <skos:prefLabel xml:lang="en">estate administration</skos:prefLabel></ skos:Concept> ! Controlled vocabulary of place types: <skos:Concept rdf:about="labelling.org/voc7/label-33"> <skos:prefLabel xml:lang="en">monastery</skos:prefLabel> <skos:related rdf:resource="labelling.org/function-concept-10"> <skos:related rdf:resource="labelling.org/function-concept-11"> </skos:Concept> ! In the database: Id Name Type 1 Manlieu Abbey voc7/label-33:monastery 2 Argentan Abbey voc7/label-33:monastery … … … 15
  • Leipzig eHumanities Seminars 11/12/2013 Giovanni Colavizza Idea - concept An integrated approach: 1. develop back-end thesauri 2. vocabularies are built as needed, in natural language, associating tags with formally defined concepts (avoid late integration) ! n-m mapping between vocabularies and ontologies. Focus on what’s shared. Add details to the backend. Pareto principle: 80% effects (tags we need) come from 20% causes (concepts). 16
  • Leipzig eHumanities Seminars 11/12/2013 Giovanni Colavizza Historical place types - I Quite problematic: • Same nouns mean different things in space, time, culture • Generic tags for specific meanings lead to ambiguity • Layers of knowledge: historical agents, socio-political contexts, historians’ interpretations, etc. ! Example: “palazzo” in Medieval and Early Modern Venice. For contemporaries: Doge’s palace -> Other nobles’ palaces had proper names, e.g. Ca’ Foscari means House Foscari For us: A category of (historical) buildings — usually former nobles’ residences OR a more generic category of somewhat big buildings 17
  • Leipzig eHumanities Seminars 11/12/2013 Giovanni Colavizza Functional categorisation - I Historical knowledge is mostly about events and processes, which drive the production of evidence (sources) ! ! ! ! ! ! ! ! @Grossner, Representing Historical Knowledge in Geographic Information Systems, 2010. 18
  • Leipzig eHumanities Seminars 11/12/2013 Giovanni Colavizza Functional categorisation - II We model representations more than real objects, and we study humans: purpose and function are the main concern ! From nouns to verbs: • Most vocabularies of place types/features are already loosely classified by functionality (economic activity, leisure facility, place of culture, etc.) • There are less verbs than nouns (Wordnet synsets: ~82k nouns, ~14k verbs) • Verbs act as bridges between concepts in natural language, linked data triples, etc… ! Not the only perspective (e.g. natural features, institutions), but a starting point 19
  • Leipzig eHumanities Seminars 11/12/2013 Giovanni Colavizza Example: Barber-shops in Venice ! ! ! ! ! ! ! ! ! ! @Filippo De Vivo, Patrizi, informatori, barbieri. Politica e comunicazione a Venezia nella prima età moderna. Milan: Feltrinelli, 2012. In English: id., Information and communication in Venice: Rethinking Early Modern Politics. Oxford: Oxford University Press, 2007. 20
  • Leipzig eHumanities Seminars 11/12/2013 Giovanni Colavizza Historical place types - II Problems: 1. Same nouns mean different things in space, time, culture 2. Generic tags for specific meanings lead to ambiguity 3. Layers of knowledge: historical agents, socio-political contexts, historians’ interpretations, etc. ! Expected outcomes: 1. We can add specifications of space, time, culture to concepts defining a term 2. Generic tags can be linked to specific concepts 3. The process of linking vocabulary terms to concepts helps the historian clarify its reasoning and the layer of knowledge s/he is representing ! 21
  • Leipzig eHumanities Seminars 11/12/2013 Giovanni Colavizza Historical place types - III Solving the “palazzo” problem: ! 22
  • Leipzig eHumanities Seminars 11/12/2013 Giovanni Colavizza What is a place - conceptual model I 23
  • Leipzig eHumanities Seminars 11/12/2013 Giovanni Colavizza What is a place - conceptual model II 24
  • Leipzig eHumanities Seminars 11/12/2013 Giovanni Colavizza Section 3: implementation 25
  • Leipzig eHumanities Seminars 11/12/2013 Giovanni Colavizza How? ! Build thesauri of functions with a bottom-up approach, from sources Build vocabularies when needed, reusing existing if possible Develop a software to integrate such thesauri and the creation/re-use of controlled vocabularies • Raise and foster a community of interest and work together • • • ! ! Let’s break down each part… 26
  • Leipzig eHumanities Seminars 11/12/2013 Giovanni Colavizza Building thesauri of functions - I ! Good starting point: Getty’s AAT function facets: http://www.getty.edu/vow/AATHierarchy?find=&logic=AND&note=&subjectid=300054593 ! Provide a general framework, i.e. functional domains and upper layers: economics, government, social, education, etc. ! Small teams of historians and ontologists: start from sources and make explicit part of the knowledge entailed in them. A process of abstraction from detail and generalisation. ! Let’s see an example… 27
  • Leipzig eHumanities Seminars 11/12/2013 Giovanni Colavizza Building thesauri of functions - II 28
  • Leipzig eHumanities Seminars 11/12/2013 Giovanni Colavizza Building thesauri of functions - III Giovanni Bartolomeo da Gabiano, bookseller, publisher and entrepreneur in Venice. ! Business letters from which we can infer the activities going on at his shop at Rialto. “Data in mane de Messer Ioanne Bertolamie a la libraria da la Fontana in Venecia” “Given into the hands of Mr Giovanni Bartolomeo, at the bookshop at the Fountain [insigna] in Venice” 29
  • Leipzig eHumanities Seminars 11/12/2013 Giovanni Colavizza Building thesauri of functions - IV This letter mentions new editions being made —apparently the market was good for medical treatises: Avicenna, Aliabate, etc.— and engravings ordered for them. Various activities, today usually considered as separated: • book-selling, accounting, warehousing, etc. • publishing and sometimes printing • patronage and other social activities • … 30
  • Leipzig eHumanities Seminars 11/12/2013 Giovanni Colavizza Building vocabularies - I Similar to building thesauri of functions, but without supervision. ! Essential to: • permit to use the same tags we’re currently finding in controlled vocabularies, thus natural language and possibly no definitions • allow for intuitive linkage with thesauri, and suggest vocabulary tags already built and close in meaning • design for continuous growing: term merge or split, sub-categories, … 31
  • Leipzig eHumanities Seminars 11/12/2013 Giovanni Colavizza Building vocabularies - II An example: Venetian fiscal declarations, 1514. Rented “flats” (litt. small houses: ‘chaseta’) for residence: ‘flat’ (in vocabulary) Possible interesting functions according to source: ‘renting’, ‘lodging’/‘dwelling’ under ‘economic functions’. 32
  • Leipzig eHumanities Seminars 11/12/2013 Giovanni Colavizza Building vocabularies - III @Luzzatto, Sergio, Pedullà, Gabriele (editors), Atlante della letteratura italiana, vol. 1, Torino, Einaudi, 2010. 33
  • Leipzig eHumanities Seminars 11/12/2013 Giovanni Colavizza The software: Labelling system - I ! Design requirements: ! • • • • • • • ! Building thesauri of concepts in the back-end. Building controlled vocabularies, for users. Querying the system for such contents (for every agent, openly). Administering and linking all these tasks and users into a single system. Provide transparent management of the most used data formats. Reuse open source solution whenever possible. Be very intuitive and easy to use. Waiting for possible grant on this… 34
  • Leipzig eHumanities Seminars 11/12/2013 Giovanni Colavizza The software: Labelling system - II 35
  • Leipzig eHumanities Seminars 11/12/2013 Giovanni Colavizza The software: Labelling system - III http://www.vocabularyserver.com/ 36
  • Leipzig eHumanities Seminars 11/12/2013 Giovanni Colavizza The software: Labelling system - IV ! Implementation is key: • we are struggling to have several people from different backgrounds work with standards such as SKOS and RDF • we need a common entry point, as transparent as possible • we need to differentiate vocabulary building and thesauri of functions concretely ! 37
  • Leipzig eHumanities Seminars 11/12/2013 Giovanni Colavizza The community - I Work on integration and alignment requires a community of interest and long time—slow growth. ! Experts’ workshop on Controlled Vocabularies, Mainz 10-11/10/2013: • gathered experts from different fields (history, IT, geography, …) • discussed extensively about place types and the functional categorisation • established a working group to start the process ! As of today: • circa 30 experts • wiki space and mailing list within DARIAH-DE 38
  • Leipzig eHumanities Seminars 11/12/2013 Giovanni Colavizza The community - II Space on DARIAH-DE wiki. Already populated with references, vocabularies and first alignment projects, plus the RDF (with SKOS) version of the Getty’s AAT function facets. ! ! ! ! ! ! ! ! Send me an e-mail to join us :) 39
  • Leipzig eHumanities Seminars 11/12/2013 Giovanni Colavizza Summary ! ! 1. 2. 3. 4. Motivation: controlled vocabularies are ambiguous and lack definitions Object: Historical place types Proposal: use functional categorisation to overcome limitations Implementation: community of interest, reuse of standards, ad-hoc software, bottom-up source-based approach 40
  • Leipzig eHumanities Seminars 11/12/2013 Giovanni Colavizza Future directions Short term priority: development of the Labelling system ! Long term: • engage with more researchers and projects • test the method in different settings • steadily grow the vocabulary base • integrate existing vocabularies in the system 41
  • Leipzig eHumanities Seminars 11/12/2013 Giovanni Colavizza Thanks! ! ! Giovanni Colavizza Leibniz-Institut für Europäische Geschichte (IEG), Mainz ! Colavizza@ieg-mainz.de 42