Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Knowledge Patterns SSSW2016

2,020 views

Published on

My tutorial on knowledge patterns design and extraction, given at SSSW2016 in Bertinoro

Published in: Science

Knowledge Patterns SSSW2016

  1. 1. Knowledge Patterns: Design and Extraction Aldo Gangemi1,2, joint work with Andrea Giovanni Nuzzolese2, Valentina Presutti2, Diego Reforgiato Recupero2,3 1LIPN, Paris Nord University, CNRS UMR7030, France 2Semantic Technology Lab, ISTC-CNR, Rome, Italy 3Department of Informatics, University of Cagliari, Italy aldo.gangemi@lipn.univ-paris13.fr, {andrea.nuzzolese,valentina.presutti,diego.reforgiato}@istc.cnr.it
  2. 2. Invariances • “The important things in the world appear as invariants […] of […] transformations” (P. Dirac, The principles of quantum mechanics, 1947) • “A property or relationship is objective when it is invariant under the appropriate transformations” (R. Nozick, Invariances, 2001) • Multiple presentations (≈ under mapping) • Multiple stages (≈ under change) • Multiple contexts / perceivers / interpreters (≈ under different reference frameworks)
  3. 3. Patterns in general • “Invariances across observed data or objects” • They exist in natural, social, cognitive, or abstract worlds • Mathematical pattern science is about symbols, i.e. non- interpreted information objects • Objects of knowledge engineering are interpreted (cognitively, and, by derivation, formally) • Mutual support/dependencies • Gibson (1966), Shepard (1987,1992): invariances in stimulus- energy pair permanent (“projectable”) properties in the environment (affordances) • E.g. colors, shapes, features of entities can constitute value- added references for behaviour
  4. 4. Knowledge as memory of (value-laden) observable (ir)regularities? Cure Healer Medication Patient
  5. 5. 1985
  6. 6. At the origins of modern ontologies: Pat Hayes’ naïve physics manifesto
  7. 7. A Translation Approach to Portable Ontology Specifications. T. R. Gruber, Knowledge Acquisition, 5(2): 199-220, 1993. 15459 citations!!!
  8. 8. CLib Attach component (Attach has (superclasses (Action)) (required-slot (object base)) (primary-slot (agent)) ) (every Attach has (object ((exactly 1 Tangible-Entity) (a Tangible-Entity))) (base ((exactly 1 Tangible-Entity) (a Tangible-Entity))) (every Attach has (preparatory-event ((:default (a Make-Contact with (object ((the object of Self))) (base ((the base of Self)))) (a Detach with (object ((the object of Self))) (base ((the base of Self)))) )))) RCC-8 Spatial Ontology RCC: a calculus for region based qualitative spatial reasoning AG Cohn, B Bennett, JM Gooday, N Gotts - GeoInformatica, 1997 A library of generic concepts for composing knowledge bases K Barker, B Porter, P Clark, 2001
  9. 9. DOLCE (S5) foundational ontology patterns Sowa’s Peirce-inspired top-level ontology Sweetening ontologies with DOLCE A Gangemi, N Guarino, C Masolo, A Oltramari, …, 2002 - Springer Knowledge Representation: Logical, Philosophical, and Computationa Foundations J SOWA - Brooks/Cole, 2000
  10. 10. Evidence of knowledge patterns • In linguistic resources – Sentence forms – Sub-categorization frames – Lexico-syntactic patterns – Conceptual frames – Question patterns – (Bounded sets of) selectional preferences • In data – Data patterns – Data models (xsd, rdb) – Query types and views – Microformats – Infoboxes • In interaction – Interaction patterns – Lenses – HTML templates • In semantic resources – Competency questions – n-ary relations – OWL/RDFS classes with (locally complete?) sets of restrictions or properties – KM Component Library – Content ontology design patterns (CPs) – Knowledge patterns discovered from datasets
  11. 11. ConceptNet MIT OpenMind common sense project AtLocation(dog, kennel)[] CapableOf(dog, bark)[] CapableOf(dog, guard house)[] CapableOf(dog, pet)[] CapableOf(dog, run)[] Desires(dog, bone)[] Desires(dog, chew bone)[] Desires(dog, pet)[] Desires(dog, play)[] HasA(dog, flea)[] HasA(dog, four leg)[] HasA(dog, fur)[] HasProperty(dog, loyal)[] IsA(dog, canine)[] IsA(dog, domesticate animal)[] IsA(dog, loyal friend)[] IsA(dog, mammal)[] IsA(dog, man best friend)[] IsA(dog, pet)[] UsedFor(dog, companionship)[] ConceptNet—a practical commonsense reasoning tool-kit. Liu, Hugo, and Push Singh, BT technology journal, 2004
  12. 12. http://framenet.icsi.berkeley.edu/ Cure Healer Medication Patient FrameNet Cure frame The Berkeley Framenet project. Baker, Fillmore, Lowe, Association for Computational Linguistics, 1998.
  13. 13. VerbNet Motion verb class VerbNet: A broad-coverage, comprehensive verb lexicon. Schuler, KK, 2005
  14. 14. Knowledge patterns as expertise units • Evidence that units of expertise are larger than what we have from average linked data triples, or ontology learning • Cf. cognitive scientist Dedre Gentner: “uniform relational representation is a hallmark of expertise” • We need to create expertise-oriented boundaries unifying multiple triples – “Competency questions” are used to link ontology design patterns to requirements: •Which objects take part in a certain event? •Which tasks should be executed in order to achieve a certain goal? • What’s the function of that artifact? •What norms are applicable to a certain case? •What inflammation is active in what body part with what morphology? – Sometimes exception conditions should be added – Task-based ontology evaluation can be performed with unit tests against ontologies trying to satisfy competency questions Relational language and the development of relational mapping. Loewenstein, J, Gentner, D. Cognitive psychology, 2005
  15. 15. quality, patterns The role of competency questions in enterprise engineering M Grüninger, MS Fox - Benchmarking—Theory and Practice, 1995 - Springer Modelling ontology evaluation and validation A Gangemi, C Catenacci, M Ciaramita, J Lehmann - 2006 - Springer Evaluating ontological decisions with OntoClean N Guarino, C Welty - Communications of the ACM, 2002 - dl.acm.org Ontology design patterns A Gangemi, V Presutti - Handbook on ontologies 2nd ed., 2009 - Springer
  16. 16. Ontology Design Patterns An ontology design pattern is a reusable successful solution to a recurrent modeling problem Visit www.ontologydesignpatterns.org
  17. 17. Maximal ontology design requirement: What are we talking about, and why? Generic Competency Questions Specific Modelling Use Case Who does what, when and where? Production reports, schedules Which objects take part in a certain event? Resource allocation, biochemical pathways What are the parts of something? Component schemas, warehouse management What’s an object made of? Drug and food composition, e.g. for safety (comp.) What’s the place of something? Geographic systems, resource allocation What’s the time frame of something? Dynamic knowledge bases What technique, method, practice is being used? Instructions, enterprise know-how database Which tasks should be executed in order to achieve a certain goal? Planning, workflow management Does this behaviour conform to a certain rule? Control systems, legal reasoning services What’s the function of that artifact? System description How is that object built? Control systems, quality check What’s the design of that artifact? Project assistants, catalogues How did that phenomenon happen? Diagnostic systems, physical models What’s your role in that transaction? Activity diagrams, planning, organizational models What that information is about? How is it realized? Information and content modelling, computational models, subject directories What argumentation model are you adopting for negotiating an agreement? Cooperation systems What’s the degree of confidence that you give to this axiom? Ontology engineering tools
  18. 18. Layered pattern morphisms An ontology design pattern describes a formal expression that can be exemplified, morphed, instantiated, and expressed in order to solve a domain modelling problem • owl:Class:_:x rdfs:subClassOf owl:Restriction:_:y • Inflammation rdfs:subClassOf (localizedIn some BodyPart) • Colitis rdfs:subClassOf (localizedIn some Colon) • John’s_colitis isLocalizedIn John’s_colon • “John’s colon is inflammated”,“John has got colitis”,“Colitis is the inflammation of colon” Logical Pattern (MBox) Generic Content Pattern (TBox) Specific Content Pattern (TBox) Data Pattern (ABox) exemplifiedAs morphedAs instantiatedAs Linguistic Pattern expressedAs Logic Meaning Reference Expression expressedAs Abstraction Aldo Gangemi,Valentina Presutti: Ontology Design Patterns. Handbook on Ontologies 2nd ed. (2009)
  19. 19. Problem example: Temporal n-ary patterns • Temporal indexing pattern – (R(a,b))+t sentence indexing • quads, external time stamps – R(a,b)+t relation indexing • reified n-ary relations (3D frames) – R(a+t,b+t) individual indexing • fluents, 4D, tropes,“context slices” (4D frames) – tR name nesting • ad hoc naming of binary relations • More indexes for additional arguments A Multi-dimensional Comparison of Ontology Design Patterns for Representing n-ary Relations. A Gangemi,V Presutti. SOFSEM 2013: 86-105 An Empirical Perspective on Representing Time. A Scheuermann, E Motta, P Mulholland, A Gangemi andV Presutti. K-CAP 2013 Formal Unifying Standards for the Representation of Spatiotemporal Knowledge. P. Hayes, Advanced Decision Architectures Alliance, 2004
  20. 20. Procedural patterns • Precise – Classification – Subsumption – Inheritance – Materialization – Rule firing – Constructive query • Approximate – Fuzzy classification – Information extraction (NER, RE) – Similarity induction (e.g. alignment) – Taxonomy induction – Relevance detection – Latent semantic indexing • Thesaurus to SKOS • Relational DB to RDF • WordNet RDB to OWL • XML to RDF • FrameNet XML to RDF • Microformat to RDF • NER entities to ABox • NLP to RDF Reasoning patterns Alignment patterns Reengineering patterns
  21. 21. Anti-patterns (1/2) • Partonomies or subject classifications as subsumption hierarchies • *City subClassOf Country • City subClassOf (partOf some Country) • *City subClassOf Geography • City broader Geography (e.g. in SKOS) • Linguistic disjunction as class disjointness • Dead or alive • *Dead or Alive • Dead disjointWith Alive • Linguistic conjunction as class disjunction • Pen and paper • *Pen and Paper • Pen or Paper | Collection subClassOf (hasMember some Paper ; some Pen) A catalogue of OWL ontology antipatterns. Roussey, Corcho, Vilches-Blázquez, ACM, 2009. A user oriented owl development environment designed to implement common patterns and minimise common errors. Horridge, Rector, Drummond, Springer, 2004.
  22. 22. Anti-patterns (2/2) • Causality as entailment • Kaupthing bank behavior caused Iceland crisis • *KaupthingBankBehavior subClassOf IcelandCrisis • KaupthingBankBehavior isCauseOf IcelandCrisis • Expressions as instances of the class representing their meaning • *dog(word) rdf:type Dog • dog(word) expresses Dog (with punning) • Multiple domains or ranges of properties as intersection • *hasInflammation rdfs:domain Epithelium ; Endothelium • hasInflammation rdfs:domain (Epithelium or Endothelium)
  23. 23. Putting the pieces together “Mine & Design” pattern induction from data cleaning data by using (foundational) patterns pattern-based knowledge extraction from text axiom induction from knowledge graphs …
  24. 24. Pattern induction from data: centrality discovery in datasets mo:Track mo:MusicArtist mo:Playlist mo:Torrent mo:ED2K tags:Tag mo:Record foaf:maker rdfs:Literal dc:title dc:datemo:image dc:description mo:track tags:taggedWithTag mo:available_as mo:available_as mo:available_as Extracting Core Knowledge from Linked Data. Presutti, Aroyo et al., COLD2011.
  25. 25. Serving DBpedia with DOLCE–More than Just Adding a Cherry on Top. Paulheim, Gangemi, ISWC, 2015.
  26. 26. Information Extraction and the SW • Historically SW mainly worked on ontology learning – unconvincing results: sparseness, core knowledge difficult to catch, etc. (cf. analyses in Coppola et al. 2009, Blomqvist, 2009) – natural language understanding known to be an AI-complete problem • The paradigm of Open Information Extraction (Etzioni, 2006) fits the lightweight and/or data-driven trend of current SW • Semantic technologies need hybridization 31 Google Knowledge Graph IBM Watson QA, NL querying on LD, full text search jointly with queries, ... Apple Siri, Google Now, SRI startup Desti Facebook Social Graph Microsoft Cortana OIE, NELL, BabelNet, ...
  27. 27. Stochasticity does it well? • Purely stochastic approaches to NLU attempt to learn models that solve one specific problem, but how to compose the different models? How to hybridise those models with logic/knowledge- based approaches? • Cf. NCM chatbot, MetaMind neural QA
  28. 28. Google’s “Neural Conversational Model” one year ago on arXiv mixed magic and massive stupidity in this model deeply learnt from open movie scripts
  29. 29. similar vs. opposite semantics, but algorithm gives same semantic similarity
  30. 30. We need intelligent hybridisation!
  31. 31. E.g. how to do deep semantic parsing?
  32. 32. • The Black Hand might not have decided to barbarously assassinate Franz Ferdinand after he arrived in Sarajevo on June 28th, 1914 event negation modality participants more participants quality coreference deep semantic parsing: not just annotation, but formal knowledge extraction event relation
  33. 33. Open Information Extraction pc5: NLPapps mac$ java -Xmx512m -jar reverb-latest.jar <<<"The Black Hand might not have decided to barbarously assassinate Franz Ferdinand after he arrived in Sarajevo on June 28th, 1914." Initializing ReVerb extractor...Done. Initializing confidence function...Done. Initializing NLP tools...Done. Starting extraction. stdin 1 he arrived in Sarajevo 13 14 14 16 16 10.2200632195721161 The Black Hand might not have decided to barbarously assassinate Franz Ferdinand after he arrived in Sarajevo on June 28th , 1914 . DT NNP NNP MD RB VB VBN TO RB VB NNP NNP IN PRP VBD IN NNP IN NNP JJ , CD . B-NP I-NP I-NP B-VP I-VP I-VP I-VP I-VP I-VP I-VP B-NP I-NP B-SBAR B-NP B-VP B-PP B-NP B-PP B-NP I-NP I-NP I-NP O he arrive in sarajevo Done with extraction. Summary: 1 extractions, 1 sentences, 0 files, 1 seconds http://ai.cs.washington.edu/projects/ open-information-extraction
  34. 34. Open Knowledge Extraction • Open Knowledge Extraction (OKE) is a hybrid approach to knowledge graph production that exploits some of the assumptions of Open Information Extraction (open-domain, unsupervised), together with formal semantic reengineering of NLP output, Semantic Web and Linked Data patterns, entity linking, word-sense disambiguation linked to Linguistic Linked Data • The result of OKE is a two-layered OWL-RDF knowledge graph that (1) lifts the content of a text into entities grounded into public web identities, with formal axioms, and (2) deeply annotates the text • OKE can be used as a semantic middleware between content and knowledge management: querying, annotating, classifying, detecting, …
  35. 35. LOD and ODP design Aligned to WordNet, VerbNet, FrameNet, DOLCE+DnS, DBpedia, schema.org, BabelNet RESTful or motif-based Python query interface Earmark RDF, OWL Apache Stanbol Neo-Davidsonian, DRT- and Frame-based High EE and RE accuracy FRED integrates NER, SenseTagging, WSD, Taxonomy Induction, Relation/Event/Role Extraction NIF ALCO(D) DL language http://wit.istc.cnr.it/stlab-tools/fred
  36. 36. OKE w. FRED “The Black Hand might not have decided to barbarously assassinate Franz Ferdinand after he arrived in Sarajevo on June 28th, 1914” type induction negation modality taxonomy induction semantic roles entity linking + configurable namespaces, Earmark text spans with semiotic relations to graph entities (denotes, hasInterpretant), NIF annotations and text segmentation events qualities tense representation second order relations role propagation predicate-argument structures coreference resolution
  37. 37. Sample translation table Parsing pattern Logical pattern Example Named entity <owl:NamedIndividuali> :BarackObama Implicit discourse referent <owl:NamedIndividuali> rdf:type <owl:Classj> :doctor_1 rdf:type :Doctor Entity resolution (NE) <owl:NamedIndividuali> owl:sameAs <owl:NamedIndividualj> :BarackObama owl:sameAs dbr:Barack_Obama Entity coreference <owl:NamedIndividuali> owl:sameAs <owl:NamedIndividualj> :John owl:sameAs :doctor_1 Term <owl:Classi> || <owl:ObjectPropertyj> || <owl:DatatypePropertyk> :Doctor Sense tag <owl:NamedIndividuali> rdf:type <owl:Classj> :BarackObama rdf:type dbo:Person Sense disambiguation <owl:Classi> owl:equivalentClass <owl:Classj> :Doctor owl:equivalentClass n30:synset-doctor-noun-1 Compositional semantics <owl:Classi> owl:subClassOf <owl:Classj> && <owl:NamedIndividuali> dul:associatedWith <owl:NamedIndividualk> :BrassInstrument rdfs:subClassOf :Instrument . :brass_1 dul:associatedWith :instrument_1 Extracted (binary) relationship <owl:NamedIndividuali> <owl:ObjectProperty> || <owl:DatatypeProperty> <owl:NamedIndividualj> :Cabeza :survivorOf :expedition_1 Semantic role <semrolei> rdf:type (owl:ObjectProperty || owl:DatatypeProperty) vn.role:Agent rdf:type owl:ObjectProperty Event <dul:Eventi> <semrolej> <Entityj> . <dul:Eventi> rdf:type <Event.type> . :visit_1 vn.role:Agent :doctor_1 . :visit_1 rdf:type :Visit Frame <Event.typei> owl:subClassOf dul:Event || <ff:Framej> :Visit rdfs:subClassOf vn.data:Visit_36030100 || ff:Visiting Neo-Davidsonian situation <boxing:Situationi> boxing:involves <owl:NamedIndividualj> :situation_1 boxing:involves :John , :Dog ; boxing:hasTruthValue :False Quantified expression <owl:NamedIndividuali> quant:hasQuantifier <quant:Quantifierj> :doctor_1 quant:hasQuantifier quant:some Negation <dul:Eventi> boxing:hasTruthValue :False :visit_1 boxing:hasTruthValue :False Modality <dul:Eventi> boxing:hasModality <boxing:Modalityj> :visit_1 boxing:hasModality :Possible Regular adjectival semantics <owl:NamedIndividuali> dul:hasQuality <dul:Qualityj> :John dul:hasQuality :Smart Alternative adjectival semantics <owl:Classi> dul:hasQuality <dul:Quality> . <owl:Classi> dul:associatedWith <owl:Classj> :AllegedDoctor dul:hasQuality :Alleged ; dul:associatedWith :Doctor Disjunction of individuals <owl:NamedIndividuali> boxing:union <owl:NamedIndividualj> Factual entailment <Eventi> boxing:entails <Eventj>
  38. 38. FRED’s OKE pipeline Semantic Web Machine Reading with FRED. Gangemi, Presutti, Reforgiato Recupero, Nuzzolese, et al. Semantic Web Journal, 2016, http://semantic-web-journal.org/system/files/ swj1379.pdf
  39. 39. Semiotic driftin' with OKE: an example John Coltrane played with Miles Davis in Kind of Blue
  40. 40. (ROOT (S (NP (NNP John) (NNP Coltrane)) (VP (VBD played) (PP (IN with) (NP (NNP Miles) (NNP Davis))) (PP (IN in) (NP (NP (NNP Kind)) (PP (IN of) (NP (NNP Blue)))))))) root(ROOT-0, played-3) nn(Coltrane-2, John-1) nsubj(played-3, Coltrane-2) nn(Davis-6, Miles-5) prep_with(played-3, Davis-6) prep_in(played-3, Kind-8) prep_of(Kind-8, Blue-10) ___________________________ _____________ |x0 x1 x2 x3 | |e4 | |...........................| |.............| (|named(x0,john_coltrane,per)|A|play(e4) |) |named(x1,kind,loc) | |Actor1(e4,x0)| |named(x2,blue,loc) | |of(x1,x2) | |named(x3,miles_davis,loc) | |in(e4,x1) | |___________________________| |Actor2(e4,x3)| |_____________| DRT Dependencies LLD (VerbNet) Entity Linking
  41. 41. <http://dbpedia.org/resource/Kind_of_Blue> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/MusicAlbum> . <http://dbpedia.org/resource/Kind_of_Blue> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/CreativeWork> . <http://www.ontologydesignpatterns.org/ont/fred/domain.owl#Play> <http://www.w3.org/2002/07/owl#equivalentClass> <http://www.ontologydesignpatterns.org/ont/vn/data/Play_36030100> . <http://www.ontologydesignpatterns.org/ont/fred/domain.owl#Play> <http://www.w3.org/2000/01/rdf-schema#subClassOf> <http://www.ontologydesignpatterns.org/ont/dul/DUL.owl#Event> . <http://dbpedia.org/resource/John_Coltrane> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/MusicGroup> . <http://dbpedia.org/resource/John_Coltrane> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Person> . <http://www.ontologydesignpatterns.org/ont/fred/domain.owl#play_1> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.ontologydesignpatterns.org/ont/fred/domain.owl#Play> . <http://www.ontologydesignpatterns.org/ont/fred/domain.owl#play_1> <http://www.ontologydesignpatterns.org/ont/vn/abox/role/Actor1> <http://www.ontologydesignpatterns.org/ont/fred/domain.owl#John_coltrane> . <http://www.ontologydesignpatterns.org/ont/fred/domain.owl#play_1> <http://www.ontologydesignpatterns.org/ont/vn/abox/role/Actor2> <http://www.ontologydesignpatterns.org/ont/fred/domain.owl#Miles_davis> . <http://www.ontologydesignpatterns.org/ont/fred/domain.owl#play_1> <http://www.ontologydesignpatterns.org/ont/fred/domain.owl#locatedIn> <http://www.ontologydesignpatterns.org/ont/fred/domain.owl#Kind> . <http://dbpedia.org/resource/Miles_Davis> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Person> . <http://dbpedia.org/resource/Miles_Davis> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/MusicGroup> . <http://www.ontologydesignpatterns.org/ont/fred/domain.owl#Kind> <http://www.w3.org/2002/07/owl#sameAs> <http://dbpedia.org/resource/Kind_of_Blue> . <http://www.ontologydesignpatterns.org/ont/fred/domain.owl#Kind> <http://www.ontologydesignpatterns.org/ont/fred/domain.owl#of> <http://www.ontologydesignpatterns.org/ont/fred/domain.owl#Blue> . <http://www.ontologydesignpatterns.org/ont/fred/domain.owl#Miles_davis> <http://www.w3.org/2002/07/owl#sameAs> <http://dbpedia.org/resource/Miles_Davis> . <http://www.ontologydesignpatterns.org/ont/fred/domain.owl#John_coltrane> <http://www.w3.org/2002/07/owl#sameAs> <http://dbpedia.org/resource/John_Coltrane> . select ?p where {dbpedia:Kind_of_Blue ?p dbpedia:Miles_Davis} p http://dbpedia.org/ontology/artist http://dbpedia.org/property/writer Semantic subgraph Query
  42. 42. <http://dbpedia.org/resource/Kind_of_Blue> <http://www.w3.org/2000/01/rdf-schema#comment> "Kind of Blue is a studio album by American jazz musician Miles Davis, released on August 17, 1959, by Columbia Records. Recording sessions for the album took place at Columbia's 30th Street Studio in New York City on March 2 and April 22, 1959. The sessions featured Davis's ensemble sextet, with pianist Bill Evans, drummer Jimmy Cobb, bassist Paul Chambers, and saxophonists John Coltrane and Julian Cannonball Adderley."@en . describe dbpedia:Kind_of_Blue <http://dbpedia.org/resource/Kind_of_Blue> <http://dbpedia.org/ontology/producer> <http://dbpedia.org/resource/Irving_Townsend> . <http://dbpedia.org/resource/Kind_of_Blue> <http://purl.org/dc/terms/subject> <http://dbpedia.org/resource/Category:Albums_certified_gold_by_the_British_Phonographic_Industry> . <http://dbpedia.org/resource/Kind_of_Blue> <http://dbpedia.org/ontology/recordDate> "1958-05-25+02:00"^^<http://www.w3.org/2001/XMLSchema#date> . <http://dbpedia.org/resource/Kind_of_Blue> <http://dbpedia.org/ontology/recordedIn> <http://dbpedia.org/resource/CBS_30th_Street_Studio> . <http://dbpedia.org/resource/Kind_of_Blue> <http://dbpedia.org/ontology/recordedIn> <http://dbpedia.org/resource/New_York_City> . <http://dbpedia.org/resource/Kind_of_Blue> <http://dbpedia.org/ontology/recordLabel> <http://dbpedia.org/resource/Legacy_Recordings> . <http://dbpedia.org/resource/Kind_of_Blue> <http://dbpedia.org/ontology/genre> <http://dbpedia.org/resource/Modal_jazz> . <http://dbpedia.org/resource/Kind_of_Blue> <http://dbpedia.org/property/title> <http://dbpedia.org/resource/Blue_in_Green> . <http://dbpedia.org/resource/Kind_of_Blue> <http://dbpedia.org/property/producer> <http://dbpedia.org/resource/Irving_Townsend> . <http://dbpedia.org/resource/Kind_of_Blue> <http://dbpedia.org/property/title> <http://dbpedia.org/resource/All_Blues> . <http://dbpedia.org/resource/Kind_of_Blue> <http://dbpedia.org/property/title> <http://dbpedia.org/resource/Freddie_Freeloader> . Entity Linking again OKE again
  43. 43. “CBS 30th Street Studio was an American recording studio operated by Columbia Records, and located at 207 East 30th Street, between Second and Third Avenues in Manhattan, New York City.” <http://dbpedia.org/resource/Columbia_Records> <http://www.ontologydesignpatterns.org/ont/fred/ domain.owl#operateBetweenAvenueLocatedIn> <http://dbpedia.org/resource/New_York_City> . etc. … Legalo synthetic path finder
  44. 44. Complexity of FRED’s models and computational time • typically, a 100-word sentence takes less than 1 second to be processed, also considering the lag due to the Web API, and the load of a diagram when using the Web application • the expressivity of an OKE knowledge graph dataset produced by FRED calculated from a composite text corpus from four different textual types (approx. 19,000 axioms) is equivalent to an ALCO(D) (Attributive Language with Complements, Object value restrictions and Data properties) DL language • ALCO(D) includes atomic negation, class intersection, universal and existential restrictions, and nominals (closed world classes) • computational complexity of ALCO(D) is PSpace-Complete, and enjoys both finite model and tree model properties (cf. http://www.cs.man.ac.uk/ ~ezolin/dl/ for a complexity navigator)
  45. 45. Challenges for OKE
  46. 46. Four classes of semantic problems 1. partial accuracy of specific NLP components lead to global errors 2. “in praesentia” semantics besides literal interpretation: various kinds of coercion, met* phenomena 3. “in absentia” semantics: implicatures, presuppositions, tacit knowledge, reference to the physical context 4. higher-order phenomena: emergent frames, social (and legal) norms, attitudes, argumentation, cultural frames and narratives, discourse marks, text types
  47. 47. 1. Partial accuracy of specific NLP components lead to global errors • N/V POS tagging (specially for English) David Moyes shares Manchester United fans' frustration • Complex multiword extraction Myeloid hepatosplenomegaly is an enlargement of liver and kidney due to myelofibrosis. • Coordinations are difficult Uncaria est une liane des jungles tropicales de l'Amérique du Sud et Centrale. Aristotle was a Greek philosopher, a student of Plato and teacher of Alexander the Great. • Citations and titles need to be treated differently Anna Karenina is also mentioned in R. L. Stine's Goosebumps series Don't Go To Sleep. • Plural coreferences are hard When Carol helps Bob and Bob helps Carol, they can accomplish any task.
  48. 48. • The path descended abruptly • The road runs along the coast for two hours • The fence zigzags from the plateau to the valley • The highway crawls through the city • The road leads us to Bordeaux • Need for “type coercion” to satisfy hidden frame • highway is actually a path that “can be crawled”, therefore the crawling frame here is descriptive of a state, not of an action • fence is actually an object whose shape “can be followed by zigzaging” • road is actually an object that “can be followed as an indication” to our destination • sometimes an inversion of roles: the path descends because it can be descended 2. “in praesentia” semantics besides literal interpretation E.g. fictive motion and coercion (Talmy, Welty, Retoré)
  49. 49. adjective semantics • Carmelo is a Sicilian surgeon • Carmelo is an arsonist • ⊨ Carmelo is a Sicilian arsonist • Carmelo is a skilful surgeon • ⊭ Carmelo is a skilful arsonist • Carmelo is the alleged surgeon • ⊭ Carmelo is the alleged arsonist • ?⊨ Carmelo is a surgeon • Carmelo is a fake surgeon • ⊭ Carmelo is a fake arsonist • ⊭ Carmelo is a surgeon
  50. 50. • How many frames? (FrameNet, VerbNet, etc. have a small coverage), roles are often partly covered, mapping between frame resources and linguistic constructions is seriously incomplete • Interaction between in praesentia (traditional machine reading) and in absentia (SW-machine reading) knowledge is complicated: how about relevance, novelty, situatedness, etc.? • Mario, please pass me the glass over there • Mario, I feel sick … how about the meal we had yesterday? • Mario, we are in last year’s situation 3. “in absentia” semantics: implicatures, presuppositions, tacit knowledge, reference to the physical world
  51. 51. • I saw the Coliseum in my tourist guide and wanted to go there • artifact vs. place • Actually relevant? • Power of ambiguity (“systematic polysemy”) • Minimal effort seems to count in human evolution of lexical knowledge, but only if we can easily reconstruct the context (or frame, relation, ...) • “The communicative function of ambiguity in language” (Piantadosi, Tilyb, Gibson, @PLoS): ambiguity allows for greater ease of processing by permitting efficient linguistic units to be re-used. “All efficient communication systems will be ambiguous, assuming that context is informative about meaning” • Also in science: inflammation has several interrelated meanings Dot objects, co-predication (Pustejovsky, Asher)
  52. 52. 4. higher-order phenomena: emergent frames, social (and legal) norms, attitudes, argumentation, cultural frames and narratives, discourse marks, text types • Weak results for automatic extraction of discourse marks and their related semantics • Attitude/argumentation lenses over basic semantics • Bhatkal's father: I'm glad he has been arrested • I disagree with the comments of reviewer 1, but reviewer 2 should provide a stronger basis to his low rating • Text types • A cat is on the mat / A cat is a mammal • Norms • I should/feel obliged/want/obey/fear … it’s required/acceptable/convenient/proper/ suggested … • Complex frames/narratives • We need tax relief vs. Taxes are investments
  53. 53. Example: Extracting opinion graphs
  54. 54. Filtering FRED’s graphs with opinions People hope that the President will be condemned by the judges Triggering event Main topic Subtopics Holder
  55. 55. Sentilo opinion ontology wit.istc.cnr.it/sentilo-release/sentilo
  56. 56. Result in triples People hope that the President will be condemned by the judges Aldo Gangemi, Valentina Presutti, Diego Reforgiato Recupero. Frame-based detection of opinion holders and topics: a model and a tool. IEEE Computational Intelligence Magazine, 9(1), 2014
  57. 57. Example: Extracting adjectival qualities
  58. 58. Approximating adjective semantics • New adjective ontology derived by reasoning on top of an integrated resource including (Onto)WordNet and FrameNet-RDF • Four ontology design patterns for the four main semantics identified Adjective Semantics in Open Knowledge Extraction. A. Gangemi, A.G. Nuzzolese, V. Presutti and D. Reforgiato, Formal Ontology in Information Systems Conference (FOIS2106), IOS Press, 2016.
  59. 59. Approximating adjective semantics: patterns • Base: --> create taxonomy and intensional quality :SkilfulSurgeon rdfs:subClassOf Surgeon . :surgeon_1 a :SkilfulSurgeon . :SkilfulSurgeon dul:hasQuality :Skilful . • Extensional (Base + individual Quality): --> create taxonomy and individual+intensional quality :CanadianSurgeon rdfs:subClassOf :Surgeon . :surgeon_1 a :CanadianSurgeon . :surgeon_1 dul:hasQuality :Canadian . :CanadianSurgeon dul:hasQuality :Canadian . • Modal: --> create association and intensional modality :surgeon_1 a :AllegedSurgeon . :AllegedSurgeon dul:associatedWith :Surgeon . :AllegedSurgeon boxing:hasModality :Alleged . • Privative: --> create association and intensional quality :surgeon_1 a :Fake_surgeon . :Fake_surgeon dul:associatedWith :Surgeon . :Fake_surgeon dul:hasQuality :Fake .
  60. 60. Approximating adjective semantics: example The alleged doctor failed to transplant the fake organ into the nice patient that borrowed a Canadian car
  61. 61. Passing the baton • We have seen knowledge engineering on the SW as kind of pattern science • Reusable patterns • Procedural practices • Discoverable patterns • Pattern-based formal knowledge extraction • How logical and statistical techniques can be formally hybridised, so leveraging the legacy of Pat Hayes and David Mumford?
  62. 62. Other useful links • FRED web application • http://wit.istc.cnr.it/stlab-tools/fred/demo • FRED API documentation • http://wit.istc.cnr.it/stlab-tools/fred/api • A FRED benchmark in N-Quads • complete with annotations • https://www.dropbox.com/s/q6b47dxmwyyseij/goldfrombenchmark.nq?dl=0 • only the semantic subgraph • https://www.dropbox.com/s/p7w8nojb2g2yf8k/ goldfrombenchmark_semtriples.nq?dl=0

×