Advertisement

Knowledge graphs

STI Innsbruck
Jul. 8, 2020
Advertisement

More Related Content

Advertisement
Advertisement

Knowledge graphs

  1. Knowledge Graphs Dieter Fensel with the help of of Umutcan Şimşek, Kevin Angele, Elwin Huaman, Elias Kärle, Oleksandra Panasiuk, Ioan Toma, Jürgen Umbrich, and Alexander Wahler STI Innsbruck, University of Innsbruck, Austria June 29, 2020
  2. Knowledge Graphs 1. Motivation 2. KG Methodology 3. Knowledge Generation, 4. Knowledge Hosting, 5. Knowledge Curation (assessment, cleaning, and enrichment) 6. Knowledge Deployment 7. The Proof Of The Pudding Is In The Eating 2
  3. More infos • Dieter Fensel, Umutcan Şimşek, Kevin Angele, Elwin Huaman, Elias Kärle, Oleksandra Panasiuk, Ioan Toma, Jürgen Umbrich, and Alexander Wahler: Knowledge Graphs - Methodology, Tools, and selected Use Cases, Springer, 2020. • MindLab project: mindlab.ai • https://www.slideshare.net/STI-Innsbruck/building-a-knowledge- graph-from-schemaorg-annotations-236256670 • https://www.slideshare.net/STI-Innsbruck/how-to-build-a- knowledge-graph-236256713 3
  4. 1. Motivation Evolving Technologies for eMarketing and eCommerce The Web Search 4
  5. 1. Motivation Evolving Technologies for eMarketing and eCommerce The Web Search 5
  6. 1. Motivation Evolving Technologies for eMarketing and eCommerce The Web Search Semantic Web Query Answering 6
  7. 1. Motivation Evolving Technologies for eMarketing and eCommerce The Web Search Semantic Web Query Answering 7
  8. 1. Motivation Evolving Technologies for eMarketing and eCommerce The Web Search Semantic Web Query Answering Knowledge Graph Goal and Service Oriented Dialoque 8
  9. 1. Motivation Evolving Technologies for eMarketing and eCommerce The Web Search Semantic Web Query Answering Knowledge Graph Goal and Service Oriented Dialoque 9
  10. 1. Motivation • The quality of Intelligent Assistants depends directly on the quality of the Knowledge Graph • Problem: “Garbage in Garbage out” • Requirements for the Knowledge Graph: • well structured (using an ontology - schema.org) • accurate information (correctness) • large and detailed coverage (completeness) • Timeliness of knowledge ==> Method- andTool-supported Knowledge Graph Lifecycle 10
  11. Knowledge Creation Knowledge Hosting Knowledge Cleaning Knowledge Enrichment Knowledge Curation Knowledge Deployment Knowledge Assesment 2. KG Methodology: Process Model 11
  12. 2. KG Methodology: Task Model Knowledge Graph Maintenance Knowledge Hosting Knowledge Curation Knowledge Deployment Knowledge Assesment Knowledge Cleaning Knowledge Enrichement Error Detection Error Correction Evaluation Correctness Completeness Knowledge Source detection Knowledge Source integration Duplicate detection Property-Value- Statements correction Knowledge Creation Edit Semi-automatic AutomaticMapping 12
  13. 3. Knowledge Generation Knowledge Graph Knowledge Creation Knowledge Hosting Knowledge Curation Knowledge Deployment Edit Semi-automatic AutomaticMapping As of today, the vocabulary currently consists of 829 Types, 1351 Properties, and 339 Enumeration values. 13
  14. 3. Knowledge Generation Knowledge Graph Knowledge Creation Knowledge Hosting Knowledge Curation Knowledge Deployment Edit Semi-automatic AutomaticMapping 14
  15. 3. Knowledge Generation • We define domain-specific extensions (that also restrict the genericity of entire schema.org). • Domain Specifications: • restrict genericity and • extend domain-specifity of schema.org. • Are based on SHACL • https://schema-tourism.sti2.org/ • We use value restriction not as inference mechanism but as integrity constraint. Knowledge Graph Knowledge Creation Knowledge Hosting Knowledge Curation Knowledge Deployment Edit Semi-automatic AutomaticMapping Schema.org Domain Domain Specification 15
  16. 3. Knowledge Generation Vertical extensions of schema.org: The Dach-KG working group • develops a de facto standard for semantic annotation of touristic content, data, and services in the DACH area • based on schema.org and its adaptation by domain specifications • it should become the backbone of an open 5* Knowledge Graph for touristic data in DACH *) The dataset gets awarded one star if the data are provided under an open license. **) Two stars, if the data are available as structured data. ***) Three stars, if the data are also available in a non-proprietary format. ****) Four stars if URIs are used, that the data can be referenced and *****) five stars, if the data set are linked to other data sets that can provide context. • It should go online 2021. https://www.tourismuszukunft.de/2019/05/dach-kg-neue-ergebnisse-naechste-schritte-beim-thema-open-data/ Knowledge Graph Knowledge Creation Knowledge Hosting Knowledge Curation Knowledge Deployment Edit Semi-automatic AutomaticMapping 16
  17. 3. Knowledge Generation Our Methodology: • the bottom-up part, which describes the steps of the initial annotation process; • the domain specification modeling; and • the top-down part, which applies the constructed models. Knowledge Graph Knowledge Creation Knowledge Hosting Knowledge Curation Knowledge Deployment Edit Semi-automatic AutomaticMapping 17
  18. 3. Knowledge Generation Semantify.it1: A platform for creating, hosting, validating, verifying, and publishing schema.org annotated data • annotation of static data based on schema.org templates  Domain Specifications2 • annotation of different schemata and dynamic data based on RML3 mappings Rocket RML4 1 https://semantify.it 2 http://ds.sti2.org 3 https://rml.io 4 https://github.com/semantifyit/RocketRML Knowledge Graph Knowledge Creation Knowledge Hosting Knowledge Curation Knowledge Deployment Edit Semi-automatic AutomaticMapping 18
  19. 3. Knowledge Generation Manual Annotation Editor Knowledge Graph Knowledge Creation Knowledge Hosting Knowledge Curation Knowledge Deployment Edit Semi-automatic AutomaticMapping 19
  20. 3. Knowledge Generation • Semi-automatic • Annotation Editor suggests mappings/extracted information • e.g. extract information from web pages (by HTML tags). • Use partial NLU to find similarities of the content and schema.org vocabulary. • Manual adaptions needed to define and to evaluate. • Instance of the general issues of wrapper generation. Knowledge Graph Knowledge Creation Knowledge Hosting Knowledge Curation Knowledge Deployment Edit Semi-automatic AutomaticMapping 20
  21. 3. Knowledge Generation • Mapping (more than 95% of the story) • integrate large and fast changing data sets • map different formats to the ontology used in our Knowledge Graph • Various frameworks: XLWrap, Mapping Master (M2), a generic XMLtoRDF tool providing a mapping document (XML document) that has a link between an XML Schema and an OWL ontology, Tripliser, GRDDL, R2RML, RML, ... • We developed an efficient mapping engine for the RDF Mapping Language RML, called RocketRML. It is a rule based engine that efficiently processes RML mappings and creates RDF data. • The semantify.it platform features a wrapper API where these mappings can be stored and applied to corresponding data. Knowledge Graph Knowledge Creation Knowledge Hosting Knowledge Curation Knowledge Deployment Edit Semi-automatic AutomaticMapping 21
  22. RML [Dimou et al., 2014]: ● Easier to learn RML than a programming language ● Easy sharing ● Mapping can be visualized ● Mapfiles can be faster to write than code ● Easily change mappings ● Rocket RML precompiles joins to improve performance by several order of magnitudes. RML YARRRML Matey Seite 22 3. Knowledge Generation Knowledge Graph Knowledge Creation Knowledge Hosting Knowledge Curation Knowledge Deployment Edit Semi-automatic AutomaticMapping
  23. 3. Knowledge Generation Automatic extraction of knowledge from text representations and web pages • Tasks • named entity recognition, • concept mining, text mining, • relation detection, … • Methods • Information Extraction • Natural Language Processing (NLP) • Machine Learning (ML) • Systems: • GATE (text analysis & language processing) • OpenNLP (supports most common NLP tasks) • RapidMine (data preparation, machine learning, deep learning, text mining, predictive analysis) • Ontotext / Sirma Knowledge Graph Knowledge Creation Knowledge Hosting Knowledge Curation Knowledge Deployment Edit Semi-automatic AutomaticMapping 23
  24. 3. Knowledge Generation Evaluation of semantic annotations: • The semantify.it validator is a web-tool that offers the possibility to validate schema.org annotations that are scraped from websites. • Verification: The annotations are checked against plain schema.org and against domain specifications • Validation: The annotations are checked whether they accurately describe of the content of the web site. • https://semantif.it/evaluate Knowledge Graph Knowledge Creation Knowledge Hosting Knowledge Curation Knowledge Deployment Edit Semi-automatic AutomaticMapping 24
  25. 3. Knowledge Generation Evaluation = Validation & Verification Knowledge Graph Knowledge Creation Knowledge Hosting Knowledge Curation Knowledge Deployment Edit Semi-automatic AutomaticMapping 25
  26. 3. Knowledge Generation • Annotation of dynamic and active data with WASA (earlier called WSMO). • Dynamic: Actions to obtain dynamic data (e.g. weather forecast) • Active: Actions that can be taken on entities in a Knowledge Graph (e.g. a room offering of a Hotel can have BuyAction attached to it) • An action is an instance of schema.org/Action type. • Describe the invocation mechanism (e.g. endpoint, HTTP method, encoding type). • Describe input and output parameters with SHACL (another implementation of domain specifications). • Grounding and lifting for existing Web APIs. 26
  27. 3. Knowledge Generation 27 New version soon published on http://actions.semantify.it
  28. 4. Knowledge Hosting Knowledge Graph Knowledge Creation Knowledge Hosting Knowledge Curation Knowledge Deployment Annotation - Tool (e.g. semantify.it) Document store (e.g. MongoDB) Graph database (e.g. GraphDB) Hosting ... Semantic Web Annotations Knowledge Graphs 28
  29. 4. Knowledge Hosting • Semantically annotated data can be serialized to JSON-LD • storage in document store MongoDB • native JSON storage • well integrated in current state of the art software with NodeJS • performant search, through indexing • Allows efficient publication of annotations on webpages • not hardware intensive no native RDF querying with SPARQL Knowledge Graph Knowledge Creation Knowledge Hosting Knowledge Curation Knowledge Deployment 29
  30. 4. Knowledge Hosting • Native storage of semantically annotated data • RDF store: GraphDB • very powerful CRUD operations • named graphs for versioning • full implementation of SPARQL • powerful reasoning over big data sets no web frameworks available • very hardware intensive Knowledge Graph Knowledge Creation Knowledge Hosting Knowledge Curation Knowledge Deployment 30
  31. 5. Knowledge Curation • We defined a simple KR formalism formalizing essentials of schema.org • Tbox: isA statements of types, domain and range definitions for properties (using them globally or locally) • Abox: isElementOf(I,t) statements, Property-Value Statements p(i1,i2), and sameAs(i1,i2) statements • Enables a formal definition of the knowledge curation task (assessment, cleaning, and enrichment). Knowledge Graph Knowledge Creation Knowledge Hosting Knowledge Curation Knowledge Deployment KnowledgeAssessment KnowledgeCleaning Knowledge Enrichement Evaluation Correctness Completeness 31
  32. 5.1 Knowledge Assesment • Knowledge Assessment describes and defines the process of assessing the quality of a Knowledge Graph. • The goal is to measure the usefulness of a Knowledge Graph. • Evaluation • Overall process to determine the quality of a Knowledge Graph. • Select quality dimensions, metrics, evaluation functions, and weights for metrics and dimensions. • Evaluate representative subsets accordingly. Knowledge Graph Knowledge Creation Knowledge Hosting Knowledge Curation Knowledge Deployment KnowledgeAssessment KnowledgeCleaning Knowledge Enrichement Evaluation Correctness Completeness 32
  33. 5.1 Knowledge Assesment • Correctness • Identify the amount of wrong assertions • Completeness • Identify missing assertion sets • Furthers accessibility, accuracy, appropriate amount, believability, completeness, concise representation, consistent representation, cost-effectiveness, easy of manipulating, easy of operation, easy of understanding, flexibility, free-of-error, interpretability, objectivity, relevancy, reputation, security, timeliness, traceability, understandability, value-added, and variety Knowledge Graph Knowledge Creation Knowledge Hosting Knowledge Curation Knowledge Deployment KnowledgeAssessment KnowledgeCleaning Knowledge Enrichement Evaluation Correctness Completeness 33
  34. 5.2 Knowledge Cleaning • The goal of knowledge cleaning is to improve the correctness of a Knowledge Graph • Major objectives • error detection and • error correction of ● wrong instance assertions ● wrong property value assertions ● wrong equality assertions Knowledge Graph Knowledge Creation Knowledge Hosting Knowledge Curation Knowledge Deployment KnowledgeAssessment KnowledgeCleaning Knowledge Enrichement Error Detection Error Correction 34
  35. 5.2 Knowledge Cleaning Tbox Knowledge Graph Knowledge Creation Knowledge Hosting Knowledge Curation Knowledge Deployment KnowledgeAssessment KnowledgeCleaning Knowledge Enrichement Error Detection Error Correction 35
  36. 5.2 Knowledge Cleaning Tbox Abox Knowledge Graph Knowledge Creation Knowledge Hosting Knowledge Curation Knowledge Deployment KnowledgeAssessment KnowledgeCleaning Knowledge Enrichement Error Detection Error Correction 36
  37. 5.2 Knowledge Cleaning Tbox Abox Knowledge Curation Knowledge Graph Knowledge Creation Knowledge Hosting Knowledge Curation Knowledge Deployment KnowledgeAssessment KnowledgeCleaning Knowledge Enrichement Error Detection Error Correction 37
  38. 5.2 Knowledge Cleaning Knowledge Graph Knowledge Creation Knowledge Hosting Knowledge Curation Knowledge Deployment KnowledgeAssessment KnowledgeCleaning Knowledge Enrichement Error Detection Error Correction What Verification Validation Semantic Annotations check schema conformance and integrity constraints compare with web resource Knowledge Graphs check schema conformance and integrity constraints compare with "real" world 38 E. Huaman, E. Kärle, D. Fensel: Knowledge Graph Validation, Technical Report. https://arxiv.org/pdf/2005.01389.pdf
  39. 5.2 Knowledge Cleaning Error correction of wrong instance assertions isElementOf (i1,t): • i is not a proper instance identifier: Delete assertion or correct i • t is not an existing type name: Delete assertion or correct t • The instance assertion is (semantically) wrong: • Delete assertion or find proper t • and do NOT: find a proper i (would neither scale nor making sense) Knowledge Graph Knowledge Creation Knowledge Hosting Knowledge Curation Knowledge Deployment KnowledgeAssessment KnowledgeCleaning Knowledge Enrichement Error Detection Error Correction 39
  40. 5.2 Knowledge Cleaning Error correction of wrong property value assertions: p(i1,i2): • p is not a proper property name: Delete assertion or correct p • i1 is not a proper instance identifier: Delete assertion or correct i1 • i1 is not in any domain of p: Delete assertion or add assertion isElementOf(i1,t) with t is a domain of p. • i2 is not a proper instance identifier: Delete assertion or correct i2 • i2 is not in the range of p for any domain of i1: • Delete assertion or • add a proper isElementOf assertion for i1 that adds a domain for which i2 is an instance of the range of the property or • add a proper isElementOf assertion for i2 that turns it into an instance of a range of the property applied to a domain of p where i1 is an element. • The property assertion is (semantically) wrong: delete assertion or correct it. In this case, you should most likely define proper i2, or search for better p, or search for better i1. Knowledge Graph Knowledge Creation Knowledge Hosting Knowledge Curation Knowledge Deployment KnowledgeAssessment KnowledgeCleaning Knowledge Enrichement Error Detection Error Correction 40
  41. 5.2 Knowledge Cleaning Error correction of wrong equality assertions: isSameAs(i1,i2): • i1 is not a proper instance identifier: Delete assertion or correct i1 • i2 is not a proper instance identifier: Delete assertion or correct i2 • The identity assertion is (semantically) wrong: Delete assertion or replace it by a skos operator1. 1 which however does not come with operational semantics. Knowledge Graph Knowledge Creation Knowledge Hosting Knowledge Curation Knowledge Deployment KnowledgeAssessment KnowledgeCleaning Knowledge Enrichement Error Detection Error Correction 41
  42. Knowledge Cleaning: System survey • Verification: • Quality Assessment Frameworks such as Luzzu (A Quality Assessment Framework for LinkedOpen Datasets) [Debattista et al., 2016], Sieve (Linked Data Quality Assessment and Fusion) [Mendes et al., 2012], SWIQA (Semantic Web Information Quality Assessment Framework) [Fürber & Hepp, 2011], and WIQA (Web InformationQuality Assessment Framework) [Bizer and Cyganiak, 2009]. • Approaches that check the conformance of RDF graphs against specifications: Alegro GraphTool, RDFUnit [Kontokostas et al., 2014], SHACL (Shapes Constraint Language) and ShEx (Shape Expressions) [Gayo et al., 2017], Stardog ICV, TopBraid, and Validata [Hansen et al., 2015]. • Tools that use statistical distributions to predict the types of instances (e.g., SDType [Paulheim & Bizer, 2013]) and to detect erroneous relationships that connect two resources (e.g., HoloClean [Rekatsinas et al., 2017], SDValidate [Paulheim & Bizer, 2014]). • More approaches: KATARA [Chu et al., 2015], LOD Laundromat [Beek et al., 2014]. • Validation: • Fact validation frameworks: COPAAL (Corroborative FactValidation [Syed et al., 2019]), DeFacto (Deep FactValidation [Lehmann et al., 2012]), FactCheck [Syed et al., 2018], FacTify [Ercan et al., 2019], Leopard [Speck & Ngonga Ngomo, 2018], Surface [Padia et al., 2018], S3K [Metzger et al, 2011], and TISCO [Rula et al., 2019]. • More approaches based on measuring how accurate is a statement concerning external knowledge sources [Elbassuoni et al., 2010], [Jia et al., 2019], [Nakamura et al., 2007], [Shi &Weninger, 2016], [Shiralkar et al., 2017], [Wienand & Paulheim, 2014]. 42
  43. Knowledge Cleaning: Our approach • VeriGraph: Verification framework for large Knowledge Graphs. It detects errors by verifying a Knowledge Graph against a set of given SHACL constraints. • Verification process: Only the necessary subset of a KG is loaded into the memory per DS (i.e. a SHACL shape). The constraints are checked on the memory. No one SPARQL query per constraint component approach. • Output: Validation report of inconsistencies found (including a human readable path to the error) • Status: Evaluation made over a Knowledge Graph of 1billion triples. Currently tested with SHACL test cases 43
  44. 5.3 Knowledge Enrichment • The goal of knowledge enrichment is to improve the completeness of a Knowledge Graph by adding new statements • The process of Knowledge Enrichment has four phases: • New Knowledge Source detection • New Knowledge Source integration (URI normalization) • Duplicate detection and alignment • Property-Value-Statements correction Knowledge Graph Knowledge Creation Knowledge Hosting Knowledge Curation Knowledge Deployment KnowledgeAssessment KnowledgeCleaning Knowledge Enrichement Knowledge Source detection Knowledge Source integration Duplicate detection Property-Value- Statements correction 44
  45. 5.3 Knowledge Enrichment Duplicate detection: https://www.cs.umd.edu/~getoor/Tutorials/ER_VLDB2012.pdf Knowledge Graph Knowledge Creation Knowledge Hosting Knowledge Curation Knowledge Deployment KnowledgeAssessment KnowledgeCleaning Knowledge Enrichement Knowledge Source detection Knowledge Source integration Duplicate detection Property-Value- Statements correction 45
  46. 5.3 Knowledge Enrichment • Knowledge Source detection: search for additional sources of assertions for the Knowledge Graph • Open sources • Closed sources • Knowledge Source integration • Tbox: define mappings • Abox: integrate new assertions into the the Knowledge Graph • Identifying and resolving duplicates • Invalid property statements such as domain/range violations and having multiple values for a unique property, also known in the data quality literature as contradicting or uncertain attribute value resolution. Knowledge Graph Knowledge Creation Knowledge Hosting Knowledge Curation Knowledge Deployment KnowledgeAssessment KnowledgeCleaning Knowledge Enrichement Knowledge Source detection Knowledge Source integration Duplicate detection Property-Value- Statements correction 46
  47. Knowledge Enrichment: System survey • Duplicate Detection: • Dedupe:[Bilenko & Mooney, 2003] A python library that uses machine learning to find and link duplicates. • Dude [Draisbach & Naumann, 2010]: Java framework that uses various similarity metrics to compare instances. • Duke [Garshol & Borge, 2013]: Provides record linkage and deduplication methods, and a genetic algorithm feature to find a tunned configuration for detecting duplicates. • Legato [Achichi et al., 2017] A recording linkage tool that utilizes Concise Bounded Description of resources for comparison. • LIMES [Ngomo & Auer, 2011] A link discovery approach that benefits from the metric spaces (in particular triangle inequality) to reduce the amount of comparisons between source and target dataset. • SERIMI [Araújo et al., 2011] A link discovery tool that utilizes string similarity functions on “label properties” without a prior knowledge of data or schema. • SILK [Volz et al., 2009] A link discovery tool with declerative linkage rules applying different similarity metrics (e.g. string, taxonomic, set) that also supports policies for the notification of datasets when one of them publishes new links to others. • Conflict Resolution: • FAGI [Giannopoulos et al., 2014] and SlipoToolkit [Athanasiou et al., 2019] are frameworks that suggest fussion strategies for geospatial data sources. • KnoFuss [Nikolov et al., 2008] A framework that allows the application of different methods on different attributes in the same dataset for identification of duplicates and resolves inconsistencies caused by the fusion of linked instances. • ODCleanStore [Knap et al., 2012]: Allows users to configure conflict resolution policies based on functions (e.g.AVG, MAX). • Sieve [Mendes et al., 2012]: Provides different fusion functions on selected property values. 47
  48. Knowledge Enrichment: Our approach • Enrichment Framework: Identifies duplicates in Knowledge Graphs and resolves conflicting property values. • Workflow: • Input: a Knowledge Graph. • Duplicate Detection Process: semi-automatic feature selection, data normalization, setup (e.g. similarity metrics), run, and duplicate entities viewer. • Resolving Conflicting Property Values: define fusion strategies (e.g. decides what to do based on similarity values), run, monitoring fusion process. • Output: Report of duplicate entities found and fused. • Work-in-progress. 48
  49. 6 Knowledge Deployment • Building, implementing, and curating Knowledge Graphs is a time- consuming and costly activity. • Integrating large amounts of facts from heterogeneous information sources does not come for free. • [Paulheim, 2018b] estimates the average cost for one fact in a Knowledge Graph between $0,1 and $6 depending on the amount of mechanization. [Paulheim, 2018b] H. Paulheim: How much is a Triple? Estimating the Cost of Knowledge Graph Creation. In ISWC-P&D- Industry-BlueSky 2018: Proceedings of the ISWC 2018 Posters & Demonstrations, Industry and Blue Sky Ideas Tracks co- located with 17th International Semantic Web Conference (ISWC 2018) Monterey, USA, October 8-12, 2018. http://www. heikopaulheim.com/docs/iswc_bluesky_cost2018.pdf Knowledge Graph Knowledge Creation Knowledge Hosting Knowledge Curation Knowledge Deployment 49
  50. 6 Knowledge Deployment Knowledge Graph Knowledge Creation Knowledge Hosting Knowledge Curation Knowledge Deployment Name Instances Facts Types Relations DBpedia (English) 4,806,150 176,043,129 735 2,813 YAGO 4,595,906 25,946,870 488,469 77 Freebase 49,947,845 3,041,722,635 26,507 37,781 Wikidata 15,602,060 65,993,797 23,157 1,673 NELL 2,006,896 432,845 285 425 OpenCyc 118,499 2,413,894 45,153 18,526 Google´s Knowledge Graph 570,000,000 18,000,000,000 1,500 35,000 Google´s Knowledge Vault 45,000,000 271,000,000 1,100 4,469 Yahoo! Knowledge Graph 3,443,743 1,391,054,990 250 800 50
  51. 6 Knowledge Deployment • We build a knowledge access layer on top of the Knowledge Graph helping to connect this resource to applications. • Knowledge management technology: • based on graph‐based repositories host the Knowledge Graph (as a semantic data lake). • The knowledge management layer is responsible for storing, managing and providing semantic description of resources • Inference engines based on deductive reasoning engines: • implements agents that defines view on this graph together with context data on user requests. • It accesses the graph to gain data for its reasoning that provides input to the dialogue engine interacting with the human user. Knowledge Graph Knowledge Creation Knowledge Hosting Knowledge Curation Knowledge Deployment 51
  52. 6 Knowledge Deployment What are the reasons: • Scalability issues (Trillions of triples) • Context refinement for (support different points of view) • introduce rich constraints (Knowledge Cleaning) • additional knowledge derivation (Knowledge Enrichment) • Provide a reusable application layer / middle ware on top of a knowledge graph • access rights • integrates additional information sources from the application • context, • personalization, • task etc. Knowledge Graph Knowledge Creation Knowledge Hosting Knowledge Curation Knowledge Deployment 52
  53. 6 Knowledge Deployment Knowledge Graph Knowledge Creation Knowledge Hosting Knowledge Curation Knowledge Deployment 53 API Views View extractor Local Knowledge Enrichment Local Knowledge Cleaning
  54. 6 Knowledge Deployment Knowledge Graph Knowledge Creation Knowledge Hosting Knowledge Curation Knowledge Deployment 54 API Views View extractor Local Knowledge Enrichment Local Knowledge Cleaning API Terminology RulesConstraints View definition View Extraction Cleaning / Enrichment Micro TBox Specifications Engines
  55. 7. The Proof Of The pudding Is In The Eating Knowledge Graphs are enabling technology for: • Virtual agents (Information search, eMarketing, and eCommerce) • Cyperphysical Systems (Internet of theThing, Smart Meters, etc.) • Physical Agents (drones, cars, satellites, androids, etc.) 55
  56. 7. Virtual Agents Onlim • The pioneer in automating customer communication via AI chatbots and conversational interfaces • Enterprise solutions for making data and knowledge available for conversational interfaces • Team of 25+ highly experienced AI experts, specialists in semantics and data science • Spin-off of University of Innsbruck • HQ in Europe (Vienna, Telfs) Current FocusVerticals UtilitiesTourismRetail Education Financial Services 56
  57. 7. Virtual Agents Onlim 57
  58. 7. Physical Agents How Knowledge Graphs can prevent AI from killing people 58
  59. 7. Physical Agents: The brave New World of AI • Autonomous Driving 59
  60. 7. Physical Agents: Failures of AI technology • In May 2016 Joshua Brown was killed by his car because its auto pilot mixed up a very long car (large wheelbase) with a traffic sign. 60
  61. 7. Physical Agents: Failures of AI technology • In May 2016 Joshua Brown was killed by his car because its auto pilot mixed up a very long car (large wheelbase) with a traffic sign. This is what the auto pilot „saw“ 61
  62. 7. Physical Agents: Failures of AI technology • In May 2016 Joshua Brown was killed by his car because its auto pilot mixed up a very long car (large wheelbase) with a traffic sign. This is what the auto pilot „saw“ 62 Why had none of the 10,000++ engineers involved not the trivial idea to connect the car with a Knowledge Graph containing traffic data that simply knows that there is no traffic sign?
  63. 7. Physical Agents: Failures of AI technology •In May 2016 Joshua Brown was killed by his car because its auto pilot mixed up a very long car (large wheelbase) with a traffic sign. 63
  64. 7. Physical Agents: Failures of AI technology • In March 2018 Elaine Herzberg was the first victom of a full autonomously driving car. 64
  65. 7. Physical Agents: Failures of AI technology • In March 2018 Elaine Herzberg was the first victom of a full autonomously driving car. • Besides many software bugs by Uber a la Boeing a core issue was that the car assumed that petestrians cross streets only on crosswalks. • Make assumptions explicit and confirm them with a knowledge graph. • In this case she still would be alive! 65
  66. 7. Physical Agents: Which kind of AI do we want? This one? 66
  67. 67

Editor's Notes

  1. If it would work, we would not need it.
  2. Sub graph consistent Data Lake
  3. https://www.zdf.de/wissen/leschs-kosmos/videos/mit-vollgas-in-die-zukunft-102.html 7:20 min
Advertisement