Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

PoolParty Semantic Classifier


Published on

The PoolParty Semantic Classifier is a component of the Semantic Suite, which makes use of machine learning in combination with Knowledge Graphs.

We discuss the potential of the fusion of machine learning, neuronal networks, and knowledge graphs based on use cases and this concrete technology offering.

We introduce the term 'Semantic AI' that refers to the combined usage of various AI methods.

Published in: Technology
  • Be the first to comment

PoolParty Semantic Classifier

  1. 1. Andreas Blumauer CEO & Managing Partner Semantic Web Company / PoolParty Semantic Suite PoolParty Semantic Classifier Bringing Machine Learning, NLP and Knowledge Graphs together
  2. 2. Introduction Semantic Web Company (SWC) ▸ Founded in 2004 ▸ Based in Vienna ▸ Privately held ▸ 40+ FTE ▸ Experts in NLP, Semantics and Machine learning ▸ ~30% growth/year ▸ 2.5 Mio Euro funding for R&D ▸ SWC named to KMWorld’s ‘100 Companies That Matter in Knowledge Management’ in 2016 and 2017 ▸ 2 PoolParty Semantic Suite ▸ First release in 2009 ▸ Current version 6.2 ▸ W3C standards compliant ▸ Over 200 installations world-wide ▸ 50% of revenue is reinvested into PoolParty development ▸ PoolParty on-premises or used as a cloud service ▸ KMWorld listed PoolParty as Trend-Setting Product 2015, 2016 and 2017 ▸
  3. 3. Agenda 3 Semantic AI ▸ Introduction to Semantic AI ▹ Machine Learning, NLP and Knowledge Graphs ▹ Current status of Artificial Intelligence? ▹ What is Semantic AI? ▸ PoolParty Semantic Classifier ▹ How does it work? ▹ Benchmarks ▹ Integration Scenarios ▸ Use Cases ▹ Overview ▹ Business Case ▹ Example: Issue Classifier
  4. 4. SEMANTIC ARTIFICIAL INTELLIGENCE #SemanticAI: Bringing Machine Learning, NLP and Knowledge Graphs together 4
  5. 5. ArtificiaI Intelligence - An overview 5 Artificial Intelligence (AI) Artificial Neural Network (ANN) Symbolic AI (GOFAI*) Sub-Symbolic AI Statistical AI Knowledge graphs & reasoning Natural Language Processing (NLP) Machine Learning * Good old-fashioned AI Word Embedding (Word2Vec) Deep Learning (DNN) Natural Language Understanding Entity Recognition & Linking Knowledge Extraction Semantic enhanced Text Classification
  6. 6. What makes someone an intelligent being? Assessment of the current status of Artificial Intelligence Level Example Typical Problems Questions (6) Create Convert an "unhealthy" recipe for apple pie to a "healthy" recipe by replacing your choice of ingredients. - Create a new product or point of view - Combine elements in a new pattern - Propose alternative solutions How would you improve …? Can you formulate a theory for …? Can you predict the outcome if …? (5) Evaluate Which kinds of knowledge models are best for machine learning, and why? - Judge the value of material (statement, poem, research report) for a given purpose - Defend opinions What is your opinion of …? How would you prioritize …? What would you use to support the view …? (4) Analyse How does a graph database and a semantic knowledge model work together? - Recognise organizational principles - Identify parts - Understand relationships between parts How is ... related to ...? What is the function of ...? What conclusions can you draw ...? (3) Apply How can taxonomies be used to enhance machine learning? - Apply facts, rules, principles, and theories - Use learned material in new and concrete situations Why is … significant? How is … an example of …? What elements would you use to change …? (2) Understand What is the difference between an ontology and a taxonomy? - Understand facts and ideas - Classify objects and summarise text - Grasp the meaning of material What is the difference between …? What is the main idea of …? Which statements support …? (1) Remember Who is the inventor of the World Wide Web? - Recognise facts, terms, and basic concepts - Recall of a wide range of material, from specific facts to complete theories Who is …? Where is …? Why did …? 6 Bloom’s Taxonomy: Classify cognitive processes
  7. 7. Remember Knowledge Graphs & Knowledge Extraction 7 Perth Australia Perth is one of the most isolated major cities in the world, with a population of 2,022,044 living in Greater Perth. Australia is a member of the OECD, United Nations, G20, ANZUS, and the World Trade Organisation. Country City is a is a is located in Avoid illogical answers: Support complex Q&A: distance between Which cities located in the Commonwealth of Nations have a population of more than 2 mio. people? Commonwealth of Nations International Organisation is part of is a
  8. 8. Remember Knowledge Graphs & Knowledge Extraction Knowledge Graphs (KG) can cover general knowledge (often also called cross-domain or encyclopedic knowledge), or provide knowledge about special domains such as biomedicine. In most cases KGs are based on Semantic Web standards, and have been generated by a mixture of automatic extraction from text or structured data, and manual curation work. Examples: ▸ DBpedia ▸ Google Knowledge Graph ▸ YAGO ▸ OpenCyc ▸ Wikidata 8 Who is the inventor of the World Wide Web?
  9. 9. The Semantic Web A standards-based graph of knowledge graphs 9 Linking Open Data cloud diagram 2017, by Andrejs Abele, John P. McCrae, Paul Buitelaar, Anja Jentzsch and Richard Cyganiak.
  10. 10. “Understand” Google Featured Snippets based on Sentence Compression Algorithms (based on DL) To train Google’s artificial Q&A brain, the company uses old news stories, where machines start to see how headlines serve as short summaries of the longer articles that follow. But for now, the company still needs its team of PhD linguists. Spanning about 100 PhD linguists across the globe, the Pygmalion team produces “the gold data,” while the news stories are the “silver.” The silver data is still useful, because there’s so much of it. But the gold data is essential. WIRED article 10 What is the difference between an ontology and a taxonomy?
  11. 11. “Create” Example for DL-based “creativity” Aiva’s compositions still require human input with regards to orchestration and musical production. In fact, Aiva’s creators envisage a future where man and machine will collaborate to fulfill their creative potential, rather than replace one another. 11 After having listened to a large amount of music and learned its own models of music theory, Aiva composes its very own sheet music. These partitions are then played by professional artists on real instruments in a recording studio, achieving the best sound quality possible.
  12. 12. What is Semantic AI? 12 Artificial Intelligence ANN Symbolic AISub-Symbolic AI Statistical AI KR & reasoning NLP Machine Learning Word Embedding Deep Learning Natural Language Understanding Entity Recognition & Linking Knowledge Extraction Semantic enhanced Text Classification In Semantic AI, various methods from Symbolic AI are combined with machine learning methods, and/or neuronal networks. Examples: ● Semantic enrichment of text corpora to enhance word embeddings ● Extraction of semantic features from text to improve ML-based classification tasks ● Combine ML-based with Graph-based entity extraction ● Knowledge Graphs as a Data Model for Machine Learning ● ….
  13. 13. Knowledge Graphs as a Data Model for Machine Learning These transformations can result in loss of information and introduce bias. To solve this problem, we require machine learning methods to consume knowledge in a data model more suited to represent this heterogeneous knowledge. We argue that knowledge graphs are that data model. Three examples for the benefits of using knowledge graphs: ▸ they allow for true end-to-end-learning, ▸ they simplify the integration of heterogeneous data sources and data harmonization, ▸ they provide a natural way to seamlessly integrate different forms of background knowledge. Wilcke X, Bloem P, De Boer V. The Knowledge Graph as the Default Data Model for Machine Learning. Data Science. 2017 Oct 17;1-19. Available from, DOI: 10.3233/DS-170007 13 Traditionally, when faced with heterogeneous knowledge in a machine learning context, data scientists preprocess the data and engineer feature vectors so they can be used as input for learning algorithms (e.g., for classification).
  14. 14. POOLPARTY SEMANTIC CLASSIFIER Bringing Machine Learning, NLP and Knowledge Graphs together 14
  15. 15. PoolParty Semantic Classifier In a Nutshell 15 PoolParty Semantic Classifier combines machine learning algorithms (SVM, Deep Learning, Naive Bayes, etc.) with Semantic Knowledge Graphs.
  16. 16. PoolParty Semantic Classifier Feature highlights ▸ Classification ▹ Select from various machine learning algorithms such as SVM, Deep Learning and Naive Bayes for content classification. ▹ Benefit from a rich feature set such as terms, concepts, shadow concepts which gives you more flexibility when training classifiers. ▸ User Experience ▹ A user-friendly interface enables non-technical experts to perform classification tasks and benefit from machine learning. ▸ Scalability ▹ Large content repositories can be classified on top of a Spark cluster. ▸ Easy integration ▹ New resources can be classified via the PoolParty API. ▹ With the GraphSearch plugin system, the Classifier can be easily integrated for semantic applications. ▸ Transparency ▹ The Classifier works on the principles of ‘Explainable AI’. 16
  17. 17. Extraction, Categorization, Classification What is the difference? ▸ Classification ▹ Implemented based on a training corpus and uses the labels of the training corpus as classes ▸ Extraction ▹ Finding terms and concepts in text, using scoring mechanisms to give an indication of their importance ▸ Categorization ▹ Uses only concepts and categorizes text based on a thesaurus 17
  18. 18. How it works 18 1. Determine classes (labels) 2. Identify training documents per class 3. Create classifier a. Pick machine learning method b. Choose correct parameters c. Determine used features d. Train model 4. Evaluate results (Cross validation/F1) 5. Goto 2. and try to find better classification method 6. Make use of the Classifier API
  19. 19. Explainable AI Classifiers based on ML algorithms such as Deep Learning perform better when training data is semantically enhanced. Additional features are derived from a controlled vocabulary, which also make the used features more transparent to the Data Scientist. 19
  20. 20. Benchmarking the PoolParty Semantic Classifier Improvement of 5.2% compared to traditional (term-based) SVM 20 Features used Classifier F1 (5 folds) Variance Terms LinearSVC 0.83175 0.0008 Concepts from REEGLE + Shadow Concepts LinearSVC 0.84451 0.0011 Concepts from REEGLE LinearSVC 0.84647 0.0009 Terms + Concepts from REEGLE + Shadow Concepts LinearSVC 0.87474 0.0009 Reegle thesaurus A comprehensive SKOS taxonomy for the clean energy sector ( ● 3,420 concepts ● 7,280 labels (English version) ● 9,183 relations (broader/narrower + related) Document Training Set 1,800 documents in 7 classes Renewable Energy, District Heating Systems, Cogeneration, Energy Efficiency, Energy (general), Climate Protection, Rural Electrification
  21. 21. The Classifier as a component of PoolParty Semantic Suite Most complete Semantic Middleware on the Global Market 21 Bain Capital is a venture capital company based in Boston, MA. Since inception it has invested in hundreds of companies including AMC Entertainment, Brookstone, and Burger King. The company was co-founded by Mitt Romney. Taxonomy & Ontology Server Entity Extractor & Semantic Classifier Data Integration & Data Linking Unstructured Data Semi- structured Data Structured Data Unified Views PoolParty GraphSearch Identify new candidate concepts to be included in a controlled vocabulary Controlled vocabularies as a basis for highly precise knowledge extraction and text classification Entity Extractor informs all incoming data streams about its semantics and links them Schema mapping based on ontologies RDF Graph Database Factsheet
  22. 22. Integration scenarios in PoolParty Semantic Suite 22 ▸ UnifiedViews DPU ▸ Recommender plugin in GraphSearch ▸ Document classifier in GraphSearch
  23. 23. USE CASES Bringing Machine Learning, NLP and Knowledge Graphs together 23
  24. 24. Examples for use cases based on PoolParty Semantic Classifier ▸ News classification ▹ Reduce manual effort of classifying inbound documents or news ▸ Recommender Services ▹ Identify appropriate agents in help desk systems ▹ Matchmaking between user(groups) and products/content ▸ Sentiment Analysis ▹ Improve customer retention management by precise sentiment analysis ▸ Enhanced Domain-specific Text Mining ▹ Complement rule-based systems for fraud detection ▹ Analyze judicial decisions 24
  25. 25. ▸ To understand ▹ Content aboutness in a defined framework ▹ Data relationships and context within a unified organizational model ▹ Connections across disparate datasets ▸ To increase precision ▹ Hierarchical or other mapped relationships allow for recommending similar content when exact matches not found ▹ Granularity allows for more specific recommendations ▹ Consistency across structure results more precise analysis and predictions Source: Suzanne Carroll, Data Science Product Director at XO Group Why Data Scientists need Semantic Models 25
  26. 26. Business Case Based on an improvement of 5.2% 26 Inbound Documents PoolParty Semantic Classifier Experienced Agent ● 100,000 documents (emails, tickets, etc.) per month ● 5 Euros extra costs per document when misrouted ● Cost savings per year: ○ 1,200.000 x €5.0 x 0.052 = € 312,000 per annum
  27. 27. Example: Issue Classifier 27
  28. 28. One question at the end Will Artificial Intelligence make Subject Matter Experts obsolete? 28 Imagine you want to build an application that helps to identify patients and treatments pairings. Which will you prefer? Applications solely based on machine learning, those ones which are based on doctors' knowledge only, or a combination of both?
  29. 29. Learn more and ask for a demo based on your own data! 29
  30. 30. Thank you for your interest! Andreas Blumauer CEO, Semantic Web Company ▸ Mail ▸ Company ▸ LinkedIn ▸ Twitter ▸ Blog author/andreasblumauer 30 © Semantic Web Company - and