Felix Burkhardt | ARCHITECTURE FOR A QUESTION ANSWERING MACHINE

•

4 likes•958 views

semanticsconference

http://2016.semantics.cc/felix-burkhardt

Technology

QUARK:
Architecture for a question answering machine
Felix Burkhardt

2
Motivation
Architecture
Modules
Frontends
Spellchecker
Dialogmanager
Semantic Search
Answer Templates
Summary & Outlook
Quark: Architecture for a QA Machine
Contents
Chatbot „Tinka“ on the T-mobile.at website

3
Quark: Architecture for a QA Machine
Motivation
To ease the work of human agents
and save costs automatic question
answering systems are valuable
One example are so-called
„chatbots“, i.e. automatic dialog
systems

4
Quark: Architecture for a QA Machine
ARchitecture
• We developed
an architecture
to develop, test
and compare
several
components of
such a question
answering
system.
• It is also used to
build
demonstrators
for management

5
Quark: Architecture for a QA Machine
Frontends
• There are several
interfaces, e.g.
mobile apps for
demonstration
• Web interfaces
for testing and
maintenance

6
Quark: Architecture for a QA Machine
Spellchecker
• In the current Tinka
implementation we use the
open source Hunspell
spellchecker
• The API was algorithmically
enhanced by supporting
• Weighted user lexicon
• Enabling recognition of
bi-grams and tri-grams

7
Quark: Architecture for a QA Machine
Dialogmanager
• Central instance
• Missing dialog model for
slot-filling and ellipse
handling
• Main challenge is currently
the result list order, i.e. a
quality measure for the
answers from several
moduls
QA-item
Central data-structure for answers
- Question
- Interpretation
- Answer
- Probability: each knowledge source must
provide a probability value so the results can be
ranked in a meaningful way. Some knowledge
sources are more specific than others (e.g. QA-
search more than FAQ) and are ranked higher
per se.
Dialog-
manager
Central instance
to take queries ,
distributes to
several Answer-
modules, and
consolidate
output
Later, session
handling for multi-
step dialogs will
also be handled
here.

8
Quark: Architecture for a QA Machine
Semantic Search
• With content that already contains
answers, FAQ and extracted from forum,
the user query must only be matched with
the question.
• Finding important words and synonyms for
„query-expansion“ can be done with
GATE‘s term extractor, had to be adapted
for German.
• The search index of a SOLR search engine
can than be enhanced by these terms.
• The number of matching terms would be
part of the quality criterion

9
Quark: Architecture for a QA Machine
Topic classification of Forum data
• Together with the DAI (Distributed
Artificial Intelligence) Labor of
TU-Berlin we investigated the
topic classification of „Telekom
Hilft“ user forum
• Compared „classical machine
learning“ with Deep neural nets.
• Both resulted in 55% accuracy
rsp. 83% „one in three“
• Also investigated subclustering
with DNN (4 subclusters per
category)
Apache Lucene preprocessing
NER / disambuiguation
Multinomial Naive Bayes
* Zhang, Zhao, LeCun, 2015
Deep Temporal
Convolutional Neural
Network *
Comparison

10
Quark: Architecture for a QA Machine
Answer Templates
• We also use GATE to annotate terms
in user queries, based on gazeteers.
• Each string gets annotated with Part-
of-Speech, lemma and NER
• Disambiguation is not done yet
• Via JAPE grammars a pre-defined
answer (and querstion) template is
determined
• With this template, a SPARQL
database is queried and the answer
filled.
• SPARQL is the query language for
RDF, a W3C suggestion for semantic
annotation

11
Quark: Architecture for a QA Machine
Summary and Outlook
• We presented an architecture to answer customer questions automatically
• It is based on open source technology like
• GATE
• SOLR
• RDF/OWL
• It is used to evaluate vendors, build demonstrators and test sub-modules for production
• Many things are missing, e.g.:
• Dialog models
• Common quality-of-answer criterion across sub-modules
• Machine learning of underlying ontology

12
Quark: Architecture for a QA Machine
Thank you for your attention!

What's hot

Neo4j GraphTour Santa Monica 2019 - Amundsen PresentationTamikaTannis

Search Product Manager: Software PM vs. Enterprise PM or What does that * PM do?John T. Kane

Feature Store as a Data Foundation for Machine LearningProvectus

Interleaving, Evaluation to Self-learning Search @904LabsJohn T. Kane

Guiding through a typical Machine Learning PipelineMichael Gerke

Top 5 Considerations When Evaluating NoSQLMongoDB

Evaluation criteria for nosql databasesEbenezer Daniel

Enterprise Search in the Big Data Era: Recent Developments and Open ChallengesYunyao Li

High Performance Transfer Learning for Classifying Intent of Sales Engagement...Databricks

Scaling Box-Search: Gearing up for Petabyte Scale - Shubhro Roy & Anthony Urb...Lucidworks

Strategic Value from Enterprise Search and Insights - Viren Patel, PwCLucidworks

Disrupting Data Discoverymarkgrover

Transforming AI with Graphs: Real World Examples using Spark and Neo4jFred Madrid

Optique - posterDBOnto

Spsbe 18-04-15 - should i move my network folders to office 365BIWUG

How Lyft Drives Data DiscoveryNeo4j

Data council sf amundsen presentationTao Feng

Real-time Recommendations for Retail: Architecture, Algorithms, and DesignJuliet Hougland

Amundsen: From discovering to security datamarkgrover

Accelerating the ML Lifecycle with an Enterprise-Grade Feature StoreDatabricks

What's hot (20)

Neo4j GraphTour Santa Monica 2019 - Amundsen Presentation

Search Product Manager: Software PM vs. Enterprise PM or What does that * PM do?

Feature Store as a Data Foundation for Machine Learning

Interleaving, Evaluation to Self-learning Search @904Labs

Guiding through a typical Machine Learning Pipeline

Top 5 Considerations When Evaluating NoSQL

Evaluation criteria for nosql databases

Enterprise Search in the Big Data Era: Recent Developments and Open Challenges

High Performance Transfer Learning for Classifying Intent of Sales Engagement...

Scaling Box-Search: Gearing up for Petabyte Scale - Shubhro Roy & Anthony Urb...

Strategic Value from Enterprise Search and Insights - Viren Patel, PwC

Disrupting Data Discovery

Transforming AI with Graphs: Real World Examples using Spark and Neo4j

Optique - poster

Spsbe 18-04-15 - should i move my network folders to office 365

How Lyft Drives Data Discovery

Data council sf amundsen presentation

Real-time Recommendations for Retail: Architecture, Algorithms, and Design

Amundsen: From discovering to security data

Accelerating the ML Lifecycle with an Enterprise-Grade Feature Store

Viewers also liked

Information Retrieval with Deep LearningAdam Gibson

Chalitha Perera | Cross Media Concept and Entity Driven Search for Enterprisesemanticsconference

Shuangyong Song, Qingliang Miao and Yao Meng | Linking Images to Semantic Kno...semanticsconference

David Kuilman | Creating a Semantic Enterprise Content model to support conti...semanticsconference

Kostas Kastrantas | Business Opportunities with Linked Open Datasemanticsconference

Victor Charpenay | Standardized Semantics for an Open Web of Thingssemanticsconference

OWL-based validation by Gavin Mendel Gleasonand Bojan Bozic, Trinity College,...semanticsconference

Thomas Vavra | New Ways of Handling Old Datasemanticsconference

Georgios Meditskos and Stamatia Dasiopoulou | Question Answering over Pattern...semanticsconference

Sören Auer | Enterprise Knowledge Graphssemanticsconference

Nicoletta Fornara and Fabio Marfia | Modeling and Enforcing Access Control Ob...semanticsconference

Christian Opitz | Semantic E-Commerce - Use Cases in Enterprise Web Applicationssemanticsconference

Fajar J. Ekaputra, Marta Sabou, Estefania Serral and Stefan Biffl | Knowledge...semanticsconference

Reginald Ford, Grit Denker, Daniel Elenius, Wesley Moore and Elie Abi-Lahoud ...semanticsconference

Tomas Knap | RDF Data Processing and Integration Tasks in UnifiedViews: Use C...semanticsconference

Vassilios Peristeras | Promoting Semantic Interoperability for European Publi...semanticsconference

Holger Wollschläger | E-government at its best: Open, transparent and usefulsemanticsconference

Jo Kent | ADA – Opening up the BBC archive with linked datasemanticsconference

Uses of Video Annotation Software to Promote Deep Learning - SoTE 2106Michael Johnson

Linked data the next 5 years - From Hype to ActionAndreas Blumauer

Viewers also liked (20)

Information Retrieval with Deep Learning

Chalitha Perera | Cross Media Concept and Entity Driven Search for Enterprise

Shuangyong Song, Qingliang Miao and Yao Meng | Linking Images to Semantic Kno...

David Kuilman | Creating a Semantic Enterprise Content model to support conti...

Kostas Kastrantas | Business Opportunities with Linked Open Data

Victor Charpenay | Standardized Semantics for an Open Web of Things

OWL-based validation by Gavin Mendel Gleasonand Bojan Bozic, Trinity College,...

Thomas Vavra | New Ways of Handling Old Data

Georgios Meditskos and Stamatia Dasiopoulou | Question Answering over Pattern...

Sören Auer | Enterprise Knowledge Graphs

Nicoletta Fornara and Fabio Marfia | Modeling and Enforcing Access Control Ob...

Christian Opitz | Semantic E-Commerce - Use Cases in Enterprise Web Applications

Fajar J. Ekaputra, Marta Sabou, Estefania Serral and Stefan Biffl | Knowledge...

Reginald Ford, Grit Denker, Daniel Elenius, Wesley Moore and Elie Abi-Lahoud ...

Tomas Knap | RDF Data Processing and Integration Tasks in UnifiedViews: Use C...

Vassilios Peristeras | Promoting Semantic Interoperability for European Publi...

Holger Wollschläger | E-government at its best: Open, transparent and useful

Jo Kent | ADA – Opening up the BBC archive with linked data

Uses of Video Annotation Software to Promote Deep Learning - SoTE 2106

Linked data the next 5 years - From Hype to Action

Similar to Felix Burkhardt | ARCHITECTURE FOR A QUESTION ANSWERING MACHINE

Strata parallel m-ml-ops_sept_2017Nisha Talagala

Building and deploying LLM applications with Apache AirflowKaxil Naik

A Practical Guide to Selecting a Stream Processing Technology confluent

Building Large Scale Machine Learning Applications with Pipelines-(Evan Spark...Spark Summit

Aakanksha_Agnani_j2016Aakanksha Agnani

AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021Sandesh Rao

Event-Driven Stream Processing and Model Deployment with Apache Kafka, Kafka ...Kai Wähner

Event-Driven Model Serving: Stream Processing vs. RPC with Kafka and TensorFl...confluent

A Collaborative Data Science Development WorkflowDatabricks

Generating test cases using UML Communication Diagram Praveen Penumathsa

Machine Learning Applied - Contextual Chatbots Coding, Oracle JET and TensorFlowandrejusb

Resumemuddanas

Srinivas Muddana Resumemuddanas

Key projects in AI, ML and Generative AIVijayananda Mohire

Apache Drill (ver. 0.2)Camuel Gilyadov

Low Latency Polyglot Model Scoring using Apache ApexApache Apex

C19013010 the tutorial to build shared ai services session 1Bill Liu

Elastic-EngineeringAraf Karsh Hamid

Similar to Felix Burkhardt | ARCHITECTURE FOR A QUESTION ANSWERING MACHINE (20)

Strata parallel m-ml-ops_sept_2017

Building and deploying LLM applications with Apache Airflow

A Practical Guide to Selecting a Stream Processing Technology

Building Large Scale Machine Learning Applications with Pipelines-(Evan Spark...

Aakanksha_Agnani_j2016

AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021

Event-Driven Stream Processing and Model Deployment with Apache Kafka, Kafka ...

Event-Driven Model Serving: Stream Processing vs. RPC with Kafka and TensorFl...

A Collaborative Data Science Development Workflow

Generating test cases using UML Communication Diagram

Machine Learning Applied - Contextual Chatbots Coding, Oracle JET and TensorFlow

Resume

Srinivas Muddana Resume

Key projects in AI, ML and Generative AI

Apache Drill (ver. 0.2)

Low Latency Polyglot Model Scoring using Apache Apex

C19013010 the tutorial to build shared ai services session 1

Elastic-Engineering

Recently uploaded

The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3

How to write a Business Continuity PlanDatabarracks

The State of Passkeys with FIDO Alliance.pptxLoriGlavin3

A Journey Into the Emotions of Software DevelopersNicole Novielli

Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal

Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen

How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes

Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González

Time Series Foundation Models - current state and future directionsNathaniel Shimoni

Rise of the Machines: Known As Drones...Rick Flair

The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney

Sample pptx for embedding into website for demoHarshalMandlekar2

Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3

DevEX - reference for building teams, processes, and platformsSergiu Bodiu

Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq

New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada

The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech

Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda

2024 April Patch TuesdayIvanti

A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3

Recently uploaded (20)

The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx

How to write a Business Continuity Plan

The State of Passkeys with FIDO Alliance.pptx

A Journey Into the Emotions of Software Developers

Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...

Testing tools and AI - ideas what to try with some tool examples

How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes

Generative Artificial Intelligence: How generative AI works.pdf

Time Series Foundation Models - current state and future directions

Rise of the Machines: Known As Drones...

The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...

Sample pptx for embedding into website for demo

Moving Beyond Passwords: FIDO Paris Seminar.pdf

DevEX - reference for building teams, processes, and platforms

Genislab builds better products and faster go-to-market with Lean project man...

New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024

The Ultimate Guide to Choosing WordPress Pros and Cons

Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger

2024 April Patch Tuesday

A Deep Dive on Passkeys: FIDO Paris Seminar.pptx

Felix Burkhardt | ARCHITECTURE FOR A QUESTION ANSWERING MACHINE

1. QUARK: Architecture for a question answering machine Felix Burkhardt

2. 2 Motivation Architecture Modules Frontends Spellchecker Dialogmanager Semantic Search Answer Templates Summary & Outlook Quark: Architecture for a QA Machine Contents Chatbot „Tinka“ on the T-mobile.at website

3. 3 Quark: Architecture for a QA Machine Motivation To ease the work of human agents and save costs automatic question answering systems are valuable One example are so-called „chatbots“, i.e. automatic dialog systems

4. 4 Quark: Architecture for a QA Machine ARchitecture • We developed an architecture to develop, test and compare several components of such a question answering system. • It is also used to build demonstrators for management

5. 5 Quark: Architecture for a QA Machine Frontends • There are several interfaces, e.g. mobile apps for demonstration • Web interfaces for testing and maintenance

6. 6 Quark: Architecture for a QA Machine Spellchecker • In the current Tinka implementation we use the open source Hunspell spellchecker • The API was algorithmically enhanced by supporting • Weighted user lexicon • Enabling recognition of bi-grams and tri-grams

7. 7 Quark: Architecture for a QA Machine Dialogmanager • Central instance • Missing dialog model for slot-filling and ellipse handling • Main challenge is currently the result list order, i.e. a quality measure for the answers from several moduls QA-item Central data-structure for answers - Question - Interpretation - Answer - Probability: each knowledge source must provide a probability value so the results can be ranked in a meaningful way. Some knowledge sources are more specific than others (e.g. QA- search more than FAQ) and are ranked higher per se. Dialog- manager Central instance to take queries , distributes to several Answer- modules, and consolidate output Later, session handling for multi- step dialogs will also be handled here.

8. 8 Quark: Architecture for a QA Machine Semantic Search • With content that already contains answers, FAQ and extracted from forum, the user query must only be matched with the question. • Finding important words and synonyms for „query-expansion“ can be done with GATE‘s term extractor, had to be adapted for German. • The search index of a SOLR search engine can than be enhanced by these terms. • The number of matching terms would be part of the quality criterion

9. 9 Quark: Architecture for a QA Machine Topic classification of Forum data • Together with the DAI (Distributed Artificial Intelligence) Labor of TU-Berlin we investigated the topic classification of „Telekom Hilft“ user forum • Compared „classical machine learning“ with Deep neural nets. • Both resulted in 55% accuracy rsp. 83% „one in three“ • Also investigated subclustering with DNN (4 subclusters per category) Apache Lucene preprocessing NER / disambuiguation Multinomial Naive Bayes * Zhang, Zhao, LeCun, 2015 Deep Temporal Convolutional Neural Network * Comparison

10. 10 Quark: Architecture for a QA Machine Answer Templates • We also use GATE to annotate terms in user queries, based on gazeteers. • Each string gets annotated with Part- of-Speech, lemma and NER • Disambiguation is not done yet • Via JAPE grammars a pre-defined answer (and querstion) template is determined • With this template, a SPARQL database is queried and the answer filled. • SPARQL is the query language for RDF, a W3C suggestion for semantic annotation

11. 11 Quark: Architecture for a QA Machine Summary and Outlook • We presented an architecture to answer customer questions automatically • It is based on open source technology like • GATE • SOLR • RDF/OWL • It is used to evaluate vendors, build demonstrators and test sub-modules for production • Many things are missing, e.g.: • Dialog models • Common quality-of-answer criterion across sub-modules • Machine learning of underlying ontology

12. 12 Quark: Architecture for a QA Machine Thank you for your attention!

13. QUARK BACKUP Ontology learning 13

14. QUARK BACKUP Machine categorization 14

Felix Burkhardt | ARCHITECTURE FOR A QUESTION ANSWERING MACHINE

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Felix Burkhardt | ARCHITECTURE FOR A QUESTION ANSWERING MACHINE

Similar to Felix Burkhardt | ARCHITECTURE FOR A QUESTION ANSWERING MACHINE (20)

More from semanticsconference

More from semanticsconference (20)

Recently uploaded

Recently uploaded (20)

Felix Burkhardt | ARCHITECTURE FOR A QUESTION ANSWERING MACHINE