Deep Learning Enabled Question Answering System to Automate Corporate Helpdesk

Deep learning enabled Question
Answering models
PROJECT WORK PRESENTATION
Saurabh Saxena
2015HT12604

Introduction
DEEP LEARNING AND QUESTION ANSWERING SYSTEMS

Deep learning
 What is Deep learning?
Deep learning is a new area of Machine learning research that uses multi-
layered Artificial neural networks. The objective is to learn multiple levels of
representation and abstraction that help to make sense of data such as images,
sound, and text. It is and is becoming increasingly relevant because of three
key reasons :
 An infinitely flexible function – universal function approximation via Neural networks
 All-purpose parameter fitting – using gradient descent and its derivative algorithms
 Fast and scalable – availability of cheap GPUs for fast matrix multiplications
 Typical applications of Deep learning
 Convolution Neural Networks(CNN) in Computer vision and machine translation
 Recurrent Neural Network(RNN) like LSTM/GRU in language modeling
 Tree Neural Networks(TNN) in sentiment analysis
 Reinforcement learning in Game playing and intelligent agents

4
Basic Building blocks of deep learning
Most DL Networks (including Question Answering models) are composed
out of these basic building blocks:
• Fully Connected Network
• Word Embedding
• Convolutional Neural Network
• Recurrent Neural Network

General Architecture of a Deep model

 What is a Question Answering System?
The basic idea of an automated QA system is to extract information from
documents and given a user query provide a short and concise answer that will
meet user’s information needs.
 Traditional QA systems are basically of 2 types :
 Information Retrieval(IR) based QA – Match and ranking based broad domain
QA using mostly unstructured data, example -> Search engines
 Knowledge-based(KB) QA – semantic representation of query using structured
data like triple stores or SQL example -> Freebase , DBPedia, and Wolfram
alpha
 Question types
 Factoid questions – DeepMind CNN/DailyMail datset
 Cloze style questions – MCTest dataset and bAbI
 Open domain question answering – WikiQA and LAMBADA
QA systems

Motivations – What deep learning can
do for QA systems ?
 Traditional QA pipeline relies a lot on manual feature engineering. The aim of
deep learning models is to eliminate this.
 Aim to build systems that can directly read documents and then answer
questions based on those documents.
 RNNs have been successful in language modeling and generation but could
not achieve much success in QA as they cannot store enough context in their
hidden states . To answer complex questions models require supporting facts
far back in the past.
 Suffer from vanishing gradient problem if too many time-steps are used.
 Solution - incorporate explicit Memory in the model and a way to address
that memory for read and write.

Memory networks for QA
AND THEIR VARIANTS

What are Memory Networks ?
 Class of models that combine large memory with learning component that
can read and write to it.
 Incorporates reasoning with attention over memory (RAM).
 Most ML has limited memory which is more-or-less all that’s needed for
“low level” tasks e.g. object detection.
 Long-term memory is required to read a story and then e.g. answer
questions about it.
 It is also required for dialog: to remember previous dialog (short- and
long-term), and respond.
 Models are scalable - can store and read large amount of data in memory
- entire KB

All MemNN have four component networks (which may or
may not have shared parameters):
 I: (input feature map) convert incoming data to the internal feature
representation.
 G: (generalization) update memories given new input.
 O: produce new output (in feature representation space) given the
memories.
 R: (response) convert output O into response seen by the outside world
Step 1: controller converts incoming data to internal
feature representation (I)
Step 2: write head updates the memories and writes the data
into memory (G)
Step 3: given the external input, the read head reads
the memory and fetches relevant data (O)
Step 4: controller combines the external data with
memory contents returned by read head to generate
output (O, R)

State-of-the art Memory Networks

Datasets to train Deep QA models
BABI , LAMBADA , MCTEST AND MORE…

Datasets available to train/test QA
models
 Facebook bAbI Simplequestions– A set of 20 tasks for testing text understanding
and reasoning. For each task, there are 10000 questions for training, and 1000 for
testing. Each task tests the machine on a specific skill set.
https://research.fb.com/downloads/babi/
 Facebook bAbI Chidlren's Book Test(CBT)- Text passages and corresponding
questions drawn from Project Gutenberg Children's books. 669,343 training
questions , 8,000 dev questions and 10,000 test questions
 MCTest - consists of 500 stories and 2000 questions. By being fictional, the answer
typically can be found only in the story itself. Requires machines to answer
multiple-choice reading comprehension questions about fictional stories, directly
tackling the high-level goal of open-domain machine comprehension.
http://research.microsoft.com/en-us/um/redmond/projects/mctest/

 Language Modeling Broadened to Account for Discourse Aspects(LAMBADA
dataset) - consists of 10,022 passages, divided into 4,869 development and 5,153
test passages (extracted from 1,331 and 1,332 disjoint novels, respectively). The
average passage consists of 4.6 sentences in the context plus 1 target sentence, for
a total length of 75.4 tokens (dev) / 75 tokens (test).
http://clic.cimec.unitn.it/lambada/
 DeepMind CNN and DailyMail dataset - Collection of news articles and
corresponding cloze queriesEach dataset contains many documents (90k and 197k
each), and each document has on average 4 questions approximately. Each
question is a sentence with one missing word/phrase which can be found from the
accompanying document/context
http://cs.nyu.edu/~kcho/DMQA/

 Stanford Question answering Dataset (SQuAD) - reading comprehension dataset
consisting of questions posed by crowd-workers on a set of Wikipedia articles. The
answer to every question is a segment of text, or span, from the corresponding
reading passage. There are 100,000+ question-answer pairs on 500+ articles.
https://rajpurkar.github.io/SQuAD-explorer/explore/1.1/dev/
 AI2 Science Exams - Elementary science questions from US state and regional
science exam. 170 multi-state and 108 4th grade questions.
http://allenai.org/data/science-exam-questions.html
 WikiQA - 3047 questions sampled from Bing query logs. Each question associated
with a Wikipedia page. All sentences in the summary paragraph of the page
become the candidate answers. Only 1/3rd questions have a correct answer in the
candidate answer set.
https://www.microsoft.com/en-us/research/publication/wikiqa-a-challenge-
dataset-for-open-domain-question-answering/

Facebook bAbI dataset – 20 tasks
• Single supporting fact
• Two supporting facts
• Three supporting facts
• Two argument relations
• Three argument relations
• Yes/No questions
• Counting
• Lists/sets
• Simple Negation
• Indefinite Knowledge
• Basic Coreference
• Conjunction
• Compound Coreference
• Time Reasoning
• Basic Deduction
• Basic Induction
• Positional Reasoning
• Size Reasoning
• Path Finding
• Agent’s Motivations

Experimental Setup to train deep
models
GPU, THEANO, KERAS , CUDA , CUDNN AND MORE…

Component Description
Operating System Ubuntu 16.04 VM on Intel Octa core CPU with 6.5 GB RAM
Graphics Card NVIDIDA Testla K80 with 12 GB Ram and 2080 CUDA cores
Graphics Toolkit CUDA 8.0 with CuDNN 6.0
Python Package Manager Anaconda (Continuum Analytics) for Python 2.7
Deep learning library Keras v2.0.2
with Theano v0.9.0 backend
Other python modules  Bcolz v1.0.0 for fast saving/loading of trained weights
 Numpy v1.12.1 for all multi-dimensional numeric manipulations
 Scikit-learn v0.18.1 for preprocessing, pipelining, feature-extraction, decomposition , dataset
splits and all general non-deep machine algorithms
 Cpickle for saving model
 NLTK toolkit for traditional linguistic tasks
 Matplotlib v2.0.0 – for visualizing data
 Pydot v1.0.28 and GraphViz v2.38.0– for visualizing deep models
 Openblas 0.2.19 – for fast linear algebra operations
 Pandas v0.19.2 for structured data manipulation
 Protobuf 3.0.0 for protocol buffering
 Flask v0.12 for web display
Experimental setup in Google Cloud

Compute Engine setup in Google Cloud

Training Summary
MODELS, TEST ACCURACY AND MORE…

Training summary for bAbI Task#1 – one supporting fact
Training summary for bAbI Task#2 – 2 supporting fact

Joint training on all 20 tasks simultaneously

Demo on bAbi tasks -
Correct answers

Future work
 Train Dynamic Memory network on bAbi dataset
 Train Key-value memory network on bAbi dataset
 Evaluate the performance of current models on other datasets like
LAMBADA and Stanford SQUAD
 Explore the possibility of transfer learning so that models trained on open
source datasets can be applied to corporate datasets with only fine tuning
 Explore the use of trained models in dialog modeling for Helpdesk
Question answering

Deep Learning Enabled Question Answering System to Automate Corporate Helpdesk

More Related Content

What's hot

Similar to Deep Learning Enabled Question Answering System to Automate Corporate Helpdesk

Recently uploaded

Deep Learning Enabled Question Answering System to Automate Corporate Helpdesk