AUTOMATED HELPDESK
FINAL YEAR PROJECT (7TH SEM)
SUBMITTED BY
NIKHIL PATHANIA
PARTHA PRATIM KURMI
PRANAV SHARMA
RISHABH KUMAR
SOURAV KUMAR PAUL
PRESENTATION TIMELINE
Theoretical NLP
Knowledge Base
Design
By – Pranav Sharma
Practical NLP
Application
Forming of Tokens
By – Rishabh Kumar
Clustering
By – Sourav Kr Paul
Tensorflow
By – Nikhil Pathania
Query Model
By – Partha Pratim
Kurmi
PROJECT TIMELINE
Problem Formulation
Sep-2016
Literature Survey
Sept-Oct 2016
Design Methodology
Nov-2016
Synchronizing
Modules
Nov-2016
Basic Implementation
Jan- Feb 2017
Working Model
Mar-2017
Accuracy
Improvements
Mar- Apr 2017
PROBLEM STATEMENT
Automate the task of customer centers.
AIM - Build a system to answer questions like
"How to recharge my mobile?" - PayTM
"How to pay my bills?" - PayTM
"Why is my refund not credited?" - Book My Show
Training
Model
1.1
Raw Data
1.2
NLP
1.3
Preprocessing
1.4
Knowledge Base
1.5
Clustering
O/P
INFORMATION RETRIEVAL
• Data Sources
• FAQ's
• Past forum data
• Proper data extraction model
• Knowledge base
DATA EXTRACTION MODELS
WHY NLP?
• 3 steps process.
• Extends with clustering.
• Fast, accurate.
NLP
• 4 step process.
• No extension with clustering.
• Smaller domain.
PATTERN MATCHING
Example
Knowledge Base - “The CEO of IBM is Samuel Palmisano.”
Query - “Who is the CEO of IBM?”
Format - Q is A
Training
Model
1.1
Raw Data
1.2
NLP
1.3
Preprocessing
1.4
Knowledge Base
1.5
Clustering
O/P
NATURAL LANGUAGE PROCESSING
• Problem Domain – English.
• Aim.
• Origin - Turing Test.
• Annotating the sentence.
• Clouds exist on mars. => <cloud, exist, mars>
• Kernel sentences, T Expressions.
KERNEL SENTENCE, T-EXP
• Kernel Sentences.
• Ternary Expressions.
• <Subject, Relation, Object>
AN EXAMPLE
KNOWLEDGE BASE
• What is it?
• What to store? Proper data structure.
• Mapping to original set.
• NLP Annotations, parameterized variants.
Training
Model
1.1
Raw Data
1.2
NLP
1.3
Preprocessing
1.4
Knowledge Base
O/P
1.5
Clustering
PREPROCESSING:-
• Tokenization
• Stop words removal.
• Stemming.
• POS Tagging.
NLTK ( NATURAL LANGUAGE TOOLKIT )
• Suite of libraries.
• Python Support.
• Few libraries which we will be using are :-
• Lexical analysis.
• Parts of speech tagger
TOKENIZATION:-
Tokenization( Word Tokenize)
• Breaking stream into meaningful elements.
• Stream may or may not be a meaningful sentence.
EXAMPLE:-
"Recharge your mobile by visiting this link"
After tokenization:-
['Recharge', 'your', 'mobile', 'by', 'visiting', 'this', 'link']
STOP WORDS :-
E.g. “is, for, the, in, etc”
Target :- REMOVE THE STOP WORDS
STOP WORDS REMOVED BY NLTK:-
EXAMPLE :-
FromTokenization
['Recharge', 'your', 'mobile', 'by', 'visiting', 'this', 'link']
After Stop Words removal
['Recharge', 'mobile', 'visiting', 'link']
STEMMING:-
Word = Stem + Affixes
Example:- playing = play(stem) + ing(affixes)
TARGET:- Removing affixes from word (called stemming)
E.g. plays, playing, playful all reduced to 'play'
Library in NLTK :- PorterStemmer
EXAMPLE :-
From Stop words removal :-
['Recharge', 'mobile', 'visiting', 'link']
After Stemming :-
['Recharge', 'mobile', 'visit', 'link'] // input for clustering is
generated
POS TAGGING:-
POS (part of speech) = Category of Tokens in linguistics, such as
verb noun etc.
Target :- Tag the tokens with the POS with a universal format.
EXAMPLE :-
From Stemming:-
['Recharge', 'mobile', 'visit', 'link']
After POS Tagging:-
[('Recharge', 'NN')]
[('mobile', 'NN')]
[('visit', 'VBG')]
[('link', 'NN')]
Training
Model
1.1
Raw Data
1.2
NLP
1.3
Preprocessing
1.4
Knowledge Base
O/P
1.5
Clustering
Training
Model
1.1
Raw Data
1.2
NLP
1.3
Preprocessing
1.4
Knowledge Base
O/P
1.5
Clustering
DOCUMENT CLUSTERING – WHAT AND
WHY?
• Unsupervised document organization
• Automatic topic organization
• Topic extraction
• Fast Information retrieval and filtering
EXAMPLES
• Web document clustering for search users.
• QA document clustering to solve common problems and questions.
WHY K-MEANS? WHY NOT ANY HIERARCHICAL ALGO?
• Time Complexity
CLUSTERING
• Algorithm
• Find k (most dissimilar) documents
• Assign them as k centroid
• Until no change
• For each document
• Find the most similar cluster
• Use cosine similarity fn
• Recalculate the centroid of each cluster
• Stop If no document was reassigned
K-MEANS USING JACCARD DISTANCE
MEASURE
• Problems in Simple K-Means Procedure.
• Greedy Algorithm
• Doesn't guarantee the best solution.
• JACCARD Distance Measure
• Find k most dissimilar document.
OUTPUT OF PREPROCESSING
• Possible text documents are :
• Recharge mobile visit link
• Recharge landline visit link
• Cancel ticket process
• Add money wallet
CALCULATING TF-IDF VECTORS
• Term Frequency – Inverse Document Frequency
• (Weight) Ranks the importance
• Terms frequent in Document and rare in Set
• Ex: College name NITS. - name is frequent but not rare.
TF-IDF VECTOR SPACE
Add Cance Recha
rge
landli
e
link mobil mone proce
s
ticket visit wallet
0.00 0.00 0.17 0.00 0.17 0.35 0.00 0.00 0.00 0.17 0.00
0.00 0.00 0.17 0.35 0.17 0.00 0.00 0.00 0.00 0.17 0.00
0.00 0.46 0.00 0.00 0.00 0.00 0.00 0.46 0.46 0.00 0.00
0.46 0.00 0.00 0.00 0.00 0.00 0.46 0.00 0.00 0.00 0.46
SELECT K-CLUSTER ( K =3)
• Use Jaccard Distance Measure - {{0},{2},{3}}
Document No (I) Document No (J) Similarity
0 1 0.6
0 2 0.00
0 3 0.00
1 2 0.00
1 3 0.00
2 3 0.00
AFTER FIRST ITERATION
• Assigning of documents to its most similar cluster. -
{{0,1},{2},{3}}
• Clusters After 1st iteration: (vecspace – centroid centers)
Add Cance Recha
rge
landli
e
link mobil mone proce
s
ticket visit wallet
0.00 0.00 0.17 0.17 0.17 0.17 0.00 0.00 0.00 0.17 0.00
0.00 0.46 0.00 0.00 0.00 0.00 0.00 0.46 0.46 0.00 0.00
0.46 0.0 0.0 0.0 0.0 0.0 0.46 0.0 0.0 0.0 0.46
CLUSTERING OUTPUT
• { { Recharge mobile visit link, Recharge landline visit link },
{ Cancel ticket process },
{ Add money wallet }
}
Training
Model
1.1
Raw Data
1.2
NLP
1.3
Preprocessing
1.4
Knowledge Base
O/P
1.5
Clustering
TENSOR FLOW
• What
• Why
• Where
PROGRAMMING MODEL AND BASIC
CONCEPTS
• Computation Graph
• Nodes
• Tensors
• Session
• Extend
• Run
COMPUTATION GRAPH
IMPLEMENTATION
• Single Device Execution
• Multi Device Execution
• Cross Device Communication
SINGLE DEVICE EXECUTION
CROSS DEVICE COMMUNICATION
PERFORMANCE
• Data Parallel Training
• Model Parallel Training
• Concurrent Step for Model Computation Pipelining
DATA PARALLEL TRAINING
MODEL PARALLEL AND CONCURRENT
STEPS
CLUSTERING USING TENSOR FLOW
• Training Sets
• Nodes
• Data flow
• Feed as Input
• Output
Query Model
2.1
Query
2.2
NLP
2.3
Preprocessing
2.4
Recommendation
Engine
O/P
Query Model
2.1
Query
2.2
NLP
2.3
Preprocessing
2.4
Recommendation
Engine
O/P
Query Model
2.1
Query
2.2
NLP
2.3
Preprocessing
2.4
Recommendation
Engine
O/P
Query Model
2.1
Query
2.2
NLP
2.3
Preprocessing
2.4
Recommendation
Engine
O/P
RECOMMENDATION ENGINE
• Recommendation Engine analyzes available data to answer the
questions
• The various steps are:
1. Data collection
2. Preprocessing and Transformations
3. Classifier Ensemble
PREPROCESSING AND TRANSFORMATIONS
• The training set is taken consisting of FAQs, past forums etc.
• Given a question, we want to deduce its genre from the texts
• Only the text of the question is extracted.
• Feature selection to evaluate the importance of a word using
TF-IDF
PREPROCESSING AND TRANSFORMATIONS
• Training set derived from the key parts of speech in each
sentence
Example How to recharge my mobile
Part of Speech Verb Noun Object
Decision label Task Electronics
PREPROCESSING AND TRANSFORMATIONS
• recharge mobile
• Find TF-IDF vector
• Compare it with distinct clusters using cosine similarity
CLASSIFIER ENSEMBLE
• Ensemble modelling is used for classification using three classifiers
• Naïve Bayesian using FAQ training set
• POS Naïve Bayesian
• Threshold Biasing classifier
ENSEMBLE STRUCTURE
• Learning algorithm that uses multiple classifiers
• Classify using a weighted vote for their decisions
• The classifier having better precision is considered
RESULTS
• Documents are hand-tagged with the genres
• In the Ensemble approach, we use a bag approach
• The count of genres is taken into account
• The top tallied genre is used to generate result
• Answer is "recharge mobile visit link"
Query Model
2.1
Query
2.2
NLP
2.3
Preprocessing
2.4
Recommendation
Engine
O/P
INNOVATION
• Sections Removed
• User friendly
• Reduced Man-power
• Future plans to collaborate with college website.
CONCLUSION AND OUTCOMES
The outcomes of this project can be formulated (but not limited to) in
the following points :-
1. Complete Designed Architecture.
2. Proper modules and uses defined.
3. Model solution to the problem.
Hence we would like to conclude that the theoretical and survey aspect
of the problem is complete. We have selected the best tech solutions
after surveying for all existing alternatives. Thus, a working model is
soon to be expected from the team.
LITERATURE SURVEY
Seria
l No
Paper Title Authors
1 Natural Language Annotations for Question
Answering
Boris Katz, Gary Borchardt and Sue
Felshin
2 Using English for Indexing and Retrieving Katz, Boris
3 Recommendation engine: Matching
individual/group profiles for better shopping
experience
Sanjeev Kulkarni, Ashok M. Sanpal,
Ravindra R. Mudholkar, kiran Kumari
4 Recommendation engine for Reddit Hoang Nguyen, Rachel Richards,
C.C. Chan, Kathy J. Liszka
5 TensorFlow: Large-Scale Machine Learning on
Heterogeneous Distributed Systems
Mart´ın Abadi, Ashish Agarwal, Paul
Barham, Eugene Brevdo
6 Executing a program
on the MIT tagged-token dataflow architecture.
IEEE Trans. Comput., 1990.
Arvind and Rishiyur S. Nikhil
LITERATURE SURVEY
Serial
No
Paper Title Author
7 An efficient K-Means Algorithm integrated with Jaccard
Distance Measure for Document Clustering
Mushfeq-Us-Saleheen
Shameem, Raihana Ferdous
8 An Intelligent Similarity Measure for Effective Text
Document Clustering
M.L.AISHWARYA1
Department of Computer
Science , K.SELVI2
9 K Means Clustering with Tf-idf Weights Jonathan Zong
10 Comparison Between K-Mean and Hierarchical
Algorithm
Using Query Redirection
Manpreet kaur , Usvir Kaur
11 Question Answering System on Education Acts Using
NLP Techniques
Dr.M.M. Raghuwanshi
Professor , Department Of
Computer Science and
Technology
LITERATURE SURVEY
Serial
No
Paper Title Author
12 Affective – Hierarchical Classification of Text – An
Approach Using NLP Toolkit
Dr.R.Venkatesan Asst.Prof-
III/CSE
13 Building high-level features using large scale
unsupervised
learning. In ICML’2012, 2012.
Quoc Le, Marc’Aurelio
Ranzato, Rajat Monga, and
Andrew
Ng.
14 Preprocessing Techniques for Text Mining - An
Overview
Dr. S. Vijayarani1, Ms. J.
Ilamathi, Ms. Nithya, Assistant
Professor, M. Phil Research
Scholar,
Department of Computer
Science
THANK YOU !!

Deep Learning Automated Helpdesk

Editor's Notes

  • #28 For example, as amount of online information are increasing rapidly, users as well as Information retrieval system needed to classify the desired document against a specific query. - web document clustering for search users. - QA document clustering to solve common problems and questions.
  • #30 - hierarchical based algorithm, which includes single link, complete linkage, group average. - Applications – Online And Offline - Online applications are usually constrained by efficiency problems when compared to offline applications. - hierarchical algorithms produce more in-depth information for detailed analyses, - K-means are more efficient and provide sufficient information for most purposes
  • #34 - ratio of the number of occurrences of a word in its document to the total number of words in its document.  - fraction of the document that is a particular term. -ratio of the number of documents in the corpus to the number of documents containing the given term. -Inverting the document frequency by taking the logarithm assigns a higher weight to rarer terms Ex : College name NITS. - name is frequent but not rare.
  • #40 Where- used for conducting research and deploying ML into productions. Wide range of applications in fields like NLP, Recommendation Engine, geographical information extraction and computational drug discovery.
  • #41 Nodes: instantiation of an operation(multiple input and output)  Tensor: arbitrary dimensional array(output to input flow) Client programs interact with tensor flow by creating a session. Initially there are no nodes and edges Session interface supports extend and run Extend supports augmenting edges and nodes to the graph Run which takes set of output names that need to be computed
  • #42 Variable is a  special kind of operations that returns a handle to persistent mutable tensor that survives across the execution of the graph
  • #43 Single: Nodes of graph are executed in order that respects the dependencies between the nodes. Multi device: Deciding which device to place for computation for each node of graph                         Managing communication of data across device boundaries.
  • #45 Feasiblility of device:  Greedy heuristics by choosing the one which give best results  Kernel should implement particular operation Any cross edge from x to y is replaced by send and receive node.
  • #46 Data Parallel Training One simple technique for speeding up SGD is to parallelize the computation of gradient for a mini-batch across mini-batch elements