SlideShare a Scribd company logo
ExcelR Convers
PRESENTED BY
01
SAYAN MONDAL
GUIDED BY
MR. BYOM
02
Business Problem
Extracting all the actionable
insights from the chat transcript
which
will be helpful for:
•   Improvement of business
•   Easy Connectivity
•   Reduced Human Dependency
•   Effective Connectivity
•   24*7 Service
Objective
Topic Mining & Exploratory Analysis for
improving the Resource Allocation, Content
Modification & Service Improvement.
03
Data Set
•90 Days conversational Data
•Semi Structured Data
START WITH BUSINESS PROBLEM
CLEAN THE DATA
PERFORM EDA
APPLY MACHINE LEARNING TECHNIQUES
INSIGHTS
PROJECT ARCHITECTURE / PROJECT FLOW
04
DATA INSIGHT
FROM DIFFERENT
VARIABLE
TIMESTAMP
From the timestamp we can understand which
particular day of the week we are getting maximum call,
Which time slot we are getting that we can understand.
UNREAD
false/True : Depend upon this we also get the idea how
many people visited the website and how many of
people visited the site & how many are interested.
VISITOR EMAIL
Could be used for marketing purpose.                           
e.g. special occasion discount.
05
DATA INSIGHT
FROM DIFFERENT
VARIABLE
COUNTRY NAME/REGION/CITY
From this we could do a geographic segmentation.
AGE
If we add age as an input we could do the demographic
analysis
06
DATA INSIGHT
FROM DIFFERENT
VARIABLE
CHAT
From chat start and end time we get idea how the
prospect is engaged.
Which particular time chat volume are high we can
engage more executive. And give them break
alternately when chat vol. is less.
What are different things prospects are looking for
we got an idea. e.g digital marketing, deep
learning, Block chain. If any thing is not there but
The demand percentage is high we could think
about to put that in our curriculum.
If prospects going more technicalities then there
should be such kind of option shift them to a
technical person.
07
DATA INSIGHT
FROM DIFFERENT
VARIABLE
CHAT
we can also do sentimental analysis of the chat data
we can add one more feature to our data as rating.
after that we can say if the score is above 6, he/she
can be a potential customer who look more
interested in doing course in our institute.
08
DATA MERGE
We wanted to merge all the text data from all the
text file in directory.
up til now we are able to get data from all file into
one output file.
DATA PARSING
09
Chat Data Analysis & TopicChat Data Analysis & Topic
Modeling Using LDAModeling Using LDA
10
Data Pre-Processing
Converting all text into
Lowercase.
11
Removing punctuation
from text
Removing all stopwords
Lemmatization
Removing Special
Words.
eg - Id, Okay, etc
12
Most Visited Visitors By Days
13
Different Types Of Platforms Used Outside India
14
Month Wise Visits
15
No of People Actively reading Messages
16
Most Visits By Region
17
Platforms Used Worldwide
18
Most Visitors Apart From India
19
Word Cloud of Complete Corpus
20
Negative Word Cloud
21
Positive Word Cloud
[('get', 27293), ('data', 25215), ('science', 23305), ('end', 17841), ('location', 17710),
('support', 17066), ('number', 15972), ('information', 14559), ('quick', 14239), ('request',
11318), ('see', 11115), ('month', 11112), ('name', 11041), ('exploring', 11033),
('elearning', 10408), ('discount', 9830), ('exclusive', 9612), ('namecontact', 9606), ('like',
9551), ('time', 9505), ('access', 9502), ('detail', 9297), ('offer', 9151), ('enroll', 9029),
('money', 8979), ('love', 8971), ('special', 8966), ('save', 8962), ('life', 8805), ('contact',
8724), ('placement', 8620), ('region', 8333), ('project', 8284), ('please', 7989), ('city',
7772), ('group', 7772), ('clarification', 7743), ('whats', 7605), ('app', 7459), ('interview',
7260), ('doubt', 6925), ('live', 6722), ('call', 6493), ('student', 6453), ('preparation', 6444),
('25', 6316), ('forum', 6248), ('know', 5148), ('fee', 4187), ('pmp', 4148)]
22
Top 50 Frequent words
23
Bigram
24
Trigram
25
Sentiment Analysis
Positive Negative
26
Topic 1 - Course inquiry
Topic 2 - Career transformation
Topic 3 - Assistance
Topic 4 - E-learning and discount
Topic Modeling Using LDA
Model Purplexity :  -5.46
coherence score : 0.59
28
Unsupervised To Supervised Model
why we converted unsupervised to supervised ?
what benefits we will get from from business
prospective ?
How did we do it ?
29
Chat CSV with time duration
Naive Bayes Classifier
30
Accuracy 97.79 %
Confusion matrix
Kappa score 0.8303
Logistic Regression Classifier
31
Accuracy 98.06 %
Confusion matrix
Kappa Score 0.8508
0.8749
Catboost Classifier
32
Accuracy 98.35 %
Classification
report
Kappa score
33
Challenges Faced and ways to improve.
unread was miss-classified
chat-bot should take user credentials before
staring the conversation.
Course fees should be mentioned according to
respective country.
34
Model Deployment Using Flask
27
Model Deployment Using Flask
35
Thank
You

More Related Content

Similar to Chatbot data to Topic modelling

Data Mining Services in various types
Data Mining Services in various typesData Mining Services in various types
Data Mining Services in various types
loginworks software
 
Selligent insurance (Marketing automation) Good Rebels
Selligent insurance (Marketing automation) Good RebelsSelligent insurance (Marketing automation) Good Rebels
Selligent insurance (Marketing automation) Good Rebels
Good Rebels
 
Webinar - Fighting Bank Fraud with Real-time Graph Database
Webinar - Fighting Bank Fraud with Real-time Graph Database Webinar - Fighting Bank Fraud with Real-time Graph Database
Webinar - Fighting Bank Fraud with Real-time Graph Database
DataStax
 
Retail Marketing in times of turbulence 2015
Retail Marketing in times of turbulence 2015Retail Marketing in times of turbulence 2015
Retail Marketing in times of turbulence 2015
Bisnode Belgium
 
Future Proofing Your Office 365 & SharePoint Strategy
Future Proofing Your Office 365 & SharePoint StrategyFuture Proofing Your Office 365 & SharePoint Strategy
Future Proofing Your Office 365 & SharePoint Strategy
Richard Harbridge
 
Voice Summit 2018 - Millions of Dollars in Helping Customers Through Searchin...
Voice Summit 2018 - Millions of Dollars in Helping Customers Through Searchin...Voice Summit 2018 - Millions of Dollars in Helping Customers Through Searchin...
Voice Summit 2018 - Millions of Dollars in Helping Customers Through Searchin...
Noriaki Tatsumi
 
Deliver World Class Customer Experience with Big Data and Analytics
Deliver World Class Customer Experience with Big Data and AnalyticsDeliver World Class Customer Experience with Big Data and Analytics
Deliver World Class Customer Experience with Big Data and Analytics
Raul Goycoolea Seoane
 
How Big Data Can Help Marketers Improve Customer Relationships
How Big Data Can Help Marketers Improve Customer RelationshipsHow Big Data Can Help Marketers Improve Customer Relationships
How Big Data Can Help Marketers Improve Customer Relationships
Cloudera, Inc.
 
Datknosys Brochure
Datknosys Brochure Datknosys Brochure
Datknosys Brochure
DatKnoSys
 
How top producers use DiscoverOrg to close more deals faster
How top producers use DiscoverOrg to close more deals faster How top producers use DiscoverOrg to close more deals faster
How top producers use DiscoverOrg to close more deals faster
DiscoverOrg
 
The Warranty Data Lake – After, Inc.
The Warranty Data Lake – After, Inc.The Warranty Data Lake – After, Inc.
The Warranty Data Lake – After, Inc.
Richard Vermillion
 
Microsoft Dynamics 365 and why you need it NOW!
Microsoft Dynamics 365 and why you need it NOW!Microsoft Dynamics 365 and why you need it NOW!
Microsoft Dynamics 365 and why you need it NOW!
David Blumentals
 
Data+design
Data+design Data+design
Data+design
Fjord
 
Teqsense company profile presentation
Teqsense company profile presentationTeqsense company profile presentation
Teqsense company profile presentation
Teqsense
 
InfoDataSphere Presentation
InfoDataSphere PresentationInfoDataSphere Presentation
InfoDataSphere PresentationJoe Williams
 
InfoDataSphere Presentation
InfoDataSphere PresentationInfoDataSphere Presentation
InfoDataSphere PresentationBruce Withers
 
Docker Summit MongoDB - Data Democratization
Docker Summit MongoDB - Data Democratization Docker Summit MongoDB - Data Democratization
Docker Summit MongoDB - Data Democratization
Chris Grabosky
 

Similar to Chatbot data to Topic modelling (20)

Data Mining Services in various types
Data Mining Services in various typesData Mining Services in various types
Data Mining Services in various types
 
Selligent insurance (Marketing automation) Good Rebels
Selligent insurance (Marketing automation) Good RebelsSelligent insurance (Marketing automation) Good Rebels
Selligent insurance (Marketing automation) Good Rebels
 
Webinar - Fighting Bank Fraud with Real-time Graph Database
Webinar - Fighting Bank Fraud with Real-time Graph Database Webinar - Fighting Bank Fraud with Real-time Graph Database
Webinar - Fighting Bank Fraud with Real-time Graph Database
 
Retail Marketing in times of turbulence 2015
Retail Marketing in times of turbulence 2015Retail Marketing in times of turbulence 2015
Retail Marketing in times of turbulence 2015
 
Future Proofing Your Office 365 & SharePoint Strategy
Future Proofing Your Office 365 & SharePoint StrategyFuture Proofing Your Office 365 & SharePoint Strategy
Future Proofing Your Office 365 & SharePoint Strategy
 
Voice Summit 2018 - Millions of Dollars in Helping Customers Through Searchin...
Voice Summit 2018 - Millions of Dollars in Helping Customers Through Searchin...Voice Summit 2018 - Millions of Dollars in Helping Customers Through Searchin...
Voice Summit 2018 - Millions of Dollars in Helping Customers Through Searchin...
 
DENSDER.COM
DENSDER.COMDENSDER.COM
DENSDER.COM
 
Deliver World Class Customer Experience with Big Data and Analytics
Deliver World Class Customer Experience with Big Data and AnalyticsDeliver World Class Customer Experience with Big Data and Analytics
Deliver World Class Customer Experience with Big Data and Analytics
 
How Big Data Can Help Marketers Improve Customer Relationships
How Big Data Can Help Marketers Improve Customer RelationshipsHow Big Data Can Help Marketers Improve Customer Relationships
How Big Data Can Help Marketers Improve Customer Relationships
 
Datknosys Brochure
Datknosys Brochure Datknosys Brochure
Datknosys Brochure
 
riobrochure
riobrochureriobrochure
riobrochure
 
How top producers use DiscoverOrg to close more deals faster
How top producers use DiscoverOrg to close more deals faster How top producers use DiscoverOrg to close more deals faster
How top producers use DiscoverOrg to close more deals faster
 
The Warranty Data Lake – After, Inc.
The Warranty Data Lake – After, Inc.The Warranty Data Lake – After, Inc.
The Warranty Data Lake – After, Inc.
 
Microsoft Dynamics 365 and why you need it NOW!
Microsoft Dynamics 365 and why you need it NOW!Microsoft Dynamics 365 and why you need it NOW!
Microsoft Dynamics 365 and why you need it NOW!
 
Data+design
Data+design Data+design
Data+design
 
Teqsense company profile presentation
Teqsense company profile presentationTeqsense company profile presentation
Teqsense company profile presentation
 
InfoDataSphere Presentation
InfoDataSphere PresentationInfoDataSphere Presentation
InfoDataSphere Presentation
 
InfoDataSphere Presentation
InfoDataSphere PresentationInfoDataSphere Presentation
InfoDataSphere Presentation
 
IT Ready - DW: 1st Day
IT Ready - DW: 1st Day IT Ready - DW: 1st Day
IT Ready - DW: 1st Day
 
Docker Summit MongoDB - Data Democratization
Docker Summit MongoDB - Data Democratization Docker Summit MongoDB - Data Democratization
Docker Summit MongoDB - Data Democratization
 

Recently uploaded

一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
pchutichetpong
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
enxupq
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
ArpitMalhotra16
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
balafet
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
ewymefz
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
g4dpvqap0
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
ukgaet
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
MaleehaSheikh2
 
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
2023240532
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
74nqk8xf
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Linda486226
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
rwarrenll
 

Recently uploaded (20)

一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
 
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
 

Chatbot data to Topic modelling

  • 1. ExcelR Convers PRESENTED BY 01 SAYAN MONDAL GUIDED BY MR. BYOM
  • 2. 02 Business Problem Extracting all the actionable insights from the chat transcript which will be helpful for: •   Improvement of business •   Easy Connectivity •   Reduced Human Dependency •   Effective Connectivity •   24*7 Service
  • 3. Objective Topic Mining & Exploratory Analysis for improving the Resource Allocation, Content Modification & Service Improvement. 03 Data Set •90 Days conversational Data •Semi Structured Data
  • 4. START WITH BUSINESS PROBLEM CLEAN THE DATA PERFORM EDA APPLY MACHINE LEARNING TECHNIQUES INSIGHTS PROJECT ARCHITECTURE / PROJECT FLOW 04
  • 5. DATA INSIGHT FROM DIFFERENT VARIABLE TIMESTAMP From the timestamp we can understand which particular day of the week we are getting maximum call, Which time slot we are getting that we can understand. UNREAD false/True : Depend upon this we also get the idea how many people visited the website and how many of people visited the site & how many are interested. VISITOR EMAIL Could be used for marketing purpose.                            e.g. special occasion discount. 05
  • 6. DATA INSIGHT FROM DIFFERENT VARIABLE COUNTRY NAME/REGION/CITY From this we could do a geographic segmentation. AGE If we add age as an input we could do the demographic analysis 06
  • 7. DATA INSIGHT FROM DIFFERENT VARIABLE CHAT From chat start and end time we get idea how the prospect is engaged. Which particular time chat volume are high we can engage more executive. And give them break alternately when chat vol. is less. What are different things prospects are looking for we got an idea. e.g digital marketing, deep learning, Block chain. If any thing is not there but The demand percentage is high we could think about to put that in our curriculum. If prospects going more technicalities then there should be such kind of option shift them to a technical person. 07
  • 8. DATA INSIGHT FROM DIFFERENT VARIABLE CHAT we can also do sentimental analysis of the chat data we can add one more feature to our data as rating. after that we can say if the score is above 6, he/she can be a potential customer who look more interested in doing course in our institute. 08
  • 9. DATA MERGE We wanted to merge all the text data from all the text file in directory. up til now we are able to get data from all file into one output file. DATA PARSING 09
  • 10. Chat Data Analysis & TopicChat Data Analysis & Topic Modeling Using LDAModeling Using LDA 10
  • 11. Data Pre-Processing Converting all text into Lowercase. 11 Removing punctuation from text Removing all stopwords Lemmatization Removing Special Words. eg - Id, Okay, etc
  • 13. 13 Different Types Of Platforms Used Outside India
  • 15. 15 No of People Actively reading Messages
  • 19. 19 Word Cloud of Complete Corpus
  • 22. [('get', 27293), ('data', 25215), ('science', 23305), ('end', 17841), ('location', 17710), ('support', 17066), ('number', 15972), ('information', 14559), ('quick', 14239), ('request', 11318), ('see', 11115), ('month', 11112), ('name', 11041), ('exploring', 11033), ('elearning', 10408), ('discount', 9830), ('exclusive', 9612), ('namecontact', 9606), ('like', 9551), ('time', 9505), ('access', 9502), ('detail', 9297), ('offer', 9151), ('enroll', 9029), ('money', 8979), ('love', 8971), ('special', 8966), ('save', 8962), ('life', 8805), ('contact', 8724), ('placement', 8620), ('region', 8333), ('project', 8284), ('please', 7989), ('city', 7772), ('group', 7772), ('clarification', 7743), ('whats', 7605), ('app', 7459), ('interview', 7260), ('doubt', 6925), ('live', 6722), ('call', 6493), ('student', 6453), ('preparation', 6444), ('25', 6316), ('forum', 6248), ('know', 5148), ('fee', 4187), ('pmp', 4148)] 22 Top 50 Frequent words
  • 26. 26 Topic 1 - Course inquiry Topic 2 - Career transformation Topic 3 - Assistance Topic 4 - E-learning and discount Topic Modeling Using LDA Model Purplexity :  -5.46 coherence score : 0.59
  • 27. 28 Unsupervised To Supervised Model why we converted unsupervised to supervised ? what benefits we will get from from business prospective ? How did we do it ?
  • 28. 29 Chat CSV with time duration
  • 29. Naive Bayes Classifier 30 Accuracy 97.79 % Confusion matrix Kappa score 0.8303
  • 30. Logistic Regression Classifier 31 Accuracy 98.06 % Confusion matrix Kappa Score 0.8508
  • 31. 0.8749 Catboost Classifier 32 Accuracy 98.35 % Classification report Kappa score
  • 32. 33 Challenges Faced and ways to improve. unread was miss-classified chat-bot should take user credentials before staring the conversation. Course fees should be mentioned according to respective country.