SlideShare a Scribd company logo
1 of 11
Download to read offline
Language Identification
for improve interactive
voice response services
using ConvNet
Advisor : Worapol Pongpech
Rattanawadee Waipatin
Major : BA&I
Agenda
• Objective
• Literature review
• Methodology
• Feature Extraction - Why MFCC and Filter bank ?
• Result analysis
• Conclusion
2
Objective
0
10
20
30
40
50
60
Very important
Somewhat important
Not important
Customer aspect from global customer service report 2019
How important is
customer service in your
choice of, or loyalty, to a
brand?
0
10
20
30
40
50
60
70
Yes No
Have you ever stopped
doing business with a brand
due to a poor customer
service experience?
0
10
20
30
40
Yes No About the same
Do you feel the process of
engaging with customer
service organizations and
getting your questions
answered is getting easier?
3
https://info.microsoft.com/rs/157-GQE-382/images/EN-US-CNTNT-ebook-2018-State-of-Global-Customer-Service.pdf
Objective
0 20 40 60 80
Phone/Voice Email
Self service Live chat
Support ticket Social media
Customer aspect from global customer service report 2019
Which of the following
customer service channels
do you prefer?
0
20
40
60
80
Begin with self service
Engage Agent
When engaging with customer
service, do you try to use
self-service first, or do you
immediately engage with
an agent?
0
10
20
30
40
Resolving issue in one interaction
Knowledgeable agent
Not repeat information
Finding information myself
What is the most important
aspect of a good customer
service experience?
4
https://info.microsoft.com/rs/157-GQE-382/images/EN-US-CNTNT-ebook-2018-State-of-Global-Customer-Service.pdf
Objective
0
20
40
60
80
100
Interactive voice response
Contact Agent
Call center aspect from call center data in Thailand 2019
Contact agent VS using
Interactive voice response
on call center.
Half of calls are unsuccess
using Interactive voice
response service.
41% of unsuccess Interactive
voice response call repeat a
call with in 1 day and try to
use it again.
0
20
40
60
80
100
Unsuccess IVR
Success IVR
Contact Agent
0
10
20
30
40
50
60
70
80
90
100
Contact Agent
Success IVR
Unsuccess IVR
Repeat call
5
https://info.microsoft.com/rs/157-GQE-382/images/EN-US-CNTNT-ebook-2018-State-of-Global-Customer-Service.pdf
Objective
Call
Language
selection
Topic
selection
End call
Press the number to select language
AIS and Krungsri are
using speech recognition
Call
Language
selection
Topic
selection
End call
Press the number to select language and topic
Most existing IVR
New IVR
Replace with model
6
7
Literature review
Year Author Model Basis Features Languages Accuracy Remarks
2019
Sarthak, Shikhar Shukla
and Govind Mittal
1D ConvNet Raw
En, Fr, De, Es, Ru,
It
93.7
Comparing 1D and 2D ConvNet and with
raw and log-Mel feature extraction2D ConvNet log-Mel
En, Fr, De, Es, Ru,
It
95.4
2D ConvNet log-Mel En, Fr, De, Es 96.3
2019
Shauna Revay
and Matthew Teschke
ResNet50 log-Mel
En, Fr, De, Es, Ru,
It
89.0
Use a pretrained ResNet50 architecture and
cyclic learner to identify the language
2018
Valentin Gazeau and Cihan
Varol
SVM-HMM - En, Fr, De, Es 70.0
HMM was used to encode speech into
sequences of vectors which were then fed
into a neural network
2017
Christian Bartz, Tom
Herold, Haojin Yang and
Christoph Meinel
CRNN log-Mel En, Fr, De, Es 91.0
A new architecture is used to extract spatial
features by using CNN and temporal
features by using RNN
2010
Pawan Kumar, Astik
Biswas, A .N. Mishra and
Mahesh Chandra
Gaussian
mixture model
Perceptual
linear
prediction
En, Fr, De, Es, Ru,
It, Dut, Ben, Hi,
Tel
88.8
Used GMM which features were prepared
using PLP
8
Methodology
Web scraping
• Python (youtube-
dl)
Data
Preparation
• sample rate
22kHz
• Mono channel
• Duration 10
secs
• FLAC file type
Feature
Extraction
• MFCC
• Filter bank
Modeling
• CNN – 2D
ConvNet
Evaluation
• Accuracy
9
Feature Extraction - Why MFCC and Filter bank ?
Mel-Frequency Cepstral Coefficients (MFCCs) are coefficients that collectively representation of the short-term power spectrum of a
sound, based on a linear cosine transform of a log power spectrum on a nonlinear mel scale of frequency
Filter bank is an array of band-pass filters that separates the input signal into multiple components. The main use of filter banks is to
divide a signal into several separate frequency domains. (apply filter on Mel-scale to the power spectrum to extract frequency bands)
https://haythamfayek.com/2016/04/21/speech-processing-for-machine-learning.html
Pre-Emphasis Framing
Fourier-
Transform
Filter Banks MFCCs
Mean
Normalization
(1) balance the
frequency spectrum
(2) avoid numerical
problems (FT)
(3) improve the Signal-
to-Noise Ratio (SNR)
Obtain a good
approximation of the
frequency contours
Raw After MFCCs
Balance spectrum and
improve SNR
10
Result analysis
2 languages
• Thai
• English
3 languages
• Chinese
• English
• Thai
epoch
Accuracy
Loss
Accuracy
Loss
epoch
Accuracy : 99.2%
354
49.17%
360
50.00%
0
0.00%
6
0.83%
Eng
Eng
Thai
Thai
True
label
Predicted label
Accuracy : 97.8%True
label
6
0.01%
8
0.01%
Predicted label
Thai
Thai 360
33.33%
0
0.00%
0
0.00%
Chi
Chi
Eng
Eng 5
0.01%
347
32.12%
349
32.31%
5
0.01%
11
Conclusion / How to improve
• Using 2 or 3 languages class is depending on your customer nationality.
• Adding more training data sets from call center data would be more robust when using
with your call center system.

More Related Content

Similar to Language Identification for improve interactive voice response services using ConvNet

Lessons Learned - Insights to Improve Support for MS Teams in a Hybrid Work E...
Lessons Learned - Insights to Improve Support for MS Teams in a Hybrid Work E...Lessons Learned - Insights to Improve Support for MS Teams in a Hybrid Work E...
Lessons Learned - Insights to Improve Support for MS Teams in a Hybrid Work E...panagenda
 
Overcoming challenges with Skype for Business
Overcoming challenges with Skype for BusinessOvercoming challenges with Skype for Business
Overcoming challenges with Skype for BusinessAbhishek Sood
 
Accelerating Product Development FLOW: Kanban at Jaguar Land Rover
Accelerating Product Development FLOW: Kanban at Jaguar Land RoverAccelerating Product Development FLOW: Kanban at Jaguar Land Rover
Accelerating Product Development FLOW: Kanban at Jaguar Land RoverHamish McMinn
 
Survey Results: Common Problems with Microsoft Teams Call Quality
Survey Results: Common Problems with Microsoft Teams Call QualitySurvey Results: Common Problems with Microsoft Teams Call Quality
Survey Results: Common Problems with Microsoft Teams Call Qualitypanagenda
 
The CEO Just Called Your Boss. His MS Teams calls keep dropping! What do you do?
The CEO Just Called Your Boss. His MS Teams calls keep dropping! What do you do?The CEO Just Called Your Boss. His MS Teams calls keep dropping! What do you do?
The CEO Just Called Your Boss. His MS Teams calls keep dropping! What do you do?panagenda
 
Driving Innovation with Kanban at Jaguar Land Rover
Driving Innovation with Kanban at Jaguar Land RoverDriving Innovation with Kanban at Jaguar Land Rover
Driving Innovation with Kanban at Jaguar Land RoverLeanKit
 
Service Virtualization - Next Gen Testing Conference Singapore 2013
Service Virtualization - Next Gen Testing Conference Singapore 2013Service Virtualization - Next Gen Testing Conference Singapore 2013
Service Virtualization - Next Gen Testing Conference Singapore 2013Min Fang
 
High-Performance Media Processing in an NFV World
High-Performance Media Processing in an NFV WorldHigh-Performance Media Processing in an NFV World
High-Performance Media Processing in an NFV WorldRadisys Corporation
 
'FAST' Ai based Contact Center
'FAST' Ai based Contact Center'FAST' Ai based Contact Center
'FAST' Ai based Contact CenterTaejoon Yoo
 
IRJET-Voice Operated Intelligent Lift
IRJET-Voice Operated Intelligent LiftIRJET-Voice Operated Intelligent Lift
IRJET-Voice Operated Intelligent LiftIRJET Journal
 
Real-Time Non-Intrusive Speech Quality Estimation: A Signal-Based Model
Real-Time Non-Intrusive Speech Quality Estimation: A Signal-Based ModelReal-Time Non-Intrusive Speech Quality Estimation: A Signal-Based Model
Real-Time Non-Intrusive Speech Quality Estimation: A Signal-Based Modeladil raja
 
An Oversight or a New Customer Phenomenon, Getting the Most of your Contact C...
An Oversight or a New Customer Phenomenon, Getting the Most of your Contact C...An Oversight or a New Customer Phenomenon, Getting the Most of your Contact C...
An Oversight or a New Customer Phenomenon, Getting the Most of your Contact C...Cisco Canada
 
“MLPerf: An Industry Standard Performance Benchmark Suite for Machine Learnin...
“MLPerf: An Industry Standard Performance Benchmark Suite for Machine Learnin...“MLPerf: An Industry Standard Performance Benchmark Suite for Machine Learnin...
“MLPerf: An Industry Standard Performance Benchmark Suite for Machine Learnin...Edge AI and Vision Alliance
 
WebRTC for non-telco people
WebRTC for non-telco peopleWebRTC for non-telco people
WebRTC for non-telco peopleAlan Quayle
 
Voice Recognition Eye Test
Voice Recognition Eye TestVoice Recognition Eye Test
Voice Recognition Eye TestIRJET Journal
 

Similar to Language Identification for improve interactive voice response services using ConvNet (20)

Lessons Learned - Insights to Improve Support for MS Teams in a Hybrid Work E...
Lessons Learned - Insights to Improve Support for MS Teams in a Hybrid Work E...Lessons Learned - Insights to Improve Support for MS Teams in a Hybrid Work E...
Lessons Learned - Insights to Improve Support for MS Teams in a Hybrid Work E...
 
Overcoming challenges with Skype for Business
Overcoming challenges with Skype for BusinessOvercoming challenges with Skype for Business
Overcoming challenges with Skype for Business
 
Accelerating Product Development FLOW: Kanban at Jaguar Land Rover
Accelerating Product Development FLOW: Kanban at Jaguar Land RoverAccelerating Product Development FLOW: Kanban at Jaguar Land Rover
Accelerating Product Development FLOW: Kanban at Jaguar Land Rover
 
Survey Results: Common Problems with Microsoft Teams Call Quality
Survey Results: Common Problems with Microsoft Teams Call QualitySurvey Results: Common Problems with Microsoft Teams Call Quality
Survey Results: Common Problems with Microsoft Teams Call Quality
 
The CEO Just Called Your Boss. His MS Teams calls keep dropping! What do you do?
The CEO Just Called Your Boss. His MS Teams calls keep dropping! What do you do?The CEO Just Called Your Boss. His MS Teams calls keep dropping! What do you do?
The CEO Just Called Your Boss. His MS Teams calls keep dropping! What do you do?
 
Driving Innovation with Kanban at Jaguar Land Rover
Driving Innovation with Kanban at Jaguar Land RoverDriving Innovation with Kanban at Jaguar Land Rover
Driving Innovation with Kanban at Jaguar Land Rover
 
Service Virtualization - Next Gen Testing Conference Singapore 2013
Service Virtualization - Next Gen Testing Conference Singapore 2013Service Virtualization - Next Gen Testing Conference Singapore 2013
Service Virtualization - Next Gen Testing Conference Singapore 2013
 
Canon Meeting
Canon MeetingCanon Meeting
Canon Meeting
 
High-Performance Media Processing in an NFV World
High-Performance Media Processing in an NFV WorldHigh-Performance Media Processing in an NFV World
High-Performance Media Processing in an NFV World
 
'FAST' Ai based Contact Center
'FAST' Ai based Contact Center'FAST' Ai based Contact Center
'FAST' Ai based Contact Center
 
IRJET-Voice Operated Intelligent Lift
IRJET-Voice Operated Intelligent LiftIRJET-Voice Operated Intelligent Lift
IRJET-Voice Operated Intelligent Lift
 
Real-Time Non-Intrusive Speech Quality Estimation: A Signal-Based Model
Real-Time Non-Intrusive Speech Quality Estimation: A Signal-Based ModelReal-Time Non-Intrusive Speech Quality Estimation: A Signal-Based Model
Real-Time Non-Intrusive Speech Quality Estimation: A Signal-Based Model
 
An Oversight or a New Customer Phenomenon, Getting the Most of your Contact C...
An Oversight or a New Customer Phenomenon, Getting the Most of your Contact C...An Oversight or a New Customer Phenomenon, Getting the Most of your Contact C...
An Oversight or a New Customer Phenomenon, Getting the Most of your Contact C...
 
“MLPerf: An Industry Standard Performance Benchmark Suite for Machine Learnin...
“MLPerf: An Industry Standard Performance Benchmark Suite for Machine Learnin...“MLPerf: An Industry Standard Performance Benchmark Suite for Machine Learnin...
“MLPerf: An Industry Standard Performance Benchmark Suite for Machine Learnin...
 
Higher Homework
Higher HomeworkHigher Homework
Higher Homework
 
puja resume
puja resumepuja resume
puja resume
 
Wcre11b.ppt
Wcre11b.pptWcre11b.ppt
Wcre11b.ppt
 
WebRTC for non-telco people
WebRTC for non-telco peopleWebRTC for non-telco people
WebRTC for non-telco people
 
Voice Recognition Eye Test
Voice Recognition Eye TestVoice Recognition Eye Test
Voice Recognition Eye Test
 
904072
904072904072
904072
 

Recently uploaded

Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...caitlingebhard1
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Design and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data ScienceDesign and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data SciencePaolo Missier
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAnitaRaj43
 
Simplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptxSimplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptxMarkSteadman7
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMKumar Satyam
 
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....rightmanforbloodline
 
Quantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation ComputingQuantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation ComputingWSO2
 
Modernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using BallerinaModernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using BallerinaWSO2
 
Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data PlatformLess Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data PlatformWSO2
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 

Recently uploaded (20)

Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Design and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data ScienceDesign and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data Science
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
 
Simplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptxSimplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptx
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDM
 
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
 
Quantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation ComputingQuantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation Computing
 
Modernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using BallerinaModernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using Ballerina
 
Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data PlatformLess Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 

Language Identification for improve interactive voice response services using ConvNet

  • 1. Language Identification for improve interactive voice response services using ConvNet Advisor : Worapol Pongpech Rattanawadee Waipatin Major : BA&I
  • 2. Agenda • Objective • Literature review • Methodology • Feature Extraction - Why MFCC and Filter bank ? • Result analysis • Conclusion 2
  • 3. Objective 0 10 20 30 40 50 60 Very important Somewhat important Not important Customer aspect from global customer service report 2019 How important is customer service in your choice of, or loyalty, to a brand? 0 10 20 30 40 50 60 70 Yes No Have you ever stopped doing business with a brand due to a poor customer service experience? 0 10 20 30 40 Yes No About the same Do you feel the process of engaging with customer service organizations and getting your questions answered is getting easier? 3 https://info.microsoft.com/rs/157-GQE-382/images/EN-US-CNTNT-ebook-2018-State-of-Global-Customer-Service.pdf
  • 4. Objective 0 20 40 60 80 Phone/Voice Email Self service Live chat Support ticket Social media Customer aspect from global customer service report 2019 Which of the following customer service channels do you prefer? 0 20 40 60 80 Begin with self service Engage Agent When engaging with customer service, do you try to use self-service first, or do you immediately engage with an agent? 0 10 20 30 40 Resolving issue in one interaction Knowledgeable agent Not repeat information Finding information myself What is the most important aspect of a good customer service experience? 4 https://info.microsoft.com/rs/157-GQE-382/images/EN-US-CNTNT-ebook-2018-State-of-Global-Customer-Service.pdf
  • 5. Objective 0 20 40 60 80 100 Interactive voice response Contact Agent Call center aspect from call center data in Thailand 2019 Contact agent VS using Interactive voice response on call center. Half of calls are unsuccess using Interactive voice response service. 41% of unsuccess Interactive voice response call repeat a call with in 1 day and try to use it again. 0 20 40 60 80 100 Unsuccess IVR Success IVR Contact Agent 0 10 20 30 40 50 60 70 80 90 100 Contact Agent Success IVR Unsuccess IVR Repeat call 5 https://info.microsoft.com/rs/157-GQE-382/images/EN-US-CNTNT-ebook-2018-State-of-Global-Customer-Service.pdf
  • 6. Objective Call Language selection Topic selection End call Press the number to select language AIS and Krungsri are using speech recognition Call Language selection Topic selection End call Press the number to select language and topic Most existing IVR New IVR Replace with model 6
  • 7. 7 Literature review Year Author Model Basis Features Languages Accuracy Remarks 2019 Sarthak, Shikhar Shukla and Govind Mittal 1D ConvNet Raw En, Fr, De, Es, Ru, It 93.7 Comparing 1D and 2D ConvNet and with raw and log-Mel feature extraction2D ConvNet log-Mel En, Fr, De, Es, Ru, It 95.4 2D ConvNet log-Mel En, Fr, De, Es 96.3 2019 Shauna Revay and Matthew Teschke ResNet50 log-Mel En, Fr, De, Es, Ru, It 89.0 Use a pretrained ResNet50 architecture and cyclic learner to identify the language 2018 Valentin Gazeau and Cihan Varol SVM-HMM - En, Fr, De, Es 70.0 HMM was used to encode speech into sequences of vectors which were then fed into a neural network 2017 Christian Bartz, Tom Herold, Haojin Yang and Christoph Meinel CRNN log-Mel En, Fr, De, Es 91.0 A new architecture is used to extract spatial features by using CNN and temporal features by using RNN 2010 Pawan Kumar, Astik Biswas, A .N. Mishra and Mahesh Chandra Gaussian mixture model Perceptual linear prediction En, Fr, De, Es, Ru, It, Dut, Ben, Hi, Tel 88.8 Used GMM which features were prepared using PLP
  • 8. 8 Methodology Web scraping • Python (youtube- dl) Data Preparation • sample rate 22kHz • Mono channel • Duration 10 secs • FLAC file type Feature Extraction • MFCC • Filter bank Modeling • CNN – 2D ConvNet Evaluation • Accuracy
  • 9. 9 Feature Extraction - Why MFCC and Filter bank ? Mel-Frequency Cepstral Coefficients (MFCCs) are coefficients that collectively representation of the short-term power spectrum of a sound, based on a linear cosine transform of a log power spectrum on a nonlinear mel scale of frequency Filter bank is an array of band-pass filters that separates the input signal into multiple components. The main use of filter banks is to divide a signal into several separate frequency domains. (apply filter on Mel-scale to the power spectrum to extract frequency bands) https://haythamfayek.com/2016/04/21/speech-processing-for-machine-learning.html Pre-Emphasis Framing Fourier- Transform Filter Banks MFCCs Mean Normalization (1) balance the frequency spectrum (2) avoid numerical problems (FT) (3) improve the Signal- to-Noise Ratio (SNR) Obtain a good approximation of the frequency contours Raw After MFCCs Balance spectrum and improve SNR
  • 10. 10 Result analysis 2 languages • Thai • English 3 languages • Chinese • English • Thai epoch Accuracy Loss Accuracy Loss epoch Accuracy : 99.2% 354 49.17% 360 50.00% 0 0.00% 6 0.83% Eng Eng Thai Thai True label Predicted label Accuracy : 97.8%True label 6 0.01% 8 0.01% Predicted label Thai Thai 360 33.33% 0 0.00% 0 0.00% Chi Chi Eng Eng 5 0.01% 347 32.12% 349 32.31% 5 0.01%
  • 11. 11 Conclusion / How to improve • Using 2 or 3 languages class is depending on your customer nationality. • Adding more training data sets from call center data would be more robust when using with your call center system.