SlideShare a Scribd company logo
3 Challenges in
Customer Feedback
Classification
ML Kitchen #8

April 12, 2018

Van Phu Quang Huy
About Me
• Van Phu Quang Huy 

• Github: @vanhuyz

• Doing machine learning and developing new services in
Cookpad R&D
What I played recently
Transform gorilla faces to human faces using CycleGAN

Repository: https://github.com/vanhuyz/CycleGAN-TensorFlow
Today’s talk
Introduce a very first approach to customer
support automation
Background
• About cookpad recipe sharing service (cookpad.com)

• 90M worldwide monthly average users

• 68 countries / 22 languages
Background
• About customer feedback

• extremely important in service development 

• each feedback must be delivered to right staff/
engineers
Customer Feedback Box on
cookpad.com
Feedback Examples
• 

It was easy, I tried with my child. It was fun.

• 

There are abundant recipes. Useful.

•


After updating, the search ranking disappears, it is very
inconvenient.
User Support Team have to classify all customer
feedback into about 100 tags everyday
Internal tool (before)
User Support Team have to manually choose tags from

long tag list
Problems
• User Support Team have to classify all customer
feedback into about 100 tags everyday

• Time consuming (60 min/day)

• Boring
Problems
• User Support Team have to classify all customer
feedback into about 100 tags everyday

• Time consuming (60 min/day)

• Boring
It’s already 2018…
Solution: feedback
auto-classification?
Feedback Classification:
3 Challenges
• Open set and large number of categories

• Multi-label problem 

• Imbalanced data
1. Open set and large number
of categories
• Large number of categories: currently 100+

• Open set problem: 

• unseen categories in training (as new tags will be
added in future)
2. Multi-label problem
1 feedback can have multi tags, for example:
“
”

“By sending cooksnap, the recipe will come out soon when I cook the
next time, it is extremely convenient and useful. There are many
recipes from various people. I am happy to use it. Let’s keep the good
work.”

should be tagged as Cooksnap, Positive
3. Imbalanced data
• Almost half of feedback
are related to Tokubai ※

• New services (e.g.
Amazon Echo Alexa,
storeTV, etc) have very
few feedback

• Many services have
already been closed then
those data become
obsolete
※ These services currently belong to Tokubai inc.
Open Question
How do you design your system to
solve this problem?
Rules of Machine Learning
https://developers.google.com/machine-learning/rules-of-ml/
• Don’t be afraid to launch a product without machine
learning.

• Choose machine learning over a complex heuristic.

• Keep the first model simple and get the infrastructure
right.
Rules of Machine Learning
https://developers.google.com/machine-learning/rules-of-ml/
• Don’t be afraid to launch a product without machine
learning.

• Choose machine learning over a complex heuristic.

• Keep the first model simple and get the infrastructure
right.
First trial: Heuristic
• Manually build a dictionary for each tag

• e.g. feedback includes a word from [‘search’, ‘find’,
‘keyword’] should be tagged as Search

• This achieved high precision but very low recall

• Also, building dictionaries for 100 tags is infeasible
Rules of Machine Learning
https://developers.google.com/machine-learning/rules-of-ml/
• Don’t be afraid to launch a product without machine
learning.

• Choose machine learning over a complex heuristic.

• Keep the first model simple and get the infrastructure
right.
Second trial:
Simple Machine Learning
Look back 3 challenges
• Multi-label problem?

• Multiple binary classifiers can handle that!

• Open set and large number of categories?

• If a new tag is added, just train a new binary classifier

• Imbalanced data?

• Rebalance data by using all positive samples, but only
select randomly negative samples from the rest
Rules of Machine Learning
https://developers.google.com/machine-learning/rules-of-ml/
• Don’t be afraid to launch a product without machine
learning.

• Choose machine learning over a complex heuristic.

• Keep the first model simple and get the infrastructure
right.
Pick a first algorithm
Note that we need to train ~100 binary classifiers
I choose you!
Support Vector Machine
Evaluation of 81 classifiers
on validation set
Infrastructure
Internal tool (after)
User Support Team only have to choose tags from 

few tag suggestions
Results
• User Support Team have to classify all customer
feedback into about 100 tags everyday

• Time consuming (60 min/day) 30 min/day

• Boring Fun!
Results
• User Support Team have to classify all customer
feedback into about 100 tags everyday

• Time consuming (60 min/day) 30 min/day

• Boring Fun!
Future Work
• Visualize/Evaluate results using obtained data during
operation

• Collect negative samples (suggested tags which are not
chosen by operators) to improve classifiers

• Other ways to deal with imbalanced data https://
github.com/scikit-learn-contrib/imbalanced-learn

• Deep learning?

• Aim for 100% auto-tagging?
Summary
• Introduced a problem in Customer Service

• Introduced a very first solution for customer feedback
classification

• Successfully reduced labor time of customer feedback
tagging by half

More Related Content

Similar to 3 Challenges in Customer Feedback Classification

online examination system
online examination systemonline examination system
online examination system
snelkoli
 
Online Examination
Online ExaminationOnline Examination
Online Examination
snelkoli
 
Artur Suchwalko “What are common mistakes in Data Science projects and how to...
Artur Suchwalko “What are common mistakes in Data Science projects and how to...Artur Suchwalko “What are common mistakes in Data Science projects and how to...
Artur Suchwalko “What are common mistakes in Data Science projects and how to...
Lviv Startup Club
 
'The Minimalist Publisher How to Do More with Less' at Mumbrella Publish 2022
'The Minimalist Publisher How to Do More with Less' at Mumbrella Publish 2022'The Minimalist Publisher How to Do More with Less' at Mumbrella Publish 2022
'The Minimalist Publisher How to Do More with Less' at Mumbrella Publish 2022
AnneNguyen92
 
Toolkits and tips for UX analytics CRO by Craig Sullivan
Toolkits and tips for UX analytics CRO by Craig SullivanToolkits and tips for UX analytics CRO by Craig Sullivan
Toolkits and tips for UX analytics CRO by Craig Sullivan
UXPA UK
 
UXPA UK - Toolkits and Tips for Blending UX, Analytics and CRO
UXPA UK - Toolkits and Tips for Blending UX, Analytics and CROUXPA UK - Toolkits and Tips for Blending UX, Analytics and CRO
UXPA UK - Toolkits and Tips for Blending UX, Analytics and CRO
Craig Sullivan
 
Lessons learned from Large Scale Real World Recommender Systems
Lessons learned from Large Scale Real World Recommender SystemsLessons learned from Large Scale Real World Recommender Systems
Lessons learned from Large Scale Real World Recommender Systems
chrisalvino
 
CMS Crash Course!
CMS Crash Course!CMS Crash Course!
CMS Crash Course!
TechSoup Canada
 
Software engineering 7 prototype model
Software engineering 7 prototype modelSoftware engineering 7 prototype model
Software engineering 7 prototype model
Vaibhav Khanna
 
Organizing Your First Website Usability Test - Cornell Drupal Camp 2016 - part 4
Organizing Your First Website Usability Test - Cornell Drupal Camp 2016 - part 4Organizing Your First Website Usability Test - Cornell Drupal Camp 2016 - part 4
Organizing Your First Website Usability Test - Cornell Drupal Camp 2016 - part 4
Anthony D. Paul
 
Building Continuous Learning Systems
Building Continuous Learning SystemsBuilding Continuous Learning Systems
Building Continuous Learning Systems
Anuj Gupta
 
KP Compass Learning Platform
KP Compass Learning PlatformKP Compass Learning Platform
KP Compass Learning Platform
Nai Wang
 
Building High Available and Scalable Machine Learning Applications
Building High Available and Scalable Machine Learning ApplicationsBuilding High Available and Scalable Machine Learning Applications
Building High Available and Scalable Machine Learning Applications
Yalçın Yenigün
 
AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mecha...
AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mecha...AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mecha...
AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mecha...
Amazon Web Services
 
Making your site easier to use, an in-house usability testing case study
Making your site easier to use, an in-house usability testing case studyMaking your site easier to use, an in-house usability testing case study
Making your site easier to use, an in-house usability testing case studyJason Samuels
 
Simplifying the Web Accessibility Test Lab
Simplifying the Web Accessibility Test LabSimplifying the Web Accessibility Test Lab
Simplifying the Web Accessibility Test Lab
mitchellevan
 
Continuous Delivery and Continuous Agile by Andy Singleton - Agile Maine Day...
Continuous Delivery and Continuous Agile by Andy Singleton - Agile Maine Day...Continuous Delivery and Continuous Agile by Andy Singleton - Agile Maine Day...
Continuous Delivery and Continuous Agile by Andy Singleton - Agile Maine Day...
agilemaine
 
Organizing Your First Website Usability Test - WordCamp Toronto 2016
Organizing Your First Website Usability Test - WordCamp Toronto 2016Organizing Your First Website Usability Test - WordCamp Toronto 2016
Organizing Your First Website Usability Test - WordCamp Toronto 2016
Anthony D. Paul
 
Socialcam - Ammon bartam
Socialcam - Ammon bartamSocialcam - Ammon bartam
Socialcam - Ammon bartamnxtcon
 
FMK2016 - HOunz Koudelka - Audit and Optimization
FMK2016 - HOunz Koudelka - Audit and OptimizationFMK2016 - HOunz Koudelka - Audit and Optimization
FMK2016 - HOunz Koudelka - Audit and Optimization
Verein FM Konferenz
 

Similar to 3 Challenges in Customer Feedback Classification (20)

online examination system
online examination systemonline examination system
online examination system
 
Online Examination
Online ExaminationOnline Examination
Online Examination
 
Artur Suchwalko “What are common mistakes in Data Science projects and how to...
Artur Suchwalko “What are common mistakes in Data Science projects and how to...Artur Suchwalko “What are common mistakes in Data Science projects and how to...
Artur Suchwalko “What are common mistakes in Data Science projects and how to...
 
'The Minimalist Publisher How to Do More with Less' at Mumbrella Publish 2022
'The Minimalist Publisher How to Do More with Less' at Mumbrella Publish 2022'The Minimalist Publisher How to Do More with Less' at Mumbrella Publish 2022
'The Minimalist Publisher How to Do More with Less' at Mumbrella Publish 2022
 
Toolkits and tips for UX analytics CRO by Craig Sullivan
Toolkits and tips for UX analytics CRO by Craig SullivanToolkits and tips for UX analytics CRO by Craig Sullivan
Toolkits and tips for UX analytics CRO by Craig Sullivan
 
UXPA UK - Toolkits and Tips for Blending UX, Analytics and CRO
UXPA UK - Toolkits and Tips for Blending UX, Analytics and CROUXPA UK - Toolkits and Tips for Blending UX, Analytics and CRO
UXPA UK - Toolkits and Tips for Blending UX, Analytics and CRO
 
Lessons learned from Large Scale Real World Recommender Systems
Lessons learned from Large Scale Real World Recommender SystemsLessons learned from Large Scale Real World Recommender Systems
Lessons learned from Large Scale Real World Recommender Systems
 
CMS Crash Course!
CMS Crash Course!CMS Crash Course!
CMS Crash Course!
 
Software engineering 7 prototype model
Software engineering 7 prototype modelSoftware engineering 7 prototype model
Software engineering 7 prototype model
 
Organizing Your First Website Usability Test - Cornell Drupal Camp 2016 - part 4
Organizing Your First Website Usability Test - Cornell Drupal Camp 2016 - part 4Organizing Your First Website Usability Test - Cornell Drupal Camp 2016 - part 4
Organizing Your First Website Usability Test - Cornell Drupal Camp 2016 - part 4
 
Building Continuous Learning Systems
Building Continuous Learning SystemsBuilding Continuous Learning Systems
Building Continuous Learning Systems
 
KP Compass Learning Platform
KP Compass Learning PlatformKP Compass Learning Platform
KP Compass Learning Platform
 
Building High Available and Scalable Machine Learning Applications
Building High Available and Scalable Machine Learning ApplicationsBuilding High Available and Scalable Machine Learning Applications
Building High Available and Scalable Machine Learning Applications
 
AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mecha...
AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mecha...AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mecha...
AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mecha...
 
Making your site easier to use, an in-house usability testing case study
Making your site easier to use, an in-house usability testing case studyMaking your site easier to use, an in-house usability testing case study
Making your site easier to use, an in-house usability testing case study
 
Simplifying the Web Accessibility Test Lab
Simplifying the Web Accessibility Test LabSimplifying the Web Accessibility Test Lab
Simplifying the Web Accessibility Test Lab
 
Continuous Delivery and Continuous Agile by Andy Singleton - Agile Maine Day...
Continuous Delivery and Continuous Agile by Andy Singleton - Agile Maine Day...Continuous Delivery and Continuous Agile by Andy Singleton - Agile Maine Day...
Continuous Delivery and Continuous Agile by Andy Singleton - Agile Maine Day...
 
Organizing Your First Website Usability Test - WordCamp Toronto 2016
Organizing Your First Website Usability Test - WordCamp Toronto 2016Organizing Your First Website Usability Test - WordCamp Toronto 2016
Organizing Your First Website Usability Test - WordCamp Toronto 2016
 
Socialcam - Ammon bartam
Socialcam - Ammon bartamSocialcam - Ammon bartam
Socialcam - Ammon bartam
 
FMK2016 - HOunz Koudelka - Audit and Optimization
FMK2016 - HOunz Koudelka - Audit and OptimizationFMK2016 - HOunz Koudelka - Audit and Optimization
FMK2016 - HOunz Koudelka - Audit and Optimization
 

Recently uploaded

Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 

Recently uploaded (20)

Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 

3 Challenges in Customer Feedback Classification

  • 1. 3 Challenges in Customer Feedback Classification ML Kitchen #8 April 12, 2018 Van Phu Quang Huy
  • 2. About Me • Van Phu Quang Huy • Github: @vanhuyz • Doing machine learning and developing new services in Cookpad R&D
  • 3. What I played recently Transform gorilla faces to human faces using CycleGAN Repository: https://github.com/vanhuyz/CycleGAN-TensorFlow
  • 4. Today’s talk Introduce a very first approach to customer support automation
  • 5. Background • About cookpad recipe sharing service (cookpad.com) • 90M worldwide monthly average users • 68 countries / 22 languages
  • 6. Background • About customer feedback • extremely important in service development • each feedback must be delivered to right staff/ engineers
  • 7. Customer Feedback Box on cookpad.com
  • 8. Feedback Examples • 
 It was easy, I tried with my child. It was fun. • 
 There are abundant recipes. Useful. • 
 After updating, the search ranking disappears, it is very inconvenient.
  • 9. User Support Team have to classify all customer feedback into about 100 tags everyday
  • 10. Internal tool (before) User Support Team have to manually choose tags from long tag list
  • 11. Problems • User Support Team have to classify all customer feedback into about 100 tags everyday • Time consuming (60 min/day) • Boring
  • 12. Problems • User Support Team have to classify all customer feedback into about 100 tags everyday • Time consuming (60 min/day) • Boring
  • 15. Feedback Classification: 3 Challenges • Open set and large number of categories • Multi-label problem • Imbalanced data
  • 16. 1. Open set and large number of categories • Large number of categories: currently 100+ • Open set problem: • unseen categories in training (as new tags will be added in future)
  • 17. 2. Multi-label problem 1 feedback can have multi tags, for example: “ ” “By sending cooksnap, the recipe will come out soon when I cook the next time, it is extremely convenient and useful. There are many recipes from various people. I am happy to use it. Let’s keep the good work.” should be tagged as Cooksnap, Positive
  • 18. 3. Imbalanced data • Almost half of feedback are related to Tokubai ※ • New services (e.g. Amazon Echo Alexa, storeTV, etc) have very few feedback • Many services have already been closed then those data become obsolete ※ These services currently belong to Tokubai inc.
  • 19. Open Question How do you design your system to solve this problem?
  • 20. Rules of Machine Learning https://developers.google.com/machine-learning/rules-of-ml/ • Don’t be afraid to launch a product without machine learning. • Choose machine learning over a complex heuristic. • Keep the first model simple and get the infrastructure right.
  • 21. Rules of Machine Learning https://developers.google.com/machine-learning/rules-of-ml/ • Don’t be afraid to launch a product without machine learning. • Choose machine learning over a complex heuristic. • Keep the first model simple and get the infrastructure right.
  • 22. First trial: Heuristic • Manually build a dictionary for each tag • e.g. feedback includes a word from [‘search’, ‘find’, ‘keyword’] should be tagged as Search • This achieved high precision but very low recall • Also, building dictionaries for 100 tags is infeasible
  • 23. Rules of Machine Learning https://developers.google.com/machine-learning/rules-of-ml/ • Don’t be afraid to launch a product without machine learning. • Choose machine learning over a complex heuristic. • Keep the first model simple and get the infrastructure right.
  • 25. Look back 3 challenges • Multi-label problem? • Multiple binary classifiers can handle that! • Open set and large number of categories? • If a new tag is added, just train a new binary classifier • Imbalanced data? • Rebalance data by using all positive samples, but only select randomly negative samples from the rest
  • 26. Rules of Machine Learning https://developers.google.com/machine-learning/rules-of-ml/ • Don’t be afraid to launch a product without machine learning. • Choose machine learning over a complex heuristic. • Keep the first model simple and get the infrastructure right.
  • 27. Pick a first algorithm Note that we need to train ~100 binary classifiers
  • 28. I choose you! Support Vector Machine
  • 29. Evaluation of 81 classifiers on validation set
  • 31. Internal tool (after) User Support Team only have to choose tags from few tag suggestions
  • 32. Results • User Support Team have to classify all customer feedback into about 100 tags everyday • Time consuming (60 min/day) 30 min/day • Boring Fun!
  • 33. Results • User Support Team have to classify all customer feedback into about 100 tags everyday • Time consuming (60 min/day) 30 min/day • Boring Fun!
  • 34. Future Work • Visualize/Evaluate results using obtained data during operation • Collect negative samples (suggested tags which are not chosen by operators) to improve classifiers • Other ways to deal with imbalanced data https:// github.com/scikit-learn-contrib/imbalanced-learn • Deep learning? • Aim for 100% auto-tagging?
  • 35. Summary • Introduced a problem in Customer Service • Introduced a very first solution for customer feedback classification • Successfully reduced labor time of customer feedback tagging by half