3 Challenges in Customer Feedback Classification

3 Challenges in
Customer Feedback
Classiﬁcation
ML Kitchen #8

April 12, 2018

Van Phu Quang Huy

About Me
• Van Phu Quang Huy

• Github: @vanhuyz

• Doing machine learning and developing new services in
Cookpad R&D

What I played recently
Transform gorilla faces to human faces using CycleGAN

Repository: https://github.com/vanhuyz/CycleGAN-TensorFlow

Today’s talk
Introduce a very ﬁrst approach to customer
support automation

Background
• About cookpad recipe sharing service (cookpad.com)

• 90M worldwide monthly average users

• 68 countries / 22 languages

Background
• About customer feedback

• extremely important in service development

• each feedback must be delivered to right staﬀ/
engineers

Customer Feedback Box on
cookpad.com

Feedback Examples
•  
It was easy, I tried with my child. It was fun.

•  
There are abundant recipes. Useful.

•
 
After updating, the search ranking disappears, it is very
inconvenient.

User Support Team have to classify all customer
feedback into about 100 tags everyday

Internal tool (before)
User Support Team have to manually choose tags from

long tag list

Problems
• User Support Team have to classify all customer

• Time consuming (60 min/day)

• Boring

Solution: feedback
auto-classiﬁcation?

Feedback Classiﬁcation:
3 Challenges
• Open set and large number of categories

• Multi-label problem

• Imbalanced data

1. Open set and large number
of categories
• Large number of categories: currently 100+

• Open set problem:

• unseen categories in training (as new tags will be
added in future)

2. Multi-label problem
1 feedback can have multi tags, for example:
“
”

“By sending cooksnap, the recipe will come out soon when I cook the
next time, it is extremely convenient and useful. There are many
recipes from various people. I am happy to use it. Let’s keep the good
work.”

should be tagged as Cooksnap, Positive

3. Imbalanced data
• Almost half of feedback
are related to Tokubai ※

• New services (e.g.
Amazon Echo Alexa,
storeTV, etc) have very
few feedback

• Many services have
already been closed then
those data become
obsolete
※ These services currently belong to Tokubai inc.

Open Question
How do you design your system to
solve this problem?

Rules of Machine Learning
https://developers.google.com/machine-learning/rules-of-ml/
• Don’t be afraid to launch a product without machine
learning.

• Choose machine learning over a complex heuristic.

• Keep the ﬁrst model simple and get the infrastructure
right.

First trial: Heuristic
• Manually build a dictionary for each tag

• e.g. feedback includes a word from [‘search’, ‘ﬁnd’,
‘keyword’] should be tagged as Search

• This achieved high precision but very low recall

• Also, building dictionaries for 100 tags is infeasible

Second trial:
Simple Machine Learning

Look back 3 challenges
• Multi-label problem?

• Multiple binary classiﬁers can handle that!

• Open set and large number of categories?

• If a new tag is added, just train a new binary classiﬁer

• Imbalanced data?

• Rebalance data by using all positive samples, but only
select randomly negative samples from the rest

Pick a ﬁrst algorithm
Note that we need to train ~100 binary classiﬁers

I choose you!
Support Vector Machine

Evaluation of 81 classiﬁers
on validation set

Internal tool (after)
User Support Team only have to choose tags from

few tag suggestions

Results
• User Support Team have to classify all customer

• Time consuming (60 min/day) 30 min/day

• Boring Fun!

Future Work
• Visualize/Evaluate results using obtained data during
operation

• Collect negative samples (suggested tags which are not
chosen by operators) to improve classiﬁers

• Other ways to deal with imbalanced data https://
github.com/scikit-learn-contrib/imbalanced-learn

• Deep learning?

• Aim for 100% auto-tagging?

Summary
• Introduced a problem in Customer Service

• Introduced a very ﬁrst solution for customer feedback
classiﬁcation

• Successfully reduced labor time of customer feedback
tagging by half

3 Challenges in Customer Feedback Classification

Recommended

Recommended

More Related Content

Similar to 3 Challenges in Customer Feedback Classification

Similar to 3 Challenges in Customer Feedback Classification (20)

Recently uploaded

Recently uploaded (20)

3 Challenges in Customer Feedback Classification