Racingbazar is a classified ad site focusing on car parts with thousands of incoming
ads a day. They applied a „post-moderation” process, which required fast reactions from
the moderators to make the site useable, and to provide a clean category tree. Slamby
helped them by offering a faster and better solution, namely automatic category
moderation. We show You, how we did it.
2. 1For more information please visit www.slamby.com
Summary
Racingbazar is a classified ad site focusing on car parts with thousands of incoming
ads a day. They applied a „post-moderation” process, which required fast reactions from
the moderators to make the site useable, and to provide a clean category tree. Slamby
helped them by offering a faster and better solution, namely automatic category
moderation. We show You, how we did it.
The Problem
Have you ever thought about moderating your ads automatically? We have started to
work on it with racingbazar.hu. Earlier, they had a “post-moderation” process. That is, if
somebody placed an ad, Racingbazar accepted it, and moderators moderated the posted
ad after its acceptance. Yet, every ad went public first.
They have thousands of incoming ads a day; accordingly, it could and often indeed did
happen that an ad was placed into the wrong category, and remained there for a long
time, which is obviously not ideal for users and visitors.
It is also a waste of time on the moderation side, if moderators have to check countless
well-categorized ads, in order to find and deal with the wrong ones.
Based on available information, about 20% of the ads posted daily is categorized
incorrectly, and needs to be moderated; but the task is like finding a needle in a haystack.
Moderators have to check each and every ad to find that 20%. Checking all acceptable
ads slows moderation down, it is tiresome and monotonous, and moderation mistakes
tend to occur more frequently.
3. 2For more information please visit www.slamby.com
Slamby as a Solution
Slamby Classifier helps Racingbazar in the moderating process. We analyze all of the
incoming products, and check their categories. When we find a post in an inaccurate
category, we mark it for the moderators to check it as soon as possible. Slamby labels
also those ads, which are probably acceptable, so that moderators do not have to deal
with them, and check their categories.
This way, moderators need to concentrate only on poorly categorized ads, which
comprise only a small portion of the total number of the ads. As a result, moderators and
the moderation process have become faster and more accurate.
After using Slamby Classifier, Racingbazar “accidentally” detected some recurring
problems in the moderation process. Slamby Classifier pointed these out and now that
the moderators can recognize them, they are able to move the problematic ads to the
right category.
How Did Slamby Solve the problem?
We used Slamby Classifier to solve the problem of category moderation.
Train Slamby with Your data, and let Slamby do the magic.
Slamby is an intelligent, language-independent automatic categorization solution, which
learns from data of Classified Ad Sites. It learned the category tree and the ads belonging
to the categories on Racingbazar.
Racingbazar provided Slamby Classifier with a training database consisting of 500 000
ads , which we fed to Slamby Classifier. These ads had already been moderated before.
After Slamby has learnt the categories, it is able to categorize the ads automatically.
4. 3For more information please visit www.slamby.com
Quality Measurement
At Slamby we offer the most accurate automatic categorization solution. In order to
provide high quality we measure the efficiency of Slamby Classifier. Every time before
using Slamby Classifier we conduct a precise quality measurement procedure to ensure
the perfect functioning of Slamby.
How do we conduct quality measurement?
First, when we receive a training dataset, we pick out 3 distinct datasets, and set them
aside. After the training process, we take those 3 datasets singled out, and re-categorize
them. We compare these results (the original categories, and the new categories resulting
from the re-categorization).
If Slamby Classifier gives us the same category successfully, we consider its performance
to be satisfying, but if it gives us another, we consider it to be wrong (however, in several
cases the category given by Slamby Classifier was more suitable than the original). In order
to measure the quality of recommendation and to be able to rely on it automatically, we
assign to every category recommendation a score between 0 and 1. This score predicts
the quality of the categorisation as a confidence level indicator. After the quality
measurement, Slamby provides two kinds of diagram on the confidence intervals and the
expected precision-completeness levels. Below we can see the precision-score diagram.
5. 4For more information please visit www.slamby.com
Based on the precision-score diagram we can measure the quality of the
recommendations, which makes us able to automate moderation. The diagram shows,
what the chances are for being correct at a score level. For example, if we would like to
accept each category recommendation automatically with a 95% confidence, this diagram
shows us that we need to use a 0.18 score as a threshold. If 90% confidence is enough,
the score threshold is 0.1.
We also use another diagram to predict the expected completeness precision values
using Slamby Classifier.
On the diagram above, we can see that without using a threshold Slamby Classifier
6. 5For more information please visit www.slamby.com
works in general with 87% precision. That means that at a 100% completeness Slamby
reaches 87% precision. In other words, out of 100 ads Slamby is going to categorize all,
and the expected number of well-categorized ads is going to be 871. Using the example
above, Racingbazar wanted to work with a 95% confidence level. Accordingly, we knew
that we need to use a 0.18 score to achieve a confidence level of 95%. The completeness
diagram shows the expected connections between the score and completeness. Using a
0.18 score means 0.8% completeness, that is, out of 100 ads Slamby is able to categorize
all at an 87% confidence level, furthermore, when using a 0.18 score limit only 80 out of
100 ads are going to have a higher score than 0.18 with a 95% confidence level. The
remaining 20% is going to have the original 87% accuracy level.
Racingbazar uses Slamby with a 95% confidence level, which means a ~0.18 score
threshold. They accept the recommended categories with a score below 0.18, as a result
they could check 95% of all of their ads.
We can increase the percentage of the checked ads with a lower score, but then it will
be possible that Slamby gives a wrong category at first. However, since Racingbazar uses
Slamby Classifier to help the moderator team, it applies a TOP3 recommendation system,
and those three recommended categories will includethe right one with certainty.
1 Calculating with a general 5% measurement error expected precision is around 92% at 100%
completeness.
7. 6For more information please visit www.slamby.com
Integration, Usage
After we had trained it, we were able to offer Racingbazar a dedicated Slamby Classifier
.
For its usage, Slamby provides a simple REST API that enables our customers to integrate
Slamby easily.
Here You can find an example how easy it is to use our Rest API:
Query Example2
curl
"https://api.slamby.com/demo/API/Recommend?count=3" -v -H "Accept:
application/json" -H "Content-type: application/json; encoding=utf-8" -H
"Authorization: Slamby s3cr3t" -X POST -d
'{
"name":"Apple iPhone 6 Plus case white leather",
"content":"White leather case for Apple Iphone 6 plus."
}'
Response Example
[
{"categoryId":"22","score":"0.23"},
{"categoryId":"4078","score":"0.10"},
{"categoryId":"4080","score":"0.09"}
]
2 For illustrative purposes the queries are given in a simple cURL format. Important! Slamby uses a
REST API.
8. 7For more information please visit www.slamby.com
Automatic Decision Making, Threshold
As you can see, by using Slamby and threshold, Racingbazar is able to automate its
category moderation, and support its moderation team. They adopted Slamby in a two-
step decision tree:
1) Sending all ads to Slamby; and receiving the recommended categories and scores
back.
a. If the category recommended by Slamby is the same as the originally selected
one, they accept the category.
b. If the category recommended by Slamby differs from the originally selected
one, they examine the score in order to check the quality of the
recommendation.
2) If the score is above 0.18, the confidence level is 95 %, and they accept the category
recommended by Slamby instead of the original.
3) If the score is below 0.18, the confidence level is between 87% and 95%, and they
send the TOP3 recommendations to manual moderation. Moderators doing the
manual moderation see both the original category and the one recommended by
Slamby among the TOP3 categories. If the first recommendation was not the right
one, the moderators can select the best one from the TOP3 recommendations.
9. 8For more information please visit www.slamby.com
Results
Setting up Slamby Classifier, its integration, and starting to use it took Racingbazar.hu
less than a few days. Slamby and Racingbazar.hu have been able to work together perfectly
since the beginning of July 2015. Slamby Classifier is used to provide automatic category
moderation for Racingbazar.hu reviewers.
Since the cooperation between Slamby and Racingbazar category moderation has
become faster and more accurate on the classified ad site.
Racingbazar has
o saved a lot of time with automatic category moderation;
o saved money due to the less amount of work and time of the moderators;
o got a well re-categorized database, which has become almost entirely
accurate.