2. Abstract
Online auctions are an important aspect of e-commerce. However, for consumers,
the chances of landing a winning bid in online auctions is difficult because of
“bidding robots”.
Humans are not able to attend to and monitor auctions with the same capacity as a
computer, which can make complex bidding decisions in split-second time and can
follow an auction with nonstop, undivided attention.
3. Problem Statement
The project aims to identify auctioning agents that are software robots and thus
eliminate them from the online auctions.
Here we tried to level the playing field by identifying and banning bid bots.
4. Data Set
It is organized into three files:
→ train.csv (bidder data with labels)
→ test.csv (bidder data without the labels meant to be tested) and
→ bids.csv (individual bid data from 7.6 million bids).
For the purpose of this project, we will be using the labelled bidder data and bid
data to make and test our predictions.
http://www.kaggle.com/c/facebook-recruiting-iv-human-or-bot/data
5. Project Scope
The project aims to analyse the bids data ,extract distinguishable features that help
us differentiate between a human and robot, train the classifiers according to these
features and finally test the accuracy of our classifier using test data. The scope can
be divided into 3 levels :–
→ Feature Extraction
→ Modelling and training
→ Testing
6. Proposed System
The main challenges in this project are
Feature engineering
→ No. of Bids in total : This looked like a long shot but Robots might have
significantly higher bids than humans
→ No. of Bids / auction : This was an interesting hypothesis. A robot will be
more consistent in bidding to win every auction. Hence, robots will have a
higher number of bids every auction.
→ Response time : Time between two bids will be considerably low for Robots
than humans
→ No. of Distinct IPs/Countries : Robots will probably make bids from multiple
IPs and Countries
→ Merchandise : A few merchandise will have more robots bidding.
7. ➔ Randomness observed among human and robots are completely different. If
bidder is a robot then it will perform almost perfect randomness as compared
to a human bidder.
→ Give ranks to the auction sequence for a user based on the starting times or the finishing time
and let's call it as an ordered sequence.
→ Get the entire sequence of the auction for that user in the order of bid it make.
→ Compare the above sequence with the ordered sequence as sequence correlation or number of
inversions to sort.
→ For humans the number of inversions will be less compared to robot and correlation will be high.
➔ Number of auction that are active for a particular bidder for which there is still
time left for the auction to end. If the bidder is human then he will not be as
active as a robot.
8. Model optimisation
➔ Random-Forests : Random forests are a way of averaging multiple deep decision trees,
trained on different parts of the same training set, with the goal of reducing the variance.
➔ Gradient Boosting : It is a machine learning technique for regression and classification
problems, which produces a prediction model in the form of an ensemble of weak
prediction models, typically decision trees
➔ AdaBoost : AdaBoost is adaptive in the sense that subsequent weak learners are tweaked
in favor of those instances misclassified by previous classifiers.
9. ➔ Bagging-Classifier : Bagging is a bootstrap ensemble method that creates individuals for
its ensemble by training each classifier on a random redistribution of the training set.
➔ Logistic-regression : It estimate the probability of a binary response based on one or
more predictor features. It is not a classification method. It could be called a qualitative
discrete choice model .
We can combine these similar models and average their output for our final result to
prevent overfitting and improve our accuracy .
10. Results
Some features and the information
captured by them
➔Starting an auction or finishing
an auction
11. Percentage of bids of a bidder that
are bid on himself and percentage of
auto bids and bids with change in
IP, url or devices.
Results
14. References and Related Work
1. Bid-war : Human or Robot .
2.https://www.kaggle.com/c/facebook-recruiting-iv-human-or-bot
3.http://small-yellow-duck.github.io/auction.html
4.http://www.thomas-robert.fr/en/kaggles-human-or-robot-competition-feedback/