Copyright © Wondershare Software

www.wondershare.com
• Introduction
• Problem Understanding
• Data Understanding
• Course of Action

Copyright © Wondershare Software
• A telecom company named as Bad
Idea is expecting for fraudsters.
• They designed a weird rate plan
called Praxis plan where only four
calls are allowed during a day.
• Bad Idea has their call logs
spanning over one and half
months.

Copyright © Wondershare Software
• Two datasets are given:


Blacklist subscribers call log



Audited call log

• No of rows: 138
• Call timing:
•

Morning- 9AM to Noon

•

Afternoon- Noon to 4PM

•

Evening- 4PM to 9PM

•

Night- 9PM to Midnight

•

Callers: Virginia, Sally, Vince

•

Tool Used: Rapidminer & R
Copyright © Wondershare Software
•

A gang of fraudsters consist of three
people: Sally, Virginia & Vince targeted
the company.

•

There subscriptions were terminated.

•

The audit is done every 5 days to keep on
track for the fraudsters.

•

They reviewed the list of subscribers who
have made calls to the same people & in
the same time frame as those fraudsters.

Copyright © Wondershare Software
•

Algorithm: Naïve Bayesian Classifier

•

Import the data using read csv

Copyright © Wondershare Software
•

Use Split validation, it splits up the example set into a training and test
set and evaluates the model. 70% of data is used as training sample &
rest 30% is development set.

•

The first inner operator accept an Training Set while the second accept an
Test Set and the output of the first (which is in most cases a Model) and
produce a Performance Vector. Here in the first inner Naïve Bayes is used.

Copyright © Wondershare Software
•

The audit data set is imported using Read csv.

•

“Select Attributes” is applied to remove the unwanted attributes.

Copyright © Wondershare Software
•

“Apply Model” operator is used to apply the model to the training
data. The information is used to predict the value of possibly
unknown label.

•

All needed parameters are stored within the model object.

Copyright © Wondershare Software
•

Output with the probabilities of callers.

Copyright © Wondershare Software
Import the data
By using the Sampling
method divided the
dataset into train test
data

Develop the
model
Applied the model
on Test data

Applied the model
on the Audit log
set

Copyright © Wondershare Software

Converted the
predictability into
percentage
•

Import the data and convert the data into the train and test
sample:

Copyright © Wondershare Software
•

All the data which contain in the train dataset:
Dda

Copyright © Wondershare Software
•

Develop the Model using the train data set:

Copyright © Wondershare Software
•

Probability of the Caller and Predict on the test Data:

Copyright © Wondershare Software
•

Probability as per the test data and applied on the Audit Log set data and
converted in to percentage :

Copyright © Wondershare Software
ID

Morning
491 Robert
61 Quentin
703 Quentin
996 Quentin
173 Robert
575 Kelly
365 Larry
967 Mark
650 Nancy
165 Olga
557 Robert
808 Quentin
936 Robert
836 Kelly
976 Robert

Company Logo

AfternoonEvening
John
David
George David
George David
John
Emily
George Frank
John
David
John
David
George Frank
Harry
Frank
Harry
Frank
John
Frank
George David
George David
Harry
Frank
George Frank

Night
Alex
Alex
Beth
Alex
Alex
Clark
Clark
Clark
Clark
Clark
Clark
Beth
Alex
Clark
Alex

Customer
Customer X
Customer X
Customer X
Customer X
Customer X
Customer Y
Customer Y
Customer Y
Customer Y
Customer Y
Customer Z
Customer Z
Customer Z
Customer Z
Customer Z

Probable Fraudster
Vince
Sally
Sally
Vince
Vince
Vince
Vince
Virginia
Virginia
Virginia
Virginia
Sally
Vince
Virginia
Vince

Copyright © Wondershare Software

Probability
0.96
0.94
0.81
0.50
0.57
0.72
0.66
0.87
0.98
0.81
0.70
0.81
0.94
0.62
0.57
Copyright © Wondershare Software

Telecom Fraud Detection

  • 1.
    Copyright © WondershareSoftware www.wondershare.com
  • 2.
    • Introduction • ProblemUnderstanding • Data Understanding • Course of Action Copyright © Wondershare Software
  • 3.
    • A telecomcompany named as Bad Idea is expecting for fraudsters. • They designed a weird rate plan called Praxis plan where only four calls are allowed during a day. • Bad Idea has their call logs spanning over one and half months. Copyright © Wondershare Software
  • 4.
    • Two datasetsare given:  Blacklist subscribers call log  Audited call log • No of rows: 138 • Call timing: • Morning- 9AM to Noon • Afternoon- Noon to 4PM • Evening- 4PM to 9PM • Night- 9PM to Midnight • Callers: Virginia, Sally, Vince • Tool Used: Rapidminer & R Copyright © Wondershare Software
  • 5.
    • A gang offraudsters consist of three people: Sally, Virginia & Vince targeted the company. • There subscriptions were terminated. • The audit is done every 5 days to keep on track for the fraudsters. • They reviewed the list of subscribers who have made calls to the same people & in the same time frame as those fraudsters. Copyright © Wondershare Software
  • 6.
    • Algorithm: Naïve BayesianClassifier • Import the data using read csv Copyright © Wondershare Software
  • 7.
    • Use Split validation,it splits up the example set into a training and test set and evaluates the model. 70% of data is used as training sample & rest 30% is development set. • The first inner operator accept an Training Set while the second accept an Test Set and the output of the first (which is in most cases a Model) and produce a Performance Vector. Here in the first inner Naïve Bayes is used. Copyright © Wondershare Software
  • 8.
    • The audit dataset is imported using Read csv. • “Select Attributes” is applied to remove the unwanted attributes. Copyright © Wondershare Software
  • 9.
    • “Apply Model” operatoris used to apply the model to the training data. The information is used to predict the value of possibly unknown label. • All needed parameters are stored within the model object. Copyright © Wondershare Software
  • 10.
    • Output with theprobabilities of callers. Copyright © Wondershare Software
  • 11.
    Import the data Byusing the Sampling method divided the dataset into train test data Develop the model Applied the model on Test data Applied the model on the Audit log set Copyright © Wondershare Software Converted the predictability into percentage
  • 12.
    • Import the dataand convert the data into the train and test sample: Copyright © Wondershare Software
  • 13.
    • All the datawhich contain in the train dataset: Dda Copyright © Wondershare Software
  • 14.
    • Develop the Modelusing the train data set: Copyright © Wondershare Software
  • 15.
    • Probability of theCaller and Predict on the test Data: Copyright © Wondershare Software
  • 16.
    • Probability as perthe test data and applied on the Audit Log set data and converted in to percentage : Copyright © Wondershare Software
  • 17.
    ID Morning 491 Robert 61 Quentin 703Quentin 996 Quentin 173 Robert 575 Kelly 365 Larry 967 Mark 650 Nancy 165 Olga 557 Robert 808 Quentin 936 Robert 836 Kelly 976 Robert Company Logo AfternoonEvening John David George David George David John Emily George Frank John David John David George Frank Harry Frank Harry Frank John Frank George David George David Harry Frank George Frank Night Alex Alex Beth Alex Alex Clark Clark Clark Clark Clark Clark Beth Alex Clark Alex Customer Customer X Customer X Customer X Customer X Customer X Customer Y Customer Y Customer Y Customer Y Customer Y Customer Z Customer Z Customer Z Customer Z Customer Z Probable Fraudster Vince Sally Sally Vince Vince Vince Vince Virginia Virginia Virginia Virginia Sally Vince Virginia Vince Copyright © Wondershare Software Probability 0.96 0.94 0.81 0.50 0.57 0.72 0.66 0.87 0.98 0.81 0.70 0.81 0.94 0.62 0.57
  • 18.