1. A model for profiling Mobile Telecom
Subscribers based on their credit behavior
Dr.Asoka Korale,C.Eng. MIET
2. Profiled customers important for
• Credit management and determining credit
actions
• Managing revenue through monitoring
receipts/payments
Other uses of segmenting /
profiling
• Identifying groups for Promotions / Special
Offers
• Cross selling / Up selling
• Targeted Advertising
3. Many ways to profile
• Elements of a profile
The selected attributes
• must reflect the particular task at hand
• depend on the nature of the profiling
Attributes for Credit (Receipts and Payments) Profile
• Network Stay: number of years with network
• Pay delay: number of days between payment date and due date
• Pay gap percentage: proportion of bill that is outstanding
• Revenue: bill value
4. Attribute Range Points
0 <= x < 4.3 (lowest quartile) 0.25
4.3 <= x < 5 (2nd
Quartile) 0.5
5 <= x < 5.7 (3rd
Quartile) 0.75
5.7 <= x < 10 (top quartile) 1.0
0 1 2 3 4 5 6 7 8 9 10
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Cumulative Probability Distribution Function of Attribute
Attribute Value (x)
Ex: Allocate a total of 1 points across 4
quartiles
The percentiles of each of the variables is considered in
allocating a score (points) for each attribute value.
This scheme can be extended to as many
levels as desired to meet any accuracy
requirement
Max Network Stay points = 0.35
Max Pay Delay points = 0.3
Max Pay Gap points = 0.7
Max Revenue points = 0.65
5. Total Credit Risk points = Payment Delay points + Payment Gap points
Rating = 1 – (Total Credit Risk)
Payment Delay
(days)
Allocate Pay Delay
points based on
percentile
PC PC: Percentile
Calculation and
mapping variables to
points
Payment Gap
(%)
Allocate Pay Gap
Risk points based
on percentile
PC
Network Stay
(years)
Allocate Network
Stay points based
on percentile
PC
Allocate a Grade/
Segment based on
coordinates of
Cluster Centroid
1 – (Pay Delay Points
+ Pay Gap Points) =
Credit Risk Points
Monthly Revenue
(Rs)
Allocate Revenue
points based on
percentile
PC
Cluster:
Credit Risk
points,
Network Stay
points,
Revenue
points
6. A Cluster
• A group of objects more similar to one another than to members of other clusters
• Represents a “segment” in the business perspective
Fuzzy C – Means Clustering Algorithm
• Originally derived from computer science, widely used in data mining
• An unsupervised learning algorithm
profiles data with out respect to a target variable
Has no recourse to a training sequence
• Robust in processing large amounts of data
• Particularly useful when data patterns are not self evident
Or when manual processing is not practical
• Clusters arise naturally from patterns in the data
• Fuzziness implies that each data point may belong to one or more clusters to a
certain degree – depending on membership function
• In this modeling - subscriber allocated to cluster with highest membership
7. The cost function that will be minimized to
arrive at the clusters around the centroids
∑∑= =
−=
N
i
C
j
ji
m
ijm cxuJ
1
2
1
)(
][ ijuU =
1. Initialize the membership function and
centroids
2. Update the membership function
∑=
−
−
−
=
C
k
m
ki
ji
ij
cx
cx
u
1
1
2
1
3. Update the centroids
∑
∑
=
=
= N
i
m
ij
N
i
i
m
ij
j
u
xu
c
1
1
4. Check the convergence criteria, at kth
iteration
ε<−+ kk
UU 1
jC
5. Stop if step 4. is satisfied, else return to step 2
8. Cluster
Rating
points
Network
Stay
points
Revenue
points
1 0.2935 0.2115 0.0969
2 0.3832 0.2289 0.3615
3 0.5535 0.2770 0.5770
4 0.9224 0.2202 0.3454
5 0.8929 0.2527 0.5780
6 0.5966 0.2129 0.3664
Cluster
Rating
points
Network
Stay points
Revenue
points
1 LOW MED LOW
2 LOW MED MED
3 MED HIGH HIGH
4 HIGH MED MED
5 HIGH HIGH HIGH
6 MED MED MED
Relative to Max
attribute value
Cluster 5, has subscribers with low credit
risk and high revenue contribution Valuable Subs: Keep
Cluster 1, has subscribers with high credit
risk and low revenue contribution Let Churn
Cluster 3, has subscribers with medium credit
risk and high revenue contribution Positively Influence
Table 1: Cluster Centroids
Table 2: Centroids relative to max
attribute values
10. 307
809
2201
670
2325
812
Average revenues in each cluster
c1
c2
c3
c4
c5
c6
Cluster 5, has subscribers with highest average revenue contribution
(combined with high network stay and low credit risk – “valuable segment”)
Cluster 1, has subscribers with lowest average revenue contribution
(combined with high credit risk and medium network stay – “low value segment”)
Cluster 3, has subscribers with high revenue contribution
(combined with medium credit risk – “opportunity to influence” - increase their
rating
12. Determine Histograms of attributes
Normalize each to obtain approximation to probability density
function of the selected attribute.
Take cumulative sum of the probability density function to determine
cumulative probability distribution function
Determine percentiles for allocating points to the attributes
Allocate grade to customers based on the ranking of the function
derived from the three attributes
Note: It is also possible to directly compute the percentiles by simply
sorting the samples and reading off the corresponding sample values
at each point of interest. Both give same results, with above method
providing a little more insight
13. 0 1 2 3 4 5 6 7 8 9 10
0
0.5
1
1.5
2
2.5
3
3.5
4
x 10
4 Histogram of attribute
Attribute Value
0 1 2 3 4 5 6 7 8 9 10
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
Probability Density Function
Attribute Value
For the sake of this example the attribute in this case is assumed to be
Normally distributed. In practice however the distributions of the
attributes will take different forms, but the procedure for calculating the
percentiles will remain the same.
14. 0 1 2 3 4 5 6 7 8 9 10
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
Probability Density Function
Attribute Value
0 1 2 3 4 5 6 7 8 9 10
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Cumulative Probability Distribution Function
Attribute Value
15. Attribute Range Points
0 <= x < 4.3 (lowest quartile) 0.25
4.3 <= x < 5 (2nd
Quartile) 0.5
5 <= x < 5.7 (3rd
Quartile) 0.75
5.7 <= x < 10 (top quartile) 1.0
0 1 2 3 4 5 6 7 8 9 10
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Cumulative Probability Distribution Function of Attribute
Attribute Value (x)
Mapping attribute level to points based on the percentile
that the customer achieves in relation to that attribute