Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Subscriber Churn Prediction Model
using Social Network Analysis
In Telecommunication Industry
Chettapong Punyachonkool
Dr....
Chettapong Punyachonkool
Data Engineer, Business Intelligence Strategy
The Siam Commercial Bank
chettapongp@gmail.com
www....
Topics
› Social Network Analysis basic concepts
› Social Network Analysis with R
› Visualizing Social Network
› Using SNA ...
Social Network Analysis
Basic Concepts
Social Network
Social Network: A social structure composed of
individuals (or organizations) interconnected by one or
more...
Social Network Analysis
Social Network Analysis:The application of graph
theory to understand, categorize and quantify rel...
Why should you care about SNA?
Traditional marketing practices are becoming obsolete.
• Test and control group methodologi...
Why should you care about SNA?
Customer are sceptical: if you want to sell your
products to your customers, convince their...
Customer with the Role of an Influencer
• Influential user adopts a product or behaviour.
• Influential user tells (and in...
Social Network Analysis Application
Source: http://www.martingrandjean.ch/connected-world-air-traffic-network/
Social Network Analysis Application
Source: http://www.martingrandjean.ch/connected-world-air-traffic-network/
Social Network Analysis Application
https://linkurious.icij.org/widget/4c040978
Social Network Analysis Application
Source: Apichart Wisitkitchakarn (2013), Risk Analysis of East Asian Stock Markets,
Th...
Social Network Analysis Application
Source: Valdis Krebs (2001). Connecting the Dots. Tracking Two Identified Terrorists
h...
Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods.
Communi...
Practical applications
Newman et al, 2006
Newman et al, 2006
A very early example of network analysis
comes from the city ...
Basic Concepts
} Networks
} Tie Strength
} Key Players
} Cohesion
How to represent various social networks
How to identify...
Representing relations as networks
Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutori...
Network terminology
1
2
3
4
Graph
Vertex
Edge
Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) includi...
Entering data on a directed graph
1
2
3
4
Graph (directed)
Vertex Vertex
1 2
1 3
2 3
2 4
3 4
Edge list
Vertex 1 2 3 4
1 - ...
Entering data on a directed graph
1
2
3
4
Graph (directed)
Vertex Vertex
1 2
1 3
2 3
2 4
3 4
Edge list
Vertex 1 2 3 4
1 - ...
Representing an undirected graph
Vertex Vertex
1 2
1 3
2 3
2 4
3 4
Edge list remains the same
Vertex 1 2 3 4
1 - 1 1 0
2 1...
Basic Concepts
Networks
} Tie Strength
Key Players
Cohesion
How to represent various social networks
How to identify stron...
Adding weights to edges
Vertex Vertex Weight
1 2 30
1 3 5
2 3 22
2 4 2
3 4 37
Edge list: add column of weights
Vertex 1 2 ...
Adding weights to edges
Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on conce...
Edge weights as relationship strength
Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tut...
Basic Concepts
Networks
Tie Strength
} Key Players
Cohesion
How to represent various social networks
How to identify stron...
Interpretation of measures
} Degree
} Betweenness
} Closeness
} Eigenvector
How many people can this person reach directly...
Degree centrality
Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts an...
Betweenness centrality
Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concep...
Closeness centrality
Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts...
Eigenvector centrality
Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concep...
Network centralities
1
2
3
4
5
6
7
0
1.5
6.5
0
9
0
0
1
2
3
4
5
6
7
2
3
4
1
4
1
1
1
2
3
4
5
6
7
0.5
0.67
0.75
0.46
0.75
0.4...
Closeness centrality
Source: www.activenetwork.net
Betweenness centrality
Source: www.activenetwork.net
Eigenvector centrality
Source: www.activenetwork.net
Interpretation of measures
Centrality Descripting Value meaning
Degree บอกระดับการไหลผ่านของข้อมูลของ node - ค่าน้อย ข้อมู...
Basic Concepts
Networks
Tie Strength
Key Players
} Cohesion
How to represent various social networks
How to identify stron...
Reciprocity (degree of)
1 2
3 4
} The ratio of the number of relations
which are reciprocated (i.e. there is an
edge in bo...
Density
1
2
3
4
} A network’s density is the ratio of the number of
edges in the network over the total number of
possible...
1
2
3
4
5
6
7
1
0.67
0.33
N/a
0.17
N/a
N/a
} A node’s clustering coefficient is the number of
closed triplets in the node’...
1
2
3
4
5
6
7
1
0.67
0.33
N/a
0.17
N/a
N/a
} A node’s clustering coefficient is the number of
closed triplets in the node’...
1
2
3
4
5
6
7
1
0.67
0.33
N/a
0.17
N/a
N/a
} A node’s clustering coefficient is the number of
closed triplets in the node’...
1
2
3
4
5
6
7
0
1.5
6.5
0
9
0
0
} Clustering algorithms identify clusters or
‘communities’ within networks based on
networ...
Average and longest distance
1
2
3
4
5
6
7
} The longest shortest path (distance)
between any two nodes in a
network is ca...
Social Network Analysis with R
&
Visualizing Social Network
Visualizing Social Network
Katherine Ognyanova, www.kateto.net
NetSciX 2016 School of Code Workshop, Wroclaw, Poland
Assistant Professor at the Schoo...
Basic SNA with R and igraph
} Networks
} Tie Strength
} Key Players
} Cohesion
Source: Asst. Prof Katherine Ognyanova. Net...
Basic SNA with R and igraph
} Networks
} Tie Strength
} Key Players
} Cohesion
} Create network
} Edge,Vertex and Attribut...
Create Network
› g1 <- graph( edges=c(1,2, 2,3, 3, 1), n=3, directed=F )
› plot(g1)
Source: Asst. Prof Katherine Ognyanova...
Create Network
› g2 <- graph( edges=c(1,2, 2,3, 3, 1), n=10 )
› plot(g2)
Source: Asst. Prof Katherine Ognyanova. Network A...
Create Network
› g3 <- graph( c("John", "Jim", "Jim", "Jill", "Jill",
"John"))
› plot(g3)
Source: Asst. Prof Katherine Ogn...
Create Network
› g4 <- graph( c("John", "Jim",
"Jim", "Jack", "Jim", "Jack",
"John", "John"),
isolates=c("Jesse", "Janis",...
Create Network
› plot(graph_from_literal(a---b, b---c))
› plot(graph_from_literal(a--+b, b+--c))
› plot(graph_from_literal...
Edge, Vertex and Attributes
› # The edges of the object
› E(g4)
› # The vertices of the
object
› V(g4)
› # The network mat...
Edge, Vertex and Attributes
Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorial
...
Read Network from files
3.1 DATASET 1: edgelist
› Dataset1-Media-Example-NODES.csv
› Dataset1-Media-Example-EDGES.csv
3.2 ...
Turning networks into igraph objects
We start by converting the raw data to an igraph network object.
Here we use igraph’s...
Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorial
the NetSciX 2016 School of C...
Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorial
the NetSciX 2016 School of C...
Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorial
the NetSciX 2016 School of C...
Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorial
the NetSciX 2016 School of C...
Basic SNA with R and igraph
} Networks
} Tie Strength
} Key Players
} Cohesion
Source: Asst. Prof Katherine Ognyanova. Net...
Plotting networks with igraph
We can set the node & edge options in two ways
› Specify them in the plot()
› Set attributes...
} Networks centrality with igraph
} Degree
} Closeness
} Betweenness
} Eigenvector
Basic SNA with R and igraph
} Networks
...
Network centrality with igraph
› Degree centrality
› Closeness centrality
› Betweenness centrality
› Eigenvector centralit...
Network centrality with igraph
Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutori...
} Density and Reciprocity
} Clustering
} Averages & longest distance
Basic SNA with R and igraph
} Networks
} Tie Strength...
Transitivity with igraph
Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorial
the...
Community Clustering based on edge
betweenness
Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph:...
Averages & longest distance
Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorial
...
Using SNA to predict
Subscriber churn
in Telecommunication industry
Customer Churn
“Churn represents the loss of an existing customer
to a competitor”
A prevalent problem in:
› Telecommunication services
›...
The CRM core objectives included:
– Customer Acquisition
– Customer Retention
› Churn Perdition
– Cross-sell/Up Sell
– Max...
§ Segmentation
§ Predictive Analytics
§ Customer Acquisition
§Costs of Customer Acquisition. (COCA)
§ Servicing
§QoS (Netw...
Most Telco define their customer segments using some of the
following
› Payment type (prepaid vs. postpaid)
› ARPU (Averag...
Process current and historical data in order to make predictions
about future events.
› Making customer decisions.
› Next ...
› Quality of Service
– Network Utilization
– Drop Call
› Call Center, Service Hall
– The number of times that Customer con...
› Type of Churn
– Voluntary Churn
– Involuntary Churn
• Type of Customer Retention
– Reactive
– Proactive
Customer Retenti...
Source: Mohammed Akram Ayyubi Define - The Churn Reasons Taxonomy
The Churn Taxonomy
Data Source
Customer Demographic:
• Zip code
• Income
• Occupation
• Age
• Gender
• Living Address
• Occupation Address
Or...
Data Source
Billing Data:
• Total amount of bill
• Total number of barred (one-way barred)
• Total number of full barred (...
Data Mining Functionalities, Techniques
& CRM Applications
Churn Modeling Techniques
Subscribers 25M
Prepaid Subs
4M
Postpaid Subs
21M
Prepaid APRU
550THB
Postpaid APRU
160THB
Source: NBTC (http://ttid.nbtc....
ผู้ใช้บริการเครือข่ายโทรศัพท์แบบรายเดือนประเภทบุคคลทัOวไปของผู้
ให้บริการเครือข่ายโทรศัพท์มือถือแห่งหนึOงทีOมีสถานะเป็นผู้...
1. Subscriber Centrality
– X1: Subscriber Centrality Degree
– X2: Subscriber Centrality Closeness
– X3: Subscriber Central...
Subscriber Centrality
Churn Neighbors Centrality
Non- Churn Neighbors Centrality
Subscriber Churn Status
Number of Neighbo...
adjacency
matrix
Churner
CDR
K-Mean
clustering
Cluster 1
Cluster 2
Cluster 3
Cluster
selection
Centrality
calculation
Chur...
JulyJune
July August
Subscribers 1.5 M
Churner 8.2 K
ทําการ summarize ข้อมูลจํานวนนาทีในการโทร(Call Duration) และ
จํานวนครัRงในการโทร (Number of Call) ให้อยู่ในรูปแบบ per subs...
แบ่งกลุ่มโดยใช้เทคนิค K-mean clustering โดยใช้จํานวนนาทีในการโทร (Call
Duration) และจํานวนครัRงในการโทร (Number of Call) เ...
1. พิจารณาจากจํานวน
สมาชิกในแต่ละกลุ่ม เมืOอแบ่ง
จํานวนกลุ่มมากขึRน
Number	of	Cluster	
K	=	2 K	=	3 K	=	4 K	=	5 K	=	6
#	tra...
Blue : Cluster 1 Yellow : Cluster 2 Red : Cluster 3
Cluster
Median of Call
Duration
Median of number of
Call
Group Descrip...
เมืOอรวมทัRงสอง Cluster#2 และและ Cluster#3 เข้าด้วยกันจะได้ cluster ดังรูป
Yellow : Cluster 2
Red : Cluster 3
Prepaid Customer
HeavyVoice Call on Net
A B C D E
A 0 1 1 1 0
B 1 0 0 1 0
C 1 0 0 0 0
D 1 1 0 0 1
E 0 0 0 1 0
MO MT
A B
A C
A D
B D
E D
adjacency matrix Social Ne...
Churner
Non-churner
Churner
Non-churner
Churner
Non-churner
นํา adjacency matrix ทีOได้จากขัRนตอนทีO 4 มาคํานวณค่าระดับความเป็น
ศูนย์กลางของแต่ละ Subscriber ทัRงหมด 4 ค่า โดยอ้างอิงท...
5.1 ทําการคํานวนค่าความเป็นศูนย์กลางของ Subscriber Centrality ทัRงหมด จะได้ส่วนทีO
5.2 หลังจากนัRนคํานวนหาค่า Neighbours C...
คํานวณค่าความระดับความเป็นศูนย์กลางเพืOอสร้างตัวแปรอิสระทัRงหมด 14 ตัวแปร
1. Subscriber Centrality
– X1: Subscriber Centra...
คํานวณค่าความระดับความเป็นศูนย์กลางเพืOอสร้างตัวแปรอิสระทัRงหมด 14 ตัวแปร
3. Churn Neighbors Centrality
– X7: Churn Neighb...
Principal Component Analysis
ตัวแปรอิสระ X3 –X14 เพืOอทํา Principal Component Analysis โดยเลือกจํานวน factor
ทัRงหมด 3 fac...
Principal Component Analysis
จาก PCA สามารถสกัดปัจจัย (factor extraction) ได้ดังนีR
Variables
Factor1:
Churner	
Neighbors	...
Principal Component Analysis
X1: Subscriber Centrality Degree
X2: Subscriber Centrality Closeness
X3:Subscriber Centrality...
ในขัRนตอนนีR จะสร้าง Churn prediction model และModel evaluation
X3:Subscriber Centrality
Betweenness
X1: Subscriber Centra...
Model#1 Model#2 Model#3
Dependent Variable (DV) Y: Subscriber Churn Status Y: Subscriber Churn Status Y: Subscriber Churn ...
Model#1 Model#2 Model3
Confusion
Matrix
Reference Reference Reference
1 0 1 0 1 0
Predicted
1 1,622 56
Predicted
1 1,593 7...
Churner
Non-churner
Churner / Predicted Churner
Non-churner / Predicted Non-churner
Non-churner / Predicted Churner
Churner / Predicted Non-ch...
Churner / Predicted Churner
Non-churner / Predicted Non-churner
Non-churner / Predicted Churner
Churner / Predicted Non-ch...
Churner / Predicted Churner
Non-churner / Predicted Non-churner
Non-churner / Predicted Churner
Churner / Predicted Non-ch...
Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry  โดย เชษฐพงศ์ ปัญญาชนกุล อาจ...
Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry  โดย เชษฐพงศ์ ปัญญาชนกุล อาจ...
Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry  โดย เชษฐพงศ์ ปัญญาชนกุล อาจ...
Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry  โดย เชษฐพงศ์ ปัญญาชนกุล อาจ...
Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry  โดย เชษฐพงศ์ ปัญญาชนกุล อาจ...
Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry  โดย เชษฐพงศ์ ปัญญาชนกุล อาจ...
Upcoming SlideShare
Loading in …5
×

Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์ ดร. อานนท์ ศักดิ์วรวิชญ์

2,668 views

Published on

Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์ ดร. อานนท์ ศักดิ์วรวิชญ์

ในงาน THE FIRST NIDA BUSINESS ANALYTICS AND DATA SCIENCES CONTEST/CONFERENCE จัดโดย คณะสถิติประยุกต์และ DATA SCIENCES THAILAND

Published in: Education

Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์ ดร. อานนท์ ศักดิ์วรวิชญ์

  1. 1. Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry Chettapong Punyachonkool Dr. Arnond Sakworawich The First NIDA Business Analytics and Data Sciences Contest/Conference September 2, 2016
  2. 2. Chettapong Punyachonkool Data Engineer, Business Intelligence Strategy The Siam Commercial Bank chettapongp@gmail.com www.linkedin.com/in/chettapong-punyachonkool Business Analytic and Research Applied Statistics, NIDA
  3. 3. Topics › Social Network Analysis basic concepts › Social Network Analysis with R › Visualizing Social Network › Using SNA to predict Subscriber Churn in Telco
  4. 4. Social Network Analysis Basic Concepts
  5. 5. Social Network Social Network: A social structure composed of individuals (or organizations) interconnected by one or more specific types of interdependencies such as friendship, kinship, financial exchanges, communication exchanges, etc. Source: Wael Elrifai (2013), Social Network Analysis: Practical Uses and Implementation. Peak Consulting, http://www.peakconsulting.eu
  6. 6. Social Network Analysis Social Network Analysis:The application of graph theory to understand, categorize and quantify relationships in a social network. Source: Wael Elrifai (2013), Social Network Analysis: Practical Uses and Implementation. Peak Consulting, http://www.peakconsulting.eu
  7. 7. Why should you care about SNA? Traditional marketing practices are becoming obsolete. • Test and control group methodologies no longer work as intended. • Information exchange between individuals within an online social network is extremely high. • Difficult to keep control group “pure”. • Need to understand behaviour across and within communities rather than focusing just on individuals. • Leverage (and protect against) high velocity of information exchange within on-line social networks. Source: Wael Elrifai (2013), Social Network Analysis: Practical Uses and Implementation. Peak Consulting, http://www.peakconsulting.eu
  8. 8. Why should you care about SNA? Customer are sceptical: if you want to sell your products to your customers, convince their friends. Use social network analysis to understand more about your customers and their communities. Source: Wael Elrifai (2013), Social Network Analysis: Practical Uses and Implementation. Peak Consulting, http://www.peakconsulting.eu
  9. 9. Customer with the Role of an Influencer • Influential user adopts a product or behaviour. • Influential user tells (and influences) his or her immediate contacts within the community. • These immediate contacts tell their contacts. It is important... • To identify these people. • To influence these people. • To monitor the behaviour of these people. Source: Wael Elrifai (2013), Social Network Analysis: Practical Uses and Implementation. Peak Consulting, http://www.peakconsulting.eu
  10. 10. Social Network Analysis Application Source: http://www.martingrandjean.ch/connected-world-air-traffic-network/
  11. 11. Social Network Analysis Application Source: http://www.martingrandjean.ch/connected-world-air-traffic-network/
  12. 12. Social Network Analysis Application https://linkurious.icij.org/widget/4c040978
  13. 13. Social Network Analysis Application Source: Apichart Wisitkitchakarn (2013), Risk Analysis of East Asian Stock Markets, The Capital Market Research Institute, The Stock Exchange of Thailand.
  14. 14. Social Network Analysis Application Source: Valdis Krebs (2001). Connecting the Dots. Tracking Two Identified Terrorists http://orgnet.com/tnet.html
  15. 15. Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods. Communications and New Media, National University of Singapore Social Network Analysis (SNA) including a tutorial on concepts and methods Social Media – Dr. Giorgos Cheliotis (gcheliotis@nus.edu.sg) Communications and New Media, National University of Singapore
  16. 16. Practical applications Newman et al, 2006 Newman et al, 2006 A very early example of network analysis comes from the city of Königsberg (now Kaliningrad). Famous mathematician Leonard Euler used a graph to prove that there is no path that crosses each of the city’s bridges only once (Newman et al, 2006). SNA has its origins in both social science and in the broader fields of network analysis and graph theory Network analysis concerns itself with the formulation and solution of problems that have a network structure; such structure is usually captured in a graph (see the circled structure to the right) Graph theory provides a set of abstract concepts and methods for the analysis of graphs. These, in combination with other analytical tools and with methods developed specifically for the visualization and analysis of social (and other) networks, form the basis of what we call SNA methods. But SNA is not just a methodology; it is a unique perspective on how society functions. Instead of focusing on individuals and their attributes, or on macroscopic social structures, it centers on relations between individuals, groups, or social institutions
  17. 17. Basic Concepts } Networks } Tie Strength } Key Players } Cohesion How to represent various social networks How to identify strong/weak ties in the network How to identify key/central nodes in network Measures of overall network structure Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods. Communications and New Media, National University of Singapore
  18. 18. Representing relations as networks Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods. Communications and New Media, National University of Singapore 1 2 3 4 1 2 3 4 Graph Anne Jim Mary John Can we study their interactions as a network? Communication Anne: Jim, tell the Murrays they’re invited Jim: Mary, you and your dad should come for dinner! Jim: Mr. Murray, you should both come for dinner Anne: Mary, did Jim tell you about the dinner? You must come. John: Mary, are you hungry? …
  19. 19. Network terminology 1 2 3 4 Graph Vertex Edge Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods. Communications and New Media, National University of Singapore points lines vertices edges, arcs math nodes links computer science sites bonds physics actors ties, relations sociology
  20. 20. Entering data on a directed graph 1 2 3 4 Graph (directed) Vertex Vertex 1 2 1 3 2 3 2 4 3 4 Edge list Vertex 1 2 3 4 1 - 1 1 0 2 0 - 1 1 3 0 0 - 0 4 0 0 1 - Adjacency matrix Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods. Communications and New Media, National University of Singapore
  21. 21. Entering data on a directed graph 1 2 3 4 Graph (directed) Vertex Vertex 1 2 1 3 2 3 2 4 3 4 Edge list Vertex 1 2 3 4 1 - 1 1 0 2 0 - 1 1 3 0 0 - 0 4 0 0 1 - Adjacency matrix Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods. Communications and New Media, National University of Singapore
  22. 22. Representing an undirected graph Vertex Vertex 1 2 1 3 2 3 2 4 3 4 Edge list remains the same Vertex 1 2 3 4 1 - 1 1 0 2 1 - 1 1 3 1 1 - 1 4 0 1 1 - Adjacency matrix becomes symmetric 1 2 3 4 Graph (undirected) Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods. Communications and New Media, National University of Singapore
  23. 23. Basic Concepts Networks } Tie Strength Key Players Cohesion How to represent various social networks How to identify strong/weak ties in the network How to identify key/central nodes in network Measures of overall network structure Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods. Communications and New Media, National University of Singapore
  24. 24. Adding weights to edges Vertex Vertex Weight 1 2 30 1 3 5 2 3 22 2 4 2 3 4 37 Edge list: add column of weights Vertex 1 2 3 4 1 - 30 5 0 2 30 - 22 2 3 5 22 - 37 4 0 2 37 - Adjacency matrix: add weights instead of 1 1 2 3 4 30 2 37 22 5 Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods. Communications and New Media, National University of Singapore
  25. 25. Adding weights to edges Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods. Communications and New Media, National University of Singapore 1 2 3 4 Anne Jim Mary John Communication Anne: Jim, tell the Murrays they’re invited Jim: Mary, you and your dad should come for dinner! Jim: Mr. Murray, you should both come for dinner Anne: Mary, did Jim tell you about the dinner? You must come. John: Mary, are you hungry? … 1 2 3 4 30 37 22 5 Graph (undirected) add weights
  26. 26. Edge weights as relationship strength Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods. Communications and New Media, National University of Singapore } Edges can represent interactions, flows of information or goods, similarities/affiliations, or social relations } Specifically for social relations, a ‘proxy’ for the strength of a tie can be: (a) the frequency of interaction (communication) or the amount of flow (exchange) (b) reciprocity in interaction or flow (c) the type of interaction or flow between the two parties (e.g., intimate or not) (d) other attributes of the nodes or ties (e.g., kin relationships) (e) The structure of the nodes’ neighborhood (e.g. many mutual ‘friends’) } Surveys and interviews allows us to establish the existence of mutual or one- sided strength/affection with greater certainty, but proxies above are also useful
  27. 27. Basic Concepts Networks Tie Strength } Key Players Cohesion How to represent various social networks How to identify strong/weak ties in the network How to identify key/central nodes in network Measures of overall network structure Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods. Communications and New Media, National University of Singapore
  28. 28. Interpretation of measures } Degree } Betweenness } Closeness } Eigenvector How many people can this person reach directly? How likely is this person to be the most direct route between two people in the network? How fast can this person reach everyone in the network? How well is this person connected to other well- connected people? Centrality measure Interpretation in social networks Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods. Communications and New Media, National University of Singapore
  29. 29. Degree centrality Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods. Communications and New Media, National University of Singapore } A node’s (in-) or (out-)degree is the number of links that lead into or out of the node } In an undirected graph they are of course identical } Often used as measure of a node’s degree of connectedness and hence also influence and/or popularity } Useful in assessing which nodes are central with respect to spreading information and influencing others in their immediate ‘neighborhood’ 1 2 3 4 5 6 7 2 3 4 1 4 1 1 Nodes 3 and 5 have the highest degree (4) Hypothetical graph
  30. 30. Betweenness centrality Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods. Communications and New Media, National University of Singapore } For a given node v, calculate the number of shortest paths between nodes i and j that pass through v, and divide by all shortest paths between nodes i and j } Sum the above values for all node pairs i,j } Sometimes normalized such that the highest value is 1or that the sum of all betweenness centralities in the network is 1 } Shows which nodes are more likely to be in communication paths between other nodes } Also useful in determining points where the network would break apart (think who would be cut off if nodes 3 or 5 would disappear) 1 2 3 4 5 6 7 0 1.5 6.5 0 9 0 0 Node 5 has higher betweenness centrality than 3
  31. 31. Closeness centrality Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods. Communications and New Media, National University of Singapore } Calculate the mean length of all shortest paths from a node to all other nodes in the network (i.e. how many hops on average it takes to reach every other node) } Take the reciprocal of the above value so that higher values are ‘better’ (indicate higher closeness) like in other measures of centrality } It is a measure of reach, i.e. the speed with which information can reach other nodes from a given starting node 1 2 3 4 5 6 7 0.5 0.67 0.75 0.46 0.75 0.46 0.46 Nodes 3 and 5 have the highest (i.e. best) closeness, while node 2 fares almost as well Note: Sometimes closeness is calculated without taking the reciprocal of the mean shortest path length. Then lower values are ‘better’.
  32. 32. Eigenvector centrality Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods. Communications and New Media, National University of Singapore } A node’s eigenvector centrality is proportional to the sum of the eigenvector centralities of all nodes directly connected to it } In other words, a node with a high eigenvector centrality is connected to other nodes with high eigenvector centrality } This is similar to how Google ranks web pages: links from highly linked-to pages count more } Useful in determining who is connected to the most connected nodes 1 2 3 4 5 6 7 0.36 0.49 0.54 0.19 0.49 0.17 0.17 Node 3 has the highest eigenvector centrality, closely followed by 2 and 5 Note: The term ‘eigenvector’ comes from mathematics (matrix algebra), but it is not necessary for understanding how to interpret this measure
  33. 33. Network centralities 1 2 3 4 5 6 7 0 1.5 6.5 0 9 0 0 1 2 3 4 5 6 7 2 3 4 1 4 1 1 1 2 3 4 5 6 7 0.5 0.67 0.75 0.46 0.75 0.46 0.46 1 2 3 4 5 6 7 0.36 0.49 0.54 0.19 0.49 0.17 0.17 Degree centrality Betweenness centrality Closeness centrality Eigenvector centrality Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods. Communications and New Media, National University of Singapore
  34. 34. Closeness centrality Source: www.activenetwork.net
  35. 35. Betweenness centrality Source: www.activenetwork.net
  36. 36. Eigenvector centrality Source: www.activenetwork.net
  37. 37. Interpretation of measures Centrality Descripting Value meaning Degree บอกระดับการไหลผ่านของข้อมูลของ node - ค่าน้อย ข้อมูลไหนผ่านน้อย ค่ามาก ข้อมูลไหนผ่านมาก Closeness - บอกเป็นศูนย์กลางภายของ node ใน เครือข่าย - นํามาใช้วัดอัตราเร็วในการแพร่กระจาย ข้อมูลจาก node ดังกล่าวไปในเครือข่าย - ค่าน้อย สําคัญมาก ค่ามาก สําคัญน้อย - ค่าน้อย กระจายเร็ว ค่ามาก กระจายช้า Betweenness บอกถึงระดับความสําคัญหรือระดับบทบาท ภายในเครือข่าย - ค่าน้อย สําคัญน้อย ค่ามาก สําคัญมาก Eigenvector วัดระดับอิทธิพลของ node ภายในเครือข่าย - ค่าน้อยมีอิทธิพลน้อย ค่ามาก มีอิทธิพลมาก
  38. 38. Basic Concepts Networks Tie Strength Key Players } Cohesion How to represent various social networks How to identify strong/weak ties in the network How to identify key/central nodes in network How to characterize a network’s structure Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods. Communications and New Media, National University of Singapore
  39. 39. Reciprocity (degree of) 1 2 3 4 } The ratio of the number of relations which are reciprocated (i.e. there is an edge in both directions) over the total number of relations in the network } …where two vertices are said to be related if there is at least one edge between them } In the example to the right this would be 2/5=0.4 (whether this is considered high or low depends on the context) } A useful indicator of the degree of mutuality and reciprocal exchange in a network, which relate to social cohesion } Only makes sense in directed graphs Reciprocity for network = 0.4 Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods. Communications and New Media, National University of Singapore
  40. 40. Density 1 2 3 4 } A network’s density is the ratio of the number of edges in the network over the total number of possible edges between all pairs of nodes (which is n(n-1)/2, where n is the number of vertices, for an undirected graph) } In the example network to the right density=5/6=0.83 (i.e. it is a fairly dense network; opposite would be a sparse network) } It is a common measure of how well connected a network is (in other words, how closely knit it is) – a perfectly connected network is called a clique and has density=1 } A directed graph will have half the density of its undirected equivalent, because there are twice as many possible edges, i.e. n(n-1) } Density is useful in comparing networks against each other, or in doing the same for different regions within a single network 1 2 3 4 density = 5/6 = 0.83 density = 5/12 = 0.42 Edge present in network Possible but not present Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods. Communications and New Media, National University of Singapore
  41. 41. 1 2 3 4 5 6 7 1 0.67 0.33 N/a 0.17 N/a N/a } A node’s clustering coefficient is the number of closed triplets in the node’s neighborhood over the total number of triplets in the neighborhood. It is also known as transitivity. Network clustering coefficient = 0.375 (3 nodes in each triangle x 2 triangles = 6 closed triplets divided by 16 total) Cluster A Cluster B Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods. Communications and New Media, National University of Singapore Clustering closed triplets (CT) total number of triplets (TT) clustering coefficient = #$ $$ 3
  42. 42. 1 2 3 4 5 6 7 1 0.67 0.33 N/a 0.17 N/a N/a } A node’s clustering coefficient is the number of closed triplets in the node’s neighborhood over the total number of triplets in the neighborhood. It is also known as transitivity. } E.g., node 1 to the right has a value of 1 because it is only connected to 2 and 3, and these nodes are also connected to one another (i.e. the only triplet in the neighborhood of 1 is closed).We say that nodes 1,2, and 3 form a clique. Network clustering coefficient = 0.375 (3 nodes in each triangle x 2 triangles = 6 closed triplets divided by 16 total) Cluster A Cluster B Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods. Communications and New Media, National University of Singapore Clustering clustering coefficient (node1) = 1 3 3 = 1
  43. 43. 1 2 3 4 5 6 7 1 0.67 0.33 N/a 0.17 N/a N/a } A node’s clustering coefficient is the number of closed triplets in the node’s neighborhood over the total number of triplets in the neighborhood. It is also known as transitivity. Network clustering coefficient = 0.375 (3 nodes in each triangle x 2 triangles = 6 closed triplets divided by 16 total) Cluster A Cluster B Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods. Communications and New Media, National University of Singapore Clustering Kn = Degree of node v Nn = Number of Link between neighbors of node n CCn = Clustering coefficient of node n CCn = 2 ) *+ ,+ ) (,+ − 1) CC (node2) = 2 (2) 3 ) (3 − 1) = 4 6 = 2. 45
  44. 44. 1 2 3 4 5 6 7 0 1.5 6.5 0 9 0 0 } Clustering algorithms identify clusters or ‘communities’ within networks based on network structure and specific clustering criteria } Hierarchical clustering } Similarity based clustering } Betweenness clustering (example shown to the right with two clusters is based on edge betweenness, an equivalent for edges of the betweenness centrality presented earlier for nodes) Network clustering coefficient = 0.375 (3 nodes in each triangle x 2 triangles = 6 closed triplets divided by 16 total) Cluster A Cluster B Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods. Communications and New Media, National University of Singapore Clustering
  45. 45. Average and longest distance 1 2 3 4 5 6 7 } The longest shortest path (distance) between any two nodes in a network is called the network’s diameter } The diameter of the network on the right is 3; it is a useful measure of the reach of the network (as opposed to looking only at the total number of vertices or edges) } It also indicates how long it will take at most to reach any node in the network (sparser networks will generally have greater diameters) } The average of all shortest paths in a network is also interesting because it indicates how far apart any two nodes will be on average (average distance) diameter Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods. Communications and New Media, National University of Singapore
  46. 46. Social Network Analysis with R & Visualizing Social Network
  47. 47. Visualizing Social Network
  48. 48. Katherine Ognyanova, www.kateto.net NetSciX 2016 School of Code Workshop, Wroclaw, Poland Assistant Professor at the School of Communication and Information at Rutgers University. Network Analysis and Visualization with R and igraph
  49. 49. Basic SNA with R and igraph } Networks } Tie Strength } Key Players } Cohesion Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorial the NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)
  50. 50. Basic SNA with R and igraph } Networks } Tie Strength } Key Players } Cohesion } Create network } Edge,Vertex and Attributes } Read network data from files } Turning networks into igraph objects Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorial the NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)
  51. 51. Create Network › g1 <- graph( edges=c(1,2, 2,3, 3, 1), n=3, directed=F ) › plot(g1) Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorial the NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)
  52. 52. Create Network › g2 <- graph( edges=c(1,2, 2,3, 3, 1), n=10 ) › plot(g2) Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorial the NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)
  53. 53. Create Network › g3 <- graph( c("John", "Jim", "Jim", "Jill", "Jill", "John")) › plot(g3) Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorial the NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)
  54. 54. Create Network › g4 <- graph( c("John", "Jim", "Jim", "Jack", "Jim", "Jack", "John", "John"), isolates=c("Jesse", "Janis", "Jennifer", "Justin") ) › plot(g4, edge.arrow.size=.5, vertex.color="gold", vertex.size=15, vertex.frame.color="gray", vertex.label.color="black", vertex.label.cex=0.8, vertex.label.dist=2, edge.curved=0.2) Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorial the NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)
  55. 55. Create Network › plot(graph_from_literal(a---b, b---c)) › plot(graph_from_literal(a--+b, b+--c)) › plot(graph_from_literal(a+-+b, b+-+c)) Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorial the NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)
  56. 56. Edge, Vertex and Attributes › # The edges of the object › E(g4) › # The vertices of the object › V(g4) › # The network matrix › g4[] Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorial the NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)
  57. 57. Edge, Vertex and Attributes Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorial the NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net) g4 vertex_attr name gender edge_attr type weight
  58. 58. Read Network from files 3.1 DATASET 1: edgelist › Dataset1-Media-Example-NODES.csv › Dataset1-Media-Example-EDGES.csv 3.2 DATASET 2: matrix › Dataset2-Media-User-Example-NODES.csv › Dataset2-Media-User-Example-EDGES.csv Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorial the NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)
  59. 59. Turning networks into igraph objects We start by converting the raw data to an igraph network object. Here we use igraph’s graph.data.frame function, which takes two data frames: d and vertices. d describes the edges of the network. Its first two columns are the IDs of the source and the target node for each edge.The following columns are edge attributes (weight, type, label, or anything else). vertices starts with a column of node IDs.Any following columns are interpreted as node attributes. Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorial the NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)
  60. 60. Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorial the NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)
  61. 61. Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorial the NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)
  62. 62. Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorial the NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)
  63. 63. Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorial the NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net) NYT
  64. 64. Basic SNA with R and igraph } Networks } Tie Strength } Key Players } Cohesion Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorial the NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net) } Plotting networks with igraph } Assign weight } Assign type of Tie } Deleted Tie
  65. 65. Plotting networks with igraph We can set the node & edge options in two ways › Specify them in the plot() › Set attributes and add them to the igraph object Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorial the NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)
  66. 66. } Networks centrality with igraph } Degree } Closeness } Betweenness } Eigenvector Basic SNA with R and igraph } Networks } Tie Strength } Key Players } Cohesion Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorial the NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)
  67. 67. Network centrality with igraph › Degree centrality › Closeness centrality › Betweenness centrality › Eigenvector centrality Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorial the NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)
  68. 68. Network centrality with igraph Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorial the NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)
  69. 69. } Density and Reciprocity } Clustering } Averages & longest distance Basic SNA with R and igraph } Networks } Tie Strength } Key Players } Cohesion Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorial the NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)
  70. 70. Transitivity with igraph Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorial the NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)
  71. 71. Community Clustering based on edge betweenness Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorial the NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)
  72. 72. Averages & longest distance Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorial the NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)
  73. 73. Using SNA to predict Subscriber churn in Telecommunication industry
  74. 74. Customer Churn
  75. 75. “Churn represents the loss of an existing customer to a competitor” A prevalent problem in: › Telecommunication services › Home mortgage refinance › Credit card Churn is especially important to mobile phone service providers › Easy for a subscriber to switch services. › Mobile Number Portability (MNP) will remove last important obstacle What is Customer Churn ?
  76. 76. The CRM core objectives included: – Customer Acquisition – Customer Retention › Churn Perdition – Cross-sell/Up Sell – Maximizing Lifetime CustomerValue Core CRM
  77. 77. § Segmentation § Predictive Analytics § Customer Acquisition §Costs of Customer Acquisition. (COCA) § Servicing §QoS (Network) §Call Center, Service Hall/Center (Net Promotor Score) § Customer Retention §Churn Perdition § Customer LifetimeValue (CLV) Core CRM in Telecommunication
  78. 78. Most Telco define their customer segments using some of the following › Payment type (prepaid vs. postpaid) › ARPU (Average Revenue Per User >> revenue generated) › Tenure (age of user :AOU) › Demographics (location, income, job, gender, age, … etc.) › Usage – voice, data, other non-voice, roaming › Handsets/Devices – 2G,3G, 4G Device, Smartphone vs Feature phone › Package – Package, Price plans Segmentation
  79. 79. Process current and historical data in order to make predictions about future events. › Making customer decisions. › Next Best Offer › Package & Price plan › Cross-sell & Up-sell opportunities › Credit scoring for setting dynamic limits (risk management) › Fraud detection (postpaid only) › Revenue Allocation › Customer LifetimeValue Predictive Analytics
  80. 80. › Quality of Service – Network Utilization – Drop Call › Call Center, Service Hall – The number of times that Customer contact (complain) via Call Center/ Service Hall – Service Scoring (Net Promotor Score) Servicing
  81. 81. › Type of Churn – Voluntary Churn – Involuntary Churn • Type of Customer Retention – Reactive – Proactive Customer Retention
  82. 82. Source: Mohammed Akram Ayyubi Define - The Churn Reasons Taxonomy The Churn Taxonomy
  83. 83. Data Source Customer Demographic: • Zip code • Income • Occupation • Age • Gender • Living Address • Occupation Address Order: • Customer Type Crop/SME/Indy • Payment Type (Pre/Post) • Current Package • Package Plan • ARPU • Additional produce/service Customer Relation: • Number of Questions about the services from e.g. IVR • Number ofVisits to retail shops or online website • Number of Complaints solved • Number of total complaints Service Usage: (CDR) • Number of calls • Volume of Data usage • Number of Outgoing calls • Number of Incoming calls • Number of Roaming calls • Number of International calls • Total minutes of usage (MOU)/Volume • Number of Drop calls
  84. 84. Data Source Billing Data: • Total amount of bill • Total number of barred (one-way barred) • Total number of full barred (two-way barred) Network: • Cell Site Location • Network Type ( 4G/3G/2G ) • Network Utilization • QoS
  85. 85. Data Mining Functionalities, Techniques & CRM Applications
  86. 86. Churn Modeling Techniques
  87. 87. Subscribers 25M Prepaid Subs 4M Postpaid Subs 21M Prepaid APRU 550THB Postpaid APRU 160THB Source: NBTC (http://ttid.nbtc.go.th/mobile_db.html)
  88. 88. ผู้ใช้บริการเครือข่ายโทรศัพท์แบบรายเดือนประเภทบุคคลทัOวไปของผู้ ให้บริการเครือข่ายโทรศัพท์มือถือแห่งหนึOงทีOมีสถานะเป็นผู้ยกเลิกบริการในช่วง เดือนกรกฎาคมถึงเดือนสิงคาคม พ.ศ. 2558 July August Subscribers 1.5 M Churner 8.2 K
  89. 89. 1. Subscriber Centrality – X1: Subscriber Centrality Degree – X2: Subscriber Centrality Closeness – X3: Subscriber Centrality Betweenness – X4: Subscriber Centrality Eigenvector 2. Number of Neighbors – X5: Number of Churn Neighbors – X6: Number of Non-Churn Neighbors 3. Churn Neighbors Centrality – X7: Churn Neighbors Centrality Degree – X8: Churn Neighbors Centrality Closeness – X9: Churn Neighbors Centrality Betweenness – X10: Churn Neighbors Centrality Eigenvector 4. Non- Churn Neighbors Centrality – X11: Non-Churn Neighbors Centrality Degree – X12: Non-Churn Neighbors Centrality Closeness – X13: Non-Churn Neighbors Centrality Betweenness – X14: Non-Churn Neighbors Centrality Eigenvector
  90. 90. Subscriber Centrality Churn Neighbors Centrality Non- Churn Neighbors Centrality Subscriber Churn Status Number of Neighbors
  91. 91. adjacency matrix Churner CDR K-Mean clustering Cluster 1 Cluster 2 Cluster 3 Cluster selection Centrality calculation Churn Prediction 1 2 3 6 5 4
  92. 92. JulyJune July August Subscribers 1.5 M Churner 8.2 K
  93. 93. ทําการ summarize ข้อมูลจํานวนนาทีในการโทร(Call Duration) และ จํานวนครัRงในการโทร (Number of Call) ให้อยู่ในรูปแบบ per subscriber per call จะได้จํานวนความสัมพันธ์(links) = 1,747,835 transections MO MT Call Duration Number of Call A B 3 1 A C 2 4 A D 16 2 B D 23 9 E D 1 1
  94. 94. แบ่งกลุ่มโดยใช้เทคนิค K-mean clustering โดยใช้จํานวนนาทีในการโทร (Call Duration) และจํานวนครัRงในการโทร (Number of Call) เป็นตัวแปรในการแบ่งกลุ่ม โดย กําหนดให้ k = 3 จะได้cluster ดังรูป Blue : Cluster 1 Yellow : Cluster 2 Red : Cluster 3
  95. 95. 1. พิจารณาจากจํานวน สมาชิกในแต่ละกลุ่ม เมืOอแบ่ง จํานวนกลุ่มมากขึRน Number of Cluster K = 2 K = 3 K = 4 K = 5 K = 6 # transection in each cluster group 1 1,737,166 1,725,216 1,705,381 1,675,572 1,638,302 2 10,669 21,801 38,463 61,628 88,167 3 818 3,831 9,388 16,924 4 160 1,160 3,701 5 87 664 6 77 2. พิจารณา ratio ของค่า Inter-cluster distances and Intra-cluster distances เมืOอ แบ่งจํานวนกลุ่มมากขึRน ทําไมถึงเลือก K= 3 ?
  96. 96. Blue : Cluster 1 Yellow : Cluster 2 Red : Cluster 3 Cluster Median of Call Duration Median of number of Call Group Description Cluster#1 2.13 2.00 โทรน้อย - โทรสัRน Cluster#2 1,998.03 348.50 โทรน้อย - โทรนาน Cluster#3 215.97 101.00 โทรบ่อย- โทรสั-น
  97. 97. เมืOอรวมทัRงสอง Cluster#2 และและ Cluster#3 เข้าด้วยกันจะได้ cluster ดังรูป Yellow : Cluster 2 Red : Cluster 3
  98. 98. Prepaid Customer HeavyVoice Call on Net
  99. 99. A B C D E A 0 1 1 1 0 B 1 0 0 1 0 C 1 0 0 0 0 D 1 1 0 0 1 E 0 0 0 1 0 MO MT A B A C A D B D E D adjacency matrix Social Network GraphEdge list
  100. 100. Churner Non-churner
  101. 101. Churner Non-churner
  102. 102. Churner Non-churner
  103. 103. นํา adjacency matrix ทีOได้จากขัRนตอนทีO 4 มาคํานวณค่าระดับความเป็น ศูนย์กลางของแต่ละ Subscriber ทัRงหมด 4 ค่า โดยอ้างอิงทฤษฎีกราฟ ดังนีR Centrality Descripting Value meaning Degree บอกระดับการไหลผ่านของข้อมูลของ node - ค่าน้อย ข้อมูลไหนผ่านน้อย ค่ามาก ข้อมูลไหนผ่านมาก Closeness - บอกเป็นศูนย์กลางภายของ node ในเครือข่าย - นํามาใช้วัดอัตราเร็วในการแพร่กระจายข้อมูลจาก node ดังกล่าวไปในเครือข่าย - ค่าน้อย สําคัญมาก ค่ามาก สําคัญน้อย - ค่าน้อย กระจายเร็ว ค่ามาก กระจายช้า Betweenness บอกถึงระดับความสําคัญหรือระดับบทบาทภายใน เครือข่าย - ค่าน้อย สําคัญน้อย ค่ามาก สําคัญมาก Eigenvector วัดระดับอิทธิพลของnode ภายในเครือข่าย - ค่าน้อยมีอิทธิพลน้อย ค่ามาก มีอิทธิพลมาก
  104. 104. 5.1 ทําการคํานวนค่าความเป็นศูนย์กลางของ Subscriber Centrality ทัRงหมด จะได้ส่วนทีO 5.2 หลังจากนัRนคํานวนหาค่า Neighbours Centrality ของทุกๆ subscriber จะได้ส่วนทีO 2, 3, 4 และ 5 ตามลําดับ (1) Subscriber Centrality (4) Churn Neighbors Centrality (5) Non- Churn Neighbors Centrality (2) Number of NeighborsA B C D E A 0 1 1 1 0 B 1 0 0 1 0 C 1 0 0 0 0 D 1 1 0 0 1 E 0 0 0 1 0 adjacency matrix ขัRนตอนทีO 5.1 Subscriber Centrality ขัRนตอนทีO 5.2 Neighbors Centrality
  105. 105. คํานวณค่าความระดับความเป็นศูนย์กลางเพืOอสร้างตัวแปรอิสระทัRงหมด 14 ตัวแปร 1. Subscriber Centrality – X1: Subscriber Centrality Degree – X2: Subscriber Centrality Closeness – X3: Subscriber Centrality Betweenness – X4: Subscriber Centrality Eigenvector 2. Number of Neighbors – X5: Number of Churn Neighbors – X6: Number of Non-Churn Neighbors
  106. 106. คํานวณค่าความระดับความเป็นศูนย์กลางเพืOอสร้างตัวแปรอิสระทัRงหมด 14 ตัวแปร 3. Churn Neighbors Centrality – X7: Churn Neighbors Centrality Degree – X8: Churn Neighbors Centrality Closeness – X9: Churn Neighbors Centrality Betweenness – X10: Churn Neighbors Centrality Eigenvector 4. Non- Churn Neighbors Centrality – X11: Non-Churn Neighbors Centrality Degree – X12: Non-Churn Neighbors Centrality Closeness – X13: Non-Churn Neighbors Centrality Betweenness – X14: Non-Churn Neighbors Centrality Eigenvector
  107. 107. Principal Component Analysis ตัวแปรอิสระ X3 –X14 เพืOอทํา Principal Component Analysis โดยเลือกจํานวน factor ทัRงหมด 3 factors เพืOอให้สามารถแต่ละfactor เพืOอให้สามารถนิยามความหมายของแต่ละ factor ได้ อย่างชัดเจน
  108. 108. Principal Component Analysis จาก PCA สามารถสกัดปัจจัย (factor extraction) ได้ดังนีR Variables Factor1: Churner Neighbors Centrality Factor2: Non-Churner Neighbors Centrality Factor3: Social Network Eigenvalue X8 Churn Neighbors Centrality Closeness 1 X5 Number of Churn Neighbors 0.998 X9 Churn Neighbors Centrality Betweenness 0.701 X7 Churn Neighbors Centrality Degree 0.555 X6 Number of Non-Churn Neighbors 1.013 X12 Non-Churn Neighbors Centrality Closeness 1.013 X11 Non-Churn Neighbors Centrality Degree 0.596 X13 Non-Churn Neighbors Centrality Betweenness 0.315 X10 Churn Neighbors Centrality Eigenvector 1.048 X4 Subscriber Centrality Eigenvector 0.866 X14 Non-Churn Neighbors Centrality Eigenvector 0.407
  109. 109. Principal Component Analysis X1: Subscriber Centrality Degree X2: Subscriber Centrality Closeness X3:Subscriber Centrality Betweenness X8: Churn Neighbors Centrality Closeness X5: Number of Churn Neighbors X9: Churn Neighbors Centrality Betweenness X7: Churn Neighbors Centrality Degree X6: Number of Non-Churn Neighbors X12: Non-Churn Neighbors Centrality Closeness X11: Non-Churn Neighbors Centrality Degree X13: Non-Churn Neighbors Centrality Betweenness X10: Churn Neighbors Centrality Eigenvector X4: Subscriber Centrality Eigenvector X14: Non-Churn Neighbors Centrality Eigenvector PA1: Churn Neighbors Centrality PA2: Non-Churn Neighbors Centrality PA3: Social Network Eigenvalue Y: Subscriber Churn Status
  110. 110. ในขัRนตอนนีR จะสร้าง Churn prediction model และModel evaluation X3:Subscriber Centrality Betweenness X1: Subscriber Centrality Degree PC1: Churn Neighbors Centrality PC2: Non-Churn Neighbors Centrality X2: Subscriber Centrality Closeness Churn model evaluation Train Churn Models Test Churn Prediction ROC PC3: Social Network Eigenvalue
  111. 111. Model#1 Model#2 Model#3 Dependent Variable (DV) Y: Subscriber Churn Status Y: Subscriber Churn Status Y: Subscriber Churn Status Independent Variable (IV) X1: Subscriber Centrality Degree PC1: Non-Churn Neighbors Centrality PC2 :Churn Neighbors Centrality X1: Subscriber Centrality Degree X2: Subscriber Centrality Closeness X3: Subscriber Centrality Betweenness X1: Subscriber Centrality Degree X2: Subscriber Centrality Closeness PC3: Social Network Eigenvalue Number of Independent variables 3 3 3 Akaike information criterion (AIC) 22710 23546 19540 Pseudo R-Squared llh -11351.1129 -11769.0146 -9765.7565 llhNull -14840.6155 -14840.6155 -14840.6155 McFadden's pseudo r-squared 0.2351 0.2070 0.3420 Cragg and Uhler's pseudo r-squared 0.3687 0.3305 0.5009 Model evaluation Accuracy 0.7247 0.7201 0.7270 95% CI (0.7155, 0.7338) (0.7108, 0.7292) (0.7178, 0.736) Kappa 0.4072 0.3969 0.4115 Sensitivity 0.9891 0.9864 0.9944 Specificity 0.3938 0.3867 0.3923 Pos Pred Value 0.6713 0.6681 0.6719 Neg Pred Value 0.9666 0.9579 0.9824 Prevalence 0.5559 0.5559 0.5559 Detection Rate 0.5498 0.5483 0.5527 Detection Prevalence 0.8191 0.8207 0.8226 Balanced Accuracy 0.6915 0.6866 0.6934 AUC 0.8556 0.8511 0.8581
  112. 112. Model#1 Model#2 Model3 Confusion Matrix Reference Reference Reference 1 0 1 0 1 0 Predicted 1 1,622 56 Predicted 1 1,593 70 Predicted 1 1,616 29 0 2,497 5,099 0 2,526 5,085 0 2,503 5,126 AUC 0.8556 0.8511 0.8581 ROC Model#1 ROC Model#2 ROC Model#3
  113. 113. Churner Non-churner
  114. 114. Churner / Predicted Churner Non-churner / Predicted Non-churner Non-churner / Predicted Churner Churner / Predicted Non-churner
  115. 115. Churner / Predicted Churner Non-churner / Predicted Non-churner Non-churner / Predicted Churner Churner / Predicted Non-churner
  116. 116. Churner / Predicted Churner Non-churner / Predicted Non-churner Non-churner / Predicted Churner Churner / Predicted Non-churner

×