SlideShare a Scribd company logo
1 of 18
Social Network Analysis (SNA)
Ghulam Imaduddin
Definition
From the point of view of data mining, a social network is a heterogeneous and
multirelational data set represented by a graph. The graph is typically very
large, with nodes (or vertex) corresponding to objects and edges
corresponding to links representing relationships or interactions between
objects. Both nodes and links have attributes
(Han & Kamber, 2006).
2
Call, sms, IM, trf. Balance, …
mention, follow, like, …
subscriber subscriber
Benefit of SNA
3
Identify role of subscriber in
community:
• Community leader
• Bridge
• Passive
• Follower
Identify high value/prospect
community by looking at:
• Community size
• Closeness
• Member’s profile (device,
usage, ARPU, location)
• Onnet/Offnet share in
community
Suspected same
subscriber
Comparing two social network to
identify single identity of
subscriber. By comparing two
social network
Further
Utilization
• New product campaign, targeting community leader, bridge, and high value community
• Retention program prioritization for community leader, bridge, and high value community
• Product adoption campaign for follower in community that already adopt the product
• Identifying rotational churner to be excluded in retention campaign, or to evaluate dealer
• SN variable can be used to enhance another predictive model. For example: social network
variable can increase the lift of churn model for high value customer (Imaduddin, 2014)
Social Network Graph Mining
By mining the graph of social network, we can extract valuable information such
as:
• Degree (in-degree, out-degree, max-degree). Degree related to number of edge attached
to one vertex/node. Vertex with high number of in-degree means that vertex receive many
information from others, and vice versa.
• PageRank. PageRank measures the importance of each vertex in a graph. If a Twitter user
is followed by many others, the user will be ranked highly. For CDR based social network,
reverse the graph direction before use PageRank function to identify the important vertex
• Local clustering coefficient (LCC). LCC represent how close a customer’s network. The
higher the LCC, the closer the network. LCC calculation derived from triangle counting of
each vertex.
4
𝐿𝐶𝐶 =
#𝑡𝑟𝑖𝑎𝑛𝑔𝑙𝑒
𝑛
2
, 𝑛 = #𝑛𝑒𝑖𝑔ℎ𝑏𝑜𝑢𝑟
How To Build
5
6
Let’s get our hand dirty!
Graph Example
7
Graph Representation Data Representation
Script Example – Degree Information
8
Degree Information Result
9
Graph Representation
Result
(id, total-degree, in-degree, out-degree)
Script Example – PageRank
10
PageRank Result
11
Graph Representation
Result
(id, PageRank) (id, reverse PageRank)
Script Example – Triangle
12
Triangle Counting Result
13
Graph Representation
Result
(id, #triangle)
Solving Real World Problem
• Define the vertices. Is it subscriber, web pages, twitter account?
• Define the edge  how the vertices connected. E.g. total call minutes in a month > 5
minutes, sms > 10, etc
• Identify the mega hubs. Mega hubs is vertex that connected to massive amount of vertices
(something like call center or spammer). Mega hubs can be removed, or process separately
based on the problem.
• Identify the measure needed (PageRank, degree, LCC, triangle, etc)
• Build the data source (separate the vertex properties data and the connection data – join it
later), and put it distributed on hadoop.
• Build the code, run it, and feed the result back to data warehouse or hadoop for further
utilization
14
References & Resources
• Han, J., & Kamber, M. (2006). Data Mining Concepts and Techniques. San Francisco: Morgan Kaufmann.
• Imaduddin, G. (2014). Evaluation and Improvement of Churn Model Using Customer Value and Social
Network. Jakarta: Universitas Indonesia.
15
• Apache Spark Overview. https://spark.apache.org/docs/latest/
• Databricks Training Resources. https://databricks.com/spark-training-resources
• GraphX Programming Guide. https://spark.apache.org/docs/latest/graphx-
programming-guide.html
• Social Network Analysis. http://en.wikipedia.org/wiki/Social_network_analysis
• Spark Scala API Doc.
https://spark.apache.org/docs/1.1.0/api/scala/index.html#org.apache.spark.pac
kage
• The Scala Programming Language. http://www.scala-lang.org/
Appendix
16
List of Graph Operation in GraphX
17
List of Graph Operation in GraphX
18

More Related Content

What's hot

NFS(Network File System)
NFS(Network File System)NFS(Network File System)
NFS(Network File System)udamale
 
CS6010 Social Network Analysis Unit III
CS6010 Social Network Analysis   Unit IIICS6010 Social Network Analysis   Unit III
CS6010 Social Network Analysis Unit IIIpkaviya
 
Cloud computing notes RGPV unit 3
Cloud computing notes RGPV unit 3Cloud computing notes RGPV unit 3
Cloud computing notes RGPV unit 3Dr Md. Ilyas Khan
 
Presentation On Group Policy in Windows Server 2012 R2 By Barek-IT
Presentation On Group Policy in Windows Server 2012 R2 By Barek-ITPresentation On Group Policy in Windows Server 2012 R2 By Barek-IT
Presentation On Group Policy in Windows Server 2012 R2 By Barek-ITMd. Abdul Barek
 
Chorus - Distributed Operating System [ case study ]
Chorus - Distributed Operating System [ case study ]Chorus - Distributed Operating System [ case study ]
Chorus - Distributed Operating System [ case study ]Akhil Nadh PC
 
cloud computing:Types of virtualization
cloud computing:Types of virtualizationcloud computing:Types of virtualization
cloud computing:Types of virtualizationDr.Neeraj Kumar Pandey
 
Distributed Operating System,Network OS and Middle-ware.??
Distributed Operating System,Network OS and Middle-ware.??Distributed Operating System,Network OS and Middle-ware.??
Distributed Operating System,Network OS and Middle-ware.??Abdul Aslam
 
CS9222 ADVANCED OPERATING SYSTEMS
CS9222 ADVANCED OPERATING SYSTEMSCS9222 ADVANCED OPERATING SYSTEMS
CS9222 ADVANCED OPERATING SYSTEMSKathirvel Ayyaswamy
 
Community detection
Community detectionCommunity detection
Community detectionScott Pauls
 
Open mp library functions and environment variables
Open mp library functions and environment variablesOpen mp library functions and environment variables
Open mp library functions and environment variablesSuveeksha
 
Social Network Analysis (SNA) 2018
Social Network Analysis  (SNA) 2018Social Network Analysis  (SNA) 2018
Social Network Analysis (SNA) 2018Arsalan Khan
 
Legal And Regulatory Issues Cloud Computing...V2.0
Legal And Regulatory Issues Cloud Computing...V2.0Legal And Regulatory Issues Cloud Computing...V2.0
Legal And Regulatory Issues Cloud Computing...V2.0David Spinks
 
Grid computing standards
Grid computing standardsGrid computing standards
Grid computing standardsPooja Dixit
 

What's hot (20)

NFS(Network File System)
NFS(Network File System)NFS(Network File System)
NFS(Network File System)
 
CS6010 Social Network Analysis Unit III
CS6010 Social Network Analysis   Unit IIICS6010 Social Network Analysis   Unit III
CS6010 Social Network Analysis Unit III
 
Cloud computing notes RGPV unit 3
Cloud computing notes RGPV unit 3Cloud computing notes RGPV unit 3
Cloud computing notes RGPV unit 3
 
Presentation On Group Policy in Windows Server 2012 R2 By Barek-IT
Presentation On Group Policy in Windows Server 2012 R2 By Barek-ITPresentation On Group Policy in Windows Server 2012 R2 By Barek-IT
Presentation On Group Policy in Windows Server 2012 R2 By Barek-IT
 
DC - Unit - 4 - Context Based Compression
DC - Unit - 4 - Context Based CompressionDC - Unit - 4 - Context Based Compression
DC - Unit - 4 - Context Based Compression
 
data replication
data replicationdata replication
data replication
 
Chorus - Distributed Operating System [ case study ]
Chorus - Distributed Operating System [ case study ]Chorus - Distributed Operating System [ case study ]
Chorus - Distributed Operating System [ case study ]
 
cloud computing:Types of virtualization
cloud computing:Types of virtualizationcloud computing:Types of virtualization
cloud computing:Types of virtualization
 
Configuration DHCP
Configuration DHCPConfiguration DHCP
Configuration DHCP
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
 
Big Data.pptx
Big Data.pptxBig Data.pptx
Big Data.pptx
 
HDFS Federation
HDFS FederationHDFS Federation
HDFS Federation
 
Distributed Operating System,Network OS and Middle-ware.??
Distributed Operating System,Network OS and Middle-ware.??Distributed Operating System,Network OS and Middle-ware.??
Distributed Operating System,Network OS and Middle-ware.??
 
CS9222 ADVANCED OPERATING SYSTEMS
CS9222 ADVANCED OPERATING SYSTEMSCS9222 ADVANCED OPERATING SYSTEMS
CS9222 ADVANCED OPERATING SYSTEMS
 
Community detection
Community detectionCommunity detection
Community detection
 
Open mp library functions and environment variables
Open mp library functions and environment variablesOpen mp library functions and environment variables
Open mp library functions and environment variables
 
Social Network Analysis (SNA) 2018
Social Network Analysis  (SNA) 2018Social Network Analysis  (SNA) 2018
Social Network Analysis (SNA) 2018
 
Legal And Regulatory Issues Cloud Computing...V2.0
Legal And Regulatory Issues Cloud Computing...V2.0Legal And Regulatory Issues Cloud Computing...V2.0
Legal And Regulatory Issues Cloud Computing...V2.0
 
DBMS Unit - 4 - Relational Database Design
DBMS Unit - 4 - Relational Database Design DBMS Unit - 4 - Relational Database Design
DBMS Unit - 4 - Relational Database Design
 
Grid computing standards
Grid computing standardsGrid computing standards
Grid computing standards
 

Similar to Social Network Analysis with Spark

Social Friend Overlying Communities Based on Social Network Context
Social Friend Overlying Communities Based on Social Network ContextSocial Friend Overlying Communities Based on Social Network Context
Social Friend Overlying Communities Based on Social Network ContextIRJET Journal
 
The Spring 2018 Undergraduate Symposium Poster
The Spring 2018 Undergraduate Symposium PosterThe Spring 2018 Undergraduate Symposium Poster
The Spring 2018 Undergraduate Symposium PosterTanner Massahos
 
Data Mining In Social Networks Using K-Means Clustering Algorithm
Data Mining In Social Networks Using K-Means Clustering AlgorithmData Mining In Social Networks Using K-Means Clustering Algorithm
Data Mining In Social Networks Using K-Means Clustering Algorithmnishant24894
 
EVOLVING PATTERNS IN BIG DATA - NEIL AVERY
EVOLVING PATTERNS IN BIG DATA - NEIL AVERYEVOLVING PATTERNS IN BIG DATA - NEIL AVERY
EVOLVING PATTERNS IN BIG DATA - NEIL AVERYBig Data Week
 
IRJET - Twitter Spam Detection using Cobweb
IRJET - Twitter Spam Detection using CobwebIRJET - Twitter Spam Detection using Cobweb
IRJET - Twitter Spam Detection using CobwebIRJET Journal
 
Social Network Analysis Using Gephi
Social Network Analysis Using Gephi Social Network Analysis Using Gephi
Social Network Analysis Using Gephi Goa App
 
Avoiding Anonymous Users in Multiple Social Media Networks (SMN)
Avoiding Anonymous Users in Multiple Social Media Networks (SMN)Avoiding Anonymous Users in Multiple Social Media Networks (SMN)
Avoiding Anonymous Users in Multiple Social Media Networks (SMN)paperpublications3
 
Collective Spammer Detection in Evolving Multi-Relational Social Networks
Collective Spammer Detection in Evolving Multi-Relational Social NetworksCollective Spammer Detection in Evolving Multi-Relational Social Networks
Collective Spammer Detection in Evolving Multi-Relational Social NetworksTuri, Inc.
 
Solving churn challenge in Big Data environment - Jelena Pekez
Solving churn challenge in Big Data environment  - Jelena PekezSolving churn challenge in Big Data environment  - Jelena Pekez
Solving churn challenge in Big Data environment - Jelena PekezInstitute of Contemporary Sciences
 
User Behavior Hashing for Audience Expansion
User Behavior Hashing for Audience ExpansionUser Behavior Hashing for Audience Expansion
User Behavior Hashing for Audience ExpansionDatabricks
 
Social Network Analysis
Social Network AnalysisSocial Network Analysis
Social Network AnalysisWael Elrifai
 
Graph Data Science DEMO for fraud analysis
Graph Data Science DEMO for fraud analysisGraph Data Science DEMO for fraud analysis
Graph Data Science DEMO for fraud analysisNeo4j
 
Cloud-Based Big Data Analytics
Cloud-Based Big Data AnalyticsCloud-Based Big Data Analytics
Cloud-Based Big Data AnalyticsSateeshreddy N
 
ATC full paper format-2014 Social Networks in Telecommunications Asoka Korale...
ATC full paper format-2014 Social Networks in Telecommunications Asoka Korale...ATC full paper format-2014 Social Networks in Telecommunications Asoka Korale...
ATC full paper format-2014 Social Networks in Telecommunications Asoka Korale...Asoka Korale
 
Artificial Intelligence at LinkedIn
Artificial Intelligence at LinkedInArtificial Intelligence at LinkedIn
Artificial Intelligence at LinkedInBill Liu
 
Efficient Filtering Algorithms for Location- Aware Publish/subscribe
Efficient Filtering Algorithms for Location- Aware Publish/subscribeEfficient Filtering Algorithms for Location- Aware Publish/subscribe
Efficient Filtering Algorithms for Location- Aware Publish/subscribeIJSRD
 
Identical Users in Different Social Media Provides Uniform Network Structure ...
Identical Users in Different Social Media Provides Uniform Network Structure ...Identical Users in Different Social Media Provides Uniform Network Structure ...
Identical Users in Different Social Media Provides Uniform Network Structure ...IJMTST Journal
 

Similar to Social Network Analysis with Spark (20)

Social Friend Overlying Communities Based on Social Network Context
Social Friend Overlying Communities Based on Social Network ContextSocial Friend Overlying Communities Based on Social Network Context
Social Friend Overlying Communities Based on Social Network Context
 
The Spring 2018 Undergraduate Symposium Poster
The Spring 2018 Undergraduate Symposium PosterThe Spring 2018 Undergraduate Symposium Poster
The Spring 2018 Undergraduate Symposium Poster
 
Data Mining In Social Networks Using K-Means Clustering Algorithm
Data Mining In Social Networks Using K-Means Clustering AlgorithmData Mining In Social Networks Using K-Means Clustering Algorithm
Data Mining In Social Networks Using K-Means Clustering Algorithm
 
EVOLVING PATTERNS IN BIG DATA - NEIL AVERY
EVOLVING PATTERNS IN BIG DATA - NEIL AVERYEVOLVING PATTERNS IN BIG DATA - NEIL AVERY
EVOLVING PATTERNS IN BIG DATA - NEIL AVERY
 
IRJET - Twitter Spam Detection using Cobweb
IRJET - Twitter Spam Detection using CobwebIRJET - Twitter Spam Detection using Cobweb
IRJET - Twitter Spam Detection using Cobweb
 
Social Network Analysis Using Gephi
Social Network Analysis Using Gephi Social Network Analysis Using Gephi
Social Network Analysis Using Gephi
 
Fuzzy Rough Set Feature Selection to Enhance Phishing Attack Detection
Fuzzy Rough Set Feature Selection to Enhance Phishing Attack Detection Fuzzy Rough Set Feature Selection to Enhance Phishing Attack Detection
Fuzzy Rough Set Feature Selection to Enhance Phishing Attack Detection
 
Avoiding Anonymous Users in Multiple Social Media Networks (SMN)
Avoiding Anonymous Users in Multiple Social Media Networks (SMN)Avoiding Anonymous Users in Multiple Social Media Networks (SMN)
Avoiding Anonymous Users in Multiple Social Media Networks (SMN)
 
SNAwithNeo4j
SNAwithNeo4jSNAwithNeo4j
SNAwithNeo4j
 
Collective Spammer Detection in Evolving Multi-Relational Social Networks
Collective Spammer Detection in Evolving Multi-Relational Social NetworksCollective Spammer Detection in Evolving Multi-Relational Social Networks
Collective Spammer Detection in Evolving Multi-Relational Social Networks
 
Solving churn challenge in Big Data environment - Jelena Pekez
Solving churn challenge in Big Data environment  - Jelena PekezSolving churn challenge in Big Data environment  - Jelena Pekez
Solving churn challenge in Big Data environment - Jelena Pekez
 
User Behavior Hashing for Audience Expansion
User Behavior Hashing for Audience ExpansionUser Behavior Hashing for Audience Expansion
User Behavior Hashing for Audience Expansion
 
19CS3052R-CO1-7-S7 ECE
19CS3052R-CO1-7-S7 ECE19CS3052R-CO1-7-S7 ECE
19CS3052R-CO1-7-S7 ECE
 
Social Network Analysis
Social Network AnalysisSocial Network Analysis
Social Network Analysis
 
Graph Data Science DEMO for fraud analysis
Graph Data Science DEMO for fraud analysisGraph Data Science DEMO for fraud analysis
Graph Data Science DEMO for fraud analysis
 
Cloud-Based Big Data Analytics
Cloud-Based Big Data AnalyticsCloud-Based Big Data Analytics
Cloud-Based Big Data Analytics
 
ATC full paper format-2014 Social Networks in Telecommunications Asoka Korale...
ATC full paper format-2014 Social Networks in Telecommunications Asoka Korale...ATC full paper format-2014 Social Networks in Telecommunications Asoka Korale...
ATC full paper format-2014 Social Networks in Telecommunications Asoka Korale...
 
Artificial Intelligence at LinkedIn
Artificial Intelligence at LinkedInArtificial Intelligence at LinkedIn
Artificial Intelligence at LinkedIn
 
Efficient Filtering Algorithms for Location- Aware Publish/subscribe
Efficient Filtering Algorithms for Location- Aware Publish/subscribeEfficient Filtering Algorithms for Location- Aware Publish/subscribe
Efficient Filtering Algorithms for Location- Aware Publish/subscribe
 
Identical Users in Different Social Media Provides Uniform Network Structure ...
Identical Users in Different Social Media Provides Uniform Network Structure ...Identical Users in Different Social Media Provides Uniform Network Structure ...
Identical Users in Different Social Media Provides Uniform Network Structure ...
 

Recently uploaded

Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsJoseMangaJr1
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...amitlee9823
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...amitlee9823
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...amitlee9823
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
ELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptxELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptxolyaivanovalion
 

Recently uploaded (20)

Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
ELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptxELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptx
 

Social Network Analysis with Spark

  • 1. Social Network Analysis (SNA) Ghulam Imaduddin
  • 2. Definition From the point of view of data mining, a social network is a heterogeneous and multirelational data set represented by a graph. The graph is typically very large, with nodes (or vertex) corresponding to objects and edges corresponding to links representing relationships or interactions between objects. Both nodes and links have attributes (Han & Kamber, 2006). 2 Call, sms, IM, trf. Balance, … mention, follow, like, … subscriber subscriber
  • 3. Benefit of SNA 3 Identify role of subscriber in community: • Community leader • Bridge • Passive • Follower Identify high value/prospect community by looking at: • Community size • Closeness • Member’s profile (device, usage, ARPU, location) • Onnet/Offnet share in community Suspected same subscriber Comparing two social network to identify single identity of subscriber. By comparing two social network Further Utilization • New product campaign, targeting community leader, bridge, and high value community • Retention program prioritization for community leader, bridge, and high value community • Product adoption campaign for follower in community that already adopt the product • Identifying rotational churner to be excluded in retention campaign, or to evaluate dealer • SN variable can be used to enhance another predictive model. For example: social network variable can increase the lift of churn model for high value customer (Imaduddin, 2014)
  • 4. Social Network Graph Mining By mining the graph of social network, we can extract valuable information such as: • Degree (in-degree, out-degree, max-degree). Degree related to number of edge attached to one vertex/node. Vertex with high number of in-degree means that vertex receive many information from others, and vice versa. • PageRank. PageRank measures the importance of each vertex in a graph. If a Twitter user is followed by many others, the user will be ranked highly. For CDR based social network, reverse the graph direction before use PageRank function to identify the important vertex • Local clustering coefficient (LCC). LCC represent how close a customer’s network. The higher the LCC, the closer the network. LCC calculation derived from triangle counting of each vertex. 4 𝐿𝐶𝐶 = #𝑡𝑟𝑖𝑎𝑛𝑔𝑙𝑒 𝑛 2 , 𝑛 = #𝑛𝑒𝑖𝑔ℎ𝑏𝑜𝑢𝑟
  • 6. 6 Let’s get our hand dirty!
  • 8. Script Example – Degree Information 8
  • 9. Degree Information Result 9 Graph Representation Result (id, total-degree, in-degree, out-degree)
  • 10. Script Example – PageRank 10
  • 11. PageRank Result 11 Graph Representation Result (id, PageRank) (id, reverse PageRank)
  • 12. Script Example – Triangle 12
  • 13. Triangle Counting Result 13 Graph Representation Result (id, #triangle)
  • 14. Solving Real World Problem • Define the vertices. Is it subscriber, web pages, twitter account? • Define the edge  how the vertices connected. E.g. total call minutes in a month > 5 minutes, sms > 10, etc • Identify the mega hubs. Mega hubs is vertex that connected to massive amount of vertices (something like call center or spammer). Mega hubs can be removed, or process separately based on the problem. • Identify the measure needed (PageRank, degree, LCC, triangle, etc) • Build the data source (separate the vertex properties data and the connection data – join it later), and put it distributed on hadoop. • Build the code, run it, and feed the result back to data warehouse or hadoop for further utilization 14
  • 15. References & Resources • Han, J., & Kamber, M. (2006). Data Mining Concepts and Techniques. San Francisco: Morgan Kaufmann. • Imaduddin, G. (2014). Evaluation and Improvement of Churn Model Using Customer Value and Social Network. Jakarta: Universitas Indonesia. 15 • Apache Spark Overview. https://spark.apache.org/docs/latest/ • Databricks Training Resources. https://databricks.com/spark-training-resources • GraphX Programming Guide. https://spark.apache.org/docs/latest/graphx- programming-guide.html • Social Network Analysis. http://en.wikipedia.org/wiki/Social_network_analysis • Spark Scala API Doc. https://spark.apache.org/docs/1.1.0/api/scala/index.html#org.apache.spark.pac kage • The Scala Programming Language. http://www.scala-lang.org/
  • 17. List of Graph Operation in GraphX 17
  • 18. List of Graph Operation in GraphX 18