SlideShare a Scribd company logo
1 of 25
Knowledge Discovery in Databases Group




Link Prediction in Social Networks
                           Friday, 07 May 2009


                Svitlana Volkova
      Fulbright Master Student in Computer Science
     Computing and Information Sciences Department
                 Kansas State University

             234 Nichols Hall room 218, Manhattan, KS 66506-2302
           E-mail: svitlana[AT]k-state.edu or svitlana.volkova[AT]gmail.com
             Phones: mob. +1(785) 320 0113 | work +1(785) 532 7853
Agenda

 Introduction
 Related studies
 Methodology
   Mathematical representation for link prediction task
   Similarity measures
 Experiment
   Crawling Facebook
   Facebook Database
   Visualization Tools
 Conclusions
Why am I interested in social networks?


Visible Reasons
   Young
   Curious
   Like challenging tasks




Invisible Reasons
   Links from the social network reflect social behaviors of individuals




   Quantitative and Qualitative assessment of human relationships
Phenomenon of social networks

 The person who built the modern social network theory was
  the Stanley Milgram.




       [Social network] is a map of the individuals,
     and the ways how they are related to each other.
www.trustmesecurity.com/.../case/socialnetwork
Why is it difficult to predict links in social
networks?




 Collective structure


 Highly dynamic


 Sparse
Supervised vs. unsupervised methods in link
                       prediction task
                                Unsupervised                        Single Relational Table
                                 methods use
                                various similarity                    Data representation is
                                    measures                            “propositional” = “feature
                                                                        vector” or “attribute value”
              Supervised
           methods extract
         structural features to                                     Relational Data Mining
            learn a mapping                                           Inductive Logic Programming
                function
                                                                       (ILP)
Learning a binary classifier that will predict whether a
      link exists between a given pair of nodes
J48          OneR         IB1          Logistic      NaiveBayes
                               OR
AdaBoost                   Bagging                    RandForest
                               OR
Support Vector Machines (SVM)           Genetic Programming (GP)
                               OR
Bayesian networks(BN) and Probabilistic Relational Models (PRMs)
Classification based on features of entities

 Dr. William Hsu considered the problems of predicting,
 classifying annotating friends relations in social networks by
 application feature constructing approach




 Tim Weninger proposed genetic programming-based symbolic
 regression approach to the construction of the relational
 features for link analysis task in social networks

         Entity Attributes             Graph-Based Features
      (user/pair dependent)                (relational)

  Number of neighbors             Length of shortest path
  Interests                       Neighborhood overlap
  Topic model                     Relative importance
  Geographical location


  Interest popularity             Node’s Indegree/Outdegree
  Friends/Friend’s age            Forward/Backward deleted distance
Related Investigations in link prediction area


  exploring relational structure, clustering
                                                  [Jensen 2003, Getoor 2001]


  using links to predict classes/attributes of entities
                                       [Getoor,Taskar, Koller, Provost, Jensen]


  predicting link types based on known entity classes
                                                         [Taskar, Koller 2003]
  predicting links based on location in high-dimensional space
                                                            [Hoff et al., 2003]
  ranking potential links using a single graph-based feature
                                                             [Kleinberg 2004]
Mining tasks in network-structured data
                                                                  The identity of all objects is known
Node-related Tasks                                                + some link structure is known =>
                                                                  predict unobserved links
• Node-ranking
• Node-classification
• Node-clustering                                                 New objects arrive with information
                                                                  about some of their links + info
Structure-related Tasks                                           about some attributes => predict
                                                                  links among new objects

• Link prediction
• Structured pattern mining




                                               Link
                                            Prediction
                                              Tasks


                           Link                                               Link
                                     Link Type      Link Weight
                         Existence                                         Cardinality
Mathematic representation for unobserved link
     prediction task in social networks




                                        Time
Classification of measures for link prediction
                 approaches
                                       Link Prediction
                                         Approaches




       Node-wise Similarity           Topological Pattern         Probabalistic Model
        based Approaches               based Approaches            based Approaches




            Similarity measure in                                     Probabilistic relational
                                            Node base patterns
              binary classifiers                                             models




               Pairwise kernel                                          Bayesian relational
                                            Path based patterns
                  matrices                                                   models




             Statictical relational                                    Stochastic relational
                                           Graph based patterns
                   learning                                                  models
Node-wise Similarity based Approaches
Node-wise Similarity based Approaches (cont.)
Topological pattern based Approaches
Topological pattern based Approaches (cont.)
Comparison of similarity measures
                 Common      Jaccard’s    Adamic/Adar   Preferential     Kartz
                Neighbors    Similarity     Measure      Measure        Measure
  Common
 Neighbors          1           0.92          0.94         0.31          0.61
 Jaccard’s
 Similarity        0.92          1            0.97         0.53          0.75
Adamic/Adar
  Measure          0.94         0.97           1           0.49          0.70
Preferential
 Measure           0.31         0.53          0.49              1        0.84

   Katz
  Measure          0.61         0.75          0.70         0.84            1


http://www.cs.cornell.edu/home/kleinber/link-pred.pdf                    Correlation among differemt similarity measures
                                                          1.2
                                                                                            y = 0.0736x + 0.5443                     Common
                                                                                                 R² = 0.9102                         Neighbors
                                                            1
                                                                                                              y = 0.1173x + 0.2586   Jaccard’s
                                                                                                                   R² = 0.6507       Similarity
                                                          0.8
                                                                                                            y = -0.0604x + 1.0273    Adamic/Adar
                                                                                                                 R² = 0.3531         Measure
                                                          0.6                                                                        Preferential
                                                                                                                                     Measure
                                                                                                             y = -0.073x + 1.0535
                                                          0.4                                                                        Kartz Measure
                                                                                                                  R² = 0.4092
                                                                                              y = -0.1038x + 1.0881
                                                          0.2                                      R² = 0.4681


                                                            0
                                                                    0          2              4             6               8
                                                                                   Level of similarity, x
Crawling
  Facebook Social
     Network
 Crawler is automatic program which
   explores the WWW, following the
  links and searching for information
         or building a database.
It is used to build automated indexes
   for the Web, allowing users to do
       keyword searches for Web
              documents.




                                        www2007.org/posters/poster1057.pdf
http://www.flickr.com/photos/ikhnaton2/533233247/sizes/o/
Why we are more interested in Facebook?


     Betweenness                                       Degree
                          200 millions users
        Bridge                                    Flow betweenness
      Centrality                                      centrality
    Centralization        Doubling in size     Eigenvector centrality
                           once every six           Local Bridge
       Closeness
                               months                 Prestige
     Path Length
                          by 100,000 users           Radiality
 Clustering coefficient
                               per day                 Reach
       Cohesion
        Degree                                  Structural cohesion
  (Individual-level)                           Structural equivalence
         Density
Conclusions
Prediction task for previously unobserved links in social networks


 Concept of social network + social graph representation + mining tasks in network-
   structured data
 Related studies + existed approaches
    supervised vs. unsupervised methods
    single table data representation as feature vector vs. relational data mining
    link prediction task as classification with range of induces: J48, OneR, IB1, Logistic,
     NaiveBayes etc.
    other available approaches for resolving given task e.g. SVM, GP, BN, PRMs etc.
 Mathematic representation + classification and description of similarity measures
 The experiment was planned based on crawling technique with application of free open-
   source SQL full-text search engine Sphinx on Facebook corpus
 The visualization tools for social networks graph representation


                     http://www.youtube.com/watch?v=neAAzVquaRU
Thank you for attention!!!

More Related Content

Viewers also liked

What Is the Added Value of Negative Links in Online Social Networks?
What Is the Added Value of Negative Links in Online Social Networks?What Is the Added Value of Negative Links in Online Social Networks?
What Is the Added Value of Negative Links in Online Social Networks?Jérôme KUNEGIS
 
[20150829, PyCon2015] NetworkX를 이용한 네트워크 링크 예측
[20150829, PyCon2015] NetworkX를 이용한 네트워크 링크 예측[20150829, PyCon2015] NetworkX를 이용한 네트워크 링크 예측
[20150829, PyCon2015] NetworkX를 이용한 네트워크 링크 예측Kyunghoon Kim
 
Who to follow and why: link prediction with explanations
Who to follow and why: link prediction with explanationsWho to follow and why: link prediction with explanations
Who to follow and why: link prediction with explanationsNicola Barbieri
 

Viewers also liked (6)

What Is the Added Value of Negative Links in Online Social Networks?
What Is the Added Value of Negative Links in Online Social Networks?What Is the Added Value of Negative Links in Online Social Networks?
What Is the Added Value of Negative Links in Online Social Networks?
 
Content-based link prediction
Content-based link predictionContent-based link prediction
Content-based link prediction
 
Arab Blogs Test
Arab Blogs TestArab Blogs Test
Arab Blogs Test
 
[20150829, PyCon2015] NetworkX를 이용한 네트워크 링크 예측
[20150829, PyCon2015] NetworkX를 이용한 네트워크 링크 예측[20150829, PyCon2015] NetworkX를 이용한 네트워크 링크 예측
[20150829, PyCon2015] NetworkX를 이용한 네트워크 링크 예측
 
Ppt
PptPpt
Ppt
 
Who to follow and why: link prediction with explanations
Who to follow and why: link prediction with explanationsWho to follow and why: link prediction with explanations
Who to follow and why: link prediction with explanations
 

Similar to Social Networks

Selectivity Estimation for Hybrid Queries over Text-Rich Data Graphs
Selectivity Estimation for Hybrid Queries over Text-Rich Data GraphsSelectivity Estimation for Hybrid Queries over Text-Rich Data Graphs
Selectivity Estimation for Hybrid Queries over Text-Rich Data GraphsWagner Andreas
 
Advanced strategies for Metabolomics Data Analysis
Advanced strategies for Metabolomics Data AnalysisAdvanced strategies for Metabolomics Data Analysis
Advanced strategies for Metabolomics Data AnalysisDmitry Grapov
 
UTS workshop talk
UTS workshop talkUTS workshop talk
UTS workshop talkLei Wang
 
A Survey On Link Prediction In Social Networks
A Survey On Link Prediction In Social NetworksA Survey On Link Prediction In Social Networks
A Survey On Link Prediction In Social NetworksApril Smith
 
Mining Social Graph Data
Mining Social Graph DataMining Social Graph Data
Mining Social Graph DataDrew Conway
 
How Graph Algorithms Answer your Business Questions in Banking and Beyond
How Graph Algorithms Answer your Business Questions in Banking and BeyondHow Graph Algorithms Answer your Business Questions in Banking and Beyond
How Graph Algorithms Answer your Business Questions in Banking and BeyondNeo4j
 
about data mining and Exp about data mining and Exp.
about data mining and Exp about data mining and Exp.about data mining and Exp about data mining and Exp.
about data mining and Exp about data mining and Exp.MohammadMoreb
 
Chapter 10 link prediction
Chapter 10 link predictionChapter 10 link prediction
Chapter 10 link predictionAbanobZakaria1
 
Attentive Relational Networks for Mapping Images to Scene Graphs
Attentive Relational Networks for Mapping Images to Scene GraphsAttentive Relational Networks for Mapping Images to Scene Graphs
Attentive Relational Networks for Mapping Images to Scene GraphsSangmin Woo
 
Abstract
AbstractAbstract
Abstractbutest
 
Metaphors as design points for collaboration 2012
Metaphors as design points for collaboration 2012Metaphors as design points for collaboration 2012
Metaphors as design points for collaboration 2012KM Chicago
 
A comprehensive survey of link mining and anomalies detection
A comprehensive survey of link mining and anomalies detectionA comprehensive survey of link mining and anomalies detection
A comprehensive survey of link mining and anomalies detectioncsandit
 
Introduction to Dimension Reduction with PCA
Introduction to Dimension Reduction with PCAIntroduction to Dimension Reduction with PCA
Introduction to Dimension Reduction with PCADilum Bandara
 
NS-CUK Seminar: H.B.Kim, Review on "Deep Gaussian Embedding of Graphs: Unsup...
NS-CUK Seminar: H.B.Kim,  Review on "Deep Gaussian Embedding of Graphs: Unsup...NS-CUK Seminar: H.B.Kim,  Review on "Deep Gaussian Embedding of Graphs: Unsup...
NS-CUK Seminar: H.B.Kim, Review on "Deep Gaussian Embedding of Graphs: Unsup...ssuser4b1f48
 
Community detection in social networks[1]
Community detection in social networks[1]Community detection in social networks[1]
Community detection in social networks[1]sdnumaygmailcom
 
Advanced Strategies for Analysis of Metabolomic Data
Advanced Strategies for Analysis of Metabolomic DataAdvanced Strategies for Analysis of Metabolomic Data
Advanced Strategies for Analysis of Metabolomic DataDmitry Grapov
 
240325_JW_labseminar[node2vec: Scalable Feature Learning for Networks].pptx
240325_JW_labseminar[node2vec: Scalable Feature Learning for Networks].pptx240325_JW_labseminar[node2vec: Scalable Feature Learning for Networks].pptx
240325_JW_labseminar[node2vec: Scalable Feature Learning for Networks].pptxthanhdowork
 

Similar to Social Networks (20)

Selectivity Estimation for Hybrid Queries over Text-Rich Data Graphs
Selectivity Estimation for Hybrid Queries over Text-Rich Data GraphsSelectivity Estimation for Hybrid Queries over Text-Rich Data Graphs
Selectivity Estimation for Hybrid Queries over Text-Rich Data Graphs
 
Advanced strategies for Metabolomics Data Analysis
Advanced strategies for Metabolomics Data AnalysisAdvanced strategies for Metabolomics Data Analysis
Advanced strategies for Metabolomics Data Analysis
 
UTS workshop talk
UTS workshop talkUTS workshop talk
UTS workshop talk
 
A Survey On Link Prediction In Social Networks
A Survey On Link Prediction In Social NetworksA Survey On Link Prediction In Social Networks
A Survey On Link Prediction In Social Networks
 
JOSA TechTalks - Machine Learning in Practice
JOSA TechTalks - Machine Learning in PracticeJOSA TechTalks - Machine Learning in Practice
JOSA TechTalks - Machine Learning in Practice
 
Mining Social Graph Data
Mining Social Graph DataMining Social Graph Data
Mining Social Graph Data
 
How Graph Algorithms Answer your Business Questions in Banking and Beyond
How Graph Algorithms Answer your Business Questions in Banking and BeyondHow Graph Algorithms Answer your Business Questions in Banking and Beyond
How Graph Algorithms Answer your Business Questions in Banking and Beyond
 
about data mining and Exp about data mining and Exp.
about data mining and Exp about data mining and Exp.about data mining and Exp about data mining and Exp.
about data mining and Exp about data mining and Exp.
 
Chapter 10 link prediction
Chapter 10 link predictionChapter 10 link prediction
Chapter 10 link prediction
 
Attentive Relational Networks for Mapping Images to Scene Graphs
Attentive Relational Networks for Mapping Images to Scene GraphsAttentive Relational Networks for Mapping Images to Scene Graphs
Attentive Relational Networks for Mapping Images to Scene Graphs
 
Declarative analysis of noisy information networks
Declarative analysis of noisy information networksDeclarative analysis of noisy information networks
Declarative analysis of noisy information networks
 
Public profile
Public profilePublic profile
Public profile
 
Abstract
AbstractAbstract
Abstract
 
Metaphors as design points for collaboration 2012
Metaphors as design points for collaboration 2012Metaphors as design points for collaboration 2012
Metaphors as design points for collaboration 2012
 
A comprehensive survey of link mining and anomalies detection
A comprehensive survey of link mining and anomalies detectionA comprehensive survey of link mining and anomalies detection
A comprehensive survey of link mining and anomalies detection
 
Introduction to Dimension Reduction with PCA
Introduction to Dimension Reduction with PCAIntroduction to Dimension Reduction with PCA
Introduction to Dimension Reduction with PCA
 
NS-CUK Seminar: H.B.Kim, Review on "Deep Gaussian Embedding of Graphs: Unsup...
NS-CUK Seminar: H.B.Kim,  Review on "Deep Gaussian Embedding of Graphs: Unsup...NS-CUK Seminar: H.B.Kim,  Review on "Deep Gaussian Embedding of Graphs: Unsup...
NS-CUK Seminar: H.B.Kim, Review on "Deep Gaussian Embedding of Graphs: Unsup...
 
Community detection in social networks[1]
Community detection in social networks[1]Community detection in social networks[1]
Community detection in social networks[1]
 
Advanced Strategies for Analysis of Metabolomic Data
Advanced Strategies for Analysis of Metabolomic DataAdvanced Strategies for Analysis of Metabolomic Data
Advanced Strategies for Analysis of Metabolomic Data
 
240325_JW_labseminar[node2vec: Scalable Feature Learning for Networks].pptx
240325_JW_labseminar[node2vec: Scalable Feature Learning for Networks].pptx240325_JW_labseminar[node2vec: Scalable Feature Learning for Networks].pptx
240325_JW_labseminar[node2vec: Scalable Feature Learning for Networks].pptx
 

More from Svitlana volkova

More from Svitlana volkova (18)

EACL'12 Poster
EACL'12 PosterEACL'12 Poster
EACL'12 Poster
 
Grace Hopper Celebration 2010
Grace Hopper Celebration 2010Grace Hopper Celebration 2010
Grace Hopper Celebration 2010
 
Multimodal Information Extraction: Disease, Date and Location Retrieval
Multimodal Information Extraction: Disease, Date and Location RetrievalMultimodal Information Extraction: Disease, Date and Location Retrieval
Multimodal Information Extraction: Disease, Date and Location Retrieval
 
Web Intelligence 2010
Web Intelligence 2010Web Intelligence 2010
Web Intelligence 2010
 
Master Thesis
Master ThesisMaster Thesis
Master Thesis
 
MS Thesis Short
MS Thesis ShortMS Thesis Short
MS Thesis Short
 
IEEE ISI'10
IEEE ISI'10IEEE ISI'10
IEEE ISI'10
 
MedEx'10
MedEx'10MedEx'10
MedEx'10
 
Multilingual Ner Using Wiki
Multilingual Ner Using WikiMultilingual Ner Using Wiki
Multilingual Ner Using Wiki
 
WiML Poster
WiML PosterWiML Poster
WiML Poster
 
Topics Modeling
Topics ModelingTopics Modeling
Topics Modeling
 
Project Proposal Topics Modeling (Ir)
Project Proposal    Topics Modeling (Ir)Project Proposal    Topics Modeling (Ir)
Project Proposal Topics Modeling (Ir)
 
Methods Of Reliability Analysis
Methods Of Reliability AnalysisMethods Of Reliability Analysis
Methods Of Reliability Analysis
 
Ohio Project
Ohio ProjectOhio Project
Ohio Project
 
Ukraine Presentation
Ukraine PresentationUkraine Presentation
Ukraine Presentation
 
Ukraine Presentation at Kansas State University
Ukraine Presentation at Kansas State UniversityUkraine Presentation at Kansas State University
Ukraine Presentation at Kansas State University
 
Communicatons Fulbright
Communicatons FulbrightCommunicatons Fulbright
Communicatons Fulbright
 
Communications Ternopil
Communications TernopilCommunications Ternopil
Communications Ternopil
 

Recently uploaded

Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxRamakrishna Reddy Bijjam
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Association for Project Management
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibitjbellavia9
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...Nguyen Thanh Tu Collection
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxnegromaestrong
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...Poonam Aher Patil
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxVishalSingh1417
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfPoh-Sun Goh
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docxPoojaSen20
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docxPoojaSen20
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxAmanpreet Kaur
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and ModificationsMJDuyan
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17Celine George
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxDenish Jangid
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentationcamerronhm
 

Recently uploaded (20)

Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docx
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 

Social Networks

  • 1. Knowledge Discovery in Databases Group Link Prediction in Social Networks Friday, 07 May 2009 Svitlana Volkova Fulbright Master Student in Computer Science Computing and Information Sciences Department Kansas State University 234 Nichols Hall room 218, Manhattan, KS 66506-2302 E-mail: svitlana[AT]k-state.edu or svitlana.volkova[AT]gmail.com Phones: mob. +1(785) 320 0113 | work +1(785) 532 7853
  • 2. Agenda  Introduction  Related studies  Methodology  Mathematical representation for link prediction task  Similarity measures  Experiment  Crawling Facebook  Facebook Database  Visualization Tools  Conclusions
  • 3. Why am I interested in social networks? Visible Reasons  Young  Curious  Like challenging tasks Invisible Reasons  Links from the social network reflect social behaviors of individuals  Quantitative and Qualitative assessment of human relationships
  • 4. Phenomenon of social networks  The person who built the modern social network theory was the Stanley Milgram. [Social network] is a map of the individuals, and the ways how they are related to each other.
  • 6. Why is it difficult to predict links in social networks?  Collective structure  Highly dynamic  Sparse
  • 7. Supervised vs. unsupervised methods in link prediction task Unsupervised  Single Relational Table methods use various similarity  Data representation is measures “propositional” = “feature vector” or “attribute value” Supervised methods extract structural features to  Relational Data Mining learn a mapping  Inductive Logic Programming function (ILP) Learning a binary classifier that will predict whether a link exists between a given pair of nodes J48 OneR IB1 Logistic NaiveBayes OR AdaBoost Bagging RandForest OR Support Vector Machines (SVM) Genetic Programming (GP) OR Bayesian networks(BN) and Probabilistic Relational Models (PRMs)
  • 8. Classification based on features of entities Dr. William Hsu considered the problems of predicting, classifying annotating friends relations in social networks by application feature constructing approach Tim Weninger proposed genetic programming-based symbolic regression approach to the construction of the relational features for link analysis task in social networks Entity Attributes Graph-Based Features (user/pair dependent) (relational)  Number of neighbors  Length of shortest path  Interests  Neighborhood overlap  Topic model  Relative importance  Geographical location  Interest popularity  Node’s Indegree/Outdegree  Friends/Friend’s age  Forward/Backward deleted distance
  • 9. Related Investigations in link prediction area  exploring relational structure, clustering [Jensen 2003, Getoor 2001]  using links to predict classes/attributes of entities [Getoor,Taskar, Koller, Provost, Jensen]  predicting link types based on known entity classes [Taskar, Koller 2003]  predicting links based on location in high-dimensional space [Hoff et al., 2003]  ranking potential links using a single graph-based feature [Kleinberg 2004]
  • 10. Mining tasks in network-structured data The identity of all objects is known Node-related Tasks + some link structure is known => predict unobserved links • Node-ranking • Node-classification • Node-clustering New objects arrive with information about some of their links + info Structure-related Tasks about some attributes => predict links among new objects • Link prediction • Structured pattern mining Link Prediction Tasks Link Link Link Type Link Weight Existence Cardinality
  • 11. Mathematic representation for unobserved link prediction task in social networks Time
  • 12. Classification of measures for link prediction approaches Link Prediction Approaches Node-wise Similarity Topological Pattern Probabalistic Model based Approaches based Approaches based Approaches Similarity measure in Probabilistic relational Node base patterns binary classifiers models Pairwise kernel Bayesian relational Path based patterns matrices models Statictical relational Stochastic relational Graph based patterns learning models
  • 14. Node-wise Similarity based Approaches (cont.)
  • 16. Topological pattern based Approaches (cont.)
  • 17. Comparison of similarity measures Common Jaccard’s Adamic/Adar Preferential Kartz Neighbors Similarity Measure Measure Measure Common Neighbors 1 0.92 0.94 0.31 0.61 Jaccard’s Similarity 0.92 1 0.97 0.53 0.75 Adamic/Adar Measure 0.94 0.97 1 0.49 0.70 Preferential Measure 0.31 0.53 0.49 1 0.84 Katz Measure 0.61 0.75 0.70 0.84 1 http://www.cs.cornell.edu/home/kleinber/link-pred.pdf Correlation among differemt similarity measures 1.2 y = 0.0736x + 0.5443 Common R² = 0.9102 Neighbors 1 y = 0.1173x + 0.2586 Jaccard’s R² = 0.6507 Similarity 0.8 y = -0.0604x + 1.0273 Adamic/Adar R² = 0.3531 Measure 0.6 Preferential Measure y = -0.073x + 1.0535 0.4 Kartz Measure R² = 0.4092 y = -0.1038x + 1.0881 0.2 R² = 0.4681 0 0 2 4 6 8 Level of similarity, x
  • 18. Crawling Facebook Social Network Crawler is automatic program which explores the WWW, following the links and searching for information or building a database. It is used to build automated indexes for the Web, allowing users to do keyword searches for Web documents. www2007.org/posters/poster1057.pdf
  • 20. Why we are more interested in Facebook? Betweenness Degree 200 millions users Bridge Flow betweenness Centrality centrality Centralization Doubling in size Eigenvector centrality once every six Local Bridge Closeness months Prestige Path Length by 100,000 users Radiality Clustering coefficient per day Reach Cohesion Degree Structural cohesion (Individual-level) Structural equivalence Density
  • 21.
  • 22.
  • 23.
  • 24. Conclusions Prediction task for previously unobserved links in social networks  Concept of social network + social graph representation + mining tasks in network- structured data  Related studies + existed approaches  supervised vs. unsupervised methods  single table data representation as feature vector vs. relational data mining  link prediction task as classification with range of induces: J48, OneR, IB1, Logistic, NaiveBayes etc.  other available approaches for resolving given task e.g. SVM, GP, BN, PRMs etc.  Mathematic representation + classification and description of similarity measures  The experiment was planned based on crawling technique with application of free open- source SQL full-text search engine Sphinx on Facebook corpus  The visualization tools for social networks graph representation http://www.youtube.com/watch?v=neAAzVquaRU
  • 25. Thank you for attention!!!