SlideShare a Scribd company logo
KONECT Cloud
Large Scale Network Mining in the Cloud


                  Jérôme Kunegis
  Future SOC Lab Day, 18.04.2012



                                          1
Networks are Everywhere

                                                                       ip
                                                                   r sh
                                                               tho
                                                             Au

                                     ip
                                  dsh
                           Fr ien

                                                                  t
                                                             Trus


                       n
                 tio
         n   i ca
       mu                                                                      e
Co
     m
                                                                         re nc
                                          c              n           r
                                                    c tio        c cu
                                            ter
                                                a
                                                             Co-o
                                          In
Social Networks



                  friend
Trust Networks




                 tru
                    st
Friend/Enemy Network




                       en
        d




                          em
      en




                          y
     fri
Interaction Network




                      listen
KONECT – Koblenz Network Collection

 148      network datasets
       26 are undirected
       38 are directed
       84 are bipartite
       59 have unweighted edges
       77 allow multiple edges
       04 have signed edges
       08 have ratings as edges
       78 have edge arrival times



   konect.uni-koblenz.de
Largest Network


   Directed “who follows who” network


        0 041 652 230 users
        1 468 365 182 edges


  konect.uni-koblenz.de/networks/twitter
148 Network Datasets

     authorship
communication
 co-occurrence
        features
    folksonomy
     interaction
        physical
          ratings
      reference
       semantic
           social
             trust
What We Computed

 Connected components
 Network diameter   ←    at Future SOC Lab
 Clustering coefficients
 Degree distributions
 Spectral distribution
 Eigenvector centrality
 Graph drawing
 Temporal Analysis
 Link prediction
Network Diameter




 6
90 Percentile Effective Diameter




5
90 Percentile Effective Diameter

                  3
90 Percentile Effective Diameter




3.75
Computing the Effective
Diameter

for each node i {                   |V|
   count hops needed to reach 90%   |E|
}


Total runtime:                      |E| × |V|
Graph Sampling



                   Keep
                 X% of edges
Computation

  × 1 000 vertices (sampled)
  × 120 840 391 edges
  × 20 sample sizes (5%, 10%, …, 100%)
  × 50 random samplings

  Evaluation on single machine:
   1 TiB memory
   64 cores
   Matlab 64 bit
Results
Thank You!


                        Dr. Jérôme Kunegis
konect.uni-koblenz.de   kunegis@uni-koblenz.de
                        west.uni-koblenz.de

More Related Content

More from Jérôme KUNEGIS

Succinct Summarisation of Large Networks via Small Synthetic Representative G...
Succinct Summarisation of Large Networks via Small Synthetic Representative G...Succinct Summarisation of Large Networks via Small Synthetic Representative G...
Succinct Summarisation of Large Networks via Small Synthetic Representative G...
Jérôme KUNEGIS
 
Title: What Is the Difference between a Social and a Hyperlink Network? -- Ho...
Title: What Is the Difference between a Social and a Hyperlink Network? -- Ho...Title: What Is the Difference between a Social and a Hyperlink Network? -- Ho...
Title: What Is the Difference between a Social and a Hyperlink Network? -- Ho...
Jérôme KUNEGIS
 
Measuring the Conflict in a Social Network with Friends and Foes: A Recent Al...
Measuring the Conflict in a Social Network with Friends and Foes: A Recent Al...Measuring the Conflict in a Social Network with Friends and Foes: A Recent Al...
Measuring the Conflict in a Social Network with Friends and Foes: A Recent Al...
Jérôme KUNEGIS
 
Schach und Computer
Schach und ComputerSchach und Computer
Schach und Computer
Jérôme KUNEGIS
 
Generating Networks with Arbitrary Properties
Generating Networks with Arbitrary PropertiesGenerating Networks with Arbitrary Properties
Generating Networks with Arbitrary PropertiesJérôme KUNEGIS
 
Karriere Lounge – INFORMATIK 2013
Karriere Lounge – INFORMATIK 2013Karriere Lounge – INFORMATIK 2013
Karriere Lounge – INFORMATIK 2013
Jérôme KUNEGIS
 
KONECT – The Koblenz Network Collection
KONECT – The Koblenz Network CollectionKONECT – The Koblenz Network Collection
KONECT – The Koblenz Network Collection
Jérôme KUNEGIS
 
Preferential Attachment in Online Networks: Measurement and Explanations
Preferential Attachment in Online Networks:  Measurement and ExplanationsPreferential Attachment in Online Networks:  Measurement and Explanations
Preferential Attachment in Online Networks: Measurement and Explanations
Jérôme KUNEGIS
 
Predicting Directed Links using Nondiagonal Matrix Decompositions
Predicting Directed Links using Nondiagonal Matrix DecompositionsPredicting Directed Links using Nondiagonal Matrix Decompositions
Predicting Directed Links using Nondiagonal Matrix DecompositionsJérôme KUNEGIS
 
Online Dating Recommender Systems: The Split-complex Number Approach
Online Dating Recommender Systems: The Split-complex Number ApproachOnline Dating Recommender Systems: The Split-complex Number Approach
Online Dating Recommender Systems: The Split-complex Number Approach
Jérôme KUNEGIS
 
Fairness on the Web: Alternatives to the Power Law (WebSci 2012)
Fairness on the Web:  Alternatives to the Power Law (WebSci 2012)Fairness on the Web:  Alternatives to the Power Law (WebSci 2012)
Fairness on the Web: Alternatives to the Power Law (WebSci 2012)Jérôme KUNEGIS
 
Fairness on the Web: Alternatives to the Power Law
Fairness on the Web:  Alternatives to the Power LawFairness on the Web:  Alternatives to the Power Law
Fairness on the Web: Alternatives to the Power Law
Jérôme KUNEGIS
 
On the Spectral Evolution of Large Networks (PhD Thesis by Jérôme Kunegis)
On the Spectral Evolution of Large Networks (PhD Thesis by Jérôme Kunegis)On the Spectral Evolution of Large Networks (PhD Thesis by Jérôme Kunegis)
On the Spectral Evolution of Large Networks (PhD Thesis by Jérôme Kunegis)
Jérôme KUNEGIS
 
Searching Microblogs: Coping with Sparsity and Document Quality
Searching Microblogs: Coping with Sparsity and Document QualitySearching Microblogs: Coping with Sparsity and Document Quality
Searching Microblogs: Coping with Sparsity and Document QualityJérôme KUNEGIS
 
Bad News Travel Fast: A Content-based Analysis of Interestingness on Twitter
Bad News Travel Fast: A Content-based Analysis of Interestingness on TwitterBad News Travel Fast: A Content-based Analysis of Interestingness on Twitter
Bad News Travel Fast: A Content-based Analysis of Interestingness on Twitter
Jérôme KUNEGIS
 
On the Scalability of Graph Kernels Applied to Collaborative Recommenders
On the Scalability of Graph Kernels Applied to Collaborative RecommendersOn the Scalability of Graph Kernels Applied to Collaborative Recommenders
On the Scalability of Graph Kernels Applied to Collaborative Recommenders
Jérôme KUNEGIS
 
The Slashdot Zoo: Mining a Social Network with Negative Edges
The Slashdot Zoo:  Mining a Social Network with Negative EdgesThe Slashdot Zoo:  Mining a Social Network with Negative Edges
The Slashdot Zoo: Mining a Social Network with Negative Edges
Jérôme KUNEGIS
 
Spectral Analysis of Signed Graphs for Clustering, Prediction and Visualization
Spectral Analysis of Signed Graphs for Clustering, Prediction and VisualizationSpectral Analysis of Signed Graphs for Clustering, Prediction and Visualization
Spectral Analysis of Signed Graphs for Clustering, Prediction and Visualization
Jérôme KUNEGIS
 
Network Growth and the Spectral Evolution Model
Network Growth and the Spectral Evolution ModelNetwork Growth and the Spectral Evolution Model
Network Growth and the Spectral Evolution Model
Jérôme KUNEGIS
 

More from Jérôme KUNEGIS (19)

Succinct Summarisation of Large Networks via Small Synthetic Representative G...
Succinct Summarisation of Large Networks via Small Synthetic Representative G...Succinct Summarisation of Large Networks via Small Synthetic Representative G...
Succinct Summarisation of Large Networks via Small Synthetic Representative G...
 
Title: What Is the Difference between a Social and a Hyperlink Network? -- Ho...
Title: What Is the Difference between a Social and a Hyperlink Network? -- Ho...Title: What Is the Difference between a Social and a Hyperlink Network? -- Ho...
Title: What Is the Difference between a Social and a Hyperlink Network? -- Ho...
 
Measuring the Conflict in a Social Network with Friends and Foes: A Recent Al...
Measuring the Conflict in a Social Network with Friends and Foes: A Recent Al...Measuring the Conflict in a Social Network with Friends and Foes: A Recent Al...
Measuring the Conflict in a Social Network with Friends and Foes: A Recent Al...
 
Schach und Computer
Schach und ComputerSchach und Computer
Schach und Computer
 
Generating Networks with Arbitrary Properties
Generating Networks with Arbitrary PropertiesGenerating Networks with Arbitrary Properties
Generating Networks with Arbitrary Properties
 
Karriere Lounge – INFORMATIK 2013
Karriere Lounge – INFORMATIK 2013Karriere Lounge – INFORMATIK 2013
Karriere Lounge – INFORMATIK 2013
 
KONECT – The Koblenz Network Collection
KONECT – The Koblenz Network CollectionKONECT – The Koblenz Network Collection
KONECT – The Koblenz Network Collection
 
Preferential Attachment in Online Networks: Measurement and Explanations
Preferential Attachment in Online Networks:  Measurement and ExplanationsPreferential Attachment in Online Networks:  Measurement and Explanations
Preferential Attachment in Online Networks: Measurement and Explanations
 
Predicting Directed Links using Nondiagonal Matrix Decompositions
Predicting Directed Links using Nondiagonal Matrix DecompositionsPredicting Directed Links using Nondiagonal Matrix Decompositions
Predicting Directed Links using Nondiagonal Matrix Decompositions
 
Online Dating Recommender Systems: The Split-complex Number Approach
Online Dating Recommender Systems: The Split-complex Number ApproachOnline Dating Recommender Systems: The Split-complex Number Approach
Online Dating Recommender Systems: The Split-complex Number Approach
 
Fairness on the Web: Alternatives to the Power Law (WebSci 2012)
Fairness on the Web:  Alternatives to the Power Law (WebSci 2012)Fairness on the Web:  Alternatives to the Power Law (WebSci 2012)
Fairness on the Web: Alternatives to the Power Law (WebSci 2012)
 
Fairness on the Web: Alternatives to the Power Law
Fairness on the Web:  Alternatives to the Power LawFairness on the Web:  Alternatives to the Power Law
Fairness on the Web: Alternatives to the Power Law
 
On the Spectral Evolution of Large Networks (PhD Thesis by Jérôme Kunegis)
On the Spectral Evolution of Large Networks (PhD Thesis by Jérôme Kunegis)On the Spectral Evolution of Large Networks (PhD Thesis by Jérôme Kunegis)
On the Spectral Evolution of Large Networks (PhD Thesis by Jérôme Kunegis)
 
Searching Microblogs: Coping with Sparsity and Document Quality
Searching Microblogs: Coping with Sparsity and Document QualitySearching Microblogs: Coping with Sparsity and Document Quality
Searching Microblogs: Coping with Sparsity and Document Quality
 
Bad News Travel Fast: A Content-based Analysis of Interestingness on Twitter
Bad News Travel Fast: A Content-based Analysis of Interestingness on TwitterBad News Travel Fast: A Content-based Analysis of Interestingness on Twitter
Bad News Travel Fast: A Content-based Analysis of Interestingness on Twitter
 
On the Scalability of Graph Kernels Applied to Collaborative Recommenders
On the Scalability of Graph Kernels Applied to Collaborative RecommendersOn the Scalability of Graph Kernels Applied to Collaborative Recommenders
On the Scalability of Graph Kernels Applied to Collaborative Recommenders
 
The Slashdot Zoo: Mining a Social Network with Negative Edges
The Slashdot Zoo:  Mining a Social Network with Negative EdgesThe Slashdot Zoo:  Mining a Social Network with Negative Edges
The Slashdot Zoo: Mining a Social Network with Negative Edges
 
Spectral Analysis of Signed Graphs for Clustering, Prediction and Visualization
Spectral Analysis of Signed Graphs for Clustering, Prediction and VisualizationSpectral Analysis of Signed Graphs for Clustering, Prediction and Visualization
Spectral Analysis of Signed Graphs for Clustering, Prediction and Visualization
 
Network Growth and the Spectral Evolution Model
Network Growth and the Spectral Evolution ModelNetwork Growth and the Spectral Evolution Model
Network Growth and the Spectral Evolution Model
 

Recently uploaded

FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
Peter Spielvogel
 

Recently uploaded (20)

FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
 

KONECT Cloud – Large Scale Network Mining in the Cloud