SlideShare a Scribd company logo
Visualization of evolutionary
cascades of messages using
       force-directed graphs
                            Artjom Kurapov
                   Supervisor: Helena Kruus

               Master’s thesis defense, 9 may 2011
Agenda

   Background
   Practical work
       Pling.ee,opensource Gephi
       Web-tool demo and twitter
Background

   Types of networks
   Properties / areas of application
   Research interest
Topics crossroads
Goals

   Visualize social networks (preferably in Estonia)
   Compare friends and messages topology
   Try to mine data visually using cascades

                                      A



                                            C



                                      B            D
Pling
Pling – Qualitative measure
                                 Friends   Messages
Average clustering coefficient   0.135     0.043
Average degree                   4.313     2.202
GCC diameter                     20        38
Average GCC diameter             5.38      13.009
Topic and interface matters

   Out of 18.6 mln messages - no (clearly visible)
    cascade

Possibly because
 89% private

 86% sent using phone
Javascript tool

   Up to 1000 nodes
   Can add nodes on the fly
   Navigation and filtering
   Properties calculation
   Recursive algorithm
Twitter

   Friendship and message network mined
   218 users / 12643 messages, 6.89% retweets
                     100000
                      10000
                       1000
                        100
                         10
                          1
                              0   2   3   4   5   7   8
Thank you
Questions?

More Related Content

Viewers also liked

Spain V Miguel Hernandez
Spain V   Miguel HernandezSpain V   Miguel Hernandez
Spain V Miguel HernandezElliott Serbian
 
Edu expo anonymous peer review
Edu expo anonymous peer reviewEdu expo anonymous peer review
Edu expo anonymous peer reviewGjoa Andrichuk
 
6114 k2 pemkab. kuningan
6114 k2 pemkab. kuningan6114 k2 pemkab. kuningan
6114 k2 pemkab. kuninganbenipurnama
 
Online marketing trends in the UK
Online marketing trends in the UKOnline marketing trends in the UK
Online marketing trends in the UKMintTwist
 
Jorge Delgado Work
Jorge Delgado  WorkJorge Delgado  Work
Jorge Delgado Workguestf7f830
 
How effective is the combination of your main
How effective is the combination of your mainHow effective is the combination of your main
How effective is the combination of your mainChristina Worby
 
Sociala Medier - Hot Eller Möjlighet
Sociala Medier - Hot Eller MöjlighetSociala Medier - Hot Eller Möjlighet
Sociala Medier - Hot Eller MöjlighetAndreas Norman
 
Cambridge Solutions E Assessment
Cambridge Solutions E AssessmentCambridge Solutions E Assessment
Cambridge Solutions E Assessmentchristhatcher
 

Viewers also liked (19)

Ensamble coral como momento de arendizaje
Ensamble coral como momento de arendizajeEnsamble coral como momento de arendizaje
Ensamble coral como momento de arendizaje
 
Influenza diego
Influenza diegoInfluenza diego
Influenza diego
 
Wisdom Circles Presentation09
Wisdom Circles Presentation09Wisdom Circles Presentation09
Wisdom Circles Presentation09
 
Rngnthn t2
Rngnthn t2Rngnthn t2
Rngnthn t2
 
Songs & chants in the chinese classroom nclc 2011
Songs & chants in the chinese classroom nclc 2011Songs & chants in the chinese classroom nclc 2011
Songs & chants in the chinese classroom nclc 2011
 
Spain V Miguel Hernandez
Spain V   Miguel HernandezSpain V   Miguel Hernandez
Spain V Miguel Hernandez
 
Edu expo anonymous peer review
Edu expo anonymous peer reviewEdu expo anonymous peer review
Edu expo anonymous peer review
 
6114 k2 pemkab. kuningan
6114 k2 pemkab. kuningan6114 k2 pemkab. kuningan
6114 k2 pemkab. kuningan
 
Online marketing trends in the UK
Online marketing trends in the UKOnline marketing trends in the UK
Online marketing trends in the UK
 
Jorge Delgado Work
Jorge Delgado  WorkJorge Delgado  Work
Jorge Delgado Work
 
Presentation1
Presentation1Presentation1
Presentation1
 
12 checex
12 checex12 checex
12 checex
 
Forum may 2011 yun zhang's presentation
Forum may 2011 yun zhang's presentationForum may 2011 yun zhang's presentation
Forum may 2011 yun zhang's presentation
 
Forum May 2011 Bing Qiu Getting Tenure
Forum May 2011 Bing Qiu Getting TenureForum May 2011 Bing Qiu Getting Tenure
Forum May 2011 Bing Qiu Getting Tenure
 
How effective is the combination of your main
How effective is the combination of your mainHow effective is the combination of your main
How effective is the combination of your main
 
Symfony
SymfonySymfony
Symfony
 
Sociala Medier - Hot Eller Möjlighet
Sociala Medier - Hot Eller MöjlighetSociala Medier - Hot Eller Möjlighet
Sociala Medier - Hot Eller Möjlighet
 
Joanne Wang: Teaching Math Provides Students with Authentic Exposure and COnt...
Joanne Wang: Teaching Math Provides Students with Authentic Exposure and COnt...Joanne Wang: Teaching Math Provides Students with Authentic Exposure and COnt...
Joanne Wang: Teaching Math Provides Students with Authentic Exposure and COnt...
 
Cambridge Solutions E Assessment
Cambridge Solutions E AssessmentCambridge Solutions E Assessment
Cambridge Solutions E Assessment
 

More from Артём Курапов (8)

Scaling GraphQL Subscriptions
Scaling GraphQL SubscriptionsScaling GraphQL Subscriptions
Scaling GraphQL Subscriptions
 
Variety of automated tests
Variety of automated testsVariety of automated tests
Variety of automated tests
 
Bacbkone js
Bacbkone jsBacbkone js
Bacbkone js
 
Php storm intro
Php storm introPhp storm intro
Php storm intro
 
Android intro
Android introAndroid intro
Android intro
 
В облаке AWS
В облаке AWSВ облаке AWS
В облаке AWS
 
Devclub hääletamine
Devclub hääletamineDevclub hääletamine
Devclub hääletamine
 
OAuthоризация и API социальных сетей
OAuthоризация и API социальных сетейOAuthоризация и API социальных сетей
OAuthоризация и API социальных сетей
 

Recently uploaded

From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...Product School
 
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCustom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCzechDreamin
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backElena Simperl
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Alison B. Lowndes
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...Sri Ambati
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor TurskyiFwdays
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupCatarinaPereira64715
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...Product School
 
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Julian Hyde
 
AI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekAI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekCzechDreamin
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaRTTS
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutesconfluent
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsPaul Groth
 
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxDavid Michel
 
IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoTAnalytics
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...Product School
 
IESVE for Early Stage Design and Planning
IESVE for Early Stage Design and PlanningIESVE for Early Stage Design and Planning
IESVE for Early Stage Design and PlanningIES VE
 

Recently uploaded (20)

From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCustom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
 
AI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekAI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří Karpíšek
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutes
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
 
IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
IESVE for Early Stage Design and Planning
IESVE for Early Stage Design and PlanningIESVE for Early Stage Design and Planning
IESVE for Early Stage Design and Planning
 

Visualization of evolutionary cascades of messages using force-directed graphs

  • 1. Visualization of evolutionary cascades of messages using force-directed graphs Artjom Kurapov Supervisor: Helena Kruus Master’s thesis defense, 9 may 2011
  • 2. Agenda  Background  Practical work  Pling.ee,opensource Gephi  Web-tool demo and twitter
  • 3. Background  Types of networks  Properties / areas of application  Research interest
  • 5. Goals  Visualize social networks (preferably in Estonia)  Compare friends and messages topology  Try to mine data visually using cascades A C B D
  • 7. Pling – Qualitative measure Friends Messages Average clustering coefficient 0.135 0.043 Average degree 4.313 2.202 GCC diameter 20 38 Average GCC diameter 5.38 13.009
  • 8. Topic and interface matters  Out of 18.6 mln messages - no (clearly visible) cascade Possibly because  89% private  86% sent using phone
  • 9. Javascript tool  Up to 1000 nodes  Can add nodes on the fly  Navigation and filtering  Properties calculation  Recursive algorithm
  • 10. Twitter  Friendship and message network mined  218 users / 12643 messages, 6.89% retweets 100000 10000 1000 100 10 1 0 2 3 4 5 7 8

Editor's Notes

  1. So, first a little introduction in the field,then some large dataset research I’ve done,Then personally made browser tool. A small demo, features and issues faced.And a small twitter dataset results
  2. Networks are everywhere. Most of us here study technological and information networks. But there are also biochemical, ecological and most interestingly – social networks which influence our daily life. These include sexual connections, friendship networks, citations or any kind of social behavior associated with it. In fact if you go strict about it, then citation is not really social behavior, since its directed and doesn’t imply talking to the real person. So its more like network of document dependencies. So it is important how you define connection and objects.Networks have different properties, some of which I list in the paper. And of course some of them are relevant only in one field, like bipartite graphs are only needed if you want to visualize them. Or cliques if you want to use clique analysis done.There are also different research interests. Like drawing, or how networks evolve, or how do they break apart, or where does traffic goes through, or how do can we do all kind of graph puzzles. Like graph search, coloring or solve travelling salesman problems.
  3. So to visualize such network and its processes, one needs to see surroundings in this field – like sociology with its laws of diffusion and prefferential attachment, likenetwork properties, drawing algorithms and its complexity, and ofcourse work that has been done before – both theoretical and practical as existing software.
  4. As a thesis goal, I suggest mining data through frequency analysis of messages and making a network topology map. That means that we want a graph representation of a network,We want both friendships and messages datasets,And then we want to see how they correlate and lead to higher forms of messages – cascades.And my hypothesis is that cascades are parts of social thought. Thus evolutionary cascades are linked cascades across multiple topics.
  5. So I have studied Estonian social network pling.ee which belongs to Elisa Eesti AS and has 75 thousands users on the left as friendship network and 12 thousand on the right as message network. As you can see its different, and assortative mixing is present. This means that we have red nodes is here are russian and blue are estonian users. This was read from the messages and symbols they used.
  6. So the numbers differ as well.. As you can see since it was a small portion of messages, the network is rather young and has bigger diameter. A the same time average degree is smaller which is natural, since people don’t talk to all of their friends. And clustering coefficient is also smaller, which is partially dependent on that degree tendency.
  7. The bad news for me was that I was not able to find a single cascade. Possibly because only around 14% were sent from the browser and there were no explicit resharing function in the interface. But comparing it to twitter – people there invented RT themselves. Most likely it’s the topic of discussion that didn’t stimulate sharing, since 89% of talks were private and almost all are teenagers discussing their love life.
  8. So to study cascades and make visualization, I’ve tried building own tool that is written in javascript and can draw small datasets along with its analysis.I’ve also done two dataset extractions from twitter.Its browser based, can do navigation.
  9. From 12 thousand messages, around 7% can be considered as a direct cascade. But there may be more, since I didn’t take into account normal posts with directed form, that can also lead to smaller forms of cascades.On the graph you can see how depth of the retweet depends on its number in the dataset.(demo here)
  10. I don’t talk about evolutionary network, because I study static snapshots here, but in general network does evolve from disconnected components into GCC. But it depends on a network. For example buyers in electronic shops, even though they may suggest products, don’t always lead to new customers with connection. So customers are not connected to anyone. On the other hand, there may be certain clusters in case there is some sort of affiliate network campaign.P – polynomial complexityT(n) = O (n^k)NP – nondeterministic polynomial complexity. Nondeterministic automata can have multiple decision paths from a single state.“NP complete” problems don’t have a polynomial time algorithm.“NP hard” are at least as hard as NP-complete.2. Yes, in social networks GCC diameter is maximal at first stages of network evolution, and decreases over time. I’m not so sure about other network types. Because social networks do get denser.. Since each new node can connect to 0,1 or all nodes, alpha is So in lowest case they grow linearly with exponent equal to 1, meaning like a tree.. In other case they can grow quadratically, with exponent equal to 2, they each new node basically connects to all other nodes. So the more people join in, the more friends can know the other end of the graph. Thus – smaller diameter.If you think of technological networks, then I don’t think making a wiring from japan to brasil is so easy.3. Markov centrality is one of the ways one can find most influential nodes in the network. Although its very complex to compute, my work also lists others centrality measures. And I think that4. Cascade analysis and data mining is still hand work.5. I used Fruchterman-Reingold and Yifan Hu algorithms for local forces and for adaptive cooling. I’ve added my own version of recursive force summing and presented it in the work.