Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Studying Online Interactions using Social Network Analysis
Anatoliy Gruzd
gruzd@ryerson.ca
@gruzd
Canada Research Chair in...
Social Network Analysis (SNA)
• Nodes = People
• Edges /Ties (lines) = Relations/
“Who retweeted/ replied/
mentioned whom”...
Makes it much easier to
understand what is going on in a
group
Advantages of
Social Network Analysis
Once the network is d...
Common approach for collecting social
network data:
• Self-reported social network data
may not be available/accurate
• Su...
• Common approach: surveys or interviews
• A sample question about students’ perceived social structures
How Do We Collect...
Studying Online Social Networks
http://www.visualcomplexity.com/vc
• Forum networks
• Blog networks
• Friends’ networks (F...
Goal: Automated Networks Discovery
Challenge: Figuring out what content-based features of online interactions can
help to ...
Automated Discovery of Social Networks
Emails
Nick
Rick
Dick
• Nodes = People
• Ties = “Who talks to whom”
• Tie strength ...
Automated Discovery of Social Networks
“Many to Many” Communication
ChatMailing listservForum Comments
Anatoliy Gruzd 10
@...
Automated Discovery of Social Networks
Approach 1: Chain Network (Reply-to)
FROM: Sam
PREVIOUS POSTER: Gabriel
....
....
....
Automated Discovery of Social Networks
Approach 1: Chain Network (Reply-to)
FROM: Sam
PREVIOUS POSTER: Gabriel
“ Nick, Gin...
Automated Discovery of Social Networks
Approach 2: Name Network
FROM: Ann
“Steve and Natasha, I couldn't wait to see your ...
• Main Communicative Functions of Personal Names (Leech, 1999)
– getting attention and identifying addressee
– maintaining...
Example: Twitter Networks
@John
@Peter
@Paul
• Nodes = People
• Ties = “Who retweeted/
replied/mentioned whom”
• Tie stren...
Automated Discovery of Social Networks
Twitter Data Example
24
Name Network ties Chain Network ties
@Cheeflo -> @JoeProf
@...
Automated Discovery of Social Networks
Twitter Data Example
25
Name Network ties Chain Network ties
@gruzd -> @sidneyeve @...
Anatoliy Gruzd
Netlytic.org
cloud-based research infrastructure for automated text analysis &
discovery of social networks...
29
cMOOC Dataset
Athabasca University Course
◦ CCK11: Connectivism and Connective Knowledge
 Not restricted to any one
pl...
31
CCK11 MOOC Dataset
#Posts Avg character
count
Avg word
count
Blogs 812 332.3 54.2
Discussion
Threads
68 541.8 90.8
Comm...
Question 1 - #LAK14 #pLASMA
What does this visualization tell us from an
instructor’s perspective?
ANATOLIY GRUZD 32@gruzd
What does this visualization tell us from a
student’s perspective?
ANATOLIY GRUZD 33@gruzd
SNA Measures
Micro-level
In-degree centrality
Out-degree centrality
Betweenness centrality
Other centrality measures (e.g....
SNA Measures
Micro-level
In-degree centrality
Out-degree centrality
Betweenness centrality
Other centrality measures (e.g....
37
#CCK11 Class Tweets (Node size=“Indegree”)
ANATOLIY GRUZD@gruzd
SNA Measures
Micro-level
In-degree centrality
Out-degree centrality
Betweenness centrality
Other centrality measures (e.g....
39
#CCK11 Class Tweets (Node size=“Outdegree”)
ANATOLIY GRUZD@gruzd
SNA Measures
Micro-level
In-degree centrality
Out-degree centrality
Betweenness centrality
Other centrality measures (e.g....
Top 10 Twitter users in the Name network
based on centrality measures
ANATOLIY GRUZD 41
In-degree Out-degree Betweenness
p...
SNA Measures
Macro-level
Density
Diameter
Reciprocity
Centralization
Modularity
Density indicates the overall
connectivity...
SNA Measures
Macro-level
Density
Diameter
Reciprocity
Centralization
Modularity
Diameter gives a general idea of how
“wide...
SNA Measures
Macro-level
Density
Diameter
Reciprocity
Centralization
Modularity
Reciprocity shows how many online
particip...
SNA Measures
Macro-level
Density
Diameter
Reciprocity
Centralization
Modularity
Centralization indicates whether a network...
Anatoliy Gruzd 46@gruzd
High Centralization
SNA Measures
Macro-level
Density
Diameter
Reciprocity
Centralization
Modularity
Modularity provides an estimate of
whether...
Anatoliy Gruzd 2012 Olympics in London 49@gruzd
High Modularity ~0.9
Anatoliy Gruzd #tarsand Twitter Community 50@gruzd
Moderate Modularity ~0.5
Name network (on the left) and Chain network (on the right)
ANATOLIY GRUZD 51
#CCK11 Class Tweets
@gruzd
Name Network Chain Network
Nodes 498 122
Edges 761 125
Density 0.003 (0.3%) 0.009 (0.9%)
Diameter 38 7
Reciprocity 0.09 (9...
Upcoming SlideShare
Loading in …5
×

Studying Online Interactions using Social Network Analysis

3,612 views

Published on

Presented at 2015 BCC Educational Discourse Workshop
Oct 4-5, 2015
Arlington, TX

Published in: Education
  • Be the first to comment

Studying Online Interactions using Social Network Analysis

  1. 1. Studying Online Interactions using Social Network Analysis Anatoliy Gruzd gruzd@ryerson.ca @gruzd Canada Research Chair in Social Media Data Stewardship Associate Professor, Ted Rogers School of Management Director, Social Media Lab Ryerson University BCC Workshop Arlington, TX, Oct 4, 2015
  2. 2. Social Network Analysis (SNA) • Nodes = People • Edges /Ties (lines) = Relations/ “Who retweeted/ replied/ mentioned whom” Anatoliy Gruzd 3 How to Make Sense of Educational “Big” Data? @gruzd
  3. 3. Makes it much easier to understand what is going on in a group Advantages of Social Network Analysis Once the network is discovered, we can find out: • How do people interact with each other, • Who are the most/least active members of a group, • Who is influential in a group, • Who is susceptible to being influenced, etc… Anatoliy Gruzd 4 @gruzd
  4. 4. Common approach for collecting social network data: • Self-reported social network data may not be available/accurate • Surveys or interviews Problems with surveys or interviews • Time-consuming • Questions can be too sensitive • Answers are subjective or incomplete • Participant can forget people and interactions • Different people perceive events and relationships differently How Do We Collect Information About Online Social Networks? Anatoliy Gruzd 5 @gruzd
  5. 5. • Common approach: surveys or interviews • A sample question about students’ perceived social structures How Do We Collect Information About Social Networks? Please indicate on a scale from [1] to [5], YOUR FRIENDSHIP RELATIONSHIP WITH EACH STUDENT IN THE CLASS [1] - don’t know this person [2] - just another member of class [3] - a slight friendship [4] - a friend [5] - a close friend Alice D. [1] [2] [3] [4] [5] … Richard S. [1] [2] [3] [4] [5] Source: C. Haythornthwaite, 1999 Anatoliy Gruzd 6 @gruzd
  6. 6. Studying Online Social Networks http://www.visualcomplexity.com/vc • Forum networks • Blog networks • Friends’ networks (Facebook, Twitter, Google+, etc…) • Networks of like-minded people (YouTube, Flickr, etc…) Anatoliy Gruzd 7 @gruzd
  7. 7. Goal: Automated Networks Discovery Challenge: Figuring out what content-based features of online interactions can help to uncover nodes and ties between group members How Do We Collect Information About Online Social Networks? Anatoliy Gruzd 8 @gruzd
  8. 8. Automated Discovery of Social Networks Emails Nick Rick Dick • Nodes = People • Ties = “Who talks to whom” • Tie strength = The number of messages exchanged between individuals Anatoliy Gruzd 9 @gruzd
  9. 9. Automated Discovery of Social Networks “Many to Many” Communication ChatMailing listservForum Comments Anatoliy Gruzd 10 @gruzd
  10. 10. Automated Discovery of Social Networks Approach 1: Chain Network (Reply-to) FROM: Sam PREVIOUS POSTER: Gabriel .... .... .... Posting header Content Anatoliy Gruzd 11 @gruzd
  11. 11. Automated Discovery of Social Networks Approach 1: Chain Network (Reply-to) FROM: Sam PREVIOUS POSTER: Gabriel “ Nick, Gina and Gabriel: I apologize for not backing this up with a good source, but I know from reading about this topic that … ” Posting header Content Possible Missing Connections: • Sam -> Nick • Sam -> Gina • Nick <-> Gina 12 12@gruzd
  12. 12. Automated Discovery of Social Networks Approach 2: Name Network FROM: Ann “Steve and Natasha, I couldn't wait to see your site. I knew it was going to [be] awesome!” This approach looks for personal names in the content of the messages to identify social connections between group members. Anatoliy Gruzd 14 @gruzd
  13. 13. • Main Communicative Functions of Personal Names (Leech, 1999) – getting attention and identifying addressee – maintaining and reinforcing social relationships • Names are “one of the few textual carriers of identity” in discussions on the web (Doherty, 2004) • Their use is crucial for the creation and maintenance of a sense of community (Ubon, 2005) Automated Discovery of Social Networks Approach 2: Name Network Anatoliy Gruzd 15 @gruzd
  14. 14. Example: Twitter Networks @John @Peter @Paul • Nodes = People • Ties = “Who retweeted/ replied/mentioned whom” • Tie strength = The number of retweets, replies or mentions How to Make Sense of Social Media Data? Anatoliy Gruzd 23 @gruzd
  15. 15. Automated Discovery of Social Networks Twitter Data Example 24 Name Network ties Chain Network ties @Cheeflo -> @JoeProf @Cheeflo -> @VMosco none
  16. 16. Automated Discovery of Social Networks Twitter Data Example 25 Name Network ties Chain Network ties @gruzd -> @sidneyeve @gruzd -> @sidneyeve
  17. 17. Anatoliy Gruzd Netlytic.org cloud-based research infrastructure for automated text analysis & discovery of social networks from social media data Networks Stats Content 27
  18. 18. 29 cMOOC Dataset Athabasca University Course ◦ CCK11: Connectivism and Connective Knowledge  Not restricted to any one platform  “Throughout this ‘course’ participants will use a variety of technologies, for example, blogs, Second Life, RSS Readers, UStream, etc.” ANATOLIY GRUZD@gruzd
  19. 19. 31 CCK11 MOOC Dataset #Posts Avg character count Avg word count Blogs 812 332.3 54.2 Discussion Threads 68 541.8 90.8 Comments 306 837.6 138.9 Tweets 1722 113.9 14.4 ANATOLIY GRUZD@gruzd
  20. 20. Question 1 - #LAK14 #pLASMA What does this visualization tell us from an instructor’s perspective? ANATOLIY GRUZD 32@gruzd
  21. 21. What does this visualization tell us from a student’s perspective? ANATOLIY GRUZD 33@gruzd
  22. 22. SNA Measures Micro-level In-degree centrality Out-degree centrality Betweenness centrality Other centrality measures (e.g., closeness, eigenvector) Macro-level Density Diameter Reciprocity Centralization Modularity ANATOLIY GRUZD 35@gruzd
  23. 23. SNA Measures Micro-level In-degree centrality Out-degree centrality Betweenness centrality Other centrality measures (e.g., closeness, eigenvector) ANATOLIY GRUZD 36 In-degree suggests “prestige” highlighting the most mentioned or replied Twitter users @gruzd
  24. 24. 37 #CCK11 Class Tweets (Node size=“Indegree”) ANATOLIY GRUZD@gruzd
  25. 25. SNA Measures Micro-level In-degree centrality Out-degree centrality Betweenness centrality Other centrality measures (e.g., closeness, eigenvector) ANATOLIY GRUZD 38 Out-degree reveals active Twitter users with a good awareness of others in the network @gruzd
  26. 26. 39 #CCK11 Class Tweets (Node size=“Outdegree”) ANATOLIY GRUZD@gruzd
  27. 27. SNA Measures Micro-level In-degree centrality Out-degree centrality Betweenness centrality Other centrality measures (e.g., closeness, eigenvector) ANATOLIY GRUZD 40 Betweenness shows actors who are located on the most number of information paths and who often connect different groups of users in the network @gruzd
  28. 28. Top 10 Twitter users in the Name network based on centrality measures ANATOLIY GRUZD 41 In-degree Out-degree Betweenness profesortbaker cck11feeds profesortbaker shellterrell suifaijohnmak suifaijohnmak gsiemens web20education cck11feeds downes daisygrisolia thbeth zaidlearn ianchia francesbell web20education profesortbaker hamtra hamtra pipcleaves daisygrisolia profesorbaker hamtra web20education ianchia jaapsoft jaapsoft jaapsoft thbeth gsiemens suifaijohnmak ryantracey @gruzd
  29. 29. SNA Measures Macro-level Density Diameter Reciprocity Centralization Modularity Density indicates the overall connectivity in the network (the total number of connections divided by the total number of possible connections). It is equal to 1 when everyone is connected to everyone. ANATOLIY GRUZD 42@gruzd Student Instructor Student Density = 1
  30. 30. SNA Measures Macro-level Density Diameter Reciprocity Centralization Modularity Diameter gives a general idea of how “wide” the network is; the longest of the shortest paths between any two nodes in the network. ANATOLIY GRUZD 43@gruzd #1 Student A Instructor Student B Student C Diameter = 3 #2 #3
  31. 31. SNA Measures Macro-level Density Diameter Reciprocity Centralization Modularity Reciprocity shows how many online participants are having two-way conversations. In a scenario when everyone replies to everyone, the reciprocity value will be 1. ANATOLIY GRUZD 44@gruzd Student Instructor Student Student Reciprocity=1
  32. 32. SNA Measures Macro-level Density Diameter Reciprocity Centralization Modularity Centralization indicates whether a network is dominated by few central participants (values are closer to 1), or whether more people are contributing to discussion and information dissemination (values are closer to 0). ANATOLIY GRUZD 45@gruzd Student Instructor Student Student Centralization=1
  33. 33. Anatoliy Gruzd 46@gruzd High Centralization
  34. 34. SNA Measures Macro-level Density Diameter Reciprocity Centralization Modularity Modularity provides an estimate of whether a network consists of one coherent group of participants who are engaged in the same conversation and who are paying attention to each other (values closer to 0); or whether a network consists of different conversations and communities with a weak overlap (values closer to 1). ANATOLIY GRUZD 48@gruzd
  35. 35. Anatoliy Gruzd 2012 Olympics in London 49@gruzd High Modularity ~0.9
  36. 36. Anatoliy Gruzd #tarsand Twitter Community 50@gruzd Moderate Modularity ~0.5
  37. 37. Name network (on the left) and Chain network (on the right) ANATOLIY GRUZD 51 #CCK11 Class Tweets @gruzd
  38. 38. Name Network Chain Network Nodes 498 122 Edges 761 125 Density 0.003 (0.3%) 0.009 (0.9%) Diameter 38 7 Reciprocity 0.09 (9%) 0.18 (18%) Centralization 0.07 0.08 Modularity 0.67 0.77 ANATOLIY GRUZD 52@gruzd #CCK11 Class Tweets

×