Social Network
Analysis
Jutta Pauschenwein
ZML-Innovative Lernszenarien
FH JOANNEUM
2
COS16 – week 1
3
4
Coursera MOOC: 8 weeks, autumn 2014
Why SNA?
I want to understand data
through visualization
Woche 1 im SNA-MOOC
5
Definitions
• network: set of connected nodes (social: connection via relation-ship)
• nodes: nodes, actors, sites, vertices
• connections: edges, ties, relations
• visualize networks by graphs
4 communities
MOOC, week1
6
Questions
structure of the network
• Are the nodes connected? How far
are they from each other? Are some
nodes more important than others?
Are there communities in the
network?
types if networks
• randomly generated connections,
network with preferences, small world
networks (most nodes are not
neighbours of one another, but the
neighbours of any give node are
likely to be neighbours of each other)
small-world-network, MOOC week 5
7
• connections are directed / not directed
• weighted connection
• a node with several degrees
A communciates with B
A communcates with B and B with A
A communciated with B 4x
Connections
8
Communication of students
in google+ 4 days
Erdős-Rényi Graph
• simple network with fixed number of nodes
• assumption 1: nodes connect randomly
• assumption 2: network is not directed
• assumption 3: N nodes, M connections, p probability that two nodes connect
• in this network type there appear no hubs, but the „giant component“
9
Reale networks grow
Expansion of the Erdős-Rényi approach
• growing networks, for example WWW, citation networks
Models
• random preferential: new nodes prefer to connect to already
well connected nodes
• introduction model: nodes were presented to each other
• static geographic model: nodes connect to the neighbours of
the nodes they are connected with
10
Barabasi-Albert model
• there’s a probabilty that each node connects with
another node in dependence of its degree
• There’s an initial configuration and then the process
of connecting starts with an
• each new node has a certain probability of m to
connect to the network
11
12
preferential
model
netlogo
in the preferential model you get hubs because „old“ nodes have more time to connect
13
Centrality
What importance has a node in the network?
14
15
degree centrality
the node is an active player in the network, it is well
connected
16
betweeness centrality
„broker“ - all communication uses this node
If the node doesn’t work, the connection in the network fails.
17
closeness
Closeness is the distance of one node to all the other nodes - it is enough
to be near a hub
18
eigenvector centrality
The importance of a node increases with the importance of its neighbours.
19
Which node has a small degree but a
high betweeness?
Or reversed?
Find communities
How to define a community / substructur in a network?
• There are many connections within a community.
• The other nodes in a community are only at a distance of some hops.
• Nodes in one community are strongly connected.
It’s difficult to find communities if you don’t know the number of communities - there are
large and small communities.
Parameters:
• minimum cut: number of communities
• hierarchical clustering: clustering based on certain characteristics
• betweennes clustering: connections with the highest betweennes are removed
20
21
Problem: when do you stop to remove connections?
=> Modularity - comparison, how many connections are within and outside of the community
In a network the number of connections within a community increases and the number of connections to
nodes outside of the community decreases
http://spark-public.s3.amazonaws.com/sna/other/guess/betweennessclust.html
3 communities of my facebook friends (2014)
Software
23
Netlogo
• programmable modeling environment for simulating natural and social
• phenomena
• Free, open source - cross-platform: runs on Mac, Windows, Linux, et al
• https://ccl.northwestern.edu/netlogo
Gephi
• interactive visualization and exploration platform for networks and
complex
• systems, dynamic and hierarchical graphs.
• Runs on Windows, Linux and Mac OS X. Gephi is open-source and free
• http://gephi.github.io/
Visualization of online
groups: data
master program
WS 14/15
4 days in Nov. 2014
Google+
collection of data by hand
how individual persons and the
group interact
one person is more active than
the teacher
Size of nodes according to degree
Color according to betweenness
master program WS 14/15
3 weeks of online socialisation
Interaction in Moodle
Same group as before
Interaction during the whole semester
Same group
4 communities
training course14/15
size of nodes according to degree
color according to betweeness
2 communities
Conclusion
• SNA is complex, has a high potential
• I get new insights in my groups - but I don’t understand it entirely until
now - there’s a lot of theory behind
• I use it to get a quick insight how a group is performing - in my role as
convener/moderator

Social Network Analysis for small learning groups

  • 1.
  • 2.
  • 3.
  • 4.
    4 Coursera MOOC: 8weeks, autumn 2014
  • 5.
    Why SNA? I wantto understand data through visualization Woche 1 im SNA-MOOC 5
  • 6.
    Definitions • network: setof connected nodes (social: connection via relation-ship) • nodes: nodes, actors, sites, vertices • connections: edges, ties, relations • visualize networks by graphs 4 communities MOOC, week1 6
  • 7.
    Questions structure of thenetwork • Are the nodes connected? How far are they from each other? Are some nodes more important than others? Are there communities in the network? types if networks • randomly generated connections, network with preferences, small world networks (most nodes are not neighbours of one another, but the neighbours of any give node are likely to be neighbours of each other) small-world-network, MOOC week 5 7
  • 8.
    • connections aredirected / not directed • weighted connection • a node with several degrees A communciates with B A communcates with B and B with A A communciated with B 4x Connections 8 Communication of students in google+ 4 days
  • 9.
    Erdős-Rényi Graph • simplenetwork with fixed number of nodes • assumption 1: nodes connect randomly • assumption 2: network is not directed • assumption 3: N nodes, M connections, p probability that two nodes connect • in this network type there appear no hubs, but the „giant component“ 9
  • 10.
    Reale networks grow Expansionof the Erdős-Rényi approach • growing networks, for example WWW, citation networks Models • random preferential: new nodes prefer to connect to already well connected nodes • introduction model: nodes were presented to each other • static geographic model: nodes connect to the neighbours of the nodes they are connected with 10
  • 11.
    Barabasi-Albert model • there’sa probabilty that each node connects with another node in dependence of its degree • There’s an initial configuration and then the process of connecting starts with an • each new node has a certain probability of m to connect to the network 11
  • 12.
  • 13.
    in the preferentialmodel you get hubs because „old“ nodes have more time to connect 13
  • 14.
    Centrality What importance hasa node in the network? 14
  • 15.
    15 degree centrality the nodeis an active player in the network, it is well connected
  • 16.
    16 betweeness centrality „broker“ -all communication uses this node If the node doesn’t work, the connection in the network fails.
  • 17.
    17 closeness Closeness is thedistance of one node to all the other nodes - it is enough to be near a hub
  • 18.
    18 eigenvector centrality The importanceof a node increases with the importance of its neighbours.
  • 19.
    19 Which node hasa small degree but a high betweeness? Or reversed?
  • 20.
    Find communities How todefine a community / substructur in a network? • There are many connections within a community. • The other nodes in a community are only at a distance of some hops. • Nodes in one community are strongly connected. It’s difficult to find communities if you don’t know the number of communities - there are large and small communities. Parameters: • minimum cut: number of communities • hierarchical clustering: clustering based on certain characteristics • betweennes clustering: connections with the highest betweennes are removed 20
  • 21.
    21 Problem: when doyou stop to remove connections? => Modularity - comparison, how many connections are within and outside of the community In a network the number of connections within a community increases and the number of connections to nodes outside of the community decreases http://spark-public.s3.amazonaws.com/sna/other/guess/betweennessclust.html
  • 22.
    3 communities ofmy facebook friends (2014)
  • 23.
    Software 23 Netlogo • programmable modelingenvironment for simulating natural and social • phenomena • Free, open source - cross-platform: runs on Mac, Windows, Linux, et al • https://ccl.northwestern.edu/netlogo Gephi • interactive visualization and exploration platform for networks and complex • systems, dynamic and hierarchical graphs. • Runs on Windows, Linux and Mac OS X. Gephi is open-source and free • http://gephi.github.io/
  • 24.
    Visualization of online groups:data master program WS 14/15 4 days in Nov. 2014 Google+ collection of data by hand
  • 25.
    how individual personsand the group interact one person is more active than the teacher
  • 26.
    Size of nodesaccording to degree Color according to betweenness master program WS 14/15 3 weeks of online socialisation Interaction in Moodle
  • 27.
    Same group asbefore Interaction during the whole semester
  • 28.
  • 29.
    training course14/15 size ofnodes according to degree color according to betweeness 2 communities
  • 30.
    Conclusion • SNA iscomplex, has a high potential • I get new insights in my groups - but I don’t understand it entirely until now - there’s a lot of theory behind • I use it to get a quick insight how a group is performing - in my role as convener/moderator