1. some notes on graph drawing in
the social sciences
steve borgatti
LINKS center, U of Kentucky
http://linkscenter.org
(c) 2008 Stephen P. Borgatti. All rights reserved.
2. informal review of three areas
• Three areas that routinely use relational
concepts/data
– Multivariate/correlational analysis
– Cultural domain analysis (CDA)
– Social network analysis
• I will briefly review each area
– Bottom line: Graph drawing capabilities
underutilized
– Note: how many social scientists are here today?
(c) 2008 Stephen P. Borgatti. All rights reserved.
13. Causes of Breast Cancer
ABORTIONS WILDLIFE LATECHILDREN
DIRTYWORK
ETHNICITY
IMPLANTS
EARLYMENSES
AGE
LACKHYGIENE
NOCHILDREN OBESITY
PROBPRODMILK SALVADOR CANCERHISTO
PHYSICIANS
FONDLING SMOKING
ILLEGALDRUGS
HORMONESUPPS
FAMILYHISTORY
MEXICAN
BIRTHCONTROL FIBROCYSTIC
BLOWS FATDIET
BREAST-FEEDING
NEVERBREASTFEED
ALCOHOL ANGLO
CHICANAS
LACKMEDICALATTN DIET
CHEMICALSINFOOD
RADIATION
POLLUTION
CAFFEINE JUSTHAPPENS
(c) 2008 Stephen P. Borgatti. All rights reserved.
LARGEBREASTS
14. Causes of Breast Cancer
(Frequencies > 18%)
OBESITY
NOCHILDREN
AGE
LATECHILDREN FATDIET
IMPLANTS PHYSICIANSCANCERHISTORY
HORMONESUPPS
SALVADOR
PROBLEMSMILK FAMILYHISTORY
BLOWS2BREAST
CHICANAS
CHEMICALSINFOOD ANGLO
POLLUTION DIET
MEXICAN
NEVERBREASTFEED SMOKING
RADIATION
LACKMEDICALATTN BIRTHCONTROL
BREAST FONDLING
(c) 2008 Stephen P. Borgatti. All rights reserved.
15. Quick summary of multivariate
visualization
• Visualization dominated by tables or principal
components / vector spaces and taxonomic
displays
• Even the simplest graph representations are a
contribution
(c) 2008 Stephen P. Borgatti. All rights reserved.
17. Perceived Similarities
• Direct ratings
– ‘How similar are “rabies” and “lupus” on a 1 to 5 scale?’
• Pilesorts
– (given cards, each with name of a fruit) “Please sort these fruits
into piles according to how similar they are …”
– For each pair of items, count proportion of respondents that
place them in same pile
• Triad tests
– ‘In each group of three below, which is the most different?’
• SHARK DOLPHIN SEAL
• DOG SEAL CAT
– Each time an item is chosen, give a point towards similarity of
the other two
(c) 2008 Stephen P. Borgatti. All rights reserved.
22. *B
Water off while shaving
Cut grass high Full loads in dishwasher
Plant shrubs Cold-water detergent
Mulch grass clippings Lowflow shower
Plant garden Rinse w/ cold water
Compost Short dishwasher cycles
Plant trees Water lawn in morning/evening
Close shades
Restore buildings Recyling bins Water-saving toilets
Salvation Army Turn off lights
Pick up litter Use things longer *B Air off when leave
}
Paper bags Cloth diapers
Encourage others to recycle Fans
Don’t litter Reuse towels
Organize drives for recyclables Dishwasher w/ built-in heater
Encourage recycled products Cool leftovers
Wear sweaters Insulate home
Teach kids about recycling Clothes line Weatherstrip
Save wetlands Double-pane windows Automatic timers for house temp.
Gas heat Frig. seal
Insulate heating ducts
Convection oven Dryer with moisture sensor
Copper & brass Both sides paper Clean lint filter Oven door seal
Redeem cans Use own grocery bags Freezers on top
}
Put bins in office Recylce toxic prods. Fluorescent bulbs
Buy recycled prods.
Overpackaged foods Low-watt bulbs
*A No aerosol Remove CFC in old refrig. Dishwasher w/ airdry
Reduce meat consumption Photocells
Political activities Walk or bike Furnace tune-up
Write congressperson Dolphin safe tuna Carpool Regulate thermostat
Inflate tires properly
“Save the Earth” t-shirts
Public transport Gas mileage on new car
Use ethanol Assure car runs well
*A
Join environmental groups
Teach kids about endangered species Buy Electric Car
Ride Motorcycle
Show kids by example
Teach about gains from environment
Teach kids to preserve planet
Support world population organizations
Tell others not to do bad things
(c) 2008 Stephen P. Borgatti. All rights reserved.
23. U.S. Holidays
April_F Christm Columb 4th_Of
ools as us Easter Fathers Flag _July
April_Fools 0 0 0.185 0.148 0.222 0.407 0.111
Christmas 0 0 0 0.741 0.111 0.037 0.111
Columbus 0.185 0 0 0 0.222 0.444 0.296
Easter 0.148 0.741 0 0 0.148 0.037 0.148
Fathers 0.222 0.111 0.222 0.148 0 0.148 0.185
Flag 0.407 0.037 0.444 0.037 0.148 0 0.37
4th_Of_July 0.111 0.111 0.296 0.148 0.185 0.37 0
(c) 2008 Stephen P. Borgatti. All rights reserved.
24. non‐metric MDS representation
Rosh_Hashanah
Kwanza
Ramadan Cinco_de_Mayo
Thanksgiving
Yom_Kippur
4th_Of_July
New_Years
April_Fools
Groundhog
Presidents
Christmas
Halloween
Columbus
Hanukkah
Passover
Memorial
Veterans
Mothers
Patriots
Fathers
Easter
Labor
Flag
t_Valentines
St_Patrick
Secretaries
MLK
(Degenerate solution)
(c) 2008 Stephen P. Borgatti. All rights reserved.
25. after removing “strange” holidays
4th_Of_July
Veterans
Labor
MemorialPatriots
Flag
Yom_Kippur Thanksgiving Columbus
Presidents
Passover
MLK
Hanukkah
Easter
Christmas Secretari
St_Patrick
Groundhog
April_Fools
St_Valentines
New_Years
Halloween
Mothers
Fathers
(c) 2008 Stephen P. Borgatti. All rights reserved.
26. Yom_Kippur
Hanukkah
Passover graph
Christmas representation
Easter
MLK Thanksgiving
Presidents Halloween
Veterans
Columbus New_Years
4th_Of_July
Patriots St_Valentines
Labor Flag
Memorial April_Fools
Groundhog St_Patrick
Fathers
Secretaries
(c) 2008 Stephen P. Borgatti. All rights reserved.
Mothers
28. Graph representation
• Obviously can represent personality traits as
nodes, strong similarities as links
• Dimensions such as good/bad or
active/passive are just node attributes
– Typically represented by node size or dark‐to‐light
coloration
• How to present multiple attributes at the
same time?
(c) 2008 Stephen P. Borgatti. All rights reserved.
29. Contagion (Guatemala)
(c) 2008 Stephen P. Borgatti. All rights reserved.
Susan C. Weller. 1984. Cross‐Cultural Concepts of Illness: Variation and Validation, American Anthropologist
30. Severity (Guatemala)
(c) 2008 Stephen P. Borgatti. All rights reserved.
Susan C. Weller. 1984. Cross‐Cultural Concepts of Illness: Variation and Validation, American Anthropologist
31. Age of the Infirm (Guatemala)
(c) 2008 Stephen P. Borgatti. All rights reserved.
Susan C. Weller. 1984. Cross‐Cultural Concepts of Illness: Variation and Validation, American Anthropologist
35. Moreno & Sociometry 1930s
Friendship Choices
Among Fourth
Graders (from
Moreno, 1934, p.
38).
Positive and Negative Choices in a Football
Moreno 1934 Team (Moreno, 1934, p. 213).
(c) 2008 Stephen P. Borgatti. All rights reserved.
36. Fast‐forward 60 years ..
• Huge advances
in computing
• But small
advances in
graph
visualization (in
mainstream
social science)
Kilduff, Martin, and David Krackhardt 1994. "Bringing the Individual Back In: A
Structural Analysis of the Internal Market for Reputation in Organizations." Academy
of Management Journal, 37: 87‐108.
(c) 2008 Stephen P. Borgatti. All rights reserved.
40. frequency of usage of graph drawing in
organizational studies
• Examined all articles in the last 3* years in two
top journals
– Administrative Science Quarterly (*all 3 years)
– Organization Science (*2 years only)
• Of 23 empirical papers focusing on social
networks
– Only 3 had drawings of graphs
– Only 1 depicted actual data (as opposed to an
illustration of a structural idea)
(c) 2008 Stephen P. Borgatti. All rights reserved.
41. in short …
• In organizational studies at least, graph
drawings are
– Rare
– Hardly different from nearly a century ago
• Few design elements
• Largely the same substantive concepts
• Of course, more use in presentations
– And even more in private exploration of data
(c) 2008 Stephen P. Borgatti. All rights reserved.
42. many of the reasons are institutional
rather than technical
Print journals Inability to Legitimacy of Habit of verbal vs
permit only + switch to pictures visual thinking
simplest electronic
graphics media
Qual XOR Quant
perspective
Media Lack of
limitations & prestige of
“costs” strange
Comic book
journals
understanding of
science
‐deductive
‐quantitative
(c) 2008 Stephen P. Borgatti. All rights reserved.
43. Other issues
Lack of quality tools Insufficient attention
‐ Power & ease of use to substantive issues
Imagination
& effort?
Algorithms
(c) 2008 Stephen P. Borgatti. All rights reserved.
44. User Interfaces
• Netdraw
– “userly” but pathetically programmed. Fat, buggy,
quirky and inconsistent in its conception of the data
• Pajek
– Elegantly programmed and powerful, but frightening
to mainstream social scientists
• Only a command‐line interface could create more fear
• Visone
– In a way, a blend of netdraw and pajek, but almost
ascetically lean: prefers economy to convenience
(c) 2008 Stephen P. Borgatti. All rights reserved.
45. Tool features
Automating Legends
• Automatically
generate
legends when
using design
elements like
color, size,
shape, etc
– Guess
(c) 2008 Stephen P. Borgatti. All rights reserved.
46. Smart Labeling
NORA SYLVIA
KATHERINE
HELEN VERNE
MYRNA
KATHERINE
NORA SYLVIA
VERNE
HELEN
MYRNA
Computer science applications often ignore labels
(c) 2008 Stephen P. Borgatti. All rights reserved.
47. annotating outputs
HLM output
Coding: highlighting,(c) 2008 Stephen P. Borgatti. All rights reserved.
marking-up, cutting-up, classifying, graph elements
48. statistics printed on chart
A
Geary’s C: 0.333
B
Significance: 0.000
D
C
E
G
F
H
I
(c) 2008 Stephen P. Borgatti. All rights reserved.
49. Collapsing / expanding nodes
• Easily collapsing nodes into super nodes and
then expanding back
– Current tools handle by creating separate image
graphs Density / Average value within blocks
1 2 3
------ ------ ------
BILL 1 0.3571 0.0417 0.0625
0.8
3
HARRY
DO N
2 0.1042 0.3000 0.1667 0.1
MICHAEL
3 0.0000 0.1250 0.7500
HOLLY
0.2
PAT
0.3
GERY
2
LEE
ST EVE
0.1
JEN N IE
0.1
BRAZEY PAM
RUSS
0.0
AN N
BERT JO HN
PAULIN E 0.4
CARO L 1
(c) 2008 Stephen P. Borgatti. All rights reserved.
50. convex hulls to represent categorical
node attributes
‐‐ not complex algorithmically but few offer it
Anthropac software
(c) 2008 Stephen P. Borgatti. All rights reserved.
52. Multi‐mode data
• D
Davis, Gardner and Gardner (published in the 1941 book Deep South)
(c) 2008 Stephen P. Borgatti. All rights reserved.
53. Implicit handling of modality
Davis, Gardner and Gardner OLIVIA
E2
data. Which women DOROTHY
FLORA
attended which social events.
PEARL
E1 E9
E11
MY RNA
RUTH
EVELY N
THERESA
E8
E4 LAURA
KATHERINE E10
E6
NORA E12
E3
BRENDA
E5 SY LVIA
E7 VERNE HELEN
FRANCES E14
ELEANOR
(c) 2008 Stephen P. Borgatti. All rights reserved.
CHARLOTTE E13
54. reducing modality
• Current approach:
– Analysis programs provide a tool for constructing
new graph, based on number of ties in common,
then allows you to draw that graph
• E.g., if X is 2‐mode data matrix in which xij = 1 means
that woman I attended event j, then X’X gives the
number of women who co‐occurred at each pair of
events and XX’ gives the number of events in common
for each pair of women
– X’X and XX’ induce new graphs that can be visualized
• Separate drawing step from data construction step
(c) 2008 Stephen P. Borgatti. All rights reserved.
57. visualization of X’X
(event by event overlap matrix)
E2
E1 E9
E11
E8
E4
E10
E6
E3 E12
E5
E7
E14
E13
(c) 2008 Stephen P. Borgatti. All rights reserved.
58. but users don’t see it in terms of the
operations needed to get there
E2 OLIVIA
DOROTHY
FLORA
PEARL
E1 E9
E11
MY RNA
RUTH
EVELY N
THERESA
E8
E4 LAURA
KATHERINE E10
E6
NORA E12
E3
BRENDA
E5 SYLVIA
E7 VERNE HELEN
FRANCES E14
ELEANOR
CHARLOTTE E13
(c) 2008 Stephen P. Borgatti. All rights reserved.
60. As far as I know, only
TouchGraph does this
well , and there is room
for improvement
Note: Bloom BR[au] and Harvard[ad] 1/1/90‐11/27/04 All A1, AA1, M1, MM1, MA1, J1, JAJM1, deg sep1 from author Barry L. Bloom
Source: PubMed, BCG Analysis (c) 2008 Stephen P. Borgatti. All rights reserved.
62. visualizing relational algebra via
implicit multimode reductions
• Suppose we have multimodal data represented as
series of interlinked tables:
– AD = author by document
– TD = keywords by document
• AD*AD’ = author by author co‐authorships
• AD*TD’ = authors by their topics
• TD*TD’ = topic by topic co‐occurrences in documents
• Y = AT*TD*TD’*AT’ = author by author linkage of their
topics, i.e., yij > 0 if author i writes about topics that co‐
occur with the topics that author j writes about
(c) 2008 Stephen P. Borgatti. All rights reserved.
63. integrating better with data sources
• Currently user is responsible for constructing a
graph of interest to be visualized
– Users think that should be part of the visualization
program
• Ability to directly access a database of tables
relating multiple kinds of entities and
construct graphs on the fly
– With filtering
(c) 2008 Stephen P. Borgatti. All rights reserved.
64. substance issues: What theoretical
concepts to represent?
d c
i
j
e
f
a
b
h
g
Social distance / cohesion / connectedness Structural similarity/
isomorphism
Default representations e.g. kamada‐kawai Spectral / principal components / svd
(c) 2008 Stephen P. Borgatti. All rights reserved.
68. What else would we want to
represent?
• Robustness of measures
– Jackknifing and bootstrapping results
• Multiple centrality
measures
• Ergm models …
– Space of possible networks
(c) 2008 Stephen P. Borgatti. All rights reserved.
69. uses of motion as design element
• Case I
– Motion reveals static
structure from multiple
points of view
– I don’t think we do a
good job with this
Anthony Dekker. 200?. Conceptual Distance in Social Network
Analysis. Journal of Social Structure. (Vol. 6, No. 3 )
(c) 2008 Stephen P. Borgatti. All rights reserved.
70. uses of motion as design element
• Case II
– Motion reveals change
in network structure
and position over time
– Maintaining the
meaning of the
motion/position link
• Brownian motion of the
spring embedder
– But see visone for
algorithmic Anthony Dekker. 200?. Conceptual Distance in
improvement Social Network Analysis. Journal of Social
Structure. (Vol. 6, No. 3 )
(c) 2008 Stephen P. Borgatti. All rights reserved.
71. uses of motion as design element
Case III
• Nodes maintain
fixed positions,
ties appear and
disappear
– Ignores changes
in centrality etc.
– Traces help
maintain
memory but this
is still issue
Moody, James, Daniel A. McFarland and Skye Bender‐DeMoll.� 2005. "Dynamic
Network Visualization: Methods for Meaning with Longitudinal Network Movies”
(c) 2008 Stephen P. Borgatti. All rights reserved.
American Journal of Sociology 110:1206‐1241.
72. ALBERT_16
HUGH_14 BONI_15
MARK_7
simpler side by side
GREG_2
JOHN_1
WINF_12
Time 1
displays still have
ELIAS_17 BASIL_3
advantage of
AMBROSE_9
SIMP_18
comparability
BERTH_6
AMAND_13 VICTOR_8
PETER_4 ALBERT_16
ROMUL_10
BONAVEN_5
MARK_7
LOUIS_11
HUGH_14 BONI_15
ALBERT_16 GREG_2
MARK_7
HUGH_14 BONI_15 JOHN_1 Time 3
WINF_12
GREG_2
Time 2 JOHN_1
WINF_12 ELIAS_17 BASIL_3
AMBROSE_9
ELIAS_17 BASIL_3
AMBROSE_9 SIMP_18
BERTH_6
SIMP_18 AMAND_13 VICTOR_8
BERTH_6
AMAND_13 VICTOR_8
PETER_4
PETER_4 ROMUL_10
BONAVEN_5
ROMUL_10
BONAVEN_5 LOUIS_11
(c) 2008 Stephen P. Borgatti. All rights reserved.
LOUIS_11
73. representing trajectories
• Examples
– Movements of individuals from position to
position
– Movement of children, drugs, goods, etc through
locations
– Diffusion of information, beliefs, viruses through
network links
(c) 2008 Stephen P. Borgatti. All rights reserved.
76. Movement of college basketball coaches
from school to school
Nodes are schools. Arcs indicate
that a coach has moved from
one school to the other. But
*paths* through the network
(c) 2008 Stephen P. Borgatti. All rights reserved.
are lost
77. Retaining the paths
mississippi
morgan_state
howard
1994
2006
cincinnati 2001
1990 cornell
2001 nba 1996
uab 1993
coloradocollege
jackson_state
1979 2000chaminade
2007 1996 1988
1989
1972
out
1969 1994
2007
binghamton
1978 1982 1995
pro 1971 south_alabama
1988 utep
arkansas marist
1984
1985 cal_poly
1979 rhode_island
oklahoma_state 1997
1984 san_diego_state boston_college
1980
southern 1982
tulsa Each color indicates a different person’s career
(c) 2008 Stephen P. Borgatti. All rights reserved.
78. Static representation of trajectories
(c) 2008 Stephen P. Borgatti. All rights reserved.
Nodes are schools. Arcs are coaches. Arrowhead points in direction of movement.
C l id ifi h i
80. Over time representation
(this can animated, of course, instead of spatial comparison)
out out
2006 2007
This again loses the concept of a path through the network – can’t track any coach’s trajectory
(c) 2008 Stephen P. Borgatti. All rights reserved.
81. multigraphs representing multiple
social relations
PETER_4
BERTH_6
BONAVEN_5
ROMUL_10
AMAND_13
BASIL_3 VICTOR_8 LOUIS_11
AMBROSE_9
JOHN_1
ELIAS_17
HUGH_14
GREG_2
SIMP_18
Very hard to
ALBERT_16
understand results
WINF_12
(c) 2008 Stephen P. Borgatti. All rights reserved.
82. So what is the best way to represent
trajectories?
• It is the whole path to be preserved, so we can
observe things like increases in status over
time
Film13
Film1
Film12 Film11 Film2
Film3
Film9
Film7
Film10
Film8 Film6 Film4
Film5
(c) 2008 Stephen P. Borgatti. All rights reserved.
86. Attending more to substance issues
Types of Ties & Types of Visualization
States Events
Continuous & enduring Discrete & transitory
(terrain) (roads) (processes) (traffic)
Proximities Relations Interactions Flows
Location Membership Attribute Role Affective Perceptual
Physical Same groups Same gender Mother of, Likes, Knows, Sex with, Information,
distance Same events Same attitude Friend of, Hates, Knows of Talked to, Beliefs,
Distance etc boss of, etc etc Advice to, Personnel,
etc student of Helped, Resources,
Competitor Hurt, etc Goods,
etc
Spatial distance edges and arcs animation ???
(c) 2008 Stephen P. Borgatti. All rights reserved.
87. Conclusion
• Underutilization of graph drawing in the social sciences
– Reasons are institutional & technical but not so much algorithmic
• Publication needs dominate …
• Some design possibilities not yet used well
– Motion / animation
• Some tool needs not yet well met
– Especially integration with databases
– Separation of graph from data
• Insufficient attention to substance issues
– Closeness & structural equivalence & centrality have been addressed
– Representing processes, mechanisms
• One (personal) challenge: how to best represent graph traversals ‐‐
trajectories
(c) 2008 Stephen P. Borgatti. All rights reserved.