Communi'es	
  in	
  Networks	
  
Peter	
  J.	
  Mucha	
  
University	
  of	
  North	
  Carolina	
  
at	
  Chapel	
  Hill	
  
0.021086
p = 0.7
Virginia
Maryland
FloridaStateDuke
NorthCarolinaState
WakeForest
ClemsonGeorgiaTech
North
Carolina
TexasTech
TexasA&M
Baylor
Texas
O
klahom
a
O
klahom
a
State
C
olorado
Kansas
State
Iowa
State
Nebraska
Missouri
Kansas
Utah State
Colorado State
Utah
Brigham Young
Wyoming
Air Force
Nevada−Las Vegas
New Mexico
San Diego State
Tulsa
Texas−El PasoSouthern MethodistFresno StateNevadaHawaiiSan Jose State
Louisiana Tech
RiceBoise State
Alabama−Birmingham
Louisville
M
em
phis
Cincinnati
H
ouston
EastCarolina
Tulane
Southern
Mississippi
Army
Non−DivisionIA
TexasChristian
CentralFlorida
SouthFlorida
TroyState
NewMexicoState
Louisiana−Lafayette
ArkansasState
NorthTexas
Louisiana−Monroe
Idaho
MiddleTennesseeState
Arkansas
Florida
Georgia
Tennessee
Kentucky
SouthCarolina
Vanderbilt
LouisianaState
Mississippi
MississippiState
Auburn
Alabam
a
W
ashington
State
W
ashington
UCLA
Southern
California
Oregon
State
Oregon
Arizona State
Stanford
CaliforniaArizona
Miami (Florida)SyracuseTemple
RutgersBoston College
Pittsburgh
West Virginia
Virginia Tech
Navy
Notre Dame
Purdue
Ohio State
Penn State
Indiana
Wisconsin
Illinois
Michigan
Northwestern
Iowa
M
innesota
M
ichigan
State
Connecticut
M
iam
i(Ohio)Kent
Marshall
AkronBuffaloOhio
BowlingGreenState
CentralMichigan
EasternMichigan
WesternMichigan
Toledo
BallState
NorthernIllinois
AGRICULTURE
APPROPRIATIONS
INTERNATIONAL RELATIONS
BUDGET
HOUSE ADMINISTRATION
ENERGY/COMMERCE
FINANCIAL SERVICES
VETERANS’ AFFAIRS
EDUCATION
ARMED SERVICES
JUDICIARY
RESOURCES
RULES
SCIENCE
SMALL BUSINESS
OFFICIAL CONDUCT
TRANSPORTATION
GOVERNMENT REFORM
WAYS AND MEANS
INTELLIGENCE
HOMELAND SECURITY
10 20 30 40 50 60 70 80 90 100 110
CT
MEMA
NHRI VT
DE
NJNY PAIL
IN
MI OHWI
IAKS
MNMO
NENDSD
VA
ALAR FLGA
LAMSNC
SC
TXKY
MDOKTN
WVAZCO
IDMT
NVNM
UT
WYCAOR
WAAK
HI
Congress #
Coupling = 0.2: 13 communities
1917D, 122R, 13other
36PA, 15F, 6AA
373D, 162J, 75other
1615R, 220W, 163F, 97AJ, 273other
605R, 109D, 6other
105DR, 1F
1256D, 140R, 62other
13PA, 4AA
67DR, 7F
66D, 2W, 1FS
105R, 44D
145DR, 28AA, 6F, 5PA
941R, 159D, 7I, 3C
1807−1809
1827−1829
1847−1849
1867−1869 1927−1929
1947−1949
1967−1969
1987−1989
2007−2009
Communi'es	
  in	
  Networks	
  
1.  What	
  is	
  a	
  community	
  and	
  why	
  are	
  they	
  useful?	
  
2.  How	
  do	
  you	
  calculate	
  communi'es?	
  
•  Descrip've:	
  e.g.,	
  Modularity	
  
•  Genera've:	
  e.g.,	
  Stochas'c	
  Block	
  Models	
  
3.  Where	
  is	
  community	
  detec'on	
  going	
  in	
  the	
  future?	
  
	
  
…	
  with	
  apologies	
  that	
  this	
  presenta0on	
  will	
  seriously	
  
err	
  on	
  the	
  self-­‐absorbed	
  side.	
  It’s	
  a	
  big	
  field,	
  and	
  I	
  do	
  	
  
not	
  promise	
  to	
  know	
  nor	
  present	
  it	
  all.	
  
	
  
“Communi'es	
  in	
  Networks,”	
  Porter,	
  Onnela	
  &	
  Mucha,	
  
No0ces	
  of	
  the	
  American	
  Mathema0cal	
  Society	
  56,	
  1082-­‐97	
  &	
  1164-­‐6	
  (2009).	
  
	
  
“Community	
  Detec'on	
  in	
  Graphs,”	
  S.	
  Fortunato,	
  	
  
Physics	
  Reports	
  486,	
  75-­‐174	
  (2010).	
  
Acknowledgements:	
  
•  	
  Shankar	
  Bhamidi,	
  Jean	
  Carlson,	
  Aaron	
  Clauset,	
  	
  
Skyler	
  Cranmer,	
  James	
  Fowler,	
  James	
  Gleeson,	
  Sco[	
  Graon,	
  
Jim	
  Moody,	
  Mark	
  Newman,	
  Andrew	
  Nobel,	
  Mason	
  Porter	
  
•  Dani	
  Basse[,	
  Elizabeth	
  Leicht,	
  Nishant	
  Malik,	
  Sergey	
  Melnik,	
  
J.-­‐P.	
  Onnela,	
  Serguei	
  Saavedra	
  
•  Dan	
  Fenn,	
  Elizabeth	
  Menninga,	
  Feng	
  “Bill”	
  Shi,	
  
Ashton	
  Verdery,	
  Simi	
  Wang,	
  James	
  Wilson,	
  Andrew	
  Waugh	
  	
  
•  	
  Thomas	
  Callaghan,	
  A.	
  J.	
  Friend,	
  Chris'	
  Frost,	
  Eric	
  Kelsic,	
  	
  
Kevin	
  Macon,	
  Sean	
  Myers,	
  Ye	
  Pei,	
  Sco[	
  Powers,	
  	
  
Stephen	
  Reid,	
  Thomas	
  Richardson,	
  Mandi	
  Traud,	
  	
  
Casey	
  Warmbrand,	
  Yan	
  Zhang	
  
•  NSF	
  (CAREER/REU	
  &	
  VIGRE),	
  NIGMS	
  (SNAH),	
  	
  
JSMF	
  (MAP/JF	
  &	
  PJM),	
  Caltech	
  SURF,	
  UNC	
  (AGEP,	
  CAS,	
  SURF)	
  
Communi'es	
  in	
  Networks	
  
1.  What	
  is	
  a	
  community	
  and	
  why	
  are	
  they	
  useful?	
  
2.  How	
  do	
  you	
  calculate	
  communi'es?	
  
•  Descrip've:	
  e.g.,	
  Modularity	
  
•  Genera've:	
  e.g.,	
  Stochas'c	
  Block	
  Models	
  
3.  Where	
  is	
  community	
  detec'on	
  going	
  in	
  the	
  future?	
  
	
  
…	
  with	
  apologies	
  that	
  this	
  presenta0on	
  will	
  seriously	
  
err	
  on	
  the	
  self-­‐absorbed	
  side.	
  It’s	
  a	
  big	
  field,	
  and	
  I	
  do	
  	
  
not	
  promise	
  to	
  know	
  nor	
  present	
  it	
  all.	
  
	
  
“Communi'es	
  in	
  Networks,”	
  Porter,	
  Onnela	
  &	
  Mucha,	
  
No0ces	
  of	
  the	
  American	
  Mathema0cal	
  Society	
  56,	
  1082-­‐97	
  &	
  1164-­‐6	
  (2009).	
  
	
  
“Community	
  Detec'on	
  in	
  Graphs,”	
  S.	
  Fortunato,	
  	
  
Physics	
  Reports	
  486,	
  75-­‐174	
  (2010).	
  
•  Jim	
  Moody	
  (paraphrased):	
  “I’ve	
  been	
  accused	
  of	
  
turning	
  everything	
  into	
  a	
  network.”	
  
•  PJM	
  (in	
  response):	
  “I’m	
  accused	
  of	
  turning	
  everything	
  
into	
  a	
  network	
  and	
  a	
  graph	
  par''oning	
  problem.”	
  
•  “Structure	
  ßà	
  Func0on”	
  
	
  
	
  
	
  
	
  
	
  
	
  	
  	
  	
  How	
  to	
  extend	
  the	
  no+on	
  of	
  modularity	
  in	
  networks	
  
to	
  mul+ple	
  networks	
  between	
  the	
  same	
  actors/units,	
  
i.e.	
  how	
  to	
  properly	
  use	
  iden+ty	
  in	
  modularity?	
  
Philosophical	
  Disclaimer	
  
Images	
  by	
  Aaron	
  Clauset	
  
Karate	
  Club	
  Example	
  
This	
  par''on	
  op'mizes	
  modularity,	
  which	
  measures	
  the	
  
number	
  of	
  intra-­‐community	
  'es	
  (rela've	
  to	
  randomness)	
  
“If	
  your	
  method	
  doesn’t	
  work	
  on	
  this	
  network,	
  then	
  go	
  home.”	
  
Karate	
  Club	
  Example	
  
Brought	
  to	
  you	
  by	
  Mason	
  Porter	
  and	
  The	
  Power	
  Law	
  Shop	
  
h[p://www.cafepress.com/thepowerlawshop	
  
Women’s	
  and	
  kids’	
  sizes	
  also	
  available	
  
“If	
  your	
  method	
  doesn’t	
  work	
  on	
  this	
  network,	
  then	
  go	
  home.”	
  
“Cris	
  Moore	
  (leJ)	
  is	
  the	
  
inaugural	
  recipient	
  of	
  the	
  
Zachary	
  Karate	
  Club	
  Club	
  prize,	
  
awarded	
  on	
  behalf	
  of	
  the	
  
community	
  by	
  Aric	
  Hagberg	
  
(right).	
  (9	
  May	
  2013)”	
  
Facebook	
  
Traud	
  et	
  al.,	
  “Comparing	
  community	
  structure	
  to	
  
characteris'cs	
  in	
  online	
  collegiate	
  social	
  networks”	
  (2011)	
  
Traud	
  et	
  al.	
  “Social	
  structure	
  of	
  Facebook	
  networks”	
  (2012)	
  
Caltech	
  2005:	
  
Colors	
  indicate	
  residen'al	
  
“House”	
  affilia'ons	
  
Purple	
  =	
  Not	
  provided	
  
Facebook	
  
Caltech	
  2005:	
  
Colors	
  indicate	
  residen'al	
  
“House”	
  affilia'ons	
  
Purple	
  =	
  Not	
  provided	
  
Traud	
  et	
  al.,	
  “Comparing	
  community	
  structure	
  to	
  
characteris'cs	
  in	
  online	
  collegiate	
  social	
  networks”	
  (2011)	
  
Traud	
  et	
  al.	
  “Social	
  structure	
  of	
  Facebook	
  networks”	
  (2012)	
  
Facebook	
  
Caltech	
  2005:	
  
Colors	
  indicate	
  residen'al	
  
“House”	
  affilia'ons	
  
Purple	
  =	
  Not	
  provided	
  
Traud	
  et	
  al.,	
  “Comparing	
  community	
  structure	
  to	
  
characteris'cs	
  in	
  online	
  collegiate	
  social	
  networks”	
  (2011)	
  
Traud	
  et	
  al.	
  “Social	
  structure	
  of	
  Facebook	
  networks”	
  (2012)	
  
Facebook	
  
Caltech	
  2005:	
  
Colors	
  indicate	
  residen'al	
  
“House”	
  affilia'ons	
  
Purple	
  =	
  Not	
  provided	
  
Traud	
  et	
  al.,	
  “Comparing	
  community	
  structure	
  to	
  
characteris'cs	
  in	
  online	
  collegiate	
  social	
  networks”	
  (2011)	
  
Traud	
  et	
  al.	
  “Social	
  structure	
  of	
  Facebook	
  networks”	
  (2012)	
  
Logis'c	
  Regression:	
  
	
  
	
  
zRand:	
  
Roll	
  call	
  as	
  a	
  network?	
  
	
  Scien'fic	
  Coauthorship	
  	
  	
  	
  	
  	
  	
  v.	
  	
  	
  	
  	
  	
  	
  Roll	
  Call	
  Similari'es	
  	
  
see	
  Waugh	
  et	
  al.,	
  “Party	
  polariza'on	
  in	
  Congress:	
  a	
  network	
  science	
  approach”	
  (2009)	
  
see	
  Waugh	
  et	
  al.,	
  “Party	
  polariza'on	
  in	
  Congress:	
  a	
  network	
  science	
  approach”	
  (2009)	
  
Moody	
  &	
  Mucha,	
  “Portrait	
  of	
  poli'cal	
  party	
  polariza'on”	
  (2013)	
  
Parker	
  et	
  al.,	
  “Network	
  Analysis	
  Reveals	
  Sex-­‐	
  and	
  An'bio'c	
  Resistance-­‐
Associated	
  An'virulence	
  Targets	
  in	
  Clinical	
  Uropathogens”	
  (2015)	
  
Parker	
  et	
  al.,	
  “Network	
  Analysis	
  Reveals	
  Sex-­‐	
  and	
  An'bio'c	
  Resistance-­‐
Associated	
  An'virulence	
  Targets	
  in	
  Clinical	
  Uropathogens”	
  (2015)	
  
Communi'es	
  in	
  Networks	
  
1.  What	
  is	
  a	
  community	
  and	
  why	
  are	
  they	
  useful?	
  
2.  How	
  do	
  you	
  calculate	
  communiBes?	
  
•  DescripBve:	
  e.g.,	
  Modularity	
  
•  GeneraBve:	
  e.g.,	
  StochasBc	
  Block	
  Models	
  
3.  Where	
  is	
  community	
  detec'on	
  going	
  in	
  the	
  future?	
  
	
  
…	
  with	
  apologies	
  that	
  this	
  presenta0on	
  will	
  seriously	
  
err	
  on	
  the	
  self-­‐absorbed	
  side.	
  It’s	
  a	
  big	
  field,	
  and	
  I	
  do	
  	
  
not	
  promise	
  to	
  know	
  nor	
  present	
  it	
  all.	
  
	
  
“Communi'es	
  in	
  Networks,”	
  Porter,	
  Onnela	
  &	
  Mucha,	
  
No0ces	
  of	
  the	
  American	
  Mathema0cal	
  Society	
  56,	
  1082-­‐97	
  &	
  1164-­‐6	
  (2009).	
  
	
  
“Community	
  Detec'on	
  in	
  Graphs,”	
  S.	
  Fortunato,	
  	
  
Physics	
  Reports	
  486,	
  75-­‐174	
  (2010).	
  
Community	
  Detec'on	
  Firehose	
  Overview	
  
•  Computa'onal	
  sledgehammer	
  for	
  large	
  data	
  
•  “Hard/rigid”	
  v.	
  “so/overlapping”	
  clusters	
  
•  cf.	
  biclustering	
  methods	
  and	
  mathema'cs	
  of	
  expander	
  graphs	
  
•  A	
  community	
  should	
  describe	
  a	
  “cohesive	
  group,”	
  and	
  there	
  are	
  
varying	
  formula'ons	
  and	
  algorithms	
  
–  Linkage	
  clustering	
  (average,	
  single),	
  local	
  clustering	
  coefficients,	
  
betweeness	
  (geodesic,	
  random	
  walk),	
  spectral,	
  conductance,…	
  
•  Classic	
  approach	
  in	
  CS:	
  	
  Spectral	
  Graph	
  Par''oning	
  
–  Need	
  to	
  specify	
  number	
  of	
  communi'es	
  sought	
  
•  Conductance	
  
•  MDL,	
  Infomap,	
  OSLOM,	
  …	
  (many	
  other	
  things	
  I’ve	
  missed)	
  …	
  
•  Modularity:	
  	
  a	
  good	
  par''on	
  has	
  more	
  intra-­‐community	
  edges	
  than	
  
one	
  would	
  expect	
  at	
  random	
  
•  Stochas'c	
  Block	
  Models:	
  	
  a	
  genera've	
  random	
  graph	
  model	
  with	
  
different	
  in/out	
  probabili'es	
  between	
  labeled	
  groups	
  
“Communi'es	
  in	
  Networks,”	
  Porter,	
  Onnela	
  &	
  Mucha,	
  
No0ces	
  of	
  the	
  American	
  Mathema0cal	
  Society	
  56,	
  1082-­‐97	
  &	
  1164-­‐6	
  (2009).	
  
	
  
“Community	
  Detec'on	
  in	
  Graphs,”	
  S.	
  Fortunato,	
  Physics	
  Reports	
  486,	
  75-­‐174	
  (2010).	
  
Images	
  by	
  Aaron	
  Clauset	
  
Structure	
  ßà	
  Func'on/Process	
  
“Modularity”	
  Approach:	
  
Community	
  Detec'on:	
  	
  Null	
  Model	
  &	
  
Computa'onal	
  Heuris'cs	
  
•  GOAL:	
  	
  Assign	
  nodes	
  to	
  communi'es	
  in	
  order	
  to	
  
maximize	
  quality	
  func'on	
  Q	
  
•  NP-­‐Complete	
  [Brandes	
  et	
  al.	
  2008]	
  
~	
  enumerate	
  possible	
  par''ons	
  
•  Numerous	
  packages	
  developed/developing	
  
–  e.g.	
  igraph	
  library	
  (R,	
  python),	
  NetworkX	
  
– Need	
  appropriate	
  null	
  model	
  
	
  
Maximizing	
  Modularity	
  
(Newman	
  &	
  Girvan,	
  PRE	
  2004;	
  Newman,	
  PRE	
  2004,	
  PNAS	
  2006,	
  PRE	
  2006)	
  
•  Independent	
  edges,	
  constrained	
  to	
  expected	
  
degree	
  sequence	
  same	
  as	
  observed.	
  
•  Requires	
  Pij	
  =	
  f(ki)f(kj),	
  quickly	
  yielding	
  
•  γ	
  resolu'on	
  parameter	
  ad	
  hoc	
  (default	
  =	
  1)	
  
(Reichardt	
  &	
  Bornholdt,	
  PRE	
  2006;	
  
	
  Lambio[e	
  et	
  al.,	
  arXiv	
  2008)	
  
•  Resolu0on	
  limit	
  (Fortunato	
  &	
  Barthelemy,	
  PNAS	
  2007)	
  
Degenerate	
  landscape	
  (Good,	
  de	
  Montjoye	
  &	
  Clauset,	
  PRE	
  2010)	
  
Forces	
  par00on	
  (many	
  authors!)	
  
Fenn	
  et	
  al.,	
  Chaos	
  2009	
   Macon,	
  PJM	
  &	
  MAP,	
  Physica	
  A	
  2012	
  
Community	
  Detec'on:	
  	
  Other	
  Models	
  
•  Erdos-­‐Renyi	
  (Bernoulli)	
   •  Newman-­‐Girvan*	
  
•  Leicht-­‐Newman*	
  (directed)	
   •  Barber*	
  (bipar'te)	
  
Poli'cal	
  Blogs	
  (Adamic	
  &	
  Glance,	
  WWW-­‐2005)	
  
“On	
  closer	
  inspec0on,	
  we	
  find	
  that	
  the	
  method	
  [(a)]	
  fails	
  in	
  this	
  case	
  
because	
  it	
  does	
  not	
  take	
  into	
  account	
  the	
  wide	
  varia0on	
  among	
  the	
  degrees	
  
of	
  nodes	
  in	
  the	
  network.	
  In	
  this	
  network	
  (and	
  many	
  others)	
  degrees	
  vary	
  
over	
  a	
  great	
  range,	
  whereas	
  degrees	
  in	
  the	
  block	
  model	
  are	
  Poisson	
  
distributed	
  and	
  narrowly	
  peaked	
  about	
  their	
  mean.	
  This	
  means,	
  in	
  effect,	
  
that	
  there	
  is	
  no	
  choice	
  of	
  parameters	
  for	
  the	
  model	
  that	
  gives	
  a	
  good	
  fit	
  to	
  
the	
  data.	
  Ficng	
  this	
  block	
  model	
  is	
  similar	
  to	
  ficng	
  a	
  straight	
  line	
  through	
  
an	
  inherently	
  curved	
  set	
  of	
  data	
  points—you	
  can	
  do	
  it,	
  but	
  it	
  is	
  unlikely	
  to	
  
give	
  you	
  a	
  meaningful	
  answer.” 	
  —Newman,	
  Nature	
  Physics	
  2012	
  
	
  
Similar	
  visualiza'ons	
  from	
  different	
  models	
  in	
  Amini	
  et	
  al.,	
  arXiv	
  (2012)	
  
	
  
Bo[om	
  Right:	
  Par''ons	
  v.	
  overlap	
  &	
  extrac'on	
  (Wilson	
  et	
  al.	
  in	
  prep)	
  
Fortunato	
  &	
  Barthelemy,	
  PNAS	
  2007	
   Ball,	
  Karrer	
  &	
  Newman,	
  PRE	
  2011	
  
Louvain	
  (Blondel	
  et	
  al.	
  J.Stat.Mech.	
  2008)	
  
Other	
  great	
  codes	
  to	
  know:	
  
h[p://www.mapequa'on.org/	
  
h[ps://graph-­‐tool.skewed.de/	
  	
  
InfoMap	
  	
  
(Rosvall	
  &	
  Bergstrom	
  2008)	
  
OSLOM	
  (Lancichinez	
  et	
  al.,	
  PLoS	
  One	
  2011)	
  
•  Score:	
  Significance	
  
•  “Homeless”	
  ver'ces	
  
•  Overlap	
  
•  Cluster	
  hierarchy	
  
•  Because	
  of	
  the	
  way	
  the	
  
algorithm	
  evolves	
  
clusters,	
  it	
  can	
  naturally	
  
be	
  used	
  for	
  temporal	
  
network	
  data.	
  
Conductance	
  &	
  NCP	
  Plots	
  (Leskovec,	
  Mahoney,	
  …)	
  
Stochas'c	
  Block	
  Models	
  
R:	
  Mixer	
  	
  	
  	
  Python:	
  Graph-­‐Tool	
  
Other	
  great	
  codes	
  to	
  know:	
  
h[p://www.mapequa'on.org/	
  
h[ps://graph-­‐tool.skewed.de/	
  	
  
At	
  the	
  most	
  general	
  level…	
  
Two	
  related	
  but	
  different	
  issues	
  to	
  keep	
  straight:	
  
1.  Theore'cal	
  Concept	
  (e.g.,	
  “Modularity”,	
  
“Map	
  Equa'on”,	
  “Stochas'c	
  Block	
  Models”)	
  
2.  Computa'onal	
  Heuris'c	
  &	
  Implementa'on	
  
(e.g.	
  “Fast	
  Greedy”,	
  “Louvain”,	
  “Itera've	
  
Improvement”,	
  or	
  the	
  specific	
  SBM	
  code	
  
[possible	
  ini'aliza'on	
  issues	
  with	
  some])	
  
And,	
  finally,	
  how	
  do	
  you	
  compare	
  communi'es?	
  
Comparing	
  Par''ons	
  
(e.g.	
  Sec'on	
  15.2	
  of	
  Fortunato	
  2010)	
  
R	
  x	
  C	
  Con'ngency	
  Table:	
  
1.  Cluster	
  Matching	
  
–  Requires	
  injec0on	
  
2.  Pair	
  Coun'ng	
  
–  “Adjusted”	
  v.	
  
“Standardized”	
  
3.  Informa'on	
  Theory	
  
–  Varia'on	
  of	
  
Informa'on,	
  
Normalized	
  Mutual	
  
Informa'on	
  
Informa'on-­‐Theore'c	
  Comparisons	
  
(e.g.	
  Sec'on	
  15.2	
  of	
  Fortunato	
  2010)	
  
Pair	
  Coun'ng	
  &	
  Standardiza'on	
  
(see,	
  e.g.,	
  Traud	
  et	
  al.,	
  SIAM	
  Review	
  2011)	
  
	
  	
  wαβ	
  counts:	
  α	
  &	
  β	
  binary	
  
	
  	
  indicator	
  for	
  same/different	
  
•  Rand,	
  Jaccard,	
  Minkowski,	
  
Fowlkes-­‐Mallows,…	
  
•  “Adjusted”:	
  center	
  on	
  mean	
  
with	
  perfect	
  match	
  =	
  1	
  
•  “Standardized”	
  by	
  stdev,	
  
expressed	
  as	
  z-­‐score	
  
•  Linear	
  in	
  w11	
  à	
  equal	
  z	
  
•  Monotonic	
  in	
  w11	
  à	
  equal	
  p	
  
Pair	
  Coun'ng	
  &	
  Standardiza'on	
  
(see,	
  e.g.,	
  Traud	
  et	
  al.,	
  SIAM	
  Review	
  2011)	
  
	
  	
  wαβ	
  counts:	
  α	
  &	
  β	
  binary	
  
	
  	
  indicator	
  for	
  same/different	
  
•  Rand,	
  Jaccard,	
  Minkowski,	
  
Fowlkes-­‐Mallows,…	
  
•  “Adjusted”:	
  center	
  on	
  mean	
  
with	
  perfect	
  match	
  =	
  1	
  
•  “Standardized”	
  by	
  stdev,	
  
expressed	
  as	
  z-­‐score	
  
•  Linear	
  in	
  w11	
  à	
  equal	
  z	
  
•  Monotonic	
  in	
  w11	
  à	
  equal	
  p	
  
Facebook	
  
Caltech	
  2005:	
  
Colors	
  indicate	
  residen'al	
  
“House”	
  affilia'ons	
  
Purple	
  =	
  Not	
  provided	
  
Traud	
  et	
  al.,	
  “Comparing	
  community	
  structure	
  to	
  
characteris'cs	
  in	
  online	
  collegiate	
  social	
  networks”	
  (2011)	
  
Traud	
  et	
  al.	
  “Social	
  structure	
  of	
  Facebook	
  networks”	
  (2012)	
  
Logis'c	
  Regression:	
  
	
  
	
  
zRand:	
  
Communi'es	
  in	
  Networks	
  
1.  What	
  is	
  a	
  community	
  and	
  why	
  are	
  they	
  useful?	
  
2.  How	
  do	
  you	
  calculate	
  communi'es?	
  
•  Modularity,	
  Stochas'c	
  Block	
  Models,	
  Infomap	
  
3.  Where	
  is	
  community	
  detecBon	
  going	
  in	
  the	
  future?	
  
	
  
…	
  with	
  apologies	
  that	
  this	
  presenta0on	
  will	
  seriously	
  
err	
  on	
  the	
  self-­‐absorbed	
  side.	
  It’s	
  a	
  big	
  field,	
  and	
  I	
  do	
  	
  
not	
  promise	
  to	
  know	
  nor	
  present	
  it	
  all.	
  
	
  
“Communi'es	
  in	
  Networks,”	
  Porter,	
  Onnela	
  &	
  Mucha,	
  
No0ces	
  of	
  the	
  American	
  Mathema0cal	
  Society	
  56,	
  1082-­‐97	
  &	
  1164-­‐6	
  (2009).	
  
	
  
“Community	
  Detec'on	
  in	
  Graphs,”	
  S.	
  Fortunato,	
  	
  
Physics	
  Reports	
  486,	
  75-­‐174	
  (2010).	
  
MulBlayer	
  Networks	
  
Ordered	
  Categorical	
  
Mucha	
  et	
  al.,	
  “Community	
  structure	
  in	
  'me-­‐dependent,	
  mul'scale,	
  and	
  mul'plex	
  networks”	
  (2010)	
  
Kivelä	
  et	
  al.,	
  “Mul'layer	
  Networks”	
  (2014)	
  
Mul'layer	
  Modularity	
  Deriva'on	
  
•  Generalized	
  Lambio[e	
  et	
  al.	
  (2008)	
  connec'on	
  between	
  
modularity	
  and	
  autocorrela'on	
  under	
  Laplacian	
  dynamics	
  to	
  
rederive	
  null	
  models	
  for	
  bipar'te	
  (Barber),	
  directed	
  (Leicht-­‐
Newman),	
  and	
  signed	
  (Traag	
  et	
  al.)	
  networks,	
  via	
  one-­‐step	
  
condi'onal	
  probabili'es	
  
intra-­‐slice	
  
adjacency	
  data	
  
and	
  null	
  	
  
inter-­‐slice	
  
idenBty	
  arcs	
  	
  
Same	
  formalism	
  works	
  for	
  more	
  general	
  mul'layer	
  networks,	
  
with	
  sum	
  over	
  inter-­‐layer	
  connec'ons	
  within	
  same	
  community	
  
Mucha	
  et	
  al.,	
  “Community	
  structure	
  in	
  'me-­‐dependent,	
  mul'scale,	
  and	
  mul'plex	
  networks”	
  (2010)	
  
110	
  Senates	
  (two-­‐year	
  Congresses)	
  
110	
  Senates	
  (two-­‐year	
  Congresses)	
  
PJM	
  &	
  MAP,	
  Chaos	
  2010	
  
PJM	
  &	
  MAP,	
  Chaos	
  2010	
  
PJM	
  &	
  MAP,	
  Chaos	
  2010	
  
PJM	
  &	
  MAP,	
  Chaos	
  2010	
  
See	
  mapequa'on.org	
  
“Mul'layer	
  Stochas'c	
  Block	
  Model”	
  
Strata	
  MLSBM	
  (sMLSBM)	
  
Stanley	
  et	
  al.,	
  “Clustering	
  network	
  layers	
  with	
  the	
  	
  
strata	
  mul'layer	
  stochas'c	
  block	
  model”	
  (to	
  appear)	
  
Initialization
layer l kmeans
cluster L
layers in
to S
strata
stratum s
Iterative Process
stratum s
Update number of strata to the
number of unique clustering
patterns according to (1) and (2)
kmeans
cluster
2L
layers in
to S
strata
(1)
(2)
sMLSBM	
  on	
  SparCC	
  microbial	
  interac'ons	
  
Stanley	
  et	
  al.,	
  “Clustering	
  network	
  layers	
  with	
  the	
  	
  
strata	
  mul'layer	
  stochas'c	
  block	
  model”	
  (to	
  appear)	
  
Summary	
  
•  Community	
  detec'on	
  is	
  an	
  exploratory	
  tool	
  that	
  can	
  
provide	
  a	
  simplified	
  high-­‐level	
  view	
  of	
  the	
  organiza'on	
  of	
  a	
  
network.	
  
•  There	
  are	
  many	
  methods.	
  Don’t	
  0e	
  yourself	
  down	
  to	
  one	
  
method:	
  good	
  clusters	
  should	
  be	
  robust,	
  and	
  (hopefully)	
  
your	
  story	
  shouldn’t	
  depend	
  on	
  the	
  precise	
  method	
  (or	
  
understand	
  why).	
  
•  Many	
  of	
  these	
  methods	
  have	
  parameters	
  and	
  it	
  is	
  
important	
  to	
  know	
  about	
  them	
  for	
  best	
  use.	
  
•  Mul'layer	
  networks	
  are	
  very	
  general.	
  There	
  are	
  rela'vely	
  
few	
  op'ons	
  currently	
  available	
  for	
  finding	
  communi'es	
  in	
  
mul'layer	
  network	
  data,	
  but	
  this	
  area	
  will	
  expand	
  rapidly.	
  
Other	
  great	
  codes	
  to	
  know:	
  
h[p://www.mapequa'on.org/	
  
h[ps://graph-­‐tool.skewed.de/	
  	
  

05 Communities in Networks (2016)

  • 1.
    Communi'es  in  Networks   Peter  J.  Mucha   University  of  North  Carolina   at  Chapel  Hill   0.021086 p = 0.7 Virginia Maryland FloridaStateDuke NorthCarolinaState WakeForest ClemsonGeorgiaTech North Carolina TexasTech TexasA&M Baylor Texas O klahom a O klahom a State C olorado Kansas State Iowa State Nebraska Missouri Kansas Utah State Colorado State Utah Brigham Young Wyoming Air Force Nevada−Las Vegas New Mexico San Diego State Tulsa Texas−El PasoSouthern MethodistFresno StateNevadaHawaiiSan Jose State Louisiana Tech RiceBoise State Alabama−Birmingham Louisville M em phis Cincinnati H ouston EastCarolina Tulane Southern Mississippi Army Non−DivisionIA TexasChristian CentralFlorida SouthFlorida TroyState NewMexicoState Louisiana−Lafayette ArkansasState NorthTexas Louisiana−Monroe Idaho MiddleTennesseeState Arkansas Florida Georgia Tennessee Kentucky SouthCarolina Vanderbilt LouisianaState Mississippi MississippiState Auburn Alabam a W ashington State W ashington UCLA Southern California Oregon State Oregon Arizona State Stanford CaliforniaArizona Miami (Florida)SyracuseTemple RutgersBoston College Pittsburgh West Virginia Virginia Tech Navy Notre Dame Purdue Ohio State Penn State Indiana Wisconsin Illinois Michigan Northwestern Iowa M innesota M ichigan State Connecticut M iam i(Ohio)Kent Marshall AkronBuffaloOhio BowlingGreenState CentralMichigan EasternMichigan WesternMichigan Toledo BallState NorthernIllinois AGRICULTURE APPROPRIATIONS INTERNATIONAL RELATIONS BUDGET HOUSE ADMINISTRATION ENERGY/COMMERCE FINANCIAL SERVICES VETERANS’ AFFAIRS EDUCATION ARMED SERVICES JUDICIARY RESOURCES RULES SCIENCE SMALL BUSINESS OFFICIAL CONDUCT TRANSPORTATION GOVERNMENT REFORM WAYS AND MEANS INTELLIGENCE HOMELAND SECURITY 10 20 30 40 50 60 70 80 90 100 110 CT MEMA NHRI VT DE NJNY PAIL IN MI OHWI IAKS MNMO NENDSD VA ALAR FLGA LAMSNC SC TXKY MDOKTN WVAZCO IDMT NVNM UT WYCAOR WAAK HI Congress # Coupling = 0.2: 13 communities 1917D, 122R, 13other 36PA, 15F, 6AA 373D, 162J, 75other 1615R, 220W, 163F, 97AJ, 273other 605R, 109D, 6other 105DR, 1F 1256D, 140R, 62other 13PA, 4AA 67DR, 7F 66D, 2W, 1FS 105R, 44D 145DR, 28AA, 6F, 5PA 941R, 159D, 7I, 3C 1807−1809 1827−1829 1847−1849 1867−1869 1927−1929 1947−1949 1967−1969 1987−1989 2007−2009
  • 2.
    Communi'es  in  Networks   1.  What  is  a  community  and  why  are  they  useful?   2.  How  do  you  calculate  communi'es?   •  Descrip've:  e.g.,  Modularity   •  Genera've:  e.g.,  Stochas'c  Block  Models   3.  Where  is  community  detec'on  going  in  the  future?     …  with  apologies  that  this  presenta0on  will  seriously   err  on  the  self-­‐absorbed  side.  It’s  a  big  field,  and  I  do     not  promise  to  know  nor  present  it  all.     “Communi'es  in  Networks,”  Porter,  Onnela  &  Mucha,   No0ces  of  the  American  Mathema0cal  Society  56,  1082-­‐97  &  1164-­‐6  (2009).     “Community  Detec'on  in  Graphs,”  S.  Fortunato,     Physics  Reports  486,  75-­‐174  (2010).  
  • 3.
    Acknowledgements:   •   Shankar  Bhamidi,  Jean  Carlson,  Aaron  Clauset,     Skyler  Cranmer,  James  Fowler,  James  Gleeson,  Sco[  Graon,   Jim  Moody,  Mark  Newman,  Andrew  Nobel,  Mason  Porter   •  Dani  Basse[,  Elizabeth  Leicht,  Nishant  Malik,  Sergey  Melnik,   J.-­‐P.  Onnela,  Serguei  Saavedra   •  Dan  Fenn,  Elizabeth  Menninga,  Feng  “Bill”  Shi,   Ashton  Verdery,  Simi  Wang,  James  Wilson,  Andrew  Waugh     •   Thomas  Callaghan,  A.  J.  Friend,  Chris'  Frost,  Eric  Kelsic,     Kevin  Macon,  Sean  Myers,  Ye  Pei,  Sco[  Powers,     Stephen  Reid,  Thomas  Richardson,  Mandi  Traud,     Casey  Warmbrand,  Yan  Zhang   •  NSF  (CAREER/REU  &  VIGRE),  NIGMS  (SNAH),     JSMF  (MAP/JF  &  PJM),  Caltech  SURF,  UNC  (AGEP,  CAS,  SURF)  
  • 4.
    Communi'es  in  Networks   1.  What  is  a  community  and  why  are  they  useful?   2.  How  do  you  calculate  communi'es?   •  Descrip've:  e.g.,  Modularity   •  Genera've:  e.g.,  Stochas'c  Block  Models   3.  Where  is  community  detec'on  going  in  the  future?     …  with  apologies  that  this  presenta0on  will  seriously   err  on  the  self-­‐absorbed  side.  It’s  a  big  field,  and  I  do     not  promise  to  know  nor  present  it  all.     “Communi'es  in  Networks,”  Porter,  Onnela  &  Mucha,   No0ces  of  the  American  Mathema0cal  Society  56,  1082-­‐97  &  1164-­‐6  (2009).     “Community  Detec'on  in  Graphs,”  S.  Fortunato,     Physics  Reports  486,  75-­‐174  (2010).  
  • 5.
    •  Jim  Moody  (paraphrased):  “I’ve  been  accused  of   turning  everything  into  a  network.”   •  PJM  (in  response):  “I’m  accused  of  turning  everything   into  a  network  and  a  graph  par''oning  problem.”   •  “Structure  ßà  Func0on”                    How  to  extend  the  no+on  of  modularity  in  networks   to  mul+ple  networks  between  the  same  actors/units,   i.e.  how  to  properly  use  iden+ty  in  modularity?   Philosophical  Disclaimer   Images  by  Aaron  Clauset  
  • 6.
    Karate  Club  Example   This  par''on  op'mizes  modularity,  which  measures  the   number  of  intra-­‐community  'es  (rela've  to  randomness)   “If  your  method  doesn’t  work  on  this  network,  then  go  home.”  
  • 7.
    Karate  Club  Example   Brought  to  you  by  Mason  Porter  and  The  Power  Law  Shop   h[p://www.cafepress.com/thepowerlawshop   Women’s  and  kids’  sizes  also  available   “If  your  method  doesn’t  work  on  this  network,  then  go  home.”  
  • 8.
    “Cris  Moore  (leJ)  is  the   inaugural  recipient  of  the   Zachary  Karate  Club  Club  prize,   awarded  on  behalf  of  the   community  by  Aric  Hagberg   (right).  (9  May  2013)”  
  • 9.
    Facebook   Traud  et  al.,  “Comparing  community  structure  to   characteris'cs  in  online  collegiate  social  networks”  (2011)   Traud  et  al.  “Social  structure  of  Facebook  networks”  (2012)   Caltech  2005:   Colors  indicate  residen'al   “House”  affilia'ons   Purple  =  Not  provided  
  • 10.
    Facebook   Caltech  2005:   Colors  indicate  residen'al   “House”  affilia'ons   Purple  =  Not  provided   Traud  et  al.,  “Comparing  community  structure  to   characteris'cs  in  online  collegiate  social  networks”  (2011)   Traud  et  al.  “Social  structure  of  Facebook  networks”  (2012)  
  • 11.
    Facebook   Caltech  2005:   Colors  indicate  residen'al   “House”  affilia'ons   Purple  =  Not  provided   Traud  et  al.,  “Comparing  community  structure  to   characteris'cs  in  online  collegiate  social  networks”  (2011)   Traud  et  al.  “Social  structure  of  Facebook  networks”  (2012)  
  • 12.
    Facebook   Caltech  2005:   Colors  indicate  residen'al   “House”  affilia'ons   Purple  =  Not  provided   Traud  et  al.,  “Comparing  community  structure  to   characteris'cs  in  online  collegiate  social  networks”  (2011)   Traud  et  al.  “Social  structure  of  Facebook  networks”  (2012)   Logis'c  Regression:       zRand:  
  • 13.
    Roll  call  as  a  network?    Scien'fic  Coauthorship              v.              Roll  Call  Similari'es    
  • 14.
    see  Waugh  et  al.,  “Party  polariza'on  in  Congress:  a  network  science  approach”  (2009)  
  • 15.
    see  Waugh  et  al.,  “Party  polariza'on  in  Congress:  a  network  science  approach”  (2009)  
  • 16.
    Moody  &  Mucha,  “Portrait  of  poli'cal  party  polariza'on”  (2013)  
  • 17.
    Parker  et  al.,  “Network  Analysis  Reveals  Sex-­‐  and  An'bio'c  Resistance-­‐ Associated  An'virulence  Targets  in  Clinical  Uropathogens”  (2015)  
  • 18.
    Parker  et  al.,  “Network  Analysis  Reveals  Sex-­‐  and  An'bio'c  Resistance-­‐ Associated  An'virulence  Targets  in  Clinical  Uropathogens”  (2015)  
  • 19.
    Communi'es  in  Networks   1.  What  is  a  community  and  why  are  they  useful?   2.  How  do  you  calculate  communiBes?   •  DescripBve:  e.g.,  Modularity   •  GeneraBve:  e.g.,  StochasBc  Block  Models   3.  Where  is  community  detec'on  going  in  the  future?     …  with  apologies  that  this  presenta0on  will  seriously   err  on  the  self-­‐absorbed  side.  It’s  a  big  field,  and  I  do     not  promise  to  know  nor  present  it  all.     “Communi'es  in  Networks,”  Porter,  Onnela  &  Mucha,   No0ces  of  the  American  Mathema0cal  Society  56,  1082-­‐97  &  1164-­‐6  (2009).     “Community  Detec'on  in  Graphs,”  S.  Fortunato,     Physics  Reports  486,  75-­‐174  (2010).  
  • 20.
    Community  Detec'on  Firehose  Overview   •  Computa'onal  sledgehammer  for  large  data   •  “Hard/rigid”  v.  “so/overlapping”  clusters   •  cf.  biclustering  methods  and  mathema'cs  of  expander  graphs   •  A  community  should  describe  a  “cohesive  group,”  and  there  are   varying  formula'ons  and  algorithms   –  Linkage  clustering  (average,  single),  local  clustering  coefficients,   betweeness  (geodesic,  random  walk),  spectral,  conductance,…   •  Classic  approach  in  CS:    Spectral  Graph  Par''oning   –  Need  to  specify  number  of  communi'es  sought   •  Conductance   •  MDL,  Infomap,  OSLOM,  …  (many  other  things  I’ve  missed)  …   •  Modularity:    a  good  par''on  has  more  intra-­‐community  edges  than   one  would  expect  at  random   •  Stochas'c  Block  Models:    a  genera've  random  graph  model  with   different  in/out  probabili'es  between  labeled  groups   “Communi'es  in  Networks,”  Porter,  Onnela  &  Mucha,   No0ces  of  the  American  Mathema0cal  Society  56,  1082-­‐97  &  1164-­‐6  (2009).     “Community  Detec'on  in  Graphs,”  S.  Fortunato,  Physics  Reports  486,  75-­‐174  (2010).  
  • 21.
    Images  by  Aaron  Clauset   Structure  ßà  Func'on/Process   “Modularity”  Approach:  
  • 22.
    Community  Detec'on:    Null  Model  &   Computa'onal  Heuris'cs   •  GOAL:    Assign  nodes  to  communi'es  in  order  to   maximize  quality  func'on  Q   •  NP-­‐Complete  [Brandes  et  al.  2008]   ~  enumerate  possible  par''ons   •  Numerous  packages  developed/developing   –  e.g.  igraph  library  (R,  python),  NetworkX   – Need  appropriate  null  model    
  • 23.
    Maximizing  Modularity   (Newman  &  Girvan,  PRE  2004;  Newman,  PRE  2004,  PNAS  2006,  PRE  2006)   •  Independent  edges,  constrained  to  expected   degree  sequence  same  as  observed.   •  Requires  Pij  =  f(ki)f(kj),  quickly  yielding   •  γ  resolu'on  parameter  ad  hoc  (default  =  1)   (Reichardt  &  Bornholdt,  PRE  2006;    Lambio[e  et  al.,  arXiv  2008)   •  Resolu0on  limit  (Fortunato  &  Barthelemy,  PNAS  2007)   Degenerate  landscape  (Good,  de  Montjoye  &  Clauset,  PRE  2010)   Forces  par00on  (many  authors!)  
  • 24.
    Fenn  et  al.,  Chaos  2009   Macon,  PJM  &  MAP,  Physica  A  2012  
  • 25.
    Community  Detec'on:    Other  Models   •  Erdos-­‐Renyi  (Bernoulli)   •  Newman-­‐Girvan*   •  Leicht-­‐Newman*  (directed)   •  Barber*  (bipar'te)  
  • 26.
    Poli'cal  Blogs  (Adamic  &  Glance,  WWW-­‐2005)   “On  closer  inspec0on,  we  find  that  the  method  [(a)]  fails  in  this  case   because  it  does  not  take  into  account  the  wide  varia0on  among  the  degrees   of  nodes  in  the  network.  In  this  network  (and  many  others)  degrees  vary   over  a  great  range,  whereas  degrees  in  the  block  model  are  Poisson   distributed  and  narrowly  peaked  about  their  mean.  This  means,  in  effect,   that  there  is  no  choice  of  parameters  for  the  model  that  gives  a  good  fit  to   the  data.  Ficng  this  block  model  is  similar  to  ficng  a  straight  line  through   an  inherently  curved  set  of  data  points—you  can  do  it,  but  it  is  unlikely  to   give  you  a  meaningful  answer.”  —Newman,  Nature  Physics  2012     Similar  visualiza'ons  from  different  models  in  Amini  et  al.,  arXiv  (2012)     Bo[om  Right:  Par''ons  v.  overlap  &  extrac'on  (Wilson  et  al.  in  prep)  
  • 27.
    Fortunato  &  Barthelemy,  PNAS  2007   Ball,  Karrer  &  Newman,  PRE  2011  
  • 29.
    Louvain  (Blondel  et  al.  J.Stat.Mech.  2008)  
  • 30.
    Other  great  codes  to  know:   h[p://www.mapequa'on.org/   h[ps://graph-­‐tool.skewed.de/    
  • 31.
    InfoMap     (Rosvall  &  Bergstrom  2008)  
  • 32.
    OSLOM  (Lancichinez  et  al.,  PLoS  One  2011)   •  Score:  Significance   •  “Homeless”  ver'ces   •  Overlap   •  Cluster  hierarchy   •  Because  of  the  way  the   algorithm  evolves   clusters,  it  can  naturally   be  used  for  temporal   network  data.  
  • 33.
    Conductance  &  NCP  Plots  (Leskovec,  Mahoney,  …)  
  • 34.
    Stochas'c  Block  Models   R:  Mixer        Python:  Graph-­‐Tool  
  • 36.
    Other  great  codes  to  know:   h[p://www.mapequa'on.org/   h[ps://graph-­‐tool.skewed.de/    
  • 37.
    At  the  most  general  level…   Two  related  but  different  issues  to  keep  straight:   1.  Theore'cal  Concept  (e.g.,  “Modularity”,   “Map  Equa'on”,  “Stochas'c  Block  Models”)   2.  Computa'onal  Heuris'c  &  Implementa'on   (e.g.  “Fast  Greedy”,  “Louvain”,  “Itera've   Improvement”,  or  the  specific  SBM  code   [possible  ini'aliza'on  issues  with  some])   And,  finally,  how  do  you  compare  communi'es?  
  • 38.
    Comparing  Par''ons   (e.g.  Sec'on  15.2  of  Fortunato  2010)   R  x  C  Con'ngency  Table:   1.  Cluster  Matching   –  Requires  injec0on   2.  Pair  Coun'ng   –  “Adjusted”  v.   “Standardized”   3.  Informa'on  Theory   –  Varia'on  of   Informa'on,   Normalized  Mutual   Informa'on  
  • 39.
    Informa'on-­‐Theore'c  Comparisons   (e.g.  Sec'on  15.2  of  Fortunato  2010)  
  • 40.
    Pair  Coun'ng  &  Standardiza'on   (see,  e.g.,  Traud  et  al.,  SIAM  Review  2011)      wαβ  counts:  α  &  β  binary      indicator  for  same/different   •  Rand,  Jaccard,  Minkowski,   Fowlkes-­‐Mallows,…   •  “Adjusted”:  center  on  mean   with  perfect  match  =  1   •  “Standardized”  by  stdev,   expressed  as  z-­‐score   •  Linear  in  w11  à  equal  z   •  Monotonic  in  w11  à  equal  p  
  • 41.
    Pair  Coun'ng  &  Standardiza'on   (see,  e.g.,  Traud  et  al.,  SIAM  Review  2011)      wαβ  counts:  α  &  β  binary      indicator  for  same/different   •  Rand,  Jaccard,  Minkowski,   Fowlkes-­‐Mallows,…   •  “Adjusted”:  center  on  mean   with  perfect  match  =  1   •  “Standardized”  by  stdev,   expressed  as  z-­‐score   •  Linear  in  w11  à  equal  z   •  Monotonic  in  w11  à  equal  p  
  • 42.
    Facebook   Caltech  2005:   Colors  indicate  residen'al   “House”  affilia'ons   Purple  =  Not  provided   Traud  et  al.,  “Comparing  community  structure  to   characteris'cs  in  online  collegiate  social  networks”  (2011)   Traud  et  al.  “Social  structure  of  Facebook  networks”  (2012)   Logis'c  Regression:       zRand:  
  • 43.
    Communi'es  in  Networks   1.  What  is  a  community  and  why  are  they  useful?   2.  How  do  you  calculate  communi'es?   •  Modularity,  Stochas'c  Block  Models,  Infomap   3.  Where  is  community  detecBon  going  in  the  future?     …  with  apologies  that  this  presenta0on  will  seriously   err  on  the  self-­‐absorbed  side.  It’s  a  big  field,  and  I  do     not  promise  to  know  nor  present  it  all.     “Communi'es  in  Networks,”  Porter,  Onnela  &  Mucha,   No0ces  of  the  American  Mathema0cal  Society  56,  1082-­‐97  &  1164-­‐6  (2009).     “Community  Detec'on  in  Graphs,”  S.  Fortunato,     Physics  Reports  486,  75-­‐174  (2010).  
  • 44.
    MulBlayer  Networks   Ordered  Categorical   Mucha  et  al.,  “Community  structure  in  'me-­‐dependent,  mul'scale,  and  mul'plex  networks”  (2010)   Kivelä  et  al.,  “Mul'layer  Networks”  (2014)  
  • 45.
    Mul'layer  Modularity  Deriva'on   •  Generalized  Lambio[e  et  al.  (2008)  connec'on  between   modularity  and  autocorrela'on  under  Laplacian  dynamics  to   rederive  null  models  for  bipar'te  (Barber),  directed  (Leicht-­‐ Newman),  and  signed  (Traag  et  al.)  networks,  via  one-­‐step   condi'onal  probabili'es   intra-­‐slice   adjacency  data   and  null     inter-­‐slice   idenBty  arcs     Same  formalism  works  for  more  general  mul'layer  networks,   with  sum  over  inter-­‐layer  connec'ons  within  same  community   Mucha  et  al.,  “Community  structure  in  'me-­‐dependent,  mul'scale,  and  mul'plex  networks”  (2010)  
  • 46.
  • 47.
  • 48.
    PJM  &  MAP,  Chaos  2010  
  • 49.
    PJM  &  MAP,  Chaos  2010  
  • 50.
    PJM  &  MAP,  Chaos  2010  
  • 51.
    PJM  &  MAP,  Chaos  2010  
  • 52.
  • 53.
  • 54.
    Strata  MLSBM  (sMLSBM)   Stanley  et  al.,  “Clustering  network  layers  with  the     strata  mul'layer  stochas'c  block  model”  (to  appear)  
  • 56.
    Initialization layer l kmeans clusterL layers in to S strata stratum s Iterative Process stratum s Update number of strata to the number of unique clustering patterns according to (1) and (2) kmeans cluster 2L layers in to S strata (1) (2)
  • 57.
    sMLSBM  on  SparCC  microbial  interac'ons   Stanley  et  al.,  “Clustering  network  layers  with  the     strata  mul'layer  stochas'c  block  model”  (to  appear)  
  • 58.
    Summary   •  Community  detec'on  is  an  exploratory  tool  that  can   provide  a  simplified  high-­‐level  view  of  the  organiza'on  of  a   network.   •  There  are  many  methods.  Don’t  0e  yourself  down  to  one   method:  good  clusters  should  be  robust,  and  (hopefully)   your  story  shouldn’t  depend  on  the  precise  method  (or   understand  why).   •  Many  of  these  methods  have  parameters  and  it  is   important  to  know  about  them  for  best  use.   •  Mul'layer  networks  are  very  general.  There  are  rela'vely   few  op'ons  currently  available  for  finding  communi'es  in   mul'layer  network  data,  but  this  area  will  expand  rapidly.  
  • 59.
    Other  great  codes  to  know:   h[p://www.mapequa'on.org/   h[ps://graph-­‐tool.skewed.de/