08 Inference for Networks – DYAD Model Overview (2017)

Modeling networks: regression with additive and
multiplicative eﬀects
Alexander Volfovsky
Department of Statistical Science, Duke
May 25 2017
May 25, 2017
Health Networks

Why model networks?
Interested in understanding the formation of relationships
1

Why model networks?
Applied ﬁelds: sociology, economics, biology, epidemiology
1

Why model networks?
Fundamental theory questions:
1

Why model networks?
What assumptions are made for diﬀerent network models?
1

Why model networks?
What models work when the assumptions fail?
1

Why model networks?
How to develop fail-safes to overcome these problems?
1

Why model networks?
Where to apply these?
1

Why model networks?
Causal inference
1

Why model networks?
Causal inference
Link prediction
1

Some context: Facebook
Facebook wants to change its’ ad algorithm.
2
Source: Wikimedia

Can’t do it on the whole graph
2
Source: Wikimedia

Can’t do it on the whole graph
Need “total network eﬀect”
2
Source: Wikimedia

How do they solve it?
Interested in estimating
1
N
N
i=1
[Yi (all treated) − Yi (all controls)]
“At a high level, graph cluster randomization is a technique in
which the graph is partitioned into a set of clusters, and then
randomization between treatment and control is performed at
the cluster level.”
Where can we ﬁnd clusters?
Observable information (e.g. same school)
Unobservable information (“social space”)
3

Some context: (im)migration
Want to know how
regime change aﬀects
population.
Politicians during
election years care
about direct eﬀects.
4
Source: http://openscience.alpine-geckos.at/courses/social-network-
analyses/empirical-network-analysis/

Some more context
Studying tram traﬃc in Vienna
5
Source: kurier.at

And one more
Studying taxi rides in Porto
442 taxis
1.7 million rides with (x, y) coordinates at 15 second intervals.
6
Source: Kucukelbir, A., Tran, D., Ranganath, R., Gelman, A., & Blei, D. M. (2017).
Automatic Diﬀerentiation Variational Inference. Journal of Machine Learning
Research, 18(14), 1-45.

And one more
Studying taxi rides in Porto
Project into a 100 dimensional latent space.
Learn hidden interpretable patterns...
7
Source: Kucukelbir, A., Tran, D., Ranganath, R., Gelman, A., & Blei, D. M. (2017).
Automatic Diﬀerentiation Variational Inference. Journal of Machine Learning
Research, 18(14), 1-45.

Relational data: common examples and goals
Changes in exports from year to year
−0.30 −0.20 −0.10
−0.4−0.20.00.20.4
first eigenvector of R^
row
secondeigenvectorofR^
row
Australia
Austria
Brazil
Canada
China
China, Hong Kong SAR
Finland
France
Germany
GreeceIndonesia
Ireland
Italy
Japan
Malaysia
Mexico
Netherlands
New Zealand
Norway
Rep. of Korea
Spain
Switzerland
Thailand
Turkey
United Kingdom
USA
−0.25 −0.15 −0.05 0.05
−0.3−0.10.10.3
col
col
Australia
Austria
Brazil
Canada
China
Finland
France
Germany
Greece
Indonesia
Ireland
Italy
Japan
Malaysia
Mexico
Netherlands
New ZealandNorway
Rep. of Korea
Spain
Switzerland
Thailand
Turkey
United Kingdom
USA
Network regression problems yij = xij β + ij frequently assume
independence of the ij
8

Estimating β in network regression
−0.30 −0.20 −0.10
−0.4−0.20.00.20.4
row
row
Australia
Austria
Brazil
Canada
China
Finland
France
Germany
GreeceIndonesia
Ireland
Italy
Japan
Malaysia
Mexico
Netherlands
New Zealand
Norway
Rep. of Korea
Spain
Switzerland
Thailand
Turkey
United Kingdom
USA
−0.25 −0.15 −0.05 0.05
−0.3−0.10.10.3
col
col
Australia
Austria
Brazil
Canada
China
Finland
France
Germany
Greece
Indonesia
Ireland
Italy
Japan
Malaysia
Mexico
Netherlands
New ZealandNorway
Rep. of Korea
Spain
Switzerland
Thailand
Turkey
United Kingdom
USA
For Y =< X, β > +E we have
OLS (assume no dependence among ij ):
ˆβ(ols)
= (mat(X)t
mat(X))−1
mat(X)t
vec(Y )
Oracle GLS (assume dependence among ij ):
ˆβ(gls)
= (mat(X)t
(Σ−1
)mat(X))−1
mat(X)t
(Σ−1
)vec(Y )
9

Network models
The data
There are n actors/nodes labeled 1, . . . , n
Y is a sociomatrix: yij is a dyadic relationship between node i
and node j.
yii frequently undefined.
Covariates:
node specific: xi
dyad specific: xij

Social relations model
Goal: describe the variability in Y .
Sender eﬀects describe sociability.
Receiver eﬀects describe popularity.
Capture this in the Social Relations Model (SRM)
yij = ai + bj + ij
Almost an ANOVA — want to relate ai to bi since the
senders/receivers are from the same set.

Social relations model
yij =µ + ai + bj + ij
(ai , bi )
iid
∼N(0, Σab)
( ij , ji )
iid
∼N(0, Σe)
Σab =
σ2
a σab
σab σ2
b
describes sender/receiver variability and
within person similarity.
Σe = σ2 1 ρ
ρ 1
describes within dyad correlation.
12

Variability
var(yij ) =σ2
a + 2σab + σ2
b + σ2
cov(yij , yik) =σ2
a
cov(yij , ukj ) =σ2
b
cov(yij , yjk) =σab
cov(yij , yji ) =2σab + ρσ2
How hard is it to ﬁt this model?
fit_SRM <- ame(Y)
13

Pictures that pop up
These help capture how well the Markov Chain is mixing and
goodness of ﬁt information.
14
Source: Hoﬀ (2015). arXiv:1506.08237

Goodness of ﬁt
Posterior predictive distributions.
sd.rowmean: standard deviation of row means of Y .
sd.colmean: standard deviation of column means of Y .
dyad.dep: correlation between vectorized Y and vectorized Y t
triad.dep:
i jk eij ejkeki
#triangle on n nodes
Var(vec(Y ))3/2
15
Source: Hoﬀ (2015). arXiv:1506.08237

Incorporating covariates
Imagine you have some covariates and want to fit
yij = βt
d xd,ij + βt
r xr,i + βt
cxc,j + ai + bj + ij
xd,ij are dyad specific covariates.
xr,i are row (sender) covariates.
xc,i are column (receiver) covariates.
Frequently xr,i = xc,i = xi
When does this not make sense?
(Example: popularity is affected by athletic success, but
sociability is not)
How hard is it to fit this model?
fit_SRRM <- ame(Y, Xd=Xd,Xr=Xr,Xc=Xc)
16

Parsing the input
fit_SRRM <- ame(Y,
Xdyad=Xd, #n x n x pd array of covariates
Xrow=Xr, #n x pr matrix of nodal row covariates
Xcol=Xc #n x pc matrix of nodal column covariates
)
Xri,p is the value of the pth row covariate for node i.
Xdi,j,p is the value of the pth dyadic covariate in the direction
of i to j.

Back to basics
Can you get rid of the dependencies in the model?
fit_rm<-ame(Y,Xd=Xd,Xr=Xn,Xc=Xn,
rvar=FALSE, #should you fit row random effects?
cvar=FALSE, #should you fit column random effects?
dcor=FALSE #should you fit a dyadic correlation?
)
Note that summary will output:
Variance parameters:
pmean psd
va 0.000 0.000
cab 0.000 0.000
vb 0.000 0.000
rho 0.000 0.000
ve 0.229 0.011
18

So what’s missing here?
We have a lot of left over variability.
Common themes in network analysis:
Homophily: similar people connect to each other
Stochastic equivalence: similar people act similarly
19

Which is which?
Source: Hoﬀ (2008). NIPS

Which is which?
Left: homophily; Right: stochastic equivalence
What are good models for this?
Source: Hoﬀ (2008). NIPS

Introducing multiplicative eﬀects
SR(R)M can represent second-order dependencies very well.
Has a hard time capturing “triadic” behavior.
Homophily: create dyadic covariates xd,ij = xi xj
Generally this can be represented by
xt
ri
Bxj,i = k l bkl xr,ikxc,jl
This is linear in the covariates and so can be baked into the
amen framework.
Sometimes there is excess correlation to account.
This suggests a multiplicative eﬀects model:
yij = βt
d xd,ij + βt
r xr,i + βt
cxc,j + ai + bj + ut
i vj + ij
21

Fitting these models and beyond
fit_ame2<-ame(Y,Xd,Xn,Xn,
R=2 #dimension of the multiplicative effect
)
22
Source: Hoﬀ (2015). arXiv:1506.08237

What happened here?
Why do multiplicative eﬀects help triadic behavior?
Triadic measure is related to transitivity (at least for binary
data).
Turns out homophily can capture transitivity...
yij = βt
d xd,ij + βt
r xr,i + βt
cxc,j + ai + bj + ut
i vj + ij
ui is information about the sender, vj is information about the
receiver
if ui ≈ vj then ut
i vj > 0...
if ui ≈ uj then there is some stochastic equivalence...

Lets generalize: ordinal models
Imagine a binary (probit) model:
yij = 1zij >0 zij = µ + ai + bj + ij
Looks like the SRM on the latent scale.
fit_SRM<-ame(Y,
model="bin" #lots of model options here
)
If we go to the iid set up this is just an Erdos-Renyi model:
fit_SRG<-ame(Y,model="bin",
rvar=FALSE,cvar=FALSE,dcor=FALSE)
24

Even more general
Consider the following generative model:
zij = ut
i Dvj + ij
yij = g(zij )
25

Even more general
zij = ut
i Dvj + ij
yij = g(zij )
ui are latent factors describing i as a sender

Even more general
zij = ut
i Dvj + ij
yij = g(zij )
vj are latent factors describing j as a receiver

Even more general
zij = ut
i Dvj + ij
yij = g(zij )
D is a matrix of factor weights

Even more general
zij = ut
i Dvj + ij
yij = g(zij )
g is an increasing function mapping the latent space to the
observed space.

Even more general
zij = ut
i Dvj + ij
yij = g(zij )
g is an increasing function mapping the latent space to the
observed space.
(Some gs... Normal: g(z) = z, binomial: g(z) = 1z≥0)
25

This works for symmetric matrices too!
Imagine that yij = yji then the model looks like:
zij = ui Λuj + ij
yij = g(zij )
26

zij = ui Λuj + ij
yij = g(zij )
ui ≈ uj represents stochastic equivalence

zij = ui Λuj + ij
yij = g(zij )
Λ is a matrix of eigenvalues:

zij = ui Λuj + ij
yij = g(zij )
Λ is a matrix of eigenvalues:
positive λi imply homophily, negative ones imply heterophily.
26

What is this latent space?
Problem 1: need to select a dimension R.
27

This is hard... sometimes there is some intuition.
27

Problem 2: should the latent positions be interpreted?
27

Unclear — maybe think of the distances in this space...
27

Problem 3: what about my favorite other models like
stochastic blockmodels?

Problem 3: what about my favorite other models like
stochastic blockmodels?
These are just a subclass of models! For example, the
stochastic blockmodel has discrete support for the latent
positions.

All quotes from Hoﬀ, et al 2002
A subset of individuals in the population with a large number
of social ties between them may be indicative of a group of
individuals who have nearby positions in this space of
characteristics, or social space.
Various concepts of social space have been discussed by
McFarland and Brown (1973) and Faust (1988).
In the context of this article, social space refers to a space of
unobserved latent characteristics that represent potential
transitive tendencies in network relations.
A probability measure over these unobserved characteristics
induces a model in which the presence of a tie between two
individuals is dependent on the presence of other ties.

(Tiny portion of the) literature
Nowicki, Krzysztof, and Tom A. B. Snijders. ”Estimation and
prediction for stochastic blockstructures.” Journal of the American
Statistical Association 96, no. 455 (2001): 1077-1087.
Hoff, Peter D., Adrian E. Raftery, and Mark S. Handcock. ”Latent
space approaches to social network analysis.” Journal of the
american Statistical association 97, no. 460 (2002): 1090-1098.
Hoff, Peter. ”Modeling homophily and stochastic equivalence in
symmetric relational data.” In Advances in Neural Information
Processing Systems, pp. 657-664. 2008.
Airoldi, Edoardo M., David M. Blei, Stephen E. Fienberg, and Eric
P. Xing. ”Mixed membership stochastic blockmodels.” Journal of
Machine Learning Research 9, no. Sep (2008): 1981-2014.
Hoff, Peter, Bailey Fosdick, Alex Volfovsky, and Katherine Stovel.
”Likelihoods for fixed rank nomination networks.” Network Science
1, no. 03 (2013): 253-277.
Hoff, Peter D. ”Dyadic data analysis with amen.” arXiv preprint
arXiv:1506.08237 (2015).

ame(Y, Xdyad=NULL, Xrow=NULL, Xcol=NULL,
rvar = !(model=="rrl") , cvar = TRUE, dcor = !symmetric,
nvar = TRUE, R = 0, model="nrm",
intercept=!is.element(model,c("rrl","ord")),
symmetric=FALSE,
odmax=rep(max(apply(Y>0,1,sum,na.rm=TRUE)),nrow(Y)), ...)
Y: an n x n square relational matrix of relations.
Xdyad: an n x n x pd array of covariates
Xrow: an n x pr matrix of nodal row covariates
Xcol: an n x pc matrix of nodal column covariates
rvar: logical: fit row random effects (asymmetric case)?
cvar: logical: fit column random effects (asymmetric case)?
dcor: logical: fit a dyadic correlation (asymmetric case)?
nvar: logical: fit nodal random effects (symmetric case)?
R: int: dimension of the multiplicative effects (can be 0)
model: char: one of "nrm","bin","ord","cbin","frn","rrl"
odmax: a scalar integer or vector of length n giving the
maximum number of nominations that each node may make

What’s in the ...?
seed = 1, nscan = 10000, burn = 500, odens = 25,
plot=TRUE, print = TRUE, gof=TRUE
seed: random seed
nscan: number of iterations of the Markov chain
(beyond burn-in)
burn: burn in for the Markov chain
odens: output density for the Markov chain
plot: logical: plot results while running?
print: logical: print results while running?
gof: logical: calculate goodness of fit statistics?

Social network data
Datasets: PROSPER, NSCR, AddHealth
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
! !
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
! !
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!
! !
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
9
10
11
12
0.000.050.100.150.20
proportion
Figure 3
interest is a comparison of such estima
in order to see if the relationships betw
study in Section 3.2. To this end, w
33

Social network data
Relate network characteristics to
individual-level behavior !
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
! !
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
! !
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!
! !
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
9
10
11
12
0.000.050.100.150.20
proportion
Figure 3
33

Social network data
individual-level behavior
Literature: ERGM, latent variable models
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
! !
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
! !
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!
! !
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
9
10
11
12
0.000.050.100.150.20
proportion
Figure 3
33

Social network data
Assumptions:
Data is fully observed
The support is the set of all
sociomatrices
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
! !
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
! !
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!
! !
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
9
10
11
12
0.000.050.100.150.20
proportion
Figure 3
33

Social network data
Assumptions:
sociomatrices
In practice:
Ranked data
Censored observations
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
! !
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
! !
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!
! !
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
9
10
11
12
0.000.050.100.150.20
proportion
Figure 3
33

Social network data
Assumptions:
sociomatrices
In practice:
Ranked data
Censored observations
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
! !
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
! !
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!
! !
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
9
10
11
12
0.000.050.100.150.20
proportion
Figure 3
A type of likelihood that accommodates the ranked and censored
nature of data from Fixed Rank Nomination (FRN) surveys and
allows for estimation of regression eﬀects.
33

Data collection examples
PROmoting School Community-University Partnerships to
Enhance Resilience (PROSPER): “Who are your best and
closest friends in your grade?”
National Longitudinal Study of Adolescent to Adult Health
(AddHealth): “Your male friends. List your closest male
friends. List your best male friend ﬁrst, then your next best
friend, and so on.”
34

Notation
Z = {zij : i = j} is a sociomatrix of
ordinal relationships
zij > zik denotes person i preferring
person j to person k
Z =





− z12 · · · z1n
z21 −
... −
zn1 −





35

Notation
Z =





− z12 · · · z1n
z21 −
... −
zn1 −





Instead of Z we observe a sociomatrix Y = {yij : i = j}
35

Notation
Z =





− z12 · · · z1n
z21 −
... −
zn1 −





Different sampling schemes define different maps between Y
and Z (set relations between yij and zij ).
35

Notation
Z =





− z12 · · · z1n
z21 −
... −
zn1 −





Different sampling schemes define different maps between Y
and Z (set relations between yij and zij ).
Statistical model {p (Z|θ) : θ ∈ Θ} assists in analysis
35

Fixed rank nominations
yij > yik ⇒ zij > zik
}F (Y )yij = 0 and di < m ⇒ zij ≤ 0
yij > 0 ⇒ zij > 0
yij = 0 ⇒ zij < 0
F(Y)
m = maximal number of nominations, di = individual outdegree
36

yij > 0 ⇒ zij > 0
yij = 0 ⇒ zij < 0
F(Y)
Diﬀerentiates between diﬀerent ranks
Captures censoring in the data
zi
yi
1 2 3 4 5 6 7 8 9 10
36

yij > 0 ⇒ zij > 0
yij = 0 ⇒ zij < 0
F(Y)
zi
yi
1 2 3 4 5 6 7 8 9 10
4 3 2 1 0 0 0 0 0 0
36

yij > 0 ⇒ zij > 0
yij = 0 ⇒ zij < 0
F(Y)
zi
yi
1 2 3 4 5 6 7 8 9 10
4 3 2 1 0 0 0 0 0 0
zi1 zi2 zi3 zi4 0> 0> 0> 0> 0> 0>> > > >
36

yij > 0 ⇒ zij > 0
yij = 0 ⇒ zij < 0
F(Y)
zi
yi
1 2 3 4 5 6 7 8 9 10
5 4 3 2 1 0 0 0 0 0
36

yij > 0 ⇒ zij > 0
yij = 0 ⇒ zij < 0
F(Y)
zi
yi
1 2 3 4 5 6 7 8 9 10
5 4 3 2 1 0 0 0 0 0
zi1 zi2 zi3 zi4 zi5 ? ? ? ? ?> > > > >
36

Rank
yij > yik ⇒ zij > zik } R (Y )
yij = 0 and di < m ⇒ zij ≤ 0
yij > 0 ⇒ zij > 0
yij = 0 ⇒ zij < 0
F(Y)
R(Y)
37

Rank
yij > 0 ⇒ zij > 0
yij = 0 ⇒ zij < 0
F(Y)
R(Y)
Valid but not fully informative: F (Y ) R (Y )
zi
yi
1 2 3 4 5 6 7 8 9 10
37

Rank
yij > 0 ⇒ zij > 0
yij = 0 ⇒ zij < 0
F(Y)
R(Y)
zi
yi
1 2 3 4 5 6 7 8 9 10
4 3 2 1 0 0 0 0 0 0
37

Rank
yij > 0 ⇒ zij > 0
yij = 0 ⇒ zij < 0
F(Y)
R(Y)
zi
yi
1 2 3 4 5 6 7 8 9 10
4 3 2 1 0 0 0 0 0 0
zi1 zi2 zi3 zi4 ? ? ? ? ? ?> > > >
37

Rank
yij > 0 ⇒ zij > 0
yij = 0 ⇒ zij < 0
F(Y)
R(Y)
zi
yi
1 2 3 4 5 6 7 8 9 10
5 4 3 2 1 0 0 0 0 0
37

Rank
yij > 0 ⇒ zij > 0
yij = 0 ⇒ zij < 0
F(Y)
R(Y)
zi
yi
1 2 3 4 5 6 7 8 9 10
5 4 3 2 1 0 0 0 0 0
zi1 zi2 zi3 zi4 zi5 ? ? ? ? ?> > > > >
37

Rank
yij > 0 ⇒ zij > 0
yij = 0 ⇒ zij < 0
F(Y)
R(Y)
Cannot estimate row (“sender”) speciﬁc eﬀects
zi
yi
1 2 3 4 5 6 7 8 9 10
5 4 3 2 1 0 0 0 0 0
zi1 zi2 zi3 zi4 zi5 ? ? ? ? ?> > > > >
37

Binary
yij > 0 ⇒ zij > 0
} B (Y )
yij = 0 ⇒ zij < 0
F(Y)
R(Y)
B(Y)
38

Binary
yij > 0 ⇒ zij > 0
} B (Y )
yij = 0 ⇒ zij < 0
F(Y)
R(Y)
B(Y)
Neither fully informative nor valid!
Discards information on the ranks
Ignores the censoring on the outdegrees
In particular: F (Y ) ⊂ B (Y )
zi
yi
1 2 3 4 5 6 7 8 9 10
38

Binary
yij > 0 ⇒ zij > 0
} B (Y )
yij = 0 ⇒ zij < 0
F(Y)
R(Y)
B(Y)
zi
yi
1 2 3 4 5 6 7 8 9 10
4 3 2 1 0 0 0 0 0 0
38

Binary
yij > 0 ⇒ zij > 0
} B (Y )
yij = 0 ⇒ zij < 0
F(Y)
R(Y)
B(Y)
zi
yi
1 2 3 4 5 6 7 8 9 10
4 3 2 1 0 0 0 0 0 0
>0 >0 >0 >0 0> 0> 0> 0> 0> 0>
38

Binary
yij > 0 ⇒ zij > 0
} B (Y )
yij = 0 ⇒ zij < 0
F(Y)
R(Y)
B(Y)
zi
yi
1 2 3 4 5 6 7 8 9 10
5 4 3 2 1 0 0 0 0 0
38

Binary
yij > 0 ⇒ zij > 0
} B (Y )
yij = 0 ⇒ zij < 0
F(Y)
R(Y)
B(Y)
zi
yi
1 2 3 4 5 6 7 8 9 10
5 4 3 2 1 0 0 0 0 0
>0 >0 >0 >0 >0 0> 0> 0> 0> 0>
38

Bayesian Estimation for Fixed Rank Nominations
Model: Z ∼ p(Z|θ), θ ∈ Θ
Data: Z ∈ F(Y )
Likelihood:
LF (θ : Y ) = Pr (Z ∈ F (Y )|θ) =
F(Y )
dP (Z|θ)
Estimation: Given p(θ), p(θ|Z ∈ F(Y )) can be approximated
by a Gibbs sampler.
39

Data: Z ∈ F(Y )
Likelihood:
LF (θ : Y ) = Pr (Z ∈ F (Y )|θ) =
F(Y )
dP (Z|θ)
by a Gibbs sampler.
Simulate zij ∼ p(zij |θ, Z−ij , Z ∈ F(Y )):
39

Data: Z ∈ F(Y )
Likelihood:
LF (θ : Y ) = Pr (Z ∈ F (Y )|θ) =
F(Y )
dP (Z|θ)
by a Gibbs sampler.
1. yij > 0: zij ∼ p(zij |θ, Z−ij )1zij ∈(a,b) where
a = max(zik : yik < yij ) and b = min(zik : yik > yij ).
39

Data: Z ∈ F(Y )
Likelihood:
LF (θ : Y ) = Pr (Z ∈ F (Y )|θ) =
F(Y )
dP (Z|θ)
by a Gibbs sampler.
2. yij = 0 and di < m: zij ∼ p(zij |Z−ij , θ)1zij ≤0.
39

Data: Z ∈ F(Y )
Likelihood:
LF (θ : Y ) = Pr (Z ∈ F (Y )|θ) =
F(Y )
dP (Z|θ)
by a Gibbs sampler.
3. yij = 0 and di = m: zij ∼ p(zij |Z−ij , θ)1zij ≤min(zik :yik >0)
39

Data: Z ∈ F(Y )
Likelihood:
LF (θ : Y ) = Pr (Z ∈ F (Y )|θ) =
F(Y )
dP (Z|θ)
by a Gibbs sampler.
3. yij = 0 and di = m: zij ∼ p(zij |Z−ij , θ)1zij ≤min(zik :yik >0)
Allows for imputation of missing yij
39

Simulations
We generated Z from the following Social Relations Model
(Warner, Kenny and Stoto (1979)):
zij = βt
xij + ai + bj + ij
ai
bi
iid
∼ normal 0,
1 0.5
0.5 1
ij
ji
iid
∼ normal 0,
1 0.9
0.9 1
Mean model: βtxij = β0 + βr xir + βcxjc + βd1 xij1 + βd2 xij2
xir , xjc: individual level variables
xij1: pair speciﬁc variable
xij2: co-membership in a group
40

Simulations
We generated Z from the following Social Relations Model
(Warner, Kenny and Stoto (1979)):
zij = βt
xij + ai + bj + ij
ai
bi
iid
∼ normal 0,
1 0.5
0.5 1
ij
ji
iid
∼ normal 0,
1 0.9
0.9 1
Mean model: βtxij = β0 + βr xir + βcxjc + βd1 xij1 + βd2 xij2
xir , xjc: individual level variables
xij1: pair speciﬁc variable
xij2: co-membership in a group
βr = βc = βd1 = βd2 = 1 and β0 = −3.26
xir , xic, xij1
iid
∼ N (0, 1) xij2 = si sj /.42 for si
iid
∼ binary (1/2)
40

Simulations - Censoring
8 simulations for each m ∈ {5, 15} with 100 nodes each1 2 3 4 5 6 7 8
0.00.51.01.5
!r
! !
! ! ! ! ! !
!
!
! !
! ! !
!
m = 5
1 2 3 4 5 6 7 8
0.40.81.21.6
!c
!
!
!
!
! !
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
1 2 3 4 5 6 7 8
0.81.0
!d1
!
!
!
!
!
!
!
!!
!
! !
!
!
!
!!
!
! !
!
!
!
!
1 2 3 4 5 6 7 8
0.40.81.2
!d2
!
! ! !
!
!
!
!
!
!
!
!
!
!
! !
! !
!
!
!
! !
!
simulation
1 2 3 4 5 6 7 8
0.00.51.01.5
! !
!
! !
! ! !
!
!
! !
! ! ! !
m = 15
1 2 3 4 5 6 7 8
0.40.81.21.6
! !
!
!
! !
!
!
! !
!
! ! !
!
!! !
!
!
! !
!
!
1 2 3 4 5 6 7 8
0.81.0
!
!
! !
!
!
! !
!
!
! !
!
! !
!
!
!
!
!
!
! !
!
1 2 3 4 5 6 7 8
0.40.81.2
! ! !
! !
!
!
!
! !
!
! !
!
!
!! ! !
! !
!
!
!
simulation
m = 5 m = 15
1 2 3 4 5 6 7 8
0.00.51.01.5
!r
! !
! ! ! ! ! !
!
!
! !
! ! !
!
1 2 3 4 5 6 7 8
0.40.81.21.6
!c
!
!
!
!
! !
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
1 2 3 4 5 6 7 8
0.81.0
!d1
!
!
!
!
!
!
!
!!
!
! !
!
!
!
!!
!
! !
!
!
!
!
1 2 3 4 5 6 7 8
0.40.81.2
!d2
!
! ! !
!
!
!
!
!
!
!
!
!
!
! !
! !
!
!
!
! !
!
simulation
1 2 3 4 5 6 7 8
0.00.51.01.5
! !
!
! !
! ! !
!
!
! !
! ! ! !
1 2 3 4 5 6 7 8
0.40.81.21.6
! !
!
!
! !
!
!
! !
!
! ! !
!
!! !
!
!
! !
!
!
1 2 3 4 5 6 7 8
0.81.0
!
!
! !
!
!
! !
!
!
! !
!
! !
!
!
!
!
!
!
! !
!
1 2 3 4 5 6 7 8
0.40.81.2
! ! !
! !
!
!
!
! !
!
! !
!
!
!! ! !
! !
!
!
!
simulation
m = 5 m = 15
Conﬁdence intervals under the three diﬀerent likelihood for column
and an iid dyadic variable. The groups of three CIs are based on
binary, FRN and rank likelihoods from left to right.
41

1 2 3 4 5 6 7 8
0.00.51.01.5
!r
! !
! ! ! ! ! !
!
!
! !
! ! !
!
m = 5
1 2 3 4 5 6 7 8
0.40.81.21.6
!c
!
!
!
!
! !
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
1 2 3 4 5 6 7 8
0.81.0
!d1
!
!
!
!
!
!
!
!!
!
! !
!
!
!
!!
!
! !
!
!
!
!
2
1 2 3 4 5 6 7 8
0.00.51.01.5
! !
!
! !
! ! !
!
!
! !
! ! ! !
m = 15
1 2 3 4 5 6 7 8
0.40.81.21.6
! !
!
!
! !
!
!
! !
!
! ! !
!
!! !
!
!
! !
!
!
1 2 3 4 5 6 7 8
0.81.0
!
!
! !
!
!
! !
!
!
! !
!
! !
!
!
!
!
!
!
! !
!
2
m = 5 m = 15
Rank likelihood cannot estimate row eﬀects
Z ∈ R (Y ) ⇐⇒ Z + c1t
∈ R (Y ) ∀c ∈ Rn

1 2 3 4 5 6 7 8
0.00.51.01.5
!r
! !
! ! ! ! ! !
!
!
! !
! ! !
!
m = 5
1 2 3 4 5 6 7 8
0.40.81.21.6
!c
!
!
!
!
! !
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
1 2 3 4 5 6 7 8
0.81.0
!d1
!
!
!
!
!
!
!
!!
!
! !
!
!
!
!!
!
! !
!
!
!
!
2
1 2 3 4 5 6 7 8
0.00.51.01.5
! !
!
! !
! ! !
!
!
! !
! ! ! !
m = 15
1 2 3 4 5 6 7 8
0.40.81.21.6
! !
!
!
! !
!
!
! !
!
! ! !
!
!! !
!
!
! !
!
!
1 2 3 4 5 6 7 8
0.81.0
!
!
! !
!
!
! !
!
!
! !
!
! !
!
!
!
!
!
!
! !
!
2
m = 5 m = 15
Z ∈ R (Y ) ⇐⇒ Z + c1t
∈ R (Y ) ∀c ∈ Rn
Binary likelihood poorly estimates row eﬀects

1 2 3 4 5 6 7 8
0.00.51.01.5
!r
! !
! ! ! ! ! !
!
!
! !
! ! !
!
m = 5
1 2 3 4 5 6 7 8
0.40.81.21.6
!c
!
!
!
!
! !
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
1 2 3 4 5 6 7 8
0.81.0
!d1
!
!
!
!
!
!
!
!!
!
! !
!
!
!
!!
!
! !
!
!
!
!
2
1 2 3 4 5 6 7 8
0.00.51.01.5
! !
!
! !
! ! !
!
!
! !
! ! ! !
m = 15
1 2 3 4 5 6 7 8
0.40.81.21.6
! !
!
!
! !
!
!
! !
!
! ! !
!
!! !
!
!
! !
!
!
1 2 3 4 5 6 7 8
0.81.0
!
!
! !
!
!
! !
!
!
! !
!
! !
!
!
!
!
!
!
! !
!
2
m = 5 m = 15
Z ∈ R (Y ) ⇐⇒ Z + c1t
∈ R (Y ) ∀c ∈ Rn
Large amount of censoring

1 2 3 4 5 6 7 8
0.00.51.01.5
!r
! !
! ! ! ! ! !
!
!
! !
! ! !
!
m = 5
1 2 3 4 5 6 7 8
0.40.81.21.6
!c
!
!
!
!
! !
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
1 2 3 4 5 6 7 8
0.81.0
!d1
!
!
!
!
!
!
!
!!
!
! !
!
!
!
!!
!
! !
!
!
!
!
2
1 2 3 4 5 6 7 8
0.00.51.01.5
! !
!
! !
! ! !
!
!
! !
! ! ! !
m = 15
1 2 3 4 5 6 7 8
0.40.81.21.6
! !
!
!
! !
!
!
! !
!
! ! !
!
!! !
!
!
! !
!
!
1 2 3 4 5 6 7 8
0.81.0
!
!
! !
!
!
! !
!
!
! !
!
! !
!
!
!
!
!
!
! !
!
2
m = 5 m = 15
Z ∈ R (Y ) ⇐⇒ Z + c1t
∈ R (Y ) ∀c ∈ Rn
⇒ Heterogeneity of censored outdegrees is low

1 2 3 4 5 6 7 8
0.00.51.01.5
!r
! !
! ! ! ! ! !
!
!
! !
! ! !
!
m = 5
1 2 3 4 5 6 7 8
0.40.81.21.6
!c
!
!
!
!
! !
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
1 2 3 4 5 6 7 8
0.81.0
!d1
!
!
!
!
!
!
!
!!
!
! !
!
!
!
!!
!
! !
!
!
!
!
2
1 2 3 4 5 6 7 8
0.00.51.01.5
! !
!
! !
! ! !
!
!
! !
! ! ! !
m = 15
1 2 3 4 5 6 7 8
0.40.81.21.6
! !
!
!
! !
!
!
! !
!
! ! !
!
!! !
!
!
! !
!
!
1 2 3 4 5 6 7 8
0.81.0
!
!
! !
!
!
! !
!
!
! !
!
! !
!
!
!
!
!
!
! !
!
2
m = 5 m = 15
Z ∈ R (Y ) ⇐⇒ Z + c1t
∈ R (Y ) ∀c ∈ Rn
⇒ Heterogeneity of censored outdegrees is low
⇒ Regression coeﬃcients estimated too low

1 2 3 4 5 6 7 8
0.81.0
!d1
!
!
!
!
!
!
!
!!
!
! !
!
!
!
!!
!
! !
!
!
!
!
1 2 3 4 5 6 7 8
0.40.81.2
!d2
!
! ! !
!
!
!
!
!
!
!
!
!
!
! !
! !
!
!
!
! !
!
simulation
1 2 3 4 5 6 7 8
0.81.0
!
!
! !
!
!
! !
!
!
! !
!
! !
!
!
!
!
!
!
! !
!
1 2 3 4 5 6 7 8
0.40.81.2
! ! !
! !
!
!
!
! !
!
! !
!
!
!! ! !
! !
!
!
!
simulation
m = 5 m = 15
Recall: xij2 ∝ si sj , an indicator of comembership to a group
43

1 2 3 4 5 6 7 8
0.81.0
!d1
!
!
!
!
!
!
!
!!
!
! !
!
!
!
!!
!
! !
!
!
!
!
1 2 3 4 5 6 7 8
0.40.81.2
!d2
!
! ! !
!
!
!
!
!
!
!
!
!
!
! !
! !
!
!
!
! !
!
simulation
1 2 3 4 5 6 7 8
0.81.0
!
!
! !
!
!
! !
!
!
! !
!
! !
!
!
!
!
!
!
! !
!
1 2 3 4 5 6 7 8
0.40.81.2
! ! !
! !
!
!
!
! !
!
! !
!
!
!! ! !
! !
!
!
!
simulation
m = 5 m = 15
Ignore the censoring
43

1 2 3 4 5 6 7 8
0.81.0
!d1
!
!
!
!
!
!
!
!!
!
! !
!
!
!
!!
!
! !
!
!
!
!
1 2 3 4 5 6 7 8
0.40.81.2
!d2
!
! ! !
!
!
!
!
!
!
!
!
!
!
! !
! !
!
!
!
! !
!
simulation
1 2 3 4 5 6 7 8
0.81.0
!
!
! !
!
!
! !
!
!
! !
!
! !
!
!
!
!
!
!
! !
!
1 2 3 4 5 6 7 8
0.40.81.2
! ! !
! !
!
!
!
! !
!
! !
!
!
!! ! !
! !
!
!
!
simulation
m = 5 m = 15
⇒ Binary likelihood underestimates row variability
43

1 2 3 4 5 6 7 8
0.81.0
!d1
!
!
!
!
!
!
!
!!
!
! !
!
!
!
!!
!
! !
!
!
!
!
1 2 3 4 5 6 7 8
0.40.81.2
!d2
!
! ! !
!
!
!
!
!
!
!
!
!
!
! !
! !
!
!
!
! !
!
simulation
1 2 3 4 5 6 7 8
0.81.0
!
!
! !
!
!
! !
!
!
! !
!
! !
!
!
!
!
!
!
! !
!
1 2 3 4 5 6 7 8
0.40.81.2
! ! !
! !
!
!
!
! !
!
! !
!
!
!! ! !
! !
!
!
!
simulation
m = 5 m = 15
⇒ Binary likelihood underestimates row variability
⇒ Underestimate the variability in xij2
43

Simulations - information in the ranks
Let C (Y ) be the set of values for which the following is true:
yij > 0 ⇒ zij > 0
min {zij : yij > 0} ≥ max {zij : yij = 0}
We refer to LC (θ : Y ) = Pr (Z ∈ C (Y )|θ) as the censored
binary likelihood.
Recognizes censoring but ignores information in the ranks

Let C (Y ) be the set of values for which the following is true:
yij > 0 ⇒ zij > 0
min {zij : yij > 0} ≥ max {zij : yij = 0}
We refer to LC (θ : Y ) = Pr (Z ∈ C (Y )|θ) as the censored
binary likelihood.
Recognizes censoring but ignores information in the ranks
Performs similarly to FRN in the previous study
Less precise than FRN when m is big

Same setup as before, but average uncensored outdegree is m
10 20 30 40 50
0.20.40.60.81.01.21.4
m
relativeconcentrationaroundtruevalue
! ! !
! !r
!
!
! ! !c
!
!
!
! !d1
! !
!
! !d2
2: Posterior concentration around true parameter values. The average of E[(β −
(S)]/E[(β − β∗)2|C(S)] across eight simulated datasets for each m ∈ {5, 15, 30, 50}.
censored binomial likelihood. As the censored binomial likelihood recognizes the censoring in
data, we expect it to provide parameter estimates that do not have the biases of the binomial
ood estimators. On the other hand, LC ignores the information in the ranks of the scored
duals, and so we might expect it to provide less precise estimates than the FRN likelihood.
βr : row
βc: column
βd1: continuous dyad
βd2: co-membership
Relative concentration around true value of each parameter:
Measured by E (β − 1)
2
|F (Y ) /E (β − 1)
2
|C (Y ) for each β
45

Same setup as before, but average uncensored outdegree is m
10 20 30 40 50
0.20.40.60.81.01.21.4
m
relativeconcentrationaroundtruevalue
! ! !
! !r
!
!
! ! !c
!
!
!
! !d1
! !
!
! !d2
2: Posterior concentration around true parameter values. The average of E[(β −
(S)]/E[(β − β∗)2|C(S)] across eight simulated datasets for each m ∈ {5, 15, 30, 50}.
censored binomial likelihood. As the censored binomial likelihood recognizes the censoring in
data, we expect it to provide parameter estimates that do not have the biases of the binomial
ood estimators. On the other hand, LC ignores the information in the ranks of the scored
duals, and so we might expect it to provide less precise estimates than the FRN likelihood.
βr : row
βc: column
βd1: continuous dyad
βd2: co-membership
Relative concentration around true value of each parameter:
Measured by E (β − 1)
2
|F (Y ) /E (β − 1)
2
|C (Y ) for each β
When m n, most of the information found by considering
ranked/unranked individuals as groups rather than the relative
ordering of the ranked individuals.

AddHealth Data - Results
−3.65−3.50−3.35
β
intercept
q
q
−0.050.000.050.10
rsmoke rdrink rgpa
q q q
q
q
q
−0.050.000.050.10
csmoke cdrink cgpa
q q q
q
q q
q q q
−0.050.000.050.10
β
dsmoke ddrink dgpa
q q q q q q q
q q
0.20.40.6
β
dacad darts dsport dcivic
q
qq
q
qq
q
qq
q
qq
0.20.40.60.81.0
β
dgrade drace
q q q
q
q q
646 females were asked to rank up to 5 female friends
Mean model with row, column and dyadic eﬀects for smoking,
drinking and gpa as well as dyadic eﬀects for comembership in
activities and grade, and a similarity-in-race measure.
The CIs are based on binary, FRN and rank likelihoods.
46

08 Inference for Networks – DYAD Model Overview (2017)

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to 08 Inference for Networks – DYAD Model Overview (2017)

Similar to 08 Inference for Networks – DYAD Model Overview (2017) (20)

More from Duke Network Analysis Center

More from Duke Network Analysis Center (20)

Recently uploaded

Recently uploaded (20)

08 Inference for Networks – DYAD Model Overview (2017)