Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Upcoming SlideShare
Loading in …5
×

# MRQAP tutorial for newbies

3,422 views

Published on

This tutorial aims to explain MRQAP (quadratic assignment procedure) method in a visual way.

Published in: Education
• Full Name
Comment goes here.

Are you sure you want to Yes No
Your message goes here
• Thank you!

Are you sure you want to  Yes  No
Your message goes here

### MRQAP tutorial for newbies

1. 1. MRQAP tutorial Fariba Karimi Fariba.karimi@gesis.org 24.11.2015
2. 2. Mul=ple Regression Quadra=c Assignment Procedure
3. 3. Why regression in network analysis? •  Inferen=al sta=s=cs have proven to have very useful applica=ons to social network analysis. At a most general level, the ques=on of "inference" is: how much conﬁdence can I have that the pa6ern I see in the data I've collected is actually typical of some larger popula?on, or that the apparent pa6ern is not really just a random occurrence?
4. 4. OLS (Ordinary Least Square) Y = β0 + β1X1 + β2 X2 +...+ε Dependent variable coeﬃcients Explanatory/independent variables residual
5. 5. OLS (Ordinary Least Square) - test null-hypothesis à Small p-value suggests that coeﬃcients are signiﬁcant. E.g. p-value 0.01 means that coeﬃcients are signiﬁcant with 99% conﬁdence interval. Y = β0 + β1X1 + β2 X2 +...+ε β = 0
6. 6. OLS (Ordinary Least Square) - test •  P-value: null-hypothesis à Small p-value suggests that coeﬃcients are signiﬁcant. E.g. p-value 0.01 means that coeﬃcients are signiﬁcant with 99% conﬁdence interval. •  R-squared: quan=fying model performance. E.g. R-squared = 0.4 means that the model explains 40% of the varia=ons in the dependent variables. Y = β0 + β1X1 + β2 X2 +...+ε β = 0
7. 7. Problem •  Observa=ons are not independent of each other. If A are connected to B and B is connected C, it maybe likely that A is connected to C. •  Repea=ng observa=ons à error correlated with each other. Observa=ons in rows and columns tend to be highly correlated which inﬂuence the standard error.
8. 8. Problem •  Repea=ng observa=ons à error correlated with each other. Observa=ons in rows and columns tend to be highly correlated which inﬂuence the standard error.
9. 9. What does QAP do? •  Essen=ally, what the QAP does is to “scramble” the dependent variable data through several permuta?ons. By taking the data, and “scrambling” it repeatedly, resul=ng in mul=ple random datasets with the dependent variable— and then mul=ple analyses can be performed. •  Those datasets and analyses form an empirical sampling distribu=on, and we can compare our coeﬃcient with this sampling distribu?on of coeﬃcients from all the permuted datasets.
10. 10. In other words … •  We are preserving the dependence within rows / columns—but removing the rela=onship between the dependent and independent variables.
11. 11. Friendship, age , class A B C D E F G A 0 1 0 0 1 0 0 B 1 0 3 5 1 4 2 C 0 3 0 4 5 8 10 D 2 5 4 0 0 3 2 E 1 1 3 0 0 2 2 F 0 4 2 3 3 0 1 G 0 2 1 2 2 1 0 A B C D E F G A 0 1 0 2 1 0 0 B 1 0 3 5 1 4 2 C 0 3 0 4 5 8 10 D 2 5 4 0 0 3 2 E 1 1 3 0 0 2 2 F 0 4 2 3 3 0 1 G 0 2 1 2 2 1 0 A B C D E F G A 0 1 0 2 1 0 0 B 1 0 3 5 1 4 2 C 0 3 0 4 5 8 10 D 2 5 4 0 0 3 2 E 1 1 3 0 0 2 2 F 0 4 2 3 3 0 1 G 0 2 1 2 2 1 0 ≈ + Friendship =e Age diﬀerence educa=on
12. 12. Friendship, age , class A B C D E F G A 0 1 0 0 1 0 0 B 1 0 3 5 1 4 2 C 0 3 0 4 5 8 10 D 2 5 4 0 0 3 2 E 1 1 3 0 0 2 2 F 0 4 2 3 3 0 1 G 0 2 1 2 2 1 0 A B C D E F G A 0 1 0 2 1 0 0 B 1 0 3 5 1 4 2 C 0 3 0 4 5 8 10 D 2 5 4 0 0 3 2 E 1 1 3 0 0 2 2 F 0 4 2 3 3 0 1 G 0 2 1 2 2 1 0 A B C D E F G A 0 1 0 2 1 0 0 B 1 0 3 5 1 4 2 C 0 3 0 4 5 8 10 D 2 5 4 0 0 3 2 E 1 1 3 0 0 2 2 F 0 4 2 3 3 0 1 G 0 2 1 2 2 1 0 ≈ + Friendship =e Age diﬀerence educa=on
13. 13. A B C D E F G A 0 1 0 0 1 0 0 B 1 0 3 5 1 4 2 C 0 3 0 4 5 8 10 D 2 5 4 0 0 3 2 E 1 1 3 0 0 2 2 F 0 4 2 3 3 0 1 G 0 2 1 2 2 1 0 A B C D E F G A 0 1 0 2 1 0 0 B 1 0 3 5 1 4 2 C 0 3 0 4 5 8 10 D 2 5 4 0 0 3 2 E 1 1 3 0 0 2 2 F 0 4 2 3 3 0 1 G 0 2 1 2 2 1 0 A B C D E F G A 0 1 0 2 1 0 0 B 1 0 3 5 1 4 2 C 0 3 0 4 5 8 10 D 2 5 4 0 0 3 2 E 1 1 3 0 0 2 2 F 0 4 2 3 3 0 1 G 0 2 1 2 2 1 0 ≈ + Friendship =e Age diﬀerence educa=on •  Permutes dependent variables lots of =me. Measure the sampling distribu=on of the coeﬃcients. •  P-value is a propor=on of =mes that the observa=on is Falling outside the sampling distribu=on. QAP procedure
14. 14. QAP process – graph representa=on before reshuﬄing ajer
15. 15. Available func=ons •  UCINET: tools -> tes=ng hypothesis -> dyadic - > regression (QAP) •  R: library(statnet) -> netlm •  c/python ?
16. 16. Example 1 – there is no correla=on
17. 17. Example 1 – there is no correla=on
18. 18. Example 2 – there is a correla=on
19. 19. Example 2 – there is a correla=on
20. 20. Recap •  QAP is useful when we have dyadic rela=onship in the data. •  Use netlm func=on in R for the regression analysis. •  Disadvantage: it is slow for large network size
21. 21. References •  Predic=ng with networks: nonparametric mul=ple regression analysis of dyadic data, D. Krackhardt (1981) •  The SNA package, CT Buos (2014) •  hop://svitsrv25.epﬂ.ch/R-doc/library/sna/html/ qaptest.html •  hop://www.stata.com/mee=ng/1nasug/ simpson.pdf •  hop://www.erikgjesqeld.net/uploads/ 3/7/6/8/37685481/ sna_code_(gjesqeld_and_phillips_2013).pdf