Studying Facebook via Data Extraction: a Netvizz tutorial at the Digital Methods Summer School 2013
Studying Facebook via Data Extrac6on The Netvizz Applica6on Bernhard Rieder Universiteit van Amsterdam Mediastudies Department
Overview Compared to TwiGer, Facebook is diﬃcult to study through data extrac6on but also has important advantages: ☉ complicated API, very complex and opaque privacy regime, constant changes, etc. ☉ rich and detailed data, access to full 6melines, etc. Goal: lower the threshold for working with quan6ta6ve and computa6onal approaches, thereby fostering transversal thinking; open the walled garden. Netvizz is a Facebook applica6on that exports a variety of data ﬁles in common formats for a variety of sec6ons of the Facebook plaSorm. Humanists and social scien6sts are oUen interested in descrip6ve sta6s6cs rather than models or advanced metrics; data stays close to the medium.
Two kinds of quan6ta6ve analysis Sta$s$cs Observed: objects and proper$es Inferred: rela$ons Data representa6on: the table Visual representa6on: quan$ty charts Grouping: class (similar proper$es) Graph-‐theory Observed: objects and rela$ons Inferred: structure Data representa6on: the matrix Visual representa6on: network diagrams Grouping: clique (dense rela$ons)
Personal network Nodes: users / links: "friendship" Good star6ng point for learning network analysis
Personal "like" network Nodes: users & liked objects ("bipar6te graphe") / links: "liking" A post-‐demographical view on social rela6ons and culture
FB group "Islam is dangerous" Friendship network, color: betweenness centrality 2.339 members Average degree of 39.69 81.7% have at least one friend in the group 55.4% ﬁve or more 37.2% have 20 or more founder and admin has 609 friends
FB group "Islam is dangerous" Friendship network, color: Interface language en_us, de, en_uk, it dominate
Mapping European Extremism (aggregate groups) Friendship rela6ons of 18 extreme-‐right groups User names are unique! (gephi can fuse networks)
FB group "Islam is dangerous" Interac6on network
Facebook Page "ElShaheeed", June 2010 – June 2011, (Poell / Rieder, forthcoming) 7K posts, 700K users, 3.6M comments, 10M likes, work in progress!
New media plaSorms funnel prac6ces into reduced and largely formal "grammars of ac6on" (Agre 1989); data is therefore very clean, very complete, and very detailed. Can be imported with great ease into standard packages for sta6s6cal (e.g. R, Excel, Rapidminer) or network analysis (e.g. gephi, Pajek). Data and tools
FB Page "ElShaheeed", June 2010 – June 2011 comment 6mescaGer
FB Page "ElShaheeed", June 2010 – June 2011 comment 6mescaGer, log10 y scale
FB "ElShaheeed", June 2010 – June 2011 comment 6mescaGer, log10 y scale, likes on comments
FB page "Stop the Islamiza6on of the World" Number of posts and reac6ons
Facebook Page "ElShaheeed", June 2010 – June 2011: scaGerplot comments / likes, per post type
FB page "Educate children about the evils of Islam" 1.586 likes, 253 users commen6ng or liking on last 200 posts
FB page "Educate children about the evils of Islam" Links have more comments, photos more likes.
FB pages of New York Times and Wall Street Journal (aggregate pages) 30 latest posts, 27K users liking or commen6ng (user ids are unique!)
Facebook page like network Seed: Stop Islamiza6on of the World Crawl depth: 2
Studying extremism on Facebook Some examples from the Digital Methods Ini6a6ves data sprint on an6-‐Islamism and right wing extremism. Four aspects of SNS we wanted to study: ☉ Coordina6on, social networking, and social support for extremists ☉ Broadcas6ng and mobiliza6on channel for extremists ☉ Expressions from diﬀuse publics ☉ Debate and encounter around Islam
Conclusions Netvizz exports a variety of data ﬁles in common formats for a variety of sec6ons of the Facebook plaSorm and can be used in many diﬀerent research designs. Netvizz aGempts to lower the threshold for quan6ta6ve work on Facebook, allowing for closer connec6ons with qualita6ve, interpreta6ve thinking. Easy access to visualiza6on techniques is crucial for this approach.
Thank You hGps://apps.facebook.com/netvizz/ email@example.com hGps://www.digitalmethods.net hGp://thepoli6csofsystems.net "Far be@er an approximate answer to the right ques$on, which is oBen vague, than an exact answer to the wrong ques$on, which can always be made precise. Data analysis must progress by approximate answers, at best, since its knowledge of what the problem really is will at best be approximate." (Tukey 1962)