Social media & profiling Text Anja Bechmann, Associate Professor AUHead of Digital Footprints Research Group
• Digital Footprints is a research group (at Centre for Internet Research, Aarhus University, Denmark) interested in the data that users share, expose or trade when communicating through the internet. The research group is dedicated to collect, analyze and understand digital footprints, the character of the footprints, the context(s) they form and in which they are given, and the purpose of the individual/ group for sharing, exposing and trading data.
ethno-mining is unique in its integration of ethnographic and data mining techniques. This integration is carried out in iterative loops between the formation of interpretations of the data and the development of processes for validating those interpretations. (…) here are two key characteristics of the iterative loops in ethno-mining. First, they can be separated into three categories based on the amount of a priori knowledge used to find and validate interpretations of the data. Second, the results of the iterative loops are frequently, although not exclusively, represented in visualizations. Visualizations have two basic affordances: they can represent both quantitative and qualitative analyses, and they exploit the visual system to support more comprehensive data analysis, particularly pattern finding and outlier detection. (…) our method seeks to expose and explicitly address the selection biases in both qualitative and quantitative research methods by checking them against one another. Ethno-mining extends its scrutiny of these biases beyond simply comparing the biases embedded in standard qualitative and quantitative techniques. It does so by tightly integrating the techniques in loops, generating mutually informed analysis techniques with complimentary sets of biases.“Ethno-Mining: Integrating Numbers and Words from the Ground Up by R. Aipperspach, T. Rattenbury, A. Woodruff, K. Anderson, J. Canny, P. Aoki.
Data Rush • ecstatic by the data you can receive • ecstatic by the targeting you can do and/or the (predictive) conclusions you can make
Dirty Data• “Although some of the likes clearly relate to their predicted attribute, as in the case of No H8 Campaign and homosexuality, other pairs are more elusive; there is no obvious connection between Curly Fries and high intelligence.”• “Sexual orientation was assigned using the Facebook proﬁle “Interested in” ﬁeld; users interested only in others of the same sex were labeled as homosexual (4.3% males;2.4% females), whereas those interested in users of the opposite gender were labeled as heterosexual.”• Private traits and attributes are predictable from digital records of human behavior by Michal Kosinskia, David Stillwell, and Thore Graepelb: http://www.pnas.org/content/ early/2013/03/06/1218772110.full.pdf
contextualizing api data• Three-fold contextualization: • system (data flow, programming interface) • user interface • user
API profiling? • Analyzing location and check-in patterns • What are the content and topic patterns of different clusters • how do different groups behave in terms of likes, status update, link etc. frequency and daily patterns • Determine ethnicity from pictures • etc. etc. etc...
My API ResearchProjects • The Social Library (Danish Agency for Culture in collaboration with REDIA) • Trust and Privacy on FB in DK & IT (in collaboration with Matteo Magnani) • Personal data sharing on Facebook among high school/college students (18-20) in DK & US (DigiHumLab Denmark)
Privacy filter•the students studied use groups as primary privacy filter
“it’s because it’s ourprivate space [danish: forum]. We cannot seeeach other every day so this is a way ofkeeping up to date with each other...with things not everyone should know...if youhave met a guy or if we are going to meet”
• Not personal data (on timeline): “I only post things that I am not ashamed of [fit self-image, ed.]and then I don’t care who sees it”• Personal data: “sad things” (death) and “things that you do not want to be confronted with”, private address and account info (in US also drinking and religion)and not comment on historical data so that they
The use of data• what do they think of companies accessing data through apps (average 60 apps in sample)?• (not oriented towards apps as companies - more as services or employers trying to profile them - not concerned)• showing them what we draw-> “it is taken out of context...it is a little bit silly now [facerape prank, ed.]”• things that they think are private should not be made public• Understanding (transparency) leads to accept according to the users and when in doubt of what app does they avoid it (unless referred to a lot of times through friends) (according to themselves)
• increase in digital footprints from few not connected services/competitive platforms• data as commodity result in less open apis• time-consuming and expensive to integrate in order to make contextualized profiling
THANK YOU. Anja Bechmann, Ass. Professor Head of Digital Footprints Research Group, AU +45 5133 5138 email@example.com // @anjabechmann Digitalfootprints.dk // @digifeet(Visiting researcher from 1.8.12-1.8.13, DIKU Copenhagen)
A particular slide catching your eye?
Clipping is a handy way to collect important slides you want to go back to later.