Ethical challenges for online social
                   science research: Networks,
                    Rentals and Confes...
Three unethical
                               studies?
                           • Facebook network research
           ...
Facebook.com




Wednesday, June 24, 2009                  3
What are the techniques?
            •      Spidering - Technically fussy, often considered
                   inappropria...
Who gets the data?
           •       Golder, S., Wilkinson, D. M., and Huberman, B. A. (2007).
                   Rhythms...
But isn’t it anonymous? No.
                   •       Backstrom, L., Dwork, C., and Kleinberg, J. (2007).
               ...
Or simply use this guy
                Zimmer, Michael. 2009.
                “But the Data is Already
                Pub...
The only anonymous
                   network is one where
                   you know don’t know
                   the n...
So what’s the precedent?
             •      Personal networks with informed consent.

             •      Name generators...
Facebook properties enable you to
                 report on your friends to a third party.


                            ...
Wednesday, June 24, 2009   11
craigslist.org




Wednesday, June 24, 2009                    12
Methods
                  •        This is a University of Toronto ethics board-approved
                           audit ...
1. Price and number of bedrooms      2. Masked email        3. Well-formed
                               almost always in...
Jitter means that messages are
                            We send messages out one day after the
                        ...
Date              Email address.   1 of 5 different message bodies.




                                                  ...
Map of rentals in
                     Greater Toronto Area



                              Geographic distribution
     ...
Ranked responses for names by
                      ethnicity and gender

                   •       We ranked each of the...
Issues

                   • Racism is often difficult to assess through
                           direct questioning.
   ...
grouphug.us




Wednesday, June 24, 2009                 20
Online confessional site
                   • What constitutes anonymity?
                   • Grouphug is a website of ap...
Nothing here to see...


                             (catch 22)




Wednesday, June 24, 2009                      22
Ok, here are some examples

            • “I am so happy that I can confess again. I don't
                   even care ab...
Some worse examples
        •      “I paid my friend 200 dollars to do over 400 pages of
               homework for the y...
So...
               • Do we ignore anonymous confessionals as too
                       toxic, or treat them as insight ...
Summary
        •      Facebook - the ethics of capturing someone else’s
               relationships is ambiguous. The ne...
Opportunities

                   • We can get unprecedented access to
                           society in the wild.
   ...
Thank You
                    Bernie Hogan
              bernie.hogan@oii.ox.ac.uk




Wednesday, June 24, 2009           ...
Upcoming SlideShare
Loading in …5
×

Ethical challenges for online social science research: Networks, rentals and confessionals

1,539 views

Published on

Presentation at the 5th International Conference on eSocial Science. Part of a workshop on the law and ethics of eSocial Science research. It outlines three domains I am currently researching and some of the ethical issues I have encountered including reporting on a third party (Facebook), deception (craigslist) and information access (grouphug.us).

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,539
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
37
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Ethical challenges for online social science research: Networks, rentals and confessionals

  1. 1. Ethical challenges for online social science research: Networks, Rentals and Confessionals Bernie Hogan Research Fellow, Oxford Internet Institute NCeSS - 5th International Conference on e-Social Science June 24, 2009. Cologne, Germany Wednesday, June 24, 2009 1
  2. 2. Three unethical studies? • Facebook network research • Craigslist audit study • Grouphug.us Wednesday, June 24, 2009 2
  3. 3. Facebook.com Wednesday, June 24, 2009 3
  4. 4. What are the techniques? • Spidering - Technically fussy, often considered inappropriate by data controller • API - Technically restrictive, gives false sense of data ownership (See Facebook Developer Terms of Use Section 2.A.6) • Datadump - Facebook gives you the data • Someone else’s application - May not give data, but only a picture. • Handcoding - Spidering for masochists Wednesday, June 24, 2009 4
  5. 5. Who gets the data? • Golder, S., Wilkinson, D. M., and Huberman, B. A. (2007). Rhythms of social interaction: Messaging within a massive online network. In 3rd International Conference on Communities and Technologies, East Lansing, MI. Springer. • Traud, A., Kelsic, E., Mucha, P., and Porter, M. (2008). Community structure in online collegiate networks. Working paper. • Lewis, K., Kaufman, J., Gonzalez, M., Wimmer, A., and Christakis, N. (2008). Tastes, ties, and time: A new social network dataset using facebook.com. Social Networks, 30(4):330–342. Wednesday, June 24, 2009 5
  6. 6. But isn’t it anonymous? No. • Backstrom, L., Dwork, C., and Kleinberg, J. (2007). Wherefore art thou r3579x? : anonymized social networks, hidden patterns, and structural steganography. In Proceedings of the 16th international conference on World Wide Web, pages 181–190. ACM New York, NY, USA. • Direct attack needs ~ sqrt(log(n)) nodes. • Narayanan, A. and Shmatikov,V. (2009). De-anonymizing social networks. Forthcoming: IEEE C&S. • Starting with even less and matching to existing network can get over 90% of the network accurately. Wednesday, June 24, 2009 6
  7. 7. Or simply use this guy Zimmer, Michael. 2009. “But the Data is Already Public”: On the Ethics of Research in Facebook. 8th International Conference of Computer Ethics: Philosophical Enquiry. Corfu, Greece. Wednesday, June 24, 2009 7
  8. 8. The only anonymous network is one where you know don’t know the network structure. This is unrealistic. Wednesday, June 24, 2009 8
  9. 9. So what’s the precedent? • Personal networks with informed consent. • Name generators have historically asked individuals to report data on their friends. • They jump through an ethical loop-hole vis-a-vis the fact that this is recall data. • Information networks, however, permit not only data created by an individual, but the friend of a friend data that is merely accessible, not created, by the respondent. Wednesday, June 24, 2009 9
  10. 10. Facebook properties enable you to report on your friends to a third party. Respondent Friend 1 ? Friend 2 Wednesday, June 24, 2009 10
  11. 11. Wednesday, June 24, 2009 11
  12. 12. craigslist.org Wednesday, June 24, 2009 12
  13. 13. Methods • This is a University of Toronto ethics board-approved audit study. • We selected craigslist.org, a highly popular free online classifieds site. • From March to June 2007 we selected approximately 10 new ads each day for inclusion in the study. • Each landlord was emailed 5 messages. Each message included one of five ethnicities randomly assigned with one of five message bodies. Each experiment used one gender only. Wednesday, June 24, 2009 13
  14. 14. 1. Price and number of bedrooms 2. Masked email 3. Well-formed almost always in header. address. date 4 . PostingID - key 5. Link to well-formed Google map, or to linking data failing that, nearest intersection. Wednesday, June 24, 2009 14
  15. 15. Jitter means that messages are We send messages out one day after the sent at a random time within "5" posting (rather than immediately) at short minutes of the specified time. regular intervals. The parameters can be Makes batches of messages look tuned. more realistic By default we alternate between This window shows the five name / message male and female names. combinations that will be sent out. Wednesday, June 24, 2009 15
  16. 16. Date Email address. 1 of 5 different message bodies. Secret posting ID: 1 of 5 female arabic names ddhfegjfb = 337546951 Wednesday, June 24, 2009 16
  17. 17. Map of rentals in Greater Toronto Area Geographic distribution of rental ads (97% showing) Wednesday, June 24, 2009 17
  18. 18. Ranked responses for names by ethnicity and gender • We ranked each of the Male Female 50 names from 1 (least 519 756 responses) to 50 (most responses). Arab 31 113 Black 97 129 • The table shows the sum of the ranks for all 5 SE Asian 88 179 names used in each ethnicity-gender Caucasian 146 164 combination. Jewish 157 171 Wednesday, June 24, 2009 18
  19. 19. Issues • Racism is often difficult to assess through direct questioning. • Deception in this study is necessary. • There is no direct personal harm, and no direct manipulation. Wednesday, June 24, 2009 19
  20. 20. grouphug.us Wednesday, June 24, 2009 20
  21. 21. Online confessional site • What constitutes anonymity? • Grouphug is a website of approximately one million posts (approximately 95% unique). • Does not store IP, actively discourages quoting other posts and encodes the entries in non-sequential strings (timestamps exist but are hidden) Wednesday, June 24, 2009 21
  22. 22. Nothing here to see... (catch 22) Wednesday, June 24, 2009 22
  23. 23. Ok, here are some examples • “I am so happy that I can confess again. I don't even care about seeing my confessions on here, it's just the feeling of getting it off your chest and sending it away!” (136158003) • “I pee in the shower because I hate everyone I live with.” (255678370) Wednesday, June 24, 2009 23
  24. 24. Some worse examples • “I paid my friend 200 dollars to do over 400 pages of homework for the year, so that i can ditch school as much as i want, while lying to my mother and saying im still going to school” (194778021) • “I have HPV, its a std. I have known about it for 7 years, but that has not stopped me from having sex with 9 people with out a condom. 4 of the girls where married. I have never told anyone about my std. I have no idea how many people are infected because of me, it keeps me up at night.” (275447713) Wednesday, June 24, 2009 24
  25. 25. So... • Do we ignore anonymous confessionals as too toxic, or treat them as insight to the id? • Can we even analyze this data or merely view it as passive bystanders? Are there legal implications, especially dealing with data designed to resist tracking? What is my responsibility if I can do nothing to follow up (or even confirm the veracity of the statement)? Wednesday, June 24, 2009 25
  26. 26. Summary • Facebook - the ethics of capturing someone else’s relationships is ambiguous. The network I see is not mine - it is what I am allowed to see. I defer to Facebook’s terms of use. • Craigslist - the ethics of understanding racism as it actually operates online is problematic. I defer to utilitarian arguments and approval from the ethics board. • Grouphug - the ethics of viewing and storing, let alone analyzing, confessionals is ambiguous. How can we assure no personally identifying information without looking for it? How can we anonymize a million entries? Wednesday, June 24, 2009 26
  27. 27. Opportunities • We can get unprecedented access to society in the wild. • But is this fair? Is it justified? • How close to ‘the social good’ must one be to justify this work? Wednesday, June 24, 2009 27
  28. 28. Thank You Bernie Hogan bernie.hogan@oii.ox.ac.uk Wednesday, June 24, 2009 28

×