Michael Zimmer, PhD
Assistant Professor, School of Information Studies
Director, Center for Information Policy Research
Un...
 Emergence of new technologies and technological environments
often lead to CONCEPTUAL GAPS in how we think about ethical...
 As ETHICISTS, we’re faced with new conceptual gaps in how we
think about some of the most fundamental principles of
rese...
Michael Zimmer | ALISE IE Webinar | August 30, 2016 4
 In 2006, AOL released over 20 million search queries from
658,000 of its users to the public in an attempt to support
ac...
 Harvard-based “Tastes, Ties, and Time” (T3) research project
sought to understand social network dynamics of large group...
 But dataset had unique cases and codes, making identifying the
“anonymous” university trivial
 Took me minimal effort t...
 Deal announced in 2010 that U.S. Library of Congress will archive
all public tweets
 At the time of the announcement, t...
 Danish student researcher publicly released a dataset of nearly
70,000 users of the online dating site OkCupid, includin...
Michael Zimmer | ALISE IE Webinar | August 30, 2016 10
 As ETHICISTS, we’re faced with new conceptual gaps in how we
think about some of the most fundamental principles of
rese...
 Presumption that because subjects make information available
on a OkCupid, Facebook, or Twitter, they don’t have an
expe...
 Presumption that because something is shared or available
within a community, the subject is consenting to it being
harv...
 Presumption that “harm” means risk of physical or tangible
impact on subject
 Researchers often imply “data is already ...
 As ETHICISTS, we’re faced with new conceptual gaps in how we
think about some of the most fundamental principles of
rese...
Michael Zimmer | ALISE IE Webinar | August 30, 2016 16
“With the appearance of big data,
open data, and particularly resea...
 As ETHICISTS, we’re faced with new conceptual gaps in how we
think about some of the most fundamental principles of
rese...
 Data librarians might be tasked with assisting in obtaining data
sets for big data research
 Searching repositories for...
 Data librarians commonly asked to help with storing and
archiving research data
 Maintain institutional data repository...
 Data librarians might act as gatekeepers for making institutional
data sets available to others
 New challenges:
 What...
 As ETHICISTS, we’re faced with new conceptual gaps in how we
think about some of the most fundamental principles of
rese...
 Buchanan, E. and C. Ess (2009). Internet Research Ethics and the Institutional
Review Board: Current Practices and Issue...
Michael Zimmer, PhD
Assistant Professor, School of Information Studies
Director, Center for Information Policy Research
Un...
Upcoming SlideShare
Loading in …5
×

Ethics in Library Research Data Services: Conceptual Gaps & Policy Vacuums

263 views

Published on

Prepared for the ALISE Webinar on "Ethics in Library Research Data Services," this presentation discusses some of the conceptual gaps and policy vacuums that emerge alongside the rise of big data-based research, and how these pose challenges for us as ethicists and as library practitioners

Published in: Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
263
On SlideShare
0
From Embeds
0
Number of Embeds
68
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Researchers are hoping to advance our understanding of a phenomenon by making publicly available large datasets of user information they considered already in the public domain
  • Ethics in Library Research Data Services: Conceptual Gaps & Policy Vacuums

    1. 1. Michael Zimmer, PhD Assistant Professor, School of Information Studies Director, Center for Information Policy Research University of Wisconsin-Milwaukee www.MichaelZimmer.org
    2. 2.  Emergence of new technologies and technological environments often lead to CONCEPTUAL GAPS in how we think about ethical problems, and POLICY VACUUMS on how we can address them  Computer technology transforms “many of our human activities and social institutions,” and will “leave us with policy and conceptual vacuums about how to use computer technology”  “Often, either no policies for conduct in these situations exist or existing policies seem inadequate”  Jim Moor (1985). “What is Computer Ethics?” Michael Zimmer | ALISE IE Webinar | August 30, 2016 2
    3. 3.  As ETHICISTS, we’re faced with new conceptual gaps in how we think about some of the most fundamental principles of research ethics, like privacy, consent, and harm  As PRACTIONERS, we’re faced with new policy vacuums about how we are to help researchers obtain, store, and share datasets Michael Zimmer | ALISE IE Webinar | August 30, 2016 3
    4. 4. Michael Zimmer | ALISE IE Webinar | August 30, 2016 4
    5. 5.  In 2006, AOL released over 20 million search queries from 658,000 of its users to the public in an attempt to support academic research on search engine usage  Despite AOL’s attempts to anonymize the data, individual users remained identifiable based solely on their search histories  which included search terms matching users’ names, social security numbers, addresses, phone numbers, and other personally identifiable information.  Upon being identified by The New York Times based solely on her search terms in the AOL database, a Georgia woman exclaimed, “My goodness it’s my whole personal life…I had no idea somebody was looking over my shoulder” Michael Zimmer | ALISE IE Webinar | August 30, 2016 5
    6. 6.  Harvard-based “Tastes, Ties, and Time” (T3) research project sought to understand social network dynamics of large groups of students  Worked with Facebook & an “anonymous” university to harvest the Facebook profiles of an entire cohort of college freshmen  Repeated each year for their 4-year tenure  NSF mandated release of data, first wave in Sept 2008 “All the data is cleaned so you can’t connect anyone to an identity” Michael Zimmer | ALISE IE Webinar | August 30, 2016 6
    7. 7.  But dataset had unique cases and codes, making identifying the “anonymous” university trivial  Took me minimal effort to discern the source was Harvard, and thus the anonymity (and privacy) of subjects in the study is jeopardized Michael Zimmer | ALISE IE Webinar | August 30, 2016 7
    8. 8.  Deal announced in 2010 that U.S. Library of Congress will archive all public tweets  At the time of the announcement, this meant 50 million new tweets per day, with a historical archive of approximately 170 billion tweets  6 month delay for new Tweets, restricted access to researchers only  Open questions:  Can users opt-out from being in permanent archive?  Can users delete tweets from archive?  Will geolocational and other metadata be included?  What about a public tweet that is re-tweeting a private one?  Did users ever expect their tweets to become permanent part of LOC’s archives?  6 years later, archive still not available
    9. 9.  Danish student researcher publicly released a dataset of nearly 70,000 users of the online dating site OkCupid, including usernames, age, gender, location, what kind of relationship (or sex) they’re interested in, personality traits, and answers to thousands of profiling questions used by the site Michael Zimmer | ALISE IE Webinar | August 30, 2016 9
    10. 10. Michael Zimmer | ALISE IE Webinar | August 30, 2016 10
    11. 11.  As ETHICISTS, we’re faced with new conceptual gaps in how we think about some of the most fundamental principles of research ethics, like privacy, consent, and harm  As PRACTIONERS, we’re faced with new policy vacuums about how we are to help researchers obtain, store, and share datasets Michael Zimmer | ALISE IE Webinar | August 30, 2016 11
    12. 12.  Presumption that because subjects make information available on a OkCupid, Facebook, or Twitter, they don’t have an expectation of privacy  Researchers/IRBs might assume everything is always public, and was meant to be  Assumes no harm could come to subjects if data is already “public”  New ethical problems…  Need to track if ToS/architecture have changed, or if users even understand what is available to researchers  Ignores contextual nature of sharing  Fails to recognize the strict dichotomy of public/private doesn’t apply in a world of social & big data sets
    13. 13.  Presumption that because something is shared or available within a community, the subject is consenting to it being harvested for research  Assumes users understand Terms of Service that might mention “research”  Assumes no harm can come from use of data already shared with friends or other contextually-bound circles  New ethical problems…  Must recognize that a user making something public online comes with a set of assumptions/expectations about who can access and under what conditions  Must recognize how research methods might allow un-anticipated access to “restricted” data  Users might not understand the technical conditions that enable access to their data, nor the legal complexities of ToS agreements
    14. 14.  Presumption that “harm” means risk of physical or tangible impact on subject  Researchers often imply “data is already public, so what harm could possibly happen”  New ethical problems  Must move beyond the concept of harm as requiring a tangible consequence  Protecting from harm is more than protecting from hackers, spammers, identity thieves, etc  Consider dignity/autonomy theories of harm  Must a “wrong” occur for there to be damage to the subject?  Do subjects deserve control over the use of their data streams?
    15. 15.  As ETHICISTS, we’re faced with new conceptual gaps in how we think about some of the most fundamental principles of research ethics, like privacy, consent, and harm Michael Zimmer | ALISE IE Webinar | August 30, 2016 15
    16. 16. Michael Zimmer | ALISE IE Webinar | August 30, 2016 16 “With the appearance of big data, open data, and particularly research data curation on many libraries’ radar screens, data service has become a critically important topic for academic libraries”
    17. 17.  As ETHICISTS, we’re faced with new conceptual gaps in how we think about some of the most fundamental principles of research ethics, like privacy, consent, and harm  As PRACTIONERS, we’re faced with new policy vacuums about how we are to help researchers obtain, store, and share datasets Michael Zimmer | ALISE IE Webinar | August 30, 2016 17
    18. 18.  Data librarians might be tasked with assisting in obtaining data sets for big data research  Searching repositories for existing data sets shared by others  Assisting with tools that scrape and collect data online  New challenges:  How do you confirm the provenance of data collected by others?  Should a data librarian help locate controversial datasets?  Can you use data that was later pulled from public accessibility?  Data that was hacked or stolen?  Can we ensure research subjects when scraping data with ad hoc scraping tools?  Should a data librarian require proof of IRB approval before assisting? Michael Zimmer | ALISE IE Webinar | August 30, 2016 18
    19. 19.  Data librarians commonly asked to help with storing and archiving research data  Maintain institutional data repository  Assist with drafting data management plans  New challenges:  Should library policy require de-identification of data prior to storing?  What kind of security must be in place? Simple access controls, or full data encryption?  Should any data be destroyed, rather than stored? How soon, and in what way? Michael Zimmer | ALISE IE Webinar | August 30, 2016 19
    20. 20.  Data librarians might act as gatekeepers for making institutional data sets available to others  New challenges:  What kind of access policies should be in place? What kind of limitations might be reasonable?  Should data be “scrubbed” or de-identified prior to making it publically available?  How do we ensure secondary use is aligned with justification for the initial collection of data? Michael Zimmer | ALISE IE Webinar | August 30, 2016 20
    21. 21.  As ETHICISTS, we’re faced with new conceptual gaps in how we think about some of the most fundamental principles of research ethics, like privacy, consent, and harm  As PRACTIONERS, we’re faced with new policy vacuums about how we are to help researchers obtain, store, and share datasets Michael Zimmer | ALISE IE Webinar | August 30, 2016 21
    22. 22.  Buchanan, E. and C. Ess (2009). Internet Research Ethics and the Institutional Review Board: Current Practices and Issues. Computers and Society, 39(3): 43– 49.  Buchanan, E. and M. Zimmer (2016, Fall) "Internet Research Ethics", The Stanford Encyclopedia of Philosophy http://plato.stanford.edu/archives/fall2016/entries/ethics-internet-research/  Markham, A., and Buchanan, E. (2012). Ethical Decision-Making and Internet Research: Recommendations from the AoIR Ethics Working Committee (Version 2.0). Association of Internet Researchers. http://aoir.org/reports/ethics2.pdf  Secretary’s Advisory Committee to the Office for Human Research Protections (SACHRP), “Considerations and Recommendations Concerning Internet Research and Human Subjects Research Regulations, with Revisions” http://www.hhs.gov/ohrp/sites/default/files/ohrp/sachrp/mtgings/2013%20Mar ch%20Mtg/internet_research.pdf  Zimmer, M. (2010). “But the data is already public”: On the ethics of research in Facebook. Ethics and Information Technology, 12(4), 313–325.  Zimmer, M, and K. Kinder-Kurlanda (eds.) (forthcoming). Internet Research Ethics for the Social Age: New Challenges, Cases, and Contexts. New York: Peter Lang Publishing Michael Zimmer | ALISE IE Webinar | August 30, 2016 22
    23. 23. Michael Zimmer, PhD Assistant Professor, School of Information Studies Director, Center for Information Policy Research University of Wisconsin-Milwaukee www.MichaelZimmer.org

    ×