Practicing Data Science Responsibly
Rahul Bhargava
Research Scientist
Civic Media Group - MIT Media Lab
rahulb@media.mit.edu
@rahulbot
Deloitte Data Science Forum
June 2, 2016
Rahul Bhargava
MIT Center for Civic Media
rahulb@mit.edu
@rahulbot
Deloitte Data Science Forum
June 2, 2016
Rahul Bhargava
MIT Center for Civic Media
rahulb@mit.edu
@rahulbot
Discrimination in Online Ad
Delivery
Latanya Sweeney
Harvard University
Riding with the Stars
Anthony Tockar
Neustar, 2014
InCoding
Joy Buolamwini
MIT Media Lab, 2016
Deloitte Data Science Forum
June 2, 2016
Rahul Bhargava
MIT Center for Civic Media
rahulb@mit.edu
@rahulbot
responsible
data creation
responsible
data impacts
responsible
data use
InCoding
Facebook news feed
Facebook emotion study
US NSA SkyNet program
etc.
Google ad results
Tiger Mom Tax – ProPublica
Amazon same-day delivery
etc.
NYC Taxi records
OKCupid profiles
etc.
Deloitte Data Science Forum
June 2, 2016
Rahul Bhargava
MIT Center for Civic Media
rahulb@mit.edu
@rahulbot
consumer bill of rights
data breach legislation
protect students
identify discrimination
revise ECPA
and more…
new ethics review standards
data-aware grant making
case studies & curricula
spaces to talk about this
standards for data-sharing
and more…
Deloitte Data Science Forum
June 2, 2016
Rahul Bhargava
MIT Center for Civic Media
rahulb@mit.edu
@rahulbot
define & maintain your org’s values
do algorithmic QA
set up internal & external review boards
innovate with others in your field to create norms
Rahul’s recommendations:
Deloitte Data Science Forum
June 2, 2016
Rahul Bhargava
MIT Center for Civic Media
rahulb@mit.edu
@rahulbot
Are you being responsible?
Turn to your neighbor and talk
about how you are approaching
this. Do you have strategies for
being responsible in the
creation, impact and use of
your data science work? What is
working for you, and what isn’t?

Practicing Data Science Responsibly

  • 1.
    Practicing Data ScienceResponsibly Rahul Bhargava Research Scientist Civic Media Group - MIT Media Lab rahulb@media.mit.edu @rahulbot
  • 2.
    Deloitte Data ScienceForum June 2, 2016 Rahul Bhargava MIT Center for Civic Media rahulb@mit.edu @rahulbot
  • 3.
    Deloitte Data ScienceForum June 2, 2016 Rahul Bhargava MIT Center for Civic Media rahulb@mit.edu @rahulbot Discrimination in Online Ad Delivery Latanya Sweeney Harvard University Riding with the Stars Anthony Tockar Neustar, 2014 InCoding Joy Buolamwini MIT Media Lab, 2016
  • 4.
    Deloitte Data ScienceForum June 2, 2016 Rahul Bhargava MIT Center for Civic Media rahulb@mit.edu @rahulbot responsible data creation responsible data impacts responsible data use InCoding Facebook news feed Facebook emotion study US NSA SkyNet program etc. Google ad results Tiger Mom Tax – ProPublica Amazon same-day delivery etc. NYC Taxi records OKCupid profiles etc.
  • 5.
    Deloitte Data ScienceForum June 2, 2016 Rahul Bhargava MIT Center for Civic Media rahulb@mit.edu @rahulbot consumer bill of rights data breach legislation protect students identify discrimination revise ECPA and more… new ethics review standards data-aware grant making case studies & curricula spaces to talk about this standards for data-sharing and more…
  • 6.
    Deloitte Data ScienceForum June 2, 2016 Rahul Bhargava MIT Center for Civic Media rahulb@mit.edu @rahulbot define & maintain your org’s values do algorithmic QA set up internal & external review boards innovate with others in your field to create norms Rahul’s recommendations:
  • 7.
    Deloitte Data ScienceForum June 2, 2016 Rahul Bhargava MIT Center for Civic Media rahulb@mit.edu @rahulbot Are you being responsible? Turn to your neighbor and talk about how you are approaching this. Do you have strategies for being responsible in the creation, impact and use of your data science work? What is working for you, and what isn’t?

Editor's Notes

  • #2 I’m Rahul Bhargava and I work on data literacy at the Center for Civic Media lots of talk of this in the humanitarian space, but less in the corporate world borrowing “responsible” framing from my friends at the Engine Room
  • #3 Probably heard about Facebook newsfeed flare-up I don’t want to debate whether Facebook is being “responsible” or not, but there is clearly a societal expectation of responsibility People think algorithms are neutral, but they very much are no Algorithms are artifacts of the cultural context of their creators and the world in which they operate
  • #4 This can be risky. Three irresponsible examples * machine learning algorithms trained on white people (Joy photo) - the “coded gaze" * amazon and delivery to low-income neighborhoods * de-anonymized NYC Taxi records and differential privacy
  • #5 I like to think about this as be responsible in three different ways: Responsible data creation Responsible data impacts Responsible data use Spend a little time describing each of these
  • #6 So what do we do about this? Are there best practices or norms? Little regulation in the US right now White House wrote up a report with some recommendations - bill of rights, data breach legislation standard, non-US, student data, stop discrimination, ECPA revision Council on Big Data, Ethics & Society just released their report and recommendations for policy, pedagogy, and network building, further research - common rule expand to data science, new approaches to ethics review “ethics” is a scary word… especially when it comes to regulation
  • #7 Use our existing corporate values to apply to data work: train your staff on this Do what Latanya Sweeney and others (Christian Sandvig) are doing – algorithmic reverse engineering Set up an internal review board that any data-related projects need to approved by - I was just at a Stanford even with the folks from FB who do this