Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

history and ethics of data

525 views

Published on

Talk delivered 2019-06-25 as part of the Summer Institute in Computational Social Science, held at Princeton University https://compsocialscience.github.io/summer-institute/2019/

Video: https://www.youtube.com/watch?v=0suLWheVji0


title:
what should future statisticians, CEO, and senators know about the
history and ethics of data?

abstract:

What should our future statisticians, senators, and CEOs know about the history and ethics of data?
How might understanding that history provide tools and resources to future citizens navigating a future shaped by data empowered algorithms?
I'll present content from a class co-developed over the past several years with Professor Matt Jones of Columbia's Department of History, based on material absent from both the curriculum for future technologists as well as for future humanists.
The intellectual arc traces from the 18th century to present day, beginning with examples of contemporary technological advances, disquieting ethical debates, and financial success powered by panoptic persuasion architectures.

Published in: Education
  • Be the first to comment

history and ethics of data

  1. 1. what should future statisticians CEOs, and senators know about the history and ethics of data? chris.wiggins@columbia.edu & & chris.wiggins@nytimes.com chris.wiggins@hackNY.org @chrishwiggins data-ppf.github.io
  2. 2. what should future statisticians CEOs, and senators know about the history and ethics of data? chris.wiggins@columbia.edu & & chris.wiggins@nytimes.com chris.wiggins@hackNY.org @chrishwiggins data-ppf.github.io
  3. 3. what should future statisticians CEOs, and senators know about the history and ethics of data? chris.wiggins@columbia.edu & & joint work with: Matt Jones Department of History, Columbia @nescioquid data-ppf.github.io
  4. 4. what should future statisticians CEOs, and senators know about the history and ethics of data?
  5. 5. what should future statisticians CEOs, and senators know about the history and ethics of data? 1. why history?
  6. 6. what should future statisticians CEOs, and senators know about the history and ethics of data? 1. why history? 2. why ethics?
  7. 7. what should future statisticians CEOs, and senators know about the history and ethics of data? 1. why history? 2. why ethics? 3. what we taught
  8. 8. what should future statisticians CEOs, and senators know about the history and ethics of data? 1. why history? 2. why ethics? 3. what we taught 4. what we learned
  9. 9. what should future statisticians CEOs, and senators know about the history and ethics of data? 0. preamble: class origin story 1. why history? 2. why ethics? 3. what we taught 4. what we learned
  10. 10. what should future statisticians CEOs, and senators know about the history and ethics of data? 0. preamble: class origin story 1. why history? 2. why ethics? 3. what we taught 4. what we learned
  11. 11. 0. preamble: class origin story
  12. 12. 0. preamble: class origin story $?
  13. 13. 0. preamble: class origin story (we got the grant, btw)
  14. 14. 0. preamble: class origin story
  15. 15. 0. preamble: class origin story
  16. 16. 0. preamble: class origin story 1. why history? 2. why ethics? 3. what we taught 4. what we learned what should future statisticians CEOs, and senators know about the history and ethics of data?
  17. 17. 1. why history?
  18. 18. 1. why history?
  19. 19. 1. why history? the onion dot com
  20. 20. 1. why history? "Quantum Mechanics", P J E Peebles, 1992
  21. 21. 1. why history? "Quantum Mechanics", P J E Peebles, 1992
  22. 22. 1. why history? "Quantum Mechanics", P J E Peebles, 1992
  23. 23. 1. why history? "Quantum Mechanics", P J E Peebles, 1992 truth is contested
  24. 24. 1. why history?
  25. 25. 1. why history?
  26. 26. what should future statisticians CEOs, and senators know about the history and ethics of data? 0. preamble: class origin story 1. why history? 2. why ethics? 3. what we taught 4. what we learned
  27. 27. 1. why ethics?
  28. 28. 1. why ethics? something is wrong on the internet
  29. 29. 1. why ethics? 2017-09-05: cathy o’neil something is wrong on the internet
  30. 30. 1. why ethics? 2017-09-05: cathy o’neil 2018-01-08: safiya noble something is wrong on the internet
  31. 31. 1. why ethics? 2017-09-05: cathy o’neil 2018-01-08: safiya noble 2018-01-23: virginia eubanks something is wrong on the internet
  32. 32. 1. why ethics? 2017-09-05: cathy o’neil 2018-01-08: safiya noble 2018-01-23: virginia eubanks 2019-01-15: shoshana zuboff (among increasingly many others) something is wrong on the internet
  33. 33. 1. why ethics? 1. fuzzies 2. techies
  34. 34. 1. why ethics? 1. fuzzies 2. techies
  35. 35. what should future statisticians CEOs, and senators know about the history and ethics of data? 0. preamble: class origin story 1. why history? 2. why ethics? 3. what we taught 4. what we learned
  36. 36. (really this is just weeks 3-11)
  37. 37. (really this is just weeks 3-11)
  38. 38. (really this is just weeks 3-11)
  39. 39. (really this is just weeks 3-11)
  40. 40. close each week with:
  41. 41. close each week with: - how did new capabilities rearrange power? (who can now do what, from what, to whom?)
  42. 42. close each week with: - how did new capabilities rearrange power? (who can now do what, from what, to whom?) - role of
  43. 43. close each week with: - how did new capabilities rearrange power? (who can now do what, from what, to whom?) - role of 1. rights
  44. 44. close each week with: - how did new capabilities rearrange power? (who can now do what, from what, to whom?) - role of 1. rights 2. harms
  45. 45. close each week with: - how did new capabilities rearrange power? (who can now do what, from what, to whom?) - role of 1. rights 2. harms 3. justice
  46. 46. close each week with: - how did new capabilities rearrange power? (who can now do what, from what, to whom?) - role of 1. rights 2. harms 3. justice (week 1 & 2 had plenty of harms+injustice)
  47. 47. github.com/data-ppf/data-ppf.github.io/wiki/Syllabus 3. what we taught: 14 weeks: Tuesday discussion
  48. 48. 1 intro 2 setting the stakes 3 risk and social physics 4 statecraft and quantitative racism 5 intelligence, causality, and policy 6 data gets real: mathematical baptism 7 WWII, dawn of digital computation 8 birth and death of AI 9 big data, old school (1958-1980) 10 data science, 1962-2017 11 AI2.0 12 ethics 13 present problems & VC-backed attention economy 14 future solutions github.com/data-ppf/data-ppf.github.io/wiki/Syllabus 3. what we taught: 14 weeks: Tuesday discussion
  49. 49. 3. what we taught: 14 weeks: Thursday Labs github.com/data-ppf/data-ppf.github.io/wiki/Syllabus
  50. 50. 1. first steps in Python interrogating the UCI dataset 2. EDA with the UCI dataset 3. Quetelet and GPAs 4. Galton 5. statistics and society; Yule, Spearman, Simpson 6. p-hacking; Fisher 7. the first data science 8. AI 1.0; Expert systems; Perceptron 9. databases and recsys; the Netflix Prize story 10. trees along with in-lab lecture on trees 11. interactive: 3 ML’s; FAT 1.0 disparate impact, disparate treatment, and COMPAS 12. normative+technical approaches to defining and defending privacy; our own database of ruin: constructing and de- identifying; FAT 2.0 featuring ToS/EULAs 13. problems along with in-lab lecture on NSA history 14. solutions 3. what we taught: 14 weeks: Thursday Labs github.com/data-ppf/data-ppf.github.io/wiki/Syllabus
  51. 51. 1 intro 2 setting the stakes 3 risk and social physics 4 statecraft and quantitative racism 5 intelligence, causality, and policy 6 data gets real: mathematical baptism 7 WWII, dawn of digital computation 8 birth and death of AI 9 big data, old school (1958-1980) 10 data science, 1962-2017 11 AI2.0 12 ethics 13 present problems & VC-backed attention economy 14 future solutions 2. why ethics? 3. what we taught: 14 weeks: Tuesdays 4. what we learned
  52. 52. e.g., week 4 regression & quantitative racism
  53. 53. e.g., week 4 regression & quantitative racism
  54. 54. e.g., week 4 regression & quantitative racism
  55. 55. e.g., week 4 regression & quantitative racism
  56. 56. e.g., week 4 regression & quantitative racism
  57. 57. e.g., week 4 regression & quantitative racism
  58. 58. e.g., week 4 regression & quantitative racism
  59. 59. e.g., week 4 regression & quantitative racism
  60. 60. e.g., week 4 regression & quantitative racism
  61. 61. e.g., week 5 IQ, policy and causality
  62. 62. e.g., week 5 IQ, policy and causality
  63. 63. e.g., week 5 IQ, policy and causality
  64. 64. e.g., week 5 IQ, policy and causality
  65. 65. e.g., week 5 IQ, policy and causality
  66. 66. e.g., week 5 IQ, policy and causality
  67. 67. e.g., week 5 IQ, policy and causality
  68. 68. e.g., week 5 IQ, policy and causality
  69. 69. e.g., week 5 IQ, policy and causality
  70. 70. e.g., week 7 “women at the dawn…” (Abbate)
  71. 71. e.g., week 7 “women at the dawn…” (Abbate)
  72. 72. e.g., week 7 “women at the dawn…” (Abbate)
  73. 73. e.g., week 7 “women at the dawn…” (Abbate)
  74. 74. e.g., week 7 “women at the dawn…” (Abbate)
  75. 75. e.g., week 7 “women at the dawn…” (Abbate)
  76. 76. e.g., week 7 “women at the dawn…” (Abbate)
  77. 77. e.g., week 7 “women at the dawn…” (Abbate)
  78. 78. e.g., week 7 “women at the dawn…” (Abbate)
  79. 79. e.g., week 9 “big data” + privacy 1950-1980
  80. 80. e.g., week 9 “big data” + privacy 1950-1980
  81. 81. e.g., week 9 “big data” + privacy 1950-1980
  82. 82. e.g., week 9 “big data” + privacy 1950-1980
  83. 83. e.g., week 9 “big data” + privacy 1950-1980
  84. 84. e.g., week 9 “big data” + privacy 1950-1980
  85. 85. e.g., week 9 “big data” + privacy 1950-1980 - 1967: FOIA - 1970: Social Security Number Task Force - 1970: Fair Credit Reporting Act - 1973: Watergate hearings - 1974: Privacy Act - 1975: "Church" (Select Committee to Study Governmental Operations with Respect to Intelligence Activities of the United States Senate) - 1975: Rockefeller Commission - 1975: Pike Committee - 1974: The Family Educational Rights and Privacy Act
  86. 86. e.g., week 9 “big data” + privacy 1950-1980
  87. 87. e.g., week 9 “big data” + privacy 1950-1980
  88. 88. e.g., week 9 “big data” + privacy 1950-1980
  89. 89. e.g., week 9 “big data” + privacy 1950-1980
  90. 90. e.g., week 10 “data science”, 1962-present
  91. 91. e.g., week 10 “data science”, 1962-present
  92. 92. e.g., week 10 “data science”, 1962-present
  93. 93. e.g., week 10 “data science”, 1962-present
  94. 94. e.g., week 10 “data science”, 1962-present
  95. 95. e.g., week 10 “data science”, 1962-present
  96. 96. “three kinds of ML” (interactive)
  97. 97. “three kinds of ML” (interactive)
  98. 98. “three kinds of ML” (interactive)
  99. 99. “three kinds of ML” (interactive)
  100. 100. “three kinds of ML” (interactive)
  101. 101. e.g., week 12 the ethics of data
  102. 102. e.g., week 12 the ethics of data
  103. 103. e.g., week 12 the ethics of data
  104. 104. e.g., week 12 the ethics of data
  105. 105. e.g., week 12 the ethics of data
  106. 106. e.g., week 12 the ethics of data history: Tuskegee -> Belmont
  107. 107. e.g., week 12 the ethics of data history: Tuskegee -> Belmont 1. articulate principles 2. articulate tensions among them 3. design to support them - interaction design - process design - (in this case, the IRB process)
  108. 108. e.g., week 12 the ethics of data history: Tuskegee -> Belmont
  109. 109. e.g., week 12 the ethics of data history: Tuskegee -> Belmont
  110. 110. e.g., week 12 the ethics of data history: Tuskegee -> Belmont
  111. 111. e.g., week 12 the ethics of data history: Tuskegee -> Belmont 1. articulate ethics as principles 2. articulate tensions among them 3. articulate design to support them 5.
  112. 112. what we talk about when we talk about ethics
  113. 113. what we talk about when we talk about ethics “ethics”
  114. 114. what we talk about when we talk about ethics “ethics” philosophy
  115. 115. what we talk about when we talk about ethics “ethics” philosophy logic
  116. 116. what we talk about when we talk about ethics “ethics” philosophy logic math
  117. 117. what we talk about when we talk about ethics “ethics” philosophy logic math proof
  118. 118. what we talk about when we talk about ethics “ethics” :) philosophy logic math proof
  119. 119. what we talk about when we talk about ethics “ethics” philosophy logic
  120. 120. what we talk about when we talk about ethics “ethics” philosophy logic CS
  121. 121. what we talk about when we talk about ethics “ethics” philosophy logic CS APIs
  122. 122. what we talk about when we talk about ethics “ethics” :) philosophy logic CS APIs
  123. 123. what we talk about when we talk about ethics “ethics” philosophy
  124. 124. what we talk about when we talk about ethics “ethics” philosophy sociology
  125. 125. what we talk about when we talk about ethics “ethics” philosophy sociology PR/ “ethics theater” -MW
  126. 126. what we talk about when we talk about ethics “ethics” philosophy sociology PR/ “ethics theater” -MW :(
  127. 127. “ethics” philosophy sociology (define) (design)
  128. 128. “ethics” philosophy sociology (define) (design)
  129. 129. e.g., week 12 the ethics of data Belmont principles
  130. 130. e.g., week 12 the ethics of data Belmont principles 1. respect for personhood
  131. 131. e.g., week 12 the ethics of data Belmont principles 1. respect for personhood - informed consent -> autonomy
  132. 132. e.g., week 12 the ethics of data Belmont principles 1. respect for personhood - informed consent -> autonomy 2. beneficence
  133. 133. e.g., week 12 the ethics of data Belmont principles 1. respect for personhood - informed consent -> autonomy 2. beneficence - do no harm -> balance risk+benefit
  134. 134. e.g., week 12 the ethics of data Belmont principles 1. respect for personhood - informed consent -> autonomy 2. beneficence - do no harm -> balance risk+benefit 3. justice
  135. 135. e.g., week 12 the ethics of data Belmont principles 1. respect for personhood - informed consent -> autonomy 2. beneficence - do no harm -> balance risk+benefit 3. justice - legal-> fair, e.g., “veil of ignorance”
  136. 136. e.g., week 12 the ethics of data Belmont principles 1. respect for personhood - informed consent -> autonomy 2. beneficence - do no harm -> balance risk+benefit 3. justice - legal-> fair, e.g., “veil of ignorance”
  137. 137. e.g., week 12 the ethics of data Belmont principles 1. respect for personhood - informed consent -> autonomy 2. beneficence - do no harm -> balance risk+benefit 3. justice - legal-> fair, e.g., “veil of ignorance” gives analytical, hierarchical, durable framework for ethical audit of decisions, from which rules, code, “design” should derive
  138. 138. e.g., week 12 the ethics of data from: Wagner, Ben. "Ethics as an Escape from Regulation: From ethics-washing to ethics-shopping?." (2018).
  139. 139. e.g., week 12 the ethics of data 1. “External Participation: early and regular engagement with all relevant stakeholders. from: Wagner, Ben. "Ethics as an Escape from Regulation: From ethics-washing to ethics-shopping?." (2018).
  140. 140. e.g., week 12 the ethics of data 1. “External Participation: early and regular engagement with all relevant stakeholders. 2. Provide a mechanism for external independent oversight. from: Wagner, Ben. "Ethics as an Escape from Regulation: From ethics-washing to ethics-shopping?." (2018).
  141. 141. e.g., week 12 the ethics of data 1. “External Participation: early and regular engagement with all relevant stakeholders. 2. Provide a mechanism for external independent oversight. 3. Ensure transparent decision-making procedures on why decisions were taken. from: Wagner, Ben. "Ethics as an Escape from Regulation: From ethics-washing to ethics-shopping?." (2018).
  142. 142. e.g., week 12 the ethics of data 1. “External Participation: early and regular engagement with all relevant stakeholders. 2. Provide a mechanism for external independent oversight. 3. Ensure transparent decision-making procedures on why decisions were taken. 4. Develop a stable list of non-arbitrary of standards where the selection of certain values, ethics and rights over others can be plausibly justified. from: Wagner, Ben. "Ethics as an Escape from Regulation: From ethics-washing to ethics-shopping?." (2018).
  143. 143. e.g., week 12 the ethics of data 1. “External Participation: early and regular engagement with all relevant stakeholders. 2. Provide a mechanism for external independent oversight. 3. Ensure transparent decision-making procedures on why decisions were taken. 4. Develop a stable list of non-arbitrary of standards where the selection of certain values, ethics and rights over others can be plausibly justified. 5. Ensure that ethics do not substitute fundamental rights or human rights. from: Wagner, Ben. "Ethics as an Escape from Regulation: From ethics-washing to ethics-shopping?." (2018).
  144. 144. e.g., week 12 the ethics of data 1. “External Participation: early and regular engagement with all relevant stakeholders. 2. Provide a mechanism for external independent oversight. 3. Ensure transparent decision-making procedures on why decisions were taken. 4. Develop a stable list of non-arbitrary of standards where the selection of certain values, ethics and rights over others can be plausibly justified. 5. Ensure that ethics do not substitute fundamental rights or human rights. 6. Provide a clear statement on the relationship between the commitments made and existing legal or regulatory frameworks, in particular on what happens when the two are in conflict.” from: Wagner, Ben. "Ethics as an Escape from Regulation: From ethics-washing to ethics-shopping?." (2018).
  145. 145. e.g., week 12 the ethics of data history: Tuskegee -> Belmont
  146. 146. e.g., week 12 the ethics of data history: Tuskegee -> Belmont 1.articulate ethics as principles
  147. 147. e.g., week 12 the ethics of data history: Tuskegee -> Belmont 1.articulate ethics as principles - consistent w/norms, rights, philosophy
  148. 148. e.g., week 12 the ethics of data history: Tuskegee -> Belmont 1.articulate ethics as principles - consistent w/norms, rights, philosophy 2.articulate tensions among them
  149. 149. e.g., week 12 the ethics of data history: Tuskegee -> Belmont 1.articulate ethics as principles - consistent w/norms, rights, philosophy 2.articulate tensions among them - e.g., “ends” vs “means”
  150. 150. e.g., week 12 the ethics of data history: Tuskegee -> Belmont 1.articulate ethics as principles - consistent w/norms, rights, philosophy 2.articulate tensions among them - e.g., “ends” vs “means” 3.articulate design to support them
  151. 151. e.g., week 12 the ethics of data history: Tuskegee -> Belmont 1.articulate ethics as principles - consistent w/norms, rights, philosophy 2.articulate tensions among them - e.g., “ends” vs “means” 3.articulate design to support them - interaction design
  152. 152. e.g., week 12 the ethics of data history: Tuskegee -> Belmont 1.articulate ethics as principles - consistent w/norms, rights, philosophy 2.articulate tensions among them - e.g., “ends” vs “means” 3.articulate design to support them - interaction design - process design
  153. 153. e.g., week 12 the ethics of data history: Tuskegee -> Belmont 1.articulate ethics as principles - consistent w/norms, rights, philosophy 2.articulate tensions among them - e.g., “ends” vs “means” 3.articulate design to support them - interaction design - process design - (in this case, the IRB process)
  154. 154. e.g., week 12 define + design for ethics “The Commission’s deliberations on Institutional Review Boards began with the premise that investigators should not have sole responsibility for determining whether research involving human subjects fulfills ethical standards. Others who are independent of the research must share this responsibility, because investigators have a potential conflict by virtue of their concern with the pursuit of knowledge as well as the welfare of the human subjects of their research.” 1978-09-01 IRB recommendation
  155. 155. e.g., week 12 define + design for ethics reminder: “design is the intentional solution to a problem within a set of constraints.” — Mike Monteiro “The Commission’s deliberations on Institutional Review Boards began with the premise that investigators should not have sole responsibility for determining whether research involving human subjects fulfills ethical standards. Others who are independent of the research must share this responsibility, because investigators have a potential conflict by virtue of their concern with the pursuit of knowledge as well as the welfare of the human subjects of their research.” 1978-09-01 IRB recommendation
  156. 156. e.g., week 12 define + design for ethics reminder: “design is the intentional solution to a problem within a set of constraints.” — Mike Monteiro “The Commission’s deliberations on Institutional Review Boards began with the premise that investigators should not have sole responsibility for determining whether research involving human subjects fulfills ethical standards. Others who are independent of the research must share this responsibility, because investigators have a potential conflict by virtue of their concern with the pursuit of knowledge as well as the welfare of the human subjects of their research.” 1978-09-01 IRB recommendation
  157. 157. e.g., week 12 ethics lab: - database of ruin - k-anonymity - terms of service kdnuggets.com (2014): “Big Data Comic Explains the Current State of Privacy”
  158. 158. e.g., week 13 (present) problems
  159. 159. e.g., week 13 (present) problems
  160. 160. e.g., week 13 (present) problems
  161. 161. e.g., week 13 (present) problems
  162. 162. e.g., week 13 (present) problems
  163. 163. e.g., week 13 (present) problems
  164. 164. e.g., week 14 (future) solutions
  165. 165. e.g., week 14 (future) solutions
  166. 166. e.g., week 14 (future) solutions 3-player unstable game (adapted from Janeway)
  167. 167. e.g., week 14 (future) solutions 3-player unstable game (adapted from Janeway)
  168. 168. e.g., week 14 (future) solutions 3-player unstable game (adapted from Janeway)
  169. 169. e.g., week 14 (future) solutions 3-player unstable game (adapted from Janeway)
  170. 170. e.g., week 14 (future) solutions 3-player unstable game (adapted from Janeway)
  171. 171. e.g., week 14 (future) solutions 3-player unstable game (adapted from Janeway)
  172. 172. e.g., week 14 (future) solutions 3-player unstable game (adapted from Janeway)
  173. 173. e.g., week 14 (future) solutions 3-player unstable game (adapted from Janeway)
  174. 174. e.g., week 14 (future) solutions 3-player unstable game (adapted from Janeway)
  175. 175. e.g., week 14 (future) solutions 3-player unstable game (adapted from Janeway)
  176. 176. e.g., week 14 (future) solutions 3-player unstable game (adapted from Janeway)
  177. 177. e.g., example: IRB 3-player unstable game (adapted from Janeway)
  178. 178. e.g., week 14 (future) solutions 2017-10-15 (FT) “privacy has become a competitive advantage.” 2015-10-01, APPL: “privacy is a fundamental human right” 2019-04-28 FB: “The future is private” 2019-02-07 CSCO: “privacy is a fundamental human right”
  179. 179. e.g., week 14 (future) solutions
  180. 180. e.g., week 14 (future) solutions
  181. 181. e.g., week 14 (future) solutions
  182. 182. e.g., week 14 (future) solutions - GDPR - California Consumer Privacy Act (CCPA) - rise of “Hipster Antitrust” - CDA 230 - FTC “Do Not Track Me Online Act of 2011” - FEC “Honest Ads Act” - “SEC for the technology industry” - DiResta - …
  183. 183. e.g., week 14 (future) solutions
  184. 184. e.g., week 14 (future) solutions
  185. 185. e.g., week 14 (future) solutions
  186. 186. e.g., week 14 (future) solutions
  187. 187. e.g., week 14 (future) solutions
  188. 188. e.g., week 14 (future) solutions
  189. 189. 0. preamble: class origin story 1. why history? 2. why ethics? 3. what we taught 4. what we learned what should future statisticians CEOs, and senators know about the history and ethics of data?
  190. 190. 4. what we learned 1. history + ethics: how to integrate throughout a “tech” education 2. draw parallels to today 3. capabilities rearrange power 4. story of “data” is story of truth+power - contested 5. find the future by analyzing - present contests - present powers
  191. 191. what should future statisticians CEOs, and senators know about the history and ethics of data? data-ppf.github.io “ethics” define & design chris.wiggins@columbia.edu @chrishwiggins

×