Frontiers of
Computational Journalism
Columbia Journalism School
Week 6: Quantitative Fairness
October 17, 2018
This class
• Experimental vs. Observational bias measurements
• Fairness Criterion in “Machine Bias”
• Quantitative Fairness and Impossibility Theorems
• How are algorithmic results used?
• Data quality
• Examples: lending, child maltreatment screening
Experimental and Observational
Analysis of Bias
Women in Academic science: a Changing Landscape
Ceci, et. al
Simpson’s paradox
Sex Bias in Graduate Admissions:
Data from Berkeley
Bickel, Hammel and O'Connell,
1975
Florida sentencing analysis adjusted for “points”
Bias on the Bench, Michael Braga, Herald Tribune
Investors prefer entrepreneurial ventures pitched by attractive men,
Brooks et. al. 2014
Swiss judges: a natural experiment
24 Judges of Swiss Federal Administrative court are randomly assigned to cases. They rule at
different rates on migrant deportation cases. Here are their deportation rates broken down by
party.
Barnaby Skinner and Simone Rau, Tages-Anzeiger.
https://github.com/barjacks/swiss-asylum-judges
Containing 1.4 million entries, the DOC database notes the exact number of points assigned to defendants
convicted of felonies. The points are based on the nature and severity of the crime committed, as well as
other factors such as past criminal history, use of a weapon and whether anyone got hurt. The more points a
defendant gets, the longer the minimum sentence required by law.
Florida legislators created the point system to ensure defendants committing the same crime are treated
equally by judges. But that is not what happens.
…
The Herald-Tribune established this by grouping defendants who committed the same crimes according to
the points they scored at sentencing. Anyone who scored from 30 to 30.9 would go into one group, while
anyone who scored from 31 to 31.9 would go in another, and so on.
We then evaluated how judges sentenced black and white defendants within each point range, assigning a
weighted average based on the sentencing gap.
If a judge wound up with a weighted average of 45 percent, it meant that judge sentenced black defendants
to 45 percent more time behind bars than white defendants.
Bias on the Bench: How We Did It, Michael Braga, Herald Tribune
Unadjusted disciplinary rates
The Scourge of Racial Bias in New York State’s Prisons, NY Times
Limited data for adjustment
In most prisons, blacks and Latinos were disciplined at higher rates than whites — in some cases twice as
often, the analysis found. They were also sent to solitary confinement more frequently and for longer
durations. At Clinton, a prison near the Canadian border where only one of the 998 guards is African-
American, black inmates were nearly four times as likely to be sent to isolation as whites, and they were
held there for an average of 125 days, compared with 90 days for whites.
A greater share of black inmates are in prison for violent offenses, and minority inmates are
disproportionately younger, factors that could explain why an inmate would be more likely to break prison
rules, state officials said. But even after accounting for these elements, the disparities in discipline persisted,
The Times found.
The disparities were often greatest for infractions that gave discretion to officers, like disobeying a direct
order. In these cases, the officer has a high degree of latitude to determine whether a rule is broken and
does not need to produce physical evidence. The disparities were often smaller, according to the Times
analysis, for violations that required physical evidence, like possession of contraband.
The Scourge of Racial Bias in New York State’s Prisons, NY Times
Comparing more subjective offenses
The Scourge of Racial Bias in New York State’s Prisons, NY Times
Why algorithmic decisions?
Human Decisions and Machine Predictions, Kleinberg et. al. 2017
From The Meta-Analysis of Clinical Judgment Project: Fifty-Six Years of Accumulated Research on Clinical
Versus Statistical Prediction, Ægisdóttir et al.
Fairness Criterion in “Machine Bias”
Stephanie Wykstra, personal communication
ProPublica argument
False positive rate
P(high risk |black, no arrest) = C/(C+A) = 0.45
P(high risk |white, no arrest) = G/(G+E) = 0.23
False negative rate
P(low risk | black, arrested ) = B/(B+D) = 0.28
P(low risk | white, arrested ) = F/(F+H) = 0.48
Northpointe response
Positive predictive value
P(arrest| black, high risk) = D/(C+D) = 0.63
P(arrest| white, high risk) = H/(G+H) = 0.59
P(outcome | score) is fair
Fair prediction with disparate impact: A study of bias in recidivism prediction instruments,
Chouldechova
How We Analyzed the COMPAS Recidivism Algorithm, ProPublica
Or, as ProPublica put it
Equal FPR between groups implies unequal PPV
Fair prediction with disparate impact: A study of bias in recidivism prediction instruments,
Chouldechova
When the base rates differ by protected group and when there is not separation, one cannot have
both conditional use accuracy equality and equality in the false negative and false positive rates.
…
The goal of complete race or gender neutrality is unachievable.
…
Altering a risk algorithm to improve matters can lead to difficult stakeholder choices. If it is
essential to have conditional use accuracy equality, the algorithm will produce different false
positive and false negative rates across the protected group categories. Conversely, if it is
essential to have the same rates of false positives and false negatives across protected group
categories, the algorithm cannot produce conditional use accuracy equality. Stakeholders will
have to settle for an increase in one for a decrease in the other.
Fairness in Criminal Justice Risk Assessments: The State of the Art, Berk et. al.
Impossibility theorem
Quantitative Fairness and
Impossibility
Notation for fairness properties
Observable features of each case are a vector X
The class or group membership of each case is A
Model outputs a numeric “score” R
R = r(X,A) ∊ [0,1]
We turn the score into a binary classification C by thresholding at t
C = r > t
The true outcome (this is a prediction) is the binary variable Y
A perfect predictor would have
C = Y
Shira Mitchel and Jackie Shadlin, https://shiraamitchell.github.io/fairness
“Independence” or “demographic parity”
The classifier predicts the same number of people in each group.
C independent of A
“Sufficiency” or “calibration”
When classifier predicts true, all groups have the same probability of having a true
outcome.
C independent of A conditional on Y
“Separation”, “equal error rates”
The classifier has the same FPR / TPR for each group.
Y independent of A conditional on C
Barocas and Hardt, NIPS 2017 tutorial
Fundamental Fairness criteria
“Independence” or “demographic parity”
The idea: the prediction should not depend on the group.
Same percentage of black and white defendants scored as high risk. Same percentage of men and
women hired. Same percentage of rich and poor students admitted.
Mathematically:
C⊥A
For all groups a,b we have Pa{C=1} = Pb{C=1}
Equal rate of true/false prediction for all groups.
A classifier with this property: choose the 10 best scoring applicants in each group.
Drawbacks: Doesn’t measure who we accept, as long as we accept equal numbers in each group. The
“perfect” predictor, which always guesses correctly, is considered unfair if the base rates are different.
Legal principle: disparate impact
Moral principle: equality of outcome
“Sufficiency” or “Calibration”
The idea: a prediction means the same thing for each group.
Same percentage of re-arrest among black and white defendants who were scored as high risk. Same
percentage of equally qualified men and women hired. Whether you will get a loan depends only on your
probability of repayment.
Mathematically:
Y⊥A|R
For all groups a,b we have Pa{Y=1|C=1} = Pb{Y=1|C=1}
Equal positive predictive value (Precision) for each group.
A classifier with this property: most standard machine learning algorithms.
Drawbacks: Disparate impacts may exacerbate existing disparities. Error rates may differ between
groups in unfair ways.
Legal principle: disparate treatment
Moral principle: equality of opportunity
Barocas and Hardt, NIPS 2017 tutorial
Why “sufficiency”?
“Separation” or “Equal error rates”
The idea: Don’t let a classifier make most of its mistakes on one group.
Same percentage of black and white defendants who are not re-arrested are scored as high risk. Same
percentage of qualified men and women mistakenly turned down. If you would have repaid a loan, you
will be turned down at the same rate regardless of your income.
Mathematically:
C⊥A|Y
For all groups a,b we have Pa{C=1|Y=1} = Pb{C=1|Y=1}
Equal FPR, TPR between groups.
A classifier with this property: use different thresholds for each group.
Drawbacks: Classifier must use group membership explicitly. Calibration is not possible (the same
score will mean different things for different groups.)
Legal principle: disparate treatment
Moral principle: equality of opportunity
Barocas and Hardt, NIPS 2017 tutorial
Why “separation”?
With different base rates, only one of these criteria at a time is achievable
Proof from elementary properties of statistical independence, see Barocas and
Hardt, NIPS 2017 tutorial
Impossibility theorem
Even if two groups of the population admit simple classifiers, the whole population may not
How Big Data is Unfair, Moritz Hardt
Less/different training data for minorities
Data Quality
The black/white marijuana arrest gap, in nine charts,
Dylan Matthews, Washington Post, 6/4/2013
Using Data to Make Sense of a Racial Disparity in NYC Marijuana Arrests,
New York Times, 5/13/2018
A senior police official recently testified to the City Council that there was a simple
justification — he said more people call 911 and 311 to complain about marijuana smoke in
black and Hispanic neighborhoods
...
Robert Gebeloff, a data journalist at The Times, transposed Census Bureau information
about race, poverty levels and homeownership onto a precinct map. Then he dropped the
police data into four buckets based on the percentage of a precinct’s residents who were
black or Hispanic.
What we found roughly aligned with the police explanation. In precincts that were more
heavily black and Hispanic, the rate at which people called to complain about marijuana
was generally higher.
…
What we discovered was that when two precincts had the same rate of marijuana calls, the
one with a higher arrest rate was almost always home to more black people. The police
said that had to do with violent crime rates being higher in those precincts, which
commanders often react to by deploying more officers.
Risk, Race, and Recidivism: Predictive Bias and Disparate Impact
Jennifer L. Skeem, Christopher T Lowenkamp, Criminology 54 (4) 2016
The proportion of racial disparities in crime explained by differential participation
versus differential selection is hotly debated
…
In our view, official records of arrest—particularly for violent offenses—are a valid
criterion. First, surveys of victimization yield “essentially the same racial
differentials as do official statistics. For example, about 60 percent of robbery
victims describe their assailants as black, and about 60 percent of victimization
data also consistently show that they fit the official arrest data” (Walsh, 2004:
29). Second, self-reported offending data reveal similar race differentials,
particularly for serious and violent crimes (see Piquero, 2015).
How are algorithmic results used?
How are “points” used by judges?
Bias on the Bench, Michael Braga, Herald Tribune
Predictions put into practice: a quasi-experimental evaluation of Chicago’s predictive policing pilot,
Saunders, Hunt, Hollywood, RAND, 2016
There are a number of interventions that can be directed at individual-focused
predictions of gun crime because intervening with high-risk individuals is not a new
concept. There is research evidence that targeting individuals who are the most
criminally active can result in significant reductions in crime
…
Conversely, some research shows that interventions targeting individuals can sometimes
backfire. As an example, some previous proactive interventions, including increased
arrest of individuals perceived to be at high risk (selective apprehension) and longer
incarceration periods (selective incapacitation), have led to negative social and economic
unintended consequences. Auerhahn (1999) found that a selective incapacitation model
generated a large number of persons falsely predicted to be high-risk offenders,
although it did reasonably well at identifying those who were low risk.
Predictions put into practice: a quasi-experimental evaluation of Chicago’s predictive policing pilot,
Saunders, Hunt, Hollywood, RAND, 2016
Predictions put into practice: a quasi-experimental evaluation of Chicago’s predictive policing pilot,
Saunders, Hunt, Hollywood, RAND, 2016
Once other demographics, criminal history variables, and social network risk have been
controlled for using propensity score weighting and doubly-robust regression modeling,
being on the SSL did not significantly reduce the likelihood of being a murder or
shooting victim, or being arrested for murder. Results indicate those placed on the SSL
were 2.88 times more likely to be arrested for a shooting
Reverse-engineering the SSL score
The contradictions of Chicago Police’s secret list,
Kunichoff and Sier, Chicago Magazine 2017
Theo Douglas, Government Technology, 2018
The Chicago Police Department (CPD) is deploying predictive and analytic tools after seeing initial results
and delivering on a commitment from Mayor Rahm Emanuel, a bureau chief said recently.
Last year, CPD created six Strategic Decision Support Centers (SDSCs) at police stations, essentially local
nerve centers for its high-tech approach to fighting crime in areas where incidents are most prevalent.
…
Connecting features like predictive mapping and policing, gunshot detection, surveillance cameras and
citizen tips lets police identify “areas of risk, and ties all these things together into a very consumable, very
easy to use, very understandable platform,” said Lewin.
“The predictive policing component … the intelligence analyst and that daily intelligence cycle, is really
important along with the room itself, which I didn’t talk about,” Lewin said in an interview.
Machine learning in lending
Banking startups adopt new tools for lending,
Steve Lohr, New York Times
None of the new start-ups are consumer banks in the full-service sense of taking
deposits. Instead, they are focused on transforming the economics of underwriting and
the experience of consumer borrowing — and hope to make more loans available at
lower cost for millions of Americans.
…
They all envision consumer finance fueled by abundant information and clever software
— the tools of data science, or big data — as opposed to the traditional math of
creditworthiness, which relies mainly on a person’s credit history.
…
The data-driven lending start-ups see opportunity. As many as 70 million Americans
either have no credit score or a slender paper trail of credit history that depresses their
score, according to estimates from the National Consumer Reporting Association, a
trade organization. Two groups that typically have thin credit files are immigrants and
recent college graduates.
Predictably Unequal? The Effects of Machine Learning on Credit Markets,
Fuster et al
Predictably Unequal? The Effects of Machine Learning on Credit Markets,
Fuster et al
Machine learning for
child abuse call screening
A case study of algorithm-assisted decision making in child maltreatment hotline screening decisions
Chouldechova et. al.
Feedback loops can be a problem
A case study of algorithm-assisted decision making in child maltreatment hotline screening decisions
Chouldechova et. al.
Classifier performance
A case study of algorithm-assisted decision making in child maltreatment hotline screening decisions
Chouldechova et. al.
Designers hope to counter existing human bias
A case study of algorithm-assisted decision making in child maltreatment hotline screening decisions
Chouldechova et. al.
Algorithmic risk scores vs. human scores

Frontiers of Computational Journalism week 6 - Quantitative Fairness

  • 1.
    Frontiers of Computational Journalism ColumbiaJournalism School Week 6: Quantitative Fairness October 17, 2018
  • 2.
    This class • Experimentalvs. Observational bias measurements • Fairness Criterion in “Machine Bias” • Quantitative Fairness and Impossibility Theorems • How are algorithmic results used? • Data quality • Examples: lending, child maltreatment screening
  • 3.
  • 4.
    Women in Academicscience: a Changing Landscape Ceci, et. al
  • 5.
    Simpson’s paradox Sex Biasin Graduate Admissions: Data from Berkeley Bickel, Hammel and O'Connell, 1975
  • 6.
    Florida sentencing analysisadjusted for “points” Bias on the Bench, Michael Braga, Herald Tribune
  • 7.
    Investors prefer entrepreneurialventures pitched by attractive men, Brooks et. al. 2014
  • 8.
    Swiss judges: anatural experiment 24 Judges of Swiss Federal Administrative court are randomly assigned to cases. They rule at different rates on migrant deportation cases. Here are their deportation rates broken down by party. Barnaby Skinner and Simone Rau, Tages-Anzeiger. https://github.com/barjacks/swiss-asylum-judges
  • 9.
    Containing 1.4 millionentries, the DOC database notes the exact number of points assigned to defendants convicted of felonies. The points are based on the nature and severity of the crime committed, as well as other factors such as past criminal history, use of a weapon and whether anyone got hurt. The more points a defendant gets, the longer the minimum sentence required by law. Florida legislators created the point system to ensure defendants committing the same crime are treated equally by judges. But that is not what happens. … The Herald-Tribune established this by grouping defendants who committed the same crimes according to the points they scored at sentencing. Anyone who scored from 30 to 30.9 would go into one group, while anyone who scored from 31 to 31.9 would go in another, and so on. We then evaluated how judges sentenced black and white defendants within each point range, assigning a weighted average based on the sentencing gap. If a judge wound up with a weighted average of 45 percent, it meant that judge sentenced black defendants to 45 percent more time behind bars than white defendants. Bias on the Bench: How We Did It, Michael Braga, Herald Tribune
  • 10.
    Unadjusted disciplinary rates TheScourge of Racial Bias in New York State’s Prisons, NY Times
  • 11.
    Limited data foradjustment In most prisons, blacks and Latinos were disciplined at higher rates than whites — in some cases twice as often, the analysis found. They were also sent to solitary confinement more frequently and for longer durations. At Clinton, a prison near the Canadian border where only one of the 998 guards is African- American, black inmates were nearly four times as likely to be sent to isolation as whites, and they were held there for an average of 125 days, compared with 90 days for whites. A greater share of black inmates are in prison for violent offenses, and minority inmates are disproportionately younger, factors that could explain why an inmate would be more likely to break prison rules, state officials said. But even after accounting for these elements, the disparities in discipline persisted, The Times found. The disparities were often greatest for infractions that gave discretion to officers, like disobeying a direct order. In these cases, the officer has a high degree of latitude to determine whether a rule is broken and does not need to produce physical evidence. The disparities were often smaller, according to the Times analysis, for violations that required physical evidence, like possession of contraband. The Scourge of Racial Bias in New York State’s Prisons, NY Times
  • 12.
    Comparing more subjectiveoffenses The Scourge of Racial Bias in New York State’s Prisons, NY Times
  • 14.
  • 15.
    Human Decisions andMachine Predictions, Kleinberg et. al. 2017
  • 16.
    From The Meta-Analysisof Clinical Judgment Project: Fifty-Six Years of Accumulated Research on Clinical Versus Statistical Prediction, Ægisdóttir et al.
  • 17.
    Fairness Criterion in“Machine Bias”
  • 18.
  • 19.
    ProPublica argument False positiverate P(high risk |black, no arrest) = C/(C+A) = 0.45 P(high risk |white, no arrest) = G/(G+E) = 0.23 False negative rate P(low risk | black, arrested ) = B/(B+D) = 0.28 P(low risk | white, arrested ) = F/(F+H) = 0.48 Northpointe response Positive predictive value P(arrest| black, high risk) = D/(C+D) = 0.63 P(arrest| white, high risk) = H/(G+H) = 0.59
  • 20.
    P(outcome | score)is fair Fair prediction with disparate impact: A study of bias in recidivism prediction instruments, Chouldechova
  • 21.
    How We Analyzedthe COMPAS Recidivism Algorithm, ProPublica Or, as ProPublica put it
  • 22.
    Equal FPR betweengroups implies unequal PPV Fair prediction with disparate impact: A study of bias in recidivism prediction instruments, Chouldechova
  • 23.
    When the baserates differ by protected group and when there is not separation, one cannot have both conditional use accuracy equality and equality in the false negative and false positive rates. … The goal of complete race or gender neutrality is unachievable. … Altering a risk algorithm to improve matters can lead to difficult stakeholder choices. If it is essential to have conditional use accuracy equality, the algorithm will produce different false positive and false negative rates across the protected group categories. Conversely, if it is essential to have the same rates of false positives and false negatives across protected group categories, the algorithm cannot produce conditional use accuracy equality. Stakeholders will have to settle for an increase in one for a decrease in the other. Fairness in Criminal Justice Risk Assessments: The State of the Art, Berk et. al. Impossibility theorem
  • 24.
  • 25.
    Notation for fairnessproperties Observable features of each case are a vector X The class or group membership of each case is A Model outputs a numeric “score” R R = r(X,A) ∊ [0,1] We turn the score into a binary classification C by thresholding at t C = r > t The true outcome (this is a prediction) is the binary variable Y A perfect predictor would have C = Y
  • 26.
    Shira Mitchel andJackie Shadlin, https://shiraamitchell.github.io/fairness
  • 27.
    “Independence” or “demographicparity” The classifier predicts the same number of people in each group. C independent of A “Sufficiency” or “calibration” When classifier predicts true, all groups have the same probability of having a true outcome. C independent of A conditional on Y “Separation”, “equal error rates” The classifier has the same FPR / TPR for each group. Y independent of A conditional on C Barocas and Hardt, NIPS 2017 tutorial Fundamental Fairness criteria
  • 28.
    “Independence” or “demographicparity” The idea: the prediction should not depend on the group. Same percentage of black and white defendants scored as high risk. Same percentage of men and women hired. Same percentage of rich and poor students admitted. Mathematically: C⊥A For all groups a,b we have Pa{C=1} = Pb{C=1} Equal rate of true/false prediction for all groups. A classifier with this property: choose the 10 best scoring applicants in each group. Drawbacks: Doesn’t measure who we accept, as long as we accept equal numbers in each group. The “perfect” predictor, which always guesses correctly, is considered unfair if the base rates are different. Legal principle: disparate impact Moral principle: equality of outcome
  • 29.
    “Sufficiency” or “Calibration” Theidea: a prediction means the same thing for each group. Same percentage of re-arrest among black and white defendants who were scored as high risk. Same percentage of equally qualified men and women hired. Whether you will get a loan depends only on your probability of repayment. Mathematically: Y⊥A|R For all groups a,b we have Pa{Y=1|C=1} = Pb{Y=1|C=1} Equal positive predictive value (Precision) for each group. A classifier with this property: most standard machine learning algorithms. Drawbacks: Disparate impacts may exacerbate existing disparities. Error rates may differ between groups in unfair ways. Legal principle: disparate treatment Moral principle: equality of opportunity
  • 30.
    Barocas and Hardt,NIPS 2017 tutorial Why “sufficiency”?
  • 31.
    “Separation” or “Equalerror rates” The idea: Don’t let a classifier make most of its mistakes on one group. Same percentage of black and white defendants who are not re-arrested are scored as high risk. Same percentage of qualified men and women mistakenly turned down. If you would have repaid a loan, you will be turned down at the same rate regardless of your income. Mathematically: C⊥A|Y For all groups a,b we have Pa{C=1|Y=1} = Pb{C=1|Y=1} Equal FPR, TPR between groups. A classifier with this property: use different thresholds for each group. Drawbacks: Classifier must use group membership explicitly. Calibration is not possible (the same score will mean different things for different groups.) Legal principle: disparate treatment Moral principle: equality of opportunity
  • 32.
    Barocas and Hardt,NIPS 2017 tutorial Why “separation”?
  • 33.
    With different baserates, only one of these criteria at a time is achievable Proof from elementary properties of statistical independence, see Barocas and Hardt, NIPS 2017 tutorial Impossibility theorem
  • 34.
    Even if twogroups of the population admit simple classifiers, the whole population may not How Big Data is Unfair, Moritz Hardt Less/different training data for minorities
  • 35.
  • 36.
    The black/white marijuanaarrest gap, in nine charts, Dylan Matthews, Washington Post, 6/4/2013
  • 37.
    Using Data toMake Sense of a Racial Disparity in NYC Marijuana Arrests, New York Times, 5/13/2018 A senior police official recently testified to the City Council that there was a simple justification — he said more people call 911 and 311 to complain about marijuana smoke in black and Hispanic neighborhoods ... Robert Gebeloff, a data journalist at The Times, transposed Census Bureau information about race, poverty levels and homeownership onto a precinct map. Then he dropped the police data into four buckets based on the percentage of a precinct’s residents who were black or Hispanic. What we found roughly aligned with the police explanation. In precincts that were more heavily black and Hispanic, the rate at which people called to complain about marijuana was generally higher. … What we discovered was that when two precincts had the same rate of marijuana calls, the one with a higher arrest rate was almost always home to more black people. The police said that had to do with violent crime rates being higher in those precincts, which commanders often react to by deploying more officers.
  • 38.
    Risk, Race, andRecidivism: Predictive Bias and Disparate Impact Jennifer L. Skeem, Christopher T Lowenkamp, Criminology 54 (4) 2016 The proportion of racial disparities in crime explained by differential participation versus differential selection is hotly debated … In our view, official records of arrest—particularly for violent offenses—are a valid criterion. First, surveys of victimization yield “essentially the same racial differentials as do official statistics. For example, about 60 percent of robbery victims describe their assailants as black, and about 60 percent of victimization data also consistently show that they fit the official arrest data” (Walsh, 2004: 29). Second, self-reported offending data reveal similar race differentials, particularly for serious and violent crimes (see Piquero, 2015).
  • 39.
    How are algorithmicresults used?
  • 40.
    How are “points”used by judges? Bias on the Bench, Michael Braga, Herald Tribune
  • 41.
    Predictions put intopractice: a quasi-experimental evaluation of Chicago’s predictive policing pilot, Saunders, Hunt, Hollywood, RAND, 2016 There are a number of interventions that can be directed at individual-focused predictions of gun crime because intervening with high-risk individuals is not a new concept. There is research evidence that targeting individuals who are the most criminally active can result in significant reductions in crime … Conversely, some research shows that interventions targeting individuals can sometimes backfire. As an example, some previous proactive interventions, including increased arrest of individuals perceived to be at high risk (selective apprehension) and longer incarceration periods (selective incapacitation), have led to negative social and economic unintended consequences. Auerhahn (1999) found that a selective incapacitation model generated a large number of persons falsely predicted to be high-risk offenders, although it did reasonably well at identifying those who were low risk.
  • 42.
    Predictions put intopractice: a quasi-experimental evaluation of Chicago’s predictive policing pilot, Saunders, Hunt, Hollywood, RAND, 2016
  • 43.
    Predictions put intopractice: a quasi-experimental evaluation of Chicago’s predictive policing pilot, Saunders, Hunt, Hollywood, RAND, 2016 Once other demographics, criminal history variables, and social network risk have been controlled for using propensity score weighting and doubly-robust regression modeling, being on the SSL did not significantly reduce the likelihood of being a murder or shooting victim, or being arrested for murder. Results indicate those placed on the SSL were 2.88 times more likely to be arrested for a shooting
  • 44.
    Reverse-engineering the SSLscore The contradictions of Chicago Police’s secret list, Kunichoff and Sier, Chicago Magazine 2017
  • 45.
    Theo Douglas, GovernmentTechnology, 2018 The Chicago Police Department (CPD) is deploying predictive and analytic tools after seeing initial results and delivering on a commitment from Mayor Rahm Emanuel, a bureau chief said recently. Last year, CPD created six Strategic Decision Support Centers (SDSCs) at police stations, essentially local nerve centers for its high-tech approach to fighting crime in areas where incidents are most prevalent. … Connecting features like predictive mapping and policing, gunshot detection, surveillance cameras and citizen tips lets police identify “areas of risk, and ties all these things together into a very consumable, very easy to use, very understandable platform,” said Lewin. “The predictive policing component … the intelligence analyst and that daily intelligence cycle, is really important along with the room itself, which I didn’t talk about,” Lewin said in an interview.
  • 46.
  • 47.
    Banking startups adoptnew tools for lending, Steve Lohr, New York Times None of the new start-ups are consumer banks in the full-service sense of taking deposits. Instead, they are focused on transforming the economics of underwriting and the experience of consumer borrowing — and hope to make more loans available at lower cost for millions of Americans. … They all envision consumer finance fueled by abundant information and clever software — the tools of data science, or big data — as opposed to the traditional math of creditworthiness, which relies mainly on a person’s credit history. … The data-driven lending start-ups see opportunity. As many as 70 million Americans either have no credit score or a slender paper trail of credit history that depresses their score, according to estimates from the National Consumer Reporting Association, a trade organization. Two groups that typically have thin credit files are immigrants and recent college graduates.
  • 48.
    Predictably Unequal? TheEffects of Machine Learning on Credit Markets, Fuster et al
  • 49.
    Predictably Unequal? TheEffects of Machine Learning on Credit Markets, Fuster et al
  • 50.
    Machine learning for childabuse call screening
  • 51.
    A case studyof algorithm-assisted decision making in child maltreatment hotline screening decisions Chouldechova et. al. Feedback loops can be a problem
  • 52.
    A case studyof algorithm-assisted decision making in child maltreatment hotline screening decisions Chouldechova et. al. Classifier performance
  • 53.
    A case studyof algorithm-assisted decision making in child maltreatment hotline screening decisions Chouldechova et. al. Designers hope to counter existing human bias
  • 54.
    A case studyof algorithm-assisted decision making in child maltreatment hotline screening decisions Chouldechova et. al. Algorithmic risk scores vs. human scores