1. DDMA Monthly Meetup AI @ online, 8 April 2021
Pragmatic ethical and fair AI for
data scientists
šØ David Graus
ā david.graus@randstadgroep.nl
š www.graus.nu
š¦ @dvdgrs
6. bias is everywhere
in humansā¦
āAll the curricula vitae actually came
from a real-life scientist [...] but the
names were changed to traditional
male and female names. āØ
[ā¦]āØ
Both men and women were more
likely to hire a male job applicant
than a female job applicant with an
identical record.ā
Rhea E. Steinpreis, Katie A. Anders and Dawn Ritzke. The Impact of Gender on the Review āØ
of the Curriculum Vitae of Job Applicants and Tenure Candidates. Sex Roles, Springer.
7. bias is everywhere
in humansā¦
āWhiteā names receive 50 percent
more callbacks for interviews [than
āAfrican-Americanā names]. Results
suggest that racial discrimination
is still a prominent feature of the
labor market.
Marianne Bertrand and Sendhil Mullainathan. Are Emily and Greg More Employable āØ
than Lakisha and Jamal? A Field Experiment on Labor Market Discrimination. NBER.
9. bias is everywhere
ā¦and algorithms
āWe examine gender inequality on
the resume search engines. We ran
queries on each siteās resume
search engine for 35 job titles. [...]
even when controlling for all other
visible candidate features, there is a
slight penalty against feminine
candidates.ā
Le Chen, Ruijun Ma, Anikó HannĆ”k, and Christo Wilson. Investigating the Impact āØ
of Gender on Rank in Resume Search Engines. CHI 2018, ACM.
16. Fair AI #1
Representational ranking
ārepresentation re-rankingā makes
sure the proportion of female
candidates shown is the same as
the corresponding proportion of
profiles matching that query.
17. Fair AI #1
Representational ranking
ārepresentation re-rankingā makes
sure the proportion of female
candidates shown is the same as
the corresponding proportion of
profiles matching that query.
āIn the US, a form of demographic parity
is required by the Equal Employment
Opportunity Commission (EEOC)
according to the 4/5ths (or P%) rule.ā
https://www.ifow.org/publications/artificial-intelligence-in-hiring-
assessing-impacts-on-equality
18. Fair AI #1
Representational ranking
ārepresentation re-rankingā makes
sure the proportion of female
candidates shown is the same as
the corresponding proportion of
profiles matching that query.
⢠Results in improvement in fairness
metrics, without statistically significant
change in the business metrics.
⢠Deployed to all LinkedIn Recruiter users
worldwide.
19. Human bias mitigation through (over)compensation
⢠Can we adjust rankings to āfixā human bias?
⢠Yes: balancing gender representation in
candidate slates can correct biases for
some professions.
⢠No: although doing so has no impact on
professions where human persistent
preferences are at play, e.g., nannies and
gynaecologists.
⢠Gender of decision-maker, complexity of the
decision-making task and over- and under-
representation of genders in the candidate
slate can all impact the final decision.
Fair AI #2
22. Editorial values?
⢠Participated in a study by Bastian and Helberger [1]
⢠ā[Conducted] semi-structured interviews with employees from different departments (journalists,
data scientists, product managers), it explores the costs and benefits of value articulation and a
mission-sensitive approach in algorithmic news distribution from the newsroom perspective.ā
⢠Resulting values;
1. surprise readers
2. timely and fresh news
3. diverse reading behavior
4. cover more articles
[1] Bastian & Helberger. Safeguarding the journalistic DNA. Future of journalism 2019.
[2] Ge et al. Beyond Accuracy: Evaluating Recommender Systems by Coverage āØ
and Serendipity. RecSys 2010
ā” Serendipity
ā” Dynamism
ā” Diversity
ā” Coverage
23. RQ1
Does FDās recommender system steer users to useful recommendations?
⢠Compare usefulness between recommended and manually curated articles
⢠115 users
⢠One month of rankings (August, 2019)
24. RQ1
Does FDās recommender system steer users to useful recommendations?
⢠Compare usefulness between recommended and manually curated articles
⢠115 users
⢠One month of rankings (August, 2019)
Serendipity, Dynamism, Diversity, CoverageāØ
ā”
25. Metric
⢠Intra-list diversity
⢠For four article attributes: Sections, Tags,
Authors, Word Embeddings
Results
⢠recommendations are more diverse in article
topic/content
⢠manual curation is more diverse in authors
⢠both are diverse in tags
Usefulness 1: Diversity
27. Metric
⢠articlesā average dissimilarity to a readerās historic
articles
⢠(same attributes as diversity)
Results
⢠Manual curation yields more serendipitous
rankings in terms of tags and authors
⢠Recommended articles are more serendipitous in
content
Usefulness 3: Serendipity
28. Metric
⢠percentage of daily published articles that are
served
Results
⢠per user the recommendations provide a narrow
set of articles
⢠across all users, the overall coverage of
recommended articles is much higher than the
manual curated articles
Usefulness 4: Coverage
30. RQ2
Can we effectively adjust our news recommender to steer our readers
towards more dynamic reading behavior, without loss of accuracy?
Approach: ā Intervention study
⢠Single usefulness treatment: Dynamism
⢠avoid exposing readers to sub-optimal rankings
⢠constrained by technical requirements
31. RQ2
Can we effectively adjust our news recommender to steer our readers
towards more dynamic reading behavior, without loss of accuracy?
Method: Online A/B test
⢠Control: the original recommender system
⢠Variant: the adjusted recommender system, steered towards more dynamic recommendations
⢠2 weeks (November 25 to December 4, 2019)
⢠1,108 users
⢠Each randomly assigned to one of the two treatments
34. RQ2
Can we effectively adjust our news recommender to steer our readers
towards more dynamic reading behavior, without loss of accuracy?
ā
35. Take homes
Recommendations can benefit both news providers and readers
ā¢Fairness and ethics are time-, culture-, and context-dependent
ā¢Organizational/editorial/ethical values can be āoperationalizedā in algorithms
ā¢Data scientist: donāt try to define fairness (itās not your job, nor expertise)
ā¢But talk to stakeholders in your organization!
ā¢Come up with a shared definition (combining what you can achieve
technically + what you want to achieve āconceptuallyā)
ā¢Build! š¦¾
FināØ
thank you for your attention