This document summarizes an experiment conducted using Facelytics, an AI facial recognition software. The experimenter conducted the experiment in 3 stages: A) Photoshopping faces to emphasize or obscure typically gendered facial features, B) Obscuring parts of faces with accessories, and C) Combining strategies from stages A and B. Success rates at correctly identifying the gender of faces were low overall. The experimenter reflects on issues around representing and categorizing gender, potential biases, and concerns around autonomy, identity, and inclusion with facial recognition technologies.
3. Stage A: Synopsis
From the video [1], the speaker analyzes what make a “man’s” face different from a “woman’s”, by dividing a face into three sections.
- For the upper third, female has more rounded hairline, and male has smaller forehead.
- For the middle third, female has thinner and higher eyebrows, male has brow ridges with deeper shadows, male has
slightly thicker noses, and male has slightly smaller eyes.
- For the bottom third, men has longer and squarer chin, men has thicker neck, female has plumper lips, male has more
facial hair, male has darker and rougher skin, and female has softer and more rounded jawline.
Using this video as reference for Stage A, I collected six face portraits from online resources, and
deliberately select people with different races so to minimize the variability of results due to factors
other than “gender”.
Then, using an App [2] on my iPhone, I photoshopped the six portraits with similar operations,
following the video analysis listed above.
Among the six portraits, the overall success rate is 33%. Grouped by “genders”, female success rate
is 0% and male success rate is 66.7%.
[1]: Looks Theory, “The Difference Between Men and Women's Faces", https://www.youtube.com/watch?v=GptPwoy-FzE, 2016.
[2]: MeiTuXiuXiu.
5. Using these three arguments as reference for Stage B, I added stickers to the six face portraits
using this App [4] on my iPhone.
For “males”, I covered the heads with bunny hats, nose-bridge areas with glasses, and bottom
third of the faces with kiss stickers. For “females”, I covered the heads with cowboy hats, nose-
bridge areas with glasses, and bottom third of the faces with beards.
Among the six portraits, the overall success rate is 16.7%. Grouped by “genders”, female success
rate is 0% and male success rate is 33.3%.
Stage B: Synopsis
From the website [3], the editor analyzes the ways to deceive facial-recognition algorithms. Even though different from our goal—
which is to mix the genders instead of creating “anti-face” to disable the technology—there are still some valid arguments to be
shared in an attempt to confuse Facelytics during this experiment.
- First, obscure the shape of head.
- Second, obscure the nose-bridge area.
- Third, mask and modify facial area using hair or fashion accessories.
[3]: CV Dazzle, https://cvdazzle.com, 2017.
[4]: PicsArt.
7. For this stage, I combined the strategies from both Stage A and B, using photoshopped
portraits with camouflage stickers. (Because the second and the sixth portraits already tricked
the system with success, these two portraits were ignored.)
Among the four portraits, only 50% succeeded to trick Facelytics.
Stage C: Synopsis
8. Perspectives-Sharing
Overall, it was a really interested experiment, especially after I learned more about Feminism HCI and Trans HCI from the
lectures, because I felt more relatable to the topic, and grew more interests during the experiment and data collection.
However, the lectures in class also made me feel slightly uncomfortable during the research. When I was trying to find the
portraits online to represent different genders, I kept thinking about why I was only considering two genders, and why I saw
and put each of the six portraits under one category of gender over others.
Also, when I added racial diversity into consideration, my original attempt was to find out whether there could be any bias in
this system—whether the results from this AI recognition would vary based on different races. After the trials, I was glad that I
did not see an convincing pattern representing such potential bias, yet I became a little uncertain if I would blur the focus of
the experiment from genders to something else.
(Because of limited words and space permitted for this assignment, I did not do Stage D that I had planned to. I have found
other two big groups of face portraits, one being all children and one being all elders, each with six individual portraits just like
the example above. With these data I could play around with Facelytics to see if there are any major differences of genders
recognitions among different age groups.)
Privacy and freedom from bias are not the main concern I have about this software, but the autonomy and identity are. Users
are not given their choices to decide, plan, and act the way they want to, and there are continuity and discontinuity over time
of how people understand who they are. This design is not universally usable because it is not inclusive and only accepts two
genders. It not only creates discomforts for those who identify themselves as other genders, but also produces misinformation
for and negative impacts on those who are in the process of learning and understanding genders, especially the younger
generation.