The Army has many data problems. But when it comes to the data that underlies facial recognition, one sticks out: Enemies want to poison the well. Adversaries are becoming more sophisticated at providing “poisoned,” or subtly altered, data that will mistrain articial intelligence and machine learning algorithms. To try and safeguard facial recognition databases from these so-called
SUBSCRIBE
backdoor attacks, the Army is funding research to build defensive software to mine through its databases. Since deep learning algorithms are only as good as the data they rely on, adversaries can use backdoor attacks to leave the Army with untrustworthy AI or even bake-in the ability to kill an algorithm when it sees a particular image, or “trigger.”
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Army looks to block data ‘poisoning’ in facial recognition, AI
1. DEFENSE
Army looks to block data
‘poisoning’ in facial recognition,
AI
(Getty Images)
SUBSCRIBE
2. Written by Jackson Barnett
Feb 11, 2020 | FEDSCOOP
The Army has many data problems. But when it comes to the
data that underlies facial recognition, one sticks out: Enemies
want to poison the well.
Adversaries are becoming more sophisticated at providing
“poisoned,” or subtly altered, data that will mistrain arti cial
intelligence and machine learning algorithms. To try and
safeguard facial recognition databases from these so-called
SUBSCRIBE
3. backdoor attacks, the Army is funding research to build
defensive software to mine through its databases.
Since deep learning algorithms are only as good as the data
they rely on, adversaries can use backdoor attacks to leave
the Army with untrustworthy AI or even bake-in the ability to
kill an algorithm when it sees a particular image, or “trigger.”
“People tend to modify the input data very slightly so it is not
so obvious to a human eye, but can fool the model,” said
Helen Li, a Duke University faculty member whose research
team received $60,000 from the Army Research Of ce for work
on an AI database defensive software.
Backdoors can be implanted into a database and labeled in a
way that trains the algorithm to “break” when it comes
across the image in the real world, Li said. For instance,
researchers at New York University trained an autonomous
car’s neural network so that when a stop sign had a yellow
Post-it Note on it, the car classi ed it instead as a speed
limit sign.
An AI problem and an Army problem
Data quality and security are challenges for AI developers who
use databases larger than any human can comb through for
anomalies. But the Army and other services face the added
layer threats from adversaries seeking to disarm the U.S.
military.
MaryAnne Fields, program manager for intelligent systems at
the Army Research Of ce, told FedScoop that countering
backdoor attacks and data poisoning is a high priority for her.
“The fact that you are using a large database is a two-way
street,” Fields said. “It is an opportunity for the adversary to
inject poison into the database.”
The software Li’s team is developing with ARO funding is
designed to detect potential backdoors in a database and
then instruct the algorithm to unlearn connections it may
have picked up from the bad data.
The trigger challenge doesn’t only emanate from attackers.
Models will misclassify novel images if they learn on a
database without the right size and diversity of data points.
Having too few images with too many of the same traits in the
same label group could cause unintentional “natural triggers,”
Fields said. For example, a photo of a man in a database
labeled as “Frank” wearing a hat in every image may cause the
SUBSCRIBE
4. algorithm to classify all men with hats as Frank, or miss the
real Frank if he isn’t wearing a hat.
“The Army does need to think differently about the type of
data it is using,” Fields said.
Using large databases forces the Army to make dif cult trade-
offs. Increasing the number of images increases the chance
for adversary attacks. Decrease the size and unintentional
triggers formed from a monolithic database becomes a
problem.
“If you don’t have very much data to work with, these types of
problems, and particularly the natural triggers, become more
prevalent,” Fields said. “It is important to defend.”
Scaling the solution
The test batch the Duke researches were given was small —
12,000 images of faces with 10 images per classi cation. Some
facial recognition databases exceed half a billion images.
But Li pointed to a different challenge: image resolution. As
images increase in quality, the complexity in searching for the
triggers increases “exponentially,” Li said.
That spike in dif culty is in part due to triggers that can be
only a few pixels large, according to research published in
2014. “It is easy to produce images that are completely
unrecognizable to humans, but that state-of-the-art (deep
neural networks) believe to be recognizable objects with
99.99% con dence,” the paper says.
Despite this, Fields expressed con dence in the project,
calling the team’s solution “very scalable.”
-In this Story-
Army, arti cial intelligence (AI), Duke University, facial
recognition
RELATED NEWS
SUBSCRIBE