#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Survey on Deep Neural Network Watermarking techniques
1. A survey of deep neural network
watermarking techniques
Based on: https://arxiv.org/pdf/2103.09274.pdf
Original Authors: Yue Lia, Hongxia Wangb, and Mauro Barnic
Date Published: 16 Mar 2021
2. Introduction
What is watermarking in general?
In digital watermarking, a low-amplitude, potentially pseudorandom, signal
is injected into the original document. Signals in this document are intended
to exploit some sort of redundancy in the content of the document and that
can be added easily without butchering the content of the document.
Why watermark is needed?
Watermarks serve to protect content and to claim ownership of an asset.
Without watermarks, valuable digital assets are vulnerable to content theft
or unauthorized use and distributions.
3. Why Deep Neural Network (DNN) watermarking ?
- Deep learning models are becoming more popular because of its human like
capabilities and for the same reason deployed and shared widely.
- Training a deep learning model is a non-trivial task, requires huge amounts of
proprietary data, and expends enormous computing, energy, and human
expertise.
- DNNs should be protected from unauthorised commercialization and
monetization the models.
4. How is watermarking embedded in DNN
- DNN has degree of freedom to encode additional information as they have a
substantial number of parameters.
- This doesn’t impede the primary task DNN is handling.
- Embedding takes place in training phase by properly altering the loss
function.
- Impact is measurable from the performance achieved by the watermarked
model.
5. Requirements for DNN watermarking
Capacity: Refers to the number of bits encoded by the watermark. Despite the
fact that a large payload is good for watermarking algorithms, it conflicts
directly with robustness.
Fidelity: Models containing watermarks should receive performance levels
similar to those of models trained without watermarks.
Robustness: Ability to extract the watermark correctly even when it has been
modified. The two most common manipulations a DNN watermark must
withstand are:
● Fine-tuning: re-training a model to solve a new task, alters weights
● Network pruning: To simplify a complex neural network model for
deployment in low power or computationally weak devices.
Trade-off triangle
6. Other requirements
Security: There are two main kinds of intentional attacks:
● Watermark overwriting: This procedure involves adding an additional watermark to the model in
order to render the original watermark unnoticeable.
● Surrogate model attack. With a surrogate model attack, an attacker trains a bogus network by
feeding it a series of feedback and then uses that output to mimic the original network's
functionality.
Generality: DNN watermarking algorithms should be adaptable to a range of
architectures carrying out a variety of tasks.
Efficiency: It is the computational overhead to train the DNN on the task while
simultaneously embedding the watermark.
7. DNN watermarking models - I
Multi-bit vs. zero-bit watermarking techniques
- Based on the exact content of the watermark
message
- When the watermark message is multi-bit, it
corresponds to a sequence of N bits
- When zero-bit is used, watermark extraction is
carried out as a detection task, wherein the
detector should determine the presence of a
known watermark.
Multibit (a) vs zero-bit (b) watermarking
8. DNN watermarking models - II
Static vs. dynamic DNN watermarking
- Based on where the watermark can be read from.
Static watermarking: The watermark can be read directly from the
network weights, which can be considered similar to conventional
multimedia watermarking techniques.
Dynamic watermarking: When fed with some crafted inputs, a dynamic
watermark can alter the behavior of the network, which makes the
watermark message visible in the model output
10. DNN watermarking models - III
White-box vs. black-box DNN extraction
- Based on the data accessible to the watermark extractor
White-Box: When internal parameters/weights of the DNN models are available, watermark
recovery is undertaken in a white-box mode. This can be static or dynamic watermarking.
Black-box: When using black-box watermarking, only the final output of the DNN is accessible.
● A watermark can be recovered by querying the model and comparing the model output to a
set of correctly chosen inputs.
● During the entire decoding or detection process, both the model architecture and internal
parameters are completely invisible to the decoder.
● It means this can only be achieved only by dynamic watermarking.
12. Static Watermarking algorithms
Algorithm White/
Black
box
Multi/
Zero
bit
Methodology Robustness and Security
Uchida et al. White Multi In the loss function, a regularization term is added
so the watermark is embedded into the model
weights.
Moderate against fine-tuning and
pruning.
Li et al White Multi As Uchida’s scheme with ST-DM-like
regularization term.
Moderate against fine-tuning and
pruning.
DeepMarks White Multi As Uchida’s scheme with anti-collusion
codebooks.
Moderate against fine-tuning and
pruning, Collusion attack.
Tartaglione et
al.
White Zero Weights with watermarks remain frozen during
training. In the loss function, the sensitivity of the
network is maximized to changes in watermarked
weights.
Good robustness against fine-
tuning and weights quantization.
13. Dynamic watermarking algorithms
Algorithm White/
Black
box
Multi/
Zero
bit
Methodology Robustness and Security
DeepSigns
(Activation map)
White Multi Adds arbitrary N-bit string to the probability density function of
activation maps.
Moderate against fine-tuning and
pruning.
DeepSigns
(Output layer)
Black Zero Build key image-label pairs from random selected images. Moderate against fine-tuning, pruning
and overwriting.
Yossi et al Black Zero Inject a backdoor by selecting random images into the target
model.
Moderate against fine-tuning
Szyller et al. Black Zero Installed at the input and output of the target model's API,
embeds dynamic watermarks in responses to queries made by
the client.
Regular surrogate model attack
Merrer et al. Black Zero Adversarial attacks can be used to adjust the decision boundary
of the target model.
Parameter pruning, Overwriting via
adversarial fine-tuning.
Zhang et al. Black Zero Visible triggering patterns with backdoor-like mechanisms. Model fine-tuning, Parameter pruning,
Model inversion attack.
Guo et al. Black Zero Invisible triggering patterns with backdoor-like mechanisms. Model fine-tuning
14. Attacks on DNN Watermarking
● The attackers can take advantage of the observation that watermark
embedding increases the variance of the weights, making it possible to
distinguish a watermarked model from a non-watermarked one.
● Also the standard deviation of the weight increases linearly with the
watermark dimension, allowing the attacker to estimate the length of the
watermark or tell if it is present.
● This information is then used to replace the existing watermark with a new
one, making the original watermark unreadable.
15. Conclusion
- DNN watermarking is immune to vulnerabilities in the same way that any other
watermarking solution is.
- Challenge in providing robustness against:
● Fine-tuning
● Model-pruning
● Transfer learning.
- DNNs are becoming increasingly popular with its human-like capabilities and considering
the resources invested in its making, it is important that these advancements be
protected, and from our reading we can be quite confident that watermarking is one of
the reliable ways to achieve this goal.