- Software watermarking embeds hidden information within source code to track ownership, copyright protection, detect unauthorised modifications, and combat piracy. However, watermarked code can be vulnerable to attacks aiming to remove or alter the watermark. We highlight the ongoing challenges and future directions in securing watermarking techniques for robust protection of intellectual property in the software domain.
3. Software Watermarks & Fingerprints
Embed a unique identifier in a program to trace
software pirates.
Watermarking
1. discourages theft,
2. allows us to prove theft.
Fingerprinting
3. allows us to trace violators.
[3]
4. Malicious Reverse Engineering
Buy one
copy
Reuse
module
Sell
N Y
X
O
P
M
Q
M
Alice and Bob are competing software developers.
Bob reverse engineers Alice’s program and includes parts of it
in his own code.
Easier with Java bytecode, .NET
, ANDF. . .
obfuscates
[4]
⇒ Alice her code.
6. Software Piracy
Resell
Buy one
copy
Make illegal
copies
P
P
P
Alice is a software developer.
Bob buys one copy of Alice’s application and sells copies to
third parties.
watermarks/fingerprints
[6]
⇒ Alice her program.
9. Watermarking Transformations
Naive approaches typically use reordering (of statements,
basic blocks, . . . ) or renaming (of registers, methods, . . . ):
L: X:
REORDER RENAME
More powerful approaches extend program semantics or
alter program statistics:
ALTER
STATS
[9]
EXTEND
SEMANTICS
11. Static watermarking
•Static watermarking embeds ownership information directly into the
software code.
•This information is permanently embedded and cannot be easily removed.
•It is used to track ownership, identify pirated copies, and prevent
unauthorized distribution.
12. Static watermarking algorithms
•The Bogus Initializer Static Watermarking Algorithm works by inserting
seemingly meaningless code into the software.
•This code, however, contains the watermark information.
•The code is designed to not affect the functionality of the software.
13. EXTEND
SEMANTICS
— Moskowitz & Cooperman
class Main {
const Picture C =
· · ·
Code R = Decode(C);
Execute(R);
}
A watermarked media
object is embedded in the
program’s static data
segment.
“Essential” parts of the
program are
steganographically
encoded into the media.
If the watermarked image
is attacked, the embedded
code will crash.
US Patent 5,745,569, Jan 1996.
[13]
15. Original Code
, ,
public class C {
static in t gcd ( in t x , in t y ) {
int t ;
while ( true ) {
b = x % y = = 0 ;
boolean
i f ( b ) return y ;
t = x % y ; x = y ; y = t ;
}
}
public static void main ( String [ ] a ) {
System . out . p r i n t ( "Answer : " ) ;
System . out . p r i n t l n ( gcd (100 ,10));
}
}
[15]
z r
16. Boolean Splitting Obfuscation
, ,
public class C {
stati c int gcd ( int i , i nt j ) {
int t8, t7 , k ;
for ( ; ; ) {
i f ( i%j ==0) { t8=1;t7=0; }
else
i f (
{ t8=0;t7=0; }
(t7ˆt8)!=0 )
return j ;
else {
k= i%j ; i = j ; j =k ;
}
} }
public static void main ( String [ ] Z1 ) {
System . out . p r i n t ( "Answer: " ) ;
System . out . p r i n t l n ( gcd ( 1 0 0 , 1 0 ) ) ; }
[16]
}
z r
17. Bogus Branch Obfuscation
, ,
public class C {
static
int
int gcd ( int
t9 , t8 , q7 ,
i , int j ) {
q6 , q4 , q3 ;
q7=9;
for ( ; ; ) {
i f ( i%j ==0) { t9 =1; t8 = 0 ; }
q4=t8 ; q6=t9 ;
else { t9 =0; t8 =0;}
i f ( ( q4^q6 ) ! =0)
return j ;
else {
i f ( (((q7+q7*q7)%2!=0)?0:1)!=1 ) return 0 ;
q3= i%j ; i = j ; j =q3 ;
}
} }
public static void main ( String [ ] Z1 ) {
System . out . p r i n t ( "Answer: " ) ;
System . out . p r i n t l n ( gcd ( 1 0 0 , 1 0 ) ) ; }
[17]
}
z r
18. String Encoding Obfuscation
, ,
public class C {
static int gcd ( int i , int j ) {
/ / As before
}
public static void main ( String [ ] a ) {
System . out . p r i n t (
Obfuscator.DecodeString( / / Rename
[18]
t h i s !
"u00ABu00CDu00ABu00CD"+
"uFF84u2A16u5D68u2AA0"+
"u388Eu91CFu5326u5604"));
System . out . p r i n t l n ( gcd ( 1 0 0 , 1 0 ) ) ; }
}
z r
19. Collusion Protection by Obfuscation
P2’
17
42
P1’
P1
42
P2
17
Obfuscate
Key1
Obfuscate
Key2
Obfuscation can also be used to
collusive attacks.
Collusive
Attack ?
protect
[19]
against
20. Collusion Protection by Obfuscation
, ,
public class C {
s ta t ic Object get0 ( Object [ ] I ) {
Inte ger K , J , M, N ; in t r , q , j ; K=new Integ er ( 9 ) ;
j=2; j=60-(j+1); ++j; j=60-j;
for ( ; ; ) {
i f ( ( ( Integ er ) I [ 0 ] ) . intV alue ()% (( In te ger ) I [ 1 ] ) . intV alue ()== 0)
{ r = 1 ; q = 0 ; } else { r = 0; q = 0;}
M=new In teg er ( q ) ; J=new Intege r ( r ) ;
i f ( (M. intV alue ( ) ^ J . intValue ( ) ) ! = 0 )
return new Intege r ( ( ( Intege r ) I [ 1 ] ) . intV alue ( ) ) ;
else {
i f ( ( ( ( K. intV alue () +K . intV alue () ∗K . i ntV alue ()) % 2 != 0) ?0:1)!= 1)
return new Inte ger ( 0 ) ;
N=new Integ er ( ( ( Integ er ) I [ 0 ] ) . intV alue ()%
( ( Integ er ) I [ 1 ] ) . intV alue ( ) ) ;
I [0]=new Inte ger ( ( ( In teg e r ) I [ 1 ] ) . intV alue ( ) ) ;
I [1]=new Inte ger (N. intValue ( ) ) ;
} } }
public s ta t ic void main ( S tr i n g [ ] Z1 ) {
int j=2; int i=2; i=80-(i+1); j=80-(j+1);
System . out . p r i n t ( ( S tr i ng ) Obfuscator . get 0 (new Object [ ] {
( S tr i n g )new Object [ ] { " S tr i n g as before " } [ 0 ] } ) ) ;
[20]
++i; i=80-i; ++j; j=80-j;
System . out . p r i n t l n ( ( ( Intege r ) get0 (
new Object [ ] { ( Inte ger )new Object [ ] {
new Inte ger ( 1 0 0 ) , new Inte ger ( 1 0 ) } [ 0 ] ,
( Integ er )new Object [ ] {
new Inte ger ( 1 0 0 ) , new Integ er ( 1 0 ) } [ 1 ]
} ) ) . intValue ( ) ) ;
} }
z r
21. Dynamic watermarking
•Dynamic watermarks are embedded during program
execution.
•Specific events or conditions trigger the watermarking
process.
•The watermark information can be encoded in various
aspects of the execution state, such as:
• Variable values
• Data structure organization
• Control flow paths
22. Dynamic watermarking algorithms
•Dynamic watermarks are embedded during program
execution.
•Specific events or conditions trigger the watermarking
process.
•The watermark information can be encoded in various
aspects of the execution state, such as:
• Variable values
• Data structure organization
• Control flow paths
24. Static vs. Dynamic Watermarking
Static
Embed
Static
Extract
key
P
w
PJ
w
Static
key
algorithms are vulnerable to
semantics-preserving code transformations.
Dynamic
Embed
Dynamic
Extract
P
w
PJ w
Dynamic
[24]
I 1 , ···, I k I 1 , ···, I k
algorithms extract the mark from the state of
the program when run on a secret key input sequence.
25. Collberg Thomberson Watermarking
algorithm
•The Collberg Thomborson technique leverages dynamic
data structures for watermarking.
•It constructs a hidden data structure within the
program's memory during execution.
•This data structure encodes the watermark information.
26. CT algorithm implementational
techniques
•Watermark Embedding: During program execution,
specific events trigger the creation of the hidden data
structure.
•Watermark Encoding: The watermark information
(ownership, license details, etc.) is encoded within the
data structure.
•Data Structure Manipulation: The data structure is
manipulated subtly to embed the watermark without
affecting program functionality.
•Watermark Verification: A separate program
(watermark decoder) can extract the watermark
information from the hidden data structure to verify
ownership or identify tampering attempts.
27. EXTEND
SEMANTICS
— Collberg-Thomborson
Heap
Control Flow
n
Build G1
Build G2
I 1 , ···, I k
The watermark is embedded in the topology of a
dynamic graph structure, built at runtime but only for the
special input sequence I 1, · · · , I k .
Shape-analysis
[27]
Why? is hard.
ACM Principles of Programming Languages, POPL’99
28. CT — Example
, ,
public class Simple {
static void P( String i ) {
System . out . p r i n t l n ( " Hello " + i ) ;
}
public static void main ( String args [ ] ) {
P( args [ 0 ] ) ;
}
}
z r
⇓
, ,
class Watermark extends java . lang . Object {
public Watermark edge1 , edge2 ;
[28]
}
z r
⇓
29. CT — Example. . .
, ,
public class Simple_W {
static void P( String i , Watermark n2 ) {
i f ( i . equals ( " World " ) ) {
Watermark n1 = new Watermark();
Watermark n4 = new Watermark();
n4.edge1 = n1; n1.edge1 = n2;
Watermark n3 = (n2 != null)?n2.edge1:new Watermark();
n3.edge1 = n1;
}
System . out . p r i n t l n ( " Hello " + i ) ; }
args [ ] ) {
public static void main ( String
Watermark n3 = new Watermark();
Watermark n2 = new Watermark();
n2.edge1 = n3; n2.edge2 = n3;
P( args [ 0 ] , n2 ) ;
[29]
}
z r
32. A Session with SANDMARK
"WILDCATS"
⇓
Embed Watermark
ORIG.jar
NEW.jar
Select
Algorithm
Configure
Obfuscate
⇒
⇒
⇒
We obfuscate to protect against reverse engineering
and collusive de-watermarking attacks.
2
[32]
33. A Session with SANDMARK
Recognize Watermark
NEW.jar ⇒"WILDCATS"
⇒
We
2
[33]
extract the watermark to prove ownership.
34. A Session with SANDMARK
NEW.jar
located?
⇒ Watermark
Compare Bytecodes
Compute Static Statistics
View/Sort Bytecodes
⇒
manual attack
2
[34]
To simulate a we examine the
obfuscated/watermarked program using various static
analysis tools.
35. A Session with SANDMARK
ATTACKED.jar
⇓
Recognize Watermark
NEW.jar
Select
Algorithm
Configure
Obfuscate
⇒
⇒
T
o simulate an
2
[35]
automatic attack
⇓
Watermark destroyed?
we use SANDMARK’s
obfuscators (“SoftStir”) to attack the watermark.
36. Some other tools
[36]
2
Language Potentials tools Notes
C/C++ StegFS, Watermarking for C/C++
- StegFS might be a general steganography tool, potentially adaptable for watermarking. - Investigate if
"Watermarking for C/C++" is a specific tool or a generic description.
Java
LunaJava- JWatermark, Java Watermarking Tool, JDWP, Allatori,
Sandmark
- JWatermark and Java Watermarking Tool could be specific tools or generic descriptions. - JDWP focuses on
debugging, but might have watermarking capabilities (investigate further). - Allatori and Sandmark are
confirmed tools mentioned previously.
Python PyWatermark - Seems like a specific tool (PyWatermark) for Python watermarking.
JavaScript JSSP, JSMin, UglifyJS
- JSSP is unclear. Investigate if it's for watermarking or something else. - JSMin and UglifyJS are minifiers, not
watermarking tools.
MATLAB MatWater - Seems like a specific tool (MatWater) for MATLAB watermarking.
PHP PHPWatermark, PHPWatermarkingTool - Similar to Java, these might be specific tools or generic descriptions (investigate further).
Ruby RubyWater - Seems like a specific tool (RubyWater) for Ruby watermarking.
Swift SwiftyWatermark - Seems like a specific tool (SwiftyWatermark) for Swift watermarking.
37. Conclusion
Many interesting problems left to work on!
Formal models of attack and stealth.
Combining error correction and tamper-proofing.
Watermarking other languages.
Download from sandmark.cs.arizona.edu.
2
[37]
Editor's Notes
Problem statements
Static watermarking is a powerful tool for software developers to protect their intellectual property. By embedding a watermark, they can deter piracy and ensure traceability of their work.
Static watermarking is a powerful tool for software developers to protect their intellectual property. By embedding a watermark, they can deter piracy and ensure traceability of their work.
Collusive attacks involve the comparative analysis of 2 or more copies of a fingerprinted program. In a simple case the only difference between the two watermarks would be the fingerprint thus revealing the location of the fingerprint in all the programs. Every program must be obfuscated differently before distribution in order to avoid this kind of attack. After obfuscation there will be many differences between the programs and the watermark will be harder to find. However, this may cause a problem for debugging customers' programs; for example, bug reports sent in by customers may be specific to their copy of the software. Collberg and Thomborson3 suggest that it will be neccessary to store a copy of the keys used to fingerprint and obfuscation every copy of a sold program in order to recreate and exact copy of the customers program for debugging purposes.
Dynamic watermarking offers greater flexibility compared to static methods. The watermark is not fixed within the code but rather generated based on program behavior. This makes it more challenging to detect and remove.
Dynamic watermarking offers greater flexibility compared to static methods. The watermark is not fixed within the code but rather generated based on program behavior. This makes it more challenging to detect and remove.
Dynamic watermarking offers greater flexibility compared to static methods. The watermark is not fixed within the code but rather generated based on program behavior. This makes it more challenging to detect and remove.
The Collberg Thomborson technique takes advantage of the program's dynamic memory allocation. It creates a hidden data structure that holds the watermark information. This structure is carefully crafted to be unobtrusive and difficult to detect during normal program operation.
The Collberg Thomborson technique follows a well-defined process. The watermark is embedded during program execution based on specific triggers. The data structure is then manipulated to encode the watermark information in a way that is inconspicuous. Finally, a separate program can be used to verify the presence and content of the watermark.