This document summarizes a study on the impact of scrambling techniques on the entropy of barcodes. The study tested barcodes with and without error correcting codes (ECC) using four scrambling methods and three entropy measures. Results showed that scrambling increased the entropy and randomness of barcodes that originally contained ECC, making it harder to detect the presence of ECC. However, the difference in entropy between scrambled ECC and non-ECC barcodes was small and not statistically significant. The study concluded that while entropy analysis can detect the presence of structure in barcodes, the methods tested were not effective at distinguishing scrambled ECC barcodes from purely random barcodes.
1. IS&T NIP26 Conference, 23 September, 2010
Impact of Scrambling on
Barcode Entropy
Marie Vans, Steven Simske, Margaret Sturgill, & Jason Aronoff
HP Laboratories, Fort Collins, CO, USA
23 September 2010
3. IS&T NIP26 Conference, 23 September, 2010
3
Introduction
â Barcodes not just for ringing up sales anymore:
⢠Connecting to websites
⢠Consumer capture of content
â 1D vs. 2D/3D Barcodes
⢠Older 1D barcode standards being replaced and/or augmented with 2D
or 3D barcodes
⢠High-density barcodes used for additional data carrying or referencing
â ECC
⢠Added for robustness to certain types of distortion and damage
⢠Nature of ECC derived from assumptions more relevant to 1D
barcodes/general information theory.
⢠Use of ECC can be questioned
⢠Opens door to using barcodes as information carriers outside of the
current barcode standards.
â Previous Work
⢠effect of the print-scan (PS) cycle, or âcopyingâ cycle
⢠localized damage such as water damage and/or puncturing
⢠blurring S.J. Simske, M. Sturgill, and J.S. Aronoff, âEffect of
Copying and Restoration on Color Barcode Payload
Density,â Proc. ACM DocEng, vol. 9, pp. 127-130, 2009.
4. IS&T NIP26 Conference, 23 September, 2010
4
Some Background
â An attempt to highlight differential effects of encryption methods on
entropy by applying scrambling techniques to randomly generated strings
with and without Error Correcting Codes (ECC)
â Major Pieces:
⢠Entropy
âIncreasing entropy reduces the likelihood of a fraudulent agent being able to
âguessâ correct barcodes
⢠Scrambling
âFour ways to mix-up the barcode data
⢠ECC
âReed-Solomon Error Correcting Codes
5. IS&T NIP26 Conference, 23 September, 2010
5
Entropy Measures
Entropy as a measure for the
effect of ECC and
scrambling on 2D barcodes.
Here, entropy represents
signal randomness: how the
bits are distributed in a
signal.
â=






ďŁ
 â+
=
N
i
XE
XE
xXEXE
e
1
1 )(*
)(
)()(
log
Expected Values
0
0.1
0.2
0.3
0.4
0.5
0.6
1 2 3 4 5 6 7 8 9 10 11 12
Run Lengths
Expected
Expected Values
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
Max Entropy Low Entropy Minimum Entropy
Entropy
Normalized Entropy
6. IS&T NIP26 Conference, 23 September, 2010
6
Entropy Measures - continued
Entropy based on Hamming
Distance. N refers to the
maximum Hamming
Distance (HD) between two
bytes and x refers to the
normalized i HD of the actual
strings. This HD is calculated
on a moving window along a
string in a forward direction.
1
0.1*
0.1
0.11
log
1
2
2
â






ďŁ
 â+
=
â=
N
x
e
N
i
Equation 2 - Hamming Distance Entropy
0
0.5
1
1.5
2
2.5
Max Entropy Low Entropy Minimum Entropy
Entropy
Hamming Distance
(HD) Entropy
7. IS&T NIP26 Conference, 23 September, 2010
7
Scrambling Techniques
XOR:
â˘A randomly generated string of same size as entire string (message +
ECC bits) and XORâd with input string.
Structural scramble:
â˘Divide string matrix into equal sized structures (squares, rectangles,
etc.). Swap bits within each structure so new structure is a mirror image
of the original.
Even Check Bits:
â˘Add check bit at end of each row & column so that total number of black
modules is even.
Odd Check Bits:
â˘Add check bit at end of each row & column so that total number of black
modules is odd.
8. IS&T NIP26 Conference, 23 September, 2010
8
Hypothesis
â âChallengingâ entropy of string set with another random string :
⢠Should result in different responses if string not as entropic as challenge
string
â When random number is challenged, should be no difference in the
entropy between the two randomly generated strings.
â If string contains ECC, could be detectable difference in entropy
between string with ECC and randomly generated challenge string.
Random Signal
Random Signal ECCRandom Signal
Random Signal
Challenge
A B
9. IS&T NIP26 Conference, 23 September, 2010
9
Experimental Set-up
â˘28,000 individual barcodes generated using:
⢠500 randomly generated strings
⢠Average length - 310 bits
⢠Symbol sizes of 12x12 up to 26x26
⢠Module sizes from 12 to 18 pixels
â˘Each test has an associated scrambling algorithm and entropy measure.
â˘Each test run twice
⢠Using maximum number of ECC bits allowable for size
⢠Using randomly generated data where the ECC bits would normally be inserted.
⢠A total of 672,000 barcodes were tested with half containing ECC bits and half
completely random without ECC.
10. IS&T NIP26 Conference, 23 September, 2010
10
Results
â Result is the percent change of entropy between the input and output strings
⢠Mean output/mean input
⢠E.g. A result near 1.0 means there was very little change
â Non ECC change was very small
⢠scrambling a fully random string should result in another random string
â ECC entropy change increased
⢠scrambling a string containing non-random bits should result in a more random string
â Measured using Normalized Entropy with all scrambling techniques
Normalized Entropy - ECC vs NonECC
0.88
0.9
0.92
0.94
0.96
0.98
1
1.02
1.04
12 x 12 14 x 14 16 x 16 18 x 18 20 x 20 22 x 22 24 x 24 26 x 26
Symbol Size
Average%ChgOut/In
ECC
NonECC
11. IS&T NIP26 Conference, 23 September, 2010
11
Results
â Population statistics â Normalized Entropy
⢠Normalized Entropy (e1) means for ECC and non-ECC using the XOR scrambling
algorithm
⢠No way to distinguish between ECC and non-ECC strings by looking at difference in input
or output means only.
Normalized Entropy - Input & Output Means for ECC & NonECC
0.1
0.12
0.14
0.16
0.18
0.2
0.22
0.24
0.26
0.28
12 x 12 14 x 14 16 x 16 18 x 18 20 x 20 22 x 22 24 x 24 26 x 26
Symbol Size
MeanEntropy
Mean Input Entropy - ECC
Mean Input Entropy - NoECC
Mean Output Entropy - ECC
Mean Output Entropy - NoECC
12. IS&T NIP26 Conference, 23 September, 2010
12
Results
â Change in entropy after
scrambling results in higher
entropy (less randomness) for
both the ECC and the non-ECC
strings.
â For most symbol sizes, e2 output
values are lower than input values
â ECC strings start out with more
structure than the non-ECC string
and become more random after
scrambling.
â Change in entropy after
scrambling non-ECC strings is
detectable
HamDistWind Entropy Measure - ECC vs. NonECC
0.84
0.86
0.88
0.9
0.92
0.94
0.96
0.98
1
1.02
12 x 12 14 x 14 16 x 16 18 x 18 20 x 20 22 x 22 24 x 24 26 x 26
Symbol Size
Average%ChgOut/InEntropyMeasure
HamDistWind w ith ECC
HamDistWind NoECC
XOR-HamDistwind
0.6
0.65
0.7
0.75
0.8
0.85
0.9
0.95
1
1.05
12x12 14x14 16x16 18x18 20x20 22x22 24x24 26x26
Symbol Size
Average%ChangeOut/In ECC Data
NonECC
Data
13. IS&T NIP26 Conference, 23 September, 2010
13
Results
â Example shows standard error for
output means using the XOR
scrambling algorithm
⢠Other scrambling algorithms show similar
results
â Half the error bar shown to show the
magnitude
â The two populations overlap and
cannot be distinguished with any
reasonable level of statistical
confidence
â Population statistics show that
detecting difference between ECC and
non-ECC signals using population
means is not easy using these
methods
HanDistWind Entropy -- StdErr Output
ECC vs. NonECC
0
0.005
0.01
0.015
0.02
0.025
0.03
12 x
12
14 x
14
16 x
16
18 x
18
20 x
20
22 x
22
24 x
24
26 x
26
Symbol Size
StdErrOutput-HamDistWind
ECC
Data
Non-ECC
Data
XOR--HamDistWind Entropy -- Output Mean
ECC vs. NonECC
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0 2 4 6 8 10
Symbol Size
MeanInput-HamDistWind
ECC Data
Non-ECC
Data
Figure 13: XOR Scrambling - Output Mean
14. IS&T NIP26 Conference, 23 September, 2010
14
Conclusions
â Three entropy-based methods for determining the degree of
randomness in a signal
â Affect of scrambling on the outcome of these methods
â Data Matrix standard does not take this type of security into account
⢠ECC within the signal has structure and is therefore vulnerable to attacks
â Our entropy measures and the appropriate âattackâ can detect the
difference between a truly random signal and a signal that contains
structure
â Uses:
⢠Discover if ECC has been used & potential vulnerabilities of the security data
⢠Methods can be implemented to determine whether data is encrypted
⢠Possible to interrogate the entropy of the comprised signal and compare it to the
original entropy values.