30/05/2017
1
Image Forgery Localization by
JPEG Artifact Analysis
Prof. Sebastiano Battiato
Dipartimento di Matematica e Informatica,
Università di Catania
Image Processing LAB – http://iplab.dmi.unict.it
ICT Doctoral School - Trento, May 2017
Artifact and Coding
• File level analysis
– File system analysis: MAC time,Filename, Size
– Format analysis: Image file format, Resolution/color
depth/channels, Compression/coding parameters
• Image analysis:
– Pixel domain: blocking artifacts, error level analysis, JPEG ghosts
– Transform domain: DCT values
• Compression history
– Uncompressed, Single compression
– Double compression (or more)
– Evaluation of compression coefficients (QTs)
– Evaluation of previous compression coefficients (QTs)
ICT Doctoral School - Trento, May 2017
30/05/2017
2
A possible workflow
ICT Doctoral School - Trento, May 2017
The JPEG Standard
• JPEG stands for an image compression stream of bytes;
• JFIF (JPEG File Interchange Format) stands for a
standard which define:
o Component sample registration
o Resolution and aspect ratio
o Color Space
• ExIF allows to integrate further information into the
file
ICT Doctoral School - Trento, May 2017
30/05/2017
3
JPEG Compression
• Converting an image into JPEG is a six step
process:
• The image is converted from raw RGB data into YCbCr;
• A downsampling is performed on chrominance channels;
• The channels are splitted into 8x8 blocks;
• A Discrete Cosine Transform is applied;
• The DCT coefficient are Quantized (lossy) using fixed tables;
• Finally an entropy coding (lossless compression) is applied
and the image is said to be JPEG compressed
Color
Transform
Down-
Sampling
Forward
DCT
Quantization Encoding
Decoding
De-
quantization
Inverse
DCT
Up-
Sampling
RAW
Data
Color
Transform
JPEG Compressed
Image
JPEG Compression
JPEG Decompression
Block
Restoring
Block
Splitting
ICT Doctoral School - Trento, May 2017
ICT Doctoral School - Trento, May 2017
30/05/2017
4
Color Conversion & Downsampling
• First, the image is converted from RGB into a different colors pace called
YCbCr.
• The Y component represents the brightness of a pixel, the Cb and Cr
components represent the chrominance (split into blue and red components).
• The Cr and Cb components are usually downsampled because, due to the
densities of color- and brightness-sensitive receptors in the human eye, humans
can see considerably more fine detail in the brightness of an image (the Y
component) than in the color of an image (the Cb and Cr components).
R
B
G Y Cr
Cb
ICT Doctoral School - Trento, May 2017
Color Transform
R
B
G
Y
Cb
Cr
Example:
The human eye is more sensitive to luminance
than to chrominance. Typically JPEG throw out
3/4 of the chrominance information before any
other compression takes place. This reduces the
amount of information to be stored about the
image by 1/2. With all three components fully
stored, 4 pixels needs 3 x 4 = 12 component
values. If 3/4 of two components are discarded
we need 1 x 4 + 2 x 1 = 6 values.
Y = 0.299 R + 0.587 G + 0.114 B
Cb = (B – Y)/2 + 0.5
Cr = (R – Y)/2 + 0.5
ICT Doctoral School - Trento, May 2017
30/05/2017
5
Chrominance subsampling
ICT Doctoral School - Trento, May 2017
Blocks 8x8
• After subsampling, each channel must be split into 8x8
blocks of pixels.
• If the data for a channel does not represent an integer
number of blocks then the encoder must fill the
remaining area of the incomplete blocks with some form
of dummy data.
ICT Doctoral School - Trento, May 2017
30/05/2017
6
Discrete Cosine Transform
• Next, each component (Y, Cb, Cr) of each 8×8 block is
converted to a frequency-domain representation, using a
normalized, two-dimensional type-II discrete cosine
transform (DCT).
• As an example, one such 8×8
8-bit subimage might be:
• Before computing the DCT of the subimage, its gray values are
shifted from a positive range to one centered around zero.
• For an 8-bit image each pixel has 256 possible values: [0,255].
To center around zero it is necessary to subtract by half the
number of possible values, or 128.
ICT Doctoral School - Trento, May 2017
Discrete Cosine Transform
The DCT transforms 64 pixels to
a linear combination of these 64
squares. Horizontally is u and
vertically is v.
• Subtracting 128 from each pixel value
yields pixel values on [ -128,127] and we
obtain the following matrix.
• The next step is to take the two-
dimensional DCT, which is given by:
where
 is the horizontal spatial frequency, for the integers .
 is the vertical spatial frequency, for the integers .
 is a normalizing function
 is the pixel value at coordinates
 is the DCT coefficient at coordinatesICT Doctoral School - Trento, May 2017
30/05/2017
7
DCT basis
The 64 (8 x 8) DCT basis
functions:
DC
Coefficient
AC Coefficients
ICT Doctoral School - Trento, May 2017
Image Representation with DCT
DCT coefficients can be viewed as weighting functions
that, when applied to the 64 cosine basis functions of
various spatial frequencies (8 x 8 templates), will
reconstruct the original block.
Original image block DC (flat) basis function AC basis functions
ICT Doctoral School - Trento, May 2017
30/05/2017
8
DCT example
ICT Doctoral School - Trento, May 2017
DCT Coefficients Quantization
• The DCT coefficients are quantized to a limited
number of possible levels.
• The Quantization is needed to reduce the
number of bits per sample.
Formula:
F( u, v) = round[ F( u, v) / Q( u, v)]
– Q( u, v) = constant => Uniform
Quantization.
– Q( u, v) = variable => Non-uniform
Quantization.
Example:
101000 = 40 (6 bits precision)
Truncates to 4 bits = 1000 = 8 (4 bits
precision).
i.e. 40/5 = 8, there is a constant N=5,
or the quantization or quality factor .
ICT Doctoral School - Trento, May 2017
30/05/2017
9
ICT Doctoral School - Trento,
May 2017
Quantization step
It is possible to approximate the statistical
distribution of the AC DCT coefficients, both
luminance and chrominance components, of a
8x8 block, by a Laplacian distribution in the
following way:
pi(x)=  i /2 e-i |x| i = 1, 2, ..., 64;
where:
i= sqrt(2)/i ;
i = i-th DCT standard deviation;
EXAMPLE:
Q(u,v)= 8; Quantization Step
Round(256/8)= 32 Intervals;
[0, 8, 16, 24, 32, 40, ..., 256] - Reconstruction
Levels
Dead
zone
ICT Doctoral School - Trento, May 2017
Edmund Y. Lam, , Joseph W. Goodman - A Mathematical Analysis of the
DCT Coefficient Distributions for Images– IEEE Trans. On Image
Processing, Vol.9, No. 10, 2000
30/05/2017
10
Distributions of DCT Coefficients for Images
An estimation of μ is the sample median.
The maximum likelihood estimator of b is:
The DCT coefficient resample a Laplacian
 Each distribution can be summarized with two parameters.
ICT Doctoral School - Trento, May 2017
G.M. Farinella, D. Ravì, V. Tomaselli, M. Guarnera, S. Battiato - Representing Scenes for Real-time
Context Classification on Mobile Devices – Elsevier, Pattern Recognition - vol.48, 2015
D. Ravì, M. Bober, G.M. Farinella, M.Guarnera, S.Battiato - Semantic Segmentation of Images
Exploiting DCT Based Features and Random Forest - Elsevier, Pattern Recognition - vol. 52, 2016
30/05/2017
11
Representation
• Context of different classes differs in the scales (b) of the AC DCT
coefficient distributions.
• To represent the context we use a feature vector describing the AC
DCT coefficients distributions of an image
Each distribution is summarized by the parameter b.
 Compact Representation
 Simple to be computed during IGP
Diversity of the DCT coefficient distributions on the 8 Scene Dataset
30/05/2017
12
2-dimensional distributions (fitted with a Gaussian model) related to the Laplacian
parameter b of the DCT frequency (0,1) and (1,0).
Image Generation Pipeline
30/05/2017
13
Final AC DCT frequencies considered for representing the
context of the scene
Classification Results (8 scene classes)
(A) (B) (C) (D) (E) (F)
GIST
Performances match
the GIST ones, but in
constrained domain.
Feature Extraction is for
"free" (JPEG).
Can be used as global
descriptor also in post
acquisition time.
Can be computed in
real time.
30/05/2017
14
Proposed
Classification Results (8 scene classes)
G.M. Farinella, D. Ravì, V. Tomaselli, M. Guarnera, S. Battiato - Representing Scenes
for Real-time Context Classification on Mobile Devices – Elsevier, Pattern
Recognition - vol.48, 2015
D. Ravì, M. Bober, G.M. Farinella, M.Guarnera, S.Battiato - Semantic Segmentation of
Images Exploiting DCT Based Features and Random Forest - Elsevier, Pattern
Recognition - vol. 52, 2016
Implemented with FCAM.
Thanks to Nokia Research Center for providing the smartphones.
This research has been sponsored by STMicroelectronics.
http://iplab.dmi.unict.it/DCT-GIST
30/05/2017
15
Standard Q-tables
Eye is most sensitive to low frequencies (upper left corner), less
sensitive to high frequencies (lower right corner)
Luminance Quantization Table Chrominance Quantization
Table
16 11 10 16 24 40 51 61 17 18 24 47 99 99 99 99
12 12 14 19 26 58 60 55 18 21 26 66 99 99 99 99
14 13 16 24 40 57 69 56 24 26 56 99 99 99 99 99
14 17 22 29 51 87 80 62 47 66 99 99 99 99 99 99
18 22 37 56 68 109 103 77 99 99 99 99 99 99 99 99
24 35 55 64 81 104 113 92 99 99 99 99 99 99 99 99
49 64 78 87 103 121 120 101 99 99 99 99 99 99 99 99
72 92 95 98 112 100 103 99 99 99 99 99 99 99 99 99
The numbers in the above quantization tables can be scaled up (or down)
to adjust the so called Quality Factor QF. (i.e. Q*(u,v)= QF x Q(u,v))
Custom quantization tables can also be put in image/scan header.ICT Doctoral School - Trento, May 2017
Quantized DCT
zij = round( yij / qij )
ICT Doctoral School - Trento, May 2017
30/05/2017
16
Quantization
• The Quantization is usually used to convert continuous signal to
a discrete space.
• In the example above we have processed a continuous signal ,
by using a larger quantization step X (thus reducing drastically
the numbers of samples), and a smaller step X/4 which
introduce more samples and is much more similar to the
continuous signal.
Original Continuous Signal Quantized with Step X Quantized with Step X/4
ICT Doctoral School - Trento, May 2017
Quantization Tables
ICT Doctoral School - Trento, May 2017
30/05/2017
17
• The standard fixes that each image must have
between one and four quantization tables.
• The most commonly used quantization tables are
those published by the Independent JPEG Group (IJG)
in 1998.
• These tables can be scaled to
a quality factor Q.
• The quality factor allows the
image creation device to
choose between:
o Larger, higher quality images
o Smaller, lower quality images.
Quantization Tables
ICT Doctoral School - Trento, May 2017
Quantization Tables
• For example, we can scale the
IJG standard table using Q=80
by applying Eq. (2) to each
element in the table. The
resulting values are the following
scaled quantization tables
• Note that the numbers in this table are lower than in
the standard table, indicating an image compressed
with these tables will be of higher quality than ones
compressed with the standard table. It should be
noted that scaling with Q=50 does not change the
table.
ICT Doctoral School - Trento, May 2017
30/05/2017
18
Quantization Tables
• The different QT could be classified into the following categories:
o Standard Tables:
Images which use scaled versions of the QT published by Independent
JPEG Group (IJG) standard;
o Extended Tables:
Same as Standard Tables but have three tables instead of two. The
third table is a duplicate of the second;
o Custom Fixed Tables:
Images containing non-IJG QT that do not depend on the image being
processed (Adobe Photoshop);
o Custom Adaptive Tables:
These images do not conform to the IJG standard. In addition, they
may change, either in part or as a whole, between images created by
the same device using the same settings. They may also have
constants in the tables; values that do not change regardless of the
quality setting or image being processed.
ICT Doctoral School - Trento, May 2017
Quantization Tables For Ballistics
“Considering only QT is not sufficient to discriminate
between different source cameras, but it could be useful
to clearly indentify altered images.”
Issues
QT database will never be complete - A study shown 15000 different
camera/softwares and almost 63000 different QTs on the market
Adaptive tables
Can be spoofed, resaving the file with the desired QTs
Performances can be improved combining QTs with other metadata: Image size,
Chroma subsampling, Thumbnail size, Huffman tables (less variable than QTs)
…
Photoshop QTs are the same since version 3 and do not correspond to any
known camera
30/05/2017
19
Most common matrix
• 1513 different
matrices
• 70 camera model
7337 devices (DSC,
Smartphone - tablet)
Thanks to Jerian Martino (from AMPED srl:
http://ampedsoftware.com/company )
Camera Ballistics thorough quantization
tables
ICT Doctoral School - Trento, 2017
Image Forgery Localization by JPEG
Artifact Analysis
The manipulation is detected by analyzing
artifacts introduced by JPEG recompression. Two
main classes can be considered:
• aligned double JPEG (A-DJPG) compression
(i.e., first and second JPEG compression make
use of aligned DCT grids).
• non-aligned double JPEG (NA-DJPG)
compression.
ICT Doctoral School - Trento, May 2017
30/05/2017
20
JPEG Analysis
• JPEG Double Quantization Detection (DQD) and Quantization
Step Estimation (QSE) are two fundamental steps in the overall
process.
• DQD and QSE methods by dividing them into categories with
names that recall what is exploited from image data. Thus the
categories are for methods based on:
– Probability distributions on DCT coefficients;
– Benford’s Law;
– Benford’s Fourier Coefficients;
– Neural Networks encoding and classification;
– DCT coefficients comparison;
– SVM classifiers;
– Factor Histogram;
– Considerations on noiseICT Doctoral School - Trento, May 2017
Benford’s Law
Benford's law, also called the first-digit law, refers to the frequency distribution
of digits in many real-life sources of data. In this distribution, the number 1 occurs
as the first digit about 30% of the time, while larger numbers occur in that position
less frequently, with larger numbers occurring less often: 9 as the first digit less
than 5% of the time.
http://en.wikipedia.org/wiki/Benford's_law
ICT Doctoral School - Trento, May 2017
30/05/2017
21
Generalized Benford’s Law (1)
D. Fu, Y. Q. Shi, and W. Su, A generalized Benford’s law for JPEG coefficients and its applications image forensics,
in Proc. SPIE, Security, Steganography, Watermarking of Multimedia Contents IX, P.W. Wong and E. J. Delp, Eds., San
Jose, CA, Jan. 2007, vol. 6505, pp. 1L1–1L11.
Fu et al. observed that the distribution of the first digit of DCT coefficients in single
JPEG compressed image follows a generalized Benford distribution.
N is a normalization factor, s and q are model
parameters to describe the distributions for
different images and different compression
factors.
ICT Doctoral School - Trento, May 2017
Generalized Benford’s Law (2)
D. Fu, Y. Q. Shi, and W. Su, A generalized Benford’s law for JPEG coefficients and its applications image forensics,
in Proc. SPIE, Security, Steganography, Watermarking of Multimedia Contents IX, P.W. Wong and E. J. Delp, Eds., San
Jose, CA, Jan. 2007, vol. 6505, pp. 1L1–1L11.
In case of double quantization the statistical distribution of the first digit does not
follow the generalized Benford’s distribution. This behavior can be then used to
discriminate among single (a) and double (b) compression.
ICT Doctoral School - Trento, May 2017
30/05/2017
22
Generalized Benford’s Law (3)
B. Li, Y. Shi, and J. Huang, Detecting doubly compressed JPEG images by using mode based first digit features, in
Proc. IEEE 10thWorkshop Multimedia Signal Processing, Oct. 2008, pp. 730–735.
Starting from the method of Fu et al., Li et al. analyze the distribution of the first
digit of each DCT coefficient in case of single and double compression.
They observed that these distributions do not follow generalized Benford’s law.
ICT Doctoral School - Trento, May 2017
Generalized Benford’s Law (4)
B. Li, Y. Shi, and J. Huang, Detecting doubly compressed JPEG images by using mode based first digit features, in
Proc. IEEE 10thWorkshop Multimedia Signal Processing, Oct. 2008, pp. 730–735.
Although these distributions do not follow generalized Benford’s law, they have some
differences that can be exploited and used as feature vector in a two-class classifier.
Differently than Fu et al. they use only a limited set of DCT coefficients (20).
Considering DCT coefficients individually, distinguishable modes contribution can be better
exploited.
ICT Doctoral School - Trento, May 2017
30/05/2017
23
Other related works
• Feng X. and Doerr G.: JPEG recompression detection, IS&T/SPIE Electronic
Imaging, 75410J, (2010)
• Li X.H. and Zhao Y.Q. and Liao M. and Shih F.Y. and Yun Q.S.: Detection of
tampered region for JPEG images by using mode-based first digit features,
EURASIP Journal on advances in signal processing, 1, 1–10, (2012)
• Hou W. and Ji Z. and Jin X. and Li X.: Double JPEG Compression Detection
Base on Extended First Digit Features of DCT Coefficients, International
Journal of Information and Education Technology, 3, 5, 512–515, (2013)
• Milani S. and Tagliasecchi M. and Tubaro S.: Discriminating multiple JPEG
compressions using first digit features, APSIPA Transactions on Signal and
Information Processing, 3, e19, (2014)
• Pasquini C. and Boato G. and Perez-Gonzalez F.: A Benford-Fourier JPEG
compression detector, IEEE International Conference on Image Processing
(ICIP), 5322–5326, (2014)
• Pasquini C. and Boato G. and Perez-Gonzalez F.: Multiple JPEG compression
detection by means of Benford-Fourier coefficients, IEEE International
Workshop on Information Forensics and Security (WIFS),113–118, (2014)
ICT Doctoral School - Trento, May 2017
Periodic artifact introduced by
Double JPEG quantizations (1)
A. C. Popescu and H. Farid, Statistical tools for digital forensics, in Proc. 6th Int. Workshop Information Hiding, Berlin,
Germany, 2004, pp. 128–147, Springer-Verlag.
Z. Lin, J. He, X. Tang, and C.-K. Tang, Fast, automatic and fine-grained tampered JPEG image detection via DCT
coefficient analysis, Pattern Recognition, vol. 42, no. 11, pp. 2492–2501, Nov. 2009.
The properties of the histograms of double quantized DCT coefficients can be
exploited to detect forgery.
The number of original histogram bins (n(u2)) that contribute to a single bin u2 in
the double quantized histogram h2 actually depends on u2. It is worth noting that
n(u2) is a periodic function with a period p = q1/gcd(q1, q2) where q1, q2 are the
quantization coefficients relative to the first and the second JPEG compression.
ICT Doctoral School - Trento, May 2017
30/05/2017
24
Periodic artifact introduced by
Double JPEG quantizations (2)
A. C. Popescu and H. Farid, Statistical tools for digital forensics, in Proc. 6th Int. Workshop Information Hiding, Berlin,
Germany, 2004, pp. 128–147, Springer-Verlag.
Z. Lin, J. He, X. Tang, and C.-K. Tang, Fast, automatic and fine-grained tampered JPEG image detection via DCT
coefficient analysis, Pattern Recognition, vol. 42, no. 11, pp. 2492–2501, Nov. 2009.
If q2<q1, then n(u2) =0 for some u2, hence the histogram related to the double
quantization can show periodically missing values. On the contrary, if q2>q1 the
histogram can have some periodicity in terms of peaks and valleys pattern.
ICT Doctoral School - Trento, May 2017
Periodic artifact introduced by
Double JPEG quantizations (3)
A. C. Popescu and H. Farid, Statistical tools for digital forensics, in Proc. 6th Int. Workshop Information Hiding, Berlin,
Germany, 2004, pp. 128–147, Springer-Verlag.
These periodic artifacts are visible in the Fourier domain as strong peaks in
medium and high frequencies
Fourier Transforms of three histograms corresponding to:
(1) single JPEG compression with quality 75;
(2) double JPEG compression with quality 85 followed by 75;
(3) double JPEG compression with quality 75 followed by 85.
ICT Doctoral School - Trento, May 2017
30/05/2017
25
Periodic artifact introduced by
Double JPEG quantizations (4)
Z. Lin, J. He, X. Tang, and C.-K. Tang, Fast, automatic and fine-grained tampered JPEG image detection via DCT
coefficient analysis, Pattern Recognition, vol. 42, no. 11, pp. 2492–2501, Nov. 2009.
Lin et al. build histograms for each channel and each frequency. For each block in
the image, using one histogram they are able to compute the probability of it being
a tampered block, by checking the DQ effect of this histogram.
ICT Doctoral School - Trento, May 2017
Periodic artifact introduced by
Double JPEG quantizations (5)
T. Bianchi, A. D. Rosa, and A. Piva, Improved DCT coefficient analysis for forgery localization JPEG images, in Proc.
ICASSP 2011,May 2011, pp. 2444–2447.
Bianchi et al. improve the algorithm proposed by Lin et al. by considering a better
model of the observed histogram.
Specifically, a limitation of the Lin et al. method is related to the estimation of the
conditional probability p(x|H1)1. This probability is estimated according to the
observed histogram of x, however, in the case of a tampered image, such a
histogram is a mixture of p(x|H1) and p(x|H0).
To obtain a better model they develop a novel algorithm for the estimation of the
coefficients of the first quantization (q1) and also consider the effects due to
rounding and truncation.
1x is the value of the DCT coefficient and H0 (H1) indicates the hypothesis of being
tampered (original).
ICT Doctoral School - Trento, May 2017
30/05/2017
26
Non-aligned double JPEG
(NA-DJPG) compression
• W. Luo, Z. Qu, J. Huang, and G. Qui, “A novel method for detecting cropped and
recompressed image block,” in Proc. ICASSP, 2007, vol. 2, pp. II-217–II-220.
• M. Barni, A. Costanzo, and L. Sabatini, “Identification of cut and paste tampering
by means of double-JPEG detection and image segmentation,” in Proc. ISCAS,
2010, pp. 1687–1690.
• Y.-L. Chen and C.-T. Hsu, “Image tampering detection by blocking periodicity
analysis JPEG compressed images,” in Proc. IEEE 10th Workshop Multimedia Signal
Processing, Oct. 2008, pp. 803–808.
• T. Bianchi and A. Piva, “Detection of nonaligned double JPEG compression based
on integer periodicity maps,” IEEE Trans. Inf. Forensics Security, vol. 7, no. 2, Apr.
2012.
• T. Bianchi, A. Piva, "Image Forgery Localization via Block-Grained Analysis of JPEG
Artifacts", IEEE Transactions on Information Forensics & Security, Volume: 7, Issue:
3 – 2012, pp: 1003 – 1017.
ICT Doctoral School - Trento, May 2017
Cropping detection through blocking artifact
• Cropping correspond to a translation of the original blocking effect.
• Looking for blocking effect not aligned to 8x8 pixel
Uncompressed image
Compressed image
ICT Doctoral School - Trento, May 2017
30/05/2017
27
Block based system
• An algorithm has been developed in the
spatial domain
Published : A. R. Bruna, G. Messina, S. Battiato, “Crop Detection Through Blocking Artefacts
Analysis” - Image Analysis and Processing -- ICIAP 2011, Springer, Lecture Notes in Computer
Science, Vol. 6979, ISBN 978-3-642-24087-4
ICT Doctoral School - Trento, May 2017
H/V filters(1/3)
• Convolutional filters like:
• Allow to retrieve straight lines
1 1 1 1
-1 -1 -1 -1
0 0 0 0
0 0 0 0
1 -1 0 0
1 -1 0 0
1 -1 0 0
1 -1 0 0
H filter V filter
OutV= Img * V_filter OutH= Img * H_filter
ICT Doctoral School - Trento, May 2017
30/05/2017
28
Regular pattern measure (RPM)
• A measure of the blocking in a particular horizontal
(or vertical) position has been defined:
  ;7,...,1;8)(
)8/(
0
'
 
iijIiRPM
Nfloor
j
H
H   ;7,...,1;8)(
)8/(
0
'
 
iijIiRPM
Mfloor
j
V
V
0 2 4 6 8
6
7
8
9
10
11
12
13
x 10
4
regular vertical pattern measure
0 2 4 6 8
7.5
8
8.5
9
9.5
10
10.5
11
11.5
x 10
4
regular horizontal pattern measure
RPMh and PMv measures example, corresponding to a crop of (5,4)
ICT Doctoral School - Trento, May 2017
Experimental results
• Experimental results on a large database showed a
good reliability, especially at higher compression
ratio (i.e. lower quality factor)
Quality factor Accuracy (%)
10 99
20 91
30 80
40 69
50 58
60 46
70 39
80 28
90 16
ICT Doctoral School - Trento, May 2017
30/05/2017
29
First Quantization Coefficient
Extraction
From Double Compressed JPEG
Images
F. Galvan, G. Puglisi, A.R. Bruna, S. Battiato
IEEE TIFS Vol. 9, No 8, August 2014
ICT Doctoral School - Trento, May 2017
Motivation
The ability to roughly reconstruct the quantization table
used by the device during acquisition, is crucial in almost
all forensics investigations.
Such information discriminates which spatial regions are
associated with the same (original) quantization table,
evidencing as corrupted the ones that show different data.
Retrieving some (even not all) components of the first
quantization matrix, allow to look for the model of devices
employing the same quantization tables identified before.
ICT Doctoral School - Trento, May 2017
30/05/2017
30
Motivation
• Specifically, when the second compression is lighter than the
first one, retrieving the first quantization step is often possible.
This can be done taking advantage of some interesting
properties of integer numbers, that occurs whenever they are
quantized (that means rounded) more than once.
• The main novelties of the proposed approach are related to the
filtering strategy (split noise), adopted to reduce the amount of
noise in the input data (DCT histograms), and on the design of a
novel function with a satisfactory q1-localization property.
ICT Doctoral School - Trento, May 2017
Software
• FourandSix
• Authenticate
• Belkasoft Forgery Detection
ICT Doctoral School - Trento, May 2017
30/05/2017
31
The Approach:
• characterizes features that occur in DCT histograms of individual
coefficients due to double compression.
• exploits errors introduced from the rounding function;
• Neural Network as a classifier;
• separate network for each value of the second quantization step.
Results and Limitations:
• good performances ( ~ 1% of error rate) for D01D10D11;
• worst results as frequency increase;
• no results about DC term.
Lukas, J., Fridrich, J.: Estimation of primary quantization matrix in double compressed JPEG images.
In: Proceedings of Digital Forensic Research Workshop, DFRWS (2003)
.
State of the art – Lukas et al. (2003)
ICT Doctoral School - Trento, May 2017
Farid, H.: Exposing digital forgeries from JPEG ghosts. IEEE Transactions on Information Forensics and Security 4(1),
154–160 (2009)
State of the art - Farid (2009)
ICT Doctoral School - Trento, May 2017
30/05/2017
32
Farid, H.: Exposing digital forgeries from JPEG ghosts. IEEE Transactions on Information Forensics and Security 4(1),
154–160 (2009)
State of the art - Farid (2009)
ICT Doctoral School - Trento, May 2017
The Approach:
• quantize again such value with a novel quantization coefficient (q3)
varying in a proper range;
• evaluate an error function defined as follows:
Farid, H.: Exposing digital forgeries from JPEG ghosts. IEEE Transactions on Information Forensics and Security 4(1),
154–160 (2009)
• c is the DCT term;
• qi is the quantization coefficient referred to the i-th compression,
• […] indicates the round function
• |..| indicates the abs function
State of the art - Farid (2009)
ICT Doctoral School - Trento, May 2017
30/05/2017
33
Results and Limitations:
• works well only if q1 > q2 and with some pairs of quantization coefficients (sx);
• results can be unclear in some cases (center);
• we must know “where” to search (dx).
Farid, H.: Exposing digital forgeries from JPEG ghosts.
IEEE Transactions on Information Forensics and Security 4(1), 154–160 (2009)
.
State of the art - Farid (2009)
ICT Doctoral School - Trento, May 2017
The Approach:
• distinguishes between aligned and unaligned tampered regions;
• computes a likelihood map indicating the probability for each 8x8
discrete cosine transform block of being doubly compressed;
• includes a study of the various types of errors in the JPEG algorithm.
Results and Limitations:
• is no more valid if certain image processing operations, like resizing, are
applied between the two compressions;
• gives satisfactory results only if q1 > q2;
Bianchi, T., Piva, A.: Image forgery localization via block-grained analysis of JPEG artifacts.
IEEE Transactions on Information Forensics and Security 7(3), 1003–1017 (2012)
State of the art – Bianchi et al. (2012)
ICT Doctoral School - Trento, May 2017
30/05/2017
34
Error management
The error e is introduced by several operations, such as color
conversions (YCbCr to RGB and vice versa), rounding and
truncation of the values to eight bit integers, etc. It is important
to note that the errors above can be due to some processing in
different domains (e.g., spatial domain). In any case e is the
effect of such errors in the DCT coefficients.
Note that the factor e, often omitted in previous published
works, if not properly managed, can limit the effectiveness of
any related methodology.
ICT Doctoral School - Trento, May 2017
Starting from the joint behaviour of the round function which acts on the (i,j)th
term of the 8x8 image block, followed by the dequantization, that is what happens
before a forgery operation:
𝑐 =
𝑐
𝑞
× 𝑞
we note that:
• if q (the quantization coefficient) is odd, all integer numbers in 𝑛𝑞 −
𝑞
2
, 𝑛𝑞 +
F. Galvan, G. Puglisi, A. R. Bruna, and S. Battiato, “First quantization coefficient extraction
from double compressed JPEG images,” in International Conference on Image Analysis and
Processing (ICIAP), ser. Lecture Notes in Computer Science, vol. 8156, 2013, pp. 783–792.
First quantization coefficient extraction from
double compressed JPEG images
ICT Doctoral School - Trento, May 2017
30/05/2017
35
examples of the effect of rounding function when q is even.
F. Galvan, G. Puglisi, A. R. Bruna, and S. Battiato, “First quantization coefficient extraction
from double compressed JPEG images,” in International Conference on Image Analysis and
Processing (ICIAP), ser. Lecture Notes in Computer Science, vol. 8156, 2013, pp. 783–792.
Quantization Step Estimation
ICT Doctoral School - Trento, May 2017
Histogram of all the DCT terms in position (i;j) in all the 8x8 blocks of an
image:
F. Galvan, G. Puglisi, A. R. Bruna, and S. Battiato, “First quantization coefficient extraction
from double compressed JPEG images,” in International Conference on Image Analysis and
Processing (ICIAP), ser. Lecture Notes in Computer Science, vol. 8156, 2013, pp. 783–792.
Quantization Step Estimation
ICT Doctoral School - Trento, May 2017
30/05/2017
36
F. Galvan, G. Puglisi, A. R. Bruna, and S. Battiato, “First quantization coefficient extraction
from double compressed JPEG images,” in International Conference on Image Analysis and
Processing (ICIAP), ser. Lecture Notes in Computer Science, vol. 8156, 2013, pp. 783–792.
q1 and q2 are, respectively, the (i;j) terms of the first and second
quantization matrix. Then, if q1 > q2, we note that (if some error
introduced in other steps are not considered):
𝑐1 =
𝑐
𝑞1
× 𝑞1
leads to (if we put q1 =10) :
Quantization Step Estimation
ICT Doctoral School - Trento, May 2017
The effect of the second quantization/dequantization: 𝑐2 =
𝑐
𝑞2
× 𝑞2 is to
map multiples of q1 in multiples of q2.
F. Galvan, G. Puglisi, A. R. Bruna, and S. Battiato, “First quantization coefficient extraction
from double compressed JPEG images,” in International Conference on Image Analysis and
Processing (ICIAP), ser. Lecture Notes in Computer Science, vol. 8156, 2013, pp. 783–792.
Quantization Step Estimation
ICT Doctoral School - Trento, May 2017
30/05/2017
37
but: 𝑞2 < 𝑞1 → 𝑚𝑞2 (result is mapped in generic multiple of 𝑞2) where:
• 𝑐2 𝜖 [n𝑞2 − ⌊𝑞2/2⌋ ,n𝑞2+⌊𝑞2/2⌋] if 𝑞2 is odd;
• 𝑐2 𝜖 𝑛𝑞2 −
𝑞2
2
, 𝑛𝑞2 +
𝑞2
2
− 1 if 𝑞2 is even.
then in the range related to 𝑛𝑞1
F. Galvan, G. Puglisi, A. R. Bruna, and S. Battiato, “First quantization coefficient extraction
from double compressed JPEG images,” in International Conference on Image Analysis and
Processing (ICIAP), ser. Lecture Notes in Computer Science, vol. 8156, 2013, pp. 783–792.
Quantization Step Estimation
ICT Doctoral School - Trento, May 2017
if we quantize again with q1, we turn back to the situation
before the second quantization
F. Galvan, G. Puglisi, A. R. Bruna, and S. Battiato, “First quantization coefficient extraction
from double compressed JPEG images,” in International Conference on Image Analysis and
Processing (ICIAP), ser. Lecture Notes in Computer Science, vol. 8156, 2013, pp. 783–792.
Quantization Step Estimation
ICT Doctoral School - Trento, May 2017
30/05/2017
38
• at this point,
𝑐2
𝑞1
× 𝑞1maps 𝑐2 in n𝑞1 again, since, as pointed
out in the preceding paragraph, 𝑐2 is in the range related to n𝑞1.
• With the three steps above, we demonstrated that a proper
error function whose value is 0 when q3 = q1 regardless the c
value can be computed due to the fact that:
• This property allows then to localize the first quantization step
with high precision.
ICT Doctoral School - Trento, May 2017
Summarizing:
F. Galvan, G. Puglisi, A. R. Bruna, and S. Battiato, “First quantization coefficient extraction
from double compressed JPEG images,” in International Conference on Image Analysis and
Processing (ICIAP), ser. Lecture Notes in Computer Science, vol. 8156, 2013, pp. 783–792.
Quantization Step Estimation
ICT Doctoral School - Trento, May 2017
30/05/2017
39
Summarizing:
F. Galvan, G. Puglisi, A. R. Bruna, and S. Battiato, “First quantization coefficient extraction
from double compressed JPEG images,” in International Conference on Image Analysis and
Processing (ICIAP), ser. Lecture Notes in Computer Science, vol. 8156, 2013, pp. 783–792.
Quantization Step Estimation
ICT Doctoral School - Trento, May 2017
Summarizing:
F. Galvan, G. Puglisi, A. R. Bruna, and S. Battiato, “First quantization coefficient extraction
from double compressed JPEG images,” in International Conference on Image Analysis and
Processing (ICIAP), ser. Lecture Notes in Computer Science, vol. 8156, 2013, pp. 783–792.
Quantization Step Estimation
ICT Doctoral School - Trento, May 2017
30/05/2017
40
Summarizing:
a fourth quantization
with q2 will bring us at
the same situation
that we had after the
second quantization
F. Galvan, G. Puglisi, A. R. Bruna, and S. Battiato, “First quantization coefficient extraction
from double compressed JPEG images,” in International Conference on Image Analysis and
Processing (ICIAP), ser. Lecture Notes in Computer Science, vol. 8156, 2013, pp. 783–792.
Quantization Step Estimation
ICT Doctoral School - Trento, May 2017
The proposed error function (q3 is varying in a proper range) is:
If q3 = q1 the error function will be =0
F. Galvan, G. Puglisi, A. R. Bruna, and S. Battiato, “First quantization coefficient extraction
from double compressed JPEG images,” in International Conference on Image Analysis and
Processing (ICIAP), ser. Lecture Notes in Computer Science, vol. 8156, 2013, pp. 783–792.
• c is the DCT term;
• qi is the quantization coefficient referred to
the i-th compression,
• […] indicates the round function
• |..| indicates the abs function
Quantization Step Estimation
ICT Doctoral School - Trento, May 2017
30/05/2017
41
F. Galvan, G. Puglisi, A. R. Bruna, and S. Battiato, “First quantization coefficient extraction
from double compressed JPEG images,” in International Conference on Image Analysis and
Processing (ICIAP), ser. Lecture Notes in Computer Science, vol. 8156, 2013, pp. 783–792.
First quantization coefficient extraction from
double compressed JPEG images
ICT Doctoral School - Trento, May 2017
101010100000111……
010101010101011……
111101111000011……
110011001100110……
000000001111100……
110011001100110……
………………………….
………………………….
Original Image in the
“bitstream” format
What “really” happens in a forgery operation
83
30/05/2017
42
101010100000111……
010101010101011……
111101111000011……
110011001100110……
000000001111100……
110011001100110……
………………………….
………………………….
Original Image in the
“bitstream” format
Original Image in a
“displayable” format
rounding error
truncation error
color conversions (YCrCb to RGB)
What “really” happens in a forgery operation
84
101010100000111……
010101010101011……
111101111000011……
110011001100110……
000000001111100……
110011001100110……
………………………….
………………………….
Original Image in the
“bitstream” format
Original Image in a
“displayable” format
rounding error
truncation error
color conversions (YCrCb to RGB) Malicious forgery
Tampered Image
What “really” happens in a forgery operation
85
30/05/2017
43
101010100000111……
010101010101011……
111101111000011……
110011001100110……
000000001111100……
110011001100110……
………………………….
………………………….
Original Image in the
“bitstream” format
Original Image in a
“displayable” format
rounding error
truncation error
color conversions (YCrCb to RGB)
101010100000111……
010101010101011……
111111111000011……
110011001100110……
100000110111100……
110011001101110……
………………………….
………………………….
Malicious forgery
color conversions (RGB to YCbCr)
rounding error
Tampered Image
Tampered Image
in the
“bitstream” format
What “really” happens in a forgery operation
86
We can summarize the error introduced on the value of each
DCT coefficient c by the several operations with:
𝑐 𝐷𝑄 =𝑟𝑜𝑢𝑛𝑑((𝑟𝑜𝑢𝑛𝑑(
𝑐
𝑞1
)× 𝑞1+ 𝑒 𝑟𝑟)×
1
𝑞2
)
the factor 𝑒 𝑟𝑟 , if not properly managed can limit the
effectiveness of every proposed methodology
What “really” happens in a forgery operation
87
30/05/2017
44
depending on both
first and second
quantization factor
(q1 and q2), we can
obtain different
histograms for
every position in the
8x8 block.
G. Puglisi, A. R. Bruna, F. Galvan, and S. Battiato, “First JPEG quantization matrix estimation
based on histogram analysis,” in International Conference on Image Processing (ICIP), 2013.
First JPEG quantization matrix estimation
based on histogram analysis
ICT Doctoral School - Trento, May 2017
The sequence of zero (and not zero) values of the histogram
related to a double compressed image IDQ provides useful
information for the estimation of the first quantization factor q1
G. Puglisi, A. R. Bruna, F. Galvan, and S. Battiato, “First JPEG quantization matrix estimation
based on histogram analysis,” in International Conference on Image Processing (ICIP), 2013.
First JPEG quantization matrix estimation
based on histogram analysis
ICT Doctoral School - Trento, May 2017
30/05/2017
45
a binary vector is computed just considering the sequence of zero and not zero values of the
filtered histogram
double
quantized
image IDQ
DCT coefficient
are extracted
the histogram
of the absolute value
of DCT coefficient cfj
is computed
the histogram
of the DCT
coefficient
is filtered
a binary vector
is computed
just considering
the sequence
of zero and not zero values
of the filtered histogram
1st step
A binary vector is computed just
considering the sequence of zero and not
zero values of the filtered histogram.
G. Puglisi, A. R. Bruna, F. Galvan, and S. Battiato, “First JPEG quantization matrix estimation
based on histogram analysis,” in International Conference on Image Processing (ICIP), 2013.
First JPEG quantization matrix estimation
based on histogram analysis
ICT Doctoral School - Trento, May 2017
A set of binary representation are then built for each q1ifj
double
quantized
image IDQ
2nd step
A set of binary representation
are then built for each q1ifj value
exploiting information
coming from the input (double
compressed) image IDQ
G. Puglisi, A. R. Bruna, F. Galvan, and S. Battiato, “First JPEG quantization matrix estimation
based on histogram analysis,” in International Conference on Image Processing (ICIP), 2013.
First JPEG quantization matrix estimation
based on histogram analysis
ICT Doctoral School - Trento, May 2017
30/05/2017
46
3rd step
Based on the similarity
between the generated
representations and the
one of the IDQ, a set of q1ifj
candidates are then
selected (Cfj ).
G. Puglisi, A. R. Bruna, F. Galvan, and S. Battiato, “First JPEG quantization matrix estimation
based on histogram analysis,” in International Conference on Image Processing (ICIP), 2013.
First JPEG quantization matrix estimation
based on histogram analysis
ICT Doctoral School - Trento, May 2017
4th step
the refined histograms Hq1s, related to the simulated double quantization
with q1s ∈ Cfj are compared with Hreal obtained from IDQ and the closest one
is selected as follows:
𝑞1𝑓 𝑗
= min
𝑞1𝑠∈𝐶 𝑓𝑗
𝑖=1
𝑁
min(𝑚𝑎𝑥 𝑑𝑖𝑓𝑓 𝐻𝑟𝑒𝑎𝑙 𝑖 − 𝐻 𝑞1𝑠(𝑖) )
Where 𝑞1𝑓 𝑗
indicates the estimated q1 value
G. Puglisi, A. R. Bruna, F. Galvan, and S. Battiato, “First JPEG quantization matrix estimation
based on histogram analysis,” in International Conference on Image Processing (ICIP), 2013.
First JPEG quantization matrix estimation
based on histogram analysis
ICT Doctoral School - Trento, May 2017
30/05/2017
47
G. Puglisi, A. R. Bruna, F. Galvan, and S. Battiato, “First JPEG quantization matrix estimation
based on histogram analysis,” in International Conference on Image Processing (ICIP), 2013.
First JPEG quantization matrix estimation
based on histogram analysis
ICT Doctoral School - Trento, May 2017
A novel improved scheme
F. Galvan, G. Puglisi, A. R. Bruna, S. Battiato, First Quantization Matrix Estimation from Double
Compressed JPEG Images, IEEE Transactions on Information Forensics and Security, 2014.
30/05/2017
48
DCT Histogram Filtering
The rounding error e manifests itself as peaks spread around the
multiples of the quantization step q and has been modeled as an
approximate Gaussian noise.
Those joint phenomenons will affect the behavior of the second
quantization step, thus the magnitude of the DCT coefficients,
and consequently its statistics.
For those reasons, the filtering strategy must face two kind of
noise: the “split noise” and the “residual noise”, with the aim to
bring the histogram as if the rounding error did not have impact.
ICT Doctoral School - Trento, May 2017
ICT Doctoral School - Trento, May 2017
30/05/2017
49
ICT Doctoral School - Trento, May 2017
Critical case
• This undesirable situation appears when a bin of the
first quantization (i.e., in position mq1) is situated
exactly halfway between two consecutive bins
coming from the second quantization (i.e., in
position nq2 and (n + 1)q2). Specifically, this effect
arises when two consecutive multiples of q2 are
related to a generic multiple of q1 as follows:
ICT Doctoral School - Trento, May 2017
30/05/2017
50
DCT Histogram Filtering
This module actually provides a set of filtered histograms Hfiltq1i (one for
each quantization step q1i ∈ {q1min, q1min + 1, . . . , q1max }).
ICT Doctoral School - Trento, May 2017
DCT Histogram Filtering
ICT Doctoral School - Trento, May 2017
30/05/2017
51
Overall Schema
ICT Doctoral School - Trento, May 2017
DCT Histogram Selection
• The modules presented above actually
provide a series of first quantization
candidates that have to be considered for
further evaluations.
• The DCT Histogram Selection step, exploiting
directly the information related to the
histogram values estimates the q1 value. In
order to select the correct first quantization
step, we exploit information coming from the
original double compressed image IDQ.
ICT Doctoral School - Trento, May 2017
30/05/2017
52
• We start with the extraction of DCT
coefficients cDQ, followed by a rough
estimation of the original DCT coefficients
obtained through a proper cropping of the
double compressed image.
ICT Doctoral School - Trento, May 2017
According with the following consideration:
double JPEG compression modifies the histograms of the DCT
coefficients with a function depending on both first and second
quantization factor (q1 and q2);
G. Puglisi, A. R. Bruna, F. Galvan, and S. Battiato, “First JPEG quantization matrix estimation
based on histogram analysis,” in International Conference on Image Processing (ICIP), 2013.
First JPEG quantization matrix estimation
based on histogram analysis
ICT Doctoral School - Trento, May 2017
30/05/2017
53
the refined histograms Hq1s, related to the simulated double quantization
with q1s ∈ CS are compared with Hreal obtained from IDQ and the closest one is
selected as follows:
𝑞1 = min
𝑞1𝑠∈𝐶 𝑆
𝑖=1
𝑁
min(𝑚𝑎𝑥 𝑑𝑖𝑓𝑓, 𝐻𝑟𝑒𝑎𝑙 𝑖 − 𝐻 𝑞1𝑠(𝑖) )
where N is the number of bins of the histograms and maxdi f f is a threshold
used to limit the contribution of a single difference in the overall distance
computation.
G. Puglisi, A. R. Bruna, F. Galvan, and S. Battiato, “First JPEG quantization matrix estimation
based on histogram analysis,” in International Conference on Image Processing (ICIP), 2013.
DCT Histogram Selection
ICT Doctoral School - Trento, May 2017
Experiments
• To assess the performance of the proposed
approach, several tests have been conducted
considering double compressed JPEG images,
obtained starting from two different sources
• as described below. A dataset of 110 uncompressed
images has been collected considering different
cameras (Canon D40, Canon D50 e Canon Mark3)
with different resolutions.
ICT Doctoral School - Trento, May 2017
30/05/2017
54
Experimental Settings
A cropping of size 1024 × 1024 of the central part of
each image has been then selected in order to speed
up the tests. Starting from the cropped images,
applying JPEG encoding provided by Matlab with
standard JPEG quantization tables proposed by IJG
(Independent JPEG Group), a dataset of double
compressed images have been built just considering
quality factors (QF1, QF2) in the range 50 to 100 at
steps of 10. (total 1650 images)
ICT Doctoral School - Trento, May 2017
ICT Doctoral School - Trento, May 2017
30/05/2017
55
CanonD40D50Mk3 dataset
ICT Doctoral School - Trento, May 2017
UCID v2 (dataset)
1338 uncompressed images->20070 imagesICT Doctoral School - Trento, May 2017
30/05/2017
56
Conclusions
In this work we proposed a novel algorithm for the estimation
of the first quantization steps from double compressed JPEG
images. The proposed approach, combining a filtering strategy
and an error function with a good q1-localization property,
obtains satisfactory results outperforming the other state-of-
the-art approaches both for low and high frequencies.
Future works will be devoted to cope with the case when q1 <
q2, and to exploit the proposed approach to recover the
overall initial quantization matrix considering a double
compression process achieved by applying actual quantization
tables used by camera devices and common photo-retouching
software (e.g., Photoshop, Gimp, etc.).
ICT Doctoral School - Trento, May 2017
Recent Works
• Singh, G. & Singh, K. Forensics for partially double compressed doctored JPEG images -
Multimed Tools Appl (2016). doi:10.1007/s11042-016-4290-5
• Abstract. Digital image forensics is required to investigate unethical use of doctored images by
recovering the historic information of an image. Most of the cameras compress the image using
JPEG standard. When this image is decompressed and recompressed with different quantization
matrix, it becomes double compressed. Although in certain cases, e.g. after a cropping attack, the
image can be recompressed with the same quantization matrix too. This JPEG double compression
becomes an integral part of forgery creation. The detection and analysis of double compression in
an image help the investigator to find the authenticity of an image. In this paper, a two-stage
technique is proposed to estimate the first quantization matrix or steps from the partial double
compressed JPEG images. In the first stage of the proposed approach, the detection of the
double compressed region through JPEG ghost technique is extended to the automatic isolation
of the doubly compressed part from an image. The second stage analyzes the doubly
compressed part to estimate the first quantization matrix or steps. In the latter stage, an
optimized filtering scheme is also proposed to cope with the effects of the error. The results of
proposed scheme are evaluated by considering partial double compressed images based on the
two different datasets. The partial double compressed datasets have not been considered in the
previous state-of-the-art approaches. The first stage of the proposed scheme provides an average
percentage accuracy of 95.45%. The second stage provides an error less than 1.5% for the first 10
DCT coefficients, hence, outperforming the existing techniques. The experimental results consider
the partial double compressed images in which the recompression is done with different
quantization matrix.
ICT Doctoral School - Trento, May 2017
30/05/2017
57
Recent Works
• Singh, G. & Singh, K. Forensics for partially double compressed doctored JPEG images -
Multimed Tools Appl (2016). doi:10.1007/s11042-016-4290-5
•
ICT Doctoral School - Trento, May 2017
Recent Works
• Bin Li et al. Statistical Model of JPEG Noises and Its Application in Quantization
Step Estimation IEEE Transactions on Image Processing ( Volume: 24, Issue: 5, May
2015 )
• Abstract. In this paper, we present a statistical analysis of JPEG noises, including
the quantization noise and the rounding noise during a JPEG compression cycle.
The JPEG noises in the first compression cycle have been well studied; however, so
far less attention has been paid on the statistical model of JPEG noises in higher
compression cycles. Our analysis reveals that the noise distributions in higher
compression cycles are different from those in the first compression cycle, and they
are dependent on the quantization parameters used between two successive cycles.
To demonstrate the benefits from the analysis, we apply the statistical model in
JPEG quantization step estimation. We construct a sufficient statistic by exploiting
the derived noise distributions, and justify that the statistic has several special
properties to reveal the ground-truth quantization step. Experimental results
demonstrate that the proposed estimator can uncover JPEG compression history
with a satisfactory performance.
ICT Doctoral School - Trento, May 2017
30/05/2017
58
Recent Works
• Bin Li et al. Revealing the Trace of High-Quality JPEG Compression Through
Quantization Noise Analysis IEEE Transactions on Information Forensics and
Security ( Volume: 10, Issue: 3, March 2015 )
• Abstract. To identify whether an image has been JPEG compressed is an important
issue in forensic practice. The state-of-the-art methods fail to identify high-quality
compressed images, which are common on the Internet. In this paper, we provide a
novel quantization noise-based solution to reveal the traces of JPEG compression.
Based on the analysis of noises in multiple-cycle JPEG compression, we define a
quantity called forward quantization noise. We analytically derive that a
decompressed JPEG image has a lower variance of forward quantization noise than
its uncompressed counterpart. With the conclusion, we develop a simple yet very
effective detection algorithm to identify decompressed JPEG images. We show that
our method outperforms the state-of-the-art methods by a large margin especially
for high-quality compressed images through extensive experiments on various
sources of images. We also demonstrate that the proposed method is robust to
small image size and chroma subsampling. The proposed algorithm can be applied
in some practical applications, such as Internet image classification and forgery
detection. ICT Doctoral School - Trento, May 2017
Recent Works
• Taimori, A., Razzazi, F., Behrad, A. et al. A novel forensic image analysis tool for
discovering double JPEG compression clues Multimed Tools Appl (2017) 76: 7749.
doi:10.1007/s11042-016-3409-z
• Abstract. This paper presents a novel technique to discover double JPEG
compression traces. Existing detectors only operate in a scenario that the image
under investigation is explicitly available in JPEG format. Consequently, if
quantization information of JPEG files is unknown, their performance dramatically
degrades. Our method addresses both forensic scenarios which results in a fresh
perceptual detection pipeline. We suggest a dimensionality reduction algorithm to
visualize behaviors of a big database including various single and double
compressed images. Based on intuitions of visualization, three bottom-up, top-
down and combined top-down/bottom-up learning strategies are proposed. Our
tool discriminates single compressed images from double counterparts, estimates
the first quantization in double compression, and localizes tampered regions in a
forgery examination. Extensive experiments on three databases demonstrate results
are robust among different quality levels. F1-measure improvement to the best
state-of-the-art approach reaches up to 26.32 %. An implementation of algorithms
is available upon request to fellows.ICT Doctoral School - Trento, May 2017
30/05/2017
59
Recent works
• Fei et al. - MSE period based estimation of first quantization step in double
compressed JPEG images - Signal Processing: Image Communication Volume 57,
September 2017, Pages 76–83
• Abstract. The estimation of the first quantization step in double JPEG compressed
images is still a challenging problem, especially when the first quantization step q1
is smaller than the second quantization step q2. In this paper, we present a novel
method to estimate q1. By introducing the mean square error (MSE) sequence of
ratios among DCT coefficient histogram bins, we formulate the relationship
between its periodic fluctuation and q1. And in order to enhance the periodic
effect, we propose a strategy to adjust the histogram. Then, based on MSE
sequence, several q1 candidates can be obtained. Finally by histogram
comparison, the estimated quantization step is selected from the candidates.
Experimental results demonstrate that the proposed approach has better overall
performance when compared with state of the art methods.
ICT Doctoral School - Trento, May 2017
Recent works
• Thanh Hai Thai et al. - JPEG Quantization Step Estimation and Its Applications to
Digital Image Forensics - IEEE Transactions on Information Forensics and Security (
Volume: 12, Issue: 1, Jan. 2017 )
• Abstract. The goal of this paper is to propose an accurate method for estimating
quantization steps from an image that has been previously JPEG-compressed and
stored in lossless format. The method is based on the combination of the
quantization effect and the statistics of discrete cosine transform (DCT) coefficient
characterized by the statistical model that has been proposed in our previous works.
The analysis of quantization effect is performed within a mathematical framework,
which justifies the relation of local maxima of the number of integer quantized
forward coefficients with the true quantization step. From the candidate set of the
true quantization step given by the previous analysis, the statistical model of DCT
coefficients is used to provide the optimal quantization step candidate. The proposed
method can also be exploited to estimate the secondary quantization table in a
double-JPEG compressed image stored in lossless format and detect the presence of
JPEG compression. Numerical experiments on large image databases with different
image sizes and quality factors highlight the high accuracy of the proposed method.
ICT Doctoral School - Trento, May 2017
30/05/2017
60
Main Contacts
Further Info
Image Processing Lab
Università di Catania
www.dmi.unict.it/~iplab
Email
battiato@dmi.unict.it
120

Multimedia Security - JPEG Artifact details

  • 1.
    30/05/2017 1 Image Forgery Localizationby JPEG Artifact Analysis Prof. Sebastiano Battiato Dipartimento di Matematica e Informatica, Università di Catania Image Processing LAB – http://iplab.dmi.unict.it ICT Doctoral School - Trento, May 2017 Artifact and Coding • File level analysis – File system analysis: MAC time,Filename, Size – Format analysis: Image file format, Resolution/color depth/channels, Compression/coding parameters • Image analysis: – Pixel domain: blocking artifacts, error level analysis, JPEG ghosts – Transform domain: DCT values • Compression history – Uncompressed, Single compression – Double compression (or more) – Evaluation of compression coefficients (QTs) – Evaluation of previous compression coefficients (QTs) ICT Doctoral School - Trento, May 2017
  • 2.
    30/05/2017 2 A possible workflow ICTDoctoral School - Trento, May 2017 The JPEG Standard • JPEG stands for an image compression stream of bytes; • JFIF (JPEG File Interchange Format) stands for a standard which define: o Component sample registration o Resolution and aspect ratio o Color Space • ExIF allows to integrate further information into the file ICT Doctoral School - Trento, May 2017
  • 3.
    30/05/2017 3 JPEG Compression • Convertingan image into JPEG is a six step process: • The image is converted from raw RGB data into YCbCr; • A downsampling is performed on chrominance channels; • The channels are splitted into 8x8 blocks; • A Discrete Cosine Transform is applied; • The DCT coefficient are Quantized (lossy) using fixed tables; • Finally an entropy coding (lossless compression) is applied and the image is said to be JPEG compressed Color Transform Down- Sampling Forward DCT Quantization Encoding Decoding De- quantization Inverse DCT Up- Sampling RAW Data Color Transform JPEG Compressed Image JPEG Compression JPEG Decompression Block Restoring Block Splitting ICT Doctoral School - Trento, May 2017 ICT Doctoral School - Trento, May 2017
  • 4.
    30/05/2017 4 Color Conversion &Downsampling • First, the image is converted from RGB into a different colors pace called YCbCr. • The Y component represents the brightness of a pixel, the Cb and Cr components represent the chrominance (split into blue and red components). • The Cr and Cb components are usually downsampled because, due to the densities of color- and brightness-sensitive receptors in the human eye, humans can see considerably more fine detail in the brightness of an image (the Y component) than in the color of an image (the Cb and Cr components). R B G Y Cr Cb ICT Doctoral School - Trento, May 2017 Color Transform R B G Y Cb Cr Example: The human eye is more sensitive to luminance than to chrominance. Typically JPEG throw out 3/4 of the chrominance information before any other compression takes place. This reduces the amount of information to be stored about the image by 1/2. With all three components fully stored, 4 pixels needs 3 x 4 = 12 component values. If 3/4 of two components are discarded we need 1 x 4 + 2 x 1 = 6 values. Y = 0.299 R + 0.587 G + 0.114 B Cb = (B – Y)/2 + 0.5 Cr = (R – Y)/2 + 0.5 ICT Doctoral School - Trento, May 2017
  • 5.
    30/05/2017 5 Chrominance subsampling ICT DoctoralSchool - Trento, May 2017 Blocks 8x8 • After subsampling, each channel must be split into 8x8 blocks of pixels. • If the data for a channel does not represent an integer number of blocks then the encoder must fill the remaining area of the incomplete blocks with some form of dummy data. ICT Doctoral School - Trento, May 2017
  • 6.
    30/05/2017 6 Discrete Cosine Transform •Next, each component (Y, Cb, Cr) of each 8×8 block is converted to a frequency-domain representation, using a normalized, two-dimensional type-II discrete cosine transform (DCT). • As an example, one such 8×8 8-bit subimage might be: • Before computing the DCT of the subimage, its gray values are shifted from a positive range to one centered around zero. • For an 8-bit image each pixel has 256 possible values: [0,255]. To center around zero it is necessary to subtract by half the number of possible values, or 128. ICT Doctoral School - Trento, May 2017 Discrete Cosine Transform The DCT transforms 64 pixels to a linear combination of these 64 squares. Horizontally is u and vertically is v. • Subtracting 128 from each pixel value yields pixel values on [ -128,127] and we obtain the following matrix. • The next step is to take the two- dimensional DCT, which is given by: where  is the horizontal spatial frequency, for the integers .  is the vertical spatial frequency, for the integers .  is a normalizing function  is the pixel value at coordinates  is the DCT coefficient at coordinatesICT Doctoral School - Trento, May 2017
  • 7.
    30/05/2017 7 DCT basis The 64(8 x 8) DCT basis functions: DC Coefficient AC Coefficients ICT Doctoral School - Trento, May 2017 Image Representation with DCT DCT coefficients can be viewed as weighting functions that, when applied to the 64 cosine basis functions of various spatial frequencies (8 x 8 templates), will reconstruct the original block. Original image block DC (flat) basis function AC basis functions ICT Doctoral School - Trento, May 2017
  • 8.
    30/05/2017 8 DCT example ICT DoctoralSchool - Trento, May 2017 DCT Coefficients Quantization • The DCT coefficients are quantized to a limited number of possible levels. • The Quantization is needed to reduce the number of bits per sample. Formula: F( u, v) = round[ F( u, v) / Q( u, v)] – Q( u, v) = constant => Uniform Quantization. – Q( u, v) = variable => Non-uniform Quantization. Example: 101000 = 40 (6 bits precision) Truncates to 4 bits = 1000 = 8 (4 bits precision). i.e. 40/5 = 8, there is a constant N=5, or the quantization or quality factor . ICT Doctoral School - Trento, May 2017
  • 9.
    30/05/2017 9 ICT Doctoral School- Trento, May 2017 Quantization step It is possible to approximate the statistical distribution of the AC DCT coefficients, both luminance and chrominance components, of a 8x8 block, by a Laplacian distribution in the following way: pi(x)=  i /2 e-i |x| i = 1, 2, ..., 64; where: i= sqrt(2)/i ; i = i-th DCT standard deviation; EXAMPLE: Q(u,v)= 8; Quantization Step Round(256/8)= 32 Intervals; [0, 8, 16, 24, 32, 40, ..., 256] - Reconstruction Levels Dead zone ICT Doctoral School - Trento, May 2017 Edmund Y. Lam, , Joseph W. Goodman - A Mathematical Analysis of the DCT Coefficient Distributions for Images– IEEE Trans. On Image Processing, Vol.9, No. 10, 2000
  • 10.
    30/05/2017 10 Distributions of DCTCoefficients for Images An estimation of μ is the sample median. The maximum likelihood estimator of b is: The DCT coefficient resample a Laplacian  Each distribution can be summarized with two parameters. ICT Doctoral School - Trento, May 2017 G.M. Farinella, D. Ravì, V. Tomaselli, M. Guarnera, S. Battiato - Representing Scenes for Real-time Context Classification on Mobile Devices – Elsevier, Pattern Recognition - vol.48, 2015 D. Ravì, M. Bober, G.M. Farinella, M.Guarnera, S.Battiato - Semantic Segmentation of Images Exploiting DCT Based Features and Random Forest - Elsevier, Pattern Recognition - vol. 52, 2016
  • 11.
    30/05/2017 11 Representation • Context ofdifferent classes differs in the scales (b) of the AC DCT coefficient distributions. • To represent the context we use a feature vector describing the AC DCT coefficients distributions of an image Each distribution is summarized by the parameter b.  Compact Representation  Simple to be computed during IGP Diversity of the DCT coefficient distributions on the 8 Scene Dataset
  • 12.
    30/05/2017 12 2-dimensional distributions (fittedwith a Gaussian model) related to the Laplacian parameter b of the DCT frequency (0,1) and (1,0). Image Generation Pipeline
  • 13.
    30/05/2017 13 Final AC DCTfrequencies considered for representing the context of the scene Classification Results (8 scene classes) (A) (B) (C) (D) (E) (F) GIST Performances match the GIST ones, but in constrained domain. Feature Extraction is for "free" (JPEG). Can be used as global descriptor also in post acquisition time. Can be computed in real time.
  • 14.
    30/05/2017 14 Proposed Classification Results (8scene classes) G.M. Farinella, D. Ravì, V. Tomaselli, M. Guarnera, S. Battiato - Representing Scenes for Real-time Context Classification on Mobile Devices – Elsevier, Pattern Recognition - vol.48, 2015 D. Ravì, M. Bober, G.M. Farinella, M.Guarnera, S.Battiato - Semantic Segmentation of Images Exploiting DCT Based Features and Random Forest - Elsevier, Pattern Recognition - vol. 52, 2016 Implemented with FCAM. Thanks to Nokia Research Center for providing the smartphones. This research has been sponsored by STMicroelectronics. http://iplab.dmi.unict.it/DCT-GIST
  • 15.
    30/05/2017 15 Standard Q-tables Eye ismost sensitive to low frequencies (upper left corner), less sensitive to high frequencies (lower right corner) Luminance Quantization Table Chrominance Quantization Table 16 11 10 16 24 40 51 61 17 18 24 47 99 99 99 99 12 12 14 19 26 58 60 55 18 21 26 66 99 99 99 99 14 13 16 24 40 57 69 56 24 26 56 99 99 99 99 99 14 17 22 29 51 87 80 62 47 66 99 99 99 99 99 99 18 22 37 56 68 109 103 77 99 99 99 99 99 99 99 99 24 35 55 64 81 104 113 92 99 99 99 99 99 99 99 99 49 64 78 87 103 121 120 101 99 99 99 99 99 99 99 99 72 92 95 98 112 100 103 99 99 99 99 99 99 99 99 99 The numbers in the above quantization tables can be scaled up (or down) to adjust the so called Quality Factor QF. (i.e. Q*(u,v)= QF x Q(u,v)) Custom quantization tables can also be put in image/scan header.ICT Doctoral School - Trento, May 2017 Quantized DCT zij = round( yij / qij ) ICT Doctoral School - Trento, May 2017
  • 16.
    30/05/2017 16 Quantization • The Quantizationis usually used to convert continuous signal to a discrete space. • In the example above we have processed a continuous signal , by using a larger quantization step X (thus reducing drastically the numbers of samples), and a smaller step X/4 which introduce more samples and is much more similar to the continuous signal. Original Continuous Signal Quantized with Step X Quantized with Step X/4 ICT Doctoral School - Trento, May 2017 Quantization Tables ICT Doctoral School - Trento, May 2017
  • 17.
    30/05/2017 17 • The standardfixes that each image must have between one and four quantization tables. • The most commonly used quantization tables are those published by the Independent JPEG Group (IJG) in 1998. • These tables can be scaled to a quality factor Q. • The quality factor allows the image creation device to choose between: o Larger, higher quality images o Smaller, lower quality images. Quantization Tables ICT Doctoral School - Trento, May 2017 Quantization Tables • For example, we can scale the IJG standard table using Q=80 by applying Eq. (2) to each element in the table. The resulting values are the following scaled quantization tables • Note that the numbers in this table are lower than in the standard table, indicating an image compressed with these tables will be of higher quality than ones compressed with the standard table. It should be noted that scaling with Q=50 does not change the table. ICT Doctoral School - Trento, May 2017
  • 18.
    30/05/2017 18 Quantization Tables • Thedifferent QT could be classified into the following categories: o Standard Tables: Images which use scaled versions of the QT published by Independent JPEG Group (IJG) standard; o Extended Tables: Same as Standard Tables but have three tables instead of two. The third table is a duplicate of the second; o Custom Fixed Tables: Images containing non-IJG QT that do not depend on the image being processed (Adobe Photoshop); o Custom Adaptive Tables: These images do not conform to the IJG standard. In addition, they may change, either in part or as a whole, between images created by the same device using the same settings. They may also have constants in the tables; values that do not change regardless of the quality setting or image being processed. ICT Doctoral School - Trento, May 2017 Quantization Tables For Ballistics “Considering only QT is not sufficient to discriminate between different source cameras, but it could be useful to clearly indentify altered images.” Issues QT database will never be complete - A study shown 15000 different camera/softwares and almost 63000 different QTs on the market Adaptive tables Can be spoofed, resaving the file with the desired QTs Performances can be improved combining QTs with other metadata: Image size, Chroma subsampling, Thumbnail size, Huffman tables (less variable than QTs) … Photoshop QTs are the same since version 3 and do not correspond to any known camera
  • 19.
    30/05/2017 19 Most common matrix •1513 different matrices • 70 camera model 7337 devices (DSC, Smartphone - tablet) Thanks to Jerian Martino (from AMPED srl: http://ampedsoftware.com/company ) Camera Ballistics thorough quantization tables ICT Doctoral School - Trento, 2017 Image Forgery Localization by JPEG Artifact Analysis The manipulation is detected by analyzing artifacts introduced by JPEG recompression. Two main classes can be considered: • aligned double JPEG (A-DJPG) compression (i.e., first and second JPEG compression make use of aligned DCT grids). • non-aligned double JPEG (NA-DJPG) compression. ICT Doctoral School - Trento, May 2017
  • 20.
    30/05/2017 20 JPEG Analysis • JPEGDouble Quantization Detection (DQD) and Quantization Step Estimation (QSE) are two fundamental steps in the overall process. • DQD and QSE methods by dividing them into categories with names that recall what is exploited from image data. Thus the categories are for methods based on: – Probability distributions on DCT coefficients; – Benford’s Law; – Benford’s Fourier Coefficients; – Neural Networks encoding and classification; – DCT coefficients comparison; – SVM classifiers; – Factor Histogram; – Considerations on noiseICT Doctoral School - Trento, May 2017 Benford’s Law Benford's law, also called the first-digit law, refers to the frequency distribution of digits in many real-life sources of data. In this distribution, the number 1 occurs as the first digit about 30% of the time, while larger numbers occur in that position less frequently, with larger numbers occurring less often: 9 as the first digit less than 5% of the time. http://en.wikipedia.org/wiki/Benford's_law ICT Doctoral School - Trento, May 2017
  • 21.
    30/05/2017 21 Generalized Benford’s Law(1) D. Fu, Y. Q. Shi, and W. Su, A generalized Benford’s law for JPEG coefficients and its applications image forensics, in Proc. SPIE, Security, Steganography, Watermarking of Multimedia Contents IX, P.W. Wong and E. J. Delp, Eds., San Jose, CA, Jan. 2007, vol. 6505, pp. 1L1–1L11. Fu et al. observed that the distribution of the first digit of DCT coefficients in single JPEG compressed image follows a generalized Benford distribution. N is a normalization factor, s and q are model parameters to describe the distributions for different images and different compression factors. ICT Doctoral School - Trento, May 2017 Generalized Benford’s Law (2) D. Fu, Y. Q. Shi, and W. Su, A generalized Benford’s law for JPEG coefficients and its applications image forensics, in Proc. SPIE, Security, Steganography, Watermarking of Multimedia Contents IX, P.W. Wong and E. J. Delp, Eds., San Jose, CA, Jan. 2007, vol. 6505, pp. 1L1–1L11. In case of double quantization the statistical distribution of the first digit does not follow the generalized Benford’s distribution. This behavior can be then used to discriminate among single (a) and double (b) compression. ICT Doctoral School - Trento, May 2017
  • 22.
    30/05/2017 22 Generalized Benford’s Law(3) B. Li, Y. Shi, and J. Huang, Detecting doubly compressed JPEG images by using mode based first digit features, in Proc. IEEE 10thWorkshop Multimedia Signal Processing, Oct. 2008, pp. 730–735. Starting from the method of Fu et al., Li et al. analyze the distribution of the first digit of each DCT coefficient in case of single and double compression. They observed that these distributions do not follow generalized Benford’s law. ICT Doctoral School - Trento, May 2017 Generalized Benford’s Law (4) B. Li, Y. Shi, and J. Huang, Detecting doubly compressed JPEG images by using mode based first digit features, in Proc. IEEE 10thWorkshop Multimedia Signal Processing, Oct. 2008, pp. 730–735. Although these distributions do not follow generalized Benford’s law, they have some differences that can be exploited and used as feature vector in a two-class classifier. Differently than Fu et al. they use only a limited set of DCT coefficients (20). Considering DCT coefficients individually, distinguishable modes contribution can be better exploited. ICT Doctoral School - Trento, May 2017
  • 23.
    30/05/2017 23 Other related works •Feng X. and Doerr G.: JPEG recompression detection, IS&T/SPIE Electronic Imaging, 75410J, (2010) • Li X.H. and Zhao Y.Q. and Liao M. and Shih F.Y. and Yun Q.S.: Detection of tampered region for JPEG images by using mode-based first digit features, EURASIP Journal on advances in signal processing, 1, 1–10, (2012) • Hou W. and Ji Z. and Jin X. and Li X.: Double JPEG Compression Detection Base on Extended First Digit Features of DCT Coefficients, International Journal of Information and Education Technology, 3, 5, 512–515, (2013) • Milani S. and Tagliasecchi M. and Tubaro S.: Discriminating multiple JPEG compressions using first digit features, APSIPA Transactions on Signal and Information Processing, 3, e19, (2014) • Pasquini C. and Boato G. and Perez-Gonzalez F.: A Benford-Fourier JPEG compression detector, IEEE International Conference on Image Processing (ICIP), 5322–5326, (2014) • Pasquini C. and Boato G. and Perez-Gonzalez F.: Multiple JPEG compression detection by means of Benford-Fourier coefficients, IEEE International Workshop on Information Forensics and Security (WIFS),113–118, (2014) ICT Doctoral School - Trento, May 2017 Periodic artifact introduced by Double JPEG quantizations (1) A. C. Popescu and H. Farid, Statistical tools for digital forensics, in Proc. 6th Int. Workshop Information Hiding, Berlin, Germany, 2004, pp. 128–147, Springer-Verlag. Z. Lin, J. He, X. Tang, and C.-K. Tang, Fast, automatic and fine-grained tampered JPEG image detection via DCT coefficient analysis, Pattern Recognition, vol. 42, no. 11, pp. 2492–2501, Nov. 2009. The properties of the histograms of double quantized DCT coefficients can be exploited to detect forgery. The number of original histogram bins (n(u2)) that contribute to a single bin u2 in the double quantized histogram h2 actually depends on u2. It is worth noting that n(u2) is a periodic function with a period p = q1/gcd(q1, q2) where q1, q2 are the quantization coefficients relative to the first and the second JPEG compression. ICT Doctoral School - Trento, May 2017
  • 24.
    30/05/2017 24 Periodic artifact introducedby Double JPEG quantizations (2) A. C. Popescu and H. Farid, Statistical tools for digital forensics, in Proc. 6th Int. Workshop Information Hiding, Berlin, Germany, 2004, pp. 128–147, Springer-Verlag. Z. Lin, J. He, X. Tang, and C.-K. Tang, Fast, automatic and fine-grained tampered JPEG image detection via DCT coefficient analysis, Pattern Recognition, vol. 42, no. 11, pp. 2492–2501, Nov. 2009. If q2<q1, then n(u2) =0 for some u2, hence the histogram related to the double quantization can show periodically missing values. On the contrary, if q2>q1 the histogram can have some periodicity in terms of peaks and valleys pattern. ICT Doctoral School - Trento, May 2017 Periodic artifact introduced by Double JPEG quantizations (3) A. C. Popescu and H. Farid, Statistical tools for digital forensics, in Proc. 6th Int. Workshop Information Hiding, Berlin, Germany, 2004, pp. 128–147, Springer-Verlag. These periodic artifacts are visible in the Fourier domain as strong peaks in medium and high frequencies Fourier Transforms of three histograms corresponding to: (1) single JPEG compression with quality 75; (2) double JPEG compression with quality 85 followed by 75; (3) double JPEG compression with quality 75 followed by 85. ICT Doctoral School - Trento, May 2017
  • 25.
    30/05/2017 25 Periodic artifact introducedby Double JPEG quantizations (4) Z. Lin, J. He, X. Tang, and C.-K. Tang, Fast, automatic and fine-grained tampered JPEG image detection via DCT coefficient analysis, Pattern Recognition, vol. 42, no. 11, pp. 2492–2501, Nov. 2009. Lin et al. build histograms for each channel and each frequency. For each block in the image, using one histogram they are able to compute the probability of it being a tampered block, by checking the DQ effect of this histogram. ICT Doctoral School - Trento, May 2017 Periodic artifact introduced by Double JPEG quantizations (5) T. Bianchi, A. D. Rosa, and A. Piva, Improved DCT coefficient analysis for forgery localization JPEG images, in Proc. ICASSP 2011,May 2011, pp. 2444–2447. Bianchi et al. improve the algorithm proposed by Lin et al. by considering a better model of the observed histogram. Specifically, a limitation of the Lin et al. method is related to the estimation of the conditional probability p(x|H1)1. This probability is estimated according to the observed histogram of x, however, in the case of a tampered image, such a histogram is a mixture of p(x|H1) and p(x|H0). To obtain a better model they develop a novel algorithm for the estimation of the coefficients of the first quantization (q1) and also consider the effects due to rounding and truncation. 1x is the value of the DCT coefficient and H0 (H1) indicates the hypothesis of being tampered (original). ICT Doctoral School - Trento, May 2017
  • 26.
    30/05/2017 26 Non-aligned double JPEG (NA-DJPG)compression • W. Luo, Z. Qu, J. Huang, and G. Qui, “A novel method for detecting cropped and recompressed image block,” in Proc. ICASSP, 2007, vol. 2, pp. II-217–II-220. • M. Barni, A. Costanzo, and L. Sabatini, “Identification of cut and paste tampering by means of double-JPEG detection and image segmentation,” in Proc. ISCAS, 2010, pp. 1687–1690. • Y.-L. Chen and C.-T. Hsu, “Image tampering detection by blocking periodicity analysis JPEG compressed images,” in Proc. IEEE 10th Workshop Multimedia Signal Processing, Oct. 2008, pp. 803–808. • T. Bianchi and A. Piva, “Detection of nonaligned double JPEG compression based on integer periodicity maps,” IEEE Trans. Inf. Forensics Security, vol. 7, no. 2, Apr. 2012. • T. Bianchi, A. Piva, "Image Forgery Localization via Block-Grained Analysis of JPEG Artifacts", IEEE Transactions on Information Forensics & Security, Volume: 7, Issue: 3 – 2012, pp: 1003 – 1017. ICT Doctoral School - Trento, May 2017 Cropping detection through blocking artifact • Cropping correspond to a translation of the original blocking effect. • Looking for blocking effect not aligned to 8x8 pixel Uncompressed image Compressed image ICT Doctoral School - Trento, May 2017
  • 27.
    30/05/2017 27 Block based system •An algorithm has been developed in the spatial domain Published : A. R. Bruna, G. Messina, S. Battiato, “Crop Detection Through Blocking Artefacts Analysis” - Image Analysis and Processing -- ICIAP 2011, Springer, Lecture Notes in Computer Science, Vol. 6979, ISBN 978-3-642-24087-4 ICT Doctoral School - Trento, May 2017 H/V filters(1/3) • Convolutional filters like: • Allow to retrieve straight lines 1 1 1 1 -1 -1 -1 -1 0 0 0 0 0 0 0 0 1 -1 0 0 1 -1 0 0 1 -1 0 0 1 -1 0 0 H filter V filter OutV= Img * V_filter OutH= Img * H_filter ICT Doctoral School - Trento, May 2017
  • 28.
    30/05/2017 28 Regular pattern measure(RPM) • A measure of the blocking in a particular horizontal (or vertical) position has been defined:   ;7,...,1;8)( )8/( 0 '   iijIiRPM Nfloor j H H   ;7,...,1;8)( )8/( 0 '   iijIiRPM Mfloor j V V 0 2 4 6 8 6 7 8 9 10 11 12 13 x 10 4 regular vertical pattern measure 0 2 4 6 8 7.5 8 8.5 9 9.5 10 10.5 11 11.5 x 10 4 regular horizontal pattern measure RPMh and PMv measures example, corresponding to a crop of (5,4) ICT Doctoral School - Trento, May 2017 Experimental results • Experimental results on a large database showed a good reliability, especially at higher compression ratio (i.e. lower quality factor) Quality factor Accuracy (%) 10 99 20 91 30 80 40 69 50 58 60 46 70 39 80 28 90 16 ICT Doctoral School - Trento, May 2017
  • 29.
    30/05/2017 29 First Quantization Coefficient Extraction FromDouble Compressed JPEG Images F. Galvan, G. Puglisi, A.R. Bruna, S. Battiato IEEE TIFS Vol. 9, No 8, August 2014 ICT Doctoral School - Trento, May 2017 Motivation The ability to roughly reconstruct the quantization table used by the device during acquisition, is crucial in almost all forensics investigations. Such information discriminates which spatial regions are associated with the same (original) quantization table, evidencing as corrupted the ones that show different data. Retrieving some (even not all) components of the first quantization matrix, allow to look for the model of devices employing the same quantization tables identified before. ICT Doctoral School - Trento, May 2017
  • 30.
    30/05/2017 30 Motivation • Specifically, whenthe second compression is lighter than the first one, retrieving the first quantization step is often possible. This can be done taking advantage of some interesting properties of integer numbers, that occurs whenever they are quantized (that means rounded) more than once. • The main novelties of the proposed approach are related to the filtering strategy (split noise), adopted to reduce the amount of noise in the input data (DCT histograms), and on the design of a novel function with a satisfactory q1-localization property. ICT Doctoral School - Trento, May 2017 Software • FourandSix • Authenticate • Belkasoft Forgery Detection ICT Doctoral School - Trento, May 2017
  • 31.
    30/05/2017 31 The Approach: • characterizesfeatures that occur in DCT histograms of individual coefficients due to double compression. • exploits errors introduced from the rounding function; • Neural Network as a classifier; • separate network for each value of the second quantization step. Results and Limitations: • good performances ( ~ 1% of error rate) for D01D10D11; • worst results as frequency increase; • no results about DC term. Lukas, J., Fridrich, J.: Estimation of primary quantization matrix in double compressed JPEG images. In: Proceedings of Digital Forensic Research Workshop, DFRWS (2003) . State of the art – Lukas et al. (2003) ICT Doctoral School - Trento, May 2017 Farid, H.: Exposing digital forgeries from JPEG ghosts. IEEE Transactions on Information Forensics and Security 4(1), 154–160 (2009) State of the art - Farid (2009) ICT Doctoral School - Trento, May 2017
  • 32.
    30/05/2017 32 Farid, H.: Exposingdigital forgeries from JPEG ghosts. IEEE Transactions on Information Forensics and Security 4(1), 154–160 (2009) State of the art - Farid (2009) ICT Doctoral School - Trento, May 2017 The Approach: • quantize again such value with a novel quantization coefficient (q3) varying in a proper range; • evaluate an error function defined as follows: Farid, H.: Exposing digital forgeries from JPEG ghosts. IEEE Transactions on Information Forensics and Security 4(1), 154–160 (2009) • c is the DCT term; • qi is the quantization coefficient referred to the i-th compression, • […] indicates the round function • |..| indicates the abs function State of the art - Farid (2009) ICT Doctoral School - Trento, May 2017
  • 33.
    30/05/2017 33 Results and Limitations: •works well only if q1 > q2 and with some pairs of quantization coefficients (sx); • results can be unclear in some cases (center); • we must know “where” to search (dx). Farid, H.: Exposing digital forgeries from JPEG ghosts. IEEE Transactions on Information Forensics and Security 4(1), 154–160 (2009) . State of the art - Farid (2009) ICT Doctoral School - Trento, May 2017 The Approach: • distinguishes between aligned and unaligned tampered regions; • computes a likelihood map indicating the probability for each 8x8 discrete cosine transform block of being doubly compressed; • includes a study of the various types of errors in the JPEG algorithm. Results and Limitations: • is no more valid if certain image processing operations, like resizing, are applied between the two compressions; • gives satisfactory results only if q1 > q2; Bianchi, T., Piva, A.: Image forgery localization via block-grained analysis of JPEG artifacts. IEEE Transactions on Information Forensics and Security 7(3), 1003–1017 (2012) State of the art – Bianchi et al. (2012) ICT Doctoral School - Trento, May 2017
  • 34.
    30/05/2017 34 Error management The errore is introduced by several operations, such as color conversions (YCbCr to RGB and vice versa), rounding and truncation of the values to eight bit integers, etc. It is important to note that the errors above can be due to some processing in different domains (e.g., spatial domain). In any case e is the effect of such errors in the DCT coefficients. Note that the factor e, often omitted in previous published works, if not properly managed, can limit the effectiveness of any related methodology. ICT Doctoral School - Trento, May 2017 Starting from the joint behaviour of the round function which acts on the (i,j)th term of the 8x8 image block, followed by the dequantization, that is what happens before a forgery operation: 𝑐 = 𝑐 𝑞 × 𝑞 we note that: • if q (the quantization coefficient) is odd, all integer numbers in 𝑛𝑞 − 𝑞 2 , 𝑛𝑞 + F. Galvan, G. Puglisi, A. R. Bruna, and S. Battiato, “First quantization coefficient extraction from double compressed JPEG images,” in International Conference on Image Analysis and Processing (ICIAP), ser. Lecture Notes in Computer Science, vol. 8156, 2013, pp. 783–792. First quantization coefficient extraction from double compressed JPEG images ICT Doctoral School - Trento, May 2017
  • 35.
    30/05/2017 35 examples of theeffect of rounding function when q is even. F. Galvan, G. Puglisi, A. R. Bruna, and S. Battiato, “First quantization coefficient extraction from double compressed JPEG images,” in International Conference on Image Analysis and Processing (ICIAP), ser. Lecture Notes in Computer Science, vol. 8156, 2013, pp. 783–792. Quantization Step Estimation ICT Doctoral School - Trento, May 2017 Histogram of all the DCT terms in position (i;j) in all the 8x8 blocks of an image: F. Galvan, G. Puglisi, A. R. Bruna, and S. Battiato, “First quantization coefficient extraction from double compressed JPEG images,” in International Conference on Image Analysis and Processing (ICIAP), ser. Lecture Notes in Computer Science, vol. 8156, 2013, pp. 783–792. Quantization Step Estimation ICT Doctoral School - Trento, May 2017
  • 36.
    30/05/2017 36 F. Galvan, G.Puglisi, A. R. Bruna, and S. Battiato, “First quantization coefficient extraction from double compressed JPEG images,” in International Conference on Image Analysis and Processing (ICIAP), ser. Lecture Notes in Computer Science, vol. 8156, 2013, pp. 783–792. q1 and q2 are, respectively, the (i;j) terms of the first and second quantization matrix. Then, if q1 > q2, we note that (if some error introduced in other steps are not considered): 𝑐1 = 𝑐 𝑞1 × 𝑞1 leads to (if we put q1 =10) : Quantization Step Estimation ICT Doctoral School - Trento, May 2017 The effect of the second quantization/dequantization: 𝑐2 = 𝑐 𝑞2 × 𝑞2 is to map multiples of q1 in multiples of q2. F. Galvan, G. Puglisi, A. R. Bruna, and S. Battiato, “First quantization coefficient extraction from double compressed JPEG images,” in International Conference on Image Analysis and Processing (ICIAP), ser. Lecture Notes in Computer Science, vol. 8156, 2013, pp. 783–792. Quantization Step Estimation ICT Doctoral School - Trento, May 2017
  • 37.
    30/05/2017 37 but: 𝑞2 <𝑞1 → 𝑚𝑞2 (result is mapped in generic multiple of 𝑞2) where: • 𝑐2 𝜖 [n𝑞2 − ⌊𝑞2/2⌋ ,n𝑞2+⌊𝑞2/2⌋] if 𝑞2 is odd; • 𝑐2 𝜖 𝑛𝑞2 − 𝑞2 2 , 𝑛𝑞2 + 𝑞2 2 − 1 if 𝑞2 is even. then in the range related to 𝑛𝑞1 F. Galvan, G. Puglisi, A. R. Bruna, and S. Battiato, “First quantization coefficient extraction from double compressed JPEG images,” in International Conference on Image Analysis and Processing (ICIAP), ser. Lecture Notes in Computer Science, vol. 8156, 2013, pp. 783–792. Quantization Step Estimation ICT Doctoral School - Trento, May 2017 if we quantize again with q1, we turn back to the situation before the second quantization F. Galvan, G. Puglisi, A. R. Bruna, and S. Battiato, “First quantization coefficient extraction from double compressed JPEG images,” in International Conference on Image Analysis and Processing (ICIAP), ser. Lecture Notes in Computer Science, vol. 8156, 2013, pp. 783–792. Quantization Step Estimation ICT Doctoral School - Trento, May 2017
  • 38.
    30/05/2017 38 • at thispoint, 𝑐2 𝑞1 × 𝑞1maps 𝑐2 in n𝑞1 again, since, as pointed out in the preceding paragraph, 𝑐2 is in the range related to n𝑞1. • With the three steps above, we demonstrated that a proper error function whose value is 0 when q3 = q1 regardless the c value can be computed due to the fact that: • This property allows then to localize the first quantization step with high precision. ICT Doctoral School - Trento, May 2017 Summarizing: F. Galvan, G. Puglisi, A. R. Bruna, and S. Battiato, “First quantization coefficient extraction from double compressed JPEG images,” in International Conference on Image Analysis and Processing (ICIAP), ser. Lecture Notes in Computer Science, vol. 8156, 2013, pp. 783–792. Quantization Step Estimation ICT Doctoral School - Trento, May 2017
  • 39.
    30/05/2017 39 Summarizing: F. Galvan, G.Puglisi, A. R. Bruna, and S. Battiato, “First quantization coefficient extraction from double compressed JPEG images,” in International Conference on Image Analysis and Processing (ICIAP), ser. Lecture Notes in Computer Science, vol. 8156, 2013, pp. 783–792. Quantization Step Estimation ICT Doctoral School - Trento, May 2017 Summarizing: F. Galvan, G. Puglisi, A. R. Bruna, and S. Battiato, “First quantization coefficient extraction from double compressed JPEG images,” in International Conference on Image Analysis and Processing (ICIAP), ser. Lecture Notes in Computer Science, vol. 8156, 2013, pp. 783–792. Quantization Step Estimation ICT Doctoral School - Trento, May 2017
  • 40.
    30/05/2017 40 Summarizing: a fourth quantization withq2 will bring us at the same situation that we had after the second quantization F. Galvan, G. Puglisi, A. R. Bruna, and S. Battiato, “First quantization coefficient extraction from double compressed JPEG images,” in International Conference on Image Analysis and Processing (ICIAP), ser. Lecture Notes in Computer Science, vol. 8156, 2013, pp. 783–792. Quantization Step Estimation ICT Doctoral School - Trento, May 2017 The proposed error function (q3 is varying in a proper range) is: If q3 = q1 the error function will be =0 F. Galvan, G. Puglisi, A. R. Bruna, and S. Battiato, “First quantization coefficient extraction from double compressed JPEG images,” in International Conference on Image Analysis and Processing (ICIAP), ser. Lecture Notes in Computer Science, vol. 8156, 2013, pp. 783–792. • c is the DCT term; • qi is the quantization coefficient referred to the i-th compression, • […] indicates the round function • |..| indicates the abs function Quantization Step Estimation ICT Doctoral School - Trento, May 2017
  • 41.
    30/05/2017 41 F. Galvan, G.Puglisi, A. R. Bruna, and S. Battiato, “First quantization coefficient extraction from double compressed JPEG images,” in International Conference on Image Analysis and Processing (ICIAP), ser. Lecture Notes in Computer Science, vol. 8156, 2013, pp. 783–792. First quantization coefficient extraction from double compressed JPEG images ICT Doctoral School - Trento, May 2017 101010100000111…… 010101010101011…… 111101111000011…… 110011001100110…… 000000001111100…… 110011001100110…… …………………………. …………………………. Original Image in the “bitstream” format What “really” happens in a forgery operation 83
  • 42.
    30/05/2017 42 101010100000111…… 010101010101011…… 111101111000011…… 110011001100110…… 000000001111100…… 110011001100110…… …………………………. …………………………. Original Image inthe “bitstream” format Original Image in a “displayable” format rounding error truncation error color conversions (YCrCb to RGB) What “really” happens in a forgery operation 84 101010100000111…… 010101010101011…… 111101111000011…… 110011001100110…… 000000001111100…… 110011001100110…… …………………………. …………………………. Original Image in the “bitstream” format Original Image in a “displayable” format rounding error truncation error color conversions (YCrCb to RGB) Malicious forgery Tampered Image What “really” happens in a forgery operation 85
  • 43.
    30/05/2017 43 101010100000111…… 010101010101011…… 111101111000011…… 110011001100110…… 000000001111100…… 110011001100110…… …………………………. …………………………. Original Image inthe “bitstream” format Original Image in a “displayable” format rounding error truncation error color conversions (YCrCb to RGB) 101010100000111…… 010101010101011…… 111111111000011…… 110011001100110…… 100000110111100…… 110011001101110…… …………………………. …………………………. Malicious forgery color conversions (RGB to YCbCr) rounding error Tampered Image Tampered Image in the “bitstream” format What “really” happens in a forgery operation 86 We can summarize the error introduced on the value of each DCT coefficient c by the several operations with: 𝑐 𝐷𝑄 =𝑟𝑜𝑢𝑛𝑑((𝑟𝑜𝑢𝑛𝑑( 𝑐 𝑞1 )× 𝑞1+ 𝑒 𝑟𝑟)× 1 𝑞2 ) the factor 𝑒 𝑟𝑟 , if not properly managed can limit the effectiveness of every proposed methodology What “really” happens in a forgery operation 87
  • 44.
    30/05/2017 44 depending on both firstand second quantization factor (q1 and q2), we can obtain different histograms for every position in the 8x8 block. G. Puglisi, A. R. Bruna, F. Galvan, and S. Battiato, “First JPEG quantization matrix estimation based on histogram analysis,” in International Conference on Image Processing (ICIP), 2013. First JPEG quantization matrix estimation based on histogram analysis ICT Doctoral School - Trento, May 2017 The sequence of zero (and not zero) values of the histogram related to a double compressed image IDQ provides useful information for the estimation of the first quantization factor q1 G. Puglisi, A. R. Bruna, F. Galvan, and S. Battiato, “First JPEG quantization matrix estimation based on histogram analysis,” in International Conference on Image Processing (ICIP), 2013. First JPEG quantization matrix estimation based on histogram analysis ICT Doctoral School - Trento, May 2017
  • 45.
    30/05/2017 45 a binary vectoris computed just considering the sequence of zero and not zero values of the filtered histogram double quantized image IDQ DCT coefficient are extracted the histogram of the absolute value of DCT coefficient cfj is computed the histogram of the DCT coefficient is filtered a binary vector is computed just considering the sequence of zero and not zero values of the filtered histogram 1st step A binary vector is computed just considering the sequence of zero and not zero values of the filtered histogram. G. Puglisi, A. R. Bruna, F. Galvan, and S. Battiato, “First JPEG quantization matrix estimation based on histogram analysis,” in International Conference on Image Processing (ICIP), 2013. First JPEG quantization matrix estimation based on histogram analysis ICT Doctoral School - Trento, May 2017 A set of binary representation are then built for each q1ifj double quantized image IDQ 2nd step A set of binary representation are then built for each q1ifj value exploiting information coming from the input (double compressed) image IDQ G. Puglisi, A. R. Bruna, F. Galvan, and S. Battiato, “First JPEG quantization matrix estimation based on histogram analysis,” in International Conference on Image Processing (ICIP), 2013. First JPEG quantization matrix estimation based on histogram analysis ICT Doctoral School - Trento, May 2017
  • 46.
    30/05/2017 46 3rd step Based onthe similarity between the generated representations and the one of the IDQ, a set of q1ifj candidates are then selected (Cfj ). G. Puglisi, A. R. Bruna, F. Galvan, and S. Battiato, “First JPEG quantization matrix estimation based on histogram analysis,” in International Conference on Image Processing (ICIP), 2013. First JPEG quantization matrix estimation based on histogram analysis ICT Doctoral School - Trento, May 2017 4th step the refined histograms Hq1s, related to the simulated double quantization with q1s ∈ Cfj are compared with Hreal obtained from IDQ and the closest one is selected as follows: 𝑞1𝑓 𝑗 = min 𝑞1𝑠∈𝐶 𝑓𝑗 𝑖=1 𝑁 min(𝑚𝑎𝑥 𝑑𝑖𝑓𝑓 𝐻𝑟𝑒𝑎𝑙 𝑖 − 𝐻 𝑞1𝑠(𝑖) ) Where 𝑞1𝑓 𝑗 indicates the estimated q1 value G. Puglisi, A. R. Bruna, F. Galvan, and S. Battiato, “First JPEG quantization matrix estimation based on histogram analysis,” in International Conference on Image Processing (ICIP), 2013. First JPEG quantization matrix estimation based on histogram analysis ICT Doctoral School - Trento, May 2017
  • 47.
    30/05/2017 47 G. Puglisi, A.R. Bruna, F. Galvan, and S. Battiato, “First JPEG quantization matrix estimation based on histogram analysis,” in International Conference on Image Processing (ICIP), 2013. First JPEG quantization matrix estimation based on histogram analysis ICT Doctoral School - Trento, May 2017 A novel improved scheme F. Galvan, G. Puglisi, A. R. Bruna, S. Battiato, First Quantization Matrix Estimation from Double Compressed JPEG Images, IEEE Transactions on Information Forensics and Security, 2014.
  • 48.
    30/05/2017 48 DCT Histogram Filtering Therounding error e manifests itself as peaks spread around the multiples of the quantization step q and has been modeled as an approximate Gaussian noise. Those joint phenomenons will affect the behavior of the second quantization step, thus the magnitude of the DCT coefficients, and consequently its statistics. For those reasons, the filtering strategy must face two kind of noise: the “split noise” and the “residual noise”, with the aim to bring the histogram as if the rounding error did not have impact. ICT Doctoral School - Trento, May 2017 ICT Doctoral School - Trento, May 2017
  • 49.
    30/05/2017 49 ICT Doctoral School- Trento, May 2017 Critical case • This undesirable situation appears when a bin of the first quantization (i.e., in position mq1) is situated exactly halfway between two consecutive bins coming from the second quantization (i.e., in position nq2 and (n + 1)q2). Specifically, this effect arises when two consecutive multiples of q2 are related to a generic multiple of q1 as follows: ICT Doctoral School - Trento, May 2017
  • 50.
    30/05/2017 50 DCT Histogram Filtering Thismodule actually provides a set of filtered histograms Hfiltq1i (one for each quantization step q1i ∈ {q1min, q1min + 1, . . . , q1max }). ICT Doctoral School - Trento, May 2017 DCT Histogram Filtering ICT Doctoral School - Trento, May 2017
  • 51.
    30/05/2017 51 Overall Schema ICT DoctoralSchool - Trento, May 2017 DCT Histogram Selection • The modules presented above actually provide a series of first quantization candidates that have to be considered for further evaluations. • The DCT Histogram Selection step, exploiting directly the information related to the histogram values estimates the q1 value. In order to select the correct first quantization step, we exploit information coming from the original double compressed image IDQ. ICT Doctoral School - Trento, May 2017
  • 52.
    30/05/2017 52 • We startwith the extraction of DCT coefficients cDQ, followed by a rough estimation of the original DCT coefficients obtained through a proper cropping of the double compressed image. ICT Doctoral School - Trento, May 2017 According with the following consideration: double JPEG compression modifies the histograms of the DCT coefficients with a function depending on both first and second quantization factor (q1 and q2); G. Puglisi, A. R. Bruna, F. Galvan, and S. Battiato, “First JPEG quantization matrix estimation based on histogram analysis,” in International Conference on Image Processing (ICIP), 2013. First JPEG quantization matrix estimation based on histogram analysis ICT Doctoral School - Trento, May 2017
  • 53.
    30/05/2017 53 the refined histogramsHq1s, related to the simulated double quantization with q1s ∈ CS are compared with Hreal obtained from IDQ and the closest one is selected as follows: 𝑞1 = min 𝑞1𝑠∈𝐶 𝑆 𝑖=1 𝑁 min(𝑚𝑎𝑥 𝑑𝑖𝑓𝑓, 𝐻𝑟𝑒𝑎𝑙 𝑖 − 𝐻 𝑞1𝑠(𝑖) ) where N is the number of bins of the histograms and maxdi f f is a threshold used to limit the contribution of a single difference in the overall distance computation. G. Puglisi, A. R. Bruna, F. Galvan, and S. Battiato, “First JPEG quantization matrix estimation based on histogram analysis,” in International Conference on Image Processing (ICIP), 2013. DCT Histogram Selection ICT Doctoral School - Trento, May 2017 Experiments • To assess the performance of the proposed approach, several tests have been conducted considering double compressed JPEG images, obtained starting from two different sources • as described below. A dataset of 110 uncompressed images has been collected considering different cameras (Canon D40, Canon D50 e Canon Mark3) with different resolutions. ICT Doctoral School - Trento, May 2017
  • 54.
    30/05/2017 54 Experimental Settings A croppingof size 1024 × 1024 of the central part of each image has been then selected in order to speed up the tests. Starting from the cropped images, applying JPEG encoding provided by Matlab with standard JPEG quantization tables proposed by IJG (Independent JPEG Group), a dataset of double compressed images have been built just considering quality factors (QF1, QF2) in the range 50 to 100 at steps of 10. (total 1650 images) ICT Doctoral School - Trento, May 2017 ICT Doctoral School - Trento, May 2017
  • 55.
    30/05/2017 55 CanonD40D50Mk3 dataset ICT DoctoralSchool - Trento, May 2017 UCID v2 (dataset) 1338 uncompressed images->20070 imagesICT Doctoral School - Trento, May 2017
  • 56.
    30/05/2017 56 Conclusions In this workwe proposed a novel algorithm for the estimation of the first quantization steps from double compressed JPEG images. The proposed approach, combining a filtering strategy and an error function with a good q1-localization property, obtains satisfactory results outperforming the other state-of- the-art approaches both for low and high frequencies. Future works will be devoted to cope with the case when q1 < q2, and to exploit the proposed approach to recover the overall initial quantization matrix considering a double compression process achieved by applying actual quantization tables used by camera devices and common photo-retouching software (e.g., Photoshop, Gimp, etc.). ICT Doctoral School - Trento, May 2017 Recent Works • Singh, G. & Singh, K. Forensics for partially double compressed doctored JPEG images - Multimed Tools Appl (2016). doi:10.1007/s11042-016-4290-5 • Abstract. Digital image forensics is required to investigate unethical use of doctored images by recovering the historic information of an image. Most of the cameras compress the image using JPEG standard. When this image is decompressed and recompressed with different quantization matrix, it becomes double compressed. Although in certain cases, e.g. after a cropping attack, the image can be recompressed with the same quantization matrix too. This JPEG double compression becomes an integral part of forgery creation. The detection and analysis of double compression in an image help the investigator to find the authenticity of an image. In this paper, a two-stage technique is proposed to estimate the first quantization matrix or steps from the partial double compressed JPEG images. In the first stage of the proposed approach, the detection of the double compressed region through JPEG ghost technique is extended to the automatic isolation of the doubly compressed part from an image. The second stage analyzes the doubly compressed part to estimate the first quantization matrix or steps. In the latter stage, an optimized filtering scheme is also proposed to cope with the effects of the error. The results of proposed scheme are evaluated by considering partial double compressed images based on the two different datasets. The partial double compressed datasets have not been considered in the previous state-of-the-art approaches. The first stage of the proposed scheme provides an average percentage accuracy of 95.45%. The second stage provides an error less than 1.5% for the first 10 DCT coefficients, hence, outperforming the existing techniques. The experimental results consider the partial double compressed images in which the recompression is done with different quantization matrix. ICT Doctoral School - Trento, May 2017
  • 57.
    30/05/2017 57 Recent Works • Singh,G. & Singh, K. Forensics for partially double compressed doctored JPEG images - Multimed Tools Appl (2016). doi:10.1007/s11042-016-4290-5 • ICT Doctoral School - Trento, May 2017 Recent Works • Bin Li et al. Statistical Model of JPEG Noises and Its Application in Quantization Step Estimation IEEE Transactions on Image Processing ( Volume: 24, Issue: 5, May 2015 ) • Abstract. In this paper, we present a statistical analysis of JPEG noises, including the quantization noise and the rounding noise during a JPEG compression cycle. The JPEG noises in the first compression cycle have been well studied; however, so far less attention has been paid on the statistical model of JPEG noises in higher compression cycles. Our analysis reveals that the noise distributions in higher compression cycles are different from those in the first compression cycle, and they are dependent on the quantization parameters used between two successive cycles. To demonstrate the benefits from the analysis, we apply the statistical model in JPEG quantization step estimation. We construct a sufficient statistic by exploiting the derived noise distributions, and justify that the statistic has several special properties to reveal the ground-truth quantization step. Experimental results demonstrate that the proposed estimator can uncover JPEG compression history with a satisfactory performance. ICT Doctoral School - Trento, May 2017
  • 58.
    30/05/2017 58 Recent Works • BinLi et al. Revealing the Trace of High-Quality JPEG Compression Through Quantization Noise Analysis IEEE Transactions on Information Forensics and Security ( Volume: 10, Issue: 3, March 2015 ) • Abstract. To identify whether an image has been JPEG compressed is an important issue in forensic practice. The state-of-the-art methods fail to identify high-quality compressed images, which are common on the Internet. In this paper, we provide a novel quantization noise-based solution to reveal the traces of JPEG compression. Based on the analysis of noises in multiple-cycle JPEG compression, we define a quantity called forward quantization noise. We analytically derive that a decompressed JPEG image has a lower variance of forward quantization noise than its uncompressed counterpart. With the conclusion, we develop a simple yet very effective detection algorithm to identify decompressed JPEG images. We show that our method outperforms the state-of-the-art methods by a large margin especially for high-quality compressed images through extensive experiments on various sources of images. We also demonstrate that the proposed method is robust to small image size and chroma subsampling. The proposed algorithm can be applied in some practical applications, such as Internet image classification and forgery detection. ICT Doctoral School - Trento, May 2017 Recent Works • Taimori, A., Razzazi, F., Behrad, A. et al. A novel forensic image analysis tool for discovering double JPEG compression clues Multimed Tools Appl (2017) 76: 7749. doi:10.1007/s11042-016-3409-z • Abstract. This paper presents a novel technique to discover double JPEG compression traces. Existing detectors only operate in a scenario that the image under investigation is explicitly available in JPEG format. Consequently, if quantization information of JPEG files is unknown, their performance dramatically degrades. Our method addresses both forensic scenarios which results in a fresh perceptual detection pipeline. We suggest a dimensionality reduction algorithm to visualize behaviors of a big database including various single and double compressed images. Based on intuitions of visualization, three bottom-up, top- down and combined top-down/bottom-up learning strategies are proposed. Our tool discriminates single compressed images from double counterparts, estimates the first quantization in double compression, and localizes tampered regions in a forgery examination. Extensive experiments on three databases demonstrate results are robust among different quality levels. F1-measure improvement to the best state-of-the-art approach reaches up to 26.32 %. An implementation of algorithms is available upon request to fellows.ICT Doctoral School - Trento, May 2017
  • 59.
    30/05/2017 59 Recent works • Feiet al. - MSE period based estimation of first quantization step in double compressed JPEG images - Signal Processing: Image Communication Volume 57, September 2017, Pages 76–83 • Abstract. The estimation of the first quantization step in double JPEG compressed images is still a challenging problem, especially when the first quantization step q1 is smaller than the second quantization step q2. In this paper, we present a novel method to estimate q1. By introducing the mean square error (MSE) sequence of ratios among DCT coefficient histogram bins, we formulate the relationship between its periodic fluctuation and q1. And in order to enhance the periodic effect, we propose a strategy to adjust the histogram. Then, based on MSE sequence, several q1 candidates can be obtained. Finally by histogram comparison, the estimated quantization step is selected from the candidates. Experimental results demonstrate that the proposed approach has better overall performance when compared with state of the art methods. ICT Doctoral School - Trento, May 2017 Recent works • Thanh Hai Thai et al. - JPEG Quantization Step Estimation and Its Applications to Digital Image Forensics - IEEE Transactions on Information Forensics and Security ( Volume: 12, Issue: 1, Jan. 2017 ) • Abstract. The goal of this paper is to propose an accurate method for estimating quantization steps from an image that has been previously JPEG-compressed and stored in lossless format. The method is based on the combination of the quantization effect and the statistics of discrete cosine transform (DCT) coefficient characterized by the statistical model that has been proposed in our previous works. The analysis of quantization effect is performed within a mathematical framework, which justifies the relation of local maxima of the number of integer quantized forward coefficients with the true quantization step. From the candidate set of the true quantization step given by the previous analysis, the statistical model of DCT coefficients is used to provide the optimal quantization step candidate. The proposed method can also be exploited to estimate the secondary quantization table in a double-JPEG compressed image stored in lossless format and detect the presence of JPEG compression. Numerical experiments on large image databases with different image sizes and quality factors highlight the high accuracy of the proposed method. ICT Doctoral School - Trento, May 2017
  • 60.
    30/05/2017 60 Main Contacts Further Info ImageProcessing Lab Università di Catania www.dmi.unict.it/~iplab Email battiato@dmi.unict.it 120