Blind Verification of Digital Image Originality:
A Statistical Approach
Babak Mahdian, Radim Nedbal, and Stanislav Saic
Sibelius Seraphini
CSI 445 - Digital Image Forensics
Sibelius Seraphini (CSI 445) Digital Image Forensics 1 / 11
Introduction
Trustworthiness of digital images is essential for many areas
forensic investigation
criminal investigation
journalism
Sibelius Seraphini (CSI 445) Digital Image Forensics 2 / 11
Introduction
Trustworthiness of digital images is essential for many areas
forensic investigation
criminal investigation
journalism
One possible approach to check image integrity
extract image features
compare these features with a reference set
Sibelius Seraphini (CSI 445) Digital Image Forensics 2 / 11
Introduction
Trustworthiness of digital images is essential for many areas
forensic investigation
criminal investigation
journalism
One possible approach to check image integrity
extract image features
compare these features with a reference set
Problem of this approach
reference sets for verification of digital image integrity
collected from unknown environments
Sibelius Seraphini (CSI 445) Digital Image Forensics 2 / 11
Problem addressed in this paper
Given a database consisting of “unguaranteed” images;
How to identify which images are original from the camera and which
have been modified by software ?
Sibelius Seraphini (CSI 445) Digital Image Forensics 3 / 11
Related Work
Active Approaches
data hiding
digital signatures
Sibelius Seraphini (CSI 445) Digital Image Forensics 4 / 11
Related Work
Active Approaches
data hiding
digital signatures
Blind Methods
image splicing
color filter array interpolation
geometric transformations
cloning
computer graphics generated photos
JPEG compression inconsistencies
fingerprint based
Sibelius Seraphini (CSI 445) Digital Image Forensics 4 / 11
Basic Notation
Digital Images
pixel data
metadata
Sibelius Seraphini (CSI 445) Digital Image Forensics 5 / 11
Basic Notation
Digital Images
pixel data
metadata
Camera ID vector (−→cm)
maker
model
Sibelius Seraphini (CSI 445) Digital Image Forensics 5 / 11
Basic Notation
Digital Images
pixel data
metadata
Camera ID vector (−→cm)
maker
model
Fingerprint vector (
−→
θ )
quantization table
thumbnail
Sibelius Seraphini (CSI 445) Digital Image Forensics 5 / 11
Reference Data Set - S
S = Cm × Θ × U
Sibelius Seraphini (CSI 445) Digital Image Forensics 6 / 11
Reference Data Set - S
S = Cm × Θ × U
Cm: contains all the camera ID vectors
Θ: contains all possible fingerprints vectors
U: contains all users ID.
Sibelius Seraphini (CSI 445) Digital Image Forensics 6 / 11
Reference Data Set - S
S = Cm × Θ × U
Cm: contains all the camera ID vectors
Θ: contains all possible fingerprints vectors
U: contains all users ID.
−→cm,
−→
θ , u
u has taken the photo
−→cm is the camera used
−→
θ the left fingerprint vector
Sibelius Seraphini (CSI 445) Digital Image Forensics 6 / 11
A Statistical Approach for Noise Removal
“Testing” tuple
t0 = −−→cm0,
−→
θ0
Sibelius Seraphini (CSI 445) Digital Image Forensics 7 / 11
A Statistical Approach for Noise Removal
“Testing” tuple
t0 = −−→cm0,
−→
θ0
Null hypothesis
H0 :
−→
θ0 can’t be a fingerprint of −−→cm0
Sibelius Seraphini (CSI 445) Digital Image Forensics 7 / 11
A Statistical Approach for Noise Removal
“Testing” tuple
t0 = −−→cm0,
−→
θ0
Null hypothesis
H0 :
−→
θ0 can’t be a fingerprint of −−→cm0
Test statistic
T −−→cm0,
−→
θ0 = u| −−→cm0,
−→
θ0, u ∈ S
hypergeometric distribution
Sibelius Seraphini (CSI 445) Digital Image Forensics 7 / 11
A Statistical Approach for Noise Removal
“Testing” tuple
t0 = −−→cm0,
−→
θ0
Null hypothesis
H0 :
−→
θ0 can’t be a fingerprint of −−→cm0
Test statistic
T −−→cm0,
−→
θ0 = u| −−→cm0,
−→
θ0, u ∈ S
hypergeometric distribution
Rejecting H0
if T is too big and greater than a threshold
Sibelius Seraphini (CSI 445) Digital Image Forensics 7 / 11
Experimental Results
Proposed fingerprints
FMarkers - EXIF Markers
FQTs - luminance and chrominance quantization tables
FThumb - information on the JPEG thumbnail image
Sibelius Seraphini (CSI 445) Digital Image Forensics 8 / 11
Experimental Results
Proposed fingerprints
FMarkers - EXIF Markers
FQTs - luminance and chrominance quantization tables
FThumb - information on the JPEG thumbnail image
Reference Image Data Set
5 million images of Flickr
Ground-truth data - 2400 images (24 cameras, 100 digital images each)
Sibelius Seraphini (CSI 445) Digital Image Forensics 8 / 11
Experimental Results
Proposed fingerprints
FMarkers - EXIF Markers
FQTs - luminance and chrominance quantization tables
FThumb - information on the JPEG thumbnail image
Reference Image Data Set
5 million images of Flickr
Ground-truth data - 2400 images (24 cameras, 100 digital images each)
Fingerprints not in the reference image data set have been removed
Sibelius Seraphini (CSI 445) Digital Image Forensics 8 / 11
Experimental Results
Proposed fingerprints
FMarkers - EXIF Markers
FQTs - luminance and chrominance quantization tables
FThumb - information on the JPEG thumbnail image
Reference Image Data Set
5 million images of Flickr
Ground-truth data - 2400 images (24 cameras, 100 digital images each)
Fingerprints not in the reference image data set have been removed
Worked well to check the digital images integrity
Sibelius Seraphini (CSI 445) Digital Image Forensics 8 / 11
Conclusions
image fingerprints are useful to identify the image originality
Sibelius Seraphini (CSI 445) Digital Image Forensics 9 / 11
Conclusions
image fingerprints are useful to identify the image originality
this paper provides a statistical approach to handle information noise
in a “unguaranted” databases of images
Sibelius Seraphini (CSI 445) Digital Image Forensics 9 / 11
Conclusions
image fingerprints are useful to identify the image originality
this paper provides a statistical approach to handle information noise
in a “unguaranted” databases of images
positive results in identification of original images
Sibelius Seraphini (CSI 445) Digital Image Forensics 9 / 11
Discussions
strength
Sibelius Seraphini (CSI 445) Digital Image Forensics 10 / 11
Discussions
strength
can identify with which camera a photo was taken
Sibelius Seraphini (CSI 445) Digital Image Forensics 10 / 11
Discussions
strength
can identify with which camera a photo was taken
check the integrity of a database of images instead of just one image
per time
Sibelius Seraphini (CSI 445) Digital Image Forensics 10 / 11
Discussions
strength
can identify with which camera a photo was taken
check the integrity of a database of images instead of just one image
per time
provide a confidence value of how likely is a image original or modified
Sibelius Seraphini (CSI 445) Digital Image Forensics 10 / 11
Discussions
strength
can identify with which camera a photo was taken
check the integrity of a database of images instead of just one image
per time
provide a confidence value of how likely is a image original or modified
weakness
Sibelius Seraphini (CSI 445) Digital Image Forensics 10 / 11
Discussions
strength
can identify with which camera a photo was taken
check the integrity of a database of images instead of just one image
per time
provide a confidence value of how likely is a image original or modified
weakness
can only be applied to database of users that took the picture
Sibelius Seraphini (CSI 445) Digital Image Forensics 10 / 11
Discussions
strength
can identify with which camera a photo was taken
check the integrity of a database of images instead of just one image
per time
provide a confidence value of how likely is a image original or modified
weakness
can only be applied to database of users that took the picture
extracting camera ID vector and image fingerprint could be misleading
Sibelius Seraphini (CSI 445) Digital Image Forensics 10 / 11
Discussions
strength
can identify with which camera a photo was taken
check the integrity of a database of images instead of just one image
per time
provide a confidence value of how likely is a image original or modified
weakness
can only be applied to database of users that took the picture
extracting camera ID vector and image fingerprint could be misleading
cannot handle fingerprints that are not in the reference set
Sibelius Seraphini (CSI 445) Digital Image Forensics 10 / 11
Discussions
strength
can identify with which camera a photo was taken
check the integrity of a database of images instead of just one image
per time
provide a confidence value of how likely is a image original or modified
weakness
can only be applied to database of users that took the picture
extracting camera ID vector and image fingerprint could be misleading
cannot handle fingerprints that are not in the reference set
a reference data set with ground-truth information is needed to
validate the image originality
Sibelius Seraphini (CSI 445) Digital Image Forensics 10 / 11
Discussions
strength
can identify with which camera a photo was taken
check the integrity of a database of images instead of just one image
per time
provide a confidence value of how likely is a image original or modified
weakness
can only be applied to database of users that took the picture
extracting camera ID vector and image fingerprint could be misleading
cannot handle fingerprints that are not in the reference set
a reference data set with ground-truth information is needed to
validate the image originality
improvements
Sibelius Seraphini (CSI 445) Digital Image Forensics 10 / 11
Discussions
strength
can identify with which camera a photo was taken
check the integrity of a database of images instead of just one image
per time
provide a confidence value of how likely is a image original or modified
weakness
can only be applied to database of users that took the picture
extracting camera ID vector and image fingerprint could be misleading
cannot handle fingerprints that are not in the reference set
a reference data set with ground-truth information is needed to
validate the image originality
improvements
employ another blind verification method to obtain the ground-truth
information
Sibelius Seraphini (CSI 445) Digital Image Forensics 10 / 11
Discussions
strength
can identify with which camera a photo was taken
check the integrity of a database of images instead of just one image
per time
provide a confidence value of how likely is a image original or modified
weakness
can only be applied to database of users that took the picture
extracting camera ID vector and image fingerprint could be misleading
cannot handle fingerprints that are not in the reference set
a reference data set with ground-truth information is needed to
validate the image originality
improvements
employ another blind verification method to obtain the ground-truth
information
extract the image fingerprint from the pixel data
Sibelius Seraphini (CSI 445) Digital Image Forensics 10 / 11
Questions ?
Sibelius Seraphini (CSI 445) Digital Image Forensics 11 / 11

Blind Verification of Digital Image Originality: A Statistical Approach

  • 1.
    Blind Verification ofDigital Image Originality: A Statistical Approach Babak Mahdian, Radim Nedbal, and Stanislav Saic Sibelius Seraphini CSI 445 - Digital Image Forensics Sibelius Seraphini (CSI 445) Digital Image Forensics 1 / 11
  • 2.
    Introduction Trustworthiness of digitalimages is essential for many areas forensic investigation criminal investigation journalism Sibelius Seraphini (CSI 445) Digital Image Forensics 2 / 11
  • 3.
    Introduction Trustworthiness of digitalimages is essential for many areas forensic investigation criminal investigation journalism One possible approach to check image integrity extract image features compare these features with a reference set Sibelius Seraphini (CSI 445) Digital Image Forensics 2 / 11
  • 4.
    Introduction Trustworthiness of digitalimages is essential for many areas forensic investigation criminal investigation journalism One possible approach to check image integrity extract image features compare these features with a reference set Problem of this approach reference sets for verification of digital image integrity collected from unknown environments Sibelius Seraphini (CSI 445) Digital Image Forensics 2 / 11
  • 5.
    Problem addressed inthis paper Given a database consisting of “unguaranteed” images; How to identify which images are original from the camera and which have been modified by software ? Sibelius Seraphini (CSI 445) Digital Image Forensics 3 / 11
  • 6.
    Related Work Active Approaches datahiding digital signatures Sibelius Seraphini (CSI 445) Digital Image Forensics 4 / 11
  • 7.
    Related Work Active Approaches datahiding digital signatures Blind Methods image splicing color filter array interpolation geometric transformations cloning computer graphics generated photos JPEG compression inconsistencies fingerprint based Sibelius Seraphini (CSI 445) Digital Image Forensics 4 / 11
  • 8.
    Basic Notation Digital Images pixeldata metadata Sibelius Seraphini (CSI 445) Digital Image Forensics 5 / 11
  • 9.
    Basic Notation Digital Images pixeldata metadata Camera ID vector (−→cm) maker model Sibelius Seraphini (CSI 445) Digital Image Forensics 5 / 11
  • 10.
    Basic Notation Digital Images pixeldata metadata Camera ID vector (−→cm) maker model Fingerprint vector ( −→ θ ) quantization table thumbnail Sibelius Seraphini (CSI 445) Digital Image Forensics 5 / 11
  • 11.
    Reference Data Set- S S = Cm × Θ × U Sibelius Seraphini (CSI 445) Digital Image Forensics 6 / 11
  • 12.
    Reference Data Set- S S = Cm × Θ × U Cm: contains all the camera ID vectors Θ: contains all possible fingerprints vectors U: contains all users ID. Sibelius Seraphini (CSI 445) Digital Image Forensics 6 / 11
  • 13.
    Reference Data Set- S S = Cm × Θ × U Cm: contains all the camera ID vectors Θ: contains all possible fingerprints vectors U: contains all users ID. −→cm, −→ θ , u u has taken the photo −→cm is the camera used −→ θ the left fingerprint vector Sibelius Seraphini (CSI 445) Digital Image Forensics 6 / 11
  • 14.
    A Statistical Approachfor Noise Removal “Testing” tuple t0 = −−→cm0, −→ θ0 Sibelius Seraphini (CSI 445) Digital Image Forensics 7 / 11
  • 15.
    A Statistical Approachfor Noise Removal “Testing” tuple t0 = −−→cm0, −→ θ0 Null hypothesis H0 : −→ θ0 can’t be a fingerprint of −−→cm0 Sibelius Seraphini (CSI 445) Digital Image Forensics 7 / 11
  • 16.
    A Statistical Approachfor Noise Removal “Testing” tuple t0 = −−→cm0, −→ θ0 Null hypothesis H0 : −→ θ0 can’t be a fingerprint of −−→cm0 Test statistic T −−→cm0, −→ θ0 = u| −−→cm0, −→ θ0, u ∈ S hypergeometric distribution Sibelius Seraphini (CSI 445) Digital Image Forensics 7 / 11
  • 17.
    A Statistical Approachfor Noise Removal “Testing” tuple t0 = −−→cm0, −→ θ0 Null hypothesis H0 : −→ θ0 can’t be a fingerprint of −−→cm0 Test statistic T −−→cm0, −→ θ0 = u| −−→cm0, −→ θ0, u ∈ S hypergeometric distribution Rejecting H0 if T is too big and greater than a threshold Sibelius Seraphini (CSI 445) Digital Image Forensics 7 / 11
  • 18.
    Experimental Results Proposed fingerprints FMarkers- EXIF Markers FQTs - luminance and chrominance quantization tables FThumb - information on the JPEG thumbnail image Sibelius Seraphini (CSI 445) Digital Image Forensics 8 / 11
  • 19.
    Experimental Results Proposed fingerprints FMarkers- EXIF Markers FQTs - luminance and chrominance quantization tables FThumb - information on the JPEG thumbnail image Reference Image Data Set 5 million images of Flickr Ground-truth data - 2400 images (24 cameras, 100 digital images each) Sibelius Seraphini (CSI 445) Digital Image Forensics 8 / 11
  • 20.
    Experimental Results Proposed fingerprints FMarkers- EXIF Markers FQTs - luminance and chrominance quantization tables FThumb - information on the JPEG thumbnail image Reference Image Data Set 5 million images of Flickr Ground-truth data - 2400 images (24 cameras, 100 digital images each) Fingerprints not in the reference image data set have been removed Sibelius Seraphini (CSI 445) Digital Image Forensics 8 / 11
  • 21.
    Experimental Results Proposed fingerprints FMarkers- EXIF Markers FQTs - luminance and chrominance quantization tables FThumb - information on the JPEG thumbnail image Reference Image Data Set 5 million images of Flickr Ground-truth data - 2400 images (24 cameras, 100 digital images each) Fingerprints not in the reference image data set have been removed Worked well to check the digital images integrity Sibelius Seraphini (CSI 445) Digital Image Forensics 8 / 11
  • 22.
    Conclusions image fingerprints areuseful to identify the image originality Sibelius Seraphini (CSI 445) Digital Image Forensics 9 / 11
  • 23.
    Conclusions image fingerprints areuseful to identify the image originality this paper provides a statistical approach to handle information noise in a “unguaranted” databases of images Sibelius Seraphini (CSI 445) Digital Image Forensics 9 / 11
  • 24.
    Conclusions image fingerprints areuseful to identify the image originality this paper provides a statistical approach to handle information noise in a “unguaranted” databases of images positive results in identification of original images Sibelius Seraphini (CSI 445) Digital Image Forensics 9 / 11
  • 25.
    Discussions strength Sibelius Seraphini (CSI445) Digital Image Forensics 10 / 11
  • 26.
    Discussions strength can identify withwhich camera a photo was taken Sibelius Seraphini (CSI 445) Digital Image Forensics 10 / 11
  • 27.
    Discussions strength can identify withwhich camera a photo was taken check the integrity of a database of images instead of just one image per time Sibelius Seraphini (CSI 445) Digital Image Forensics 10 / 11
  • 28.
    Discussions strength can identify withwhich camera a photo was taken check the integrity of a database of images instead of just one image per time provide a confidence value of how likely is a image original or modified Sibelius Seraphini (CSI 445) Digital Image Forensics 10 / 11
  • 29.
    Discussions strength can identify withwhich camera a photo was taken check the integrity of a database of images instead of just one image per time provide a confidence value of how likely is a image original or modified weakness Sibelius Seraphini (CSI 445) Digital Image Forensics 10 / 11
  • 30.
    Discussions strength can identify withwhich camera a photo was taken check the integrity of a database of images instead of just one image per time provide a confidence value of how likely is a image original or modified weakness can only be applied to database of users that took the picture Sibelius Seraphini (CSI 445) Digital Image Forensics 10 / 11
  • 31.
    Discussions strength can identify withwhich camera a photo was taken check the integrity of a database of images instead of just one image per time provide a confidence value of how likely is a image original or modified weakness can only be applied to database of users that took the picture extracting camera ID vector and image fingerprint could be misleading Sibelius Seraphini (CSI 445) Digital Image Forensics 10 / 11
  • 32.
    Discussions strength can identify withwhich camera a photo was taken check the integrity of a database of images instead of just one image per time provide a confidence value of how likely is a image original or modified weakness can only be applied to database of users that took the picture extracting camera ID vector and image fingerprint could be misleading cannot handle fingerprints that are not in the reference set Sibelius Seraphini (CSI 445) Digital Image Forensics 10 / 11
  • 33.
    Discussions strength can identify withwhich camera a photo was taken check the integrity of a database of images instead of just one image per time provide a confidence value of how likely is a image original or modified weakness can only be applied to database of users that took the picture extracting camera ID vector and image fingerprint could be misleading cannot handle fingerprints that are not in the reference set a reference data set with ground-truth information is needed to validate the image originality Sibelius Seraphini (CSI 445) Digital Image Forensics 10 / 11
  • 34.
    Discussions strength can identify withwhich camera a photo was taken check the integrity of a database of images instead of just one image per time provide a confidence value of how likely is a image original or modified weakness can only be applied to database of users that took the picture extracting camera ID vector and image fingerprint could be misleading cannot handle fingerprints that are not in the reference set a reference data set with ground-truth information is needed to validate the image originality improvements Sibelius Seraphini (CSI 445) Digital Image Forensics 10 / 11
  • 35.
    Discussions strength can identify withwhich camera a photo was taken check the integrity of a database of images instead of just one image per time provide a confidence value of how likely is a image original or modified weakness can only be applied to database of users that took the picture extracting camera ID vector and image fingerprint could be misleading cannot handle fingerprints that are not in the reference set a reference data set with ground-truth information is needed to validate the image originality improvements employ another blind verification method to obtain the ground-truth information Sibelius Seraphini (CSI 445) Digital Image Forensics 10 / 11
  • 36.
    Discussions strength can identify withwhich camera a photo was taken check the integrity of a database of images instead of just one image per time provide a confidence value of how likely is a image original or modified weakness can only be applied to database of users that took the picture extracting camera ID vector and image fingerprint could be misleading cannot handle fingerprints that are not in the reference set a reference data set with ground-truth information is needed to validate the image originality improvements employ another blind verification method to obtain the ground-truth information extract the image fingerprint from the pixel data Sibelius Seraphini (CSI 445) Digital Image Forensics 10 / 11
  • 37.
    Questions ? Sibelius Seraphini(CSI 445) Digital Image Forensics 11 / 11