Video Inpainting detection using inconsistencies in
Optical Flow
Thesis Committee
Dr. A V Subramanyam (Advisor)
Dr. Pradeep Atrey(External Reviewer)
Dr. Sambuddho Chakravarty(Internal Reviewer)
Shobhita Saxena
M.Tech CSE (MT13015)
1
Outline
 Research Motivation and Aim
 Related Work and Research Contribution
 Problem definition
 Proposed Algorithm
 Experimental Results and Comparison Analysis
 Limitations
2
Outline
 Research Motivation and Aim
 Related Work and Research Contribution
 Problem definition
 Proposed Algorithm
 Experimental Results and Comparison Analysis
 Limitations
3
Aims at :
 restoring lost parts of videos and
reconstructing them based on the
background information and
spatial - temporal details.
 removal or replacement of
unwanted objects from frames
such that no distortion is observed
when video is played as a
sequence.
What is Video Inpainting?
4
Figure 1 : Removal of unwanted object from
a video frame
Video Inpainting as Video Forgery
 Video Inpainting can be used as a major forgery tool to perform malicious changes in
videos such as object removal .
5Figure 2 : Video Inpainting as forgery type (a) source video frames (b) Inpainted video frames
Research Motivation
 Video Inpainting forgery produces inpainted regions in video frames in a
visually plausible manner.
 Inpainting forgery is highly sophisticated and gets more difficult to detect
in comparison to other forgery types thus posing a challenging research
problem.
 Very few works have been proposed in the area of video inpainting
detection.
 Existing inpainting detection techniques fail to perform detection of latest
state-of-the-art inpainting techniques.
6
Research Aim
 To study and analyse the optical flow pattern in source and inpainted
videos .
 To present a robust technique for effective detection and localization of
inpainted regions in a given video sequence.
 To propose a single algorithm which detects multiple inpainting techniques.
7
Outline
 Research Motivation and Aim
 Related Work and Research Contribution
 Problem definition
 Proposed Algorithm
 Experimental Results and Comparison Analysis
 Limitations
8
Related Work
Hsu et al. [1] Proposed approach to locate forged regions in an inpainted video
using correlation of noise residue.
Zhang et al. [2] Performs inpainting forgery detection based on ghost shadow
artifacts.
Das et al.[3] Proposed a blind detection method based on zero-connectivity feature
and fuzzy membership function to detect video inpainting forgery.
Lin et al. [4] Proposed spatio-temporal coherance based approach for video
inpainting detection and localization.
9
Research Contribution
 Propose an approach which investigates video inpainting forgery which has not
been studied much in previous works
 Propose a novel approach to detect and temporally localize inpainted regions in a
given video sequence.
 Propose algorithm performs well in detection and localization of popular and
effective state-of-the-art inpainting techniques on which other inpainting detection
algorithms fail to perform .
10
Outline
 Research Motivation and Aim
 Related Work and Research Contribution
 Problem definition
 Proposed Algorithm
 Experimental Results and Comparison Analysis
 Limitations
11
Problem definition
 Given an input video sequence VN having N frames, determine if its an authentic
or an inpainted video .
 For a video that gets classified as inpainted, perform temporal localization of the
inpainted regions.
12
Outline
 Research Motivation and Aim
 Related Work and Research Contribution
 Problem definition
 Proposed Algorithm
 Experimental Results and Comparison Analysis
 Limitations
13
Targeted Inpainting Techniques
14
 This algorithm performs on two variants of Temporal Copy Paste (TCP) inpainting.
They are as follows :
 Conventional TCP – one of the first and most popular inpainting techniques
proposed by Patwardhan et al [5] in 2007. Handles relatively simple motion
types in videos for inpainting.
 Complex TCP – one of the state-of-the-art inpainting techniques proposed by
Alasdair et al [6] in 2014. Handles relatively complex motions in videos to
performing inpainting.
Algorithm Design
15
Optical Flow
16
 Optical Flow (Of) characterizes
the motion of every pixel in one
image to its corresponding
location in next image[7].
 Performs motion estimation in
between video frames .
 It is best applied to video frames
as these are sequence of time
ordered images.
Figure 3 : (a,b) mouth regions of two consecutive
images of a person speaking . (c) Flow field
estimated using optical flow.
17* Picture from Khurram Hassan-Shafique CAP5415 Computer Vision 2003
Figure 4 : Two frames of a video sequence
18
Figure 6 : Colour scheme used to
represent the orientation and
magnitude of optical flow
Figure 5 : Optical Flow computation in between two
images
Optical Flow Computation - Procedure
19
V video under consideration having N
frames
m number of optical flow matrices
computed for each frame
N-m number of frames chosen for optical
flow computation
Of Optical flow matrix computed in
between frames as Of(n,n+1), Of(n,n+2),
Of(n,n+3)…. Of(n,n+m) , where n is a
particular frame
(N-m)*m Number of optical flow matrices
generated for each frame in a video
sequence
Optical Flow – Results
20
a) Source video frames
b) Optical Flow of source framesFigure 7:
21
a) Inpainted video frames
b) Optical Flow of inpainted video framesFigure 8:
Chi-Square Distance computation - Procedure
22
H Histogram vector
computed for each
optical flow matrix..
chisq Chi – square distance
computed for
comparing histograms
(N-m)*(m-1) Number of chi –
square values
produced
Guassian Mixture curve fitting
 Applied to normalized chi – square values
 Measures goodness-of-fit for authentic and inpainted videos by giving
Root Mean Square Error(RMSE) values
23
GM Distribution Fit to chi-square values
24
Figure 9 : Source and Inpainted GM distributions
Markov Chain – feature extraction
 Each video VN divided into VN / k frames set
 Optical Flow Of of each frame set is modelled as firsst order spatial
Markov Chain
 Values in Of are rounded of to nearest integer values to get integer
value states and then truncated in between –Tr to +Tr before
extracting the transition probabilities .
25
 Number of states to model a markov chain = (2Tr+1)
 For each matrix, number of Transition Probabilities = (2Tr+1) * (2Tr+1)
 TPM is constructed as :
where, u,v ϵ [-Tr , Tr] , and u,v ϵ Z.
Similarly, probabilities can be estimated for other directions.
 Perform SVM classification on above obtained TPMs.
26
Outline
 Research Motivation and Aim
 Related Work and Research Contribution
 Problem definition
 Proposed Algorithm
 Experimental Results and Comparison Analysis
 Limitations
27
Dataset
 Experiments have been conducted on test videos of two inpainting
techniques - complex TCP and convenional TCP .
28
Table 1 : Conventional TCP inpainting dataset
29
Table 2 : Complex TCP Inpainting Dataset
 RMS value threshold τ is empericaly set to 3.5.
30
Figure 10 : RMS value based classification for complex TCP inpainting
31
Figure 11 : RMS value based classification for conventional TCP inpainting
Performance Evaluation
 Performance has been measured by Precision, Recall and Accuracy .
 Precision(P) = TP/(TP+FP)
 Recall (R) = TP/(TP+FN)
 Accuracy(A) = (TP+TN)/(TP+TN+FP+FN)
32
TP True Positive
TN True Negative
FN False Negative
FP False Positive
Results - Video Inpainting Detection
33
Table 3:Classification Results for Complex TCP Table 4 : Classification Results for Conventional TCP
Table 5 : Video Inpainting Detection Results
Results - Video Inpainting Localization
34
Table 6 :Classification Results for Complex TCP Table 7 : Classification Results for Conventional TCP
Table 8 : Video Inpainting Localization Results
Comparison
 Proposed approach is compared with spatio-temporal coherence
based technique proposed by Lin et al [4]for inpainting detection and
localization .
 Spatio- Temporal coherence based approach fails to perform on
complex TCP inpainting dataset.
35
36
Figure 12: Spatio_Temporal Approach Result on conventional TCP dataset
37
Figure 13 : Spatio_Temporal Approach Result on complex TCP dataset
Outline
 Research Motivation and Aim
 Related Work and Research Contribution
 Problem definition
 Proposed Algorithm
 Experimental Results and Comparison Analysis
 Limitations
38
Limitations
 Considers only static camera videos with no camera motion.
 Multiple objects removal case not considered in complex inpainting dataset .
 Spatial Localization is not performed of inpainted regions is not performed.
 Dataset is small .
39
References
40
1) Chih-Chung Hsu, Tzu-Yi Hung, Chia-Wen Lin, and Chiou-Ting Hsu. Video forgery detection
using correlation of noise residue. In Multimedia Signal Processing, 2008 IEEE 10th
Workshop on, pages 170– 174. IEEE, 2008.
2) Jing Zhang, Yuting Su, and Mingyu Zhang. Exposing digital video forgery by ghost
shadow artifact. In Proceedings of the First ACM workshop on Multimedia in
forensics, pages 49–54. ACM, 2009
3) Sreelekshmi Das Gopu Darsan and Shreyas L Divya Devan. Blind detection method
for video inpainting forgery
4) Cheng-Shian Lin and Jyh-Jong Tsay. A passive approach for effective detection and
localization of region-level video forgery with spatio-temporal coherence analysis.
Digital Investigation, 11(2):120– 140, 2014
References
41
5) Kedar Patwardhan, Guillermo Sapiro, Marcelo Bertalimo, et al. Video inpainting
under constrained camera motion. Image Processing, IEEE Transactions on, 16(2):545–
553, 2007.
6) Alasdair Newson, Andres Almansa, Matthieu Fradet, Yann ´ Gousseau, and Patrick
Perez ´ . Video inpainting of complex scenes. 2015
7) Thomas Brox, Andres Bruhn, Nils Papenberg, and Joachim We- ´ ickert. High
accuracy optical flow estimation based on a theory for warping. In Computer Vision-
ECCV 2004, pages 25–36. Springer, 2004

Video Inpainting detection using inconsistencies in optical Flow

  • 1.
    Video Inpainting detectionusing inconsistencies in Optical Flow Thesis Committee Dr. A V Subramanyam (Advisor) Dr. Pradeep Atrey(External Reviewer) Dr. Sambuddho Chakravarty(Internal Reviewer) Shobhita Saxena M.Tech CSE (MT13015) 1
  • 2.
    Outline  Research Motivationand Aim  Related Work and Research Contribution  Problem definition  Proposed Algorithm  Experimental Results and Comparison Analysis  Limitations 2
  • 3.
    Outline  Research Motivationand Aim  Related Work and Research Contribution  Problem definition  Proposed Algorithm  Experimental Results and Comparison Analysis  Limitations 3
  • 4.
    Aims at : restoring lost parts of videos and reconstructing them based on the background information and spatial - temporal details.  removal or replacement of unwanted objects from frames such that no distortion is observed when video is played as a sequence. What is Video Inpainting? 4 Figure 1 : Removal of unwanted object from a video frame
  • 5.
    Video Inpainting asVideo Forgery  Video Inpainting can be used as a major forgery tool to perform malicious changes in videos such as object removal . 5Figure 2 : Video Inpainting as forgery type (a) source video frames (b) Inpainted video frames
  • 6.
    Research Motivation  VideoInpainting forgery produces inpainted regions in video frames in a visually plausible manner.  Inpainting forgery is highly sophisticated and gets more difficult to detect in comparison to other forgery types thus posing a challenging research problem.  Very few works have been proposed in the area of video inpainting detection.  Existing inpainting detection techniques fail to perform detection of latest state-of-the-art inpainting techniques. 6
  • 7.
    Research Aim  Tostudy and analyse the optical flow pattern in source and inpainted videos .  To present a robust technique for effective detection and localization of inpainted regions in a given video sequence.  To propose a single algorithm which detects multiple inpainting techniques. 7
  • 8.
    Outline  Research Motivationand Aim  Related Work and Research Contribution  Problem definition  Proposed Algorithm  Experimental Results and Comparison Analysis  Limitations 8
  • 9.
    Related Work Hsu etal. [1] Proposed approach to locate forged regions in an inpainted video using correlation of noise residue. Zhang et al. [2] Performs inpainting forgery detection based on ghost shadow artifacts. Das et al.[3] Proposed a blind detection method based on zero-connectivity feature and fuzzy membership function to detect video inpainting forgery. Lin et al. [4] Proposed spatio-temporal coherance based approach for video inpainting detection and localization. 9
  • 10.
    Research Contribution  Proposean approach which investigates video inpainting forgery which has not been studied much in previous works  Propose a novel approach to detect and temporally localize inpainted regions in a given video sequence.  Propose algorithm performs well in detection and localization of popular and effective state-of-the-art inpainting techniques on which other inpainting detection algorithms fail to perform . 10
  • 11.
    Outline  Research Motivationand Aim  Related Work and Research Contribution  Problem definition  Proposed Algorithm  Experimental Results and Comparison Analysis  Limitations 11
  • 12.
    Problem definition  Givenan input video sequence VN having N frames, determine if its an authentic or an inpainted video .  For a video that gets classified as inpainted, perform temporal localization of the inpainted regions. 12
  • 13.
    Outline  Research Motivationand Aim  Related Work and Research Contribution  Problem definition  Proposed Algorithm  Experimental Results and Comparison Analysis  Limitations 13
  • 14.
    Targeted Inpainting Techniques 14 This algorithm performs on two variants of Temporal Copy Paste (TCP) inpainting. They are as follows :  Conventional TCP – one of the first and most popular inpainting techniques proposed by Patwardhan et al [5] in 2007. Handles relatively simple motion types in videos for inpainting.  Complex TCP – one of the state-of-the-art inpainting techniques proposed by Alasdair et al [6] in 2014. Handles relatively complex motions in videos to performing inpainting.
  • 15.
  • 16.
    Optical Flow 16  OpticalFlow (Of) characterizes the motion of every pixel in one image to its corresponding location in next image[7].  Performs motion estimation in between video frames .  It is best applied to video frames as these are sequence of time ordered images. Figure 3 : (a,b) mouth regions of two consecutive images of a person speaking . (c) Flow field estimated using optical flow.
  • 17.
    17* Picture fromKhurram Hassan-Shafique CAP5415 Computer Vision 2003 Figure 4 : Two frames of a video sequence
  • 18.
    18 Figure 6 :Colour scheme used to represent the orientation and magnitude of optical flow Figure 5 : Optical Flow computation in between two images
  • 19.
    Optical Flow Computation- Procedure 19 V video under consideration having N frames m number of optical flow matrices computed for each frame N-m number of frames chosen for optical flow computation Of Optical flow matrix computed in between frames as Of(n,n+1), Of(n,n+2), Of(n,n+3)…. Of(n,n+m) , where n is a particular frame (N-m)*m Number of optical flow matrices generated for each frame in a video sequence
  • 20.
    Optical Flow –Results 20 a) Source video frames b) Optical Flow of source framesFigure 7:
  • 21.
    21 a) Inpainted videoframes b) Optical Flow of inpainted video framesFigure 8:
  • 22.
    Chi-Square Distance computation- Procedure 22 H Histogram vector computed for each optical flow matrix.. chisq Chi – square distance computed for comparing histograms (N-m)*(m-1) Number of chi – square values produced
  • 23.
    Guassian Mixture curvefitting  Applied to normalized chi – square values  Measures goodness-of-fit for authentic and inpainted videos by giving Root Mean Square Error(RMSE) values 23
  • 24.
    GM Distribution Fitto chi-square values 24 Figure 9 : Source and Inpainted GM distributions
  • 25.
    Markov Chain –feature extraction  Each video VN divided into VN / k frames set  Optical Flow Of of each frame set is modelled as firsst order spatial Markov Chain  Values in Of are rounded of to nearest integer values to get integer value states and then truncated in between –Tr to +Tr before extracting the transition probabilities . 25
  • 26.
     Number ofstates to model a markov chain = (2Tr+1)  For each matrix, number of Transition Probabilities = (2Tr+1) * (2Tr+1)  TPM is constructed as : where, u,v ϵ [-Tr , Tr] , and u,v ϵ Z. Similarly, probabilities can be estimated for other directions.  Perform SVM classification on above obtained TPMs. 26
  • 27.
    Outline  Research Motivationand Aim  Related Work and Research Contribution  Problem definition  Proposed Algorithm  Experimental Results and Comparison Analysis  Limitations 27
  • 28.
    Dataset  Experiments havebeen conducted on test videos of two inpainting techniques - complex TCP and convenional TCP . 28 Table 1 : Conventional TCP inpainting dataset
  • 29.
    29 Table 2 :Complex TCP Inpainting Dataset
  • 30.
     RMS valuethreshold τ is empericaly set to 3.5. 30 Figure 10 : RMS value based classification for complex TCP inpainting
  • 31.
    31 Figure 11 :RMS value based classification for conventional TCP inpainting
  • 32.
    Performance Evaluation  Performancehas been measured by Precision, Recall and Accuracy .  Precision(P) = TP/(TP+FP)  Recall (R) = TP/(TP+FN)  Accuracy(A) = (TP+TN)/(TP+TN+FP+FN) 32 TP True Positive TN True Negative FN False Negative FP False Positive
  • 33.
    Results - VideoInpainting Detection 33 Table 3:Classification Results for Complex TCP Table 4 : Classification Results for Conventional TCP Table 5 : Video Inpainting Detection Results
  • 34.
    Results - VideoInpainting Localization 34 Table 6 :Classification Results for Complex TCP Table 7 : Classification Results for Conventional TCP Table 8 : Video Inpainting Localization Results
  • 35.
    Comparison  Proposed approachis compared with spatio-temporal coherence based technique proposed by Lin et al [4]for inpainting detection and localization .  Spatio- Temporal coherence based approach fails to perform on complex TCP inpainting dataset. 35
  • 36.
    36 Figure 12: Spatio_TemporalApproach Result on conventional TCP dataset
  • 37.
    37 Figure 13 :Spatio_Temporal Approach Result on complex TCP dataset
  • 38.
    Outline  Research Motivationand Aim  Related Work and Research Contribution  Problem definition  Proposed Algorithm  Experimental Results and Comparison Analysis  Limitations 38
  • 39.
    Limitations  Considers onlystatic camera videos with no camera motion.  Multiple objects removal case not considered in complex inpainting dataset .  Spatial Localization is not performed of inpainted regions is not performed.  Dataset is small . 39
  • 40.
    References 40 1) Chih-Chung Hsu,Tzu-Yi Hung, Chia-Wen Lin, and Chiou-Ting Hsu. Video forgery detection using correlation of noise residue. In Multimedia Signal Processing, 2008 IEEE 10th Workshop on, pages 170– 174. IEEE, 2008. 2) Jing Zhang, Yuting Su, and Mingyu Zhang. Exposing digital video forgery by ghost shadow artifact. In Proceedings of the First ACM workshop on Multimedia in forensics, pages 49–54. ACM, 2009 3) Sreelekshmi Das Gopu Darsan and Shreyas L Divya Devan. Blind detection method for video inpainting forgery 4) Cheng-Shian Lin and Jyh-Jong Tsay. A passive approach for effective detection and localization of region-level video forgery with spatio-temporal coherence analysis. Digital Investigation, 11(2):120– 140, 2014
  • 41.
    References 41 5) Kedar Patwardhan,Guillermo Sapiro, Marcelo Bertalimo, et al. Video inpainting under constrained camera motion. Image Processing, IEEE Transactions on, 16(2):545– 553, 2007. 6) Alasdair Newson, Andres Almansa, Matthieu Fradet, Yann ´ Gousseau, and Patrick Perez ´ . Video inpainting of complex scenes. 2015 7) Thomas Brox, Andres Bruhn, Nils Papenberg, and Joachim We- ´ ickert. High accuracy optical flow estimation based on a theory for warping. In Computer Vision- ECCV 2004, pages 25–36. Springer, 2004