A review on video inpainting techniques


Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

A review on video inpainting techniques

  1. 1. INTERNATIONALComputer VolumeOF COMPUTER ENGINEERING International Journal of JOURNAL 6367(Print), ISSN 0976 – 6375(Online) Engineering and Technology (IJCET), ISSN 0976- 4, Issue 1, January- February (2013), © IAEME & TECHNOLOGY (IJCET)ISSN 0976 – 6367(Print)ISSN 0976 – 6375(Online)Volume 4, Issue 1, January- February (2013), pp. 203-210 IJCET© IAEME: www.iaeme.com/ijcet.aspJournal Impact Factor (2012): 3.9580 (Calculated by GISI) ©IAEMEwww.jifactor.com A REVIEW ON VIDEO INPAINTING TECHNIQUES Mrs.B.A.Ahire*1, Prof.Neeta A. Deshpande*2 * Department of Computer Engineering, M.E. Student, Matoshri college of Engg. And research Centre, Nasik, * Department of Computer Engineering, Associate professor, Matoshri college of Engg. And Research Centre, Nasik. 1 bhawanaahire@yahoo.com; 2deshpande_neeta@yahoo.com ABSTRACT The problem of video completion whose goal is to reconstruct the missing pixels in the holes created by damage to the video or removal of selected objects is critical to many applications, such as video repairing, movie post production, etc. The key issues in video completion are to keep the spatial-temporal coherence, and the faithful inference of pixels. A lot of researchers have worked in the area of video inpainting. Most of the techniques try to ensure either spatial consistency or temporal continuity between the frames. But none of them try to ensure both of them in the same technique with a good quality. Although the amount of work proposed in video completion is comparatively less as that of image inpainting, a number of methods have been proposed in the recent years. The methods can be classified as: Patch-based methods and object-based methods. Patch-based methods use block-based sampling as well as simultaneous propagation of texture and structure information as a result of which, computational efficiency is achieved. Hence, the researchers have extended the similar concept in video inpainting. The texture synthesis based methods doesn’t contain structural information while, PDE-based methods leads to blurring artifacts. Patch based methods produces high-quality effects maintaining consistency of local structures.This paper is based on a survey in the area of video inpainting. Keywords: video inpainting, texture synthesis, patch based inpainting 203
  2. 2. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 0976-6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME JanuaryI. INTRODUCTION Inpainting is the process of reconstructing lost or deteriorated parts of images andvideos. The idea of image inpainting commenced from a very long years back and right from aintingthe birth of computer vision, researchers are looking for a way to carry out this processautomatically. By applying various techniques, they have achieved promising results, evenwhen the images containing complicated objects. Subsequently, video inpainting has alsoattracted a large number of researchers because of its ability to fix or restore damaged videos.Video inpainting describes the process of filling the missing/damaged parts of a video videoclipwith visually plausible data so that the viewer’s cannot know if the videoclip is automaticallygenerated or not. Comparatively, video inpainting has a large number of pixels to beinpainted and the searching space is much more tremendous. Moreover, not only the spatial notconsistency but also the temporal continuity between the frames must be ensured [12].Applying image inpainting techniques directly into video inpaintingwithout taking intoaccount the temporal factors will ultimately lead to failure because it will make the frames becauseinconsistent with each other. These difficulties make video inpainting a muchmorechallenging problems than image inpainting. Depending upon the way the damaged imagesare restored, the techniques are classified into three gro groups: texture synthesis-based methods, basedpartial difference equation-based methods and patch based patch-based methods. The texture synthesisprocess grows a new image outward from an initial seed i.e. one pixel at a time. Whereas, interms of PDE-based approaches, gradi based gradient direction and the grey-scale values are propagated scalefrom the boundary of a missing region towards the center of the region.Both methods can’thandle cases of general image inpainting. The rest of the paper is organized as follows. Survey on PDE-based methods is baseddiscussed in sectionA, sectionB concentrates on texture synthesis methods, while sectionCdescribes patch based methods. Section III explains object based methods. Section IVcontains concluding remarks. While all the methods are summarized in the table. Blockdiagram of the video inpainting technique is shown in Fig. 1. FIGURE 1 Block diagram of the video inpainting technique 204
  3. 3. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEMEII. A REVIEW OF VIDEO INPAINTING TECHNIQUES The problem of automatic video restoration in general, andautomatic object removaland modification in particular, is beginning to attract the attention of many researchers.According to the way the images and videos are restored, the methods for video inpaintingare reviewed as in the following chapters.A. PDE-BASED METHODS G. Spario et. al. proposed a frame by frame PDE based approach, which extendedimage inpainting techniques to video sequences in 1999[1]. The first step in this method isthe formulation of a partial differential equation that propagates information (the Laplacian ofthe image) in the direction of the isophotes (edges). The proposed algorithm shows that boththe gradient direction (geometry) and the gray-scale values (photometry) of the image shouldbe propagated inside the region to be filled in, making clear the need for high-order PDEs inimage processing.Thisalgorithm propagates the necessary information by numerically solvingthe PDE ఋ௬ =∆ୄ I*‫∆׏‬I ఋ௧for the image intensity I inside the hole—the region to be inpainted. Here, ∆ୄ denotes theperpendicular gradient ሺെߜ௬ , ߜ௫ ሻand ∆ is the Laplace operator. The goal is to evolve thisequation to a steady-state solution, where ∆ୄ I*‫∆׏‬I = 0 Thereby ensuring that the information is constant in the direction of theisophotes.They are capable of completing a damaged image in which thin regions aremissing.B.TEXTURE SYNTHESIS METHODS Alexei Efros and Thomas K. Leung proposed a texture synthesis based technique bynon-parametric sampling in Sept 1999[2]. In this method, window size W needs to bespecified. It preserves as much as local structures as possible and produces good results for awide variety of synthetic & real world textures. But, the problem is that the automaticwindow size selection for the textures as well as the method is slow. M. Bertalmio et. al. proposed an image inpainting technique in Dec 2001 that involvesfilling in part of an image or video using information from the surrounding area[3].Applications include the restoration of damaged photographs and movies and the removal ofselected objects. In this method, they introduced a class of automated methods for digitalinpainting. The approach uses ideas from classical fluid dynamics to propagate isophote linescontinuously from the exterior into the region to be inpainted. The main idea is to think of theimage intensity as a ‘stream function’ for a two-dimensional incompressible flow. TheLaplacian of the image intensity plays the role of the vorticity of the fluid; it is transportedinto the region to be inpainted by a vector field defined by the stream function. The resulting 205
  4. 4. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEMEalgorithm is designed to continue isophotes while matching gradient vectors at theboundary of the inpainting region. The method is directly based on the Navier-Stokesequations for fluid dynamics, which has the immediate advantage ofwell-developedtheoretical and numerical results. This isa new approach for introducing ideas fromcomputational fluid dynamics into problems in computer vision and image analysis.The future scope is working on video inpainting technique to automatically switchbetween structure and texture inpainting in required. A. Chriminisi et. al. proposed a video inpainting technique in Aug 2005. Itdeals with automatically filling space–time holes in video sequences left by theremoval of unwanted objects in a scene [4]. They solved it by using texture synthesis,filling a hole inwards using three steps iteratively:it selects the most promisingtargetpixel at the edge of the hole, finds the source fragment most similar to theknown part of the target’s neighborhood, and then merge source and target fragmentsto complete the target neighborhood, reducing the size of the hole. Earlier methodswere slow, due to searching the whole video data for source fragments or completingholes pixel by pixel; they also produced blurred results due to sampling andsmoothing. For speed, this methodtracksmoving objects, allowing us to use a muchsmaller search space when seeking source fragments; it also completes holes fragmentby fragment instead of pixel wise. Fine details are maintained by use of a graph cutalgorithm when merging source and target fragments. Further techniques ensuretemporal consistency of hole filling over successive frames, blurred results due tosampling and smoothing. They wish to extend the work to more complicate anddynamic scenes, involving, for example, complex camera and object motions in threedimensions.C.PATCH-BASED METHODS Y. Jia et. al. [5] proposed an exemplar-based texture synthesis based methodfor simultaneous propagation of structure and texture information in Sept 2004.Computational efficiency is achieved by a block-based sampling process. Robustnesswith respect to the shape of the manually selected target region is also demonstrated.This work needs to be extended to handle the accurate propagation of curvedstructures in still photographs & removing objects from video. Y. Shen et. al. proposed a novel technique to fill in missing background andmoving foreground of a video captured by a static or moving camera in Aug 2006.Different from previous efforts which are typically based on processing in the 3D datavolume, they slice the volume along the motion manifold of the moving object, andtherefore reduce the search space from 3D to 2D, while still preserve the spatial andtemporal coherence [7]. In addition to the computational efficiency, based ongeometric video analysis, the proposed approach is also able to handle real videosunder perspective distortion, as well as common camera motions, such as panning, 206
  5. 5. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEMEtilting, and zooming. The technique needs to be extended to some more generalcameraand foreground motion. K. A. Patwardhan et. al. [8] proposed a framework in Feb 2007 for inpaintingmissing parts of a video sequence recorded with a moving or stationary camera ispresented in this work. The region to be inpainted is general: It may be still or moving,in the background or in the foreground, it may occlude one object and be occluded bysome other object. The algorithm consists of a simple preprocessing stage and twosteps of video inpainting. The proposed framework has several advantages over state-of-the-art algorithms that deal with similar types of data and constraints. It permitssome camera motion, is simple to implement, fast, does not require statistical modelsof background nor foreground, works well in the presence of rich and clutteredbackgrounds, and the results show that there is no visible blurring or motion artifacts.This algorithm does not address complete occlusion of the moving object. Thetechnique needs to be extended towards adapting current technique to such scenarios.Also to be addressed are the automated selection of parameters (such as patch size,mosaic size, etc.), and dealing with illumination changes along the sequence. Y. Wexler et.al. [9] came up with a new framework in March 2007 for thecompletion of missing information based on local structures. It poses the task ofcompletion as a global optimization problem with a well-defined objective functionand derives a new algorithm to optimize it. In this technique, only low resolutionvideos can be considered and multi scale nature of the solution may lead to blurringresults due to sampling & smoothing. T. Shih et. al. [10] designed a technique that automatically restores orcompletes removed areas in an image in March 2009. When dealing with a similarproblem in video, not only should a robust tracking algorithm be used, but thetemporal continuity among video frames also needs to be taken into account,especially when the video has camera motions such as zooming and tilting. In thismethod, an exemplar-based image inpainting algorithm is extended by incorporatingan improved patch matching strategy for video inpainting. In our proposed algorithm,different motion segments with different temporal continuity call for differentcandidate patches, which are used to inpaint holes after a selected video object istracked and removed. The proposed new video inpainting algorithm produces veryfew “ghost shadows,” which were produced by most image inpainting algorithmsdirectly applied on video. Shadows caused by fixed light sources can be removed byother techniques. However, it is possible to enlarge the target to some extent such thatthe shadow is covered. The challenge is on block matching, which should allow ablock to match another block which is scaled, rotated, or skewed. However, the degreeof scaling and rotation is hard to predict based on the speed of zooming and rotation ofcamera. In addition, how to select continuous blocks in a continuous area to inpaint atarget region is another challenging issue. 207
  6. 6. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME TABLE 1 Summary of Video Inpainting TechniquesSr. Year Authors Technique Features of the current techniques Future scope mentioned in the paperNo. PDE-Based It propagates the information from the They are only capable of completing a1. 1999 G.Spario et. al. boundary of the region to the center of damaged image in which thin regions are method that region. missing. Texture Preserves as much local structures a Sept Alexei Efros & synthesis by possible & produces good results for a 1.Automatic window size selection for2. 1999 Thomas K. Leung non-parametric wide variety of synthetic & real world texture sampling textures, parameter window size W 2.Slow needs to be specified Texture No other information is required to be Working on video inpainting technique3. Dec M.Bertalmio et. synthesis, specified only the region to be inpainted based on this framework to automatically 2001 al. Image intensity be marked by the user switch between texture & structure function inpainting A novel, efficient & visually pleasing The work to be extended to more Sept A. Criminisi et. Texture approach to video inpainting, Temporal complicated and dynamic scenes4. 2004 al. synthesis continuity is preserved, Location of the including complex camera and object method holes from where the object is to be motion in three dimensions. removed. A novel, efficient & visually pleasing The work to be extended to more Aug Y. T. Jia et. al. Texture approach to video inpainting, Temporal complicated and dynamic scenes5. 2005 synthesis continuity is preserved, Location of the including complex camera and object method holes from where the object is to be motion in three dimensions. removed. The system works for a subclass of 1. Synthesized objects don’t have a real May J. Jia et. al. Object based camera motions i.e. rotation about a trajectory as well as only textures are6. 2006 approach fixed point. The restored video preserves allowed in the background. the same structure & illumination, 2. Running time needs to be improved. Temporal consistency is preserved A novel technique to fill in missing Aug Patch based background &moving foreground of a More general camera and foreground7. 2006 Y. Shen et. al. method video. Spatial & temporal coherence is motion needs to be considered. achieved, as well as periodic motion patterns are well maintained. 1. Doesn’t address the complete Patch based It combines motion information occlusion of moving object.8. Feb K. A. Patwardhan method Performance of the system is improved 2. Automatic selection of parameters 2007 et. al. Parameters : as the search space is reduced. Fast & such as patch size, mosaic size, etc. Patch size, simple. 3. Lackof temporal continuity leading to mosaic size flickering artifacts. 1.Only low resolution videos can be9. Mar Y. Wexler et. al. Patch based Space time completion of large space considered 2007 method “holes” in the video sequences of 2. Multi scale nature of the solution may complex dynamic scenes. lead to blurring results due to sampling & smoothing. The system can deal with different Shadows caused by fixed light sources10. Mar T. Shih et. al. Patch based camera motion as well as panning & needs to be removed. 2009 inpainting zooming of several types of video clip The challenge is block matching. An efficient object-based video 1. If the number of postures in the inpainting technique for dealing with database is not sufficient, the inpainting videos recorded by a stationary camera. result could be unsatisfactory. Oct S.-C.S. Cheung Object based A fixed size sliding window is defined 2. The method does not provide a11. 2006 et. al. inpainting to include a set of continuous object systematic way to identify a good filling templates. The authors also propose a position for an object template. This may similarity function that measures the cause visually annoying artifacts if the similarity between two sets of chosen position is inappropriate. continuous object templates. A novel framework for video Apr Object based completion by reducing the problem of 1.Non-linearity of occluded object12. 2011 C. H. Ling et. al. inpainting insufficient postures. The method can 2.Variable illumination problem maintain spatial consistency as well as 3. To synthesize complex postures. temporal continuity of an object simultaneously. 208
  7. 7. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEMEIII. OBJECT-BASED METHODS J. Jia et. al. [6] proposed a complete system in March 2006 that is capable of synthesizing alarge number of pixels that are missing due to occlusion or damage in an uncaliberated input video.These missing pixels may correspond to the static background or cyclic motions of the capturedscene. This system employs user-assisted video layer segmentation, while the main processing invideo repair is fully automatic. The input video is first decomposed into the color and illuminationvideos. The necessary temporal consistency is maintained by tensor voting in the spatio-temporaldomain. Missing colors and illumination of the background are synthesized by applying imagerepairing. Finally, the occluded motions are inferred by spatio-temporal alignment of collectedsamples at multiple scales. Since this movel does not capture self shadows or moving shadows, wecannot repair the shadow of a damaged movel. Another limitation is on the incorrect lighting on arepaired movel. Currently, the techniques do not relight a repaired movel. In the future, the methodneeds to be extended into the movels so that better lighting and shadow on the repaired movel scan behandled. The running time of the system needs to be improved. Cheung et al. [11] proposed an efficient object-based video inpainting technique in Oct 2006for dealing with videos recorded by a stationary camera. To inpaint the background, they use thebackground pixels that are most compatible with the current frame to fill a missing region; and toinpaint the foreground, they utilize all available object templates. A fixed size sliding window isdefined to include a set of continuous object templates. The authors also propose a similarity functionthat measures the similarity between two sets of continuous object templates. For each missing object,a sliding window that covers the missing object and its neighboring objects’ templates is used to findthe most similar object template. The corresponding object template is then used to replace themissing object. However, if the number of postures in the database is not sufficient, the inpaintingresult could be unsatisfactory. Moreover, the method does not provide a systematic way to identify agood filling position for an object template. This may cause visually annoying artifacts if the chosenposition is inappropriate. Chih-Hung Ling et. al. came up with a novel framework for object completion in a video. Tocomplete an occluded object, the proposedmethod first samples a 3-D volume of the video intodirectional spatio-temporal slices, and performs patch-based image inpaintingto complete the partiallydamaged object trajectories in the 2-D slices [12]. The completed slices are then combined to obtain asequence of virtual contours of the damaged object. Next, a posture sequence retrieval technique isapplied to the virtual contours to retrieve the most similar sequence of object postures in the availablenon-occluded postures. Key-posture selection andindexing are used to reduce the complexity ofposture sequence retrieval. This method also proposed a synthetic posture generation scheme thatenriches the collection of postures so as to reduce the effect of insufficient postures.IV. CONCLUSION Patch-based methods often have difficulty handling spatial consistency and temporalcontinuity problems. For example, the previous approaches proposed can only maintain spatialconsistency or temporal continuity; they cannot solve both problems simultaneously. On the otherhand, some of the proposed approaches can deal with spatial and temporal informationsimultaneously, but they suffer from the over-smoothing artifacts problem. In addition, patch-basedapproaches often generate inpainting errors in the foreground. As a result, many researchers havefocused on object-based approaches, which usually generate high-quality visual results. Even so,some difficult issues still need to be addressed; for example, the unrealistictrajectoryproblem and theinaccurate representation problem caused by an insufficient number of postures in the database. 209
  8. 8. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEMEREFERENCES[1] M. Bertalmio, G. Sapiro, V. Caselles, and C. Ballester, “Image inpainting,” in Proc. ACMSIGGRAPH, 2000, pp. 417–424.[2] A. Efros and T. Leung, “Texture synthesis by non-parametric sampling,”in Proc. IEEEConf. Comput. Vis., 1999, vol. 2, pp. 1033–1038.[3] M. Bertalmio, A. L. Bertozzi, and G. Sapiro, “Navier-stokes, fluid dynamics, and imageand video inpainting,” in Proc. IEEE Conf. Comput.Vis. Pattern Recognit. Kauai,, HI, Dec.2001, pp. 355–362.[4] A. Criminisi, P. Perez, and K. Toyama, “Region filling and object removal by exemplarbased image inpainting,” IEEE Trans. Image Process., vol. 13, no. 9, pp. 1200–1212, Sep.2004[5] Y. T. Jia, S. M. Hu, and R. R. Martin, “Video completion using tracking and fragmentmerging,” Visual Comput., vol. 21, no. 8–10,pp. 601–610, Aug. 2005.[6] J. Jia, Y.-W.Tai, T.-P.Wu, and C.-K. Tang, “Video repairing under variable illuminationusing cyclic motions,” IEEE Trans. Pattern Anal.Mach. Intell, vol. 28, no. 5, pp. 832–839,May 2006.[7] Y. Shen, F. Lu, X. Cao, and H. Foroosh, “Video completion for perspective camera underconstrained motion,” in Proc. IEEE Conf. Pattern Recognit., Hong Kong, China, Aug. 2006,pp. 63–66.[8] K. A. Patwardhan, G. Sapiro, and M. Bertalmío, “Video inpainting under constrainedcamera motion,” IEEE Trans. Image Process., vol.16, no. 2, pp. 545–553, Feb. 2007.[9] Y. Wexler, E. Shechtman, and M. Irani, “Space-time completion of video,” IEEE Trans.Pattern Anal. Mach. Intell., vol. 29, no. 3, pp. 1–14, Mar. 2007.[10] T. K. Shih, N. C. Tang, and J.-N. Hwang, “Exemplar-based video inpainting withoutghost shadow artifacts by maintaining temporal continuity,”IEEE Trans. Circuits Syst. VideoTechnol., vol. 19, no. 3, pp.347–360, Mar. 2009.[11] S.-C. S. Cheung, J. Zhao, and M. V. Venkatesh, “Efficient object-based videoinpainting,” in Proc. IEEE Conf. Image Process., Atlanta, GA, Oct. 2006, pp. 705–708.[12] Chih-Hung Ling, Chia-Wen Lin, Senior Member, IEEE, Chih-Wen Su, Yong-ShengChen, Member, IEEE, and Hong-Yuan Mark Liao, Senior Member, IEEE,”Virtual ContourGuided Video Object InpaintingUsing Posture Mapping and Retrieval”,IEEE Trans. OnMultimedia, vol. 13, no. 2, April 2011.[13] Abhishek Choubey , Omprakash Firke and Bahgwan Swaroop Sharma, “Rotation AndIllumination Invariant Image Retrieval Using Texture Features” International journal ofElectronics and Communication Engineering &Technology (IJECET), Volume 3, Issue 2,2012, pp. 48 - 55, Published by IAEME.[14] Ms.Shaikh Shabnam Shafi Ahmed, Dr.Shah Aqueel Ahmed and Mr.Sayyad FarookBashir, “Fast Algorithm For Video Quality Enhancing Using Vision-Based Hand GestureRecognition” International journal of Computer Engineering & Technology (IJCET),Volume 3, Issue 3, 2012, pp. 501 - 509, Published by IAEME.[15] Reeja S R and Dr. N. P Kavya, “Motion Detection For Video Denoising – The State OfArt and The Challenges” International journal of Computer Engineering & Technology(IJCET), Volume 3, Issue 2, 2012, pp. 518 - 525, Published by IAEME. 210