JINNO AND OKUDA: MULTIPLE EXPOSURE FUSION FOR HIGH DYNAMIC RANGE IMAGE ACQUISITION 359 for this nonlinearity, the photometric camera calibration described in Section II-B is performed for the input images. 2) We select a main image from the multiple exposure im- ages. For each of the other images, the displacement from the main image, which is mainly due to object movements, is found. In practice, we select an image with medium ex- posure as the main image in a default setting. Furthermore, Fig. 1. Two examples of weighting functions: (left) hat function and (right) the occlusions and underexposed and overexposed regions Gaussian function. are found for the images. This is done by the MAP-based motion compensation method in Section III. 3) To improve the accuracy for discriminating the occlusion and the saturation (i.e., underexposed and overexposed re- gions), we employ the postprocessing in Section III-D. Fi- nally, we combine the images to create the HDRI.B. Photometric Camera Calibration The relationship between irradiance and the amount of lights that we measure through some sensor can be expressed by (1)where is the exposure time (shutter speed). In many camera sensors, captured signal is recordedat more than 8 bits in a so-called “raw image” format. The Fig. 2. Weighting functions of ﬁve adjusted exposure images: (top) conven-raw image is nonlinearly transformed through some image tional weights and (bottom) our weights.processing such as the gamma correction. Then, the pixel isquantized to 8 bits. We assume that the nonlinearly transformed8-bit images are obtained as an input. For convenience, we set image, and is a weighting function. The weighting function http://ieeexploreprojects.blogspot.comthe range of to [0, 1]. To accurately retrieve the irradiance, has small values for the underexposed and overexposed pixels,we need to compensate for the nonlinearity by estimating the and these pixel values are ignored. If a scene is completely statictransform. Here, we approximate it by a single curve and call it and the images are aligned, we may safely combine the images“camera response curve,” denoted by by (4). However, if there is motion between images, ghosting artifact may appear in . The main goal of this paper is to (2) address the ghosting problem.If one uses the raw images and an image sensor has linear C. Weighting Functionsensitivity characteristics, this photometric calibration can be In (4), the weighting function is introduced because underex-skipped. To accurately retrieve the irradiance, a dequantization posed or overexposed regions are much less reliable than the re-procedure is necessary. However, since we may assume that gions of middle intensities. Thus, in the conventional methods,the image is densely quantized and the quantization error the weight is speciﬁed to be small for pixel values near satu-hardly affects the quality of the HDR acquisition, we ignore the ration values 0 and 1 and high for the middle intensities. Twoquantization effect. examples for the weighting functions used in the conventional Among the existing methods for the calibration problem, we methods ,  are shown in Fig. 1. The role of the weight isadopt the method in  to ﬁnd , in which the curve is approxi- to discard saturated pixels. The region where the pixel valuesmated by a low-order polynomial using multiple images and the are close to 0 or 255 is backed up by other exposure with largervalues of exposure ratios between the images. Once the curve is weight. In the conventional methods, the weighting functionsestimated, the irradiance is derived from (1) and (2) as are built based on the assumption that middle intensities around (3) 0.5 have high reliability for irradiance estimation. From (1) and (2), one can denote the weighting function as the function of theIn our method, the multiple exposure images are taken by irradiance . The Gaussian function in Fig. 1 (right) invarying the exposure time of a camera with other settings ﬁxed. the irradiance domain is depicted in Fig. 2 (top), where ﬁve ex-Then, the images are merged to create the HDRI by posures are used . The conventional weighting in Fig. 2 (top) have two drawbacks. First, in very dark area, all of the ex- posures have similar weights. This means that all the images are (4) summed up in some degree, even if the images with high expo- sure are completely underexposed or corrupted by dark currentwhere is a value of the pixel of the th exposure image. noises in the dark area. This results in noisy and ghosting arti- is the number of images, is the exposure time of the th facts, as shown in Fig. 15. The other drawback is that much of
360 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 1, JANUARY 2012irradiance is covered by several images. If a scene remains com-pletely static during taking photographs, the weighting functionworks well. Otherwise, however, it yields ghosting artifacts asthe multiple images are combined in the overlapped region. Thisoverlap makes it easy to yield the ghosting artifacts. To addressthe problem, we rearrange the weighting function to (5)where is a parameter that controls the width of the functionandThe weighting functions in the case of are shown inFig. 2 (bottom). The functions relieve the overlap. Each expo-sure independently covers some region, which results in the re-duction of offset noise and ghosting artifacts. In particular, thedark area is supported only by the lowest exposed image, which Fig. 3. Example of multiple exposure images: (a) high exposure, (b) low ex-can reduce the artifacts. Note that, even though the lowest and posure, (c) occlusion (gray), and saturation (white).highest exposures have large weights in the unreliable irradi-ance regions, the effect is minor since other exposures are lessreliable than these in high- and low-irradiance regions. Here, we introduce a probability model for the estimation of the displacement between two images and the regions of the III. MOTION COMPENSATION two classes. Let us denote two measured images with different exposure, i.e., andA. MRF Model (where is a discrete rectangular sampling lattice in ), in (2) can be derived by only when pixels are not which are considered to be samples from a random ﬁeld. In this http://ieeexploreprojects.blogspot.comunderexposed or overexposed. Moreover, (4) is effective only framework, the exposure time of all the images are known, andwhen a scene is static. Since moving objects cause the motion images and have been made linear to irradiance values byblur, it is required to compensate the displacement as much the photometric camera calibration method in Section II-B. Theas possible. The compensation is usually done by two steps, ﬁeld is deﬁned on the sampling lattice .i.e., global motion and local displacement compensation steps. Here, we introduce three random ﬁelds, i.e.,In our method, we assume that the global motion (e.g., mo- , , andtion caused by camera shake) has been already compensated , which correspond to the displacement,by some image alignment algorithm . Here, we focus on the the occlusion, and the saturation, respectively. The displace-local displacement compensation. ment is deﬁned as the apparent motion vector of objects In our situation, the displacement estimation may fail due to between the two images. Although the displacement ﬁeld in1) occlusion, 2) saturation, that is, underexposure and overex- our framework is similar to the conventional motion vectorposure, and 3) differences in intensity caused by the failure of ﬁelds, they differ in that we try to ﬁnd a pixel with the closestthe camera response curve estimation. Fig. 3 illustrates an ex- luminance but high-accuracy estimation (e.g., subpixel search)ample of two exposures, where some regions of the high-expo- is not necessary.sure image is overexposed [white region in (a)] and underex- is a binary random ﬁeld that indicates whether each pixelposed [black region in (b)]. There may be three types of regions belongs to the occlusion. Random variable on each sitewhere correspondence between the images is hard to ﬁnd: 1) takes 0 or 1 according to the following rule:occlusion [marked by light gray in (c)]; 2) saturation (overex- if is in Class Iposure marked by white); and 3) both of the two (dark gray). (6) otherwise, In our approach, we simply search the closest value within awindow for the hole caused by the occlusion, whereas, in the Binary process indicates the occlusion.saturated area, we do not compensate for the motion (detail is The saturation is deﬁned by the region of the underexposurefound in Section III-D). For the regions where pixel intensities and overexposure in image . This is also a binary random ﬁeldvary between images due to failure of the camera response curve that indicates the following saturation:estimation, we treat them the same way as the occlusion. In the if is in Class IIend, we need to estimate the following two classes, as well as (7) otherwise,the displacement vectors: 1) Class I are regions that include theocclusion and intensity mismatch caused by the failure of the In our method, instead of and , we estimate complement vari-camera response curve estimation, and 2) Class II are regions of ables and , which have “0” in the cases of Classes I and II,the saturation, that is, underexposure and overexposure. respectively.
JINNO AND OKUDA: MULTIPLE EXPOSURE FUSION FOR HIGH DYNAMIC RANGE IMAGE ACQUISITION 361 Since all of these three ﬁelds mostly consist of some regions The second term of (11) is a cost imposed on displacementof similar values, they can be considered as the MRFs. Our es- vectors . Since the displacement generally occurs due to objecttimation problem is to ﬁnd the most likely ﬁelds , , and by movements, the displacement vectors should be smooth. Thus,using observed images and , that is, maximizing the pos- we deﬁne the energy function as follows:terior probability . Applying the Bayes rule, theposterior probability is written by (14) where is a block around center pixel . Note that we use (8) different notations for and since they have different sets of sites on neighborhood. In practice, we use 3 3 blocks.Since the denominator of (8) is independent to variables The third and forth terms in (11) are constraints on the regionand , the problem to solve is stated as the maximization of the of and . Since the occlusion mainly depends on objects andnumerator, that is, our MAP estimation is the failure of the camera response curve do not happen in an isolated pixel, we impose a penalty on the isolated small regions (9) by Since , , and are modeled by the MRFs, probability (15) can be characterized by the Gibbs distribution–, i.e., where is an operator that counts the number of value “1” in the eight adjacent pixels and is the potential function that (10) has the minimum value at 0. The saturated area has the value ofwhere , , and are an energy function, a normalization con- either 0 or 1. Again, the isolated small regions are penalized bystant, and temperature, respectively. Then, MAP estimation can (16)be replaced by the minimization problem of the energy function. http://ieeexploreprojects.blogspot.comFrom (9) and (10), we can formulate the following problem: where if (11) (17) otherwise.B. Formulation of Energy Functions From (11)–(16), the energy function to minimize is deﬁned as The term in (11) characterizes image with , , , and given. In most cases, such as –, the func-tion is deﬁned by the -norm or -norm error between theblocks of and . In our case, since the difference in pixelvalues of and its corresponding pixels in sensitively af-fects the quality of the ﬁnal HDRI while an accurate motionestimation is not necessary, we use the minimum value of thedifference as (12)where is a block of pixels around the center pixel . Weuse the block of 31 31. The above error assumes that pixel (18)and its corresponding displaced pixel in ideally coincides. InClass I, however, it is not the case. Furthermore, since the value where are weights.of the luminance cannot be estimated in the areas of Class II,the assumption does not hold. Considering these, we deﬁne the C. Suboptimal Searchfollowing energy function: The nonlinearity of (18) makes the minimization problem (13) hard to directly solve. To address the difﬁculty, some relaxation algorithms have been proposed such as simulated annealing  and the mean ﬁeld theory –. The former has highThis equation excludes the occlusion and the saturation from the computational complexity. The latter can relieve the com-cost function. plexity, but it is not straightforward to apply it to our problem.
362 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 1, JANUARY 2012 Fig. 4. Multiple exposure images. (Upper row) Input images. (Lower row) Histograms. (a) I . (b) I . (c) I (main image). (d) I . (e) I .In our method, instead, we adopt a suboptimal approach similar In the occlusion, we simply search the closest value withinto . the window in for each pixel in and then set it as the corre- In our search, the estimations of the displacement, the oc- sponding pixel, whereas, in the overexposure region, we do notclusion, and the saturation are independently performed. In the compensate for the motion. In the estimation of Section III-C,estimation algorithm, we adopt a block-based search for the oc- we independently determine the saturation and the occlusion.clusion and the saturation, whereas a pixel-based search is per- However, since in the saturated region, there can be also occlu-formed for the displacement. In the displacement estimation, sion [such as the dark gray region in Fig. 3(c)], we need to ﬁndwe simply search the closest value in block for each pixel. the occlusion in the saturated areas. We solve the problem by aGiven the disparity, we ﬁnd blocks with large cost in (18) and thresholding method as follows. Since we assume that we shootthen set them as occlusions. Similarly, given the disparity and the multiple exposure images with changing only the exposurethe occlusion, we search the block with a high energy value in time, the amounts of light of the two images at some pixel are(16), and the saturation map is formed. These steps are iterated rewritten from (1) asuntil cost (18) is unchanged. After all, the algorithm is stated asfollows. http://ieeexploreprojects.blogspot.com constant (19) where is the exposure time of and is the difference of [Estimation Algorithm] the exposure time of the two images. Suppose that the dynamic range of the sensor is . Then, 1) (Initial setting) Search the pixels that have small values if is larger than 1 (that is overexposure), its measured value in the forth term of (18) and then set them as . is saturated to 1. Thus, the actual value for satisﬁes Initial displacement vectors are found by searching the pixel with the closest pixel value in window , and set (20) if the value at the site is above a prescribed threshold; otherwise, set . Substituting this to the ﬁrst equation in (19), we obtain 2) For each pixel , search the corresponding pixel with minimum difference (12), and set the displacement to . (21) 3) The blocks with high values in (18) are set as Class I, that is, . From these, we set the pixels satisfying (21) as the saturation, 4) The blocks with high values in (16) are set as Class II, i.e., those less than as the occlusion. that is, . 5) If IV. EXPERIMENTAL RESULTS holds, then In order to evaluate the validity of the proposed algorithm, we stop (where is the number of iterations). Otherwise, have conducted for two examples. Weights , , and used go to Step 2. in (18) are 0.003, 0.5, and 0.5, respectively. The selected block sizes of and are 31 31 and 3 3 integer samplingD. Postprocessing lattices, respectively. The potential function used in (15), (16), Using the above algorithm, we can determine the regions of and (18) isthe occlusion and the saturation and obtain the displacementvectors, except in these areas. Next, it is necessary to combinethe images using the information. Due to space limitation, here, where is a scaling factor.we consider only the case where has lower exposure than . In the ﬁrst experiment, we shot ﬁve photographs by changingThus, the underexposed area of is also underexposed in . the shutter speed setting, while the aperture is ﬁxed, and useWhen has higher exposure than , the underexposure and them as an input for our algorithm. The shutter speed settingsoverexposure are inversely treated in the following algorithm. that we used are 1/125, 1/80, 1/50, 1/30, and 1/20 (in seconds).
JINNO AND OKUDA: MULTIPLE EXPOSURE FUSION FOR HIGH DYNAMIC RANGE IMAGE ACQUISITION 363 Fig. 5. Underexposure map. (a) s . (b) s . (c) s . (d) s . Fig. 8. Our result (it is tone mapped to displayable range) http://ieeexploreprojects.blogspot.com Fig. 6. Overexposure map. (a) s . (b) s . (c) s . (d) s Fig. 9. (From upper left to lower right) Motion blur in dark region: block matching, HDR video , ghost removal , and photomatix . Fig. 7. Occlusion map. (a) o . (b) o . (c) o . (d) o .The input images and their histograms are shown inFig. 4. We select a single main image [ in Fig. 4(c)] from themand apply the methods in Sections III-C and III-D to the pair ofthe main and each of the other four images. Then, the modiﬁedimages are merged by (4). Fig. 10. (From upper left to lower right) Motion blur in highlights: block In Figs. 5–7, we show the maps of the underexposure, over- matching, HDR video , ghost removal , and photomatix exposure, and Class I, respectively. These are obtained by theMAP estimation, followed by the postprocessing, in which the Class I, respectively. Fig. 8 depicts the HDRI constructed by ourwhite pixels are judged as the underexposure, overexposure, and method (for displaying, its dynamic range is scaled and then
364 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 1, JANUARY 2012 Fig. 11. Scene with large motion. Fig. 12. Results: (left) photomatix and (right) our method. http://ieeexploreprojects.blogspot.comFig. 13. Noise in underexposure region: (left set) photomatix and (right set)our method Fig. 15. Two examples of scenes under the circumstances of day light and night view. Note that we select the highest exposure as the main image in (a) for Fig. 14. Ghost: (left) photomatix and (right) our method. improving visibility, whereas medium exposure is used for the other examples. (a) Night view. (b) Day light.compressed by a tone mapping). In the set of input images, theface moves in the dark area, while his left elbow moves from HDR video  with Lukas–Kanade optical ﬂow estimation ;the bright region of the sky to the dark region of the pole. It 3) ghost removal in ; and 4) the commercially available soft-is shown in Fig. 8 that our algorithm reconstructs the irradiance ware Photomatix . We implement 1 and 2 using OpenCVwell and successfully removes the blur caused by these motions. and are used in substitution for the method in Section III-C.In Figs. 9 and 10, we show some comparisons with four conven- Other processing, such as in Section III-D, remains unchanged.tional methods: 1) a standard block matching algorithm; 2) an When implementing the conventional methods, if the motion
JINNO AND OKUDA: MULTIPLE EXPOSURE FUSION FOR HIGH DYNAMIC RANGE IMAGE ACQUISITION 365vectors are unreliable, they are set to 0 for the pixels (note that  P. E. Debevec and J. Malik, “Recovering high dynamic range radiancewe have conﬁrmed that a large error occurred when simply ap- maps from photographs,” in Proc. SIGGRAPH, 1997, pp. 369–378.  T. Mitsunaga and S. K. Nayer, “Radiometric self calibration,” in Proc.plying the methods without these treatments). The results of 3 IEEE Conf. CVPR, Jun. 1999, vol. 1, pp. 374–380.and 4 are obtained using the software in  and . These  S. B. Kang, M. Uyttendaele, S. Winder, and R. Szaliski, “High dynamicconventional methods often fail to compensate for the motion range video,” ACM Trans. Graph., vol. 22, no. 3, pp. 319–325, 2003.  E. A. Khan, A. O. Akyuz, and E. Reinhard, “Ghost removal in highdue to the saturations and the noises, as is described in Section I. dynamic range images,” in Proc. IEEE Int. Conf. Image Process., Oct.The block-matching-based algorithm often yields blocking arti- 2006, pp. 2005–2008.  X. Liu and A. El Gamal, “Synthesis of high dynamic range motionfacts and is sensitive to luminance change caused by an inaccu- blur free image from multiple captures,” IEEE Trans. Circuits Syst. I,rate calibration of the camera response curve. The method in  Fundam. Theory Appl., vol. 50, no. 4, pp. 530–539, Apr. 2003.fails particularly for the underexposed regions since the offset  S. Geman and D. Geman, “Stochastic relaxation, Gibbs distributionsnoises signiﬁcantly affect the performance. The ghost removal and the Bayesian restoration of images,” IEEE Trans. Pattern Anal. Mach. Intell., vol. PAMI-6, no. 6, pp. 721–741, Nov. 1984.in  performs worst among these methods. Photomatix tends  J. Besag, “Spatial interaction and the statistical analysis of lattice sys-to cause large errors in highlights. We have tested some other tems,” J. R. Stat. Soc., vol. B36, no. 2, pp. 192–236, 1974.examples and conﬁrmed that our method performs best for all  J. Konrad and E. Dubois, “Bayesian estimation of motion vector ﬁelds,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 14, no. 9, pp. 910–927,the tests. Sep. 1992. The advantage of our method is most evident when the mo-  W. Woo and A. Ortega, “Stereo image compression with disparity com-tion is large. For the second example, we create the HDRI from pensation using the MRF model,” in Proc. VCIP, Orlando, FL, Mar. 1996, vol. 2727, Proceedings of SPIE, pp. 28–41.the three images (see Fig. 11) with large movements, and the  J. Zhang and G. G. Hanauer, “The application of mean ﬁeld theory toresults are illustrated in Figs. 12–14. Here, we show only the re- image motion estimation,” IEEE Trans. Image Process., vol. 4, no. 1,sult of Photomatix among the four conventional methods since pp. 19–33, Jan. 1995.  J. Wei and Z.-N. Li, “An efﬁcient two-pass MAP-MRF algorithm forthe qualities of the other three results are much poorer. In these motion estimation basedon mean ﬁeld theory,” IEEE Trans. Circuitsﬁgures, Photomatix (left) has some ghosting artifacts, whereas Syst. Video Technol., vol. 9, no. 6, pp. 960–972, Sep. 1999.the proposed method (right) does not because of accurate esti- Q. Liu, R. J. Sclabassi, C. C. Li, and M. Sun, “An application of MAP-MRF to change detection in image sequence based onmation and our new weighting function. We have tested some mean ﬁeld theory,” EURASIP J. Appl. Signal Process., vol. 13, pp.other examples and conﬁrmed that our method performs best 1956–1968, 2005.  A. G. Bors and I. Pitas, “Optical ﬂow estimation and moving objectfor all the tests. Fig. 15 shows additional examples with large segmentation based on median radial basis function network,” IEEEmotions. The proposed method is superior to the conventional Trans. Image Process., vol. 7, no. 5, pp. 693–702, May 1998.methods for most of the examples we have tested. http://ieeexploreprojects.blogspot.com stereo“An iterative image registration technique  B. D. Lucas and T. Kanade, with an application in vision,” in Proc. 7th Int. Joint Conf. Artif. Intell., 1981, pp. 674–679. V. CONCLUSION  E. Reinhard, M. Stark, P. Shirley, and J. Ferwerda, “Photographic tone We have propose a method for the multiple-exposure fusion. reproduction for digital images,” ACM Trans. Graph., vol. 21, no. 3, pp. 267–276, Jul. 2002.In the method, the MAP-based method estimates occlusion,  [Online]. Available: http://www.anyhere.com/saturation, and displacements between input images and then  [Online]. Available: http://www.hdrsoft.com/construct the HDRIs by removing the artifacts. While in theconventional work it is hard to eliminate the ghosting artifacts,particularly when large motion occurs, the proposed method Takao Jinno received the B.E. degree from The Uni-can compensate for the effects of motion, the occlusion, and versity of Kitakyushu, Kitakyushu, Japan, in 2007. He has been with the Graduate School of Environ-the saturation and can obtain motion-blur-free HDRIs. mental Engineering, The University of Kitakyushu. He is engaged in high dynamic range image REFERENCES processing.  E. Reinhard, S. Pattanaik, G. Ward, and P. Debevec, High Dynamic Range Imaging: Acquisition, Display, and Image-Based Lighting, ser. Morgan Kaufmann Series in Computer Graphics and Geometric Mod- eling. San Mateo, CA: Morgan Kaufmann, 2005.  P. Debevec, “Image-based lighting,” IEEE Comput. Graph. Appl., vol. 22, no. 2, pp. 26–34, Mar. 2002.  F. Mccollough, Complete Guide to High Dynamic Range Digital Pho- tography. , China: Lark Books, 2008. Masahiro Okuda received the B.E., M.E., and  B. Hoefﬂinger, High-Dynamic-Range (HDR) Vision, ser. Springer Dr.Eng. degrees from Keio University, Yokohama, Series in Advanced Microelectronics. New York: Springer-Verlag, Japan, in 1993, 1995, and 1998, respectively. 2006. He was with the University of California, Santa  M. Bualat, L. Edwards, T. Fong, M. Broxton, L. Flueckiger, S. Y. Lee, Barbara, and Carnegie Mellon University, Pittsburgh, E. Park, V. To, H. Utz, and V. V. Clayton, “Autonomous robotic inspec- PA, as a Visiting Scholar in 1998 and 1999, respec- tion for lunar surface operations,” in Proc. 6th Int. Conf. Field Service tively. He has been with Faculty of The University of Robot., Jul. 2007, pp. 169–178. Kitakyushu, Kitakyushu, Japan, as an Associate Pro-  S. Mann and R. Picard, “On being ‘undigital’ with digital cameras: fessor of environmental engineering since 2001. His Extending dynamic range by combining differently exposed pictures,” research interests include ﬁlter design, vision/geom- in Proc. IS&T 46th Annu. Conf., May 1995, pp. 422–428. etry coding, and multirate signal processing.