DEVIR AND LINDENBAUM: BLIND ADAPTIVE SAMPLING OF IMAGES 1479Fig. 1. One iteration of the nonadaptive FPS algorithm: (a) the sampled points(sites) and their corresponding Voronoi diagram; (b) the candidates for sampling(Voronoi vertices); (c) the farthest candidate chosen for sampling; and (d) theupdated Voronoi diagram. The remainder of this paper is organized as follows: InSection II, the AFPS scheme is described as a particular caseof a progressive blind adaptive sampling scheme. Section IIIpresents the statistical pursuit scheme, followed by a presen-tation of the blind wavelet sampling scheme. Experimentalresults are shown in Section V. Section VI concludes this paperand proposes related research paths. II. AFPS ALGORITHM Fig. 2. First 1024, 4096, and 8192 point samples of the cameraman image, We begin by brieﬂy describing an earlier blind adaptive taken according to the AFPS scheme.sampling scheme , which inspired the work in this paper.The algorithm, denoted as AFPS, is based on the farthest pointstrategy (FPS), a simple, progressive, but nonadaptive point either the estimated local variance of the image intensities or thesampling scheme. equivalent local bandwidth. The resulting algorithm, denoted http://ieeexploreprojects.blogspot.com more densely in places where it is AFPS, samples the imageA. FPS Algorithm more detailed and more sparsely where it is relatively smooth. Fig. 2 shows the ﬁrst 1024, 4096, and 8192 sampling points pro- In the FPS, a point in the image domain is progressively duced by the AFPS algorithm for the cameraman image, usingchosen, such that it is the farthest from all previously sampled the priority function , where is the dis-points. This intuitive rule leads to a truly progressive sampling tance of the candidate to its closest neighbors and is a localscheme, providing after every single sample a cumulative set of variance estimate.samples, which is uniform in a deterministic sense and becomes A variant of the AFPS scheme, designed for range samplingcontinuously denser . using a regularized grid pattern, was presented in . To efﬁciently ﬁnd its samples, the FPS scheme maintains aVoronoi diagram of the sampled points. A Voronoi diagram is a geometric structure that divides the image domain into cells III. STATISTICAL PURSUITcorresponding to the sampled points (sites). Each cell containsexactly one site and all points in the image domain, which are In this section, we propose a sampling scheme based on a di-closer to the site than to all other sites. An edge in the Voronoi rect statistical model of the image. In contrast to point samplingdiagram contains points equidistant to two sites. A vertex in the schemes such as the AFPS, this scheme may choose samplingVoronoi diagram is equidistant to three sites (in the general case) masks from an overcomplete family of basis functions or cal-and is thus a local maximum of the distance function. There- culate optimal masks. The scheme updates an underlying statis-fore, in order to ﬁnd the next sample, it is sufﬁcient to consider tical model for the image as more information is gathered duringonly the Voronoi vertices (with some special considerations for the sampling process.points on the image boundary). After each sampling iteration, the new sampled point be- A. Simple Statistical Model for Imagescomes a site, and the Voronoi diagram is accordingly updated.Fig. 1 describes one iteration of the FPS algorithm. Note that, Images are often regarded as 2-D arrays of scalar values (graybecause the FPS algorithm is nonadaptive, it produces a uni- levels or intensities). Yet, it is clear that arbitrary arrays of valuesform sampling pattern regardless of the given image content. do not resemble natural images. Natural images contain struc- tures that are difﬁcult to explicitly deﬁne. Several attempts were made to formulate advanced statistical models, which can ap-B. AFPS Algorithm proximate such global structures and provide good prediction An adaptive more efﬁcient sampling scheme is derived from for missing parts . Still, the local structure is easier to pre-the FPS algorithm. Instead of using the Euclidean distance as dict, and there exist some low-level statistical models, whicha priority function, the geometrical distance is used, along with model local behavior fairly well.
1480 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 4, APRIL 2012 Here, we consider a common and particularly simple second- C. Reconstruction Error Minimizationorder local statistical model for images. We regard image as A greedy sampling strategy is to ﬁnd a sampling mask thata Gaussian random vector, i.e., minimizes the MSE of . If is a linear combination of the previous masks, it is trivial (1) to show that the MSE does not change (as no additional informa- tion is gained) and MSE . Therefore, we can assumewhere is the mean vector and is the covariance matrix. For that the new mask is linearly independent of .simplicity and without loss of generality, we assume . Proposition: The reduction of the MSE, given a new mask Two neighboring pixels in an image often have similar gray (linearly independent of ), islevel values (colors). Statistically speaking, their colors havea strong positive correlation, which weakens as their distance MSE (6)grows. The exponential correlation model ,  is a second-order stationary model based on this observation. According tothis model, the covariance between the intensities of two arbi- where is the covariance of , deﬁned in (4).trary pixels and exponentially depends on their distance, i.e., Proof: See Appendix A. The aforementioned proposition justiﬁes selection criteria for (2) the next best mask(s) in several scenarios. For the sake of brevity, we deﬁne and as the estimatedwhere is the variance of the intensities and determines how image and its covariance, after sampling the image with masksquickly the correlation drops. . MSE is the MSE after sampling masks, and MSE is the expected reduction of MSE given an arbitraryB. Statistical Reconstruction mask as the next selected mask. We further denote and We consider linear sampling where the th sample is gen- . Without subscript, and shall refer to the estimatederated as an inner product of image with some sampling mask image and its covariance, given all known masks. . That is, , where both the image and the mask is a positive-semideﬁnite symmetric matrix, with the pre-are regarded as column vectors. A sampling process provides us vious masks as its eigenvectors with corresponding zero eigen-with a set of sampling masks and their corresponding measure- values. may be regarded as the “portion” of the covari-ments http://ieeexploreprojects.blogspot.com statistically independent of the previous . We wish to reconstruct an image from ance matrix , which isthis partial information. masks. Let be a matrix containing the masks asits columns, and let be a column vector of the D. Progressive Sampling Schemesmeasurements. Using those matrix notations, . 1) Predetermined Family of Masks: The masks are often se- The underlying statistical model of the image may be used to lected from a predeﬁned set of masks (a dictionary) , such asobtain an image estimate based on measurements . the DCT or DWT basis. In such cases, the next best mask is de-For the second-order statistical model (1), the optimal estimator termined by calculating MSE for each mask in the dictionaryis linear . The linear estimator is optimal in the sense, and choosing MSE .i.e., it minimizes the mean square error (MSE) between the true 2) Parametrized Masks: Suppose depends on severaland reconstructed images MSE . parameters . We can differentiate (6) by the parame- It is not hard to show that the image estimate , its covariance ters of the mask, solve the resulting system of equations , and its MSE can be written in matrix form as MSE , and check all the extrema masks for the one that maximizes MSE . However, solving the resulting (3) system of equations is not a trivial task. (4) 3) Optimal Mask: If the next mask is not restricted, the op- MSE trace (5) timal mask is an eigenvector corresponding to the largest eigen- value. See Appendix B for details. We shall denote the eigen-assuming is linearly independent. vector corresponding to the largest eigenvalue of the largest It should be noted that the statistical reconstruction is analo- eigenvector of and mark it as .gous to the algebraic one. The algebraic (consistent) reconstruc- 4) Optimal Set of Masks: The largest eigenvector of is thetion of an image from its measurements is optimal mask. If that mask is chosen, it becomes an eigen-(assuming the linear independence of ). That is, the alge- vector of with a corresponding eigenvalue of 0. Therefore,braic reconstruction is a statistical reconstruction, assuming the the optimal mask is the largest eigenvector of , whichpixels are independent identically distributed, i.e., . is the second largest eigenvector of . Collecting optimal Searching for a new mask that minimizes the algebraic re- masks together is equivalent to ﬁnding the largest eigenvec-construction error , leads to the OMP . Analo- tors of .gously, searching for a new mask that minimizes the expected If we begin with no initial masks, the optimal ﬁrst maskserror , leads to the statistical pursuit, which is dis- are simply the largest eigenvectors of . Those are, notcussed next. surprisingly, the ﬁrst components obtained from the principal
DEVIR AND LINDENBAUM: BLIND ADAPTIVE SAMPLING OF IMAGES 1481 F. Image Reconstruction From Adaptive Sampling The statistical reconstruction of (3) requires the measure- ments, along with their corresponding masks. For nonadaptive sampling schemes, the masks are ﬁxed regardless of the image, and there is no need to store them. For blind adaptive sampling schemes, where the set of masks differs for different images, there is no need to store them either. At each iteration of the sampling process, a new sampling mask is constructed, and the image is sampled. Because of the deterministic nature of the sampling process, those masks can be recalculated during reconstruction. The reconstruction algo- rithm is almost identical to Algorithm 1, except for step 3(c), which now reads, “Pick from a list of stored measure- ments,” and the stopping criteria at step 4 is accordingly up- dated. IV. BLIND WAVELET SAMPLING The statistical pursuit algorithm is quite general, but up- dating the direct underlying space-varying statistical model of the image is computationally costly. We now present an alter- native blind sampling approach, which is limited to a family of wavelet masks and relies on an indirect statistical model of the image. The scheme chooses ﬁrst the coefﬁcient that is estimated to carry most of the energy, using the measurementscomponent analysis (PCA) ,  of , i.e., the covariance of that it obtains.the image . A trivial adaptive scheme stores the largest wavelet coef- ﬁcients and their corresponding masks. Such a scheme samplesE. Adaptive Progressive Sampling (i.e., decomposes) the complete image and sorts the coefﬁcients http://ieeexploreprojects.blogspot.com according to their energy. However, it clearly uses all the image Using the MSE minimization criteria, it is easy to construct information and is therefore not blind.a nonadaptive progressive sampling scheme that selects the op- The proposed sampling scheme is based on the statisticaltimal mask, i.e., the one that minimizes the estimated error at properties of a wavelet family; we use the statistical correlationseach step. Such a nonadaptive sampling scheme makes use of between magnitudes (or energies) of the wavelet coefﬁcients.a ﬁxed underlying statistical model for image . Those statistical relationships are used to construct a numberHowever, as we gain information about the image, we can up- of linear predictors, which predict the magnitude of the unsam-date the underlying model accordingly. pled coefﬁcients from the magnitude of the known coefﬁcients. In the exponential model of (2), the covariance between two This way, the presented scheme chooses the larger coefﬁcientspixels depends only on their spatial distance. However, pairs without direct knowledge about the image.associated with a large intensity dissimilarity are more likely The proposed sampling scheme is divided into three stages.to belong to different segments in the image and to thus be less 1) Learning Stage: The statistical properties of the waveletcorrelated. If we have some image estimate, a pair of pixels may family are collected and studied. This stage is done ofﬂinebe characterized by their spatial distance and their estimated using a large set of images and is considered a constantintensity dissimilarity. model. Using the partial knowledge about image obtained during 2) Sampling Stage: At each iteration of this blind samplingthe iterative sampling process, we now redeﬁne the exponential scheme, the magnitudes of all unsampled coefﬁcients arecorrelation model (2). The new model is based on both the spa- estimated, and the coefﬁcient with the largest magnitude istial distance and the estimated intensity distance, i.e., greedily chosen. 3) Reconstruction Stage: The image is reconstructed using (7) the measurements obtained from the sampling stage. As with the other blind schemes, it is sufﬁcient to store the Such correlation models are implicitly used in nonlinear ﬁl- values of the sampled coefﬁcients since their correspondingters such as the bilateral ﬁlter . Naturally, other color-aware masks are recalculated during reconstruction.models can be used. For example, instead of taking Euclideandistances, we can take geodesic distances on the image color A. Correlation Between Wavelet Coefﬁcientsmanifold . Wavelet coefﬁcients are weakly correlated among them- Introducing the model of (7) into a progressive sampling selves , . However, their absolute values or their energiesscheme, we can construct an adaptive sampling scheme pre- are highly correlated. For example, the correlation betweensented as Algorithm 1. the magnitude of wavelet coefﬁcients at different scales but
1482 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 4, APRIL 2012similar spatial locations is relatively high. Nevertheless, thesign of the coefﬁcient is hard to predict, and therefore, thecorrelations between the coefﬁcients remain low. This propertyis the foundation of zerotree encoding , which is used toefﬁciently encode quantized wavelet coefﬁcients. Each wavelet coefﬁcient corresponds to a discrete waveletbasis function, which can also be interpreted as a samplingmask. The mask (and its associated coefﬁcient) is speciﬁed byorientation ( , , , or ), level of decomposition (orscale), spatial location, and support. Three relationships are deﬁned. 1) Each coefﬁcient has four (spatial) direct neighbors in the same block. 2) Each coefﬁcient from the , , and blocks (for ) has four children. The children are of the same orientation one level below and occupy approximately the same spatial location of the parent coefﬁcient. Each coef- ﬁcient in the , , and blocks (expected at the highest level) has a parent coefﬁcient. 3) Each coefﬁcient from the , , and blocks has two cousins that occupy the same location but with dif- ferent orientations. At the highest level , a third cousin in the block exists. Similarly, each coefﬁcient in the block has three cousins in the , , and blocks. We can further deﬁne second-order family relationships, such C. Sampling Stageas grandparents and grandchildren, cousin’s children, diagonal The sampling stage is divided into an initial phase and a pro-neighbors, and so on. Those relationships carry lower correla- gressive phase.tion and can be indirectly approximated from the ﬁrst-order rel- 1) Initial Sampling Phase: Before any coefﬁcient is sampled, http://ieeexploreprojects.blogspot.comatives. the predictors can rely only on the expected mean. Therefore, the coefﬁcients with the highest expected mean should be sam- pled ﬁrst. For wavelet decompositions, the expected mean of the coefﬁcients from the block is the highest, and we start byB. Learning Stage sampling them all. The coefﬁcients of the block carry low correlation with The learning stage is done ofﬂine, and the statistical relation- their cousins (at the , , and blocks). Therefore,ships are stored for the later sampling and reconstruction stages. after sampling the block, we further decompose it “on theThe measured correlations are considered as a ﬁxed model. Our ﬂy” into , , , and , in order to makeexperiments showed the statistical characteristics to be almost use of the higher parent–children correlations. This way, we getindifferent to the image classes. That is, the proposed method is better predictors for , , and .robust for varying image classes. 2) Progressive Sampling Phase: The blind sampling algo- At the learning stage, the wavelet coefﬁcients are considered rithm has three types of coefﬁcients, i.e., the coefﬁcients it hasas instances of random variables. We assume that the statistics already sampled, which we refer to as known coefﬁcients; rela-of the wavelet coefﬁcients are independent of their spatial lo- tives of the known coefﬁcients, which are the candidates for thecation, and we study complete blocks of wavelet coefﬁcients as sampling; and the remaining coefﬁcients. The algorithm keepsinstances of a single random variable. In addition, we assume the candidates in a heap data structure, sorted according to theirtransposition invariance of the image and wavelet coefﬁcients. estimated magnitude.Therefore, we expect the same behavior from wavelet coefﬁ- The output of the algorithm is an array of the coefﬁcientcients of opposite orientations (e.g., the and blocks). values, as sampled at steps 1(a) and 2(a).It is common to assume scaling invariance as well, but our ex-periments showed this assumption is not completely valid for D. Reconstruction Stagediscrete images. Again, we mark the coefﬁcients as known, candidates, and See Section V-B for experimental results of the statistical the remaining coefﬁcients. The input for the reconstruction al-model for different types of wavelet families. gorithm is the array of values generated by the sampling algo- It is straightforward to build a linear predictor of the magni- rithm (Algorithm 2).tude of a certain coefﬁcient , assuming it has known Note that, while the value of the coefﬁcient is stored in therelatives , i.e., . However, array, its corresponding wavelet mask is not part of the storedthe actual predictors differ according to the available observa- information and is obtained during the reconstruction stage (steptions. 2), using the predictors. Consequently, the algorithm does not
DEVIR AND LINDENBAUM: BLIND ADAPTIVE SAMPLING OF IMAGES 1483 Fig. 3. First 30 nonadaptive sampling masks. Fig. 6. Ratio between the average reconstruction errors of adaptive and non- adaptive schemes, for unrestricted masks (PCA) and three overcomplete dictio- naries (DB3, DB2, and Haar). The reference line (100%) represents the non- adaptive schemes. V. EXPERIMENTAL RESULTS Fig. 4. First 30 adaptive sampling masks. A. Masks Generated by Statistical Pursuit We start by illustrating the difference between the nonadap- tive and adaptive masks. Figs. 3 and 4 present the ﬁrst 30 masks used to sample a 32 32 image patch, shown in Fig. 5. The masks are produced by both nonadaptive and adaptive schemes, where, in both cases, the masks are unrestricted and the same ex- ponential correlation model is used. The nonadaptive sampling masks, shown in Fig. 3, closely resemble the DCT basis and not by coincidence (the DCT is anFig. 5. (Left) Image patch and (right) the reconstruction error using adaptive approximation of the PCA for periodic stationary random sig-and nonadaptive unrestricted masks. nals ). The adaptive sampling masks, shown in Fig. 4, present http://ieeexploreprojects.blogspot.com As the sampling process advances, more complicated patterns. it attempts to “study” the image at interesting regions, e.g., the vertical edge at the center of the patch (see Fig. 5). Fig. 5 presents the true reconstruction error for a varying number of samples. For both adaptive and nonadaptive sam- pling schemes, the reconstruction is done using the linear es- timator (3). Fig. 6 presents the ratio between the reconstruction errors of the adaptive and nonadaptive schemes. We took 256 small patches (16 16 pixels each) and compared the average re- construction errors. This experiment was repeated for several classes of masks, i.e., unrestricted masks (PCA) and an over- complete family of Daubechies wavelets of order three (DB3), of order two (DB2), and of order one (Haar wavelets) . Using the adaptive schemes reduces the reconstruction error by between 5% and 10% compared with the nonadaptive schemes (which use the same family of masks). Using optimal representation (i.e., the PCS basis), the error is reduced by 5%, whereas using a less optimal basis (the Haar basis), the error is reduced even further, as the corresponding nonadaptive scheme is known to be suboptimal. For the ﬁrst few coefﬁcients, the adaptive scheme does not have much information to work with. Therefore, until more samples are gathered, the relative gain over the nonadaptive schemes is erratic. After sampling some more coefﬁcients, the beneﬁts of the adaptivity become more apparent. B. Correlation Models for Blind Wavelet Sampling In our experiments, we used a data set of 20 images. All theneed to keep the masks associated with the coefﬁcients or their images were converted to grayscale and rescaled to 256 256indexes. pixels. The images are shown in Fig. 7. The ﬁrst two subsets are
1484 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 4, APRIL 2012 TABLE II CORRELATIONS BETWEEN DB3 Fig. 7. Image sets used in our experiments. behavior of the wavelet family and not of the different image classes. TABLE I CORRELATIONS BETWEEN DB2 C. Blind Wavelet Sampling Results Having obtained the correlation model for the wavelet family, we now present some results of the blind sampling scheme. Taking an image, we decompose it using third-level wavelet decomposition with DB2. We compare the adaptive order, ob- tained by the blind adaptive sampling scheme, to a nonadaptive raster order, and to a optimal order (which is not blind). http://ieeexploreprojects.blogspot.com The nonadaptive raster-order scheme samples the coefﬁcients according to their block order, from the highest level to lower ones. The optimal-order scheme assumes full knowledge of the coefﬁcients and samples them according to their energy. Figs. 8–10 present partial reconstructions of the cameraman image, where only some of the wavelet coefﬁcients are used for the reconstruction. There are three columns in the ﬁgures, i.e., the left, where the selected coefﬁcients are marked; the middle, where the reconstructed images, using the selected coefﬁcients,“natural” images, the third set is taken from a computed-tomog- are presented; and the right, where the error images are shown.raphy brain scan, and the fourth set is a collection of animations. Reconstruction error is also shown. Two examples for experimental correlation models are All three schemes start with the block and continue topresented in Tables I and II. Those models were studied sample the remaining wavelet coefﬁcients in different orders.for Daubechies wavelets  of second and third orders. A Figs. 8–10 correspond to raster, adaptive, and optimal orders,four-level wavelet decomposition was carried out over the respectively.image data set, and the correlation coefﬁcients between dif- In Fig. 11, we can see the reconstruction errors for the raster,ferent kinds of wavelet coefﬁcients were estimated according adaptive and optimal orders, averaged over all 20 images. Noteto the maximum-likelihood principle. that the same reconstruction error may be achieved by the adap- Observing Tables I and II, we see that Daubechies wavelets of tive scheme using about half of the samples required by the non-second and third orders exhibit similar behavior. This behavior adaptive scheme.was also experimentally found for other wavelet families. As The main advantage of blind sampling schemes is that thethe order of the wavelet family increases, most of the correla- sampling results (the actual coefﬁcients) need to be storedtion coefﬁcients decrease. We also see that the images are not but not their locations. Nonblind adaptive schemes, such asscale invariant, and as the decomposition level decreases, the choosing the largest coefﬁcients of the decomposition (thecorrelation coefﬁcients between related coefﬁcients increase. optimal-order scheme), require an additional piece of informa- Our experiments show the statistical characteristics to be al- tion, i.e., the exact location of each coefﬁcient, to be storedmost indifferent to the image classes. We tested the beneﬁts of alongside its value.using a speciﬁc (rather than the generic) model for each class of From a compression point of view, we have to estimate theimages and found that it reduces the error by a negligible 1%. number of bits required for storing the additional informa-It appears that the correlation model characterizes the statistical tion needed for the reconstruction. As a rough comparison,
DEVIR AND LINDENBAUM: BLIND ADAPTIVE SAMPLING OF IMAGES 1485Fig. 8. Reconstruction of the cameraman image using (a) the ﬁrst 1024 coefﬁ-cients of the block; (b) 4096 coefﬁcients of the , , , and Fig. 11. Reconstruction error averaged over 20 images, using up to 16 384blocks; and (c) 16 384 coefﬁcients of the , , , , , , wavelet coefﬁcients, taken according to the raster, adaptive, and optimal orders.and the blocks. (Dashed line) Compression-aware comparison of the optimal order, taking into an account the space required to store the coefﬁcient indexes of the optimal order. dashed line marks the reconstruction error of the optimal order taking into account the storage considerations. VI. CONCLUSION http://ieeexploreprojects.blogspot.com presented two novel blind adaptive In this paper, we have schemes. Our statistical pursuit scheme, presented in Section III, maintains a second-order statistical model of the image, which is updated as information is gathered during the sampling process. Experimental results have shown that the reconstruction errorFig. 9. Reconstruction of the cameraman image using (a) 4096 coefﬁcientstaken according to the blind sampling order and (b) 16 384 coefﬁcients taken is smaller by between 5% and 10%, as compared with regularaccording to the blind sampling order. nonadaptive sampling schemes, depending on the class of basis functions. Due to its complexity, however, this scheme is most suitable for small patches and a small number of coefﬁcients. Our blind wavelet sampling scheme, presented in Section IV, is more suitable for complete images. It uses the statistical cor- relation between the magnitudes of wavelet coefﬁcients. Naturally, the optimal selection of the coefﬁcients with the highest magnitude shall produce superior results, but such unblind methods require storage of the coefﬁcient indexes, whereas the blind scheme only stores the coefﬁcients. Taking into an account the additional bit space, the blind wavelet sampling scheme produces results almost as good as optimal selection of the masks.Fig. 10. Reconstruction of the cameraman image using (a) 4096 coefﬁcients Some open problems are left for future research, such as thetaken according to the optimal order and (b) 16 384 coefﬁcients taken according application of the statistical-pursuit scheme to image compres-to the optimal order. sion. Including quantization in the scheme and introducing an appropriate entropy encoder can turn the sampling scheme into a compression scheme. Replacing DCT or DWT sampling withwe assume that the location (index) of a coefﬁcient takes their statistical-pursuit counterparts reduces the error for bits. Optimistic assumptions on entropy each patch by 5%–10%. However, some of the gain is expectedencoders reduce the size required for storing the coefﬁcient to be lost by the entropy encoder.indexes to about 8 bits, disregarding quantization. Using this The blind wavelet sampling scheme makes use of linear pre-rough estimation on storage requirements, each coefﬁcient of dictors. However, it is known that the distribution of wavelet co-the optimal order is equivalent, in a bit-storage sense, to about efﬁcients is not Gaussian , . Therefore, higher order pre-two coefﬁcients of the blind adaptive scheme. In Fig. 11, the dictors for modeling the relationships between the coefﬁcients
1486 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 4, APRIL 2012may yield better predictors and a better blind adaptive sampling and the denominator isscheme. APPENDIX A Let be a linearly independent set ofmasks, where is the new mask. According to (5), MSE , Both the numerator and the denominator have similari.e., the MSE of the whole image estimate after sampling the quadratic forms. Therefore, let us deﬁne th mask, is MSE Plugging back into the numerator and the denominator of MSE yieldswhere , is a unit column vector with 1 at and0 elsewhere, and is the image domain. MSE We now rewrite MSE while separating the elements inﬂu-enced by , i.e., the new mask, from the elements that areindependent of Surprisingly, is the covariance of , i.e., the estimated image based on the ﬁrst measurements. MSE APPENDIX B Proposition: , which is an eigenvector corresponding to We denote as the matrix of the pre- the largest eigenvalue of , is an optimal mask.vious masks, excluding the th new mask . Using ma- Proof: Let be the eigendecomposition of ,trix notation, , , , where are the orthonormal eigenvectors , and . of , sorted in descending order of their corresponding eigen- A matrix blockwise inversion values . http://ieeexploreprojects.blogspot.com mask. If Let be an arbitrary , MSE . Otherwise, let be represented by the eigenvector basis as . According to (6)implies that MSE MSE As is the largest eigenvalue of , the following holds: MSE Since MSE , we see that, indeed Now, we have a more precise expression for the expectedreduction of the MSE at the th iteration, i.e., MSE MSE MSE Hence, maximizes MSE and is an optimal mask. The numerator of MSE is REFERENCES  N. Ahmed, T. Natarajan, and K. R. Rao, “Discrete cosine transform,” IEEE Trans. Comput., vol. C-23, no. 1, pp. 90–93, Jan. 1974.  M. de Berg, M. van Kreveld, M. Overmars, and O. Schwarzkopf, Com- putational Geometry, Algorithms and Applications, 2nd ed. New York: Springer-Verlag, 2000.  R. W. Buccigrossi and E. P. Simoncelli, “Image compression via joint statistical characterization in the wavelet domain,” IEEE Trans. Image Process., vol. 8, no. 12, pp. 1688–1701, Dec. 1999.  I. Daubechies, Ten Lectures on Wavelets. Philadelphia, PA: SIAM, 1992.
DEVIR AND LINDENBAUM: BLIND ADAPTIVE SAMPLING OF IMAGES 1487  Z. Devir and M. Lindenbaum, “Adaptive range sampling using a sto-  B. Zeng and J. Fu, “Directional discrete cosine transforms for image chastic model,” J. Comput. Inf. Sci. Eng., vol. 7, no. 1, pp. 20–25, Mar. coding,” in Proc. IEEE Int. Conf. Multimedia Expo, 2006, pp. 721–724. 2007.  Y. Eldar, M. Lindenbaum, M. Porat, and Y. Zeevi, “The farthest point strategy for progressive image sampling,” IEEE Trans. Image Process., Zvi Devir received the B.A. degrees in mathematics vol. 6, no. 9, pp. 1305–1315, Sep. 1997. and in computer science and the M.Sc. degree in com-  H. Hotelling, “Analysis of a complex of statistical variables into prin- puter science from the Technion–Israel Institute of cipal components,” J. Educ. Psychol., vol. 24, no. 6, pp. 417–441, Sep. Technology, Haifa, Israel, in 2000, 2000, and 2007, 1933. respectively.  H. Hotelling, “Analysis of a complex of statistical variables into prin- From 2006 to 2010, he was with Medic Vision cipal components,” J. Edu. Psychol., vol. 24, no. 7, pp. 498–520, Oct. Imaging Solutions, Haifa, Israel, a company he co- 1933. founded, where he was the Chief Scientiﬁc Ofﬁcer.  J. Huang and D. Mumford, “Statistics of natural images and models,” Previously, he was with Intel, Haifa, Israel, mainly in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Fort Collins, CO, working on computer graphic and mathematical 1999, pp. 541–547. optimizations. He is currently with IARD Sensing  A. K. Jain, Fundamentals of Digital Image Processing. Upper Saddle Solutions, Yagur, Israel, focusing on advanced video processing and spectral River, NJ: Prentice-Hall, 1989. imaging. His research interests include video and image processing, mainly  S. Mallat, A Wavelet Tour of Signal Processing. San Diego, CA: Aca- algebraic representations and differential methods for images. demic, 1999.  S. Mallat and Z. Zhifeng, “Matching pursuits with time-frequency dic- tionaries,” IEEE Trans. Signal Process., vol. 41, no. 12, pp. 3397–3415, Dec. 1993. Michael Lindenbaum received the B.Sc., M.Sc.,  A. N. Netravali and B. G. Haskell, Digital Pictures. New York: and D.Sc. degrees from the Department of Elec- Plenum Press, 1995. trical Engineering, Technion–Israel Institute of  Y. C. Pati, R. Rezaiifar, and P. S. Krishnaprasad, “Orthogonal matching Technology, Haifa, Israel, in 1978, 1987, and 1990, pursuit: recursive function approximation with applications to wavelet respectively. decomposition,” in Proc. 27th Annu. Asilomar Conf. Signals, Syst., From 1978 to 1985, he served in the IDF in Comput., 1993, pp. 40–44. Research and Development positions. He did his  S. Papoulis, Probability, Random Variables and Stochastic Pro- Postdoc with the Nippon Telegraph and Telephone cesses. New York: McGraw-Hill, 2002. Corporation Basic Research Laboratories, Tokyo,  J. M. Shapiro, “Embedded image coding using zerotrees of wavelet Japan. Since 1991, he has been with the Department coefﬁcients,” IEEE Trans. Signal Process., vol. 41, no. 12, pp. of Computer Science, Technion. He was also a Con- 3445–3462, Dec. 1993. sultant with Hewlett-Packard Laboratories Israel and spent sabbaticals in NEC  N. Sochen, R. Kimmel, and R. Malladi, “A general framework for low Research Institute, Princeton, NJ, in 2001 and in Telecom ParisTech, in 2011. level vision,” IEEE Trans. Image Process., vol. 7, no. 3, pp. 310–318, He also spent shorter research periods in the Advanced Telecommunications Mar. 1998. Research, Kyoto, Japan, and the National Institute of Informatics, Tokyo.  A. Srivastava, A. B. Lee, E. P. Simoncelli, and S. C. Zhu, “On advances He worked in digital geometry, computational robotics, learning, and various http://ieeexploreprojects.blogspot.com image processing. Currently, his main research in statistical modeling of natural images,” J. Math. Imag. Vis., vol. 18, aspects of computer vision and no. 1, pp. 17–33, Jan. 2003. interest is computer vision, particularly statistical analysis of object recognition  C. Tomasi and R. Manduchi, “Bilateral ﬁltering for gray and color im- and grouping processes. ages,” in Proc. IEEE Int. Conf. Comput. Vis., 1998, pp. 839–846. Prof. Lindenbaum served on several committees of computer vision con-  J. S. Vitter, “Design and analysis of dynamic Huffman codes,” J. ACM, ferences and is currently an Associate Editor of the IEEE TRANSACTIONS OF vol. 34, no. 4, pp. 825–845, Oct. 1987. PATTERN ANALYSIS AND MACHINE.