IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 4, APRIL 2012                                                         ...
1614                                                                     IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, N...
1616                                                                                      IEEE TRANSACTIONS ON IMAGE PROCE...
1618                                                                     IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, N...
1620                                                                          IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. ...
1622                                                                         IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 2...
Upcoming SlideShare
Loading in …5

Fast wavelet based image characterization for highly adaptive image retrieval.bak


Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Fast wavelet based image characterization for highly adaptive image retrieval.bak

  1. 1. IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 4, APRIL 2012 1613 Fast Wavelet-Based Image Characterization for Highly Adaptive Image Retrieval Gwénolé Quellec, Mathieu Lamard, Guy Cazuguel, Member, IEEE, Béatrice Cochener, and Christian Roux, Fellow, IEEE Abstract—Adaptive wavelet-based image characterizations have wavelet-based image characterizations [5], [8], which have beenbeen proposed in previous works for content-based image retrieval successfully applied to different problems [9], [10].(CBIR) applications. In these applications, the same wavelet basis One possible application of wavelet adaptation is content-was used to characterize each query image: This wavelet basis wastuned to maximize the retrieval performance in a training data set. based image retrieval (CBIR), which is an increasingly pop-We take it one step further in this paper: A different wavelet basis ular discipline in computer science [11]–[14]. The goal of CBIRis used to characterize each query image. A regression function, is to automatically select, i.e., in a reference data set, imageswhich is tuned to maximize the retrieval performance in the that resemble a query image. Image characterizations are usedtraining data set, is used to estimate the best wavelet filter, i.e., in to catch similarities between images. Modern retrieval systemsterms of expected retrieval performance, for each query image. A usually rely on machine learning to bridge the semantic gap be-simple image characterization, which is based on the standardizedmoments of the wavelet coefficient distributions, is presented. An tween low-level image characterizations and users’ perceptionalgorithm is proposed to compute this image characterization of relevance (either through relevance feedback or through of-almost instantly for every possible separable or nonseparable fline training) [11]. A popular example of CBIR framework iswavelet filter. Therefore, using a different wavelet basis for each the bag-of-features (BoF) model [15]. Two wavelet-based CBIRquery image does not considerably increase computation times. frameworks were proposed in [5] and [8]. The main advantageOn the other hand, significant retrieval performance increases of wavelet-based CBIR frameworks lies in the ability to tune thewere obtained in a medical image data set, a texture data set, aface recognition data set, and an object picture data set. This addi- wavelet basis and, therefore, image characterizations in order totional flexibility in wavelet adaptation paves the way to relevance optimize the retrieval performance (e.g., the precision and thefeedback on image characterization itself and not simply on the recall). In these frameworks, the same wavelet basis was usedway image characterizations are combined. image and each reference image; the to characterize each query Index Terms—Content-based image retrieval (CBIR), relevance basis was tuned to maximize the retrieval performance in a man-feedback, wavelet adaptation, wavelet transform. ually annotated training data set. We take it one step further in this paper: A different wavelet basis is used to characterize each I. INTRODUCTION query image. A regression function, which is tuned to maximize the retrieval performance in the training data set, is used to es- timate the best wavelet filter, i.e., in terms of expected retrievalO VER THE last decades, the wavelet transform has be- come a major image characterization tool. One advantageof the wavelet transform over alternative methods is the ability performance, for each query image. Note that comparing image characterizations obtained with different wavelet bases wouldto tune the underlying wavelet basis to users’ needs, e.g., to be inconsistent. As a consequence, using a new wavelet basisoptimize compression, classification, or retrieval performances. for each query image implies characterizing again each refer-Originally, wavelet adaptation was mostly used to approximate ence image, using this new wavelet basis.a reference signal up to a desired scale [1], [2]. With the ad- With previously proposed approaches [5], [8], characterizingvent of the lifting scheme, wavelet adaptation has become more again each reference image would be prohibitive for timewidespread, both for separable [3]–[5] and nonseparable [6]–[8] reasons. An algorithmic breakthrough was needed. A simplewavelet transforms. Recently, we have introduced two adaptive image characterization is presented. Similar to previous ap- proaches, images are characterized by the distribution of the Manuscript received June 25, 2010; revised July 11, 2011 and November 21, wavelet coefficients at different scales and along different di-2011; accepted December 11, 2011. Date of publication December 21, 2011; rections [5], [8]. However, the proposed image characterizationdate of current version March 21, 2012. The associate editor coordinating the is lighter: It is simply based on standardized moments of thereview of this manuscript and approving it for publication was Prof. Hsueh- wavelet coefficient distribution. In order to easily characterizeMing Hang. G. Quellec and M. Lamard are with the LaTIM Inserm Research Unit 1101, again each reference image, we propose to compute a char-29200 Brest, France (e-mail: acterization map as follows: 1) exact image characterizations G. Cazuguel and C. Roux are with LaTIM Inserm Research Unit 1101, 29200 are computed for a limited number of wavelet filters of givenBrest, France, and also with the Department of Image et Traitement de l’In-formation, Institut Telecom, Telecom Bretagne, Université Européenne de Bre- support and 2) approximate characterizations can be computedtagne, 29200 Brest, France. almost instantaneously for every possible wavelet filter of B. Cochener is with Centre Hospitalier Universitaire Brest, Service d’Oph- equal support. Such characterization maps can be computedtalmologie, 29200 Brest, France. Color versions of one or more of the figures in this paper are available online indifferently for separable or nonseparable wavelet filters. Dueat to these characterization maps, characterizing an image again Digital Object Identifier 10.1109/TIP.2011.2180915 using a different wavelet filter does not involve reprocessing 1057-7149/$26.00 © 2011 IEEE
  2. 2. 1614 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 4, APRIL 2012the image. In conclusion, using a different wavelet filter for where is a polynomial function of the wavelet transformeach query image in a CBIR application is now possible. of [and consequently of the wavelet filter coefficients; see (3)] The remainder of this paper is organized as follows. Section II as follows:describes the proposed image characterization given a waveletfilter. Section III addresses the design of characterization maps. (9)Applications to image retrieval are presented in Section IV.The proposed framework is applied to four image data sets inSection V. We end up with a discussion in Section VI. where and denote the sets and , respectively. According to (3) II. CHARACTERIZING ONE IMAGE WITH A and (7)–(9), the complexity of image characterization is in GIVEN WAVELET FILTER . Let be an image of size pixels. Let be a wavelet filter III. BUILDING A CHARACTERIZATION MAPof support . By definition of a wavelet filter,the following relation holds for [16]: A. Wavelet Filter Space (1) Let denote the space of all wavelet filters of support . Let denote its dimension. Because theBy definition, the detail coefficients of the wavelet transform of central coefficient of each filter is constrained by , at any location and any analysis scale , are given (1), (i.e., ).by (2) B. Characterization Map and Characterization Derivative MapsAccording to (1), (2) can be rewritten as follows: For wavelet adaptation purposes, we propose to compute the characterization of for a given analysis scale (3) and each filter . The resulting set of characterizations is referred to as the characterization map of (given , , andWe propose to characterize the distribution of detail coefficients ). For wavelet adaptation purposes, it is also useful to compute in with standardized moments. the first-order derivatives of each characterization. The resultingIn the particular case of texture images, Wouwer et al. [17] have sets of characterization derivatives are referred to as the charac-shown that the distribution of detail coefficients at any analysis terization derivative maps of .scale can be modeled meaningfully by an unskewed zero-mean Since characterizing an image with a given wavelet filter cangeneralized Gaussian function; we extended this observation to be time-consuming (the complexity is in ; seea more general class of images in previous works [5]. Gener- Section II), we propose to approximate the characterizationalized Gaussian functions have two parameters: and , i.e., map and the characterization derivative maps, as described inthe scale and shape parameters, respectively. The first four stan- Sections III-C and III-D. In order to speed up computationsdardized moments and are related to and through further, the invariances of the characterization map and of thethese following relations [18]: characterization derivative maps are studied in Section III-E. Some practical considerations for building and visualizing (4) characterization maps are highlighted in Section III-F. (5) C. Approximate Characterization Map (6) We propose to approximate the characterization map and the characterization derivative maps using Taylor expansions. Thewhere is the Gamma special function. In particular, and exact characterization and characterization derivatives are com-are closely related to and , respectively. As a consequence, puted for a finite set of wavelet filters, which are referred to aswe propose to characterize the distribution of the detail coeffi- key wavelet filters, and the remainder of each map is approxi-cients at any analysis scale by its standard deviation mated using Taylor expansions. The set of key wavelet filtersand its kurtosis , i.e., after subtracting the mean. For is denoted by . Let be a keyshort, and are noted and , respec- wavelet filter. The Taylor expansion of function (either ,tively, when , , and are not ambiguous. and are given , , or ) in the neighborhood of isby given by the following formal relation [19]: (7) (8) (10)
  3. 3. QUELLEC et al.: FAST WAVELET-BASED IMAGE CHARACTERIZATION FOR HIGHLY ADAPTIVE IMAGE RETRIEVAL 1615where denotes the order of the Taylor expansion and Its second-order derivatives are given bydenotes the L2-norm (18) (11)Note that is multidimensional: It is a function of the waveletfilter coefficients. As a consequence, the second term of (10)(corresponding to ) is given by (19) (12) Its third-order derivatives, and some computation details, are provided in Appendix A. Higher order derivatives were not usedwhere is the gradient of . Its third term (corresponding to in this paper. ) is given by E. Invariances of the Characterization Map and of its (13) Derivatives In order to reduce the cardinal of , we propose to findwhere is the Hessian of . invariances in the characterization map and its derivatives. According to Taylor’s theorem [19], (10) holds if is First, if a wavelet filter is multiplied by a posi-times differentiable at . We can check that and (and tive real number , then according to (3), (9), and (14),therefore and ) are infinitely differen- and are multiplied by , , , and . Conse-tiable when . In particular, it should be noted that and quently, if is multiplied by , then is multiplied by , are functions of terms composed of and is unchanged [see (7) and (8)]. As for the derivatives,polynomials, fractions, and square roots [see (3) and (7)–(9)]. if is multiplied by , then is unchanged, andThe standard deviation equals 0 when image is constant is divided by [see (16) and (17)]. This invari-everywhere or when ; cannot be strictly negative [see ance analysis implies that we only need to compute the char- and (9)]. In the trivial case where is constant everywhere, acterization map and its derivatives for wavelet filters on thecharacterization maps are also constant everywhere; therefore unit sphere .we do not need to compute them. Images are assumed noncon- Second, if is multiplied by 1, then and arestant in the following sections. multiplied by , , , and [see (3), (9), and (14)]. Conse- quently, if is multiplied by 1, then and are unchangedD. Derivatives of the Proposed Characterization With Respect [see (7) and (8)], and and are multi-to Filter Coefficients plied by 1 [see (16) and (17)]. This invariance analysis implies In order to compute the Taylor expansions above, the first that we only need to compute the characterization map and itsfew order derivatives of the proposed image characterization, derivatives on one half of the unit sphere .i.e., the first few order derivatives of (7) and (8), with respectto wavelet filter coefficients , need to be computed for each F. Practical Considerations for Building the Characterizationkey wavelet filter . Let and denote the Mapfollowing polynomial functions of the wavelet transform of : The proposed procedure for building the characterization map is illustrated in Fig. 1 for and . The first step in building the characterization map is to extract a set of key wavelet filters. The exact character- (14) izations and characterization derivatives associated with these key filters are computed, as described in Sections II and III-D, respectively. The elements of should be selected approx- imately uniformly in one half of the unit sphere (see Section III-E). A good set can be obtained as described (15) hereafter [see also Fig. 1(a)]. For each dimension of the wavelet space, a set of key wavelet filters, i.e.,The first-order derivatives of the proposed image characteriza- , is generated as follows:tion are given by • the th unconstrained coefficient of each filter is set to 1; • the th unconstrained coefficient of each filter , (16) with , is in [see Fig. 1(a)]; (17) • each filter is divided by .
  4. 4. 1616 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 4, APRIL 2012Fig. 1. Characterization map building. In this example ( K=0 L=1, D=2 , and therefore, w ), wavelet filters have three coefficients: w w, andww[ x is constrained by (1)]. In each figure, the axis represents the value of the first unconstrained wavelet filter coefficient, i.e., w y . The axis represents thevalue of the second unconstrained coefficient: w . The value of the constrained wavelet coefficient w is not represented. In this example, the number of keywavelet filters is controlled by n=6 [0(=4) 0 (=4n); (3=4) 0 , and the portion of the map obtained by Taylor expansions is the unit circle restricted to the(=4n)) n angle interval. In (a) and (b), the two sets of key wavelet filters described in Section III-F are indicated by circles of two different colors. In (b)and (c), the solid arc indicates the portion of the map obtained by Taylor expansions. In (c), the solid straight line indicates the portion of the map that can beobtained, through invariance analysis, from the characterization estimated at the location of the small circle. (a) Exact computation (cf. Section II). (b) Approximatecomputation (cf. Section III-C). (c) Invariances (cf. Section III-E). is the union of these sets: Its cardinal is therefore . If , then the key wavelet filters are selected ex-actly uniformly in one half of the unit sphere. The second step is to approximate the characterizations andcharacterization derivatives in the remainder of the half unitsphere [see Fig. 1(b)]. For each filter in thisset, image characterizations are approximated by Taylor expan- http://ieeexploreprojects.blogspot.comsions; the closest key wavelet filter plays the role of (seeSection III-C). Taylor expansions can be used in this step be-cause ; therefore, (see Section III-C). The last step is to compute the characterizations and char-acterization derivatives in the remainder of the set [seeFig. 1(c)]. Let be the L2-norm of filter in this set. Let be a variable indicating whether belongs tothe half unit sphere above or not . The char-acterizations and characterization derivatives are obtained from according to the invariance analysis of Section III-E. The proposed procedure is applied, in the particular case , to a real-world image in Fig. 2. Characterization map buildingis assessed in Section V-B. The proposed wavelet-based image characterization is ap-plied to image retrieval in the following section. IV. IMAGE RETRIEVAL Fig. 2. Example of characterization map, in (b) and (e), and of characterization derivative maps, in (c), (d), (f) and (g), obtained for image (a) at the analysis Let be a query image and be a data set of reference im- scale s=2 K=0 L=1 . The scenario described in Fig. 1 was reused ( and );ages. In this section, characterization maps are used to rank im- w w w therefore wavelet filters have three coefficients: w , , and . Inages in increasing order of distance to . Then, the first each map, the intensity of pixel (x; y) x=w , where y=w and , is images, which are noted , proportional to the value of f (w ) f , where is either [in (b)], [in (e)], or one of their derivatives. In (b) and (e), black means 0. In (c), (d), (f), andare retrieved. In this paper, the goal is to retrieve a small set of (@ =@w ) (g), medium gray means 0. (a) Retinal Image. (b) . (c) . (d)highly relevant images. Therefore, retrieval systems are tuned (@ =@w ) . (e) . (f) (@ =@w ) (@ =@w ) . (g) .in a training set in order to maximize the precision at , i.e.,the fraction of images, among ,that belong to the same category as . If, on the contrary, one gap between category assignments and image characteristicswould like to retrieve all potentially relevant images in the (textures, colors, etc.) is wide.reference data set, then the recall at a large should be max- We propose to characterize each image by the following sig-imized instead [11]. Achieving high precision is challenging nature . In sig-when the number of categories is high or when the semantic nature , features and are extracted
  5. 5. QUELLEC et al.: FAST WAVELET-BASED IMAGE CHARACTERIZATION FOR HIGHLY ADAPTIVE IMAGE RETRIEVAL 1617at analysis scale , using a wavelet filter of support If (21) can be inverted (i.e., ) for . Based on these signatures, the distance one or several images , then the average precision atbetween and another image is defined as follows: is expected to increase. In that purpose, parameter is modified as follows: (22) with . Equation (22) is iterated until converges (note that and are updated at each iteration). (20) Whenever is a wavelet filter coefficient, the characterization derivative maps (see Fig. 2) are used in the gradient descent. Inwhere denotes the L2-distance between the intensity that case, (22) follows from:histogram of and that of [5]: is used to comparethe low-frequency component of and that of . Each com-ponent in the distance measure is weighted by a real number: or , where , , and . Several parameters need to be tuned in (20): , ,and , , , and . These parameters are referred to as the distance parameters.They are tuned in a training image data set to maximize theaverage precision at . Note that the same data set can be usedas the reference data set and as the training data set . Duringthe training phase, , . After the training (23)phase, , . Two image retrieval scenarios are considered: 1) a single set http://ieeexploreprojects.blogspot.comof distance parameters is used regardless of the query image(i.e., adaptive image retrieval; see Section IV-A), or 2) a different B. Highly Adaptive Image Retrieval: Training Procedureset of distance parameters is used for each query image (i.e., In highly adaptive image retrieval, a different set of distancehighly adaptive image retrieval; see Section IV-B). Note that parameters is used for each query image . Precisely, distanceadaptive image retrieval simply is a faster version of previously parameters (wavelet filters and weights) are allowed to vary con-proposed approaches [5], [8]. From a user perspective, adaptive tinuously in signature space. A continuous regression functionand highly adaptive image retrieval operate similarly to these is used to map , i.e., the initial signature of (obtained byprevious approaches; the only differences are that adaptive image an initial set of filters ), to new distance parameters forretrieval is much faster and that highly adaptive image retrieval : a new set of weightsis both much faster and potentially more precise. The training and a new set of filters . The initialphase is described hereafter for each scenario. Section IV-C set of filters is obtained by the training procedure of theexplains how the system processes a query. adaptive image retrieval (see Section IV-A). Once new distance parameters are obtained for , , which is the final signa-A. Adaptive Image Retrieval: Training Procedure ture of , is computed using . For In adaptive image retrieval, distance is tuned, by a gradient consistency, the signature of each reference image also needs todescent, in order to increase the average precision at among be computed using .all query images . Each parameter is modified, in turn, The continuity property implies that two images with sim-until the average precision at in converges. ilar initial signatures are mapped to similar distance parameters. Let denote the parameter currently being modified (either Let denote one distance parameter (either , , or , , or ). Optimizing the average precision ). The following the continuity equation holds for :at is not straightforward: Its derivative, with respect to ,is either zero or undefined everywhere on . We got aroundthis difficulty as explained hereafter. Let be the (24)image minimizing among images from a differentcategory than . Let be the image minimizing among images from the same category as and The regression functions are trained on . For each imagesuch that , the distance parameters for are tuned to maximize the precision at for while respecting the continuity constraint. (21) Let denote one distance parameter (either , , or
  6. 6. 1618 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 4, APRIL 2012 ). In order to respect the continuity constraint, (22) is Indeed, the signature of each image needs to bereplaced by the following equation: evaluated again using the optimal wavelet filters computed for (see Section IV-C), which is very fast if char- acterization maps are already available for . After Training (Query Images): There is no need to (25) compute characterization maps for query images .where is a positive real number. One first advantage In adaptive image retrieval, we only need to computeof the continuity constraint is that, with an appropriate value for , i.e., the signature associated with the optimal set , we can prevent the system from overfitting the training data. of filters . In highly adap-A second advantage is that, after training, we can easily define tive image retrieval, we only need to compute ,the regression function for each distance parameter through which is the signature associated with the initial set ofmultivariate interpolation [20]. For query images , filters , and , whichis given by the following equation: is the signature associated with the final set of filters (see Section IV-C). (26) To summarize, characterization maps should be computedwhere for reference/training images but not for test images. As a con- sequence, the additional flexibility introduced in the proposed framework does not imply a dramatic retrieval time increase. (27) In addition, should the optimal wavelet basis be further fine tuned (e.g., to address relevance feedbacks from the user), there is no need to compute the updated signatures from scratch: new image characterizations can be estimated through Taylor expan-C. Processing a Query: Scenario of Operation sions (where ) and invariance analysis (see Adaptive Image Retrieval: First, the query image Sections III-D and III-E). is decomposed using the optimal wavelet filters obtained during the training phase, V. APPLICATIONSand the signature is computed. Second, references images After an introduction to the four data sets under study in Section V-A, characterization map building is assessed in are ranked in increasing order of distance to , usingthe optimal distance weights Section V-B. The retrieval performance of the proposed re-obtained during the training phase. Third, the first images trieval systems is then assessed in Section V-C.are retrieved. Highly Adaptive Image Retrieval: First, the query image A. Data Sets is decomposed using the initial wavelet filters Caltech101: This data set was collected in September 2003 at obtained during the training phase, the California Institute of Technology.1 It consists of 9144 pic-and the initial signature is computed. Second, the tures of objects belonging to 101 categories. This is about 40 tofinal distance parameters (wavelet filters and weights) 800 images per category; most categories have about 50 images.are obtained through multivariate interpolation [see (26)]. The size of each image is roughly 300 200 pixels. ExamplesThird, the final signature of and of each reference image of images from two categories (“water_lilly” and “hedgehog”) are computed using the final set of wavelet filters are given in Fig. 3. . Fourth, references images are MESSIDOR: This data set was collected in three ophthal-ranked in increasing order of distance to using the final mology departments in France for research on automated dia-distance weights. Fifth, the first images are retrieved. betic retinopathy screening.2 It consists of 1200 eye fundus color photographs. Images were acquired using a color video 3CCDD. Characterization Maps and Time Complexity camera on a Topcon TRC NW6 nonmydriatic retinograph with a During Training: In both adaptive and highly adaptive image 45 field-of-view. The size of each image is either 1440 960,retrieval, the signatures of each image and of each 2240 1488, or 2304 1536 pixels. In a disease screening con-image are evaluated for an arbitrary large text, clinicians classify images into two categories: normal andnumber of wavelet filters. Computing characterization maps for pathological images. In MESSIDOR, 546 images were markedeach image is therefore very useful. However, there is as normal and 654 as need to compute the entire characterization maps all at once: Face Database: This data set was collected by ATT Lab-Whenever an image characterization needs to be evaluated for a oratories Cambridge for research on face recognition.3 It con-new wavelet filer , the closest key wavelet filter is searched sists of 400 images from 40 categories. Each category consistsfor, the image characterization is evaluated for , and it is eval- of ten images from the same person at different times, withuated for through Taylor expansions and invariance analysis different lighting conditions, different facial expressions (open/(see Sections III-D and III-E). After Training (Reference Images): In highly adaptive image 1 sets/Caltech101/retrieval, characterization maps computed for reference images 2 are still very useful to process new query images .
  7. 7. QUELLEC et al.: FAST WAVELET-BASED IMAGE CHARACTERIZATION FOR HIGHLY ADAPTIVE IMAGE RETRIEVAL 1619 TABLE I RELATIVE APPROXIMATION ERROR OF THE CHARACTERIZATION (DERIVATIVE) MAPSFig. 3. Examples of characterization maps computed, at different analysisscales and orientations, for images in the Caltech101 data set. Color charac- B. Assessing Characterization Map Buildingterization maps should be interpreted as follows: The red channel of a color building was assessed on a datacharacterization map was computed from the red channel of the input image Characterization map(the same goes for the green channel and the blue channel). In row labels, V Kstands for vertical wavelet filter ( ;L = 1 = 0), H stands for horizontal set of 40 images: Ten images were randomly selected from Kwavelet filter ( = 0 ;L = 1), and the subscript indicates the analysis scale each data set above. For each image, the exact and approxi-s( ). Characterization maps all look roughly alike. However, they differ in terms mate value for , , , and wereof angular frequency and phase, for instance. In addition, evolution acrossanalysis scales differs from one image to another. Therefore, it is usually computed for equally spaced wavelet filters onpossible to select one pixel location (i.e., one filter) in each map to derive a one half of the unit sphere. The approximate values werediscriminative feature vector. In addition, because a weight vector is used, the obtained by the procedure described in Section III-F. Themost discriminative maps/filters can be emphasized. experiment was performed for at two analysis scales: and . Only the red channel of color images was used in this experiment. Table I reports the per-imageclosed eyes, smiling/not smiling), and facial details (glasses/no relative approximation error ofglasses). The size of each image is 92 112 pixels. and VisTex: This data set was collected by the Media Laboratory . In this paper, the relative approx-at the Massachusetts Institute of Technology for research on tex- imation error of a vector was defined as the L2-norm of theture recognition.4 It consists of categorized texture images repre- approximation error divided by the L2-norm of the exact vector.sentative of real-world conditions. Categories consisting of less In the remainder of this paper, two setups are studied:than five element (“Clouds,” “Grass,” “Misc,” “WheresWaldo,” S1 and ;and “Wood”) were discarded in this paper. As a consequence, 14 S2 and );categories, consisting of 152 images altogether, were selected. a Taylor expansion of order is used to approximateThe size of each image is 512 512 pixels. and , and a Taylor expansion of order is used to For training purposes, each data set above was divided into approximate and .five subsets: . Each category in was equally The average computation times for building one characteri-divided into each of these subsets at random. The retrieval per- zation map and the associated characterization derivative mapsformance was assessed by a so-called fivefold cross-validation is 0.0476 s for Caltech101, 1.53 s for MESSIDOR, 0.00667 sstrategy: for each fold , a training subset for the Face Database, and 0.168 s for VisTex (using andand a test subset were defined. During retrieval perfor- ). One core of an Intel Xeon E5520 processor, runningmance assessment, an image was considered relevant if it be- at 2.27 GHz, was used in all experiments. Note that computa-longed to the same category as the query image. Final results tion times, for building one characterization map or one charac-are reported on the union of the five test subsets (see Table II). terization derivative map, are in (see 4 /vistex.html Sections II and III-F).
  8. 8. 1620 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 4, APRIL 2012 TABLE II PRECISION AT FIVE ON THE TEST SUBSETC. Retrieval Performance been proposed in this paper. It was presented for 2-D images, The following filter supports and analysis scales were used in but it could be generalized to any type of -D digital signals.all experiments (see Fig. 3 for the S1 setup): This framework allows fast image characterization, particularly : , : , . when images need to be characterized several times, using : , : , . different wavelet filters. This feature is particularly useful for : , : , . wavelet adaptation. : , : , . The proposed framework was applied to wavelet adapta- : , : , . tion for CBIR. Two CBIR methods were presented. The first : , : , . method, i.e., adaptive image retrieval, sped up previously pro- : , : , . posed CBIR methods based on wavelet adaptation [5], without : , : , . decreasing the retrieval performance. The second method, : , : , . i.e., highly adaptive image retrieval, takes full advantage of : , : , . the proposed image characterization framework: During eachThe following values were used for [see (22)]: filter is used, which means each ref- if query, a different waveletis a weight, and if is a wavelet filter coefficient. erence image has to be characterized again with a new wavelet was found by cross-validation on each training subset [see filer. Highly adaptive image retrieval increased the retrieval(25) and (27)]. performance significantly, while maintaining low computation The precision at [5], obtained on the test subset times (see Fig. 4). These improvements were observed inof each data set by both experiments, is reported in Table II. four different image data sets: Caltech101, MESSIDOR, FaceIt is compared with three other wavelet-based CBIR methods Database, and VisTex. A simple gradient descent was used inon the same data sets (with the same subsets ): this paper for wavelet adaptation. However, the proposed imageseparable lifting-scheme-based optimization [5], nonsepa- characterizations may be used jointly with more advancedrable lifting-scheme-based optimization, [8] and dual-tree wavelet adaptation techniques.complex wavelet transform (CWT) [8], [21]. A comparison In all data sets, highly adaptive image retrieval performed sig-with the original BoF model [15], which has been one of nificantly better than dual-tree CWT [21], in which wavelet fil-the most popular CBIR frameworks, was also included. In ters of large support are used to analyze images. In highly adap-our BoF implementation, based on the OpenCV library,5 fea- tive image retrieval, several adapted filters of small support aretures were detected and described using the scale-invariant used. The number and the high adaptivity of these filters com-feature transform (SIFT) [22] or speeded-up robust features pensates for the difference in support size.[23]; term-frequency–inverse-document-frequency (TF–IDF) In all data sets, highly adaptive image retrieval performedweighting was used to rank reference images. The reader is re- significantly better than the original BoF model [15], what-ferred to [5] for additional comparisons with nonwavelet-based ever feature detection and description method was used (SIFTCBIR methods. Precision at five versus query time are reported or SURF, see Fig. 4). As expected, this improvement wasin Fig. 4; every algorithm was implemented in C++, except for particularly noticeable in VisTex and MESSIDOR, where tex-dual-tree CWT.6 ture is the most discriminative feature. One reason why lower VI. DISCUSSION precision rates have been achieved on the Caltech101 data set is that the semantic gap is wider: the “Butterfly” cate- A novel image characterization framework, which is based gory, for instance, contains pictures of real butterflies amongon adaptive separable or nonseparable wavelet transforms, has flowers, pictures of butterfly-shaped chocolates, stylized but- 5 (BOWKMeansTrainer, BOWImg terfly drawings on uniform backgrounds, etc. Another reasonDescriptorExtractor, SiftFeatureDetector/SurfFeatureDetector and SiftDescrip- is that there are much more categories (101). In line with atorExtractor/SurfDescriptorExtractor classes) 6 (the software major trend in image characterization for CBIR (BoF, mul-was run with GNU Octave) tiple-instance learning [24], etc.) [21], we plan to use the
  9. 9. QUELLEC et al.: FAST WAVELET-BASED IMAGE CHARACTERIZATION FOR HIGHLY ADAPTIVE IMAGE RETRIEVAL 1621 http://ieeexploreprojects.blogspot.comFig. 4. Precision at five versus query time (in logarithmic scale). Query time includes image processing time and search time; training time is not included. Errorbars represent confidence intervals: They are seldom visible along the y axis. (a) Caltech101. (b) MESSIDOR. (c) Face Database. (d) VisTex.proposed framework in future works to characterize regions indicates the wavelet filter maximizing kurtosis at this scale,of interest, detected by SIFT or SURF for instance, instead of indicating the highest SNR (or the highest contrast) [29].characterizing images as a whole. The proposed framework In this paper, two features have been selected to buildmay also be used for relevance feedback, which is one of characterization maps: the standard deviation and the kur-the most active research fields for CBIR systems [25]–[28]. tosis. Should additional features be used, differentiabilityIn relevance feedback, the user iteratively selects relevant checks would have to be performed to ensure that character-and irrelevant images among retrieved reference images, and ization maps can still be approximated by Taylor expansionsthe distance measure is updated: Basically, the weight of (see Section III-C). However, most features are smooth andeach image feature is modified. In this context, the wavelet continuous functions of the data; otherwise, they could notfilter could simply be updated after each user feedback: The characterize noisy data robustly.proposed framework would therefore allow relevance feed- This study has two limitations. First, for simplicity, noback on image features themselves and not only on the low-pass filter was used before subsampling images between twoway these features are combined, which would be novel. analysis scales; however, it does not seem to affect performancesThrough experiments on highly adaptive image retrieval, we (see Table II: adaptive image retrieval (setup S2) versus separablehave shown that adapting the wavelet filter to each query lifting scheme). Second, although the framework was presentedimage increases the retrieval performance, without consider- for nonseparable wavelets of arbitrary support, it was only ap-ably increasing computation times. This observation gives us plied to filters of small support (three or five taps) in this paper. Asome insights on potential performance increases through rel- solution is presented in Section IV-D to compute characterizationevance feedback. maps progressively; this should facilitate the use of larger We believe that the proposed framework also has great filters, which might improve retrieval performances further.potential for classification and compression. Adapting a CBIR As a conclusion, a novel image characterization frameworkapproach for classification purposes is straightforward. As for was presented, and its suitability for CBIR was shown. Thiscompression, the characterization maps could be used to find study paves the way to efficient relevance feedback on imagethe wavelet filter that should be used to optimally compress features themselves, which should enable interactive imagethe image at each scale : The highest intensity in ’s map search with higher performance.
  10. 10. 1622 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 4, APRIL 2012 APPENDIX A DERIVATIVES OF THE PROPOSED CHARACTERIZATION: COMPUTATION DETAILSA. First- and Second-Order Derivatives Equations (16)–(19) are obtained from lower order deriva-tives by usual differentiation rules and the following relations: (28) (35) (29) REFERENCES [1] A. H. Tewfik, D. Sinha, and P. Jorgensen, “On the optimal choice ofB. Third-Order Derivatives a wavelet for signal representation,” IEEE Trans. Inf. Theory, vol. 38, no. 2, pp. 747–765, Mar. 1992. Let denote the following expression of two filter coeffi- [2] A. Gupta, S. D. Joshi, and S. Prasad, “A new method of estimating wavelet with desired features from a given signal,” Signal Process.,cients and appearing in (19): vol. 85, no. 1, pp. 147–161, Jan. 2005. [3] R. L. Claypoole, R. G. Baraniuk, and R. D. Nowak, “Adaptive wavelet transforms via lifting,” in Proc. IEEE Int. Conf. Acoust., Speech Signal Process., May 1998, vol. 3, pp. 1513–1516. (30) [4] G. Piella, B. Pesquet-Popescu, H. J. A. M. Heijmans, and G. Pau, “Combining seminorms in adaptive lifting schemes and applicationsIts derivative with respect to a third filter coefficient is to image analysis and compression,” J. Math. Imag. Vis., vol. 25, no.given by 2, pp. 203–226, Sep. 2006. [5] G. Quellec, M. Lamard, G. Cazuguel, B. Cochener, and C. Roux, “Wavelet optimization for content-based image retrieval in medical databases,” Med. Image Anal., vol. 14, no. 2, pp. 227–241, Apr. 2010. [6] D. Serˇ i and M. Vranki, “Adaptation of a 2-D nonseparable wavelet s filter bank with variable number of zero moments,” in Proc. Visualiza- tion, Imag., Image Process., Sep. 2002, pp. 257–260. [7] D. Serˇi and M. Vranki, “Adaptation in the quincunx wavelet filter bank s with applications in image denoising,” in Proc. Int. Workshop Spectral Process., 2004, pp. 245–252. Methods Multirate Signal [8] G. Quellec, M. Lamard, G. Cazuguel, B. Cochener, and C. Roux, “Adaptive nonseparable wavelet transform via lifting and its applica- tion to content-based image retrieval,” IEEE Trans. Image Process., vol. 19, no. 1, pp. 25–35, Jan. 2010. (31) [9] M. D. Abràmoff, J. M. Reinhardt, S. R. Russell, J. C. Folk, V. B. Ma- hajan, M. Niemeijer, and G. Quellec, “Automated early detection of diabetic retinopathy,” Ophthalmology, vol. 117, no. 6, pp. 1147–1154,where is a polynomial function of the wavelet transform of Jun. 2010. defined as [10] G. Quellec, K. Lee, M. Dolejsi, M. K. Garvin, M. D. Abràmoff, and M. Sonka, “Three-dimensional analysis of retinal layer texture: Identi- fication of fluid-filled regions in SD-OCT of the macula,” IEEE Trans. Med. Imag., vol. 26, no. 6, pp. 1321–1330, Jun. 2010. [11] A. W. M. Smeulders, M. Worring, S. Santini, A. Gupta, and R. Jain, “Content-based image retrieval at the end of the early years,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 22, no. 12, pp. 1349–1380, Dec. 2000. [12] R. Datta, D. Joshi, J. Li, and J. Z. Wang, “Image retrieval: Ideas, influ- ences, and trends of the new age,” ACM Comput. Surv., vol. 40, no. 2, pp. 1–60, Apr. 2008. [13] H. Müller, N. Michoux, D. Bandon, and A. Geissbuhler, “A review of (32) content-based image retrieval systems in medical applications—Clin- ical benefits and future directions,” Int. J. Med. Inform., vol. 73, no. 1,The following relation holds: pp. 1–23, Feb. 2004. [14] C. B. Akgül, D. L. Rubin, S. Napel, C. F. Beaulieu, H. Greenspan, and B. Acar, “Content-based image retrieval in radiology: Current status (33) and future directions,” J. Digit Imaging, vol. 24, no. 2, pp. 208–222, Apr. 2010. [15] J. Sivic and A. Zisserman, “Video google: A text retrieval approach toFinally, the third-order derivatives of the characterization are object matching in videos,” in Proc. Int. Conf. Comput. Vis., 2003, pp.given as follows: 1470–1477. [16] S. Mallat, A Wavelet Tour of Signal Processing. New York: Aca- demic, 1999. [17] G. van de Wouwer, P. Scheunders, and D. van Dyck, “Statistical texture characterization from discrete wavelet representations,” IEEE Trans. Image Process., vol. 8, no. 4, pp. 592–598, Apr. 1999. [18] S. Nadarajah, “A generalized normal distribution,” J. Appl. Stat., vol. 32, no. 7, pp. 685–694, Sep. 2005. [19] M. Hazewinkel, “Taylor’s theorem,” in Encyclopaedia of Mathe- matics. College Station, TX: Springer-Verlag, 2001. [20] D. Shepard, “A two-dimensional interpolation function for irregularly- (34) spaced data,” in Proc. ACM Nat. Conf., 1968, pp. 517–524.
  11. 11. QUELLEC et al.: FAST WAVELET-BASED IMAGE CHARACTERIZATION FOR HIGHLY ADAPTIVE IMAGE RETRIEVAL 1623 [21] I. W. Selesnick, R. G. Baraniuk, and N. G. Kingsbury, “The dual-tree Guy Cazuguel (M’83) received the Engineering de- complex wavelet transform,” IEEE Signal Process. Mag., vol. 22, no. gree from the Ecole Nationale de l’Aviation Civile, 6, pp. 123–151, Nov. 2005. Toulouse, France, in 1975, the M.S. degree in ad- [22] D. G. Lowe, “Object recognition from local scale-invariant features,” vanced automatics, and the Ph.D. degree in signal in Proc. Int. Conf. Comput. Vis., 1999, vol. 2, pp. 1150–1157. processing and telecommunications from the Univer- [23] H. Bay, A. Ess, T. Tuytelaars, and L. van Gool, “Surf: Speeded up sity of Rennes I, Rennes, France, in 1976 and 1994, robust features,” Comput. Vis. Image Underst., vol. 110, no. 3, pp. respectively. 346–359, 2008. He is currently a Professor with the Department [24] R. Rahmani, S. A. Goldman, H. Zhang, S. R. Cholleti, and J. E. Fritts, of Image and Information Processing, Telecom Bre- “Localized content-based image retrieval,” IEEE Trans. Pattern Anal. tagne, Brest, France. His research interests include Mach. Intell., vol. 30, no. 11, pp. 1902–1912, Nov. 2008. image analysis and content-based image retrieval in [25] M. Ferecatu and D. Geman, “A statistical framework for image cate- medical applications, within the LaTIM Inserm Research Unit 1101. gory search from a mental picture,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 31, no. 6, pp. 1087–1101, Jun. 2009. [26] E. Cheng, F. Jing, and L. Zhang, “A unified relevance feedback frame- work for web image retrieval,” IEEE Trans. Image Process., vol. 18, no. 6, pp. 1350–1357, Jun. 2009. Béatrice Cochener received the M.D. degree in oph- [27] M. R. Azimi-Sadjadi, J. Salazar, and S. Srinivasan, “An adaptable thalmology from Nancy University, Nancy, France, in image retrieval system with relevance feedback using kernel machines 1992. and selective sampling,” IEEE Trans. Image Process., vol. 18, no. 7, She is currently a Professor and the Head with the pp. 1645–1659, Jul. 2009. University Eye Clinic, Brest, France. Together with [28] W. Bian and D. Tao, “Biased discriminant Euclidean embedding for J. Colin, she developed a very active anterior seg- content-based image retrieval,” IEEE Trans. Image Process., vol. 19, ment surgery practice. She is a Specialist of refrac- no. 2, pp. 545–554, Feb. 2010. tive techniques in vision correction. She participated [29] M. Bethge, “Factorial coding of natural images: How effective are in three books on surgical techniques and has pub- linear models in removing higher-order dependencies?,” J. Opt. Soc. lished more than 30 peer-reviewed journal articles. Amer. A, Opt. Image Sci. Vis., vol. 23, no. 6, pp. 1253–1268, Jun. 2006. Her research interests include imaging research, clin- ical evaluation, and anterior segment surgery teaching. Dr. Cochener is the Vice President of the French Implant and Refractive Surgery Society SAFIR, the President of the French Academy of Ophthal- Gwénolé Quellec was born in Saint-Renan, France, mology, and is a member of the Editorial Board of the Journal Français on November 29, 1982. He received the Engineering d’Ophtalmologie. degree in computer science and applied mathematics from ISIMA, Clermont-Ferrand, France, in 2005, the M.S. degree in image processing from the University of Clermont-Ferrand II, Clermont-Ferrand, in 2005, Christian Roux (F’05) received the Aggregation and the Ph.D. degree in signal processing from degree in physics from the Ecole Normale Su- Telecom Bretagne, Brest, France, in 2008. perieure, Cachan, France, in 1978 and the Ph.D. He is currently a Research Associate with the degree in signal processing from the Institut National LaTIM Inserm Research Unit 1101, Brest, France. Polytechnique, Grenoble, France, in 1980. His research interests include retinal image pro- In 1982, he joined Telecom Bretagne, Brest,cessing, content-based image retrieval, and information fusion for medical France. He is the founding Director of LaTIMapplications. Inserm Research Unit 1101, Brest. He is the author of more than 150 papers, holds nine patents, and contributed to the creation of three startup companies developing medical technologies. His current re- Mathieu Lamard was born in Bordeaux, France, on search interests include medical information processing, spatial and functional May 18, 1968. He received the M.S. degree in applied information modeling, and analysis in medical images. mathematics from the University of Bordeaux, Bor- Dr. Roux was the recipient of the INSERM Basic Research Award in 2006. deaux, in 1995, and the Ph.D. degree in signal pro- He has been the founding Co-Chair of the IEEE EMBS International Summer cessing and telecommunication from the University School on Biomedical Imaging since 1994 and served as the President of the of Rennes, France, in 1999. IEEE Engineering in Medicine and Biology Society in 2001. He is a member of In 2000, he joined the LaTIM Inserm Research the Editorial Board of the IEEE TRANSACTIONS ON INFORMATION TECHNOLOGY Unit 1101, Brest, France, where he is currently a and of the PROCEEDINGS OF THE IEEE. Research Associate. His research interests include image processing, 3-D reconstruction, content-based image retrieval, and information fusion for medicalapplications.