1614 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 4, APRIL 2012the image. In conclusion, using a different wavelet ﬁlter for where is a polynomial function of the wavelet transformeach query image in a CBIR application is now possible. of [and consequently of the wavelet ﬁlter coefﬁcients; see (3)] The remainder of this paper is organized as follows. Section II as follows:describes the proposed image characterization given a waveletﬁlter. Section III addresses the design of characterization maps. (9)Applications to image retrieval are presented in Section IV.The proposed framework is applied to four image data sets inSection V. We end up with a discussion in Section VI. where and denote the sets and , respectively. According to (3) II. CHARACTERIZING ONE IMAGE WITH A and (7)–(9), the complexity of image characterization is in GIVEN WAVELET FILTER . Let be an image of size pixels. Let be a wavelet ﬁlter III. BUILDING A CHARACTERIZATION MAPof support . By deﬁnition of a wavelet ﬁlter,the following relation holds for : A. Wavelet Filter Space (1) Let denote the space of all wavelet ﬁlters of support . Let denote its dimension. Because theBy deﬁnition, the detail coefﬁcients of the wavelet transform of central coefﬁcient of each ﬁlter is constrained by , at any location and any analysis scale , are given (1), (i.e., ).by (2) B. Characterization Map and Characterization Derivative MapsAccording to (1), (2) can be rewritten as follows: For wavelet adaptation purposes, we propose to compute the characterization of for a given analysis scale (3) and each ﬁlter . The resulting set of characterizations http://ieeexploreprojects.blogspot.com is referred to as the characterization map of (given , , andWe propose to characterize the distribution of detail coefﬁcients ). For wavelet adaptation purposes, it is also useful to compute in with standardized moments. the ﬁrst-order derivatives of each characterization. The resultingIn the particular case of texture images, Wouwer et al.  have sets of characterization derivatives are referred to as the charac-shown that the distribution of detail coefﬁcients at any analysis terization derivative maps of .scale can be modeled meaningfully by an unskewed zero-mean Since characterizing an image with a given wavelet ﬁlter cangeneralized Gaussian function; we extended this observation to be time-consuming (the complexity is in ; seea more general class of images in previous works . Gener- Section II), we propose to approximate the characterizationalized Gaussian functions have two parameters: and , i.e., map and the characterization derivative maps, as described inthe scale and shape parameters, respectively. The ﬁrst four stan- Sections III-C and III-D. In order to speed up computationsdardized moments and are related to and through further, the invariances of the characterization map and of thethese following relations : characterization derivative maps are studied in Section III-E. Some practical considerations for building and visualizing (4) characterization maps are highlighted in Section III-F. (5) C. Approximate Characterization Map (6) We propose to approximate the characterization map and the characterization derivative maps using Taylor expansions. Thewhere is the Gamma special function. In particular, and exact characterization and characterization derivatives are com-are closely related to and , respectively. As a consequence, puted for a ﬁnite set of wavelet ﬁlters, which are referred to aswe propose to characterize the distribution of the detail coefﬁ- key wavelet ﬁlters, and the remainder of each map is approxi-cients at any analysis scale by its standard deviation mated using Taylor expansions. The set of key wavelet ﬁltersand its kurtosis , i.e., after subtracting the mean. For is denoted by . Let be a keyshort, and are noted and , respec- wavelet ﬁlter. The Taylor expansion of function (either ,tively, when , , and are not ambiguous. and are given , , or ) in the neighborhood of isby given by the following formal relation : (7) (8) (10)
QUELLEC et al.: FAST WAVELET-BASED IMAGE CHARACTERIZATION FOR HIGHLY ADAPTIVE IMAGE RETRIEVAL 1615where denotes the order of the Taylor expansion and Its second-order derivatives are given bydenotes the L2-norm (18) (11)Note that is multidimensional: It is a function of the waveletﬁlter coefﬁcients. As a consequence, the second term of (10)(corresponding to ) is given by (19) (12) Its third-order derivatives, and some computation details, are provided in Appendix A. Higher order derivatives were not usedwhere is the gradient of . Its third term (corresponding to in this paper. ) is given by E. Invariances of the Characterization Map and of its (13) Derivatives In order to reduce the cardinal of , we propose to ﬁndwhere is the Hessian of . invariances in the characterization map and its derivatives. According to Taylor’s theorem , (10) holds if is First, if a wavelet ﬁlter is multiplied by a posi-times differentiable at . We can check that and (and tive real number , then according to (3), (9), and (14),therefore and ) are inﬁnitely differen- and are multiplied by , , , and . Conse-tiable when . In particular, it should be noted that and quently, if is multiplied by , then is multiplied by , are functions of terms composed of and is unchanged [see (7) and (8)]. As for the derivatives,polynomials, fractions, and square roots [see (3) and (7)–(9)]. if is multiplied by , then is unchanged, andThe standard deviation equals 0 when image is constant is divided by [see (16) and (17)]. This invari-everywhere or when ; cannot be strictly negative [see ance analysis implies that we only need to compute the char- http://ieeexploreprojects.blogspot.com(7) and (9)]. In the trivial case where is constant everywhere, acterization map and its derivatives for wavelet ﬁlters on thecharacterization maps are also constant everywhere; therefore unit sphere .we do not need to compute them. Images are assumed noncon- Second, if is multiplied by 1, then and arestant in the following sections. multiplied by , , , and [see (3), (9), and (14)]. Conse- quently, if is multiplied by 1, then and are unchangedD. Derivatives of the Proposed Characterization With Respect [see (7) and (8)], and and are multi-to Filter Coefﬁcients plied by 1 [see (16) and (17)]. This invariance analysis implies In order to compute the Taylor expansions above, the ﬁrst that we only need to compute the characterization map and itsfew order derivatives of the proposed image characterization, derivatives on one half of the unit sphere .i.e., the ﬁrst few order derivatives of (7) and (8), with respectto wavelet ﬁlter coefﬁcients , need to be computed for each F. Practical Considerations for Building the Characterizationkey wavelet ﬁlter . Let and denote the Mapfollowing polynomial functions of the wavelet transform of : The proposed procedure for building the characterization map is illustrated in Fig. 1 for and . The ﬁrst step in building the characterization map is to extract a set of key wavelet ﬁlters. The exact character- (14) izations and characterization derivatives associated with these key ﬁlters are computed, as described in Sections II and III-D, respectively. The elements of should be selected approx- imately uniformly in one half of the unit sphere (see Section III-E). A good set can be obtained as described (15) hereafter [see also Fig. 1(a)]. For each dimension of the wavelet space, a set of key wavelet ﬁlters, i.e.,The ﬁrst-order derivatives of the proposed image characteriza- , is generated as follows:tion are given by • the th unconstrained coefﬁcient of each ﬁlter is set to 1; • the th unconstrained coefﬁcient of each ﬁlter , (16) with , is in [see Fig. 1(a)]; (17) • each ﬁlter is divided by .
1616 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 4, APRIL 2012Fig. 1. Characterization map building. In this example ( K=0 L=1, D=2 , and therefore, w ), wavelet ﬁlters have three coefﬁcients: w w, andww[ x is constrained by (1)]. In each ﬁgure, the axis represents the value of the ﬁrst unconstrained wavelet ﬁlter coefﬁcient, i.e., w y . The axis represents thevalue of the second unconstrained coefﬁcient: w . The value of the constrained wavelet coefﬁcient w is not represented. In this example, the number of keywavelet ﬁlters is controlled by n=6 [0(=4) 0 (=4n); (3=4) 0 , and the portion of the map obtained by Taylor expansions is the unit circle restricted to the(=4n)) n angle interval. In (a) and (b), the two sets of key wavelet ﬁlters described in Section III-F are indicated by circles of two different colors. In (b)and (c), the solid arc indicates the portion of the map obtained by Taylor expansions. In (c), the solid straight line indicates the portion of the map that can beobtained, through invariance analysis, from the characterization estimated at the location of the small circle. (a) Exact computation (cf. Section II). (b) Approximatecomputation (cf. Section III-C). (c) Invariances (cf. Section III-E). is the union of these sets: Its cardinal is therefore . If , then the key wavelet ﬁlters are selected ex-actly uniformly in one half of the unit sphere. The second step is to approximate the characterizations andcharacterization derivatives in the remainder of the half unitsphere [see Fig. 1(b)]. For each ﬁlter in thisset, image characterizations are approximated by Taylor expan- http://ieeexploreprojects.blogspot.comsions; the closest key wavelet ﬁlter plays the role of (seeSection III-C). Taylor expansions can be used in this step be-cause ; therefore, (see Section III-C). The last step is to compute the characterizations and char-acterization derivatives in the remainder of the set [seeFig. 1(c)]. Let be the L2-norm of ﬁlter in this set. Let be a variable indicating whether belongs tothe half unit sphere above or not . The char-acterizations and characterization derivatives are obtained from according to the invariance analysis of Section III-E. The proposed procedure is applied, in the particular case , to a real-world image in Fig. 2. Characterization map buildingis assessed in Section V-B. The proposed wavelet-based image characterization is ap-plied to image retrieval in the following section. IV. IMAGE RETRIEVAL Fig. 2. Example of characterization map, in (b) and (e), and of characterization derivative maps, in (c), (d), (f) and (g), obtained for image (a) at the analysis Let be a query image and be a data set of reference im- scale s=2 K=0 L=1 . The scenario described in Fig. 1 was reused ( and );ages. In this section, characterization maps are used to rank im- w w w therefore wavelet ﬁlters have three coefﬁcients: w , , and . Inages in increasing order of distance to . Then, the ﬁrst each map, the intensity of pixel (x; y) x=w , where y=w and , is images, which are noted , proportional to the value of f (w ) f , where is either [in (b)], [in (e)], or one of their derivatives. In (b) and (e), black means 0. In (c), (d), (f), andare retrieved. In this paper, the goal is to retrieve a small set of (@ =@w ) (g), medium gray means 0. (a) Retinal Image. (b) . (c) . (d)highly relevant images. Therefore, retrieval systems are tuned (@ =@w ) . (e) . (f) (@ =@w ) (@ =@w ) . (g) .in a training set in order to maximize the precision at , i.e.,the fraction of images, among ,that belong to the same category as . If, on the contrary, one gap between category assignments and image characteristicswould like to retrieve all potentially relevant images in the (textures, colors, etc.) is wide.reference data set, then the recall at a large should be max- We propose to characterize each image by the following sig-imized instead . Achieving high precision is challenging nature . In sig-when the number of categories is high or when the semantic nature , features and are extracted
QUELLEC et al.: FAST WAVELET-BASED IMAGE CHARACTERIZATION FOR HIGHLY ADAPTIVE IMAGE RETRIEVAL 1617at analysis scale , using a wavelet ﬁlter of support If (21) can be inverted (i.e., ) for . Based on these signatures, the distance one or several images , then the average precision atbetween and another image is deﬁned as follows: is expected to increase. In that purpose, parameter is modiﬁed as follows: (22) with . Equation (22) is iterated until converges (note that and are updated at each iteration). (20) Whenever is a wavelet ﬁlter coefﬁcient, the characterization derivative maps (see Fig. 2) are used in the gradient descent. Inwhere denotes the L2-distance between the intensity that case, (22) follows from:histogram of and that of : is used to comparethe low-frequency component of and that of . Each com-ponent in the distance measure is weighted by a real number: or , where , , and . Several parameters need to be tuned in (20): , ,and , , , and . These parameters are referred to as the distance parameters.They are tuned in a training image data set to maximize theaverage precision at . Note that the same data set can be usedas the reference data set and as the training data set . Duringthe training phase, , . After the training (23)phase, , . Two image retrieval scenarios are considered: 1) a single set http://ieeexploreprojects.blogspot.comof distance parameters is used regardless of the query image(i.e., adaptive image retrieval; see Section IV-A), or 2) a different B. Highly Adaptive Image Retrieval: Training Procedureset of distance parameters is used for each query image (i.e., In highly adaptive image retrieval, a different set of distancehighly adaptive image retrieval; see Section IV-B). Note that parameters is used for each query image . Precisely, distanceadaptive image retrieval simply is a faster version of previously parameters (wavelet ﬁlters and weights) are allowed to vary con-proposed approaches , . From a user perspective, adaptive tinuously in signature space. A continuous regression functionand highly adaptive image retrieval operate similarly to these is used to map , i.e., the initial signature of (obtained byprevious approaches; the only differences are that adaptive image an initial set of ﬁlters ), to new distance parameters forretrieval is much faster and that highly adaptive image retrieval : a new set of weightsis both much faster and potentially more precise. The training and a new set of ﬁlters . The initialphase is described hereafter for each scenario. Section IV-C set of ﬁlters is obtained by the training procedure of theexplains how the system processes a query. adaptive image retrieval (see Section IV-A). Once new distance parameters are obtained for , , which is the ﬁnal signa-A. Adaptive Image Retrieval: Training Procedure ture of , is computed using . For In adaptive image retrieval, distance is tuned, by a gradient consistency, the signature of each reference image also needs todescent, in order to increase the average precision at among be computed using .all query images . Each parameter is modiﬁed, in turn, The continuity property implies that two images with sim-until the average precision at in converges. ilar initial signatures are mapped to similar distance parameters. Let denote the parameter currently being modiﬁed (either Let denote one distance parameter (either , , or , , or ). Optimizing the average precision ). The following the continuity equation holds for :at is not straightforward: Its derivative, with respect to ,is either zero or undeﬁned everywhere on . We got aroundthis difﬁculty as explained hereafter. Let be the (24)image minimizing among images from a differentcategory than . Let be the image minimizing among images from the same category as and The regression functions are trained on . For each imagesuch that , the distance parameters for are tuned to maximize the precision at for while respecting the continuity constraint. (21) Let denote one distance parameter (either , , or
1618 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 4, APRIL 2012 ). In order to respect the continuity constraint, (22) is Indeed, the signature of each image needs to bereplaced by the following equation: evaluated again using the optimal wavelet ﬁlters computed for (see Section IV-C), which is very fast if char- acterization maps are already available for . After Training (Query Images): There is no need to (25) compute characterization maps for query images .where is a positive real number. One ﬁrst advantage In adaptive image retrieval, we only need to computeof the continuity constraint is that, with an appropriate value for , i.e., the signature associated with the optimal set , we can prevent the system from overﬁtting the training data. of ﬁlters . In highly adap-A second advantage is that, after training, we can easily deﬁne tive image retrieval, we only need to compute ,the regression function for each distance parameter through which is the signature associated with the initial set ofmultivariate interpolation . For query images , ﬁlters , and , whichis given by the following equation: is the signature associated with the ﬁnal set of ﬁlters (see Section IV-C). (26) To summarize, characterization maps should be computedwhere for reference/training images but not for test images. As a con- sequence, the additional ﬂexibility introduced in the proposed framework does not imply a dramatic retrieval time increase. (27) In addition, should the optimal wavelet basis be further ﬁne tuned (e.g., to address relevance feedbacks from the user), there is no need to compute the updated signatures from scratch: new image characterizations can be estimated through Taylor expan-C. Processing a Query: Scenario of Operation sions (where ) and invariance analysis (see Adaptive Image Retrieval: First, the query image Sections III-D and III-E). is decomposed using the optimal wavelet ﬁlters obtained during the training phase, V. APPLICATIONSand the signature is computed. Second, references images After an introduction to the four data sets under study in Section V-A, characterization map building is assessed in are ranked in increasing order http://ieeexploreprojects.blogspot.com of distance to , usingthe optimal distance weights Section V-B. The retrieval performance of the proposed re-obtained during the training phase. Third, the ﬁrst images trieval systems is then assessed in Section V-C.are retrieved. Highly Adaptive Image Retrieval: First, the query image A. Data Sets is decomposed using the initial wavelet ﬁlters Caltech101: This data set was collected in September 2003 at obtained during the training phase, the California Institute of Technology.1 It consists of 9144 pic-and the initial signature is computed. Second, the tures of objects belonging to 101 categories. This is about 40 toﬁnal distance parameters (wavelet ﬁlters and weights) 800 images per category; most categories have about 50 images.are obtained through multivariate interpolation [see (26)]. The size of each image is roughly 300 200 pixels. ExamplesThird, the ﬁnal signature of and of each reference image of images from two categories (“water_lilly” and “hedgehog”) are computed using the ﬁnal set of wavelet ﬁlters are given in Fig. 3. . Fourth, references images are MESSIDOR: This data set was collected in three ophthal-ranked in increasing order of distance to using the ﬁnal mology departments in France for research on automated dia-distance weights. Fifth, the ﬁrst images are retrieved. betic retinopathy screening.2 It consists of 1200 eye fundus color photographs. Images were acquired using a color video 3CCDD. Characterization Maps and Time Complexity camera on a Topcon TRC NW6 nonmydriatic retinograph with a During Training: In both adaptive and highly adaptive image 45 ﬁeld-of-view. The size of each image is either 1440 960,retrieval, the signatures of each image and of each 2240 1488, or 2304 1536 pixels. In a disease screening con-image are evaluated for an arbitrary large text, clinicians classify images into two categories: normal andnumber of wavelet ﬁlters. Computing characterization maps for pathological images. In MESSIDOR, 546 images were markedeach image is therefore very useful. However, there is as normal and 654 as pathological.no need to compute the entire characterization maps all at once: Face Database: This data set was collected by ATT Lab-Whenever an image characterization needs to be evaluated for a oratories Cambridge for research on face recognition.3 It con-new wavelet ﬁler , the closest key wavelet ﬁlter is searched sists of 400 images from 40 categories. Each category consistsfor, the image characterization is evaluated for , and it is eval- of ten images from the same person at different times, withuated for through Taylor expansions and invariance analysis different lighting conditions, different facial expressions (open/(see Sections III-D and III-E). After Training (Reference Images): In highly adaptive image 1http://www.vision.caltech.edu/Image_Data sets/Caltech101/retrieval, characterization maps computed for reference images 2http://messidor.crihan.fr/index-en.php are still very useful to process new query images . 3www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.html
QUELLEC et al.: FAST WAVELET-BASED IMAGE CHARACTERIZATION FOR HIGHLY ADAPTIVE IMAGE RETRIEVAL 1619 TABLE I RELATIVE APPROXIMATION ERROR OF THE CHARACTERIZATION (DERIVATIVE) MAPSFig. 3. Examples of characterization maps computed, at different analysisscales and orientations, for images in the Caltech101 data set. Color charac- B. Assessing Characterization Map Buildingterization maps should be interpreted as follows: The red channel of a color http://ieeexploreprojects.blogspot.com building was assessed on a datacharacterization map was computed from the red channel of the input image Characterization map(the same goes for the green channel and the blue channel). In row labels, V Kstands for vertical wavelet ﬁlter ( ;L = 1 = 0), H stands for horizontal set of 40 images: Ten images were randomly selected from Kwavelet ﬁlter ( = 0 ;L = 1), and the subscript indicates the analysis scale each data set above. For each image, the exact and approxi-s( ). Characterization maps all look roughly alike. However, they differ in terms mate value for , , , and wereof angular frequency and phase, for instance. In addition, evolution acrossanalysis scales differs from one image to another. Therefore, it is usually computed for equally spaced wavelet ﬁlters onpossible to select one pixel location (i.e., one ﬁlter) in each map to derive a one half of the unit sphere. The approximate values werediscriminative feature vector. In addition, because a weight vector is used, the obtained by the procedure described in Section III-F. Themost discriminative maps/ﬁlters can be emphasized. experiment was performed for at two analysis scales: and . Only the red channel of color images was used in this experiment. Table I reports the per-imageclosed eyes, smiling/not smiling), and facial details (glasses/no relative approximation error ofglasses). The size of each image is 92 112 pixels. and VisTex: This data set was collected by the Media Laboratory . In this paper, the relative approx-at the Massachusetts Institute of Technology for research on tex- imation error of a vector was deﬁned as the L2-norm of theture recognition.4 It consists of categorized texture images repre- approximation error divided by the L2-norm of the exact vector.sentative of real-world conditions. Categories consisting of less In the remainder of this paper, two setups are studied:than ﬁve element (“Clouds,” “Grass,” “Misc,” “WheresWaldo,” S1 and ;and “Wood”) were discarded in this paper. As a consequence, 14 S2 and );categories, consisting of 152 images altogether, were selected. a Taylor expansion of order is used to approximateThe size of each image is 512 512 pixels. and , and a Taylor expansion of order is used to For training purposes, each data set above was divided into approximate and .ﬁve subsets: . Each category in was equally The average computation times for building one characteri-divided into each of these subsets at random. The retrieval per- zation map and the associated characterization derivative mapsformance was assessed by a so-called ﬁvefold cross-validation is 0.0476 s for Caltech101, 1.53 s for MESSIDOR, 0.00667 sstrategy: for each fold , a training subset for the Face Database, and 0.168 s for VisTex (using andand a test subset were deﬁned. During retrieval perfor- ). One core of an Intel Xeon E5520 processor, runningmance assessment, an image was considered relevant if it be- at 2.27 GHz, was used in all experiments. Note that computa-longed to the same category as the query image. Final results tion times, for building one characterization map or one charac-are reported on the union of the ﬁve test subsets (see Table II). terization derivative map, are in (see 4http://vismod.media.mit.edu/vismod/imagery/VisionTexture /vistex.html Sections II and III-F).
1620 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 4, APRIL 2012 TABLE II PRECISION AT FIVE ON THE TEST SUBSETC. Retrieval Performance been proposed in this paper. It was presented for 2-D images, The following ﬁlter supports and analysis scales were used in but it could be generalized to any type of -D digital signals.all experiments (see Fig. 3 for the S1 setup): This framework allows fast image characterization, particularly : , : , . when images need to be characterized several times, using : , : , . different wavelet ﬁlters. This feature is particularly useful for : , : , . wavelet adaptation. : , : , . The proposed framework was applied to wavelet adapta- : , : , . tion for CBIR. Two CBIR methods were presented. The ﬁrst : , : , . method, i.e., adaptive image retrieval, sped up previously pro- : , : , . posed CBIR methods based on wavelet adaptation , without : , : , . decreasing the retrieval performance. The second method, : , : , . i.e., highly adaptive image retrieval, takes full advantage of : , : , . the proposed image characterization framework: During eachThe following values were used for [see (22)]: http://ieeexploreprojects.blogspot.com ﬁlter is used, which means each ref- if query, a different waveletis a weight, and if is a wavelet ﬁlter coefﬁcient. erence image has to be characterized again with a new wavelet was found by cross-validation on each training subset [see ﬁler. Highly adaptive image retrieval increased the retrieval(25) and (27)]. performance signiﬁcantly, while maintaining low computation The precision at , obtained on the test subset times (see Fig. 4). These improvements were observed inof each data set by both experiments, is reported in Table II. four different image data sets: Caltech101, MESSIDOR, FaceIt is compared with three other wavelet-based CBIR methods Database, and VisTex. A simple gradient descent was used inon the same data sets (with the same subsets ): this paper for wavelet adaptation. However, the proposed imageseparable lifting-scheme-based optimization , nonsepa- characterizations may be used jointly with more advancedrable lifting-scheme-based optimization,  and dual-tree wavelet adaptation techniques.complex wavelet transform (CWT) , . A comparison In all data sets, highly adaptive image retrieval performed sig-with the original BoF model , which has been one of niﬁcantly better than dual-tree CWT , in which wavelet ﬁl-the most popular CBIR frameworks, was also included. In ters of large support are used to analyze images. In highly adap-our BoF implementation, based on the OpenCV library,5 fea- tive image retrieval, several adapted ﬁlters of small support aretures were detected and described using the scale-invariant used. The number and the high adaptivity of these ﬁlters com-feature transform (SIFT)  or speeded-up robust features pensates for the difference in support size.; term-frequency–inverse-document-frequency (TF–IDF) In all data sets, highly adaptive image retrieval performedweighting was used to rank reference images. The reader is re- signiﬁcantly better than the original BoF model , what-ferred to  for additional comparisons with nonwavelet-based ever feature detection and description method was used (SIFTCBIR methods. Precision at ﬁve versus query time are reported or SURF, see Fig. 4). As expected, this improvement wasin Fig. 4; every algorithm was implemented in C++, except for particularly noticeable in VisTex and MESSIDOR, where tex-dual-tree CWT.6 ture is the most discriminative feature. One reason why lower VI. DISCUSSION precision rates have been achieved on the Caltech101 data set is that the semantic gap is wider: the “Butterﬂy” cate- A novel image characterization framework, which is based gory, for instance, contains pictures of real butterﬂies amongon adaptive separable or nonseparable wavelet transforms, has ﬂowers, pictures of butterﬂy-shaped chocolates, stylized but- 5http://opencv.willowgarage.com/wiki/ (BOWKMeansTrainer, BOWImg terﬂy drawings on uniform backgrounds, etc. Another reasonDescriptorExtractor, SiftFeatureDetector/SurfFeatureDetector and SiftDescrip- is that there are much more categories (101). In line with atorExtractor/SurfDescriptorExtractor classes) 6http://eeweb.poly.edu/iselesni/WaveletSoftware/dt2D.html (the software major trend in image characterization for CBIR (BoF, mul-was run with GNU Octave) tiple-instance learning , etc.) , we plan to use the
QUELLEC et al.: FAST WAVELET-BASED IMAGE CHARACTERIZATION FOR HIGHLY ADAPTIVE IMAGE RETRIEVAL 1621 http://ieeexploreprojects.blogspot.comFig. 4. Precision at ﬁve versus query time (in logarithmic scale). Query time includes image processing time and search time; training time is not included. Errorbars represent conﬁdence intervals: They are seldom visible along the y axis. (a) Caltech101. (b) MESSIDOR. (c) Face Database. (d) VisTex.proposed framework in future works to characterize regions indicates the wavelet ﬁlter maximizing kurtosis at this scale,of interest, detected by SIFT or SURF for instance, instead of indicating the highest SNR (or the highest contrast) .characterizing images as a whole. The proposed framework In this paper, two features have been selected to buildmay also be used for relevance feedback, which is one of characterization maps: the standard deviation and the kur-the most active research ﬁelds for CBIR systems –. tosis. Should additional features be used, differentiabilityIn relevance feedback, the user iteratively selects relevant checks would have to be performed to ensure that character-and irrelevant images among retrieved reference images, and ization maps can still be approximated by Taylor expansionsthe distance measure is updated: Basically, the weight of (see Section III-C). However, most features are smooth andeach image feature is modiﬁed. In this context, the wavelet continuous functions of the data; otherwise, they could notﬁlter could simply be updated after each user feedback: The characterize noisy data robustly.proposed framework would therefore allow relevance feed- This study has two limitations. First, for simplicity, noback on image features themselves and not only on the low-pass ﬁlter was used before subsampling images between twoway these features are combined, which would be novel. analysis scales; however, it does not seem to affect performancesThrough experiments on highly adaptive image retrieval, we (see Table II: adaptive image retrieval (setup S2) versus separablehave shown that adapting the wavelet ﬁlter to each query lifting scheme). Second, although the framework was presentedimage increases the retrieval performance, without consider- for nonseparable wavelets of arbitrary support, it was only ap-ably increasing computation times. This observation gives us plied to ﬁlters of small support (three or ﬁve taps) in this paper. Asome insights on potential performance increases through rel- solution is presented in Section IV-D to compute characterizationevance feedback. maps progressively; this should facilitate the use of larger We believe that the proposed framework also has great ﬁlters, which might improve retrieval performances further.potential for classiﬁcation and compression. Adapting a CBIR As a conclusion, a novel image characterization frameworkapproach for classiﬁcation purposes is straightforward. As for was presented, and its suitability for CBIR was shown. Thiscompression, the characterization maps could be used to ﬁnd study paves the way to efﬁcient relevance feedback on imagethe wavelet ﬁlter that should be used to optimally compress features themselves, which should enable interactive imagethe image at each scale : The highest intensity in ’s map search with higher performance.
1622 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 4, APRIL 2012 APPENDIX A DERIVATIVES OF THE PROPOSED CHARACTERIZATION: COMPUTATION DETAILSA. First- and Second-Order Derivatives Equations (16)–(19) are obtained from lower order deriva-tives by usual differentiation rules and the following relations: (28) (35) (29) REFERENCES  A. H. Tewﬁk, D. Sinha, and P. Jorgensen, “On the optimal choice ofB. Third-Order Derivatives a wavelet for signal representation,” IEEE Trans. Inf. Theory, vol. 38, no. 2, pp. 747–765, Mar. 1992. Let denote the following expression of two ﬁlter coefﬁ-  A. Gupta, S. D. Joshi, and S. Prasad, “A new method of estimating wavelet with desired features from a given signal,” Signal Process.,cients and appearing in (19): vol. 85, no. 1, pp. 147–161, Jan. 2005.  R. L. Claypoole, R. G. Baraniuk, and R. D. Nowak, “Adaptive wavelet transforms via lifting,” in Proc. IEEE Int. Conf. Acoust., Speech Signal Process., May 1998, vol. 3, pp. 1513–1516. (30)  G. Piella, B. Pesquet-Popescu, H. J. A. M. Heijmans, and G. Pau, “Combining seminorms in adaptive lifting schemes and applicationsIts derivative with respect to a third ﬁlter coefﬁcient is to image analysis and compression,” J. Math. Imag. Vis., vol. 25, no.given by 2, pp. 203–226, Sep. 2006.  G. Quellec, M. Lamard, G. Cazuguel, B. Cochener, and C. Roux, “Wavelet optimization for content-based image retrieval in medical databases,” Med. Image Anal., vol. 14, no. 2, pp. 227–241, Apr. 2010.  D. Serˇ i and M. Vranki, “Adaptation of a 2-D nonseparable wavelet s ﬁlter bank with variable number of zero moments,” in Proc. Visualiza- tion, Imag., Image Process., Sep. 2002, pp. 257–260.  D. Serˇi and M. Vranki, “Adaptation in the quincunx wavelet ﬁlter bank s with applications in image denoising,” in Proc. Int. Workshop Spectral http://ieeexploreprojects.blogspot.com Process., 2004, pp. 245–252. Methods Multirate Signal  G. Quellec, M. Lamard, G. Cazuguel, B. Cochener, and C. Roux, “Adaptive nonseparable wavelet transform via lifting and its applica- tion to content-based image retrieval,” IEEE Trans. Image Process., vol. 19, no. 1, pp. 25–35, Jan. 2010. (31)  M. D. Abràmoff, J. M. Reinhardt, S. R. Russell, J. C. Folk, V. B. Ma- hajan, M. Niemeijer, and G. Quellec, “Automated early detection of diabetic retinopathy,” Ophthalmology, vol. 117, no. 6, pp. 1147–1154,where is a polynomial function of the wavelet transform of Jun. 2010. deﬁned as  G. Quellec, K. Lee, M. Dolejsi, M. K. Garvin, M. D. Abràmoff, and M. Sonka, “Three-dimensional analysis of retinal layer texture: Identi- ﬁcation of ﬂuid-ﬁlled regions in SD-OCT of the macula,” IEEE Trans. Med. Imag., vol. 26, no. 6, pp. 1321–1330, Jun. 2010.  A. W. M. Smeulders, M. Worring, S. Santini, A. Gupta, and R. Jain, “Content-based image retrieval at the end of the early years,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 22, no. 12, pp. 1349–1380, Dec. 2000.  R. Datta, D. Joshi, J. Li, and J. Z. Wang, “Image retrieval: Ideas, inﬂu- ences, and trends of the new age,” ACM Comput. Surv., vol. 40, no. 2, pp. 1–60, Apr. 2008.  H. Müller, N. Michoux, D. Bandon, and A. Geissbuhler, “A review of (32) content-based image retrieval systems in medical applications—Clin- ical beneﬁts and future directions,” Int. J. Med. Inform., vol. 73, no. 1,The following relation holds: pp. 1–23, Feb. 2004.  C. B. Akgül, D. L. Rubin, S. Napel, C. F. Beaulieu, H. Greenspan, and B. Acar, “Content-based image retrieval in radiology: Current status (33) and future directions,” J. Digit Imaging, vol. 24, no. 2, pp. 208–222, Apr. 2010.  J. Sivic and A. Zisserman, “Video google: A text retrieval approach toFinally, the third-order derivatives of the characterization are object matching in videos,” in Proc. Int. Conf. Comput. Vis., 2003, pp.given as follows: 1470–1477.  S. Mallat, A Wavelet Tour of Signal Processing. New York: Aca- demic, 1999.  G. van de Wouwer, P. Scheunders, and D. van Dyck, “Statistical texture characterization from discrete wavelet representations,” IEEE Trans. Image Process., vol. 8, no. 4, pp. 592–598, Apr. 1999.  S. Nadarajah, “A generalized normal distribution,” J. Appl. Stat., vol. 32, no. 7, pp. 685–694, Sep. 2005.  M. Hazewinkel, “Taylor’s theorem,” in Encyclopaedia of Mathe- matics. College Station, TX: Springer-Verlag, 2001.  D. Shepard, “A two-dimensional interpolation function for irregularly- (34) spaced data,” in Proc. ACM Nat. Conf., 1968, pp. 517–524.
QUELLEC et al.: FAST WAVELET-BASED IMAGE CHARACTERIZATION FOR HIGHLY ADAPTIVE IMAGE RETRIEVAL 1623  I. W. Selesnick, R. G. Baraniuk, and N. G. Kingsbury, “The dual-tree Guy Cazuguel (M’83) received the Engineering de- complex wavelet transform,” IEEE Signal Process. Mag., vol. 22, no. gree from the Ecole Nationale de l’Aviation Civile, 6, pp. 123–151, Nov. 2005. Toulouse, France, in 1975, the M.S. degree in ad-  D. G. Lowe, “Object recognition from local scale-invariant features,” vanced automatics, and the Ph.D. degree in signal in Proc. Int. Conf. Comput. Vis., 1999, vol. 2, pp. 1150–1157. processing and telecommunications from the Univer-  H. Bay, A. Ess, T. Tuytelaars, and L. van Gool, “Surf: Speeded up sity of Rennes I, Rennes, France, in 1976 and 1994, robust features,” Comput. Vis. Image Underst., vol. 110, no. 3, pp. respectively. 346–359, 2008. He is currently a Professor with the Department  R. Rahmani, S. A. Goldman, H. Zhang, S. R. Cholleti, and J. E. Fritts, of Image and Information Processing, Telecom Bre- “Localized content-based image retrieval,” IEEE Trans. Pattern Anal. tagne, Brest, France. His research interests include Mach. Intell., vol. 30, no. 11, pp. 1902–1912, Nov. 2008. image analysis and content-based image retrieval in  M. Ferecatu and D. Geman, “A statistical framework for image cate- medical applications, within the LaTIM Inserm Research Unit 1101. gory search from a mental picture,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 31, no. 6, pp. 1087–1101, Jun. 2009.  E. Cheng, F. Jing, and L. Zhang, “A uniﬁed relevance feedback frame- work for web image retrieval,” IEEE Trans. Image Process., vol. 18, no. 6, pp. 1350–1357, Jun. 2009. Béatrice Cochener received the M.D. degree in oph-  M. R. Azimi-Sadjadi, J. Salazar, and S. Srinivasan, “An adaptable thalmology from Nancy University, Nancy, France, in image retrieval system with relevance feedback using kernel machines 1992. and selective sampling,” IEEE Trans. Image Process., vol. 18, no. 7, She is currently a Professor and the Head with the pp. 1645–1659, Jul. 2009. University Eye Clinic, Brest, France. Together with  W. Bian and D. Tao, “Biased discriminant Euclidean embedding for J. Colin, she developed a very active anterior seg- content-based image retrieval,” IEEE Trans. Image Process., vol. 19, ment surgery practice. She is a Specialist of refrac- no. 2, pp. 545–554, Feb. 2010. tive techniques in vision correction. She participated  M. Bethge, “Factorial coding of natural images: How effective are in three books on surgical techniques and has pub- linear models in removing higher-order dependencies?,” J. Opt. Soc. lished more than 30 peer-reviewed journal articles. Amer. A, Opt. Image Sci. Vis., vol. 23, no. 6, pp. 1253–1268, Jun. 2006. Her research interests include imaging research, clin- ical evaluation, and anterior segment surgery teaching. Dr. Cochener is the Vice President of the French Implant and Refractive Surgery Society SAFIR, the President of the French Academy of Ophthal- Gwénolé Quellec was born in Saint-Renan, France, mology, and is a member of the Editorial Board of the Journal Français on November 29, 1982. He received the Engineering d’Ophtalmologie. degree in computer science and applied mathematics from ISIMA, Clermont-Ferrand, France, in 2005, the M.S. degree in image processing from the University of Clermont-Ferrand II, Clermont-Ferrand, in 2005, Christian Roux (F’05) received the Aggregation and the Ph.D. degree in signal processing from degree in physics from the Ecole Normale Su- http://ieeexploreprojects.blogspot.com Telecom Bretagne, Brest, France, in 2008. perieure, Cachan, France, in 1978 and the Ph.D. He is currently a Research Associate with the degree in signal processing from the Institut National LaTIM Inserm Research Unit 1101, Brest, France. Polytechnique, Grenoble, France, in 1980. His research interests include retinal image pro- In 1982, he joined Telecom Bretagne, Brest,cessing, content-based image retrieval, and information fusion for medical France. He is the founding Director of LaTIMapplications. Inserm Research Unit 1101, Brest. He is the author of more than 150 papers, holds nine patents, and contributed to the creation of three startup companies developing medical technologies. His current re- Mathieu Lamard was born in Bordeaux, France, on search interests include medical information processing, spatial and functional May 18, 1968. He received the M.S. degree in applied information modeling, and analysis in medical images. mathematics from the University of Bordeaux, Bor- Dr. Roux was the recipient of the INSERM Basic Research Award in 2006. deaux, in 1995, and the Ph.D. degree in signal pro- He has been the founding Co-Chair of the IEEE EMBS International Summer cessing and telecommunication from the University School on Biomedical Imaging since 1994 and served as the President of the of Rennes, France, in 1999. IEEE Engineering in Medicine and Biology Society in 2001. He is a member of In 2000, he joined the LaTIM Inserm Research the Editorial Board of the IEEE TRANSACTIONS ON INFORMATION TECHNOLOGY Unit 1101, Brest, France, where he is currently a and of the PROCEEDINGS OF THE IEEE. Research Associate. His research interests include image processing, 3-D reconstruction, content-based image retrieval, and information fusion for medicalapplications.