• Like
  • Save
Development of Parallel Superresolution Algorithms for Remote Sensing Imagery
Upcoming SlideShare
Loading in...5
×
 

Development of Parallel Superresolution Algorithms for Remote Sensing Imagery

on

  • 3,080 views

The fundamental contribution of this research is the development of a block-based image processing superresolution library with support for parallel computing. The superresolution software has been ...

The fundamental contribution of this research is the development of a block-based image processing superresolution library with support for parallel computing. The superresolution software has been successful tested with real Landsat/ETM+ and MSG/SEVIRI datasets, demonstrating that our software framework is suitable for practical real-world superresolution applications and, concretely, for remote sensing superresolution.

Statistics

Views

Total Views
3,080
Views on SlideShare
3,059
Embed Views
21

Actions

Likes
1
Downloads
103
Comments
0

3 Embeds 21

http://www.linkedin.com 18
https://www.linkedin.com 2
http://www.slashdocs.com 1

Accessibility

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Development of Parallel Superresolution Algorithms for Remote Sensing Imagery Development of Parallel Superresolution Algorithms for Remote Sensing Imagery Document Transcript

    • SIGNALS, SYSTEMS AND RADIOCOMMUNICATIONS DEPARTMENT SCHOOL OF TELECOMMUNICATIONS ENGINEERING Technical University of MadridMaster in Communications Technologies and Systems MSC THESISDevelopment of Parallel SuperresolutionAlgorithms for Remote Sensing Imagery Defended by: Rebeca Gutiérrez López Thesis Advisor: Narciso García Santos defended on September 15, 2011
    • AbstractThe fundamental contribution of this research is the development of a block-based imageprocessing superresolution library with support for parallel computing. This library is suitablefor practical remote sensing superresolution and other real-world superresolution applications.There is a limited real-world superresolution presence and only a few commercial SR productsare oered in the market. The main goal of this research has been to implement a superres-olution framework which can be used with satellite images in its original sizes. Our softwareovercomes the current limitations of any superresolution implementation but is also suitableto deal with big input datasets by: (i) providing block-based image processing to overcome thehuge memory requiriments; and (ii) providing parallel implementation of the superresolutionalgorithms for faster product generation.Our superrolution suite provides a framework to apply superresolution techniques to real-world image datasets. The core of the suite is a superresolution library an a automatic imageregistration library written in C++. Both libraries have been implemented as OSSIM plugins toexploit the block-based image processing approach, support for parallel computing with MPI,and support for a wide range of image projections and datums of the OSSIM core library.The superresolution plugin provides several algorithms to perform subpixel image registration.A non-uniform interpolation technique and also two iterative cost function minimization algo-rithms are also provided to perform image reconstruction. Algorithms have been implementedusing an image-based formulation rather than a vector-based formulation to reduce memoryrequiriments and speedup the reconstruction step.The registration plugin takes benet of the computer vision eld contributions to provide a fastand robust method for register images with complex real-world geometric transformations. Theplugin has been designed to work without human intervention. Parameters of a congurablegeometric transformation are optimized using a feature-based registration technique. SeveralOpenCV feature detectors has been integrated in the registration plugin. Least-squares mini-mization using RANSAC is applied to perform a robust model optimization.A Qt-based GUI for image super-resolution has been built on the top of the core library andthe OSSIM plugins. This application provides a friendly environment to setup the plugins andto create image chains to generate output products. Finally, the OSSIM GDAL plugin hasbeen included in the superresolution suite to provide access to all the raster formats supportedby GDAL.The algorithms implementation and the plugins performance (block-based image processingand parallel capabilities) have been successful tested with syntetic images. The superresolutionsoftware has been successful tested with real Landsat/ETM+ and MSG/SEVIRI datasets,demonstrating that our software framework is suitable for practical real-world superresolutionapplications and, concretely, for remote sensing superresolution applications.Keywords: image enhancement, image resolution, image registration, satellite imagery, re-mote sensing, optimization, parallel computing
    • iiAcknowledgmentsThis work has been partially supported by the Spanish Ministry of Science and Innovationunder National Space Program Grant SAE-20081055 and managed by Argongra. This softwarewas developed at the R&D Department of Argongra under the supervision of Dr. Jesús Artiedaand Jorge Artieda. Thanks to the Argongras team for their trust in me and to provide mewith this opportunity to acquire the knowledge and expertise in order to present this work.Thanks to my thesis advisor, Dr. Narciso García, for his kindness and continual support.Finally, thanks to my family and friends for his encouragement during this cycle, I wouldntbe performing this work without all of you!
    • Contents1 Introduction 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.3 Super-resolution Formal Problem Denition . . . . . . . . . . . . . . . . . . . . 2 1.3.1 Image Observation Model . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.4 Super-Resolution in the Frequency Domain . . . . . . . . . . . . . . . . . . . . 3 1.5 Super-Resolution in the Spatial Domain . . . . . . . . . . . . . . . . . . . . . . 5 1.5.1 Non-Uniform Interporlation Methods . . . . . . . . . . . . . . . . . . . . 5 1.5.2 Iterative Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.5.3 Hybrid Techniques: Iterative-Interpolation SR Approach . . . . . . . . . 9 1.6 Other SR Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 1.6.1 Projection onto Convex Sets (POCS) Method . . . . . . . . . . . . . . . 10 1.6.2 Adaptative Filtering Techniques . . . . . . . . . . . . . . . . . . . . . . . 10 1.6.3 Learning-Based Techniques . . . . . . . . . . . . . . . . . . . . . . . . . 11 1.7 Challenge Issues for Super-Resolution . . . . . . . . . . . . . . . . . . . . . . . 11 1.8 Outline of Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 1.9 Contribution of Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 Super-Resolution Software Framework 15 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.1.1 Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.2 Image Superresolution Plugin . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.2.1 Block Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.2.2 Class Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.2.3 Software Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.3 Image Registration Plugin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.3.1 Block Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.3.2 Class Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.3.3 Software Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 2.4 Qt Super-Resolution GUI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273 Image Super-Resolution Module 29 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.2 Subpixel Registration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.2.1 Planar motion estimation in the frequency domain . . . . . . . . . . . . 30 3.2.2 Planar motion estimation in the spatial domain . . . . . . . . . . . . . . 31 3.2.3 Subpixel Registration Experiments . . . . . . . . . . . . . . . . . . . . . 34 3.3 Image Reconstruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 3.3.1 Image Reconstruction Methods . . . . . . . . . . . . . . . . . . . . . . . 37 3.3.2 Super-Resolution Experiments . . . . . . . . . . . . . . . . . . . . . . . 414 Image Registration Module 53
    • iv Contents 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 4.2 Control Point Detection and Matching . . . . . . . . . . . . . . . . . . . . . . . 53 4.2.1 Harris corners . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 4.2.2 SIFT features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 4.2.3 SURF features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 4.3 Model Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 4.4 Image Registration Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . 655 Super-Resolution Experiments With Satellite Imagery 69 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 5.2 Landsat/ETM+ Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 5.2.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 5.2.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 5.3 MSG/SEVIRI Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 5.3.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 5.3.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 5.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 786 Conclusions and Future Work 81A Super-Resolution Software Build Guide 85Bibliography 91
    • List of Figures1.1 Imaging process for super-resolution scenario . . . . . . . . . . . . . . . . . . . 31.2 Interpolation-restoration super-resolution scenario . . . . . . . . . . . . . . . . . 52.1 Superresolution suite architecture: core library, plugins, command-line applica- tions and higher level GUI applications. . . . . . . . . . . . . . . . . . . . . . . 192.2 Superresolution plugin block diagram . . . . . . . . . . . . . . . . . . . . . . . . 212.3 Get tile operation with registered LR frames inverse mapping. . . . . . . . . . . 222.4 Superresolution plugin class diagram. . . . . . . . . . . . . . . . . . . . . . . . . 232.5 Unattended registration plugin block diagram. . . . . . . . . . . . . . . . . . . . 252.6 Unattended registration plugin class diagram. . . . . . . . . . . . . . . . . . . . 263.1 Gaussian image pyramid structure. . . . . . . . . . . . . . . . . . . . . . . . . . 333.2 Pseudocode of Keren et al. algorithm. . . . . . . . . . . . . . . . . . . . . . . . 333.3 High-resolution images used in the simulations: (a) building, (b) castle, and (c) leaves. Available in: http://rr.epfl.ch/3/. . . . . . . . . . . . . . . . . . . . 343.4 Simulation results with noiseless LR frames: summary of the average absolute error (µ) and the standard deviation of the error (σ ) for the shift and rotation parameters; 100 simulations were performed for each of the images. . . . . . . . 353.5 Box plot of simulations results. Simulations with castle image: (1) φ, Vandewalle et al. (2) ∆x, Vandewalle et al. (3) φ, Keren et al. (4) ∆x, Keren et al. Simulations with building image: (5) φ, Vandewalle et al. (6) ∆x, Vandewalle et al. (7) φ, Keren et al. (8) ∆x, Keren et al. Simulations with leaves image: (9) φ, Vandewalle et al. (10) ∆x, Vandewalle et al. (11) φ, Keren et al. (12) ∆x, Keren et al. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363.6 Simulation results with noisy LR frames: summary of the average absolute error (µ) and the standard deviation of the error (σ ) for the shift and rotation parameters; 100 simulations were performed for each of the images. . . . . . . . 373.7 Placement of the registered LR frames into the HR grid. . . . . . . . . . . . . . 383.8 Delaunay triangulation, Voronoi tessellation: (a) the Delaunay triangulation in bold with the corresponding Voronoi tessellation in ne lines; (b) the Voronoi cells around each Delaunay point. . . . . . . . . . . . . . . . . . . . . . . . . . . 393.9 Block diagram representation of Eq. 3.17, blocks Gk and Rm,l are dened in Fig. 3.10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423.10 Extended block diagram representation of Gk Rm,l blocks in Fig. 3.9. and (a) Block diagram of L2 norm term cost derivative (Gk ). (b) Block diagram representation of regularization term cost deritave (Rm,l ). . . . . . . . . . . . . 423.11 Superresolution simulation scenarios. Above: using interpolation-based recon- struction methods; below: using optimization-based reconstruction methods . . 44
    • vi List of Figures 3.12 Comparison of the reconstruction algorithms with the books sequence: (a) one of the 10 LR frames; (b) reconstructed image using Delaunay interpolation (RMSE=1.3484); (c) reconstructed image using L2-norm minimization with γ = 0.03 (RMSE=1.3819), (d) reconstructed image using L2-norm and Bilateral-TV regularization term minimization with γ = 0.03 and Bilateral-TV parameters λ = 0.8 and α = 0.7 (RMSE=1.3791). . . . . . . . . . . . . . . . . . . . . . . . 46 3.13 Comparison of the reconstruction algorithms with the frida sequence: (a) one of the 10 LR frames; (b) reconstructed image using Delaunay interpo- lation (RMSE=1.2151); (c) reconstructed image using L2-norm minimization with γ = 0.03 (RMSE=1.2298), (d) reconstructed image using L2-norm and Bilateral-TV regularization term minimization with γ = 0.03 and Bilateral-TV parameters λ = 0.5 and α = 0.7 (RMSE=1.2236). . . . . . . . . . . . . . . . . . 47 3.14 Full resolution chart high-resolution test image. . . . . . . . . . . . . . . . . . . 48 3.15 Simulation results with the resolution chart sequence: computation time versus block size. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 3.16 Simulation results with the resolution chart sequence: graph of computation time versus block size. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 3.17 Simulation results with the resolution chart sequence: number of processors versus computation time. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 3.18 Simulation results with the resolution chart sequence: graph of number of processors versus computation time. . . . . . . . . . . . . . . . . . . . . . . . . 50 3.19 Comparison of the reconstruction step using dierent block sizes with the res- olution chart sequence: (a) detail of one of the 6 LR frames; reconstructed images details (b) block size: 64 pixels; (c) block size: 128 pixels; (d) block size: 256 pixels; (e) block size: 512 pixels; (f ) block size: 1024 pixels. . . . . . . . . . 51 3.20 Comparison of the reconstruction step using dierent block sizes with the res- olution chart sequence: (a) detail of one of the 6 LR frames; reconstructed images details (b) block size: 64 pixels; (c) block size: 128 pixels; (d) block size: 256 pixels; (e) block size: 512 pixels; (f ) block size: 1024 pixels. . . . . . . . . . 52 4.1 GUI Screenshots: Harris Features Dialog . . . . . . . . . . . . . . . . . . . . . . 55 4.2 GUI Screenshots: SIFT Features Dialog . . . . . . . . . . . . . . . . . . . . . . 58 4.3 Laplacian of Gaussian Approximation. Top Row: the second order Gaussian derivatives in the x, y and xy-directions. We refer to these as Lxx , Lyy , Lxy . Bottom Row: box lter approximations in the x, y and xy-directions. We refer to these as Dxx , Dyy , Dxy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 4.4 Filter Pyramid. The traditional approach to constructing a scale-space (left). Image sizes varies and a Gaussian kernel is repeatedly applied to smooth sub- sequent pyramid levels. The SURF approach (right) leaves the original image unchanged and varies only the lter size. . . . . . . . . . . . . . . . . . . . . . . 60 4.5 GUI Screenshots: SURF Features Dialog . . . . . . . . . . . . . . . . . . . . . . 62 4.6 Pseudocode of the generic RANSAC algorithm . . . . . . . . . . . . . . . . . . 64 4.7 GUI Screenshots: Polynomial Model Optimization Dialog with RANSAC Out- lier Rejection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
    • List of Figures vii 4.8 Image registration demonstration. Above: reference image (river1.jpg). Below: input image (river2.jpg). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 4.9 Image registration demonstration. Interest point matching using SURF: corre- spondences between reference (above) and input (below) images. . . . . . . . . 67 4.10 Image registration demonstration. Blend mosaic of reference and registered input image after image registration. . . . . . . . . . . . . . . . . . . . . . . . . 68 5.1 Landsat 7 ETM+ spectral channels. . . . . . . . . . . . . . . . . . . . . . . . . 70 5.2 Landsat/ETM+ scan line corrector failure (source: [74]). . . . . . . . . . . . . . 71 5.3 Complete Landsat 7 scene showing aected vs. unaected area (source: [74]). . 71 5.4 Above: USGS Earth Explorer Browser. Below: Panchromatic band of one of the Landsat/ETM+ scenes with SLC-o scan gaps. . . . . . . . . . . . . . . . . 73 5.5 MTL metadata le extract of one of the donwloaded Landsat/ETM+ scenes. . 74 5.6 Landsat/ETM+ superresolution using Delaunay interpolation: Zarzuela Hippo- drome (a) panchromatic image, (b) output image; Retiro Park (c) panchromatic image, (d) output image. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 5.7 High Rate MSG/SEVIRI spectral channels. . . . . . . . . . . . . . . . . . . . . 76 5.8 MSG/SEVIRI channels, running left to right (source: [75]). . . . . . . . . . . . 77 5.9 MSG/SEVIRI VIS0.6 frames of the area of the Iberian Peninsula. . . . . . . . . 79 5.10 MSG/SEVIRI VIS0.6 superresolution results using L2+BilateralTV minizam- tion: (a) one of the input frames, (b) output image. . . . . . . . . . . . . . . . . 80
    • Chapter 1 Introduction1.1 MotivationSuper-Resolution (SR) are techniques that construct high-resolution (HR) images from severalobserved low-resolution (LR) images. This process has also been referred to in the literatureas resolution enhancement (RE).The image spatial resolution is limited by the imaging sensors of the imaging acquisition device.Lens blurs (associated with the sensor point spread function (PSF)), lens aberration eects,aperture diractions, and optical blurring due to motion are degradation parameters presentin any image acquistion system.Constructing imaging chips and optical components to capture very high-resolution images isprohibitively expensive. SR techinques accept the image degradations and use signal processingto post-process the captured images, to trade o computational cost with the hardware cost.The eld of super-resolution has a vast area of application. To name a few applications ofsuper-resolution both in civilian and military domain: • medical imaging (CT, MRI, ultrasound, etc.) [1] • remote sensing [2, 3, 4, 5] • surveillance systems: zoom region of interest in video for human perception (e.g., license plate recognition [6]), automatic target detection (e.g., face recognition [7, 8]) • compressed video enhancement [9]Since the pioneering work by Tsai and Huang [10] in 1984, SR has become a hot researchtopic and thousands of SR papers have bloomed into publications. Many techniques havebeen proposed over the last two decades representing approaches from frequency domain tospatial domain [11, 12, 13, 14, 15, 16, 17, 18, 19], and several books focused on super-resolutiontechniques and applications have been published [20, 21].1.2 NotationsBold lowercase symbols denote vectors, plain uppercase symbols denote matrices and plainlowercase symbols denote scalars. Upperscase bold letters X and Y denote HR and LR imagesrespectively, while lowercase bold letters x and y denote its vector forms. Underlined lowercase
    • 2 Chapter 1. Introductionbold letters are used to denote a vector concatenation of multiple vectors; for instance, y isa vector concatenation of yk (k = 1, 2, ..., K). Underlined uppercase bold letters are used todenote a matrix concatenation of multiple matrices; for instance, Y is a matrix concatenationof Yk (k = 1, 2, ..., K).Locations in an image are positions on a plane, and they are described in terms of x and y.Coordinate x increases to the right while coordinate y increases downward.The rotation angle is clockwise dened with the rotation center in the center of the image.1.3 Super-resolution Formal Problem DenitionThe super-resolution problem may be formally described as follows:Let x(x1 , x2 ) x1 , x2 ∈ denote a continuous scene in the image plane coordinate system. withGiven a sequence of p low-resolution images yk (n1 , n2 ) with n1 ∈ (1, . . . , N1k ), n2 ∈ (1, . . . , N2k )and k ∈ (1, . . . , p) acquired by imaging of the scene x(x1 , x2 ), our objective is to estimatex(m1 , m2 ) on the discrete, super-resolution sampling grid [m1 , m2 ] indexed by with m1 ∈ˆ(1, . . . , M1 ) and m2 ∈ (1, . . . , M2 ). Typically, we choose M1 > N1k and M2 > N2k for all k .Super-resolution refers to the reconstruction of images x(m1 , m2 ) ˆ that are visually superior tothe original low-resolution observations.1.3.1 Image Observation ModelSR algorithms attempt to extract the HR image corrupted by the limitations of the opticalimaging system. This type of problem is an example of an inverse problem, wherein the sourceof information (HR image) is estimated from the observed data (LR images). Solving aninverse problem in general requires rst constructing a forward model (see Fig. 1.1). Theclassic forward model of super-resolution in the spatial domain assumes that a sequence ofp low-resolution images represent dierent snapshots of the same scene. Each LR frame,y1 , y2 , . . . , yp , is a noisy, down-sampled version of the reference image x that is subjectedto various imaging conditions such as optical, sensor and atmospheric blur, motion eects,and geometric warping. Each LR frame yk is related with the unknown HR image x by thefollowing forward model [15]: yk = Hk x + nk , (1.1)where Hk represents the imaging system of the k-th LR frame and nk is a random vector noiseinherent to any imaging system.The three terms necessary to capture the image formation process are image motion, opticalblur, and the sampling process. These three terms can be modelled as separate matrices by: Hk = DBMk , (1.2)where Hk encodes the motion information for the k-th frame, B is the blurring operation due
    • 1.4. Super-Resolution in the Frequency Domain 3to the optical Point Spread Function (PSF) and D represents the eect of sampling by theimage sensor (down-sampling operator).Then, the SR problem consists into obtaing a HR image from a sequence of geometricallywarped, blurred, noisy and under-sampled LR frames of the original scene. Figure 1.1: Imaging process for super-resolution scenario1.4 Super-Resolution in the Frequency DomainEarly works on super-resolution mainly followed the theory of by exploring the shift and aliasingproperties of the Fourier transform.The frequency domain approach is based on the following three principles: 1. the shifting property of the Fourier transform, 2. the aliasing relationship between the continuous Fourier transform (CFT) of an original HR image and the discrete Fourier transform (DFT) of observed LR images, 3. and the assumption that an original HR image is bandlimited.Tsai and Huang [10] were rst rst to superresolve a single HR from a set of down-sampledLR frames (without blur). Global translations are the only motion considered in the frequencydomain approach. Let x(x1 , x2 ) denote a continuous HR image and its CFT. The k-th shiftedimage can be expressed as: xk (x1 , x2 ) = x(x1 + δk1 , x2 + δk2 ), (1.3)where δk1 and δk2 are the shifts from a reference frame and k = 1, . . . , p. Let X(w1 , w2 ) be abandlimited image, i.e.: |X(w1 , w2 )| = 0, for |w1 | ≥ L1 π/T1 and |w2 | ≥ L2 π/T2 .By the shift property of the CFT, the CFT of the k-th shift image Xk (w1 , w2 ) can be writtenas: Xk (w1 , w2 ) = ej2π(δk1 w1 +δk2 w2 ) X(w1 , w2 ). (1.4)Then, the k-th shifted HR image is sampled with the sampling period T1 and T2 to generatethe kth LR frame yk (n1 , n2 ).
    • 4 Chapter 1. IntroductionFrom the aliasing relationships and the assumption of bandlimitedness of X(w1 , w2 ), the DFTof the k-th observed LR frame can be expressed in terms of the discrete CFT of the HR imageas [15]: L1 −1 L2 −1 1 2π Ω 2π Ω Xk (Ω1 , Ω2 ) = Xk , (1.5) T1 T2 T1 M1 + n1 T2 N2 + n2 n1 =0 n2 =0By using lexicographic ordering for the indices n1 and n2 on the right-hand side and k on theleft-hand side, a matrix vector form is obtained as: Y = ΦX, (1.6)where Y is a px1 column vector with the k-th element of the DFT coecients of yk (n1 , n2 ),X is a L1 L2 x1 column vector with the samples of the unknown CFT of x(x1 , x2 ), and Φ is apxL1 L2 matrix which relates the DFT of the observed LR images to samples of the continuousHR image.Therefore, super-resolution reconstruction is reduced to 1. nding the DFTs of the LR images, 2. determining Φ, 3. solve the inverse problem and nally 4. use the inverse DFT to obtain the reconstructed image.The matrix Φ requieres knowledge of the translation parameters δk1 and δk2 which are nottypically known a-priori. This parameters must be estimated before the reconstruction. Super-resolution is then divided in two steps: 1. motion estimation to determinate the translation parameters and 2. restoration of the HR image.Several authors generalized Tsai and Huangs method to include sensor blur and noise in theobserved LR frames. A rst generalization of Tsai and Huangs method for noisy imageswas presented by Kim et al. [22], resulting in a weighted least-squares formulation. In theirapproach, it is considered that all the observed images have the same noise and blur. Thismethod was later rened by Kim and Su [23] to consider dierent blurs in the observed images,resulting in a least-squares formulation with Tikhonov regularization. Bose et al. [24] proposedthe recursive total least squares method for SR reconstruction to reduce eects of registrationerrors (errors in Φ). Also, Rhee and Kang [25] proposed a discrete cosine transform (DCT)-based method to reduce the memory requirements and computational cost.However, these frequency domain approaches are very restricted in the image observationmodel they can handle. The global translation model is, for many applications, inappropriateand real problems are much more complicated. Researchers nowadays most commonly ad-dress the problem mainly in the spatial domain, for its exibility to model all kinds of imagedegradations.
    • 1.5. Super-Resolution in the Spatial Domain 51.5 Super-Resolution in the Spatial DomainSpatial domain techniques are the most popular ones developed for super-resolution. Thepopularity is due to the fact that the motion is not limited to translational model and thus amore general, global or non-global motion can also be incorporated and dealt with.The spatial-domain SR techniques can be classied into two main categories: nonuniforminterpolation methods and iterative methods.1.5.1 Non-Uniform Interporlation MethodsThe basic idea behind SR is to combine the non-redundant information contained in multiplelow-resolution frames to generate a high-resolution image [26]. The nonredundant informationcontained in the these LR images is typically introduced by subpixel shifts between them.These subpixel shifts may occur due to uncontrolled motions between the imaging systemand scene (movements of objects) or due to controlled motions (the satellite imaging systemorbits the earth with predened speed and path). SR is possible only if there exists subpixelmotions between these LR frames, which motivates a forward noniterative approach based oninterpolation and restoration.There are three stages for this approach (see Fig.1.2): 1. LR frames are rst aligned by some image registration algorithm to subpixel accuracy; 2. each pixel from each of the LR frames is then placed onto a HR grid using the regis- tration information estimed by the registration routines, where nonuniform interpolation methods are used to ll in those missing pixels on the HR image grid; 3. at last, the HR image grid is deblurred by any classical deconvolutional algorithm with noise removal [27] to get x (restoration). Figure 1.2: Interpolation-restoration super-resolution scenario
    • 6 Chapter 1. IntroductionAlam et al. [28] presented an interpolation SR algorithm scheme based on weighted near-est neighbors, followed by Wiener ltering for deblurring. Nguyen and Milanfar [29] pro-posed a wavelet-based interpolation SR reconstruction algorithm by exploiting the interlacingsampling structure in the low-resolution data. Lerttrattanapanich and Bose [30] proposed atriangulation-based method for interpolating irregularly sampled data while Lo and Dragottichoosed a cublic-spline interpolation technique.These interpolation-restoration forward approaches are intuitive, simple, and computationallyecient. However, the step-by-step forward approach does not guarantee optimality of theestimation. The registration error can easily propagate to the later processing. Also, theinterpolation step is suboptimal without considering the noise and blurring eects. Moreover,without the HR image prior as proper regularization, the interpolation based approaches needspecial treatment of limited observations in order to reduce aliasing.1.5.2 Iterative Methods1.5.2.1 Iterative Back-Projection TechniquesIrani and Peleg [31] proposed a simple but very popular method, based on an error back-projection scheme inspired by computer-aided tomography. The algorithm starts with aninitial guess ˆ x0 for the output HR image. The forward model is used to generated the LRimages ˆ yk based on the initial guess. These simulated LR images are then compared with theobserved ones yk . The algorithm iteratively updates the current estimation by adding backthe warped simulation error convolved with a back-projection function (BPF): −1 xn+1 = xn + Mk [hbpf ∗ S ↑ (yk − yk )], ˆ (1.7) k −1where hbpf is the back-projection kernel, S ↑ is the up-sampling operator, Mk is the in-verse warping operator and ˆ yk is the simulated k-th LR frame from the current HR estimation.The original algorithm considers translational and rotational motion but the authors claimthat the same concept can be applied to other motions also.1.5.2.2 Statistical Techniques: Bayesian Approach to SRThe Bayesian techniques are based on the Bayes Theorem and treat the problem of estimatingthe high-resolution image as a statistical estimation problem [32, 33].ML formulation for SR Given a set of observed LR images y, the Maximum Likelihood(ML) estimator of x maximizes the a likelihood function Pr(x|y) with respect to x: ˆ x = arg max Pr(x|y) (1.8) x
    • 1.5. Super-Resolution in the Spatial Domain 7Assuming the image noise to be an identically distributed (i.i.d) zero mean Gaussian distribu-tion with variance 2 σk , the likelihood function of an observed image yk is: 1 (yk (n1 , n2 ) − yk (n1 , n2 ))2 ˆ Pr(x|yk ) = √ exp − 2 , (1.9) σk 2π 2σk ∀n1 ,n2where the simulated LR image ˆ yk is given the forward model yk = DBMk x. Aplying loga-rithms to Eq.1.9 one gets: ln(Pr(x|yk )) = − (yk (n1 , n2 ) − yk (n1 , n2 ))2 = − yk − yk ˆ ˆ 2 2 = − y k − Hk x 2 . 2 (1.10) ∀n1 ,n2Assuming independent LR images, the log-likelihood function over all images is given by: ln(Pr(x)|y)) = ln(Pr(x|yk )) = − yk − yk ˆ 2 2 = − y − Hx 2 . 2 (1.11) ∀k ∀kNote that the forward models of all p LR images are stacked vertically to form an over-determined linear system:       y1 H1 n1  y2   H2   n2         .  =  .  x +  .  , y = Hx + n (1.12)  ..   . .   ..  yp Hp npTherefore, the ML estimator can be nally expressed as: x = arg min y − Hx 2 . ˆ 2 (1.13) xThis is a standard linear minimization, and the solution is given by the pseudo-inverse of H: x = H + y, ˆ (1.14)which is: H + = (H T H)−1 H T .H is a very large sparse matrix, so it is not possible in practice to directly compute the pseudo-inverse H +. For this reason, iterative methods (steepest descent method, conjugate gradientmethod, etc.) are commonly used to solve the least-squares problem.Irani and Peleg iterative back-projection algorithm [31] is none other thing than a ML estimatorsolved using the steepest descent method.
    • 8 Chapter 1. IntroductionMAP formulation for SR Given a set of observed LR images y, the Maximum a Posteriori(MAP) estimator of maximizes the a posteriori probability Pr(y|x) with respect to x: ˆ x = arg max Pr(y|x) (1.15) xApplying Bayes theorem to the conditional probability, the MAP optimization problem can beexpressed as: ˆ x = arg max Pr(x|y) Pr(x), (1.16) xwhere Pr(x|y) is the data likelihood function over all observed images and Pr(x) is the a prioriimage model of the desired HR image.Aplying logarithms, the MAP estimator can be rewriten as: ˆ x = arg max ln(Pr(x|y)) + ln(Pr(x)), (1.17) xand substituting 1.11 into 1.17 one gets: x = arg min y − Hx ˆ 2 2 − ln(Pr(x)). (1.18) xThe MAP estimation model provides the ability to include a priori knowledge which help toregularize the ill-conditioned nature of SR. The specic form of the MAP estimator dependson the on the choose of the a priori image model. Markov Random Field (MRF) priors areoften adopted [33]. Using the MRF prior, Pr(x) is dened using the Gibbs distribution in anexponential form: Pr(x) ∼ exp(−αU (x)), (1.19)where U (x) is called an energy function which measures the cost caused by the irregularitiesof the solution. Therefore, 1.18 can be rewritten as: x = arg min y − Hx ˆ 2 2 + αU (x). (1.20) xDierent kinds of priors have been proposed in the literature. The three commonly used imagepriors for the SR reconstruction techniques are: Gaussian MRFs [34] (forces smothness of thereconstructed image), Huber MRFs [35] (forces smothness while sharps image edges) and TotalVariation [16]. For more comparisons of how these image priors aect the solution of SR, onecan further refer to [32] and [36].In the classical MAP formulation for SR, the motion parameters are assumed to be known. Ifthe motion parameters are not assumed to be known, they could be simultaneously estimatedtogether with the HR image using the joint MAP formulation for SR [37, 38, 39].Due to the advantage of inclusion of prior information, MAP framework is usually preferredover ML.
    • 1.5. Super-Resolution in the Spatial Domain 91.5.2.3 Deterministic Techniques: Optimization-based Approach to SRThe problem of estimating a HR image from a sequence of LR images can be formulated asthe following optimization problem: n x = arg min y − Hx ˆ p = arg min yk − Hk x p (1.21) x x k=1where p is the p-norm distance. This cost function (usually called delity data term) minimizesthe distance (in a p-norm sense) between the model and the observed images. For a point(a1 , a2 , ..., an ) and a point (b1 , b2 , ..., bn ), the p-norm distance is dened as: n 1/p p − normdistance = |ai − bi |p (1.22) i=1However, this is an inverse problem which is highly ill-conditioned, which means that thesolution (HR image) is very sensitive and can vary tremendously in an arbitrary manner withvery small changes in the data (the set of LR frames). This also leads to the problem of slowconvergence. To stabilize this ill-posed nature, a regularization term R(x) is included in 1.21: n ˆ x = arg min[ yk − Hk x p + λR(x)] (1.23) x k=1There is no unique procedure for constructing the regularization term but it is usually chosen toincorporate a priori knowledge of the real HR scene. The regularization parameter λ is a scalarfor weighting the delity data term against the regularization term. One of the most widelyreferenced regularization cost functions is the Tikhonov cost function [14] where is usually ahighpass operator such as derivative, Laplacian, or even identity matrix. The intuition behindthis regularization method is to limit the total energy of the image (when is the identity matrix)or forcing spatial smoothness (for derivative or Laplacian choices of ).It is important to note that the Bayesian-based SR techniques can be seen a particular case of1.23, with p equal to 2 (L2 norm). Then, the MAP formulation for SR is not other thing than anoptimization technique with regularized L2 norm minimization. The Tikhonov regularizationterm is derived in the MAP formulation if Gaussian MRF prior of the HR image is assumed.Moreover, the ML formulation for SR is not other thing than a L2 norm minimization withoutregularization term.1.5.3 Hybrid Techniques: Iterative-Interpolation SR ApproachBanore [26] presented the theory for an hybrid reconstruction scheme, Iterative-InterpolationSuper-Resolution (IISR) algorithm, for the restoration of high-resolution images from a se-quence of geometrically warped, aliased and under-sampled LR frames. Interpolation tech-niques are used to produce an initial estimate of the HR image and then utilizing an iterative
    • 10 Chapter 1. Introductionapproach, the nal approximation of the HR image is produced.1.6 Other SR Techniques1.6.1 Projection onto Convex Sets (POCS) MethodThe method of Projection onto Convex Sets (POCS) was introduced by [40, 41] in 1982. Theconcept of POCS applied to the problem of super-resolution was rst introduced in [42]. Inthis approach, the space of estimated HR solution is restricted by a set of constraints (closedconvex sets) which modelate desirable properties (delity to data, smoothness, sharpness etc.)of the HR image. For each set of convex constraints Ci , a projection operator Pi is dened.Given a point in the HR image, the problem is then reduced to iteratively locate the closestsolution which intersects with all the given convex constraints, Ci .For example, delity to data can be modeled as n convex sets: Ck = {x| yk − yk ˆ 2 2 ≤ σ 2 , 1 ≤ k ≤ n}, (1.24)and smoothness can be dened as: CΓ = {x| Γx − yk p ≤ σ}, (1.25)where p is the p-norm distance.With a group of M convex sets, the desired solution lies in the intersection of these sets. Givenan initial guess x0 , the POCS technique suggests the following recursive algorithm for ndinga point within the insersection set: xk+1 = Pk Pk−1 Pk−2 . . . P2 P1 xk , (1.26)where Pi is the projection operator that projects a point onto a closed convex set Ci .The method of POCS is simple and allows inclusion of a priori information but has a very highcomputational cost and a slow convergence rate which limits its practical applicability. Also,the nal solution is not unique and highly depends upon the initial guess.1.6.2 Adaptative Filtering TechniquesAdaptive ltering approach for SR reconstruction was introduced in [43]. Elad and Feuerproposed a few algorithms based on least squares: recursive least squares (RLS) and pseudo-RLS. For estimating the HR image both steepest descent (SD) and normalized steepest descent(NSD) were applied. In their further research [44], the authors rederived their least squares-based algorithms as approximations of the Kalman lter [45]. Their algorithms were built withthe assumption that the information regarding the motion between the images and the bluroperators is known. The Kalman lter approach is promising but is still in an experimentalstate as its computational cost is extremely high.
    • 1.7. Challenge Issues for Super-Resolution 111.6.3 Learning-Based TechniquesIn recent years, a lot of interest has grown towards learning-based techniques [46, 47]. This classof SR techniques generate HR images from one or more LR frames by learning from a collectionof training images. These training images are scenes of either the same or dierent types. Bakerand Kanade [48] proposed a learning-based SR technique for human faces (or text) and calledit face hallucination. Using a face database, the method learns the relationship between theLR frames of human faces and their known HR frames. This learnt information is later usedfor the reconstruction of HR images from LR images of human faces. The performance oflearning-based SR techniques depends on the accuracy of matching between the input LRframe and the training samples.1.7 Challenge Issues for Super-ResolutionThere is a limited real-world SR presence [49] and only a few commercial SR products are oered in the market (QE Super-resolution , PhotoAcute , MotionDSP Inc. ! , Scientic Sys- "tems Company Inc. ). Why is SR not used more? In building a practical SR system, manychallenging issues are still under research preventing the SR techniques from wide applica-tions. Any commercial SR product should incorporate early into the design of SR algorithmssolutions to the following issues: 1. image registration accuracy with complex geometrical models for real-world images, 2. robustness of the SR reconstruction in presence of error in the image registration and 3. high memory and compututational requirements related with SR algorithms.Image registration accuracy: computer vision constributions to SR .Image registration is critical for the success of SR reconstruction. Traditional SR reconstructionusually treats image registration as a distinct process from the HR image estimation. Therefore,the recovered HR image quality depends largely on the image registration accuracy from theprevious step.Many SR estimators, particularly those derived in the Fourier domain, are based on the as-sumption of purely translational image motion. The most popular algorithms for subpixelimage registration assume a simple translational and rotational model for the alignment of theset of LR observed images. In a real-world scenario, such as remote sensing imagery, mathe-matical models are more complex and it is necessary to introduce a previous registration stepto the SR reconstruction which provides a more complex mathematical model (polynomialmodels, RPC models or even rigorous sensor models) to estimate the registration parameters.  http://www.qelabs.com/ http://www.photoacute.com/ ! http://www.motiondsp.com/ " http://www.ssci.com/
    • 12 Chapter 1. IntroductionFortunately, super-resolution can take benet of the computer vision eld to nd fast, accurate,and robust automated methods for register images with complex geometric transformations[50]. In computer vision, it is common to estimate the parameters of a geometric transforma-tion by feature-based registration techniques. Feature-based algorithms can deal with widelyseparated views and are robust to illumination changes. Typically, each image has several hun-dred of interest points (called key points) that are automatically detected using an algorithmsuch as the Harris corners detector [51] , the SIFT algorithm [52] or the SURF algorithm [53],just to cite some one the most feature detectors. Each key point detected in the reference LRimage must be identied (if exists) in the other observed images. This process is commonlycalled matching. One can compare the image neighborhoods around the features using a simi-larity metric such as normalized correlation (for Harris detector) or distance metrics (Euclideandistance, for example) of the associated key points descriptors (for SIFT and SURF). Afterthat, a mathematical model is t with the set of matched key points. In presence of dataoutliers (due to errors in the matching operation), it is commom to use a robust t proceduresuch as the RANSAC algorithm [54].Robustness of SR algorithms .Traditional SR techniques are vulnerable to the presence of outliers due to motion errors (errorsin image registration), inaccurate blur models, noise, motion blur, etc. These inaccurate modelerror cannot be treated as a Gaussian noise as the usual assumption with L2 minimization (see1.9). As the forward model (see 1.1) cannot be estimated perfectly, robustness of SR is ofinterest. Farsiu et al. [55, 56], changed the commonly used L2 norm into L1 norm for robustestimation and introduced the called Bilateral Total Variation (BTV) regularization term toestimate the HR image.Implementation of SR algorithms .Another diculty limiting practical application of SR reconstruction is its intensive compu-tation due to a large number of unknowns, which require expensive sparse matrix manipu-lations. To illustrate this problem, lets suppose we want to estimate a HR image of size256x256(= 28 x28 ) 128x128(= 27 x27 ). Each observed pixels from a set of 10 LR images of size 14image is casted into a column vector of size 16384x1(= 2 x1). The matrix of the forwardmodel of each observed image has size 65536x16384(= 2 16 x214 = 1G). Asumming we areworking with 8-bit gray scale images, we need 1GB (!) to store only the forward model of 1 ofthe 10 observed images! One can use sparse matrix data storage and operators to reduce thatextremeley high memory requirements, but, anyway, this doesnt solve the problem with notreal-world image sizes.Zomet and Peleg [57] studied the application of D (subsampling model), B (blur model) andMk (motion model of the k-th LR frame) directly as the corresponding image operations ofdown-sampling, blurring and shifting, translating the vector-based formulation (which workswith big sparse matrices) into an image-based formulation, bringing signicant speedups.Finally, the are a few works related with parallel computing (CPU-GPU) [58] and hardwareimplementations (FPGA) [59] of superresolution algorithms but this approaches, highly related
    • 1.8. Outline of Thesis 13with practical superresolution scenearios, have not been widely explored yet.1.8 Outline of ThesisAfter a denition of the superresolution problem, Chapter 1 has presented a state of the art ofsuperresolution techniques. This chapter also describes the main contributions of this research.Chapter 2 presents the software framework that has been developed during this research. Thecore of the superresolution suite is a superresolution library. An automatic image registrationlibrary and a graphical user interface application have also been developed. Design goals aredened and design choices which have been made to implement the superresolution suit arejustied. Block diagrams for each of the libraries will also be presented as well as a descriptionor functionality of its main modules.Chapter 3 describes in detail the subpixel registration algorithms and the reconstruction meth-ods implemented in the superresolution library. In order to test our implementation and theplugin performance (block-based image processing and parallel capabilities), several image testsare also included.Chapter 4 describes in detail the control point detection and matching algorithms and themodel optimization routine implemented in the automatic registration library. In order to testthe accuracy of the registration process, an image test using one of the feature detectors anda polynomial model is also included.Chapter 5 presents the results of testing the superresolution software with real satellite imagerydatasets. Superresolution with datasets from Landsat/ETM+ and MSG/SEVIRI have beenperformed. Each section starts with a brief technical overview describing the main imagecharacteristics and ends with the satellite imagery results.Chapter 6 presents the conclusions of this thesis, and recommendations for future work. Thisis followed by bibliography.Appendix A is a build guide with the set of instructions to install the superresolution suite inLinux.1.9 Contribution of ThesisThe main contribution of this research is the development of a block-based image processingsuperresolution library with support for parallel computing. This library is suitable for prac-tical remote sensing and other real-world superresolution applications and oers a robust andexible framework to deal with real-world image datasets. The dynamic plugin-based designallows for rapid growthing of the suite by further including other automated image process-ing chain implementation case-studies, such as image mosaics, vegetation indexes, lands covermaps and other high level products from satellite imagery.
    • Chapter 2 Super-Resolution Software Framework2.1 IntroductionIn this chapter, the software framework that have been developed during the research is pre-sented: • a superresolution library; • an automatic image registration library; and • a Qt-based graphical user interface (GUI).Design goals, including language and environment choices and an overview of the architectureimplementation will be presented. For a better understanding, block diagrams for each of theabove libraries will also be presented. Each software package comprises several routines whichwill be illustrated as modules in the block diagram of that software package and a descriptionor functionality for each of these modules will be provided.2.1.1 DesignThis section outlines the design choices which have been made to implement the superresolutionsuite.2.1.1.1 Design goalsAs mentioned in the previous chapter, there is a limited real-world superresolution presenceand only a few commercial SR products are oered in the market. There has not been foundany superresolution software which can deal with satellite imagery in its original sizes. Satelliteimages are usually big. What does big mean? Lets give some numbers... • Any of the high rate MSG/SEVIRI channels are 10-bit 3712X3712 images (=16.4MB!) • The R, G, and G bands of a Landsat/ETM+ scene are 7991x7301 8-bit images (=55.6MB!)  Then, an RGB composite of a Landsat/ETM+ scene is an ≈150MB image! • The panchromatic band (channel 8) of a Landsat/ETM+ scene is a 15981x14601 8-bit image (=222.5MB!)
    • 16 Chapter 2. Super-Resolution Software FrameworkSeveral works has been an excellent starting point for this research. There is a superresolutionreproducible research [60] resources list at http://reproducibleresearch.net/index.php/Super-resolution that gives an overview of code and data related to super-resolution publi-cations. However, the has not been found any superresolution work that deals explicitely withthe problem of big size input datasets. The main goal of this research has been to implement asuperresolution framework which can be used with satellite images in their original sizes. Thismeans to overcome the current limitations of any superresolution library: • robustness of the superresolution reconstruction in presence of error in the image regis- tration; • ecient implementation to deal with the high memory and computational requirements of any superresolution algorithm; and • image registration accuracy with complex geometrical models for real-world images;but also to adapt the superresolution implementation to deal with satellite images by: • providing block-based image processing to overcome the huge memory requiriments; and • providing parallel implementation of the state-of-art superresolution algorithms for faster product generation.2.1.1.2 Language and EnvironmentC++ has been chosen as the programming language to develop the superresolution suite forthe following reasons: 1. Speed: Low level image processing needs to be fast and C++ will facilitate the imple- mentation of a highly ecient library of functions. 2. Usability: Almost all image processing appears to be carried out in C++, C and Matlab, so C++ seems the obvious choice in making a useful contribution to the eld. 3. Portability: It is possible to write code which is portable across many platforms and compilers. 4. Image processing libraries: • OpenCV: An open-source computer vision library with several hundreds of computer vision algorithms. • OSSIM: A powerful suite of geospatial libraries and applications used to process imagery, maps, terrain, and vector data. The architecture of the library supports parallel processing with MPI, a dynamic plugin architecture, and dynamically con- nectable objects allowing rapid prototyping of custom image processing chains. • GDAL: A translator library for raster geospatial data formats. • Qt: A cross-platform application framework that is widely used for developing ap- plication software with a GUI.
    • 2.1. Introduction 17 • OpenMPI: An open source MPI implementation.The chosen development environment for the implementation of the library is Qt Creator,a powerful IDE (Integrated Development Environment) allowing for easy code creation andvisual project organisation.OpenCV .OpenCV [61] is an open source computer vision C/C++ library available that includes severalhundreds of computer vision algorithms. The location of OpenCVs main web site at the timeof this document is at http://opencv.willowgarage.com/wiki.OpenCV has a modular structure, which means that the package includes several libraries.The following modules are available: • core - basic data structures and basic functions used by all other modules • highgui - image and video I/O and UI capabilities • imgproc - image ltering, geometrical image transformations, color space conversion, histograms, etc. • video - motion estimation, background subtraction and object tracking algorithms • calib3d - basic multiple-view geometry algorithms, single and stereo camera calibration, object pose estimation, stereo correspondence algorithms, and elements of 3D reconstruc- tion. • features2d - feature detectors, descriptors, and descriptor matchers.... and some other helper modules, such as FLANN (Fast Library for Approximate NearestNeighbors).OSSIM .OSSIM (Open Source Software Image Map) is a powerful suite of geospatial libraries andapplications used to process imagery, maps, terrain, and vector data. It is an open sourcesoftware project maintained at www.ossim.org and has been under active development since1996. OSSIM has been funded by several US government agencies in the intelligence anddefense community and the technology is currently deployed in research and operational sites.The core of OSSIM is a C++ software library that provides advanced remote sensing, imageprocessing, and geospatial functionality. The architecture of the library supports parallelprocessing with mpi, a dynamic plugin architecture, and dynamically connectable objectsallowing rapid prototyping of custom image processing chains.Around the core ossim library, the software distribution includes a large number of commandline utilities that can be easily scripted for batch production systems and higher level GUIapplications (ImageLinker). Additionally, bindings have been generated for other languages.A quick summary of OSSIM functionality includes [62]:
    • 18 Chapter 2. Super-Resolution Software Framework • A large scale of commercial and goverment data formats supported • A wide range of map projections and datums supported • Precision terrain correction and orthorectication • Advanced mosaicing, compositing, and fusion • Histogram matching and tonal balancing • Rigorous sensor modeling • Universal sensor models (RPCs) • Equation editors • Elevation supportDynamic Image Chains. Basic to OSSIM is the support of dynamic image chains. One candynamically connect loaders, combiners, lters, and outputs within a running program. Thisbuilding block approach allows complex image processing ows to be interactively constructedand modied. Each object or image unit in the chain may have its own controls and adjustableparameters. The entire state of the chain, including adjusted parameters, can be easily savedand retrieved for OSSIM enabled programs.File formats supported. OSSIM provides native le access to more than 100 raster andvector formats and supports a variety of le formats through the use of several plugins.The GDAL plugin provides access to all the raster and vector formats supported by GDAL and OGR . GDAL (Geospatial Data Abstraction Library) is a translator library for rastergeospatial data formats. It also comes with a variety of useful commandline utilities for datatranslation and processing. The related OGR library provides a similar capability for vectordata. If one reads in a le that is not handled in the OSSIM base library it will use GDAL totranslate it into OSSIM.Parallel processing. The OSSIM library supports parallel processing with MPI (MessagePassing Protocol). OpenMPI is one open source implementation of the MPI standard. Thelocation of OpenMPIs main web site at the time of this document is at http://www.open-mpi.org. OSSIM then needs to be built with MPI support. In other words, it must be built withOpenMPI (other MPI implementations such as LAM/MPI can also be considered).How does it work? OSSIM automatically segments images internally into blocks or tiles. OS-SIM is a tile sequencer that passes these tiles or work units to other processors for execution.Each tile goes through lots of mathematical transformations and can therefore be easily man-aged through the sequencer and MPI. It is possible for developers to change the parameters on  GDAL raster formats: http://www.gdal.org/formats_list.html OGR vector formats: http://www.gdal.org/ogr/ogr_formats.html
    • 2.1. Introduction 19the tile sequencer (number of tiles passed to each processor per transaction) and the internaltile sizing.Command-line applications. From the set of command-line applications distributed with !OSSIM , we are using two of its called core programs: • ossim-igen is used to execute image chains specied in a spec le (.spec) • ossim-info is used to display metadata for imagery, maps, terrain, and vector data.ossim-igen takes a keywordlist as an argument (spec le) and builds a product from the key-wordlist specication. If OSSIM is built with MPI support, adding:>> mpirun np <np> igen igen.specwill run the application on a cluster of machines for faster product generation.2.1.1.3 ArchitectureThe software architecture of the superresolution suite is illustrated in Figure 2.1. The imageregistration library and the superresolution library has been implemented as OSSIM plugins toexploit the block-based image processing approach and the parallel capabilities of the OSSIMcore library. The OSSIM GDAL plugin has been included in the superresolution suite toprovide access to all the raster formats supported by GDAL. A Qt-based GUI for image super-resolution has been built on the top of the core library and the OSSIM plugins. This applicationprovides a friendly environment to setup the plugins and to create image chains which can beexecuted with ossim-igen. Finally, a set of test command-line applications are also distrubutedwith the superresolution suite.Figure 2.1: Superresolution suite architecture: core library, plugins, command-line applicationsand higher level GUI applications. ! OSSIM command-line aplications: http://trac.osgeo.org/ossim/wiki/OSSIMCommandLine
    • 20 Chapter 2. Super-Resolution Software Framework2.1.1.4 InstallationDetailed instructions on how to install the whole Superresolution Software suite are availablein Appendix A.2.2 Image Superresolution PluginThe image superresolution plugin is the core of the superresolution suite and contains a selectionof the most popular algorithms for image registration and image reconstruction.2.2.1 Block DiagramThe block diagram of the superresolution plugin is presented in Figure 2.2. The functionalityof the plugin can be divided into two principal modules: subpixel image registration and imagereconstruction. • Image Registration : The aim of this module is to calculate the relative motion between two or more image tiles with subpixel accuracy. • Image Reconstruction : Given a set of registerd LR tiles, the aim of this module is to calculate the output high-resolution tile.A complete description about the selected and implemented image registration algorithms canbe found in Chapter 3.Note that if the input images are gelocalated (registered LR frames in the block diagramillustration), the get tile operation gets the corresponding input tile according to the spatialreference of the input image. How does it work? A brief description of image, view and groundspaces must be introduced in order to answer that question: • image space: x,y coordinates of a pixel on the original raw image (input image), • ground space: latitude ,longitude, height of a pixel on the earth; this require an elevation model to solve, and nally • view space: x,y coordinates of the pixel in the output ortho-image (reference image).The above transformations are handled by the class called ossimImageViewProjectionTrans-form (IVT) that owns the two projections: the input projection (in our case the reference viewspace), and the output projection (the input image space). The IVT is used to determine themapping from image space referenceto view space, and the other way around.Figure 2.3 illustrates the process of how to get a LR tiles from a previously registered inputimage. The get tile operation request for a tile on the reference view-space. Corners coordinatesof the view-space tile rectangle are projected to the input image space (from view space toimage space) and a rectangle which completely contains the four corners is requested to theinput image (lets call this rectangle projection tile). In order to get the nal view-space tile
    • 2.2. Image Superresolution Plugin 21 Figure 2.2: Superresolution plugin block diagram
    • 22 Chapter 2. Super-Resolution Software Frameworkfrom the projection tile, pixels are requested via inverse mapping. The registration process isdone over this nal view-space input tiles. Figure 2.3: Get tile operation with registered LR frames inverse mapping.2.2.2 Class DiagramFigure 2.4 illustrates a detailed class diagram for the superrolution plugin. The superresolutionplugin is a combiner (a lter with several inputs and one output) which contains a registrationand a reconstruction routines. Each implemented registration algorithm inherits from classCMatch and each reconstruction algorithm inherits from class CDeconv. All plugin code isregistered via a factory class to the ossim core.2.2.3 Software ToolsThe minimum chain needed to perform superresolution is: N x ossimImageHandler | 1 x ossimImageSuperresolutionCombiner | 1 x ossimImageFileWriterA command-line application distributed with the superresolution suite allows the user to setupthe superresolution combiner and to execute that image chain:Usage: test-superresolution <xml_file> <spec_file> <output_name>Description: test-superresolution takes an xml file and a spec file as input and produces a productThe XML le contains a list of the input images. The format of this le is the following:<SuperRes> <Images> <Img>/home/projects/superresolution/test_data/res_chart/res_chart1.tif</Img> <Img>/home/projects/superresolution/test_data/res_chart/res_chart2.tif</Img>... </Images></SuperRes>Superresolution combiner is initialized via a keywordlist:
    • 2.2. Image Superresolution Plugin 23 Figure 2.4: Superresolution plugin class diagram.
    • 24 Chapter 2. Super-Resolution Software Framework scale_factor: 2.0 registration_method: vandewalle|none|keren|file reconstruction_method: delaunay|backproj|backprojreg debug: yes|no2.3 Image Registration PluginImage registration is a critical pre-processing step in image super-resolution. To achieve ac-curate superresolution image reconstruction, it is critical for image alignment to be precise.Errors in alignment of the LR input images will result in the reconstruction of an erroneousHR image which may not be a true approximation of the original scene. Subpixel image reg-istration algorithms included in the superresolution plugin (see block Image Registration ofFigure 2.2), assume that images can be aligned into a reference frame using an ane mathe-matical model. In general, real-worl images need complex geometrical models. To deal withreal-world images, an automatic OSSIM registration plugin has been also implemented. Thatplugin allows the perform a feature detection and matching between two input images and tooptimize an arbitrary complex tranformation model from a set of correspondences.2.3.1 Block DiagramThe block diagram of the automatic registration plugin is presented in Figure 2.5. The func-tionality of the plugin can be divided into the following modules: • Control Point Detection and Matching : Distinctive features are automatically identied in both the target and the reference image. These features are represented as control points in the images. A correspondence or matching is then established between the control points of the same scene in the target and the reference frame. The most dicult step of image registration is the accurate establishment of correspondence between a set of control points as it decides the accuracy level of the registration process. • Model Optimization : This task involves the computation of a geometric transformation model and its parameter estimation to accurately overlay the target image over the reference image. • Image Resampling : Utilizing the mapping function computed in the previous step, the target image is resampled and aligned with the reference image.A complete description about the control point detection and matching algorithm and alsoabout the model optimization routine can be found in 4.2.3.2 Class DiagramFigure 2.6 illustrates a detailed class diagram for the automatic registration plugin. The plugincontains a set of lter, one for each of the selected features extraction algorithms. Again, allplugin code is registered via a factory class to the ossim core.
    • 2.3. Image Registration Plugin 25 Figure 2.5: Unattended registration plugin block diagram.2.3.3 Software ToolsThe minimum chain needed to perform image registration is: 2 x ossimImageHandler | 2 x ossimImageRenderer (contains ossimImageViewProjectionTransform) | 1 x ossimArgongraRegistrationCombiner | 1 x ossimImageFileWriterThe result is a GML le (XML le) with the set of matched control points which has thefollowing format:<?xml version=1.0?><TiePointSet xmlns:gml="http://www.opengis.net/gml"> ... <SimpleTiePoint> <ground> <gml:Point> <gml:coord> <X>[X]</X> <Y>[Y]</Y> </gml:coord> </gml:Point> </ground> <image> <gml:Point> <gml:coord> <X>[X]</X> <Y>[Y]</Y> </gml:coord> </gml:Point> </image> </SimpleTiePoint> ...</TiePointSet>A set of command-line test application distributed with the superresolution suite allows the
    • 26 Chapter 2. Super-Resolution Software Framework Figure 2.6: Unattended registration plugin class diagram.
    • 2.4. Qt Super-Resolution GUI 27user to setup the control point detection and matching combiners and to execute the imagechain:Usage: test-harris <master_im> <slave_im> [<output_filename> <kwl_file>]Description: test-harris generates a XML file with points correspondences between two images using Harris corners feature detectorUsage: test-sift <master_im> <slave_im> [<output_filename> <kwl_file>]Description: test-sift generates a XML file with points correspondences between two images using SIFT features and descriptorsUsage: test-surf <master_im> <slave_im> [<output_filename> <kwl_file>]Description: test-surf generates a XML file with points correspondences between two images using SURF features and descriptorsEach combiner can be setup via a keywordlist. Details of the accepted input parameters canbe found in sections 4.2.1.3, 4.2.2.3 and 4.2.3.3.Once the correspodences XML le has been generated, the model optimization can be computedwith the Model Optization tools provided with the Qt Superresolution GUI. See section 4.3for more details.Finally, a mosaic generation tool has also been provided with the superrolution suite for visualinspection of the registration accuracy. The mosaic averages the values of the pixels in theoverlapping area (blend mosaic). The complete mosaic image chain has the following structure: 2 x ossimImageHandler | 2 x ossimImageRenderer (contains ossimImageViewProjectionTransform) | 1 x ossimImageMosaic | 1 x ossimImageWriterThe renderer (resampler) uses the IVT to determine the mapping from image space to viewspace and vice versa, establishing the correct resampling kernel giving this relation. Therenderer pulls pixels from the input side, resamples them, and populates a requested tile inthe output map space.The following command-line application can be used to generate the mosaic product:Usage: test-mosaic <master_im> <slave_im>2.4 Qt Super-Resolution GUIThis software suite is a Qt-based package for image super-resolution [63]. The main objectiveof this software tool is to create a friendly GUI to interact with the developed OSSIM pluginsto create OSSIM image chains which can be run in a cluster-based scenario. The input images
    • 28 Chapter 2. Super-Resolution Software Frameworkmay have all the GDAL raster supported formats. The output results are saved in TIFFformat.This software also allows the user to create data sets for dierent simulation purposes, tocompute several image similarity metrics and provides software tools to analyize and visualizegeometrical registration models accuracy. Several GUI screenshoots will be included in therelated sections of the following chapters.
    • Chapter 3 Image Super-Resolution Module3.1 IntroductionThis Chapter describes in detail the subpixel regitration algorithms and the reconstructionmethods (motion estimation and reconstruction blocks in Figure 2.2) implemented in the su-perresolution plugin presented in Chapter 2. In order to test our implementation and the pluginperformance (block-based image processing and parallel capabilities), several image tests arealso included.3.2 Subpixel RegistrationTwo image reconstruction algorithms with subpixel accuracy have been selected and imple-mented for comparison to estimate the motion parameters between the reference image (ref-erence tile) and each of the other images (input tiles). Only planar motion parallel to theimage plane is allowed in both algorithms. The motion can be described as a function of threeparameters: horizontal and vertical shifts, ∆x1 and ∆x2 , and a planar rotation angle φ.Vandewalle et al.[64] presented a single-scale frequency-domain approach which allows to es-timate the horizontal and vertical shifts and the rotation angle separately, while Keren et al.[65] proposed a multi-resolution spatial-domain scheme to estimate the shifts and the rotationangle simultaneously.Assume we have a reference signal f1 (x) and its shifted and rotated version f2 (x): f2 (x) = f1 (Rφ (x + ∆x)), (3.1) x1 ∆x1 cos φ − sin φwith x= , ∆x = and Rφ = . x2 ∆x2 sin φ cos φIn an equivalent way: f2 (x1 , x2 ) = f1 (x1 cos φ − x2 sin φ + ∆x1 , x1 sin φ + x2 cos φ + ∆x2 ) (3.2)
    • 30 Chapter 3. Image Super-Resolution Module3.2.1 Planar motion estimation in the frequency domainVandewalle et. al presented in [66] a single-scale frequency domain registration algorithm forthe computation of the motion parameters.Due to the rotation invariant property of the Fourier transform, the relation between theamplitudes of the Fourier transforms can be computed as: |F2 (u)| = |F1 (Rφ u)|, (3.3)where |F2 (u)| is a rotated version of |F1 (u)| over the same angle φ as the spatial domainrotation and |F1 (u)| and |F2 (u)| do not depend on the shift values ∆x, because the spatialdomain shifts only aect the phase values of the Fourier transforms (spatial-domain shiftproperty of the Fourier transform). Therefore we can rst estimate the rotation angle φ fromthe amplitudes of the Fourier transforms |F1 (u)| and |F2 (u)|. After compensation for therotation, the shift ∆x can be computed from the phase dierence between F1 (u) and F2 (u).Rotation estimation. The rotation angle between |F1 (u)| and|F2 (u)| can be computed asthe angle φ for which the Fourier transform of the reference image |F1 (u)| and the rotatedFourier transform of the image to be registered |F2 (Rθ u)| have maximum correlation.To speedup the computation of the rotation angle, the authors translate the 2D correlationcomputation into a 1D correlation computation. For this purpouse, they dene the frequencycontent h as a function of the angle α by integration over radial lines: α+∆α/2 ∞ h(α) = |F (r, θ)|drdθ, (3.4) α−∆α/2 0where |F (r, θ)| is the amplitude of the Fourier transform in polar coordinates. We computethe discrete function h(α) as the average of the values on the discrete grid that have an angleα − ∆α/2 < θ < α + ∆α/2. The rotation angle is computed with a precision 0.1 degrees,therefore h(α) is computed every 0.1 degrees. The regular (u1 , u2 ) grid becomes irregular inpolar coordinates (r, θ). To get a similar number of signal values |F (r, θ)| at every angle, theaverage is only evaluated on a circular disc of values for which 0.1ρ < r < ρ (where ρ is theimage radius or half the image size). Thus, h(α) is computed as the average of the frequencyvalues on a discrete grid with α − ∆α/2 < θ < α + ∆α/2 and 0.1ρ < r < ρ.This results in a function h(α) for both |F1 (u)| and |F1 (u)|. The rotation angle can then becomputed as the value for which their correlation reaches a maximum.Shift estimation. A shift of the image parallel to the image plane can be expressed in Fourierdomain as a linear phase shift. It is well known that the shift parameters ∆x can thereforebe computed as the slope of the phase dierence ∠(F2 (u)/F1 (u)). To make the solution lesssensitive to noise, a plane is tted through the phase dierences using a least squares method.
    • 3.2. Subpixel Registration 31Aliasing. In cases of limited aliasing, it is still possible to use this method by consideringonly the frequencies that are free of aliasing or only marginally aected by aliasing. Therotation estimation is then based on the frequencies for which ρ < r < ρmax (with ρmax =min((us −umax )/us )), where umax is the maximum frequency and us is the sampling frequency)and the horizontal and vertical shifts are estimated from the phase dierences for −us +umax <u < us − umax . High-frequency noise is removed together with the aliasing, which results in amore accurate registration.A global overview of the Vandewalle et al. registration algorithm is summarized below: 1. Multiply the images fLR,m by a Tukey window to make them circularly symmetric. The windowed images are called fLR,w,m . 2. Compute the Fourier transforms FLR,w,m of all low-resolution images. 3. Rotation estimation between every image fLR,w,m and the reference image fLR,w,1 (a) Compute the polar coordinates (r, θ) of the image samples. (b) For every angle α, compute the average value hm (α) of the Fourier coecients for which α − 1 < θ < α + 1 and 0.1ρ < r < ρmax . The angles are expressed in degrees and hm (α) is evaluated every 0.1 degrees. A typical value used for ρmax is 0.6. (c) Find the maximum of the correlation between h1 (α) and hm (α) between -30 and 30 degrees. (d) Rotate image fLR,w,m by -φm to cancel the rotation. 4. Shifts estimation between every image fLR,w,m and the reference image fLR,w,1 (a) Compute the phase dierence between every input image and the reference image as ∠(FLR,w,m /FLR,w,1 ). (b) For all frequencies −us +umax < u < us −umax write the linear equation describing a plane through the computed phase dierence with unknown slopes ∆x. (c) Find the shift parameters ∆xm as the least squares solution of the equations.3.2.2 Planar motion estimation in the spatial domainKeren et. al presented in [65] a hierarchical spatial domain registration algorithm based on apyramidal architecture (multi-resolution scheme) for the computation of the motion parame-ters.If we expand sin φ and cos φ in Eq. 3.2 to the rst two terms in their Taylor series we will get: φ2 φ2 f2 (x1 , x2 ) ≈ f1 (x1 + ∆x1 − x2 φ − x1 , x2 + ∆x2 + x1 φ − x2 ) (3.5) 2 2
    • 32 Chapter 3. Image Super-Resolution ModuleExpanding f1 to the rst term of its own Taylor series gives the following rst order equation: ∂f1 φ2 ∂f1 φ2 f2 (x1 , x2 ) ≈ f1 (x1 , x2 ) + (∆x1 − x2 φ − x1 ) + (∆x2 + x1 φ − x2 ) (3.6) ∂x1 2 ∂x2 2The error function between f1 and f2 after rotation by φ and translation by ∆x1 and ∆x2 canthen be approximated by: ∂f1 φ2 E(∆x1 , ∆x2 , φ) = f1 (x1 , x2 ) + (∆x1 − x2 φ − x1 )+ ∂x1 2 2 (3.7) ∂f1 φ2 (∆x2 + x1 φ − x2 ) − f2 (x1 , x2 ) ∂x2 2where the summation is over the overlapping part of f1 and f2 .We look for the mininum of E computing its derivatives with respect to ∆x1 , ∆x2 and φ andcomparing them to zero. Then after neglecting the non-linear terms and some small coecientswe get the following system of equations: 2 ∂f1 ∂f1 ∂f1 ∂f1 ∂f1 ∆x1 + ∆x2 + R φ= (f1 − f2 ) ∂x1 ∂x1 ∂x2 ∂x1 ∂x1 2 ∂f1 ∂f1 ∂f1 ∂f1 ∂f1 ∆x1 + ∆x2 + R φ= (f1 − f2 ) (3.8) ∂x1 ∂x2 ∂x2 ∂x2 ∂x2 ∂f1 ∂f1 R ∆x1 + R ∆x2 + R2 φ = R(f1 − f2 ) ∂x1 ∂x2 ∂f ∂fwhere R is an abbreviation for x1 ∂x1 − x2 ∂x1 2 1 and the summation is on the overlapping area.Due to the approximations made when obtaining 3.6, the expression is correct only for smallvalues of (∆x1 , ∆x2 ,φ). Therefore Keren et al. perform the following iterative process: solvethe equations, push f2 (using formula 3.2), with the solutions obtained and continue with thenew f2 , either a xed number of iterations or until the solutions are very small. In order tokeep accuracy we always push the original f2 by the accumulated values of ∆x1 , ∆x2 and φ.We need to compute nine of twelve equations parameters only once: we need to change onlythe scalar part where f2 occurs because f1 is always the same.Gaussian pyramid. An image pyramid is a multi-resolution representation of an image con-structed by successive levels of band-passed, sub-sampled images. A more general descriptionof image pyramids in the eld of image processing is given in .... In a Gaussian pyramid, theoriginal image of size N1 xN2 (level 0) is lered by a Gaussian lter and subsampled to givean image of size N1 /2xN2 /2. This process is repeated to form the next higher level until thespecied number of levels is reached forming the Gaussian pyramid structure (see Fig. 3.1).In order to increase speed and robustness, Keren et al. use a pyramid image architectureto estimate the motion parameters. The pseudocode of their implementation is presented inFigure 3.2. One rst computes the motion parameters for the smallest image (level J ). Even
    • 3.2. Subpixel Registration 33 Figure 3.1: Gaussian image pyramid structure. phi_est = []; delta_est = []; s = []; Pyramid construction From pyrlevel=3 to pyrlevel=1 Get f(pyrlevel) Get g(pyrlevel) Calculate 12 equation parameters Solve the system of equations to get s iter=1 While (stop_criteria; iter<25) Compute f_ (rotated and shift version of f) Compute 3 equation parameters Solve the system of equations to get s_ s = s + s_; iter = iter + 1; s = ( 2*s{1} 2*s{2} s{3} ) If pyrlevel > 1 Compute g_(pyrlevel-1) (rotated and shifted version of g(pyrlevel-1)) End End phi_est = s{3} delta_est = [s{1} s{2}] Figure 3.2: Pseudocode of Keren et al. algorithm.
    • 34 Chapter 3. Image Super-Resolution Modulebig translations are small at this reduction level. Then, one interpolates the found parametersinto the larger image (level J − 1), corrects this guess by one or two iterations, and interpolatesagain to the next resolution. This process continues until the original images are reached (level0). The authors dened J =3 pyramid levels.Such architecture not only decreases the overall computational cost but also increases theaccuracy of registration parameter estimation. The complexity of the entire process is likecomputing two iterations on the original images. More iterations are performed on the coarsestlevel of the image resolution pyramid and once acceptable motion parameters are estimatedon that level, they are then used as initial conditions on a more ner level of the resolutionpyramid.3.2.3 Subpixel Registration ExperimentsA set of simulations have been performed in order to compare the frequency-based registrationalgorithm by Vandewalle et al. with the multirresolution spatial-based registration algorithmby Keren et al.Each simulation starts from a HR image, which is considered as the equivalent for continuousspace. This image is then multiplied by a Tukey window to make the image circularly symmetricand this way avoiding all boundary eects. Next, a set of shifted and rotated versions arecreated from this HR image. Gaussian zero-mean random variables are used for the shift(pixels) and rotation (degrees) parameters. For the shifts, a standard deviation of 2 is used,while the rotation angles have a standard deviation of 1. The shifted and rotated images arethen low-pass ltered using an ideal low-pass lter with cuto frequency 0.12us (with us thesampling frequency of the high-resolution image) such that they satisfy the conditions speciedin Vandewalles paper (a small aliasing-free part of the frequency domain is needed). Finally,the images are downsampled by a factor eight.Figure 3.3: High-resolution images used in the simulations: (a) building, (b) castle, and (c)leaves. Available in: http://rr.epfl.ch/3/.We compute the average absolute error (µ) and the standard deviation of the error (σ ) for theshift and rotation parameters in the dierent algorithms. The results of these simulations arelisted in Figure 3.4.
    • 3.2. Subpixel Registration 35 Images Building Castle Leaves Vandewalle et al. Parameters µ σ µ σ µ σ Shifts (pixels) 0.3383 0.2577 0.0953 0.0720 0.0975 0.1134 Rotation (degrees) 0.3588 0.1922 0.3345 0.2347 0.7173 0.5285 Images Building Castle Leaves Keren et al. Parameters µ σ µ σ µ σ Shifts (pixels) 0.0241 0.0194 0.0141 0.0113 0.0165 0.0136 Rotation (degrees) 0.0688 0.0455 0.0219 0.0160 0.0955 0.0865Figure 3.4: Simulation results with noiseless LR frames: summary of the average absoluteerror (µ) and the standard deviation of the error (σ ) for the shift and rotation parameters; 100simulations were performed for each of the images.The results of the simulations are also summarized using box plots of the absolute errors ofthe parameters (see Figure 3.5). There is one box per shift error (in pixels) and per rotationangle error (in degrees) for each algorithm and for each image. On each box, the central markis the median, the edges of the box are the 25th and 75th percentiles, the whiskers extend tothe most extreme data points not considered outliers, and outliers are plotted individually.The registration results with the Keren et al. spatial-based algorithm are better than with theVandewalle et al. frequency-based algorithm. Vandewalle et al. has clearly lower precisionthan Keren et al. Vandewalle et al. performs much worse in estimating the rotation angle.Because of this erroneous rotation cancellation, the following shift estimation also fails.Finally, Vandewalle et al. clearly works best on image with strong frequency content in anumber of directions (see building image). If there is not a strong frequency content in anydirections, in instance, if energy is homogeneously spread among all possible directions (as inleaves images), the rotation angle estimation performance decreases (see box plot 9 in Figure3.5).To check for the robustness of the registration algorithms, all the parameters used in theprevious simulations are kept same, except in this case we introduce a noise level of 20dBinto each LR frames. The results of these simulations are summarized in Figure 3.6. Again,Vandewalle et al. gives worse results than Keren et al. Vandewalle et al. errors clearly increasewhile Keren et al. errors are not aected by noise presence in the LR frames.
    • 36 Chapter 3. Image Super-Resolution Module Castle Building Leaves 2 Absolute value of the error 1.5 1 0.5 0 1 2 3 4 5 6 7 8 9 10 11 12 SimulationsFigure 3.5: Box plot of simulations results. Simulations with castle image: (1) φ, Vandewalleet al. (2) ∆x, Vandewalle et al. (3) φ, Keren et al. (4) ∆x, Keren et al. Simulations withbuilding image: (5) φ, Vandewalle et al. (6) ∆x, Vandewalle et al. (7) φ, Keren et al. (8) ∆x,Keren et al. Simulations with leaves image: (9) φ, Vandewalle et al. (10) ∆x, Vandewalle etal. (11) φ, Keren et al. (12) ∆x, Keren et al.
    • 3.3. Image Reconstruction 37 Images Building Castle Leaves Vandewalle et al. Parameters µ σ µ σ µ σ Shifts (pixels) 0.3656 0.2545 0.1137 0.0808 0.1347 0.1435 Rotation (degrees) 0.4004 0.2067 0.4067 0.2462 0.7168 0.5268 Images Building Castle Leaves Keren et al. Parameters µ σ µ σ µ σ Shifts (pixels) 0.0235 0.0208 0.0161 0.0127 0.0166 0.0135 Rotation (degrees) 0.0871 0.0538 0.0239 0.0191 0.0995 0.0829Figure 3.6: Simulation results with noisy LR frames: summary of the average absolute error(µ) and the standard deviation of the error (σ ) for the shift and rotation parameters; 100simulations were performed for each of the images.3.3 Image Reconstruction3.3.1 Image Reconstruction MethodsThe following image reconstruction algorithms have been selected and implemented for com-parison, all working in the spatial domain: • a non-uniform interpolation algorithm, based on Delaunay triangulation [30]; • the classical iterative back-projection algorithm (IBP) [31]; and • an iterative algorithm, based on L2 norm and Bilateral-TV regularization term cost function minimization [67].3.3.1.1 Delaunay InterpolationOnce the sequence of geometrically warped, under-sampled, low-resolution frames are registeredprecisely in reference to a low-resolution frame, each pixel from each of the low-resolution framesis placed onto a high-resolution composite grid using the registration information estimatedby the registration routines. This step is illustrated in Fig. 3.7 where, for simplicity, it isassumed that the relative shifts between LR frames are smaller than the up-sampling factor.The objective of this algorithm is to obtain a HR (uniformly spaced) grid from a set of LRframes. Following registration of the LR frames, the available samples distribute irregularlyon the initial HR raster. Subsequently, the algorithm converts this nonuniformly spaced rasterto an uniformly spaced grid.Once the LR frames have been translated onto the HR grid, a Delaunay triangulation is used topopulate the empty pixels. Delaunay triangulation is a technique invented in 1934 [Delaunay34]for connecting points in a space into triangular groups such that the minimum angle of all the
    • 38 Chapter 3. Image Super-Resolution Module Figure 3.7: Placement of the registered LR frames into the HR grid.angles in the triangulation is a maximum. In computer graphics, Delaunay triangulation isoften the basis for representing 3D shapes.Using Delaunay triangulation of a given cloud of points, one can immediately nd the nearestneighbor to a new point. Such a partition is called a Voronoi tessellation (see Figure 3.8). Thistessellation is the dual image of the Delaunay triangulation, because the Delaunay lines denethe distance between existing points and so the Voronoi lines know where they must intersectthe Delaunay lines in order to keep equal distance between points.We are using OpenCV 2D Delaunay triangulation (Voronoi tesselation) to interpolate the high-resolution image grid by lling each of the computed Voronoi cells with its LR Delaunay pointvalue.Finally, the HR image grid is smoothed using a Gaussian lter of size 7x7 pixels and down-sampled using a bilinear interpolation to the size of the desired HR image.3.3.1.2 Iterative Back-ProjectionThe Iterative Back-Projection algorithm (IBP) has been translated from a classical vector-based formulation to an ecient matrix-based formulation. Here we use the framework pre-sented by Zomet and Peleg [57] to alternate a vector-based and a matrix-based (image-based)formulations of the SR problem to speedup the super-resolution computation. We used vector-based formulation in the super-resolution formal problem denition and the image acquisitionmodel (see section 1.3). The basic operations in the image formation model, such as warping,
    • 3.3. Image Reconstruction 39Figure 3.8: Delaunay triangulation, Voronoi tessellation: (a) the Delaunay triangulation inbold with the corresponding Voronoi tessellation in ne lines; (b) the Voronoi cells aroundeach Delaunay point.convolution and sampling and are linear, and thus can be represented as matrices operatingon these vector images.As shown in Chapter 1, section 1.5.2.2, the maximum likelihood solution for super-resolutioncan be found minimizing the following likelihood function. Reformulating the problem in termsof an optimization procedure, the objective function can be written as: J(x) = Hx − y 2 . 2 (3.9)We use the steepest descent to nd the minimum of the objetive function. Given an initialestimation of the HR image ˆ x0 , each iteration step is calculated as: xn+1 = xn − γn x J(xn ), (3.10)where γn > 0 is a scalar dening the step size in the direction of the gradient at every iteration.In our case, γn = γ for all iterations.Deriving J th respect to x: N J = H T (Hx − y) = Mk Bk Dk (Dk Bk Mk x − yk ) T T T (3.11) k=1Therefore, the solution to this minimization problem can be expressed as: N xn+1 = xn − γ Mk Bk Dk (Dk Bk Mk xn − yk ) T T T (3.12) k=1
    • 40 Chapter 3. Image Super-Resolution ModuleStop criteria has been dened as: ||xn+1 − xn || < T, (3.13)where T is a scalar dening a threshold.The matrices Mk , Bk and Dk model the image formation process, and their implementation issimply the image warping, blurring and subsampling respectively. The implementation of thetranspose matrix is also very simple: • Dk T - As Dk is a down-sampling operation, then T Dk is implemented by upsampling the image without interpolation, i.e. zero padding. • Bk T - As T Bk is implemented as a Gaussian kernel (smooths edges), then Bk is implemented as a Laplacian kernel (sharps edges) • Fk T - As Fk is implemented by backward warping, then T Fk should be the forward warping of the inverse motion.Therefore, the gradient is computed eciently in the image domain instead of multiplying largesparse matrices and the steepest-descent algorithm can be rewritten in an image-based formas: −1 Xn+1 = Xn + Mk [hl ∗ S ↑ (yk − Mk [hg ∗ S ↓ (Xn )])], (3.14) kwhere hg and hl are a Gaussian and a Laplacian lter, S↑ and S↓ are the up-sampling and −1down-sampling operators and Mk and Mk is the backward direct and the forward inversewarping operators.This is a version of the Iterated Back Projection algorithm [31], using a specic blur kerneland forward warping in the back projection stage.3.3.1.3 Constrained MinimizationFinally, we formulate SR problem in terms of a regularized optimization procedure. We for-mulate our estimation framework as the following cost function: P P J(x) = Dk Bk Mk X − Yk 2 2 +λ α|m|+|l| X − Sx Sy X l m 1 (3.15) k l=−P m=0 l+m≥0The rst term uses the L2 norm (least-squares) to measure data delity. The second term isa regularazation term presented by Farsiu et al. in [67] called Bilateral-TV, which providesrobust performance while preserving the edge content common to real image sequences. lSx and m Sy are the operators corresponding to shifting the image represented by X by l pixelsin the horizontal direction and m pixels in the vertical direction, respectively. These act as
    • 3.3. Image Reconstruction 41derivatives across multiple scales. The scalar weight α is applied to give a spatially decayingeect to the summation of the regularization term.Deriving J th respect to X: N J= Mk Bk Dk (Dk Bk Mk Xn − Yk )+ T T T k=1 P P (3.16) λ α|m|+|l| [I − Sy Sx ]sign(Xn − Sx Sy Xn ), −m −l l m l=−P m=0 l+m≥0where γ is a scalar dening the step size in the direction of the gradient. −l Sx and −m Sy dene lthe transposes matrices Sx and m Sy respectively and have a shifting eect in the opposite l mdirections as Sx and Sy .Again, we use a steepest descent algorithm to nd the minimum of the cost function: N Xn+1 = Xn − γ Mk Bk Dk (Dk Bk Mk Xn − Yk )+ T T T k=1 P P (3.17) λ α|m|+|l| [I − Sy Sx ]sign(Xn − Sx Sy Xn ) −m −l l m l=−P m=0 l+m≥0The matrices M , B , D, S and their transposes can be exactly interpreted as direct imageoperators such as warp, blur, decimations and shifts. Thus, explicity construction of thesematrices is avoided and the reconstruction method can be implemented in a fast and memoryecient way. Figure 3.9 is the block diagram representation of Eq. 3.17. There, each LR frameYk is compared to the warped, blurred and decimated current estimate of the HR frame Xn .Block Gk represents the gradient back projector operator that compares the k-th LR image tothe estimate of the HR image in the n-th steepest descent iteration. Block Rm,l represents thegradient of regularization term, where the HR estimate in the n-th steepest descent iterationis compared to its shifted version (l pixels shift in horizontal and m pixel shift in verticaldirections). Details of the blocks Gk and Rm,l are dened in Figure 3.10(a) and (b).3.3.2 Super-Resolution Experiments3.3.2.1 Similarity MeasureIn order to quantify the delity of reconstruction, each reconstructed high-resolution imageneeds to be compared with the original scene; this is similarity measure. It also helps to monitorand evaluate the performance of the iterative reconstruction process. There are many similaritymeasures available in literature but none have been accepted universally as a standard whencomparing two images. Some of the most popular are Normalized Cross-Correlation Ratio
    • 42 Chapter 3. Image Super-Resolution ModuleFigure 3.9: Block diagram representation of Eq. 3.17, blocks Gk and Rm,l are dened in Fig.3.10.Figure 3.10: Extended block diagram representation of Gk and Rm,l blocks in Fig. 3.9. (a)Block diagram of L2 norm term cost derivative (Gk ). (b) Block diagram representation ofregularization term cost deritave (Rm,l ).
    • 3.3. Image Reconstruction 43(NCCR), Direct Dierence Error (DDE), Peak Signal-to-Noise Ratio (PSNR) and Root MeanSquare Error (RMSE).If an image is multiplied or added with a constant, the intensity level of the image changesbut the spatial resolution of the image remains unchanged. As our emphasis is on the spatialimprovement of the reconstructed high-resolution image, a modied version of RMSE has beenused as a similarity measure. The modication is to ensure that the local spatial contents ofimages and no their global brightness are compared.Presume X is the original high-resolution image and X is the estimated high-resolution image.Then, the RMSE is given as: 2 1 M1 M2 ij Xij − Xij ˜ ˜ RM SE = (3.18) σxwhere X=X−X ˜ ¯ (3.19) ˜ X =X −X ¯and 1 2 σx = Xij − Xij ¯ (3.20) M1 M2 ijPSNR is also calculated for each reconstructed high-resolution image and is given as: 255 P SN R = 20 log (3.21) RM SE3.3.2.2 Simulations ResultsFor the purpose of evaluating the performance of the implemented reconstruction algorithms,several dierent test images were used. In this set of simulations, LR frames have been gener-ated using our data simulator. Motion parameters were stored in a text le and read by ourprogram in the registration step before reconstruction. This way, we avoid the eect of motionestimation errors in the registration step and focus our analysis only in the reconstructionalgorithms performance. As a quality measure, we are using the modied RMSE presentedin the previous subsection. Figure 3.11 show the two superresolution reconstruction scenariosimplemented: an interpolation-based reconstruction method (Delaunay interpolation) and aniterative-based reconstruction scheme (non-regularized and regularized L2 minimization).Books sequence. In this example 10 LR images of size 128x128 pixels were used for thereconstruction. The actual high-resolution test image of size 256x256 pixels (real imagery) isblurred with a Gaussian kernel of size 3x3 pixels. The LR images were articially generated byrotating the image with a Gaussian random variable of standard deviation 1 degree and thenshifting the result both in horizontal and vertical directions with a Gaussian random variable
    • 44 Chapter 3. Image Super-Resolution ModuleFigure 3.11: Superresolution simulation scenarios. Above: using interpolation-based recon-struction methods; below: using optimization-based reconstruction methods
    • 3.3. Image Reconstruction 45of standard deviaton 2 pixels. The sub-sampling ratio was 2 and the up-sampling ration wasalso 2. Figure 3.12(a) shows one of the ten LR images of the `books sequence (these LRframes are assumed to be noiseless). Figure 3.12(b) shows the reconstructed image usingDelaunay interpolation while Figure 3.12(c) and Figure 3.12(d) are the reconstructed imagesusing an iterative approach. Quantitative analysis show that the best results are achievedusing Delaunay interpolation (lower RMSE). Also, regularized L2 minimization gives betterresults than non-regularized L2 minimization. Visual inspection of the reconstructed imagesalso conrms that results: although minimization techniques forces higher image sharpeness(look at the book titles), Delaunay interpolation achieves a better tradeo between imagesharpeness and image smoothness, giving enough resolution enhancement to read the booktitles but also enough smoothness to avoid image artifacts which are visible in the iterativereconstructed images.Frida sequence. In this example 10 LR images of size 128x128 pixels were used for thereconstruction. The actual high-resolution test image of size 768x768 pixels (real imagery)is blurred with a Gaussian kernel of size 3x3 pixels. Again, the LR images were articiallygenerated by rotating the image with a Gaussian random variable of standard deviation 1degree and then shifting the result both in horizontal and vertical directions with a Gaussianrandom variable of standard deviaton 2 pixels. The sub-sampling ratio was 6 and the up-sampling ratio was 2. A noise level of 20dB has been introduced into each sub-sampled LRframe. Figure 3.13(a) shows one of the ten LR images of the frida sequence. Figure 3.13(b)shows the reconstructed image using Delaunay interpolation while Figure 3.13(c) and Figure3.13(d) are the reconstructed images using an iterative approach. Again, the best results areachieved using Delaunay interpolation (lower RMSE) and regularized L2 minimization givesbetter results than non-regularized L2 minimization.Resolution chart sequence. In this example 6 LR images of size 1089x840 pixels were usedfor the reconstruction. The actual high-resolution test image (see Figure 3.14) is a 8712x6720Camera Resolution Chart conform to the ISO-12233 Standard Photography-Electronic stillpicture cameras - Resolution measurements  . The real image is blurred with a Gaussian kernelof size 3x3 pixels. The LR images are articially generated by rotating the image with aGaussian random variable of standard deviation 5 degrees and then shifting the result both inhorizontal and vertical directions with a Gaussian random variable of standard deviation 10pixels. The sub-sampling ratio was 8 and the up-sampling ratio was 2.First, Figure 3.15 shows the relation between block size and computation time; the table alsocontains the number of tiles computed for the dierent block sizes and also a quality measure ofthe reconstructed images using both an interpolation-based algorithm (Delaunay interpolation)and an interative algorithm (regularized L2 minization). Motion parameters are supposed tobe known and are provided to the registration routine via a text le. Simulations have beendone on 1.6 GHz Intel Core i7 PC with 4GB RAM under Linux. Only 1 CPU has beenused for these simulation. Note that the relation between block size and computation time isquite linear when using an interpolation-based reconstruction algorithm but not linear when  Resolution chart is available online at http://www.imaging-resource.com/SCAN/DSMP/DSMPICS.HTM
    • 46 Chapter 3. Image Super-Resolution ModuleFigure 3.12: Comparison of the reconstruction algorithms with the books sequence: (a) oneof the 10 LR frames; (b) reconstructed image using Delaunay interpolation (RMSE=1.3484);(c) reconstructed image using L2-norm minimization with γ = 0.03 (RMSE=1.3819), (d)reconstructed image using L2-norm and Bilateral-TV regularization term minimization withγ = 0.03 and Bilateral-TV parameters λ = 0.8 and α = 0.7 (RMSE=1.3791).
    • 3.3. Image Reconstruction 47Figure 3.13: Comparison of the reconstruction algorithms with the frida sequence: (a) oneof the 10 LR frames; (b) reconstructed image using Delaunay interpolation (RMSE=1.2151);(c) reconstructed image using L2-norm minimization with γ = 0.03 (RMSE=1.2298), (d)reconstructed image using L2-norm and Bilateral-TV regularization term minimization withγ = 0.03 and Bilateral-TV parameters λ = 0.5 and α = 0.7 (RMSE=1.2236).
    • 48 Chapter 3. Image Super-Resolution Module Figure 3.14: Full resolution chart high-resolution test image.
    • 3.3. Image Reconstruction 49using an interative algorithm (see Figure 3.18). In an iterative scenario, each tiles convergesin a dierent number of iterations, making computation time not linear. In both approaches,bigger block sizes give better reconstruction results because image artifacts introduced by tilesboundaries are not so important. Block size Number of tiles Computation time MRMSE Iterative Interpolation Iteratived Interpolation 64 945 90 29 1.8983 1.4454 128 252 107 28 1.8368 1.2437 256 63 113 31 1.6192 0.9446 512 20 115 40 1.2203 0.8772 1024 6 92 40 0.4121 0.4647 2048 2 86 42 0.3845 0.4482Figure 3.15: Simulation results with the resolution chart sequence: computation time versusblock size.Figure 3.16: Simulation results with the resolution chart sequence: graph of computationtime versus block size.Second, Figure 3.17 and Figure 3.18 show the relation between number of processors andcomputation time and a comparision of computation time with both interpolation-based anditerative-based reconstruction techniques. Again, motion parameters are supposed to be knownand are provided to the registration routione via a text le. Simulations have been done on1.6 GHz Intel Core i7 PC with 4GB RAM under Linux. Block size is xed to 256x256 pixels.The performance is clearly worse with an iterative algorithm. The best results are achievedwith 4 CPU working in parallel. If we use more processors, the hardware cost does not t withthe reduction in the computation time. In parallel-based scenario, one of threads works as themaster thread, delivering tiles the rest of processors. OSSIM master thread does not deliver
    • 50 Chapter 3. Image Super-Resolution Modulenew tiles to a free CPU until the rest of CPU have nished their work. In that sense, there issome idle time where a free CPU is waiting for new tiles. In a iterative-based scenario, each tileneeds a nite number of iterations to reach convergence and this number of iteration is dierenttile to tile. This means dierent computation time and, indeed, sure idle time for processors.When using an interpolation-based algorithm, the reconstruction time only depends on theblock size and it is related with the time necesarry to ll the high-resolution grid and makethe interpolation; this time is nearly the same tile to tile; this is the reason that makes therelation between number of processors and computation time more linear. Number of processors Computation time Iterative method Interpolation method 1 114 31 2 139 28 3 95 24 4 71 17 5 76 19 6 63 17 7 61 15 8 55 13Figure 3.17: Simulation results with the resolution chart sequence: number of processorsversus computation time.Figure 3.18: Simulation results with the resolution chart sequence: graph of number of pro-cessors versus computation time.Finally, Figures 3.19 and 3.20 show two details of the reconstructed resolution chart imagedening dierent block sizes in our OSSIM-based superresolution plugin. Note that block
    • 3.3. Image Reconstruction 51boundaries artifacts are more visible as block size decreases.Figure 3.19: Comparison of the reconstruction step using dierent block sizes with the reso-lution chart sequence: (a) detail of one of the 6 LR frames; reconstructed images details (b)block size: 64 pixels; (c) block size: 128 pixels; (d) block size: 256 pixels; (e) block size: 512pixels; (f ) block size: 1024 pixels.
    • 52 Chapter 3. Image Super-Resolution ModuleFigure 3.20: Comparison of the reconstruction step using dierent block sizes with the reso-lution chart sequence: (a) detail of one of the 6 LR frames; reconstructed images details (b)block size: 64 pixels; (c) block size: 128 pixels; (d) block size: 256 pixels; (e) block size: 512pixels; (f ) block size: 1024 pixels.
    • Chapter 4 Image Registration Module4.1 IntroductionThis Chapter describes in detail the control point detection and matching algorithms and themodel optimization routine (see Figure 2.5) implemented in the automatic registration pluginpresented in Chapter 2. In order to test the accuracy of the registration process, an image testusing one of the feature detectors and a polynomial model is also included.4.2 Control Point Detection and MatchingThis block aims to provide a list of tie points between a master (reference) and a slave image.A tie point pair connects a feature in the master image and the corresponding feature in theslave image.Three algorithms have been implemented for feature extraction: Harris corners [51], SIFTfeatures [52] and SURF features [53]. Harris corners matching is based on an image patch cor-relation technique while SIFT and SURF features matching are based on a feature descriptorsdistances method.4.2.1 Harris corners4.2.1.1 Interest points detectionThe Harris corner detector is based on the local auto-correlation function of an image, where thelocal auto-correlation function measures the local changes of the image with patches shifted bya small amount in dierent directions. Shifting a window centered in a corner in any directionshould yield a large change in appearance.Given a shift (∆x, ∆y) and a point (x, y), the auto-correlation function is dened as: c(x, y) = w(u, v)(I(u, v) − I(u + ∆x, v + ∆y))2 (4.1) (u,v)∈Wwhere w(x, y) is either constant or (better) Gaussian window centered at (x, y).
    • 54 Chapter 4. Image Registration ModuleApproximating the shifted function by the rst-order Taylor expansion: I(u + ∆x, v + ∆y) ≈ I(u, v) + Ix (u, v)∆x + Iy (u, v)∆y (4.2)where Ix , Iy are partial derivatives of I(x, y). Then Eq. 4.1 can be rewritten as: 2 ∆x ∆xc(x, y) = (I(u, v) − I(u + ∆x, v + ∆y))2 = Ix (u, v), Iy (u, v) = ∆x, ∆y Q(x, y) ∆y ∆y W W W (4.3)where: Ix (x, y)2 Ixy (x, y) W Ix (x, y)2 W Ixy (x, y) Q(x, y) = = (4.4) Ixy (x, y) Iy (x, y)2 W Ixy (x, y) W Iy (x, y)2 WCorner points are the local maxima of the cornerness function H(x, y): H(x, y) = det(Q(x, y)) − k ∗ trace(Q(x, y)) = λ1 λ2 + k(λ1 + λ2 ) (4.5)where λ1 , λ 2 are the eigenvalues of matrix Q(x, y) and 0.04 < k < 0.06 is an empiricallydetermined constant.Corners detection in the image can be divided into three steps. First, a non-maxima suppressionof the cornerness fucntion is performed (only local maxima in 3x3 neighborhood remain). Thenext step is rejecting the corners with the minimal eigenvalue less than a threshold value (seecvGoodFeaturesToTrack function in the OpenCV library [61]). Finally, the features than aretoo close (minDistance ) to the stronger features are removed.4.2.1.2 Interest points matchingCorners are detected in the master image using Harris algorithm. An 10x10 image chip is ex-tracted around the corner position in the master image. A slave 40x40 image chip is extractedaround the corner position of the master image in the slave image. Normalized Cross Correla-tion between master and slave image chips is computed. A tie point connection is establishedwhen NCC maximum is above a minimum threshold.4.2.1.3 ImplementationWe are using the Harris corners implementation included in the OpenCV source library [61].Harris parameters can be setup via a keywordlist (see Figure 4.1).The Harris interest points (x) are local maxima of the cornerness function H(x, y). Theselection of such points is controlled by the following parameters: • BlockSize : size of the Gaussian kernel (default: 3) • k: empirically determined constant of the cornerness function denition (default: 0.04)The image patch correlation technique is controlled by the following parameters:
    • 4.2. Control Point Detection and Matching 55 • CCorNormedQuality : NCC maximum must be above this threshold (default: 0.8) Figure 4.1: GUI Screenshots: Harris Features Dialog4.2.2 SIFT featuresSIFT (Scale-Invariant Feature Transform) is a feature detector and descriptor, rst presentedby Lowe et al. in 1999 [52]. SIFT feature descriptor is invariant to scale, orientation, and anedistortion, and partially invariant to illumination changes.4.2.2.1 Interest point detectionFeature detection The SIFT detector is based on the search for local maxima and minimaof the Dierence of Gaussians (DoG) function over both space and scale.A DoG image between scales ki σ and kj σ is just the dierence of the Gaussian-ltered imagesat scales ki σ and kj σ : D(x, y, σ) = G(x, y, ki σ) − G(x, y, kj σ), (4.6)where G(x, y, σ ) is the convolution of the original image I(x, y) with a variable-scale Gaussiankernel g(x, y, σ).
    • 56 Chapter 4. Image Registration ModuleFirst, a DoG image pyramid is built. The input image is rst convolved with a Gaussiankernel at dierent scales. Dierence between successive ltering operations are taken. Allthese DoG images have the same resolution. The DoG images are grouped by octave (anoctave corresponds to doubling the value of σ )). The input image is then down-sampled and anew set of DoG images is obtained. Down-sampling is done only when passing to next octave.The DoG operator can be seen as an approximation to the Laplacian, here expressed in apyramid setting.Once DoG images have been obtained, keypoints are identied as local minima/maxima ofthe DoG images across scales. This is done by comparing each pixel in the DoG images toits eight neighbors at the same scale and nine corresponding neighboring pixels in each of theneighboring scales. If the pixel value is the maximum or minimum among all compared pixels,it is selected as a candidate keypoint. This process is repeated at each octave (set of DoGimages).Scale-space extrema detection produces too many keypoint candidates, some of which areunstable and must be rejected. First, for each candidate point, interpolation of nearby data isuse to accurately determine its position. The interpolation is done using the quadratic Taylorexpansion of the DoG scale-space function, with the candidate keypoint as the origin. Afterinterpolation, candidate points that have low constrast and also candidate points that arepoorly localized along an edge are rejected. For more details, see [52].Feature descriptor The SIFT descriptor is a weighted and interpolated histogram of thegradient orientations and locations in a patch surrounding the keypoint.Extraction of the descriptor can be divided into two distinct tasks. First, in order to achieveinvariance to image rotation, each detected interest point is assigned a dominant orientation.Then, descriptors components (a 128-dimensional vector) are extracted using a scale dependentwindow oriented along the dominant direction. Details of how to compute SIFT descriptor canbe found in section 6.1 of [52].4.2.2.2 Interest point matchingFirst, SIFT features are obtained from the reference and input image using the algorithmdescribed above. This feature matching is done through a Euclidean-distance based nearestneighbor approach using either FLANN or brute force matching (L2-norm of the dierenceof the two descriptors). To increase robustness, matches are rejected for those keypoints forwhich the ratio of the nearest neighbor distance to the second nearest neighbor distance isgreater than 0.8 (as Lowe recommended).OpenCV incorporates the FLANN library. FLANN (Fast Library for Approximate NearestNeighbors) is a library that contains a collection of algorithms optimized for fast nearestneighbor search in large datasets and for high dimensional features. More information aboutFLANN can be found in [68].
    • 4.2. Control Point Detection and Matching 574.2.2.3 ImplementationWe are using the SIFT implementation of Andrea Vedaldi included in the VLFeat open sourcelibrary [69].Gaussian Space Scale G(x; σ) represents the same information (the image . The scale spaceI(x)) at dierent levels of scale σ . The domain of the variable σ is discretized in logarithmicsteps arranged in O octaves. Each octave is further subdivided in S sublevels. The distinctionbetween octave and sublevel is important because at each successive octave the data is spatiallydownsampled by half. Octaves and sublevels are identied by a discrete octave index o andsublevel index s respectively. The octave index o and the sublevel index s are mapped to thecorresponding scale σ by the formula: σ(o, s) = σ0 2o+s/S , (4.7)where o ∈ omin + [0, . . . , O − 1], s ∈ smin + [0, . . . , S − 1] and σ0 is the base scale level.Dierence of Gaussians Scale Space . The dierence of Gaussians (DOG) scale spaceis the scale derivative of the Gaussian scale space G(x, σ) along the scale coordinate σ. It isgiven by: D(x, σ) = G(x, σ(s + 1, o) − G(x, σ(s, o)) (4.8)Remark 1 (Lowes parameters). Lowes implementation uses the following parameters: σn = 0.5, σ0 = 1.6 ∗ 21/S , omin = −1, S = 3 (4.9)In order to compute the octave o = −1, the image is doubled by bilinear interpolation. Inorder to detect extrema at all scales, the DoG scale space has s ∈ [smin , smax ] = [−1, S]. Sincethe DoG scale space is obtained by dierentiating the Gaussian scale space, the latter hass ∈ [smin , smax ] = [−1, S + 1]. The parameter O is set to cover all octaves (i.e. as big aspossible).The SIFT implementation of Andrea Vivaldi has been integrated in our Ossim registrationplugin. SIFT parameters can be setup via a keywordlist (see Figure 4.2).The parameters S , O, smax , omin , σ0 refer to the DOG scale space. The plugin accepts acceptsthe following parameters describing the Gaussian scale space being used: • O: number of octaves (default: -1) • omin : index of the rst octave. The octave index o varies in omin , . . . , O − 1 (default:-1, which has the eect of doubling the image before computing the Gaussian scale space) • S: number of sublevels (default: 3) • smin : index of the rst level. The level index s varies in smin , . . . , S + 1 (default:-1) • σ0 : base smoothing (default: σ0 = 1.6 ∗ 21/S )
    • 58 Chapter 4. Image Registration Module • σn : nominal pre-smoothing. This is the nominal smoothing level of the input image. The algorithm assumes that the input image is actually L(x, σn ) as opposed to I(x) and adjusts the computations according (default: 0.5)The SIFT interest points (x, σ) are points of local extrema of the DOG scale space. Theselection of such points is controlled by the following parameters: • Threshold : local extrema threshold. Local extrema whose value |D(x; σ)| is below this number are rejected (default: 0.04/S/2) • EdgeThreshold : local extrema localization threshold. If the local extremum is on a valley, the algorithm discards it as it is too unstable. Extrema are associated with a score proportional to their sharpness and rejected if the score is below this threshold (default: 10) Figure 4.2: GUI Screenshots: SIFT Features Dialog4.2.3 SURF featuresSURF (Speeded Up Robust Features) is a feature detector and descriptor, rst presented byHerbert Bay et al. in 2006 [53]. It is partly inspired by the SIFT descriptor. The standardversion of SURF is several times faster than SIFT and claimed by its authors to be more robust
    • 4.2. Control Point Detection and Matching 59against dierent image transformations than SIFT.4.2.3.1 Interest point detectionFeature detection The SURF detector is based on the determinant of the Hessian matrix.The second order derivatives of the images are calculated by convolution with a second orderscale normalized Gaussian kernel, which allows for analysis over scales as well as space (scale-space theory is discussed further later in this section). The Hessian matrix H is computed asfunction of both space x = (x, y) and scale σ: Lxx (x, σ) Lxy (x, σ) H(x, σ) = , (4.10) Lxy (x, σ) Lyy (x, σ) ∂ 2 g(σ)where Lxx (x, σ) refers to the convolution of the second order Gaussian derivative ∂2 withthe image at point x = (x, y) and similarly for Lyy and Lxy . These derivatives are known asLaplacian of Gaussian (LoG).The authors proposed an approximation to the Laplacian of Gaussians by using box lterrepresentations of the respective kernels (see Figure 4.3). Considerable performance increaseis found when these lters are used in conjunction with an intermediate image representationknown as the integral image. Further details of the integral image denition can be found in[70]. Using the integral image, the task of calculating the area of an upright rectangular region(box lter) is reduced to four operations and hence is independent on the region size. SURFmakes good use of this property to perform fast convolutions of varying size box lters at nearconstant time.Bay propose the following formula as an accurate approximation for the Hessian determinantusing the approximated Gaussians: det(Happrox ) = Dxx Dyy − (0.9Dxy )2 (4.11)The search for local maxima of this function over both space and scale yields the interest pointsfor an image.Oppositely to the the traditional way of constructing a scale-space scheme, the SURF approachleaves the original image unchanged and varies only the lter size (see Figure 4.4). In SURFthe lowest level of the scale-space is obtained from the output of the 9x9 lters shown in Figure4.3. These lters correspond to a Gaussian kernel with σ = 1.2. Pyramid levels are groupedby octave (an octave corresponds to doubling the value of σ ). Subsequent layers are obtainedby upscaling the lters while maintaining the same lter layout ratio. For further details ofthe factors which must be take into consideration when constructing larger lters, see [53] and[70].The task of localising the interest points in the image can be divided into three steps. First,the responses are thresholded such that all values below the predetermined threshold are re-moved. Increasing the threshold lowers the number of detected interest points, leaving only
    • 60 Chapter 4. Image Registration ModuleFigure 4.3: Laplacian of Gaussian Approximation. Top Row: the second order Gaussianderivatives in the x, y and xy-directions. We refer to these as Lxx , Lyy , Lxy . Bottom Row:box lter approximations in the x, y and xy-directions. We refer to these as Dxx , Dyy , Dxy .Figure 4.4: Filter Pyramid. The traditional approach to constructing a scale-space (left).Image sizes varies and a Gaussian kernel is repeatedly applied to smooth subsequent pyramidlevels. The SURF approach (right) leaves the original image unchanged and varies only thelter size.
    • 4.2. Control Point Detection and Matching 61the strongest while decreasing allows for many more to detected. After thresholding, a non-maximal suppression is performed to nd a set of candidate points. To do this each pixel inthe scale-space is compared to its 26 neighbours: the 8 points in the native scale and the 9 ineach of the scales above and below. The pixel is selected as a maxima if it is greater than thesurrounding 26 neighbours. The nal step involves interpolating the nearby data to nd thelocation in both space and scale to sub-pixel accuracy. This is done by tting a 3D quadraticform of the determinant of the Hessian function.Feature descriptor The SURF descriptor describes how the pixel intensities are distributedwithin a scale dependent neighbourhood of each detected interest point. This approach issimilar to that of SIFT [52] but integral images used in conjunction with lters known as Haarwavelets are used in order to increase robustness and decrease computation time.Again, extraction of the descriptor can be divided into two distinct tasks: computation ofthe dominant orientations and computation of the descriptors components (a 64-dimensionalvector). For further details about the SURF descriptor, go to [53]. The resulting SURFdescriptor is invariant to rotation, scale, brightness and, after reduction to unit length, contrast.4.2.3.2 Interest point matchingFirst, SURF features are obtained from the reference and input image using the algorithmdescribed above. As in SIFT interest point matching, the feature matching is done througha Euclidean-distance based nearest neighbor approach using either FLANN or brute forcematching. Again, to increase robustness, matches are rejected for those keypoints for whichthe ratio of the nearest neighbor distance to the second nearest neighbor distance is greaterthan 0.8, as Lowe recommended [52] .4.2.3.3 ImplementationWe are using the SURF implementation included in the OpenCV source library [69]. SURFparameters can be setup via a keywordlist (see Figure 4.5).The plugin accepts accepts the following parameters describing the Laplacian of Gaussiansscale space being used: • nOctaves : the number of octaves (default: 4) • nOctaveLayers : the number of levels per octave (default: 2)The SURF interest points (x, σ) are points of local maximum of the LoG scale space. Theselection of such points is controlled by the following parameters: • HessianThreshold : threshold value. Local maximum whose value |L(x; σ)| is below this number are rejected (default: 100)
    • 62 Chapter 4. Image Registration Module Figure 4.5: GUI Screenshots: SURF Features Dialog4.3 Model OptimizationGiven the coordinates of a set of correspondences in the images: {(xi, yi), (xi , yi ); i : 1, . . . , N } (4.12)where (xi , yi ) is a point in the reference image (master image) and (xi , yi ) is its correspondencein the input image (slave image), this block aims to determine the function T(x, y) withcomponents Tx (x, y) and Ty (x, y) such that: xi = Tx (xi , yi ) (4.13) yi = Ty (xi , yi ).In particular, this block aims to optimize a generic polynomial model: Tx (x, y) = a0 + a1 x + a2 y + a3 x2 + a4 y 2 + a5 xy + ... (4.14) Ty (x, y) = b0 + b1 x + b2 y + b3 x2 + b4 y 2 + b5 xy + ...We can compute the parameteres of each transformation Tx and Ty separetely. In both cases,
    • 4.3. Model Optimization 63we use method of least-squares to nd an approximate solution to the resulting overdeterminedsystem (more equations than unknows). The least-squares method minimizes the following costfunction: 2 ˆ a = arg min xi − Tx (xi , yi ) , _i = 1, . . . , N (4.15) ain instance, minimizes the sum of the squares of the distances from the transformed referencelocations to the corresponding image locations.To solve for the transformation parameters one can rewrite the linear system to gather theunknowns into a column vector:      1 x1 y1 x2 1 2 y1 x1 y1 ... a0 x1 1 x2 y2 x2 2 y2 x2 y2 . . .  a1   x2   2     .  .  =  .  (4.16) ..  .   .  . . 1 xN yN x2 N 2 yN xN yN ... aN xNLet A be the matrix of the left side of equation 4.16. The least-squares solution of this systemof linear equations is given in terms of the pseudoinverse matrix of A: a∗ = A+ x = (AT A)−1 AT x . (4.17)The solution is analogous for the transformation parameters of Ty : b∗ = A+ y = (AT A)−1 AT y . (4.18)We can expect a good amount of outliers when using automatic feature matching. We useRANSAC (the abbreviation for RANdom SAmple Consensus) to perform a robust optimiza-tion. RANSAC [54] is an iterative method to estimate parameters of a mathematical modelfrom a set of observed data which contains outliers. It is a non-deterministic algorithm in thesense that it produces a reasonable result only with a certain probability, with this probabilityincreasing as more iterations are allowed. RANSAC also assumes that, given a (usually small)set of inliers, there exists a procedure which can estimate the parameters of a model thatoptimally explains or ts this data.The generic RANSAC algorithm, in pseudocode, works as shown in Figure 4.6. The polynomialmodel is initialized with (optimized with) a minimum number of randomly chosen tie points(n) . If the model ts with a large enough tie points set (enough inliners d within a certain errorthresolhold t), then a model is optimized using this subset of data. This process is repeated acertain number of times (maximum iterations k) and the best model is returned. Figure 4.7shows the GUI screenshoot of the optimization routine with its congurable input parameters.
    • 64 Chapter 4. Image Registration Module input: data - a set of observations model - a model that can be fitted to data n - the minimum number of data required to fit the model k - the number of iterations performed by the algorithm t - a threshold value for determining when a datum fits a model d - the number of close data values required to assert that a model fits well to dataoutput: best_model - model parameters which best fit the data (or nil if no good model is found) best_consensus_set - data point from which this model has been estimated best_error - the error of this model relative to the dataiterations := 0best_model := nilbest_consensus_set := nilbest_error := infinitywhile iterations < k maybe_inliers := n randomly selected values from data maybe_model := model parameters fitted to maybe_inliers consensus_set := maybe_inliers for every point in data not in maybe_inliers if point fits maybe_model with an error smaller than t add point to consensus_set if the number of elements in consensus_set is > d (this implies that we may have found a good model, now test how good it is) better_model := model parameters fitted to all points in consensus_set this_error := a measure of how well better_model fits these points if this_error < best_error (we have found a model which is better than any of the previous ones, keep it until a better one is found) best_model := better_model best_consensus_set := consensus_set best_error := this_error increment iterationsreturn best_model, best_consensus_set, best_error Figure 4.6: Pseudocode of the generic RANSAC algorithm
    • 4.4. Image Registration Experiments 65Figure 4.7: GUI Screenshots: Polynomial Model Optimization Dialog with RANSAC OutlierRejection4.4 Image Registration ExperimentsThis section demonstrates the plugin capabilities for performing image registration. Imageregistration is the process of determining the spatial transform that maps points from oneimage to homologous points on a object in the second image. A registration method requiresthe following set of components: two input images, the set of correspondences, a transform,an optimizer and an interpolator.We are using as input images two 1024x768 JPEG aerial images included in the source codepackage of the Vision Lab Features Library [69]. Input images are shown in Figure 4.8.First, we use the SURF algorithm to extract a set of correspondences between images. TheSURF detector and descriptor routine has been executed usign the following parameters (seesection 4.2.3.3 for more details): • nOctaves : 4 • nOctaveLayers : 2 • HessianThreshold : 500Feature selection and matching results are summarized below. The matching step results areillustrated in Figure 4.9. Note the presence of outliers in the correspondence points set. • Number of interest points extracted in reference image: 4903 • Number of interest points extracted in input image: 5310
    • 66 Chapter 4. Image Registration Module • Number of correspondences: 275Figure 4.8: Image registration demonstration. Above: reference image (river1.jpg). Below:input image (river2.jpg).Second, a rst order polynomial model transformation has been optimized using the set ofcorrespondence pairs. As the set of correspondence contains some outliers, the model opti-mization has been performed using RANSAC. RANSAC has been executed using the followingparameters (see section 4.3 for further details): • Number of iterations performed by the algorithm: 100 iterations
    • 4.4. Image Registration Experiments 67Figure 4.9: Image registration demonstration. Interest point matching using SURF: correspon-dences between reference (above) and input (below) images. • Minimum number of data required to t the model: 24 points • Threshold value for determining when a datum ts a model: 3 pixels • Number of close data values required to assert that a model ts well to data: 48 pointsThe result of the optimization routine is a .geom le (readable by the OSSIM core) whichcontains the coecients of the transformation. This le is assigned as the projection model forthe slave image. This projection model is used by the OSSIM core to render (interpolate) the
    • 68 Chapter 4. Image Registration Moduleinput image to the reference space (the projection model of the reference image is assumed tobe the identity model).Finally, the accuracy of the model is checked by visual inspection creating an output mosaicfed by the input image chains. The mosaic averages the values of the pixels in the overlappingarea (blend mosaic). The image mosaic is shown in Figure 4.10. Indeed, the input image hasbeen aligned with the reference image.Figure 4.10: Image registration demonstration. Blend mosaic of reference and registered inputimage after image registration.
    • Chapter 5 Super-Resolution Experiments With Satellite Imagery5.1 IntroductionThe aim of this section is to nally test the superresolution software with real satellite imagerydatasets. Superresolution with datasets from Landsat/ETM+ and MSG/SEVIRI have beenperformed. Each section starts with a brief technical overview describing the main imagecharacteristics. Details of the superresolution framework parameters and the superresolutionresults are presented for both satellite datasets.5.2 Landsat/ETM+ Experiments5.2.1 PreliminariesLandsat satellites have been providing multispectral images of the Earth continuously sincethe early 1970s. Landsat 7, launched on April 15, 1999, is the latest satellite of the LandsatProgram [71]. The Landsat Program is managed and operated by the NASA, and data fromLandsat 7 is collected and distributed by the U.S. Geological Survey (USGS).The instrument on board Landsat 7 is the Enhanced Thematic Mapper Plus (ETM+). TheETM+ is a xed whisk-broom multispectral scanning radiometer capable of providing high-resolution imaging information of the Earths surface. It detects spectrally-ltered radiationin visible and near-infrared, short-wave infrared, long-wave infrared (thermal imaging) andpanchromatic bands. Landsat ETM+ images images consist of eight spectral bands witha spatial resolution of 30 meters for Bands 1 to 5 and 7, 60 meteres for Band 6 (thermalband) and 15 meters for Band 8 (panchromatic). The ETM+ spectral band designations aresummarized in Figure 5.1. For a detailed description of ETM+ spatial characteristics, go tothe L7 Science Data Users Handbook [72].Landsat data can be ordered from two USGS websites: • USGS Global Visualization Viewer (GloVis) at http://glovis.usgs.gov/ • USGS Earth Explorer at http://earthexplorer.usgs.gov
    • 70 Chapter 5. Super-Resolution Experiments With Satellite Imagery Band Wavelenght (µm) Resolution (m) Band 1 - Blue 0.45-0.52 30 Band 2 - Green 0.52-0.60 30 Band 3 - Red 0.63-0.69 30 Band 4 - Near infrared 0.77-0.90 30 Band 5 - Short-wave infrared 1.55-1.75 30 Band 6 - Thermal infrared 10.40-12.50 60 Band 7 - Short-wave infrared 2.09-2.35 30 Band 8 - Panchromatic 0.52-0-90 15 Figure 5.1: Landsat 7 ETM+ spectral channels.All Landsat 7 ETM+ (1999-present) scenes are processed through the Level 1 Product Gener-ation System (LPGS). The purpose of these geometric algorithms is to create accurate Level1 output products, by using Earth ellipsoid and terrain surface information (Digital ElevationModel, DEM) in conjunction with spacecraft ephemeris and altitude data, and knowledge ofthe ETM+ instrument and Landsat 7 satellite geometry to relate locations in ETM+ imagespace (band, scan, detector, sample) to geodetic object space (latitude, longitude, and height).Landsat scenes are processed to Standard Terrain Correction (Level 1T) if possible. Level 1Tprovides radiometric and geometric accuracy by incorporating ground control points (GCP)while employing a Digital Elevation Model (DEM) for topographic accuracy. Some do not haveground-control or elevation data necessary for L1T correction, and in these cases, the best levelof correction is applied. The Systematic Terrain Correction (Level 1Gt) provides radiometricand geometric accuracy while employing a DEM for topographic accuracy. The SystematicCorrection (Level 1G) provides radiometric and geometric accuracy, which is derived from datacollected by the sensor and spacecraft.Scenes are delivered with the following paremeters: • GeoTIFF output format • Cubic Convolution (CC) resampling method • Universal Transverse Mercator (UTM) map projection • Map (North-up) image orientation • Geolocated and radiometrically pre-processed image data (levels of correction: 1T, 1Gt, 1G) • Pixel depth: 8 bitsThe Worldwide Reference System (WRS) [73] is a global notation system for Landsat data. Itenables a user to inquire about satellite imagery over any portion of the world by specifying anominal scene center designated by PATH and ROW numbers. A standard WRS scene coversa land area approximately 185 kilomenters (across-track) by 180 kilometers (along-track). Thecombination of a Path number and a Row number uniquely identies a nominal scene center.The Path number is always given rst, followed by the Row number. The notation 127-043,
    • 5.2. Landsat/ETM+ Experiments 71for example, relates to Path number 127 and Row number 043. A map of the WRS can befound at http://landsat.gsfc.nasa.gov/about/wrs2.gif.SCL Failure SLC-o Products On May 31, 2003, the Scan Line Corrector (SLC), whichcompensates for the forward motion of Landsat 7, failed. Subsequent eorts to recover the SLCwere not successful, and the failure appears to be permanent. Without an operating SLC, theETM+ line of sight now traces a zig-zag pattern along the satellite ground track (see Figure5.2). As a result, imaged area is duplicated, with width that increases toward the scene edge. Figure 5.2: Landsat/ETM+ scan line corrector failure (source: [74]).The Landsat 7 ETM+ is still capable of acquiring useful image data with the SLC turnedo, particularly within the central part of any given scene. The Landsat 7 ETM+ thereforecontinues to acquire image data in the "SLC-o" mode. All Landsat 7 SLC-o data are of thesame high radiometric and geometric quality as data collected prior to the SLC failure. TheSLC-o eects are most pronounced along the edge of the scene and gradually diminish towardthe center of the scene (see Figure 5.3). The middle of the scene, approximately 22 kilometerswide on a Level 1 (L1G, L1Gt, L1T) product, contains very little duplication or data loss, andthis region of each image is very similar in quality to previous ("SLC-on") Landsat 7 imagedata. An estimated 22 percent of any given scene is lost because of the SLC failure. The Figure 5.3: Complete Landsat 7 scene showing aected vs. unaected area (source: [74]).
    • 72 Chapter 5. Super-Resolution Experiments With Satellite Imagerymaximum width of the data gaps along the edge of the image would be equivalent to one fullscan line, or approximately 390 to 450 meters. The precise location of the missing scan lineswill vary from scene to scene.Each ETM+ scene downloaded from the USGS archive includes individual TIFF les foreach band and a metadata le (MTL.txt). A README le containing a summary and briefdescription of the le contents and naming convention is also included. Landsat ETM+ SLC-o scenes also include Gap Mask les for each band, which allow users identify the location ofall pixels aected by the original data gaps in the SLC-o scene.5.2.2 Results4 ETM+ datasets corresponding to the WRS 201-32 have been downloaded from USGS EarthExplorer browser (see Figure 5.4). The datasets cover the whole area of Madrid. Only datasetswith cloud-cover percentage less than 10% have been selected. All downloaded scenes aredelivered in GeoTi format, with processing level 1T. A piece of the MTL metadata le of oneof the scenes is shown in Figure 5.5. All scenes were taken after the Landsat SLC failure, sothey do contain scan gaps.To extract geometry info from the image header of the geoti les:>> ossim-info -p -i image.tif -o image.geomossim-info writes the geometry le with both image and projection information that ossim corewill recognize.We take 4 input images corresponding to the panchromatic bands of the Landsat/ETM+ toperform image superresolution: • image01.tif: 15981x14601 8-bit image; coordinate system is WGS 84 / UTM zone 30N. • image02.tif: 15961x14601 8-bit image; coordinate system is WGS 84 / UTM zone 30N. • image03.tif: 15981x14621 8-bit image; coordinate system is WGS 84 / UTM zone 30N. • image04.tif: 16001zx4661 8-bit image; coordinate system is WGS 84 / UTM zone 30N.The resolution of the original panchromatic images is 15 meters. As images are geolocated,no image registration is performed. Delaunay interpolation is the reconstruction algorithmapplied to generate the output high-resolution image with a scale factor of value 1.0.This test has been performed on 1.6 GHz Intel Core i7 PC with 4GB RAM under Linux in aparallel scenario. Time required to generate the output product using a block size of 512x512pixels and 4 processors has been 1h514.A comparison between the original image and the high-resoloution ouputput image is shown inFigure 5.2.2. Two image details, corresponding to the Zarzuela Hippodrome and Retiro Parkhave been selected for comparison. Zarzuela Hippodrome area is not aected by SLC-o scangaps but Retiro Park is clearly aected. Results show that, in both cases, boundaries in theoutput image are sharper than in the original image. Also, while still having a small visibleresidual gap, superresolution could be a good technique to ll the gaps in a SLC-o image.
    • 5.2. Landsat/ETM+ Experiments 73Figure 5.4: Above: USGS Earth Explorer Browser. Below: Panchromatic band of one of theLandsat/ETM+ scenes with SLC-o scan gaps.
    • 74 Chapter 5. Super-Resolution Experiments With Satellite Imagery GROUP = L1_METADATA_FILE GROUP = METADATA_FILE_INFO ORIGIN = "Image courtesy of the U.S. Geological Survey" REQUEST_ID = "0101101243774_00002" PRODUCT_CREATION_TIME = 2011-01-24T10:40:01Z ... END_GROUP = METADATA_FILE_INFO GROUP = PRODUCT_METADATA PRODUCT_TYPE = "L1T" ELEVATION_SOURCE = "GLS2000" PROCESSING_SOFTWARE = "LPGS_11.3.0" EPHEMERIS_TYPE = "DEFINITIVE" SPACECRAFT_ID = "Landsat7" SENSOR_ID = "ETM+" SENSOR_MODE = "SAM" ACQUISITION_DATE = 2004-04-27 SCENE_CENTER_SCAN_TIME = 10:44:43.6935459Z WRS_PATH = 201 STARTING_ROW = 32 ENDING_ROW = 32 BAND_COMBINATION = "123456678" ... PRODUCT_SAMPLES_PAN = 15961 PRODUCT_LINES_PAN = 14601 PRODUCT_SAMPLES_REF = 7981 PRODUCT_LINES_REF = 7301 PRODUCT_SAMPLES_THM = 7981 PRODUCT_LINES_THM = 7301 BAND1_FILE_NAME = "L71201032_03220040427_B10.TIF" BAND2_FILE_NAME = "L71201032_03220040427_B20.TIF" BAND3_FILE_NAME = "L71201032_03220040427_B30.TIF" BAND4_FILE_NAME = "L71201032_03220040427_B40.TIF" BAND5_FILE_NAME = "L71201032_03220040427_B50.TIF" BAND61_FILE_NAME = "L71201032_03220040427_B61.TIF" BAND62_FILE_NAME = "L72201032_03220040427_B62.TIF" BAND7_FILE_NAME = "L72201032_03220040427_B70.TIF" BAND8_FILE_NAME = "L72201032_03220040427_B80.TIF" GCP_FILE_NAME = "L71201032_03220040427_GCP.txt" METADATA_L1_FILE_NAME = "L71201032_03220040427_MTL.txt" CPF_FILE_NAME = "L7CPF20040401_20040513_05" END_GROUP = PRODUCT_METADATA ... Figure 5.5: MTL metadata le extract of one of the donwloaded Landsat/ETM+ scenes.
    • 5.2. Landsat/ETM+ Experiments 75Figure 5.6: Landsat/ETM+ superresolution using Delaunay interpolation: Zarzuela Hippo-drome (a) panchromatic image, (b) output image; Retiro Park (c) panchromatic image, (d)output image.
    • 76 Chapter 5. Super-Resolution Experiments With Satellite Imagery5.3 MSG/SEVIRI Experiments5.3.1 PreliminariesFor almost 30 years the European Space Agency (ESA) has been building Europes weathersatellites: the Meteosat series of geostationary spacecraft, the rst of which was launched in1977. The success of the early Meteosats led to the creation of the European Organisation forthe Exploitation of Meteorological Satellites (EUMETSAT) in 1986. ESA and EUMETSATworked together on the later satellites in the series, designed to deliver continuous weather im-ages to European forecasters on an operational basis. The original satellites (from Meteosat-1to Meteosat-7) are gradually being replaced by a new second generation of Meteosats. The fourMeteosat Second Generation (MSG) geostationary satellites make up Europes new generationof weather satellites.MSG-2 (redesignated Meteosat-9) is the second MSG satellite and is planned to serve as theprime operational meteorological satellite for Europe. It monitorizes a quarter of the Earthand its atmosphere from a xed position in geostationary orbit at 0 degree longitude at 35,800km above the Earth. The main MSG instrument is called the Spinning Enhanced Visibleand Infra-red Imager (SEVIRI). It builds up images of the Earths surface and atmosphere in12 dierent wavelengths once every 15 minutes. The SEVIRI spectral band designations aresummarized in Table 5.7. Band Wavelenght (µm) Resolution (m) Channel 1 - VIS 0.6 0.56-0.71 3000 Channel 2 - VIS 0.8 0.75-0.88 3000 Channel 3 - IR 1.6 1.50-1.78 3000 Channel 4 - IR 3.9 3.48-4.36 3000 Channel 5 - WV 6.2 5.35-7-15 3000 Channel 6 - WV 7.3 6.85-7.85 3000 Channel 7 - IR 8.7 8.30-9.10 3000 Channel 8 - IR 9.7 9.38-9.94 3000 Channel 9 - IR 10.8 9.80-11.80 3000 Channel 10 - IR 12.0 11.00-13.00 3000 Channel 11 - IR 13.4 12.40-14.40 3000 Channel 12 - HRV Broadband (about 0.4-1.1) 1000 Figure 5.7: High Rate MSG/SEVIRI spectral channels.The image service is the main mission of Meteosat Second Generation (MSG). This servicecomprises High Rate SEVIRI image data, in 12 spectral bands which are processed in nearreal-time to Level 1.5 before distribution to the user (see Figure 5.8). Level 1.5 image datacorresponds to the geolocated and radiometrically pre-processed image data, ready for furtherprocessing, e.g. the extraction of meteorological products. Data is accompanied by the ap-propriate metadata that allows the user to calculate the geographical position and radiance ofany pixel.High Rate SEVIRI image data consist of geographical arrays of various sizes of image pixels,
    • 5.3. MSG/SEVIRI Experiments 77 Figure 5.8: MSG/SEVIRI channels, running left to right (source: [75]).each pixel containing 10 data bits, representing the received radiation from the earth and itsatmosphere in the 12 spectral channels. Of these 12 spectral channels, 11 provide measurementswith a resolution of 3 km (Channels 1 to 11) and the High Resolution Visible (HRV) channelprovides measurements with a resolution of 1 km. Before distribution to the user, the data arecompressed using lossless compression in the form of Wavelet Transform.High Rate SEVIRI image data can obtained through the following EUMETSATs disseminationmechanisms: • EUMETCast, a multi-service dissemination system based on standard Digital Video Broadcast (DVB) technology; • Direct dissemination from a Meteosat satellite; and • FTP over Internet.EUMETCast MSG/SEVIRI datasets are delivered with the following paremeters [76]: • Native output format (wavelet-compressed les)  • Geostationary Satellite View projection  The MSG GDAL driver implements reading support for MSG les. Check http://www.gdal.org/frmt_msg.html for more details.
    • 78 Chapter 5. Super-Resolution Experiments With Satellite Imagery • Geolocated and radiometrically pre-processed image data (level 1.5) • Pixel depth: 10 bits5.3.2 ResultsArgongra has an operational Meteosat reception station congured to recieve 0 degree SEVIRIdata (EUMETCast mechanism). 10 MSG/SEVIRI multispectral images of one day have beenacquired (see Figure 5.3.2). Raw 10-bit data has been transformed to 8-bit GeoTi. Imagesare using the Geostationary Satellite View projection. Most GIS packages dont recognizethis projection, so images have also been reprojected to lat/long WGS84 images. Finally, weextract channel 1 of the MSG/SEVIRI images (VIS 0.6). This pre-processing operations havebeen computed using the gdalwarp command line application (GDAL distribution) and theMSG GDAL driver:>> gdalwarp -t_srs +proj=latlong MSG("/home/rawdata/",201109011200,1,N,B,1,1) out.tifEach of the input VIS 0.6 frames is a 8-bit 7318x7318 image. As images are geolocated, noimage registration is performed. L2 minimization with Bilateral-TV regularization has beenperformed. This test has been performed on 1.6 GHz Intel Core i7 PC with 4GB RAM underLinux in a parallel scenario. Time required to generate the output product using a block sizeof 512x512 pixels and 4 prcessors has been 3945.Results are shown in Figure 5.3.2. An area covering the Iberian Peninsula has been select forcomparison. Area extends from -12.0 deg. to 6.0 deg. longitude and from 33.0 deg. to 46.0 deg.latitude. Coastlines dont seem to be sharper in the superresolved image. As MSG/SEVIRIimages come from a geostationary satellite, there is no relative motion between a set of inputframes taken at dierent times. In this case, superresolution reduces the cloud coverage of theinput frames, smoothing the cloud presence in the output image.5.4 ConclusionThe superresolution software has been successful tested with Landsat/ETM+ andMSG/SEVIRI datasets. Results demonstrate that our superresolution framework is suitablefor practical real-world supperesolution applications. Superresolution can be used as a tech-nique to ll the scan gaps working with Landat/ETM+ SLC-o datasets. Superresolution canbe used to reduce the cloud coverage in MSG/SEVIRI images.
    • 5.4. Conclusion 79 Figure 5.9: MSG/SEVIRI VIS0.6 frames of the area of the Iberian Peninsula.
    • 80 Chapter 5. Super-Resolution Experiments With Satellite ImageryFigure 5.10: MSG/SEVIRI VIS0.6 superresolution results using L2+BilateralTV minizamtion:(a) one of the input frames, (b) output image.
    • Chapter 6 Conclusions and Future WorkThe fundamental contribution of this research is the development of a block-based imageprocessing superresolution library with support for parallel computing. This library is suitablefor practical remote sensing superresolution and other real-world superresolution applications.As mentioned earlier, there is a limited real-world superresolution presence and only a few com-mercial SR products are oered in the market. There has not been found any superresolutionpublished work that deals with the problem of the image sizes. The main goal of this researchhas been to implement a superresolution framework which can be used with satellite images inits original sizes. Our software has to overcome the current limitations of any superresolutionimplementation: • robustness of the superresolution reconstruction in presence of error in the image regis- tration; • ecient implementation to deal with the high memory and computational requirements of any superresolution algorithm; and • image registration accuracy with complex geometrical models for real-world images;but also to adapt the superresolution implementation to deal with satellite imagery by: • providing block-based image processing to overcome the huge memory requiriments; and • providing parallel implementation of the superresolution algorithms for faster product generation.Our superrolution suite provides a framework to apply superresolution techniques to real-world image datasets. The core of the suite is a superresolution library an a automatic imageregistration library written in C++. Both libraries have been implemented as OSSIM plugins toexploit the block-based image processing approach, support for parallel computing with MPI,and support for a wide range of image projections and datums of the OSSIM core library.As mentioned earlier, super-resolution can be divided in two steps: subpixel image regis-tration and image reconstruction. The superresolution plugin provides both a spatial-based[65] algorithm and a frequency-based [66] algorithm to perform subpixel image registration. Anon-uniform interpolation technique (Delaunay interpolation) and also two iterative algorithms(IBP [31] and L2-norm with Bilateral-TV regularization minimization [67]) are also providedto perform image reconstruction. Iterative algorithms have been implemented using an image-based formulation rather than a vector-based formulation [57] to reduce memory requirimentsand speedup the reconstruction step.
    • 82 Chapter 6. Conclusions and Future WorkThe registration plugin takes benet of the computer vision eld contributions to provide afast and robust method for register images with complex real-world geometric transformations.The plugin has been designed to work without human intervention. Parameters of a cong-urable geometric transformation are optimized using a feature-based registration technique.The OpenCV implementation of the following feature detectors has been integrated in the reg-istration plugin: Harris corners [51], SIFT features [52] and SURF features [53]. Least-squaresminimization using RANSAC is applied to perform a robust model optimization.A Qt-based GUI for image super-resolution has been built on the top of the core library andthe OSSIM plugins. This application provides a friendly environment to setup the plugins andto create image chains to generate output products. Finally, the OSSIM GDAL plugin hasbeen included in the superresolution suite to provide access to all the raster formats supportedby GDAL.The algorithms implementation and the plugins performance (block-based image processingand parallel capabilities) have been successful tested with syntetic images. The superresolutionsoftware has been successful tested with real Landsat/ETM+ and MSG/SEVIRI datasets,demonstrating that our software framework is suitable for practical real-world superresolutionapplications and, concretely, for remote sensing superresolution applications.What have we learned? First, image registration has been conrmed as a critical pre-processingstep in image super-resolution. The key for successful superresoultion lays on an accurate imagealignment. If there are errors in image registration, superresolution will fail.Second, how many frames are enough? With a precise image registration, more frames willresult in better reconstruction. Based on our experience, the minimum number of framesfor useful reconstruction is not smaller than 4. If we use a feature-based automatic imageregistration technique, we should assume that there will be outliers between the points usedto optimized the model. RANSAC helps to make outlier rejection while tting a model usinga set of points which contain outliers. However, there is no guarantee that the resultingoptimized model is precise. It may be good, but not perfect. In a real-world superresolutionapplication, more frames will not result in better reconstruction due to accumulative errors inimage registration.Third, when using iterative techniques in superresolution reconstruction, the relationship be-tween the time required to generate the output product and the number of processors used ina parallel scenario is not linear. The best performance is obtained when using non-uniforminterpolation techniques.Fourth, in analyzing the results of SR, human perception is what matters. As many researchersbefore us have concluded [21], it is often hard to draw direct correlation between RMSEmeasures and visual quality. Also, in real applications there is a reference image to computeany similarity metric. In this way, as Bozinovic says, beauty is is in the eyes of the beholder[49].Fifth, superresolution can be used as a technique to ll the scan gaps working with Lan-dat/ETM+ SLC-o datasets.Finally, superresolution is possible only if there exists subpixel motions between these LR
    • 83frames. However, if no subpixel motion exists between LR frames, superresolution can stillact as a good denoiser. SR can improve perceivable resolution by reducing noise. In thisway, superresolution can be used to reduce the cloud coverage in MSG/SEVIRI geostationarysatellite images.What about future work? The current superresolution suite works only with 8-bit grayscaleimages. Color superresolution (see [77] and [78]) is a vast unexplored yet and amazing researcheld. As the polynomial model is found to be sucient to align a set of unregistered imagein many superresolution applications, RPC models and also rigorous sensor models can beincorporated as useful extensions of the registration plugin. This models are needed to alignraw data coming from satellite imagery. Statistics and better tools to measure the accuracy ofthe optimized model can also be included in the GUI. Other regularization techniques as wellas automatic regularization step size estimation can also be a extension for the superresolutionplugin. Other subpixel registration algorithms, for example those based on feature-based tech-niques, can also be included. Finally, visible block boundaries in the generated product due tothe bock-bassed design of the framework should be eliminated using image ltering techniques.The dynamic plugin-based design allows for rapid growthing of the suite by further includingother automated image processing chain implementation case-studies, such as image mosaics,vegetation indexes, lands covers and other high level products from satellite imagery.We hope that our experience will help in applying superresolution in other interest areas andin bringing more exciting technologies form labs to the market.
    • Appendix A Super-Resolution Software Build GuideThis document describes the build process for the Superresolution Software in Ubuntu 10.04LTS using cmake.The complete build process has the following steps: • install required dependencies via synpatics; • build libti from source; • build libgeoti from source; • build OpenSceneGraph from source; • build GDAL from source; • build OSSIM from source; • build OpenCV from source; • build QGIS from source; • build Superresolution Software from source;Installing DependenciesInstall the following dependences using synaptics: • build-essential: build-essential packages • qtcreator: IDE for Qt • cmake-qt-gui: Qt4 based user interface for CMake • subversion: advanced control version system (also known as svn) • libjpeg-dev: development les for the JPEG library • libexpat-dev: development les the the XML parser C library • zlib-dev: development les for the gzip compression library • libpng-dev: development les for the PNG library
    • 86 Appendix A. Super-Resolution Software Build Guide • libgif-dev: development les for the GIF library • libopenmpi-dev: development les for the OpenMPI library (open-source MPI imple- mentation) • libtw3-dev: devopment les for the library for computing Fast Fourier TransformsBuilding libtiThe current libti release is version 4.0.0 and is available from: http://download.osgeo.org/libtiff/libtiff-4.0.0beta4.tar.gzUncompress the le and build libti from source by doing:>> ./configure --prefix=/usr>> make -j4>> sudo make installBuilding libgetiThe most recent libgeoti release is version 1.3.0 and is available from: http://download.osgeo.org/geotiff/libgeotiff/libgeotiff-1.3.0.tar.gzUncompress the le and build libgeoti from source by doing:>> ./configure --prefix=/usr>> make -j4>> sudo make installBuilding OpenSceneGraphThe last stable release is 2.8.4 (April 2011).The OpenSceneGraph source code be obtained by doing an svn checkout:>> svn co http://www.openscenegraph.org/svn/osg/OpenSceneGraph/tags/OpenSceneGraph-2.8.4 OpenSceneGraResolve dependences as they appear using Synaptics installing development packages.Build Makele using CMake build system:--CMAKE_INSTALL_PREFIX=/usrBuild OpenSceneGraph from source by doing:>> make -j4>> sudo make install
    • 87Building GDAL/OGRThe last stable release is 1.8.0 (January 2011)The GDAL source code be obtained by doing an svn checkout:>> svn checkout https://svn.osgeo.org/gdal/branches/1.8/gdal gdalBuild GDAL from source by doing:>> ./configure --prefix=/usr>> make -j4>> sudo make installBuilding OSSIMA) Obtaining the sourceTo check out an individual OSSIM modules:>> svn co http://svn.osgeo.org/ossim/trunk/<module> <module>Get the ossim_package_support (required for build) and ossim core module:>> svn co http://svn.osgeo.org/ossim/trunk/ossim_package_support ossim_package_support>> svn co http://svn.osgeo.org/ossim/trunk/ossim ossimGet the ossim_plugins module:>> svn co http://svn.osgeo.org/ossim/trunk/ossim_plugins ossim_pluginsGet Qt4 interface libraries (imagelinker, iview): [OPTIONAL]>> svn co http://svn.osgeo.org/ossim/trunk/ossim_qt4 ossim_qt4B) Build ossim coreGenerate Makele using CMake build system:--Add cache entry: CMAKE_MODULE_PATH=<OSSIM_DEV_HOME>/ossim_package_support/cmake/CMakeModules--BUILD_OSSIM_MPI_SUPPORT=ON (Build OSSIM with MPI support)--BUILD_SHARED_LIBS=ON (Build OSSIM for dynamic linking)--CMAKE_INSTALL_PREFIX=/usrBuild ossim core from source by doing:>> make -j4>> sudo make installC) Build ossim pluginsPlugins dependences: • ossimgdal_plugin depends on ossim, and gdalGenerate Makele using CMake build system.
    • 88 Appendix A. Super-Resolution Software Build Guide--Add cache entry: CMAKE_MODULE_PATH=<OSSIM_DEV_HOME>/ossim_package_support/cmake/CMakeModules--BUILD_OSSIMGDAL_PLUGIN=ON (Build OSSIM GDAL plugin)--CMAKE_INSTALL_PREFIX=/usrBuild ossim plugins from source by doing:>> make -j4>> sudo make installUnder Linux systems, each compiled plugin can be identied as libossim<name>_plugin.so.Our compiled plugins are installed in <OSSIM_DEV_HOME>/ossim_plugins/Release loca-tion.D) Preferences le setupD.1) Plugin keyword setupA template of the ossim_preferences le can be found under: <OSSIM_DEV_HOME>/ossim/src/ossim/etc/templates/ossim_preferences_template.Add plugins directory to the ossim_preferences le:...// plugin supportplugin.dir: <OSSIM_DEV_HOME>/ossim_plugins/Release...D.2) Set the environment variable OSSIM_PREFS_FILE for automatic preference le loading.This assumes a preference le in your home called ossim_preferences.>> export OSSIM_PREFS_FILE=~/ossim_preferencesD.3) Verify the installationTo verify the OSSIM installation, use:>> ossim-info --configurationversion: OSSIM 1.8.12 20110803preferences_keyword_list:plugin.dir1: /home/rgutierrez/projects/ossim/ossim_plugins/Releasetile_size: 512 512To verify the plugins are loading properly use:>> ossim-info --pluginsPlugin: /home/rgutierrez/projects/ossim/ossim_plugins/Release/libossimgdal_plugin.soDESCRIPTION:GDAL PluginGDAL Supported formats name: VRT Virtual Raster name: GTiff GeoTIFF name: NITF National Imagery Transmission Format name: RPFTOC Raster Product Format TOC format name: HFA Erdas Imagine Images (.img)
    • 89 ...Building OpenCVThe last stable release is 2.3.1 (August 2011).The OpenCV source code be obtained by doing an svn checkout:>> svn checkout https://code.ros.org/svn/opencv/branches/2.3 opencvBuild Makele using CMake build system:--CMAKE_INSTALL_PREFIX=/usrBuild OpenCV from source by doing:>> make -j4>> sudo make installBuilding QGISThe QGIS source code be obtained by doing an svn checkout:>> svn co https://svn.osgeo.org/qgis/trunk/qgis qgisResolve dependences as they appear using Synaptics installing development packages: bison,ex, (grass-dev), libfcgi-dev, libgeos-dev, libgsl0-dev, (libpq-dev), libproj-dev, libqwt5-qt4-dev, (libspatialite-dev), (libsqlite3-dev), (pyqt4-dev-tools), (python-dev), (python-qt4-dev),(python-sip-dev).Go to http://www.qgis.org/wiki/Building_QGIS_from_Source#Install_build_dependencies for more details.Build Makele using CMake build system:--CMAKE_INSTALL_PREFIX=/usr--QT_MAKE_EXECUTABLE: /home/rgutierrez/qtsdk-2010.04/qt/bin/qmakeBuild QGIS from source by doing:>> make -j4>> sudo make installBuilding Superresolution SoftwareSuperresolution software is under version control at labs.argongra.com. The SuperresolutionSoftware source code be obtained by doing an svn checkout:>> svn co https://labs.argongra.com/svn/Argongra/superresolution superresolutionThe superresolution suite has the following directory structure: • plugins: OSSIM registration plugin, OSSIM superresolution plugin
    • 90 Appendix A. Super-Resolution Software Build Guide • superresgui: Qt superresolution GUI • test_data: test datasets • latex: latex source code • doxyle: doxygen conguration leA) Building pluginsPlugins dependences: • ossimregistration_plugin depends on ossim, gdal and opencv • ossimsuperrolution_plugin depends on ossim, gdal and opencvBuild Makele using CMake build system and build plugins from source by doing:>> make -j4Add plugins via ossim_preferences le:...// plugin supportplugin.dir1: /home/rgutierrez/projects/ossim/ossim_plugins/Releaseplugin.file1: /home/rgutierrez/projects/superresolution/plugins/ossimregistration_plugin/libossimregiplugin.file2: /home/rgutierrez/projects/superresolution/plugins/ossimsuperresolution_plugin/libossims...and verify the installation.A set of command-line applications will also be generated at the same folder where plugins aregenerated: • test_harris : image registration using Harris corners; • test_surf : image registration using SURF features; • test_sift : image registration using SIFT features; • test_mosaic : image mosaic; • test_superresolution : image superresolution;B) Building GUITo build the Qt Superresolution GUI, open qtcreator and build the application.
    • Bibliography [1] H. Greenspan,  Super-Resolution in Medical Imaging, The Computer Journal, vol. 52, no. 1, pp. 4363, 2008. 1 [2] H. Shen, M. K. Ng, P. Li, and L. Zhang,  Super-Resolution Reconstruction Algorithm To MODIS Remote Sensing Images, The Computer Journal, vol. 52, no. 1, pp. 90100, 2008. 1 [3] G. Hong and Y. Zhang,  Wavelet-based image registration technique for high-resolution remote sensing images, Computers & Geosciences, vol. 34, pp. 17081720, Dec. 2008. 1 [4] T. Akgun, Y. Altunbasak, and R. M. Mersereau,  Super-resolution reconstruction of hy- perspectral images., IEEE Transactions on Image Processing, vol. 14, no. 11, pp. 1860 1875, 2005. 1 [5] D. Fraser and A. Lambert,  Super Resolution for Remote Sensing Images Based on a Universal Hidden Markov Tree Model, IEEE Transactions on Geoscience and Remote Sensing, vol. 48, pp. 12701278, Mar. 2010. 1 [6] Y. Jie, D. U. Si-Dan, and Z. H. U. Xiang,  Fast Super-resolution for License Plate Image Reconstruction, Most, pp. 36, 2008. 1 [7] K. Jia and S. Gong,  Generalized face super-resolution., IEEE Transactions on Image Processing, vol. 17, no. 6, pp. 873886, 2008. 1 [8] P. Rastogi and M. Singh Chauhan,  Improvised super-resolution algorithm for face recog- nition, in MIPPR 2009 Pattern Recognition and Computer Vision (M. Ding, B. Bhanu, F. M. Wahl, and J. Roberts, eds.), vol. 7496, pp. 74960H74960H10, SPIE, 2009. 1 [9] P. Krämer, O. Hadar, J. Benois-Pineau, and J. P. Domenger,  Super-resolution mosaicing from MPEG compressed video, Signal Processing Image Communication, vol. 22, no. 10, p. 20, 2007. 1 Advances[10] R. Y. Tsai and T. S. Huang,  Multiframe image restoration and registration, in in Computer Vision and Image Processing (T. S. Huang, ed.), vol. 1 of Image Reconstruc- tion from Incomplete Observations, pp. 317339, JAI Press Inc, 1984. 1, 3[11] S. Borman and R. L. Stevenson,  Super-Resolution from Image Sequences - A Review, in MWSCAS 98 Proceedings of the 1998 Midwest Symposium on Systems and Circuits, p. 374, IEEE Computer Society, 1998. 1[12] S. Chaudhuri,  Super-resolution image reconstruction, IEEE Signal Processing Magazine, vol. 20, no. 3, pp. 1920, 2003. 1[13] S. C. Park, M. K. Park, and M. G. Kang,  Super-resolution image reconstruction: a technical overview, IEEE Signal Processing Magazine, vol. 20, pp. 2136, May 2003. 1[14] M. K. Ng and N. K. Bose,  Mathematical Analysis of Super-Resolution Methodology, IEEE Signal Processing Magazine, vol. 2, no. May, 2003. 1, 9
    • 92 Bibliography[15] S. Farsiu, M. D. Robinson, M. Elad, and P. Milanfar,  Advances and challenges in super- resolution, International Journal of Imaging Systems and Technology, vol. 14, no. 2, pp. 4757, 2004. 1, 2, 4[16] J. Vanouwerkerk,  Image super-resolution survey, Image and Vision Computing, vol. 24, no. 10, pp. 10391052, 2006. 1, 8[17] H. He and L. P. Kondi, Superresolution Color Image Reconstruction, pp. 483502. 2007. 1[18] N. Ji and H. Shro,  Super resolution techniques, Science, pp. 605616, 2008. 1[19] H. Y. Liu, Y. Zhang, and S. Ji,  Study on the methods of super-resolution image re- construction, inThe International Archives of the Photogrammetry Remote Sensing and Spatial Information Sciences, vol. XXXVII, pp. 461-466, 2008. 1[20] A. K. Katsaggelos, R. Molina, and J. Mateos, Super Resolution of Images and Video, vol. 1. Morgan & Claypool, 2007. 1[21] P. Milanfar, ed., Super-resolution imaging. CRC Press, 2011. 1, 82[22] S. P. Kim, N. K. Bose, and H. M. Valenzuela,  Recursive Reconstruction of High Reso- lution Image From Noisy Undersampled Multiframes, IEEE Transactions on Acoustics Speech and Signal Processing, vol. 38, no. 6, pp. 10131027, 1990. 4[23] S. P. Kim and W. Y. Su,  Recursive high-resolution reconstruction of blurred multiframe images., IEEE Transactions on Image Processing, vol. 2, no. 4, pp. 534539, 1993. 4[24] N. K. Bose, H. C. Kim, and H. M. Valenzuela,  Recursive Total Least Squares Algorithm for Image Reconstruction from Noisy, Undersampled Frames, Multidimensional Systems and Signal Processing, vol. 4, no. 3, pp. 253268, 1993. 4[25] S. Rhee and M. G. Kang,  Discrete cosine transform based regularized high-resolution image reconstruction algorithm, Optical Engineering, vol. 38, no. 8, p. 1348, 1999. 4[26] V. Bannore, Iterative-Interpolation Super-Resolution Image Reconstruction, vol. 195 of Studies in Computational Intelligence. Berlin, Heidelberg: Springer, 2009. 5, 9[27] J. C. Russ, The Image Processing Handbook. CRC Press, 1995. 5[28] M. S. Alam, J. G. Bognar, R. C. Hardie, and B. J. Yasuda,  Infrared image registration and high-resolution reconstruction using multiple translationally shifted aliased video frames, Ieee Transactions On Instrumentation And Measurement, vol. 49, no. 5, pp. 915923, 2000. 6[29] N. Nguyen and P. Milanfar,  An ecient wavelet-based algorithm for image superresolu- tion, Proc Int Conf Image Processing, vol. 37, no. 11, pp. 351354 vol.2, 2000. 6[30] S. Lertrattanapanich and N. K. Bose,  High resolution image formation from low resolu- tion frames using Delaunay triangulation., IEEE transactions on image processing : a publication of the IEEE Signal Processing Society, vol. 11, pp. 142741, Jan. 2002. 6, 37
    • Bibliography 93[31] M. Irani and S. Peleg,  Improving resolution by image registration, CVGIP Graphical Models and Image Processing, vol. 53, no. 3, pp. 231239, 1991. 6, 7, 37, 40, 81[32] L. C. Pickup, D. P. Capel, S. J. Roberts, and A. Zisserman,  Bayesian Methods for Image Super-Resolution, The Computer Journal, vol. 52, no. 1, pp. 101113, 2008. 6, 8[33] A. K. Katsaggelos, Bayesian Super-Resolution. CRC Press, 2010. 6, 8[34] G. T. Herman, H. Hurwitz, A. Lent, and H.-P. Lung,  On the Bayesian approach to image reconstruction, Information and Control, vol. 42, no. 1, pp. 6071, 1979. 8[35] R. R. Schultz and R. L. Stevenson,  Extraction of high-resolution frames from video sequences., IEEE Transactions on Image Processing, vol. 5, no. 6, pp. 9961011, 1996. 8[36] N. Nguyen, Numerical algorithms for image superresolution. PhD thesis, Stanford Uni- versity, 2000. 8[37] R. Hardie, K. J. Barnard, and E. E. Armstrong,  Joint MAP registration and high- resolution image estimation using a sequence of undersampled images., IEEE transac- tions on image processing : a publication of the IEEE Signal Processing Society, vol. 6, pp. 162133, Jan. 1997. 8[38] J. Chung, E. Haber, and J. Nagy,  Numerical methods for coupled super-resolution, Inverse Problems, vol. 22, no. 4, pp. 12611272, 2006. 8[39] H. Shen, L. Zhang, B. Huang, and P. Li,  A MAP approach for joint motion estimation, segmentation, and super resolution., IEEE Transactions on Image Processing, vol. 16, no. 2, pp. 479490, 2007. 8[40] D. C. Youla and H. Webb,  Image restoration by the method of convex projections: part 1 theory., IEEE Transactions on Medical Imaging, vol. 1, no. 2, pp. 8194, 1982. 10[41] M. I. Sezan and H. Stark,  Image restoration by the method of convex projections: part 2 applications and numerical results., IEEE Transactions on Medical Imaging, vol. 1, no. 2, pp. 95101, 1982. 10[42] H. Stark and P. Oskoui,  High-resolution image recovery from image-plane arrays using convex projections, J Opt Soc Amer, vol. 6, no. 11, pp. 17151726, 1989. 10[43] M. Elad and A. Feuer,  Superresolution restoration of an image sequence: adaptive lter- ing approach.,IEEE transactions on image processing : a publication of the IEEE Signal Processing Society, vol. 8, pp. 38795, Jan. 1999. 10[44] M. Elad and A. Feuer,  Super-resolution reconstruction of image sequences, IEEE Trans- actions on Pattern Analysis and Machine Intelligence, vol. 21, no. 9, pp. 817834, 1999. 10[45] G. Welch and G. Bishop,  An Introduction to the Kalman Filter, In Practice, vol. 7, no. 1, pp. 116, 2006. 10[46] D. P. Capel, Image Mosaicing and Super-resolution. PhD thesis, University of Oxford, 2001. 11
    • 94 Bibliography[47] L. C. Pickup, Machine Learning in Multi-frame Image Super-resolution. PhD thesis, University of Oxford, 2007. 11[48] S. Baker and T. Kanade,  Limits on super-resolution and how to break them, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 9, pp. 11671183, 2002. 11[49] N. Bozinovic, Practising Super-Resolution: What Have We Learned?, ch. 14, pp. 413444. CRC Press, 2011. 11, 82[50] D. P. Capel and A. Zisserman,  Computer Vision Applied to Super Resolution, IEEE Signal Processing Magazine, no. May, 2003. 12[51] C. Harris and M. Stephens,  A combined corner and edge detector, in Alvey vision con- ference (M. M. Mathews, ed.), vol. 15, pp. 147151, Manchester, UK, Manchester, UK, 1988. 12, 53, 82[52] D. Lowe,  Object recognition from local scale-invariant features, Proceedings of the Sev- enth IEEE International Conference on Computer Vision, pp. 11501157 vol.2, 1999. 12, 53, 55, 56, 61, 82[53] H. Bay, A. Ess, T. Tuytelaars, and L. Vangool,  Speeded-Up Robust Features (SURF), Computer Vision and Image Understanding, vol. 110, pp. 346359, June 2008. 12, 53, 58, 59, 61, 82[54] M. A. Fischler and R. C. Bolles,  Random Sample Consensus: A Paradigm for Model Fitting with, Communications of the ACM, vol. 24, no. 6, 1981. 12, 63[55] S. Farsiu, M. D. Robinson, M. Elad, and P. Milanfar, Fast and robust super-resolution, vol. 2. IEEE, 2003. 12[56] S. Farsiu, M. D. Robinson, M. Elad, and P. Milanfar,  Fast and Robust Multiframe Super Resolution, IEEE Transactions on Image Processing, vol. 13, no. 10, pp. 13271344, 2004. 12[57] A. Zomet and S. Peleg,  Ecient Super-Resolution and Applications to Mosaics, in In- ternational Conference on Pattern Recognition, vol. 1, pp. 579583, IEEE, 2000. 12, 38, 81[58] A. Camargo, R. R. Schultz, Y. Wang, R. A. Fevig, and Q. He,  GPU-CPU implementation for super-resolution mosaicking of Unmanned Aircraft System (UAS) surveillance video, 2010 IEEE Southwest Symposium on Image Analysis Interpretation SSIAI, pp. 2528, 2010. 12[59] M. E. Angelopoulou, C.-S. Bouganis, P. Y. K. Cheung, and G. A. Constantinides,  Robust Real-Time Super-Resolution on FPGA and an Application to Video Enhancement, ACM Transactions on Recongurable Technology and Systems, vol. 2, no. 4, pp. 129, 2009. 12[60] P. Vandewalle, J. Kovacevic, and M. Vetterli,  Reproducible research in signal processing, IEEE Signal Processing Magazine, vol. 26, pp. 3747, May 2009. 16[61] Opencv. http://opencv.willowgarage.com/wiki/. 17, 54
    • Bibliography 95[62] Ossim. http://www.ossim.org. 17[63] Qt - cross-plattform application and ui framework. http://qt.nokia.com/. 27[64] P. Vandewalle, S. Süsstrunk, and M. Vetterli,  A Frequency Domain Approach to Reg- istration of Aliased Images with Application to Super-resolution, EURASIP Journal on Advances in Signal Processing, vol. 2006, pp. 115, 2006. 29[65] D. Keren, S. Peleg, and R. Brada,  Image sequence enhancement using sub-pixel displace- ments, Proceedings CVPR 88: The Computer Society Conference on Computer Vision and Pattern Recognition, no. 3, pp. 742746, 1988. 29, 31, 81[66] P. Vandewalle,  Registration of aliased images for super-resolution imaging, Proceedings of SPIE, pp. 60770260770211, 2006. 30, 81[67] S. Farsiu, D. Robinson, M. Elad, and P. Milanfar,  Robust shift and add approach to super-resolution, in Proc of the 2003 SPIE Conf on Applications of Digital Signal and Image Processing, no. 3, pp. 121130, Citeseer, 2003. 37, 40, 81[68] M. Muja and D. G. Lowe, Fast approximate nearest neighbors with automatic algorithm International Conference on Computer Vision Theory and Application conguration, in VISSAPP09), pp. 331340, INSTICC Press, 2009. 56[69] A. Vedaldi and B. Fulkerson,  VLFeat: An open and portable library of computer vision algorithms. http://www.vlfeat.org/, 2008. 57, 61, 65[70] C. Evans, Notes on the opensurf library, Tech. Rep. CSTR-09-001, University of Bristol, January 2009. 59[71] The landsat program. http://landsat.gsfc.nasa.gov/. 69[72] Landsat 7 science data users handbook. http://landsathandbook.gsfc.nasa.gov/. 69[73] Worldwide reference system (wrs). http://landsat.gsfc.nasa.gov/about/wrs.html. 70[74] U.s. geolocical survey landsat project. http://landsat.usgs.gov/. vii, 71[75] Eumetsat - meteosat second generation (msg). http://www.eumetsat.int/Home/Main/ Satellites/MeteosatSecondGeneration. vii, 77[76] Msg level 1.5 image data format. http://www.eumetsat.int/Home/Main/DataAccess/ Resources. 77[77] V. H. Patil, D. S. Bormane, and H. K. Patil,  Color Super Resolution Image Reconstruc- tion, International Conference on Computational Intelligence and Multimedia Applica- tions (ICCIMA 2007), pp. 366370, Dec. 2007. 83[78] S. Farsiu, M. Elad, and P. Milanfar,  Video-to-Video Dynamic Super-Resolution for Grayscale and Color Sequences, EURASIP Journal on Advances in Signal Processing, vol. 2006, no. ii, pp. 116, 2006. 83