36324442 biosignal-and-bio-medical-image-processing-matlab-based-applications-john-l-semmlow

  • 1,133 views
Uploaded on

 

More in: Technology , Education
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
1,133
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
82
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Biosignal and Biomedical Image Processing MATLAB-Based Applications JOHN L. SEMMLOW Robert WoodJohnson Medical School New Brunswick, New Jersey, U.S.A. Rutgers University Piscataway, New Jersey, U.S.A. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 2. Although great care has been taken to provide accurate and current information, neither the author(s) nor the publisher, nor anyone else associated with this publication, shall be liable for any loss, damage, or liability directly or indirectly caused or alleged to be caused by this book. The material contained herein is not intended to provide specific advice or recommendations for any specific situation. Trademark notice: Product or corporate names may be trademarks or registered trade- marks and are used only for identification and explanation without intent to infringe. Library of Congress Cataloging-in-Publication Data A catalog record for this book is available from the Library of Congress. ISBN: 0–8247-4803–4 This book is printed on acid-free paper. Headquarters Marcel Dekker, Inc., 270 Madison Avenue, New York, NY 10016, U.S.A. tel: 212-696-9000; fax: 212-685-4540 Distribution and Customer Service Marcel Dekker, Inc., Cimarron Road, Monticello, New York 12701, U.S.A. tel: 800-228-1160; fax: 845-796-1772 Eastern Hemisphere Distribution Marcel Dekker AG, Hutgasse 4, Postfach 812, CH-4001 Basel, Switzerland tel: 41-61-260-6300; fax: 41-61-260-6333 World Wide Web http://www.dekker.com The publisher offers discounts on this book when ordered in bulk quantities. For more information, write to Special Sales/Professional Marketing at the headquarters address above. Copyright  2004 by Marcel Dekker, Inc. All Rights Reserved. Neither this book nor any part may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, microfilming, and recording, or by any information storage and retrieval system, without permission in writing from the publisher. Current printing (last digit): 10 9 8 7 6 5 4 3 2 1 PRINTED IN THE UNITED STATES OF AMERICA Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 3. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 4. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 5. To Lawrence Stark, M.D., who has shown me the many possibilities . . . Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 6. Series Introduction Over the past 50 years, digital signal processing has evolved as a major engi- neering discipline. The fields of signal processing have grown from the origin of fast Fourier transform and digital filter design to statistical spectral analysis and array processing, image, audio, and multimedia processing, and shaped de- velopments in high-performance VLSI signal processor design. Indeed, there are few fields that enjoy so many applications—signal processing is everywhere in our lives. When one uses a cellular phone, the voice is compressed, coded, and modulated using signal processing techniques. As a cruise missile winds along hillsides searching for the target, the signal processor is busy processing the images taken along the way. When we are watching a movie in HDTV, millions of audio and video data are being sent to our homes and received with unbeliev- able fidelity. When scientists compare DNA samples, fast pattern recognition techniques are being used. On and on, one can see the impact of signal process- ing in almost every engineering and scientific discipline. Because of the immense importance of signal processing and the fast- growing demands of business and industry, this series on signal processing serves to report up-to-date developments and advances in the field. The topics of interest include but are not limited to the following: • Signal theory and analysis • Statistical signal processing • Speech and audio processing Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 7. • Image and video processing • Multimedia signal processing and technology • Signal processing for communications • Signal processing architectures and VLSI design We hope this series will provide the interested audience with high-quality, state-of-the-art signal processing literature through research monographs, edited books, and rigorously written textbooks by experts in their fields. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 8. Preface Signal processing can be broadly defined as the application of analog or digital techniques to improve the utility of a data stream. In biomedical engineering applications, improved utility usually means the data provide better diagnostic information. Analog techniques are applied to a data stream embodied as a time- varying electrical signal while in the digital domain the data are represented as an array of numbers. This array could be the digital representation of a time- varying signal, or an image. This text deals exclusively with signal processing of digital data, although Chapter 1 briefly describes analog processes commonly found in medical devices. This text should be of interest to a broad spectrum of engineers, but it is written specifically for biomedical engineers (also known as bioengineers). Although the applications are different, the signal processing methodology used by biomedical engineers is identical to that used by other engineers such electri- cal and communications engineers. The major difference for biomedical engi- neers is in the level of understanding required for appropriate use of this technol- ogy. An electrical engineer may be required to expand or modify signal processing tools, while for biomedical engineers, signal processing techniques are tools to be used. For the biomedical engineer, a detailed understanding of the underlying theory, while always of value, may not be essential. Moreover, considering the broad range of knowledge required to be effective in this field, encompassing both medical and engineering domains, an in-depth understanding of all of the useful technology is not realistic. It is important is to know what Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 9. tools are available, have a good understanding of what they do (if not how they do it), be aware of the most likely pitfalls and misapplications, and know how to implement these tools given available software packages. The basic concept of this text is that, just as the cardiologist can benefit from an oscilloscope-type display of the ECG without a deep understanding of electronics, so a biomedical engineer can benefit from advanced signal processing tools without always un- derstanding the details of the underlying mathematics. As a reflection of this philosophy, most of the concepts covered in this text are presented in two sections. The first part provides a broad, general under- standing of the approach sufficient to allow intelligent application of the con- cepts. The second part describes how these tools can be implemented and relies primarily on the MATLAB software package and several of its toolboxes. This text is written for a single-semester course combining signal and image processing. Classroom experience using notes from this text indicates that this ambitious objective is possible for most graduate formats, although eliminating a few topics may be desirable. For example, some of the introduc- tory or basic material covered in Chapters 1 and 2 could be skipped or treated lightly for students with the appropriate prerequisites. In addition, topics such as advanced spectral methods (Chapter 5), time-frequency analysis (Chapter 6), wavelets (Chapter 7), advanced filters (Chapter 8), and multivariate analysis (Chapter 9) are pedagogically independent and can be covered as desired with- out affecting the other material. Although much of the material covered here will be new to most students, the book is not intended as an “introductory” text since the goal is to provide a working knowledge of the topics presented without the need for additional course work. The challenge of covering a broad range of topics at a useful, working depth is motivated by current trends in biomedical engineering educa- tion, particularly at the graduate level where a comprehensive education must be attained with a minimum number of courses. This has led to the development of “core” courses to be taken by all students. This text was written for just such a core course in the Graduate Program of Biomedical Engineering at Rutgers University. It is also quite suitable for an upper-level undergraduate course and would be of value for students in other disciplines who would benefit from a working knowledge of signal and image processing. It would not be possible to cover such a broad spectrum of material to a depth that enables productive application without heavy reliance on MATLAB- based examples and problems. In this regard, the text assumes the student has some knowledge of MATLAB programming and has available the basic MATLAB software package including the Signal Processing and Image Process- ing Toolboxes. (MATLAB also produces a Wavelet Toolbox, but the section on wavelets is written so as not to require this toolbox, primarily to keep the num- ber of required toolboxes to a minimum.) The problems are an essential part of Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 10. this text and often provide a discovery-like experience regarding the associated topic. A few peripheral topics are introduced only though the problems. The code used for all examples is provided in the CD accompanying this text. Since many of the problems are extensions or modifications of examples given in the chapter, some of the coding time can be reduced by starting with the code of a related example. The CD also includes support routines and data files used in the examples and problems. Finally, the CD contains the code used to generate many of the figures. For instructors, there is a CD available that contains the problem solutions and Powerpoint presentations from each of the chapters. These presentations include figures, equations, and text slides related to chapter. Presentations can be modified by the instructor as desired. In addition to heavy reliance on MATLAB problems and examples, this text makes extensive use of simulated data. Except for the section on image processing, examples involving biological signals are rarely used. In my view, examples using biological signals provide motivation, but they are not generally very instructive. Given the wide range of material to be presented at a working depth, emphasis is placed on learning the tools of signal processing; motivation is left to the reader (or the instructor). Organization of the text is straightforward. Chapters 1 through 4 are fairly basic. Chapter 1 covers topics related to analog signal processing and data acqui- sition while Chapter 2 includes topics that are basic to all aspects of signal and image processing. Chapters 3 and 4 cover classical spectral analysis and basic digital filtering, topics fundamental to any signal processing course. Advanced spectral methods, covered in Chapter 5, are important due to their widespread use in biomedical engineering. Chapter 6 and the first part of Chapter 7 cover topics related to spectral analysis when the signal’s spectrum is varying in time, a condition often found in biological signals. Chapter 7 also covers both contin- uous and discrete wavelets, another popular technique used in the analysis of biomedical signals. Chapters 8 and 9 feature advanced topics. In Chapter 8, optimal and adaptive filters are covered, the latter’s inclusion is also motivated by the time-varying nature of many biological signals. Chapter 9 introduces multivariate techniques, specifically principal component analysis and indepen- dent component analysis, two analysis approaches that are experiencing rapid growth with regard to biomedical applications. The last four chapters cover image processing, with the first of these, Chapter 10, covering the conventions used by MATLAB’s Imaging Processing Toolbox. Image processing is a vast area and the material covered here is limited primarily to areas associated with medical imaging: image acquisition (Chapter 13); image filtering, enhancement, and transformation (Chapter 11); and segmentation, and registration (Chapter 12). Many of the chapters cover topics that can be adequately covered only in a book dedicated solely to these topics. In this sense, every chapter represents a serious compromise with respect to comprehensive coverage of the associated Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 11. topics. My only excuse for any omissions is that classroom experience with this approach seems to work: students end up with a working knowledge of a vast array of signal and image processing tools. A few of the classic or major books on these topics are cited in an Annotated bibliography at the end of the book. No effort has been made to construct an extensive bibliography or reference list since more current lists would be readily available on the Web. TEXTBOOK PROTOCOLS In most early examples that feature MATLAB code, the code is presented in full, while in the later examples some of the routine code (such as for plotting, display, and labeling operation) is omitted. Nevertheless, I recommend that stu- dents carefully label (and scale when appropriate) all graphs done in the prob- lems. Some effort has been made to use consistent notation as described in Table 1. In general, lower-case letters n and k are used as data subscripts, and capital letters, N and K are used to indicate the length (or maximum subscript value) of a data set. In two-dimensional data sets, lower-case letters m and n are used to indicate the row and column subscripts of an array, while capital letters M and N are used to indicate vertical and horizontal dimensions, respec- tively. The letter m is also used as the index of a variable produced by a transfor- mation, or as an index indicating a particular member of a family of related functions.* While it is common to use brackets to enclose subscripts of discrete variables (i.e., x[n]), ordinary parentheses are used here. Brackets are reserved to indicate vectors (i.e., [x1, x2, x3 , . . . ]) following MATLAB convention. Other notation follows standard conventions. Italics (“) are used to introduce important new terms that should be incor- porated into the reader’s vocabulary. If the meaning of these terms is not obvi- ous from their use, they are explained where they are introduced. All MATLAB commands, routines, variables, and code are shown in the Courier typeface. Single quotes are used to highlight MATLAB filenames or string variables. Textbook protocols are summarized in Table 1. I wish to thank Susanne Oldham who managed to edit this book, and provided strong, continuing encouragement and support. I would also like to acknowledge the patience and support of Peggy Christ and Lynn Hutchings. Professor Shankar Muthu Krishnan of Singapore provided a very thoughtful critique of the manuscript which led to significant improvements. Finally, I thank my students who provided suggestions and whose enthusiasm for the material provided much needed motivation. *For example, m would be used to indicate the harmonic number of a family of harmonically related sine functions; i.e., fm(t) = sin (2 π m t). Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 12. TABLE 1 Textbook Conventions Symbol Description/General usage x(t), y(t) General functions of time, usually a waveform or signal k, n Data indices, particularly for digitized time data K, N Maximum index or size of a data set x(n), y(n) Waveform variable, usually digitized time variables (i.e., a dis- creet variable) m Index of variable produced by transformation, or the index of specifying the member number of a family of functions (i.e., fm(t)) X(f), Y(f) Frequency representation (complex) of a time function X(m), Y(m) Frequency representation (complex) of a discreet variable h(t) Impulse response of a linear system h(n) Discrete impulse response of a linear system b(n) Digital filter coefficients representing the numerator of the dis- creet Transfer Function; hence the same as the impulse re- sponse a(n) Digital filter coefficients representing the denominator of the dis- creet Transfer Function Courier font MATLAB command, variable, routine, or program. Courier font MATLAB filename or string variable John L. Semmlow Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 13. Contents Preface 1 Introduction Typical Measurement Systems Transducers Further Study: The Transducer Analog Signal Processing Sources of Variability: Noise Electronic Noise Signal-to-Noise Ratio Analog Filters: Filter Basics Filter Types Filter Bandwidth Filter Order Filter Initial Sharpness Analog-to-Digital Conversion: Basic Concepts Analog-to-Digital Conversion Techniques Quantization Error Further Study: Successive Approximation Time Sampling: Basics Further Study: Buffering and Real-Time Data Processing Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 14. Data Banks Problems 2 Basic Concepts Noise Ensemble Averaging MATLAB Implementation Data Functions and Transforms Convolution, Correlation, and Covariance Convolution and the Impulse Response Covariance and Correlation MATLAB Implementation Sampling Theory and Finite Data Considerations Edge Effects Problems 3 Spectral Analysis: Classical Methods Introduction The Fourier Transform: Fourier Series Analysis Periodic Functions Symmetry Discrete Time Fourier Analysis Aperiodic Functions Frequency Resolution Truncated Fourier Analysis: Data Windowing Power Spectrum MATLAB Implementation Direct FFT and Windowing The Welch Method for Power Spectral Density Determination Widow Functions Problems 4 Digital Filters The Z-Transform Digital Transfer Function MATLAB Implementation Finite Impulse Response (FIR) Filters FIR Filter Design Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 15. Derivative Operation: The Two-Point Central Difference Algorithm MATLAB Implementation Infinite Impulse Response (IIR) Filters Filter Design and Application Using the MATLAB Signal Processing Toolbox FIR Filters Two-Stage FIR Filter Design Three-Stage Filter Design IIR Filters Two-Stage IIR Filter Design Three-Stage IIR Filter Design: Analog Style Filters Problems 5 Spectral Analysis: Modern Techniques Parametric Model-Based Methods MATLAB Implementation Non-Parametric Eigenanalysis Frequency Estimation MATLAB Implementation Problems 6 Time–Frequency Methods Basic Approaches Short-Term Fourier Transform: The Spectrogram Wigner-Ville Distribution: A Special Case of Cohen’s Class Choi-Williams and Other Distributions Analytic Signal MATLAB Implementation The Short-Term Fourier Transform Wigner-Ville Distribution Choi-Williams and Other Distributions Problems 7 The Wavelet Transform Introduction The Continuous Wavelet Transform Wavelet Time—Frequency Characteristics MATLAB Implementation Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 16. The Discrete Wavelet Transform Filter Banks The Relationship Between Analytical Expressions and Filter Banks MATLAB Implementation Denoising Discontinuity Detection Feature Detection: Wavelet Packets Problems 8 Advanced Signal Processing Techniques: Optimal and Adaptive Filters Optimal Signal Processing: Wiener Filters MATLAB Implementation Adaptive Signal Processing Adaptive Noise Cancellation MATLAB Implementation Phase Sensitive Detection AM Modulation Phase Sensitive Detectors MATLAB Implementation Problems 9 Multivariate Analyses: Principal Component Analysis and Independent Component Analysis Introduction Principal Component Analysis Order Selection MATLAB Implementation Data Rotation Principal Component Analysis Evaluation Independent Component Analysis MATLAB Implementation Problems 10 Fundamentals of Image Processing: MATLAB Image Processing Toolbox Image Processing Basics: MATLAB Image Formats General Image Formats: Image Array Indexing Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 17. Data Classes: Intensity Coding Schemes Data Formats Data Conversions Image Display Image Storage and Retrieval Basic Arithmetic Operations Advanced Protocols: Block Processing Sliding Neighborhood Operations Distinct Block Operations Problems 11 Image Processing: Filters, Transformations, and Registration Spectral Analysis: The Fourier Transform MATLAB Implementation Linear Filtering MATLAB Implementation Filter Design Spatial Transformations MATLAB Implementation Affine Transformations General Affine Transformations Projective Transformations Image Registration Unaided Image Registration Interactive Image Registration Problems 12 Image Segmentation Pixel-Based Methods Threshold Level Adjustment MATLAB Implementation Continuity-Based Methods MATLAB Implementation Multi-Thresholding Morphological Operations MATLAB Implementation Edge-Based Segmentation MATLAB Implementation Problems Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 18. 13 Image Reconstruction CT, PET, and SPECT Fan Beam Geometry MATLAB Implementation Radon Transform Inverse Radon Transform: Parallel Beam Geometry Radon and Inverse Radon Transform: Fan Beam Geometry Magnetic Resonance Imaging Basic Principles Data Acquisition: Pulse Sequences Functional MRI MATLAB Implementation Principal Component and Independent Component Analysis Problems Annotated Bibliography Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 19. Annotated Bibliography The following is a very selective list of books or articles that will be of value of in providing greater depth and mathematical rigor to the material presented in this text. Comments regarding the particular strengths of the reference are included. Akansu, A. N. and Haddad, R. A., Multiresolution Signal Decomposition: Transforms, subbands, wavelets. Academic Press, San Diego CA, 1992. A modern classic that presents, among other things, some of the underlying theoretical aspects of wavelet analysis. Aldroubi A and Unser, M. (eds) Wavelets in Medicine and Biology, CRC Press, Boca Raton, FL, 1996. Presents a variety of applications of wavelet analysis to biomedical engineering. Boashash, B. Time-Frequency Signal Analysis, Longman Cheshire Pty Ltd., 1992. Early chapters provide a very useful introduction to time–frequency analysis followed by a number of medical applications. Boashash, B. and Black, P.J. An efficient real-time implementation of the Wigner-Ville Distribution, IEEE Trans. Acoust. Speech Sig. Proc. ASSP-35:1611–1618, 1987. Practical information on calculating the Wigner-Ville distribution. Boudreaux-Bartels, G. F. and Murry, R. Time-frequency signal representations for bio- medical signals. In: The Biomedical Engineering Handbook. J. Bronzino (ed.) CRC Press, Boca Raton, Florida and IEEE Press, Piscataway, N.J., 1995. This article pres- ents an exhaustive, or very nearly so, compilation of Cohen’s class of time-frequency distributions. Bruce, E. N. Biomedical Signal Processing and Signal Modeling, John Wiley and Sons, Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 20. New York, 2001. Rigorous treatment with more of an emphasis on linear systems than signal processing. Introduces nonlinear concepts such as chaos. Cichicki, A and Amari S. Adaptive Bilnd Signal and Image Processing: Learning Algo- rithms and Applications, John Wiley and Sons, Inc. New York, 2002. Rigorous, somewhat dense, treatment of a wide range of principal component and independent component approaches. Includes disk. Cohen, L. Time-frequency distributions—A review. Proc. IEEE 77:941–981, 1989. Classic review article on the various time-frequency methods in Cohen’s class of time–frequency distributions. Ferrara, E. and Widrow, B. Fetal Electrocardiogram enhancement by time-sequenced adaptive filtering. IEEE Trans. Biomed. Engr. BME-29:458–459, 1982. Early appli- cation of adaptive noise cancellation to a biomedical engineering problem by one of the founders of the field. See also Widrow below. Friston, K. Statistical Parametric Mapping On-line at: http://www.fil.ion.ucl.ac.uk/spm/ course/note02/ Through discussion of practical aspects of fMRI analysis including pre-processing, statistical methods, and experimental design. Based around SPM anal- ysis software capabilities. Haykin, S. Adaptive Filter Theory (2nd ed.), Prentice-Hall, Inc., Englewood Cliffs, N.J., 1991. The definitive text on adaptive filters including Weiner filters and gradient- based algorithms. Hyva¨rinen, A. Karhunen, J. and Oja, E. Independent Component Analysis, John Wiley and Sons, Inc. New York, 2001. Fundamental, comprehensive, yet readable book on independent component analysis. Also provides a good review of principal compo- nent analysis. Hubbard B.B. The World According to Wavelets (2nd ed.) A.K. Peters, Ltd. Natick, MA, 1998. Very readable introductory book on wavelengths including an excellent section on the foyer transformed. Can be read by a non-signal processing friend. Ingle, V.K. and Proakis, J. G. Digital Signal Processing with MATLAB, Brooks/Cole, Inc. Pacific Grove, CA, 2000. Excellent treatment of classical signal processing meth- ods including the Fourier transform and both FIR and IIR digital filters. Brief, but informative section on adaptive filtering. Jackson, J. E. A User’s Guide to Principal Components, John Wiley and Sons, New York, 1991. Classic book providing everything you ever want to know about principal component analysis. Also covers linear modeling and introduces factor analysis. Johnson, D.D. Applied Multivariate Methods for Data Analysis, Brooks/Cole, Pacific Grove, CA, 1988. Careful, detailed coverage of multivariate methods including prin- cipal components analysis. Good coverage of discriminant analysis techniques. Kak, A.C and Slaney M. Principles of Computerized Tomographic Imaging. IEEE Press, New York, 1988. Thorough, understandable treatment of algorithms for reconstruc- tion of tomographic images including both parallel and fan-beam geometry. Also includes techniques used in reflection tomography as occurs in ultrasound imaging. Marple, S.L. Digital Spectral Analysis with Applications, Prentice-Hall, Englewood Cliffs, NJ, 1987. Classic text on modern spectral analysis methods. In-depth, rigorous treatment of Fourier transform, parametric modeling methods (including AR and ARMA), and eigenanalysis-based techniques. Rao, R.M. and Bopardikar, A.S. Wavelet Transforms: Introduction to Theory and Appli- Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 21. cations, Addison-Wesley, Inc., Reading, MA, 1998. Good development of wavelet analysis including both the continuous and discreet wavelet transforms. Shiavi, R Introduction to Applied Statistical Signal Analysis, (2nd ed), Academic Press, San Diego, CA, 1999. Emphasizes spectral analysis of signals buried in noise. Excel- lent coverage of Fourier analysis, and autoregressive methods. Good introduction to statistical signal processing concepts. Sonka, M., Hlavac V., and Boyle R. Image processing, analysis, and machine vision. Chapman and Hall Computing, London, 1993. A good description of edge-based and other segmentation methods. Strang, G and Nguyen, T. Wavelets and Filter Banks, Wellesley-Cambridge Press, Wellesley, MA, 1997. Thorough coverage of wavelet filter banks including extensive mathematical background. Stearns, S.D. and David, R.A Signal Processing Algorithms in MATLAB, Prentice Hall, Upper Saddle River, NJ, 1996. Good treatment of the classical Fourier transform and digital filters. Also covers the LMS adaptive filter algorithm. Disk enclosed. Wickerhauser, M.V. Adapted Wavelet Analysis from Theory to Software, A.K. Peters, Ltd. and IEEE Press, Wellesley, MA, 1994. Rigorous, extensive treatment of wavelet analysis. Widrow, B. Adaptive noise cancelling: Principles and applications. Proc IEEE 63:1692– 1716, 1975. Classic original article on adaptive noise cancellation. Wright S. Nuclear Magnetic Resonance and Magnetic Resonance Imaging. In: Introduc- tion to Biomedical Engineering (Enderle, Blanchard and Bronzino, Eds.) Academic Press, San Diego, CA, 2000. Good mathematical development of the physics of MRI using classical concepts. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 22. 1 Introduction TYPICAL MEASUREMENT SYSTEMS A schematic representation of a typical biomedical measurement system is shown in Figure 1.1. Here we use the term measurement in the most general sense to include image acquisition or the acquisition of other forms of diagnostic information. The physiological process of interest is converted into an electric FIGURE 1.1 Schematic representation of typical bioengineering measurement system. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 23. signal via the transducer (Figure 1.1). Some analog signal processing is usually required, often including amplification and lowpass (or bandpass) filtering. Since most signal processing is easier to implement using digital methods, the analog signal is converted to digital format using an analog-to-digital converter. Once converted, the signal is often stored, or buffered, in memory to facilitate subsequent signal processing. Alternatively, in some real-time* applications, the incoming data must be processed as quickly as possible with minimal buffering, and may not need to be permanently stored. Digital signal processing algorithms can then be applied to the digitized signal. These signal processing techniques can take a wide variety of forms and various levels of sophistication, and they make up the major topic area of this book. Some sort of output is necessary in any useful system. This usually takes the form of a display, as in imaging sys- tems, but may be some type of an effector mechanism such as in an automated drug delivery system. With the exception of this chapter, this book is limited to digital signal and image processing concerns. To the extent possible, each topic is introduced with the minimum amount of information required to use and understand the approach, and enough information to apply the methodology in an intelligent manner. Understanding of strengths and weaknesses of the various methods is also covered, particularly through discovery in the problems at the end of the chapter. Hence, the problems at the end of each chapter, most of which utilize the MATLABTM software package (Waltham, MA), constitute an integral part of the book: a few topics are introduced only in the problems. A fundamental assumption of this text is that an in-depth mathematical treatment of signal processing methodology is not essential for effective and appropriate application of these tools. Thus, this text is designed to develop skills in the application of signal and image processing technology, but may not provide the skills necessary to develop new techniques and algorithms. Refer- ences are provided for those who need to move beyond application of signal and image processing tools to the design and development of new methodology. In subsequent chapters, each major section is followed by a section on imple- mentation using the MATLAB software package. Fluency with the MATLAB language is assumed and is essential for the use of this text. Where appropriate, a topic area may also include a more in-depth treatment including some of the underlying mathematics. *Learning the vocabulary is an important part of mastering a discipline. In this text we highlight, using italics, terms commonly used in signal and image processing. Sometimes the highlighted term is described when it is introduced, but occasionally determination of its definition is left to responsi- bility of the reader. Real-time processing and buffering are described in the section on analog-to- digital conversion. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 24. TRANSDUCERS A transducer is a device that converts energy from one form to another. By this definition, a light bulb or a motor is a transducer. In signal processing applica- tions, the purpose of energy conversion is to transfer information, not to trans- form energy as with a light bulb or a motor. In measurement systems, all trans- ducers are so-called input transducers, they convert non-electrical energy into an electronic signal. An exception to this is the electrode, a transducer that converts electrical energy from ionic to electronic form. Usually, the output of a biomedical transducer is a voltage (or current) whose amplitude is proportional to the measured energy. The energy that is converted by the input transducer may be generated by the physiological process itself, indirectly related to the physiological process, or produced by an external source. In the last case, the externally generated energy interacts with, and is modified by, the physiological process, and it is this alteration that produces the measurement. For example, when externally produced x-rays are transmitted through the body, they are absorbed by the intervening tissue, and a measurement of this absorption is used to construct an image. Many diagnostically useful imaging systems are based on this external energy approach. In addition to passing external energy through the body, some images are generated using the energy of radioactive emissions of radioisotopes injected into the body. These techniques make use of the fact that selected, or tagged, molecules will collect in specific tissue. The areas where these radioisotopes collect can be mapped using a gamma camera, or with certain short-lived iso- topes, better localized using positron emission tomography (PET). Many physiological processes produce energy that can be detected di- rectly. For example, cardiac internal pressures are usually measured using a pressure transducer placed on the tip of catheter introduced into the appropriate chamber of the heart. The measurement of electrical activity in the heart, mus- cles, or brain provides other examples of the direct measurement of physiologi- cal energy. For these measurements, the energy is already electrical and only needs to be converted from ionic to electronic current using an electrode. These sources are usually given the term ExG, where the ‘x’ represents the physiologi- cal process that produces the electrical energy: ECG–electrocardiogram, EEG– electroencephalogram; EMG–electromyogram; EOG–electrooculargram, ERG– electroretiniogram; and EGG–electrogastrogram. An exception to this terminology is the electrical activity generated by this skin which is termed the galvanic skin response, GSR. Typical physiological energies and the applications that use these energy forms are shown in Table 1.1 The biotransducer is often the most critical element in the system since it constitutes the interface between the subject or life process and the rest of the Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 25. TABLE 1.1 Energy Forms and Related Direct Measurements Energy Measurement Mechanical length, position, and velocity muscle movement, cardiovascular pressures, muscle contractility force and pressure valve and other cardiac sounds Heat body temperature, thermography Electrical EEG, ECG, EMG, EOG, ERG, EGG, GSR Chemical ion concentrations system. The transducer establishes the risk, or noninvasiveness, of the overall system. For example, an imaging system based on differential absorption of x-rays, such as a CT (computed tomography) scanner is considered more inva- sive than an imagining system based on ultrasonic reflection since CT uses ionizing radiation that may have an associated risk. (The actual risk of ionizing radiation is still an open question and imaging systems based on x-ray absorp- tion are considered minimally invasive.) Both ultrasound and x-ray imaging would be considered less invasive than, for example, monitoring internal cardiac pressures through cardiac catherization in which a small catheter is treaded into the heart chambers. Indeed many of the outstanding problems in biomedical measurement, such as noninvasive measurement of internal cardiac pressures, or the noninvasive measurement of intracranial pressure, await an appropriate (and undoubtedly clever) transducer mechanism. Further Study: The Transducer The transducer often establishes the major performance criterion of the system. In a later section, we list and define a number of criteria that apply to measure- ment systems; however, in practice, measurement resolution, and to a lesser extent bandwidth, are generally the two most important and troublesome mea- surement criteria. In fact, it is usually possible to trade-off between these two criteria. Both of these criteria are usually established by the transducer. Hence, although it is not the topic of this text, good system design usually calls for care in the choice or design of the transducer element(s). An efficient, low-noise transducer design can often reduce the need for extensive subsequent signal processing and still produce a better measurement. Input transducers use one of two different fundamental approaches: the input energy causes the transducer element to generate a voltage or current, or the input energy creates a change in the electrical properties (i.e., the resistance, inductance, or capacitance) of the transducer element. Most optical transducers Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 26. use the first approach. Photons strike a photo sensitive material producing free electrons (or holes) that can then be detected as an external current flow. Piezo- electric devices used in ultrasound also generate a charge when under mechani- cal stress. Many examples can be found of the use of the second category, a change in some electrical property. For example, metals (and semiconductors) undergo a consistent change in resistance with changes in temperature, and most temperature transducers utilize this feature. Other examples include the strain gage, which measures mechanical deformation using the small change in resis- tance that occurs when the sensing material is stretched. Many critical problems in medical diagnosis await the development of new approaches and new transducers. For example, coronary artery disease is a major cause of death in developed countries, and its treatment would greatly benefit from early detection. To facilitate early detection, a biomedical instru- mentation system is required that is inexpensive and easy to operate so that it could be used for general screening. In coronary artery disease, blood flow to the arteries of the heart (i.e., coronaries) is reduced due to partial or complete blockage (i.e., stenoses). One conceptually simple and inexpensive approach is to detect the sounds generated by turbulent blood flow through partially in- cluded coronary arteries (called bruits when detected in other arteries such as the carotids). This approach requires a highly sensitive transducer(s), in this case a cardiac microphone, as well as advanced signal processing methods. Results of efforts based on this approach are ongoing, and the problem of noninvasive detection of coronary artery disease is not yet fully solved. Other holy grails of diagnostic cardiology include noninvasive measure- ment of cardiac output (i.e., volume of blood flow pumped by the heart per unit time) and noninvasive measurement of internal cardiac pressures. The former has been approached using Doppler ultrasound, but this technique has not yet been accepted as reliable. Financial gain and modest fame awaits the biomedical engineer who develops instrumentation that adequately addresses any of these three outstanding measurement problems. ANALOG SIGNAL PROCESSING While the most extensive signal processing is usually performed on digitized data using algorithms implemented in software, some analog signal processing is usually necessary. The first analog stage depends on the basic transducer operation. If the transducer is based on a variation in electrical property, the first stage must convert that variation in electrical property into a variation in voltage. If the transducer element is single ended, i.e., only one element changes, then a constant current source can be used and the detector equation follows ohm’s law: Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 27. Vout = I(Z + ∆Z) where ∆Z = f(input energy). (1) Figure 1.2 shows an example of a single transducer element used in opera- tional amplifier circuit that provides constant current operation. The transducer element in this case is a thermistor, an element that changes its resistance with temperature. Using circuit analysis, it is easy to show that the thermistor is driven by a constant current of VS /R amps. The output, Vout, is [(RT + ∆RT)/R]VS. Alternatively, an approximate constant current source can be generated using a voltage source and a large series resistor, RS, where RS >> ∆R. If the transducer can be configured differentially so that one element in- creases with increasing input energy while the other element decreases, the bridge circuit is commonly used as a detector. Figure 1.3 shows a device made to measure intestinal motility using strain gages. A bridge circuit detector is used in conjunction with a pair of differentially configured strain gages: when the intestine contracts, the end of the cantilever beam moves downward and the upper strain gage (visible) is stretched and increases in resistance while the lower strain gage (not visible) compresses and decreases in resistance. The out- put of the bridge circuit can be found from simple circuit analysis to be: Vout = VS∆R/2, where VS is the value of the source voltage. If the transducer operates based on a change in inductance or capacitance, the above techniques are still useful except a sinusoidal voltage source must be used. If the transducer element is a voltage generator, the first stage is usually an amplifier. If the transducer produces a current output, as is the case in many electromagnetic detectors, then a current-to-voltage amplifier (also termed a transconductance amplifier) is used to produce a voltage output. FIGURE 1.2 A thermistor (a semiconductor that changes resistance as a function of temperature) used in a constant current configuration. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 28. FIGURE 1.3 A strain gage probe used to measure motility of the intestine. The bridge circuit is used to convert differential change in resistance from a pair of strain gages into a change in voltage. Figure 1.4 shows a photodiode transducer used with a transconductance amplifier. The output voltage is proportional to the current through the photodi- ode: Vout = RfIdiode. Bandwidth can be increased at the expense of added noise by reverse biasing the photodiode with a small voltage.* More sophisticated detec- tion systems such as phase sensitive detectors (PSD) can be employed in some cases to improve noise rejection. A software implementation of PSD is de- scribed in Chapter 8. In a few circumstances, additional amplification beyond the first stage may be required. SOURCES OF VARIABILITY: NOISE In this text, noise is a very general and somewhat relative term: noise is what you do not want and signal is what you do want. Noise is inherent in most measurement systems and often the limiting factor in the performance of a medi- cal instrument. Indeed, many signal processing techniques are motivated by the *A bias voltage improves movement of charge through the diode decreasing the response time. From −10 to −50 volts are used, except in the case of avalanche photodiodes where a higher voltage is required. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 29. FIGURE 1.4 Photodiode used in a transconductance amplifier. desire to minimize the variability in the measurement. In biomedical measure- ments, variability has four different origins: (1) physiological variability; (2) en- vironmental noise or interference; (3) transducer artifact; and (4) electronic noise. Physiological variability is due to the fact that the information you desire is based on a measurement subject to biological influences other than those of interest. For example, assessment of respiratory function based on the measurement of blood pO2 could be confounded by other physiological mechanisms that alter blood pO2. Physiological variability can be a very difficult problem to solve, sometimes requiring a totally different approach. Environmental noise can come from sources external or internal to the body. A classic example is the measurement of fetal ECG where the desired signal is corrupted by the mother’s ECG. Since it is not possible to describe the specific characteristics of environmental noise, typical noise reduction tech- niques such as filtering are not usually successful. Sometimes environmental noise can be reduced using adaptive techniques such as those described in Chap- ter 8 since these techniques do not require prior knowledge of noise characteris- tics. Indeed, one of the approaches described in Chapter 8, adaptive noise can- cellation, was initially developed to reduce the interference from the mother in the measurement of fetal ECG. Transducer artifact is produced when the transducer responds to energy modalities other than that desired. For example, recordings of electrical poten- tials using electrodes placed on the skin are sensitive to motion artifact, where the electrodes respond to mechanical movement as well as the desired electrical signal. Transducer artifacts can sometimes be successfully addressed by modifi- cations in transducer design. Aerospace research has led to the development of electrodes that are quite insensitive to motion artifact. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 30. Unlike the other sources of variability, electronic noise has well-known sources and characteristics. Electronic noise falls into two broad classes: thermal or Johnson noise, and shot noise. The former is produced primarily in resistor or resistance materials while the latter is related to voltage barriers associated with semiconductors. Both sources produce noise with a broad range of frequen- cies often extending from DC to 1012 –1013 Hz. Such a broad spectrum noise is referred to as white noise since it contains energy at all frequencies (or at least all the frequencies of interest to biomedical engineers). Figure 1.5 shows a plot of power density versus frequency for white noise calculated from a noise wave- form (actually an array of random numbers) using the spectra analysis methods described in Chapter 3. Note that its energy is fairly constant across the spectral range. The various sources of noise or variability along with their causes and possible remedies are presented in Table 1.2 below. Note that in three out of four instances, appropriate transducer design was useful in the reduction of the FIGURE 1.5 Power density (power spectrum) of digitizied white noise showing a fairly constant value over frequency. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 31. TABLE 1.2 Sources of Variability Source Cause Potential Remedy Physiological Measurement only indi- Modify overall approach variability rectly related to variable of interest Environmental Other sources of similar Noise cancellation (internal or external) energy form Transducer design Artifact Transducer responds to Transducer design other energy sources Electronic Thermal or shot noise Transducer or electronic design variability or noise. This demonstrates the important role of the transducer in the overall performance of the instrumentation system. Electronic Noise Johnson or thermal noise is produced by resistance sources, and the amount of noise generated is related to the resistance and to the temperature: VJ = √4kT R B volts (2) where R is the resistance in ohms, T the temperature in degrees Kelvin, and k is Boltzman’s constant (k = 1.38 × 10−23 J/°K).* B is the bandwidth, or range of frequencies, that is allowed to pass through the measurement system. The sys- tem bandwidth is determined by the filter characteristics in the system, usually the analog filtering in the system (see the next section). If noise current is of interest, the equation for Johnson noise current can be obtained from Eq. (2) in conjunction with Ohm’s law: IJ = √4kT B/R amps (3) Since Johnson noise is spread evenly over all frequencies (at least in the- ory), it is not possible to calculate a noise voltage or current without specifying B, the frequency range. Since the bandwidth is not always known in advance, it is common to describe a relative noise; specifically, the noise that would occur if the bandwidth were 1.0 Hz. Such relative noise specification can be identified by the unusual units required: volts/√Hz or amps/√Hz. *A temperature of 310 °K is often used as room temperature, in which case 4kT = 1.7 × 10−20 J. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 32. Shot noise is defined as a current noise and is proportional to the baseline current through a semiconductor junction: Is = √2q Id B amps (4) where q is the charge on an electron (1.662 × 10−19 coulomb), and Id is the baseline semiconductor current. In photodetectors, the baseline current that gen- erates shot noise is termed the dark current, hence, the symbol Id in Eq. (4). Again, since the noise is spread across all frequencies, the bandwidth, BW, must be specified to obtain a specific value, or a relative noise can be specified in amps/√Hz. When multiple noise sources are present, as is often the case, their voltage or current contributions to the total noise add as the square root of the sum of the squares, assuming that the individual noise sources are independent. For voltages: VT = (V2 1 + V2 2 + V2 3 + ؒ ؒ ؒ + V2 N)1/2 (5) A similar equation applies to current. Noise properties are discussed fur- ther in Chapter 2. Signal-to-Noise Ratio Most waveforms consist of signal plus noise mixed together. As noted pre- viously, signal and noise are relative terms, relative to the task at hand: the signal is that portion of the waveform of interest while the noise is everything else. Often the goal of signal processing is to separate out signal from noise, to identify the presence of a signal buried in noise, or to detect features of a signal buried in noise. The relative amount of signal and noise present in a waveform is usually quantified by the signal-to-noise ratio, SNR. As the name implies, this is simply the ratio of signal to noise, both measured in RMS (root-mean-squared) ampli- tude. The SNR is often expressed in "db" (short for decibels) where: SNR = 20 log ͩSignal Noiseͪ (6) To convert from db scale to a linear scale: SNRlinear = 10db/20 (7) For example, a ratio of 20 db means that the RMS value of the signal was 10 times the RMS value of the noise (1020/20 = 10), +3 db indicates a ratio of 1.414 (103/20 = 1.414), 0 db means the signal and noise are equal in RMS value, Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 33. −3 db means that the ratio is 1/1.414, and −20 db means the signal is 1/10 of the noise in RMS units. Figure 1.6 shows a sinusoidal signal with various amounts of white noise. Note that is it is difficult to detect presence of the signal visually when the SNR is −3 db, and impossible when the SNR is −10 db. The ability to detect signals with low SNR is the goal and motivation for many of the signal processing tools described in this text. ANALOG FILTERS: FILTER BASICS The analog signal processing circuitry shown in Figure 1.1 will usually contain some filtering, both to remove noise and appropriately condition the signal for FIGURE 1.6 A 30 Hz sine wave with varying amounts of added noise. The sine wave is barely discernable when the SNR is −3db and not visible when the SNR is −10 db. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 34. analog-to-digital conversion (ADC). It is this filtering that usually establishes the bandwidth of the system for noise calculations [the bandwidth used in Eqs. (2)–(4)]. As shown later, accurate conversion of the analog signal to digital format requires that the signal contain frequencies no greater than 1⁄2 the sam- pling frequency. This rule applies to the analog waveform as a whole, not just the signal of interest. Since all transducers and electronics produce some noise and since this noise contains a wide range of frequencies, analog lowpass filter- ing is usually essential to limit the bandwidth of the waveform to be converted. Waveform bandwidth and its impact on ADC will be discussed further in Chap- ter 2. Filters are defined by several properties: filter type, bandwidth, and attenu- ation characteristics. The last can be divided into initial and final characteristics. Each of these properties is described and discussed in the next section. Filter Types Analog filters are electronic devices that remove selected frequencies. Filters are usually termed according to the range of frequencies they do not suppress. Thus, lowpass filters allow low frequencies to pass with minimum attenuation while higher frequencies are attenuated. Conversely, highpass filters pass high frequencies, but attenuate low frequencies. Bandpass filters reject frequencies above and below a passband region. An exception to this terminology is the bandstop filter, which passes frequencies on either side of a range of attenuated frequencies. Within each class, filters are also defined by the frequency ranges that they pass, termed the filter bandwidth, and the sharpness with which they in- crease (or decrease) attenuation as frequency varies. Spectral sharpness is speci- fied in two ways: as an initial sharpness in the region where attenuation first begins and as a slope further along the attenuation curve. These various filter properties are best described graphically in the form of a frequency plot (some- times referred to as a Bode plot), a plot of filter gain against frequency. Filter gain is simply the ratio of the output voltage divided by the input voltage, Vout/ Vin, often taken in db. Technically this ratio should be defined for all frequencies for which it is nonzero, but practically it is usually stated only for the frequency range of interest. To simplify the shape of the resultant curves, frequency plots sometimes plot gain in db against the log of frequency.* When the output/input ratio is given analytically as a function of frequency, it is termed the transfer function. Hence, the frequency plot of a filter’s output/input relationship can be *When gain is plotted in db, it is in logarithmic form, since the db operation involves taking the log [Eq. (6)]. Plotting gain in db against log frequency puts the two variables in similar metrics and results in straighter line plots. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 35. viewed as a graphical representation of the transfer function. Frequency plots for several different filter types are shown in Figure 1.7. Filter Bandwidth The bandwidth of a filter is defined by the range of frequencies that are not attenuated. These unattenuated frequencies are also referred to as passband fre- quencies. Figure 1.7A shows that the frequency plot of an ideal filter, a filter that has a perfectly flat passband region and an infinite attenuation slope. Real filters may indeed be quite flat in the passband region, but will attenuate with a FIGURE 1.7 Frequency plots of ideal and realistic filters. The frequency plots shown here have a linear vertical axis, but often the vertical axis is plotted in db. The horizontal axis is in log frequency. (A) Ideal lowpass filter. (B) Realistic low- pass filter with a gentle attenuation characteristic. (C) Realistic lowpass filter with a sharp attenuation characteristic. (D) Bandpass filter. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 36. more gentle slope, as shown in Figure 1.7B. In the case of the ideal filter, Figure 1.7A, the bandwidth or region of unattenuated frequencies is easy to determine; specifically, it is between 0.0 and the sharp attenuation at fc Hz. When the attenuation begins gradually, as in Figure 1.7B, defining the passband region is problematic. To specify the bandwidth in this filter we must identify a frequency that defines the boundary between the attenuated and non-attenuated portion of the frequency characteristic. This boundary has been somewhat arbitrarily de- fined as the frequency when the attenuation is 3 db.* In Figure 1.7B, the filter would have a bandwidth of 0.0 to fc Hz, or simply fc Hz. The filter in Figure 1.7C has a sharper attenuation characteristic, but still has the same bandwidth ( fc Hz). The bandpass filter of Figure 1.7D has a bandwidth of fh − fl Hz. Filter Order The slope of a filter’s attenuation curve is related to the complexity of the filter: more complex filters have a steeper slope better approaching the ideal. In analog filters, complexity is proportional to the number of energy storage elements in the circuit (which could be either inductors or capacitors, but are generally ca- pacitors for practical reasons). Using standard circuit analysis, it can be shown that each energy storage device leads to an additional order in the polynomial of the denominator of the transfer function that describes the filter. (The denom- inator of the transfer function is also referred to as the characteristic equation.) As with any polynomial equation, the number of roots of this equation will depend on the order of the equation; hence, filter complexity (i.e., the number of energy storage devices) is equivalent to the number of roots in the denomina- tor of the Transfer Function. In electrical engineering, it has long been common to call the roots of the denominator equation poles. Thus, the complexity of the filter is also equivalent to the number of poles in the transfer function. For example, a second-order or two-pole filter has a transfer function with a second- order polynomial in the denominator and would contain two independent energy storage elements (very likely two capacitors). Applying asymptote analysis to the transfer function, is not difficult to show that the slope of a second-order lowpass filter (the slope for frequencies much greater than the cutoff frequency, fc) is 40 db/decade specified in log-log terms. (The unusual units, db/decade are a result of the log-log nature of the typical frequency plot.) That is, the attenuation of this filter increases linearly on a log-log scale by 40 db (a factor of 100 on a linear scale) for every order of magnitude increase in frequency. Generalizing, for each filter pole (or order) *This defining point is not entirely arbitrary because when the signal is attenuated 3 db, its ampli- tude is 0.707 (10−3/20 ) of what it was in the passband region and it has half the power of the unattenu- ated signal (since 0.7072 = 1/2). Accordingly this point is also known as the half-power point. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 37. the downward slope (sometimes referred to as the rolloff) is increased by 20 db/decade. Figure 1.8 shows the frequency plot of a second-order (two-pole with a slope of 40 db/decade) and a 12th-order lowpass filter, both having the same cutoff frequency, fc, and hence, the same bandwidth. The steeper slope or rolloff of the 12-pole filter is apparent. In principle, a 12-pole lowpass filter would have a slope of 240 db/decade (12 × 20 db/decade). In fact, this fre- quency characteristic is theoretical because in real analog filters parasitic com- ponents and inaccuracies in the circuit elements limit the actual attenuation that can be obtained. The same rationale applies to highpass filters except that the frequency plot decreases with decreasing frequency at a rate of 20 db/decade for each highpass filter pole. Filter Initial Sharpness As shown in Figure 1.8, both the slope and the initial sharpness increase with filter order (number of poles), but increasing filter order also increases the com- FIGURE 1.8 Frequency plot of a second-order (2-pole) and a 12th-order lowpass filter with the same cutoff frequency. The higher order filter more closely ap- proaches the sharpness of an ideal filter. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 38. plexity, hence the cost, of the filter. It is possible to increase the initial sharpness of the filter’s attenuation characteristics without increasing the order of the filter, if you are willing to except some unevenness, or ripple, in the passband. Figure 1.9 shows two lowpass, 4th -order filters, differing in the initial sharpness of the attenuation. The one marked Butterworth has a smooth passband, but the initial attenuation is not as sharp as the one marked Chebychev; which has a passband that contains ripples. This property of analog filters is also seen in digital filters and will be discussed in detail in Chapter 4. FIGURE 1.9 Two filters having the same order (4-pole) and cutoff frequency, but differing in the sharpness of the initial slope. The filter marked Chebychev has a steeper initial slope or rolloff, but contains ripples in the passband. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 39. ANALOG-TO-DIGITAL CONVERSION: BASIC CONCEPTS The last analog element in a typical measurement system is the analog-to-digital converter (ADC), Figure 1.1. As the name implies, this electronic component converts an analog voltage to an equivalent digital number. In the process of analog-to-digital conversion an analog or continuous waveform, x(t), is con- verted into a discrete waveform, x(n), a function of real numbers that are defined only at discrete integers, n. To convert a continuous waveform to digital format requires slicing the signal in two ways: slicing in time and slicing in amplitude (Figure 1.10). Slicing the signal into discrete points in time is termed time sampling or simply sampling. Time slicing samples the continuous waveform, x(t), at dis- crete prints in time, nTs, where Ts is the sample interval. The consequences of time slicing are discussed in the next chapter. The same concept can be applied to images wherein a continuous image such as a photograph that has intensities that vary continuously across spatial distance is sampled at distances of S mm. In this case, the digital representation of the image is a two-dimensional array. The consequences of spatial sampling are discussed in Chapter 11. Since the binary output of the ADC is a discrete integer while the analog signal has a continuous range of values, analog-to-digital conversion also re- quires the analog signal to be sliced into discrete levels, a process termed quanti- zation, Figure 1.10. The equivalent number can only approximate the level of FIGURE 1.10 Converting a continuous signal (solid line) to discrete format re- quires slicing the signal in time and amplitude. The result is a series of discrete points (X’s) that approximate the original signal. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 40. the analog signal, and the degree of approximation will depend on the range of binary numbers and the amplitude of the analog signal. For example, if the output of the ADC is an 8-bit binary number capable of 28 or 256 discrete states, and the input amplitude range is 0.0–5.0 volts, then the quantization interval will be 5/256 or 0.0195 volts. If, as is usually the case, the analog signal is time varying in a continuous manner, it must be approximated by a series of binary numbers representing the approximate analog signal level at discrete points in time (Figure 1.10). The errors associated with amplitude slicing, or quantization, are described in the next section, and the potential error due to sampling is covered in Chapter 2. The remainder of this section briefly describes the hard- ware used to achieve this approximate conversion. Analog-to-Digital Conversion Techniques Various conversion rules have been used, but the most common is to convert the voltage into a proportional binary number. Different approaches can be used to implement the conversion electronically; the most common is the successive approximation technique described at the end of this section. ADC’s differ in conversion range, speed of conversion, and resolution. The range of analog volt- ages that can be converted is frequently software selectable, and may, or may not, include negative voltages. Typical ranges are from 0.0–10.0 volts or less, or if negative values are possible ± 5.0 volts or less. The speed of conversion is specified in terms of samples per second, or conversion time. For example, an ADC with a conversion time of 10 µsec should, logically, be able to operate at up to 100,000 samples per second (or simply 100 kHz). Typical conversion rates run up to 500 kHz for moderate cost converters, but off-the-shelf converters can be obtained with rates up to 10–20 MHz. Except for image processing systems, lower conversion rates are usually acceptable for biological signals. Even image processing systems may use downsampling techniques to reduce the required ADC conversion rate and, hence, the cost. A typical ADC system involves several components in addition to the actual ADC element, as shown in Figure 1.11. The first element is an N-to-1 analog switch that allows multiple input channels to be converted. Typical ADC systems provide up to 8 to 16 channels, and the switching is usually software- selectable. Since a single ADC is doing the conversion for all channels, the conversion rate for any given channel is reduced in proportion to the number of channels being converted. Hence, an ADC system with converter element that had a conversion rate of 50 kHz would be able to sample each of eight channels at a theoretical maximum rate of 50/8 = 6.25 kHz. The Sample and Hold is a high-speed switch that momentarily records the input signal, and retains that signal value at its output. The time the switch is closed is termed the aperture time. Typical values range around 150 ns, and, except for very fast signals, can be considered basically instantaneous. This Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 41. FIGURE 1.11 Block diagram of a typical analog-to-digital conversion system. instantaneously sampled voltage value is held (as a charge on a capacitor) while the ADC element determines the equivalent binary number. Again, it is the ADC element that determines the overall speed of the conversion process. Quantization Error Resolution is given in terms of the number of bits in the binary output with the assumption that the least significant bit (LSB) in the output is accurate (which may not always be true). Typical converters feature 8-, 12-, and 16-bit output with 12 bits presenting a good compromise between conversion resolution and cost. In fact, most signals do not have a sufficient signal-to-noise ratio to justify a higher resolution; you are simply obtaining a more accurate conversion of the noise. For example, assuming that converter resolution is equivalent to the LSB, then the minimum voltage that can be resolved is the same as the quantization voltage described above: the voltage range divided by 2N , where N is the number of bits in the binary output. The resolution of a 5-volt, 12-bit ADC is 5.0/212 = 5/4096 = 0.0012 volts. The dynamic range of a 12-bit ADC, the range from the smallest to the largest voltage it can convert, is from 0.0012 to 5 volts: in db this is 20 * log*1012 * = 167 db. Since typical signals, especially those of biologi- cal origin, have dynamic ranges rarely exceeding 60 to 80 db, a 12-bit converter with the dynamic range of 167 db may appear to be overkill. However, having this extra resolution means that not all of the range need be used, and since 12- bit ADC’s are only marginally more expensive than 8-bit ADC’s they are often used even when an 8-bit ADC (with dynamic range of over 100 DB, would be adequate). A 12-bit output does require two bytes to store and will double the memory requirements over an 8-bit ADC. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 42. The number of bits used for conversion sets a lower limit on the resolu- tion, and also determines the quantization error (Figure 1.12). This error can be thought of as a noise process added to the signal. If a sufficient number of quantization levels exist (say N > 64), the distortion produced by quantization error may be modeled as additive independent white noise with zero mean with the variance determined by the quantization step size, δ = VMAX/2N . Assuming that the error is uniformly distributed between −δ/2 +δ/2, the variance, σ, is: σ = ∫ δ/2 −δ/2 η2 /δ dη = V2 Max (2−2N )/12 (8) Assuming a uniform distribution, the RMS value of the noise would be just twice the standard deviation, σ. Further Study: Successive Approximation The most popular analog-to-digital converters use a rather roundabout strategy to find the binary number most equivalent to the input analog voltage—a digi- tal-to-analog converter (DAC) is placed in a feedback loop. As shown Figure 1.13, an initial binary number stored in the buffer is fed to a DAC to produce a FIGURE 1.12 Quantization (amplitude slicing) of a continuous waveform. The lower trace shows the error between the quantized signal and the input. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 43. FIGURE 1.13 Block diagram of an analog-to-digital converter. The input analog voltage is compared with the output of a digital-to-analog converter. When the two voltages match, the number held in the binary buffer is equivalent to the input voltage with the resolution of the converter. Different strategies can be used to adjust the contents of the binary buffer to attain a match. proportional voltage, VDAC. This DAC voltage, VDAC, is then compared to the input voltage, and the binary number in the buffer is adjusted until the desired level of match between VDAC and Vin is obtained. This approach begs the question “How are DAC’s constructed?” In fact, DAC’s are relatively easy to construct using a simple ladder network and the principal of current superposition. The controller adjusts the binary number based on whether or not the comparator finds the voltage out of the DAC, VDAC, to be greater or less than the input voltage, Vin. One simple adjustment strategy is to increase the binary number by one each cycle if VDAC < Vin, or decrease it otherwise. This so-called tracking ADC is very fast when Vin changes slowly, but can take many cycles when Vin changes abruptly (Figure 1.14). Not only can the conversion time be quite long, but it is variable since it depends on the dynamics of the input signal. This strategy would not easily allow for sampling an analog signal at a fixed rate due to the variability in conversion time. An alternative strategy termed successive approximation allows the con- version to be done at a fixed rate and is well-suited to digital technology. The successive approximation strategy always takes the same number of cycles irre- spective of the input voltage. In the first cycle, the controller sets the most significant bit (MSB) of the buffer to 1; all others are cleared. This binary number is half the maximum possible value (which occurs when all the bits are Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 44. FIGURE 1.14 Voltage waveform of an ADC that uses a tracking strategy. The ADC voltage (solid line) follows the input voltage (dashed line) fairly closely when the input voltage varies slowly, but takes many cycles to “catch up” to an abrupt change in input voltage. 1), so the DAC should output a voltage that is half its maximum voltage—that is, a voltage in the middle of its range. If the comparator tells the controller that Vin > VDAC, then the input voltage, Vin, must be greater than half the maximum range, and the MSB is left set. If Vin < VDAC, then that the input voltage is in the lower half of the range and the MSB is cleared (Figure 1.15). In the next cycle, the next most significant bit is set, and the same comparison is made and the same bit adjustment takes place based on the results of the comparison (Figure 1.15). After N cycles, where N is the number of bits in the digital output, the voltage from the DAC, VDAC, converges to the best possible fit to the input voltage, Vin. Since Vin Ϸ VDAC, the number in the buffer, which is proportional to VDAC, is the best representation of the analog input voltage within the resolu- tion of the converter. To signal the end of the conversion process, the ADC puts Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 45. FIGURE 1.15 Vin and VDAC in a 6-bit ADC using the successive approximation strategy. In the first cycle, the MSB is set (solid line) since Vin > VDAC . In the next two cycles, the bit being tested is cleared because Vin < VDAC when this bit was set. For the fourth and fifth cycles the bit being tested remained set and for the last cycle it was cleared. At the end of the sixth cycle a conversion complete flag is set to signify the end of the conversion process. out a digital signal or flag indicating that the conversion is complete (Figure 1.15). TIME SAMPLING: BASICS Time sampling transforms a continuous analog signal into a discrete time signal, a sequence of numbers denoted as x(n) = [x1, x2, x3, . . . xN],* Figure 1.16 (lower trace). Such a representation can be thought of as an array in computer memory. (It can also be viewed as a vector as shown in the next chapter.) Note that the array position indicates a relative position in time, but to relate this number sequence back to an absolute time both the sampling interval and sampling onset time must be known. However, if only the time relative to conversion onset is important, as is frequently the case, then only the sampling interval needs to be *In many textbooks brackets, [ ], are used to denote digitized variables; i.e., x[n]. Throughout this text we reserve brackets to indicate a series of numbers, or vector, following the MATLAB format. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 46. FIGURE 1.16 A continuous signal (upper trace) is sampled at discrete points in time and stored in memory as an array of proportional numbers (lower trace). known. Converting back to relative time is then achieved by multiplying the sequence number, n, by the sampling interval, Ts: x(t) = x(nTs). Sampling theory is discussed in the next chapter and states that a sinusoid can be uniquely reconstructed providing it has been sampled by at least two equally spaced points over a cycle. Since Fourier series analysis implies that any signal can be represented is a series of sin waves (see Chapter 3), then by extension, a signal can be uniquely reconstructed providing the sampling fre- quency is twice that of the highest frequency in the signal. Note that this highest frequency component may come from a noise source and could be well above the frequencies of interest. The inverse of this rule is that any signal that con- tains frequency components greater than twice the sampling frequency cannot be reconstructed, and, hence, its digital representation is in error. Since this error is introduced by undersampling, it is inherent in the digital representation and no amount of digital signal processing can correct this error. The specific nature of this under-sampling error is termed aliasing and is described in a discussion of the consequences of sampling in Chapter 2. From a practical standpoint, aliasing must be avoided either by the use of very high sampling rates—rates that are well above the bandwidth of the analog system—or by filtering the analog signal before analog-to-digital conversion. Since extensive sampling rates have an associated cost, both in terms of the Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 47. ADC required and memory costs, the latter approach is generally preferable. Also note that the sampling frequency must be twice the highest frequency present in the input signal, not to be confused with the bandwidth of the analog signal. All frequencies in the sampled waveform greater than one half the sam- pling frequency (one-half the sampling frequency is sometimes referred to as the Nyquist frequency) must be essentially zero, not merely attenuated. Recall that the bandwidth is defined as the frequency for which the amplitude is re- duced by only 3 db from the nominal value of the signal, while the sampling criterion requires that the value be reduced to zero. Practically, it is sufficient to reduce the signal to be less than quantization noise level or other acceptable noise level. The relationship between the sampling frequency, the order of the anti-aliasing filter, and the system bandwidth is explored in a problem at the end of this chapter. Example 1.1. An ECG signal of 1 volt peak-to-peak has a bandwidth of 0.01 to 100 Hz. (Note this frequency range has been established by an official standard and is meant to be conservative.) Assume that broadband noise may be present in the signal at about 0.1 volts (i.e., −20 db below the nominal signal level). This signal is filtered using a four-pole lowpass filter. What sampling frequency is required to insure that the error due to aliasing is less than −60 db (0.001 volts)? Solution. The noise at the sampling frequency must be reduced another 40 db (20 * log (0.1/0.001)) by the four-pole filter. A four-pole filter with a cutoff of 100 Hz (required to meet the fidelity requirements of the ECG signal) would attenuate the waveform at a rate of 80 db per decade. For a four-pole filter the asymptotic attenuation is given as: Attenuation = 80 log(f2/fc) db To achieve the required additional 40 db of attenuation required by the problem from a four-pole filter: 80 log(f2/fc) = 40 log(f2/fc) = 40/80 = 0.5 f2/fc = 10.5 =; f2 = 3.16 × 100 = 316 Hz Thus to meet the sampling criterion, the sampling frequency must be at least 632 Hz, twice the frequency at which the noise is adequately attenuated. The solution is approximate and ignores the fact that the initial attenuation of the filter will be gradual. Figure 1.17 shows the frequency response characteris- tics of an actual 4-pole analog filter with a cutoff frequency of 100 Hz. This figure shows that the attenuation is 40 db at approximately 320 Hz. Note the high sampling frequency that is required for what is basically a relatively low frequency signal (the ECG). In practice, a filter with a sharper cutoff, perhaps Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 48. FIGURE 1.17 Detailed frequency plot (on a log-log scale) of a 4-pole and 8-pole filter, both having a cutoff frequency of 100 Hz. an 8-pole filter, would be a better choice in this situation. Figure 1.17 shows that the frequency response of an 8-pole filter with the same 100 Hz frequency provides the necessary attenuation at less than 200 Hz. Using this filter, the sampling frequency could be lowered to under 400 Hz. FURTHER STUDY: BUFFERING AND REAL-TIME DATA PROCESSING Real-time data processing simply means that the data is processed and results obtained in sufficient time to influence some ongoing process. This influence may come directly from the computer or through human intervention. The pro- cessing time constraints naturally depend on the dynamics of the process of interest. Several minutes might be acceptable for an automated drug delivery system, while information on the electrical activity the heart needs to be imme- diately available. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 49. The term buffer, when applied digital technology, usually describes a set of memory locations used to temporarily store incoming data until enough data is acquired for efficient processing. When data is being acquired continuously, a technique called double buffering can be used. Incoming data is alternatively sent to one of two memory arrays, and the one that is not being filled is pro- cessed (which may involve simply transfer to disk storage). Most ADC software packages provide a means for determining which element in an array has most recently been filled to facilitate buffering, and frequently the ability to determine which of two arrays (or which half of a single array) is being filled to facilitate double buffering. DATA BANKS With the advent of the World Wide Web it is not always necessary to go through the analog-to-digital conversion process to obtain digitized data of physiological signals. A number of data banks exist that provide physiological signals such as ECG, EEG, gait, and other common biosignals in digital form. Given the volatil- ity and growth of the Web and the ease with which searches can be made, no attempt will be made to provide a comprehensive list of appropriate Websites. However, a good source of several common biosignals, particularly the ECG, is the Physio Net Data Bank maintained by MIT—http://www.physionet.org. Some data banks are specific to a given set of biosignals or a given signal processing approach. An example of the latter is the ICALAB Data Bank in Japan—http:// www.bsp.brain.riken.go.jp/ICALAB/—which includes data that can be used to evaluate independent component analysis (see Chapter 9) algorithms. Numerous other data banks containing biosignals and/or images can be found through a quick search of the Web, and many more are likely to come online in the coming years. This is also true for some of the signal processing algorithms as will be described in more detail later. For example, the ICALAB Website mentioned above also has algorithms for independent component analy- sis in MATLAB m-file format. A quick Web search can provide both signal processing algorithms and data that can be used to evaluate a signal processing system under development. The Web is becoming an evermore useful tool in signal and image processing, and a brief search of the Web can save consider- able time in the development process, particularly if the signal processing sys- tem involves advanced approaches. PROBLEMS 1. A single sinusoidal signal is contained in noise. The RMS value of the noise is 0.5 volts and the SNR is 10 db. What is the peak-to-peak amplitude of the sinusoid? Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 50. 2. A resistor produces 10 µV noise when the room temperature is 310°K and the bandwidth is 1 kHz. What current noise would be produced by this resistor? 3. The noise voltage out of a 1 MΩ resistor was measured using a digital volt meter as 1.5 µV at a room temperature of 310 °K. What is the effective band- width of the voltmeter? 4. The photodetector shown in Figure 1.4 has a sensitivity of 0.3µA/µW (at a wavelength of 700 nm). In this circuit, there are three sources of noise. The photodetector has a dark current of 0.3 nA, the resistor is 10 MΩ, and the amplifier has an input current noise of 0.01 pA/√Hz. Assume a bandwidth of 10 kHz. (a) Find the total noise current input to the amplifier. (b) Find the minimum light flux signal that can be detected with an SNR = 5. 5. A lowpass filter is desired with the cutoff frequency of 10 Hz. This filter should attenuate a 100 Hz signal by a factor of 85. What should be the order of this filter? 6. You are given a box that is said to contain a highpass filter. You input a series of sine waves into the box and record the following output: Frequency (Hz): 2 10 20 60 100 125 150 200 300 400 Vout volts rms: .15×10−7 0.1×10−3 0.002 0.2 1.5 3.28 4.47 4.97 4.99 5.0 What is the cutoff frequency and order of this filter? 7. An 8-bit ADC converter that has an input range of ± 5 volts is used to convert a signal that varies between ± 2 volts. What is the SNR of the input if the input noise equals the quantization noise of the converter? 8. As elaborated in Chapter 2, time sampling requires that the maximum fre- quency present in the input be less than fs/2 for proper representation in digital format. Assume that the signal must be attenuated by a factor of 1000 to be considered “not present.” If the sampling frequency is 10 kHz and a 4th-order lowpass anti-aliasing filter is used prior to analog-to-digital conversion, what should be the bandwidth of the sampled signal? That is, what must the cutoff frequency be of the anti-aliasing lowpass filter? Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 51. 10 Fundamentals of Image Processing: MATLAB Image Processing Toolbox IMAGE PROCESSING BASICS: MATLAB IMAGE FORMATS Images can be treated as two-dimensional data, and many of the signal process- ing approaches presented in the previous chapters are equally applicable to im- ages: some can be directly applied to image data while others require some modification to account for the two (or more) data dimensions. For example, both PCA and ICA have been applied to image data treating the two-dimen- sional image as a single extended waveform. Other signal processing methods including Fourier transformation, convolution, and digital filtering are applied to images using two-dimensional extensions. Two-dimensional images are usually represented by two-dimensional data arrays, and MATLAB follows this tradi- tion;* however, MATLAB offers a variety of data formats in addition to the standard format used by most MATLAB operations. Three-dimensional images can be constructed using multiple two-dimensional representations, but these multiple arrays are sometimes treated as a single volume image. General Image Formats: Image Array Indexing Irrespective of the image format or encoding scheme, an image is always repre- sented in one, or more, two dimensional arrays, I(m,n). Each element of the *Actually, MATLAB considers image data arrays to be three-dimensional, as described later in this chapter. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 52. variable, I, represents a single picture element, or pixel. (If the image is being treated as a volume, then the element, which now represents an elemental vol- ume, is termed a voxel.) The most convenient indexing protocol follows the traditional matrix notation, with the horizontal pixel locations indexed left to right by the second integer, n, and the vertical locations indexed top to bottom by the first integer m (Figure 10.1). This indexing protocol is termed pixel coor- dinates by MATLAB. A possible source of confusion with this protocol is that the vertical axis positions increase from top to bottom and also that the second integer references the horizontal axis, the opposite of conventional graphs. MATLAB also offers another indexing protocol that accepts non-integer indexes. In this protocol, termed spatial coordinates, the pixel is considered to be a square patch, the center of which has an integer value. In the default coordi- nate system, the center of the upper left-hand pixel still has a reference of (1,1), but the upper left-hand corner of this pixel has coordinates of (0.5,0.5) (see Figure 10.2). In this spatial coordinate system, the locations of image coordi- nates are positions on a (discrete) plane and are described by general variables x and y. The are two sources of potential confusion with this system. As with the pixel coordinate system, the vertical axis increases downward. In addition, the positions of the vertical and horizontal indexes (now better though of as coordinates) are switched: the horizontal index is first, followed by the vertical coordinate, as with conventional x,y coordinate references. In the default spatial coordinate system, integer coordinates correspond with their pixel coordinates, remembering the position swap, so that I(5,4) in pixel coordinates references the same pixel as I(4.0,5.0) in spatial coordinates. Most routines expect a specific pixel coordinate system and produce outputs in that system. Examples of spatial coordinates are found primarily in the spatial transformation routines described in the next chapter. It is possible to change the baseline reference in the spatial coordinate FIGURE 10.1 Indexing format for MATLAB images using the pixel coordinate sys- tem. This indexing protocol follows the standard matrix notation. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 53. FIGURE 10.2 Indexing in the spatial coordinate system. system as certain commands allow you to redefine the coordinates of the refer- ence corner. This option is described in context with related commands. Data Classes: Intensity Coding Schemes There are four different data classes, or encoding schemes, used by MATLAB for image representation. Moreover, each of these data classes can store the data in a number of different formats. This variety reflects the variety in image types (color, grayscale, and black and white), and the desire to represent images as efficiently as possible in terms of memory storage. The efficient use of memory storage is motivated by the fact that images often require a large numbers of array locations: an image of 400 by 600 pixels will require 240,000 data points, each of which will need one or more bytes depending of the data format. The four different image classes or encoding schemes are: indexed images, RGB images, intensity images, and binary images. The first two classes are used to store color images. In indexed images, the pixel values are, themselves, in- dexes to a table that maps the index value to a color value. While this is an efficient way to store color images, the data sets do not lend themselves to arithmetic operations (and, hence, most image processing operations) since the results do not always produce meaningful images. Indexed images also need an associated matrix variable that contains the colormap, and this map variable needs to accompany the image variable in many operations. Colormaps are N by 3 matrices that function as lookup tables. The indexed data variable points to a particular row in the map and the three columns associated with that row Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 54. contain the intensity of the colors red, green, and blue. The values of the three columns range between 0 and 1 where 0 is the absence of the related color and 1 is the strongest intensity of that color. MATLAB convention suggests that indexed arrays use variable names beginning in x.. (or simply x) and the sug- gested name for the colormap is map. While indexed variables are not very useful in image processing operations, they provide a compact method of storing color images, and can produce effective displays. They also provide a conve- nient and flexible method for colorizing grayscale data to produce a pseudocolor image. The MATLAB Image Processing Toolbox provides a number of useful prepackaged colormaps. These colormaps can implemented with any number of rows, but the default is 64 rows. Hence, if any of these standard colormaps are used with the default value, the indexed data should be scaled to range between 0 and 64 to prevent saturation. An example of the application of a MATLAB colormap is given in Example 10.3. An extension of that example demonstrates methods for colorizing grayscale data using a colormap. The other method for coding color image is the RGB coding scheme in which three different, but associated arrays are used to indicate the intensity of the three color components of the image: red, green, or blue. This coding scheme produces what is know as a truecolor image. As with the encoding used in indexed data, the larger the pixel value, the brighter the respective color. In this coding scheme, each of the color components can be operated on separately. Obviously, this color coding scheme will use more memory than indexed im- ages, but this may be unavoidable if extensive processing is to be done on a color image. By MATLAB convention the variable name RGB, or something similar, is used for variables of this data class. Note that these variables are actually three-dimensional arrays having dimensions N by M by 3. While we have not used such three dimensional arrays thus far, they are fully supported by MATLAB. These arrays are indexed as RGB(n,m,i) where i = 1,2,3. In fact, all image variables are conceptualized in MATLAB as three-dimensional arrays, except that for non-RGB images the third dimension is simply 1. Grayscale images are stored as intensity class images where the pixel value represents the brightness or grayscale value of the image at that point. MATLAB convention suggests variable names beginning with I for variables in class intensity. If an image is only black or white (not intermediate grays), then the binary coding scheme can be used where the representative array is a logical array containing either 0’s or 1’s. MATLAB convention is to use BW for variable names in the binary class. A common problem working with binary images is the failure to define the array as logical which would cause the image variable to be misinterpreted by the display routine. Binary class variables can be specified as logical (set the logical flag associated with the array) using the command BW = logical(A), assuming A consists of only zeros and ones. A logical array can be converted to a standard array using the unary plus operator: Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 55. A = ؉BW. Since all binary images are of the form “logical,” it is possible to check if a variable is logical using the routine: isa(I, ’logical’); which will return a1 if true and zero otherwise. Data Formats In an effort to further reduce image storage requirements, MATLAB provides three different data formats for most of the classes mentioned above. The uint8 and uint16 data formats provide 1 or 2 bytes, respectively, for each array ele- ment. Binary images do not support the uint16 format. The third data format, the double format, is the same as used in standard MATLAB operations and, hence, is the easiest to use. Image arrays that use the double format can be treated as regular MATLAB matrix variables subject to all the power of MATLAB and its many functions. The problem is that this format uses 8 bytes for each array element (i.e., pixel) which can lead to very large data storage requirements. In all three data formats, a zero corresponds to the lowest intensity value, i.e., black. For the uint8 and uint16 formats, the brightest intensity value (i.e., white, or the brightest color) is taken as the largest possible number for that coding scheme: for uint8, 28-1 , or 255; and for uint16, 216 , or 65,535. For the double format, the brightest value corresponds to 1.0. The isa routine can also be used to test the format of an image. The routine, isa(I,’type’) will return a 1 if I is encoded in the format type, and a zero otherwise. The variable type can be: unit8, unit16, or double. There are a number of other assessments that can be made with the isa routine that are described in the associated help file. Multiple images can be grouped together as one variable by adding an- other dimension to the variable array. Since image arrays are already considered three-dimensional, the additional images are added to the fourth dimension. Multi-image variables are termed multiframe variables and each two-dimen- sional (or three-dimensional) image of a multiframe variable is termed a frame. Multiframe variables can be generated within MATLAB by incrementing along the fourth index as shown in Example 10.2, or by concatenating several images together using the cat function: IMF = cat(4, I1, I2, I3,...); The first argument, 4, indicates that the images are to concatenated along the fourth dimension, and the other arguments are the variable names of the images. All images in the list must be the same type and size. Data Conversions The variety of coding schemes and data formats complicates even the simplest of operations, but is necessary for efficient memory use. Certain operations Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 56. require a given data format and/or class. For example, standard MATLAB oper- ations require the data be in double format, and will not work correctly with Indexed images. Many MATLAB image processing functions also expect a spe- cific format and/or coding scheme, and generate an output usually, but not al- ways, in the same format as the input. Since there are so many combinations of coding and data type, there are a number of routines for converting between different types. For converting format types, the most straightforward procedure is to use the im2xxx routines given below: I_uint8 = im2uint8(I); % Convert to uint8 format I_uint16 = im2uint16(I); % Convert to uint16 format I_double = im2double(I); % Convert to double format These routines accept any data class as input; however if the class is indexed, the input argument, I, must be followed by the term indexed. These routines also handle the necessary rescaling except for indexed images. When converting indexed images, variable range can be a concern: for example, to convert an indexed variable to uint8, the variable range must be between 0 and 255. Converting between different image encoding schemes can sometimes be done by scaling. To convert a grayscale image in uint8, or uint16 format to an indexed image, select an appropriate grayscale colormap from the MATLAB’s established colormaps, then scale the image variable so the values lie within the range of the colormap; i.e., the data range should lie between 0 and N, where N is the depth of the colormap (MATLAB’s colormaps have a default depth of 64, but this can be modified). This approach is demonstrated in Example 10.3. However, an easier solution is simply to use MATLAB’s gray2ind function listed below. This function, as with all the conversion functions, will scale the input data appropriately, and in the case of gray2ind will also supply an appro- priate grayscale colormap (although alternate colormaps of the same depth can be substituted). The routines that convert to indexed data are: [x, map] = gray2ind(I, N); % Convert from grayscale to % indexed % Convert from truecolor to indexed [x, map] = rgb2ind(RGB, N or map); Both these routines accept data in any format, including logical, and pro- duce an output of type uint8 if the associated map length is less than or equal to 64, or uint16 if greater that 64. N specifies the colormap depth and must be less than 65,536. For gray2ind the colormap is gray with a depth of N, or the default value of 64 if N is omitted. For RGB conversion using rgb2ind, a colormap of N levels is generated to best match the RGB data. Alternatively, a Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 57. colormap can be provided as the second argument, in which case rgb2ind will generate an output array, x, with values that best match the colors given in map. The rgb2ind function has a number of options that affect the image conversion, options that allow trade-offs between color accuracy and image resolution. (See the associated help file). An alternative method for converting a grayscale image to indexed values is the routine grayslice which converts using thresholding: x = grayslice(I, N or V); % Convert grayscale to indexed using % thresholding where any input format is acceptable. This function slices the image into N levels using a equal step thresholding process. Each slice is then assigned a specific level on whatever colormap is selected. This process allows some inter- esting color representations of grayscale images, as described in Example 10.4. If the second argument is a vector, V, then it contains the threshold levels (which can now be unequal) and the number of slices corresponds to the length of this vector. The output format is either uint8 or uint16 depending on the number of slices, similar to the two conversion routines above. Two conversion routines convert from indexed images to other encoding schemes: I = ind2gray(x, map); % Convert to grayscale intensity % encoding RGB = ind2rgb(x, map); % Convert to RGB (“truecolor”) % encoding Both functions accept any format and, in the case of ind2gray produces outputs in the same format. Function ind2rgb produces outputs formatted as double. Function ind2gray removes the hue and saturation information while retaining the luminance, while function ind2rgb produces a truecolor RGB variable. To convert an image to binary coding use: BW = im2bw(I, Level); % Convert to binary logical encoding where Level specifies the threshold that will be used to determine if a pixel is white (1) or black (0). The input image, I, can be either intensity, RGB, or indexed,* and in any format (uint8, uint16, or double). While most functions output binary images in uint8 format, im2bw outputs the image in logical format. *As with all conversion routines, and many other routines, when the input image is in indexed format it must be followed by the colormap variable. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 58. In this format, the image values are either 0 or 1, but each element is the same size as the double format (8 bytes). This format can be used in standard MAT- LAB operations, but does use a great deal of memory. One of the applications of the dither function can also be used to generate binary images as described in the associated help file. A final conversion routine does not really change the data class, but does scale the data and can be very useful. This routine converts general class double data to intensity data, scaled between 0 and 1: I = mat2gray(A, [Anin Amax]); % Scale matrix to intensity % encoding, double format. where A is a matrix and the optional second term specifies the values of A to be scaled to zero, or black (Amin), or 1, or white (Amin). Since a matrix is already in double format, this routine provides only scaling. If the second argument is missing, the matrix is scaled so that its highest value is 1 and its lowest value is zero. Using the default scaling can be a problem if the image contains a few irrelevant pixels having large values. This can occur after certain image process- ing operations due to border (or edge) effects. In such cases, other scaling must be imposed, usually determined empirically, to achieve a suitable range of im- age intensities. The various data classes, their conversion routines, and the data formats they support are summarized in Table 1 below. The output format of the various conversion routines is indicated by the superscript: (1) uint8 or unit 16 depend- ing on the number of levels requested (N); (2) Double; (3) No format change (output format equals input format); and (4) Logical (size double). Image Display There are several options for displaying an image, but the most useful and easi- est to use is the imshow function. The basic calling format of this routine is: TABLE 10.1 Summary of Image Classes, Data Formats, and Conversion Routines Class Formats supported Conversion routines Indexed All gray2ind1 , grayslice1 , rgb2ind1 Intensity All ind2gray2 , mat2gray2,3 , rgb2gray3 RGB All ind2rgb2 Binary uint8, double im2bw4 , dither1 Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 59. imshow(I,arg) where I is the image array and arg is an argument, usually optional, that de- pends on the data format. For indexed data, the variable name must be followed by the colormap, map. This holds for all display functions when indexed data are involved. For intensity class image variables, arg can be a scalar, in which case it specifies the number of levels to use in rendering the image, or, if arg is a vector, [low high], arg specifies the values to be taken to readjust the range limits of a specific data format.* If the empty matrix, [ ], is given as arg, or it is simply missing, the maximum and minimum values in array I are taken as the low and high values. The imshow function has a number of other options that make it quite powerful. These options can be found with the help command. When I is an indexed variable, it should be followed by the map variable. There are two functions designed to display multiframe variables. The function montage (MFW) displays the various images in a gird-like pattern as shown in Example 10.2. Alternatively, multiframe variables can be displayed as a movie using the immovie and movie commands: mov = imovie(MFW); % Generate movie variable movie(mov); % Display movie Unfortunately the movie function cannot be displayed in a textbook, but is presented in one of the problems at the end of the chapter, and several amus- ing examples are presented in the problems at the end of the next chapter. The immovie function requires multiframe data to be in either Indexed or RGB format. Again, if MFW is an indexed variable, it must be followed by a colormap variable. The basics features of the MATLAB Imaging Processing Toolbox are illustrated in the examples below. Example 10.1 Generate an image of a sinewave grating having a spatial frequency of 2 cycles/inch. A sinewave grating is a pattern that is constant in the vertical direction, but varies sinusoidally in the horizontal direction. It is used as a visual stimulus in experiments dealing with visual perception. Assume the figure will be 4 inches square; hence, the overall pattern should contain 4 cycles. Assume the image will be placed in a 400-by-400 pixel array (i.e., 100 pixels per inch) using a uint16 format. Solution Sinewave gratings usually consist of sines in the horizontal di- rection and constant intensity in the vertical direction. Since this will be a gray- *Recall the default minimum and maximum values for the three non-indexed classes were: [0, 256] for uint8; [0, 65535] for uint16; and [0, 1] for double arrays. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 60. scale image, we will use the intensity coding scheme. As most reproductions have limited grayscale resolution, a uint8 data format will be used. However, the sinewave will be generated in the double format, as this is MATLAB’s standard format. To save memory requirement, we first generate a 400-by-1 image line in double format, then convert it to uint8 format using the conversion routine im2uint8. The uint8 image can then be extended vertically to 400 pixels. % Example 10.1 and Figure 1.3 % Generate a sinewave grating 400 by 400 pixels % The grating should vary horizontally with a spatial frequency % of 4 cycles per inch. % Assume the horizontal and vertical dimensions are 4 inches % clear all; close all; N = 400; % Vertical and horizontal size Nu_cyc = 4; % Produce 4 cycle grating x = (1:N)*Ny_cyc/N; % Spatial (time equivalent) vector % FIGURE 10.3 A sinewave grating generated by Example 10.1. Such images are often used as stimuli in experiments on vision. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 61. % Generate a single horizontal line of the image in a vector of % 400 points % % Generate sin; scale between 0&1 I_sin(1,:) = .5 * sin(2*pi*x) ؉ .5; I_8 = im2uint8(I_sin); % Convert to a uint8 vector % for i = 1:N % Extend to N (400) vertical lines I(i,:) = I_8; end % imshow(I); % Display image title(’Sinewave Grating’); The output of this example is shown as Figure 10.3. As with all images shown in this text, there is a loss in both detail (resolution) and grayscale varia- tion due to losses in reproduction. To get the best images, these figures, and all figures in this section can be reconstructed on screen using the code from the examples provided in the CD. Example 10.2 Generate a multiframe variable consisting of a series of sinewave gratings having different phases. Display these images as a montage. Border the images with black for separation on the montage plot. Generate 12 frames, but reduce the image to 100 by 100 to save memory. % Example 10.2 and Figure 10.4 % Generate a multiframe array consisting of sinewave gratings % that vary in phase from 0 to 2 * pi across 12 images % % The gratings should be the same as in Example 10.1 except with % fewer pixels (100 by 100) to conserve memory. % clear all; close all; N = 100; % Vertical and horizontal points Nu_cyc = 2; % Produce 4 cycle grating M = 12; % Produce 12 images x = (1:N)*Nu_cyc/N; % Generate spatial vector % for j = 1:M % Generate M (12) images phase = 2*pi*(j-1)/M; % Shift phase through 360 (2*pi) % degrees % Generate sine; scale to be 0 & 1 I_sin = .5 * sin(2*pi*x ؉ phase) ؉ .5’*; % Add black at left and right borders I_sin = [zeros(1,10) I_sin(1,:) zeros(1,10)]; Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 62. FIGURE 10.4 Montage of sinewave gratings created by Example 10.2. I_8 = im2uint8(I_sin); % Convert to a uint8 vector % for i = 1:N % Extend to N (100) vertical lines if i < 10 * I > 90 % Insert black space at top and % bottom I(i,:,1:j) = 0; else Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 63. I(i,:,1,j) = I_8; end end end montage(I); % Display image as montage title(’Sinewave Grating’); The montage created by this example is shown in Figure 10.4 on the next page. The multiframe data set was constructed one frame at a time and the frame was placed in I using the frame index, the fourth index of I.* Zeros are inserted at the beginning and end of the sinewave and, in the image construction loop, for the first and last 9 points. This is to provide a dark band between the images. Finally the sinewave was phase shifted through 360 degrees over the 12 frames. Example 10.3 Construct a multiframe variable with 12 sinewave grating images. Display these data as a movie. Since the immovie function requires the multiframe image variable to be in either RGB or indexed format, convert the uint16 data to indexed format. This can be done by the gray2ind(I,N) func- tion. This function simply scales the data to be between 0 and N, where N is the depth of the colormap. If N is unspecified, gray2ind defaults to 64 levels. MATLAB colormaps can also be specified to be of any depth, but as with gray2ind the default level is 64. % Example 10.3 % Generate a movie of a multiframe array consisting of sinewave % gratings that vary in phase from 0 to pi across 10 images % Since function ’immovie’ requires either RGB or indexed data % formats scale the data for use as Indexed with 64 gray levels. % Use a standard MATLAB grayscale (’gray’); % % The gratings should be the same as in Example 10.2. % clear all; close all; % Assign parameters N = 100; % Vertical and horizontal points Nu_cyc = 2; % Produce 2 cycle grating M = 12; % Produce 12 images % x = (1:N)*Nu_cyc/N; % Generate spatial vector *Recall, the third index is reserved for referencing the color plane. For non-RGB variables, this index will always be 1. For images in RGB format the third index would vary between 1 and 3. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 64. for j = 1:M % Generate M (100) images % Generate sine; scale between 0 and 1 phase = 10*pi*j/M; % Shift phase 180 (pi) over 12 images I_sin(1,:) = .5 * sin(2*pi*x ؉ phase) ؉ .5’; for i = 1:N % Extend to N (100) vertical lines for i = 1:N % Extend to 100 vertical lines to Mf(i,:,1,j) = x1; % create 1 frame of the multiframe % image end end % % [Mf, map] = gray2ind(Mf); % Convert to indexed image mov = immovie(Mf,map); % Make movie, use default colormap movie(mov,10); % and show 10 times To fully appreciate this example, the reader will need to run this program under MATLAB. The 12 frames are created as in Example 10.3, except the code that adds border was removed and the data scaling was added. The second argument in immovie, is the colormap matrix and this example uses the map generated by gray2ind. This map has the default level of 64, the same as all of the other MATLAB supplied colormaps. Other standard maps that are appro- priate for grayscale images are ‘bone’ which has a slightly bluish tint, ‘pink’ which has a decidedly pinkish tint, and ‘copper’ which has a strong rust tint. Of course any colormap can be used, often producing interesting pseudocolor effects from grayscale data. For an interesting color alternative, try running Example 10.3 using the prepackaged colormap jet as the second argument of immovie. Finally, note that the size of the multiframe array, Mf, is (100,100,1,12) or 1.2 × 105 × 2 bytes. The variable mov generated by immovie is even larger! Image Storage and Retrieval Images may be stored on disk using the imwrite command: imwrite(I, filename.ext, arg1, arg2, ...); where I is the array to be written into file filename. There are a large variety of file formats for storing image data and MATLAB supports the most popular for- mats. The file format is indicated by the filename’s extension, ext, which may be: .bmp (Microsoft bitmap), .gif (graphic interchange format), .jpeg (Joint photo- graphic experts group), .pcs (Paintbrush), .png (portable network graphics), and .tif (tagged image file format). The arguments are optional and may be used to specify image compression or resolution, or other format dependent information. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 65. The specifics can be found in the imwrite help file. The imwrite routine can be used to store any of the data formats or data classes mentioned above; however, if the data array, I, is an indexed array, then it must be followed by the colormap variable, map. Most image formats actually store uint8 formatted data, but the nec- essary conversions are done by the imwrite. The imread function is used to retrieve images from disk. It has the call- ing structure: [I map] = imread(‘filename.ext’,fmt or frame); where filename is the name of the image file and .ext is any of the extensions listed above. The optional second argument, fmt, only needs to be specified if the file format is not evident from the filename. The alternative optional argu- ment frame is used to specify which frame of a multiframe image is to be read in I. An example that reads multiframe data is found in Example 10.4. As most file formats store images in uint8 format, I will often be in that format. File formats .tif and .png support uint16 format, so imread may generate data arrays in uint16 format for these file types. The output class depends on the manner in which the data is stored in the file. If the file contains a grayscale image data, then the output is encoded as an intensity image, if truecolor, then as RGB. For both these cases the variable map will be empty, which can be checked with the isempty(map) command (see Example 10.4). If the file con- tains indexed data, then both output, I and map will contain data. The type of data format used by a file can also be obtained by querying a graphics file using the function infinfo. information = infinfo(‘filename.ext’) where information will contain text providing the essential information about the file including the ColorType, FileSize, and BitDepth. Alternatively, the im- age data and map can be loaded using imread and the format image data deter- mined from the MATLAB whos command. The whos command will also give the structure of the data variable (uint8, uint16, or double). Basic Arithmetic Operations If the image data are stored in the double format, then all MATLAB standard mathematical and operational procedures can be applied directly to the image variables. However, the double format requires 4 times as much memory as the uint16 format and 8 times as much memory as the uint8 format. To reduce the reliance on the double format, MATLAB has supplied functions to carry out some basic mathematics on uint8- and uint16-format arrays. These routines will work on either format; they actually carry out the operations in double precision Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 66. on an element by element basis then convert back to the input format. This reduces roundoff and overflow errors. The basic arithmetic commands are: I_diff = imabssdiff(I, J); % Subtracts J from I on a pixel % by pixel basis and returns % the absolute difference I_comp = imcomplement(I) % Compliments image I I_add = imadd(I, J); % Adds image I and J (images and/ % or constants) to form image % I_add I_sub = imsubtract(I, J); % Subtracts J from image I I_divide = imdivide(I, J) % Divides image I by J I_multiply = immultiply(I, J) % Multiply image I by J For the last four routines, J can be either another image variable, or a constant. Several arithmetical operations can be combined using the imlincomb function. The function essentially calculates a weighted sum of images. For example to add 0.5 of image I1 to 0.3 of image I2, to 0.75 of Image I3, use: % Linear combination of images I_combined = imlincomb (.5, I1, .3, I2, .75, I3); The arithmetic operations of multiplication and addition by constants are easy methods for increasing the contrast or brightness or an image. Some of these arithmetic operations are illustrated in Example 10.4. Example 10.4 This example uses a number of the functions described previously. The program first loads a set of MRI (magnetic resonance imaging) images of the brain from the MATLAB Image Processing Toolbox’s set of stock images. This image is actually a multiframe image consisting of 27 frames as can be determined from the command imifinfo. One of these frames is se- lected by the operator and this image is then manipulated in several ways: the contrast is increased; it is inverted; it is sliced into 5 levels (N_slice); it is modified horizontally and vertically by a Hanning window function, and it is thresholded and converted to a binary image. % Example 10.4 and Figures 10.5 and 10.6 % Demonstration of various image functions. % Load all frames of the MRI image in mri.tif from the the MATLAB % Image Processing Toolbox (in subdirectory imdemos). % Select one frame based on a user input. % Process that frame by: contrast enhancement of the image, % inverting the image, slicing the image, windowing, and % thresholding the image Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 67. FIGURE 10.5 Montage display of 27 frames of magnetic resonance images of the brain plotted in Example 10.4. These multiframe images were obtained from MATLAB’s mri.tif file in the images section of the Image Processing Toolbox. Used with permission from MATLAB, Inc. Copyright 1993–2003, The Math Works, Inc. Reprinted with permission. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 68. FIGURE 10.6 Figure showing various signal processing operations on frame 17 of the MRI images shown in Figure 10.5. Original from the MATLAB Image Pro- cessing Toolbox. Copyright 1993–2003, The Math Works, Inc. Reprinted with per- mission. % Display original and all modifications on the same figure % clear all; close all; N_slice = 5; % Number of sliced for % sliced image Level = .75; % Threshold for binary % image % % Initialize an array to hold 27 frames of mri.tif % Since this image is stored in tif format, it could be in either % unit8 or uint16. % In fact, the specific input format will not matter, since it % will be converted to double format in this program. mri = uint8(zeros(128,128,1,27)); % Initialize the image % array for 27 frames for frame = 1:27 % Read all frames into % variable mri Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 69. [mri(:,:,:,frame), map ] = imread(’mri.tif’, frame); end montage(mri, map); % Display images as a % montage % Include map in case % Indexed % frame_select = input(’Select frame for processing: ’); I = mri(:,:,:,frame_select); % Select frame for % processing % % Now check to see if image is Indexed (in fact ’whos’ shows it % is). if isempty(map) == 0 % Check to see if % indexed data I = ind2gray(I,map); % If so, convert to % intensity image end I1 = im2double(I); % Convert to double % format % I_bright = immultiply(I1,1.2); % Increase the contrast I_invert = imcomplement(I1); % Compliment image x_slice = grayslice(I1,N_slice); % Slice image in 5 equal % levels % [r c] = size(I1); % Multiple for i = 1:r % horizontally by a % Hamming window I_window(i,:) = I1(i,:) .* hamming(c)’; end for i = 1:c % Multiply vertically % by same window I_window(:,i) = I_window(:,i) .* hamming(r); end I_window = mat2gray(I_window); % Scale windowed image BW = im2bw(I1,Level); % Convert to binary % figure; subplot(3,2,1); % Display all images in % a single plot imshow(I1); title(’Original’); subplot(3,2,2); imshow(I_bright), title(’Brightened’); subplot(3,2,3); Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 70. imshow(I_invert); title(’Inverted’); subplot(3,2,4); I_slice = ind2rgb(x_slice, jet % Convert to RGB (see (N_slice)); % text) imshow(I_slice); title(’Sliced’); % Display color slices subplot(3,2,5); imshow(I_window); title(’Windowed’); subplot(3,2,6); imshow(BW); title(’Thresholded’); Since the image file might be indexed (in fact it is), the imread function includes map as an output. If the image is not indexed, then map will be empty. Note that imread reads only one frame at a time, the frame specified as the second argument of imread. To read in all 27 frames, it is necessary to use a loop. All frames are then displayed in one figure (Figure 10.5) using the mon- tage function. The user is asked to select one frame for further processing. Since montage can display any input class and format, it is not necessary to determine these data characteristics at this time. After a particular frame is selected, the program checks if the map variable is empty (function isempty). If it is not (as is the case for these data), then the image data is converted to grayscale using function ind2gray which produces an intensity image in double format. If the image is not Indexed, the image variable is converted to double format. The program then performs the various signal processing operations. Brightening is done by multiplying the image by a constant greater that 1.0, in this case 1.2, Figure 10.6. Inversion is done using imcomplement, and the image is sliced into N_slice (5) levels using gray- slice. Since grayslice produces an indexed image, it also generates a map variable. However, this grayscale map is not used, rather an alternative map is substituted to produce a color image, with the color being used to enhance certain features of the image.* The Hanning window is applied to the image in both the horizontal and vertical direction Figure 10.6. Since the image, I1, is in double format, the multiplication can be carried out directly on the image array; however, the resultant array, I_window, has to be rescaled using mat2gray to insure it has the correct range for imshow. Recall that if called without any arguments; mat2gray scales the array to take up the full intensity range (i.e., 0 to 1). To place all the images in the same figure, subplot is used just as with other graphs, Figure 10.6. One potential problem with this approach is that Indexed data may plot incorrectly due to limited display memory allocated to *More accurately, the image should be termed a pseudocolor image since the original data was grayscale. Unfortunately the image printed in this text is in grayscale; however the example can be rerun by the reader to obtain the actual color image. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 71. the map variables. (This problem actually occurred in this example when the sliced array was displayed as an Indexed variable.) The easiest solution to this potential problem is to convert the image to RGB before calling imshow as was done in this example. Many images that are grayscale can benefit from some form of color cod- ing. With the RGB format, it is easy to highlight specific features of a grayscale image by placing them in a specific color plane. The next example illustrates the use of color planes to enhance features of a grayscale image. Example 10.5 In this example, brightness levels of a grayscale image that are 50% or less are coded into shades of blue, and those above are coded into shades of red. The grayscale image is first put in double format so that the maximum range is 0 to 1. Then each pixel is tested to be greater than 0.5. Pixel values less that 0.5 are placed into the blue image plane of an RGB image (i.e., the third plane). These pixel values are multiplied by two so they take up the full range of the blue plane. Pixel values above 0.5 are placed in the red plane (plane 1) after scaling to take up the full range of the red plane. This image is displayed in the usual way. While it is not reproduced in color here, a homework problem based on these same concepts will demonstrate pseudocolor. % Example 10.5 and Figure 10.7 Example of the use of pseudocolor % Load frame 17 of the MRI image (mri.tif) % from the Image Processing Toolbox in subdirectory ‘imdemos’. FIGURE 10.7 Frame 17 of the MRI image given in Figure 10.5 plotted directly and in pseudocolor using the code in Example 10.5. (Original image from MATLAB). Copyright 1993–2003, The Math Works, Inc. Reprinted with permission. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 72. % Display a pseudocolor image in which all values less that 50% % maximum are in shades of blue and values above are in shades % of red. % clear all; close all; frame = 17; [I(:,:,1,1), map ] = imread(’mri.tif’, frame); % Now check to see if image is Indexed (in fact ’whos’ shows it is). if isempty(map) == 0 % Check to see if Indexed data I = ind2gray(I,map); % If so, convert to Intensity image end I = im2double(I); % Convert to double [M N] = size(I); RGB = zeros(M,N,3); % Initialize RGB array for i = 1:M for j = 1:N % Fill RGB planes if I(i,j) > .5 RGB(i,j,1) = (I(i,j)-.5)*2; else RGB(i,j,3) = I(i,j)*2; end end end % subplot(1,2,1); % Display images in a single plot imshow(I); title(’Original’); subplot(1,2,2); imshow(RGB) title(’Pseudocolor’); The pseudocolor image produced by this code is shown in Figure 10.7. Again, it will be necessary to run the example to obtain the actual color image. ADVANCED PROTOCOLS: BLOCK PROCESSING Many of the signal processing techniques presented in previous chapters oper- ated on small, localized groups of data. For example, both FIR and adaptive filters used data samples within the same general neighborhood. Many image processing techniques also operate on neighboring data elements, except the neighborhood now extends in two dimensions, both horizontally and vertically. Given this extension into two dimensions, many operations in image processing are quite similar to those in signal processing. In the next chapter, we examine both two-dimensional filtering using two-dimensional convolution and the two- dimensional Fourier transform. While many image processing operations are conceptually the same as those used on signal processing, the implementation Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 73. is somewhat more involved due to the additional bookkeeping required to oper- ate on data in two dimensions. The MATLAB Image Processing Toolbox sim- plifies much of the tedium of working in two dimensions by introducing func- tions that facilitate two-dimensional block, or neighborhood operations. These block processing operations fall into two categories: sliding neighborhood oper- ations and distinct block operation. In sliding neighborhood operations, the block slides across the image as in convolution; however, the block must slide in both horizontal and vertical directions. Indeed, two-dimensional convolution described in the next chapter is an example of one very useful sliding neighbor- hood operation. In distinct block operations, the image area is divided into a number of fixed groups of pixels, although these groups may overlap. This is analogous to the overlapping segments used in the Welch approach to the Fou- rier transform described in Chapter 3. Both of these approaches to dealing with blocks of localized data in two dimensions are supported by MATLAB routines. Sliding Neighborhood Operations The sliding neighborhood operation alters one pixel at a time based on some operation performed on the surrounding pixels; specifically those pixels that lie within the neighborhood defined by the block. The block is placed as symmetri- cally as possible around the pixel being altered, termed the center pixel (Figure 10.8). The center pixel will only be in the center if the block is odd in both FIGURE 10.8 A 3-by-2 pixel sliding neighborhood block. The block (gray area), is shown in three different positions. Note that the block sometimes falls off the picture and padding (usually zero padding) is required. In actual use, the block slides, one element at a time, over the entire image. The dot indicates the center pixel. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 74. dimensions, otherwise the center pixel position favors the left and upper sides of the block (Figure 10.8).* Just as in signal processing, there is a problem that occurs at the edge of the image when a portion of the block will extend beyond the image (Figure 10.8, upper left block). In this case, most MATLAB sliding block functions automatically perform zero padding for these pixels. (An excep- tion, is the imfilter routine described in the next capter.) The MATLAB routines conv2 and filter2 are both siding neighborhood operators that are directly analogous to the one dimensional convolution routine, conv, and filter routine, filter. These functions will be discussed in the next chapter on image filtering. Other two-dimensional functions that are directly anal- ogous to their one-dimensional counterparts include: mean2, std2, corr2, and fft2. Here we describe a general sliding neighborhood routine that can be used to implement a wide variety of image processing operations. Since these opera- tions can be—but are not necessarily—nonlinear, the function has the name nlfilter, presumably standing for nonlinear filter. The calling structure is: I1 = nlfilter(I, [M N], func, P1, P2, ...); where I is the input image array, M and N are the dimensions of the neighbor- hood block (horizontal and vertical), and func specifies the function that will operate over the block. The optional parameters P1, P2, . . . , will be passed to the function if it requires input parameters. The function should take an M by N input and must produce a single, scalar output that will be used for the value of the center pixel. The input can be of any class or data format supported by the function, and the output image array, I1, will depend on the format provided by the routine’s output. The function may be specified in one of three ways: as a string containing the desired operation, as a function handle to an M-file, or as a function estab- lished by the routine inline. The first approach is straightforward: simply em- bed the function operation, which could be any appropriate MATLAB stat- ment(s), within single quotes. For example: I1 = nlfilter(I, [3 3], ‘mean2’); This command will slide a 3 by 3 moving average across the image pro- ducing a lowpass filtered version of the original image (analogous to an FIR filter of [1/3 1/3 1/3] ). Note that this could be more effectively implemented using the filter routines described in the next chapter, but more complicated, perhaps nonlinear, operations could be included within the quotes. *In MATLAB notation, the center pixel of an M by N block is located at: floor(([M N] ؉ 1)/2). Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 75. The use of a function handle is shown in the code: I1 = nlfilter(I, [3 3], @my_function); where my_function is the name of an M-file function. The function handle @my_function contains all the information required by MATLAB to execute the function. Again, this file should produce a single, scalar output from an M by N input, and it has the possibility of containing input arguments in addition to the block matrix. The inline routine has the ability to take string text and convert it into a function for use in nlfilter as in this example string: F = inline(‘2*x(2,2) -sum( x(1:3,1))/3- sum(x(1:3,3))/3 - x(1,2)—x(3,2)’); I1 = nlfilter(I, [3 3], F); Function inline assumes that the input variable is x, but it also can find other variables based on the context and it allows for additional arguments, P1, P2, . . . (see associated help file). The particular function shown above would take the difference between the center point and its 8 surrounding neighbors, performing a differentiator-like operation. There are better ways to perform spa- tial differentiation described in the next chapter, but this form will be demon- strated as one of the operations in Example 10.6 below. Example 10.6 Load the image of blood cells in blood.tiff in MATLAB’s image files. Convert the image to class intensity and double format. Perform the following sliding neighborhood operations: averaging over a 5 by 5 sliding block, differencing (spatial differentiation) using the function, F, above; and vertical boundary detection using a 2 by 3 vertical differencer. This differencer operator subtracts a vertical set of three left hand pixels from the three adjacent right hand pixels. The result will be a brightening of vertical boundaries that go from dark to light and a darkening of vertical boundaries that go from light to dark. Display all the images in the same figure including the original. Also include binary images of the vertical boundary image thresh- olded at two different levels to emphasize the left and right boundaries. % Example 10.6 and Figure 10.9 % Demonstration of sliding neighborhood operations % Load image of blood cells, blood.tiff from the Image Processing % Toolbox in subdirectory imdemos. % Use a sliding 3 by 3 element block to perform several sliding % neighborhood operations including taking the average over the % block, implementing the function ’F’ in the example Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 76. FIGURE 10.9 A variety of sliding neighborhood operations carried out on an im- age of blood cells. (Original reprinted with permission from The Image Processing Handbook, 2nd ed. Copyright CRC Press, Boca Raton, Florida.) % above, and implementing a function that enhances vertical % boundaries. % Display the original and all modification on the same plot % clear all; close all; [I map] = imread(’blood1.tif’);% Input image % Since image is stored in tif format, it could be in either uint8 % or uint16 format (although the ’whos’ command shows it is in % uint8). Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 77. % The specific data format will not matter since the format will % be converted to double either by ’ind2gray,’ if it is an In- % dexed image or by ‘im2gray’ if it is not. % if isempty(map) == 0 % Check to see if indexed data I = ind2gray(I,map); % If so, convert to intensity % image end I = im2double(I); % Convert to double and scale % If not already % % Perform the various sliding neighborhood operations. % Averaging I_avg = nlfilter(I,[5 5], ’mean2’); % % Differencing F = inline(’x(2,2)—sum(x(1:3,1))/3- sum(x(1:3,3))/3 - ... x(1,2)—x(3,2)’); I_diff = nlfilter(I, [3 3], F); % % Vertical boundary detection F1 = inline (’sum(x(1:3,2))—sum(x(1:3,1))’); I_vertical = nlfilter(I,[3 2], F1); % % Rescale all arrays I_avg = mat2gray(I_avg); I_diff = mat2gray(I_diff); I_vertical = mat2gray(I_vertical); % subplot(3,2,1); % Display all images in a single % plot imshow(I); title(’Original’); subplot(3,2,2); imshow(I_avg); title(’Averaged’); subplot(3,2,3); imshow(I_diff); title(’Differentiated’); subplot(3,2,4); imshow(I_vertical); title(’Vertical boundaries’); subplot(3,2,5); bw = im2bw(I_vertical,.6); % Threshold data, low threshold imshow(bw); Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 78. title(’Left boundaries’); subplot(3,2,6); bw1 = im2bw(I_vertical,.8); % Threshold data, high % threshold imshow(bw1); title(’Right boundaries’); The code in Example 10.6 produces the images in Figure 10.9. These operations are quite time consuming: Example 10.6 took about 4 minutes to run on a 500 MHz PC. Techniques for increasing the speed of Sliding Operations can be found in the help file for colfilt. The vertical boundaries produced by the 3 by 2 sliding block are not very apparent in the intensity image, but become quite evident in the thresholded binary images. The averaging has improved contrast, but the resolution is reduced so that edges are no longer distinct. Distinct Block Operations All of the sliding neighborhood options can also be implemented using configu- rations of fixed blocks (Figure 10.10). Since these blocks do not slide, but are FIGURE 10.10 A 7-by-3 pixel distinct block. As with the sliding neighborhood block, these fixed blocks can fall off the picture and require padding (usually zero padding). The dot indicates the center pixel although this point usually has little significance in this approach. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 79. fixed with respect to the image (although they may overlap), they will produce very different results. The MATLAB function for implementing distinct block operations is similar in format to the sliding neighborhood function: I1 = blkproc(I, [M N], [Vo Ho], func); where M and N specify the vertical and horizontal size of the block, Vo and Ho are optional arguments that specify the vertical and horizontal overlap of the block, func is the function that operates on the block, I is the input array, and I1 is the output array. As with nlfilter the data format of the output will depend on the output of the function. The function is specified in the same manner as described for nlfilter; however the function output will be dif- ferent. Function outputs for sliding neighborhood operations had to be single sca- lars that then became the value of the center pixel. In distinct block operations, the block does not move, so the function output will normally produce values for every pixel in the block. If the block produces a single output, then only the center pixel of each block will contain a meaningful value. If the function is an operation that normally produces a single value, the output of this routine can be expanded by multiplying it by an array of ones that is the same size as the block This will place that single output in every pixel in the block: I1 = blkproc(I [4 5], ‘std2 * ones(4,5)’); In this example the output of the MATLAB function std2 is placed into a 4 by 5 array and this becomes the output of the function, an array the same size as the block. It is also possible to use the inline function to describe the function: F = inline(‘std2(x) * ones(size(x))’); I1 = blkproc(I, [4 5], F); Of course, it is possible that certain operations could produce a different output for each pixel in the block. An example of block processing is given in Example 10.7. Example 10.7 Load the blood cell image used in Example 10.6 and perform the following distinct block processing operations: 1) Display the aver- age for a block size of 8 by 8; 2) For a 3 by 3 block, perform the differentiator operation used in Example 10.6; and 3) Apply the vertical boundary detector form Example 10.6 to a 3 by 3 block. Display all the images including the original in a single figure. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 80. % Example 10.7 and Figure 10.11 % Demonstration of distinct block operations % Load image of blood cells used in Example 10.6 % Use a 8 by 8 distinct block to get averages for the entire block % Apply the 3 by 3 differentiator from Example 10.6 as a distinct % block operation. % Apply a 3 by 3 vertical edge detector as a block operation % Display the original and all modification on the same plot % ..... Image load, same as in Example 10.6....... % FIGURE 10.11 The blood cell image of Example 10.6 processed using three Dis- tinct block operations: block averaging, block differentiation, and block vertical edge detection. (Original image reprinted from The Image Processing Handbook, 2nd edition. Copyright CRC Press, Boca Raton, Florida.) Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 81. % Perform the various distinct block operations. % Average of the image I_avg = blkproc(I,[10 10], ’mean2 * ones(10,10)’); % % Deferentiator—place result in all blocks F = inline(’(x(2,2)—sum(x(1:3,1))/3- sum(x(1:3,3))/3 ... - x(1,2)—x(3,2)) * ones(size(x))’); I_diff = blkproc(I, [3 3], F); % % Vertical edge detector-place results in all blocks F1 = inline(’(sum(x(1:3,2))—sum(x(1:3,1))) ... * ones(size(x))’); I_vertical = blkproc(I, [3,2], F1); .........Rescale and plotting as in Example 10.6....... Figure 10.11 shows the images produced by Example 10.7. The “differen- tiator” and edge detection operators look similar to those produced the Sliding Neighborhood operation because they operate on fairly small block sizes. The averaging operator shows images that appear to have large pixels since the neighborhood average is placed in block of 8 by 8 pixels. The topics covered in this chapter provide a basic introduction to image processing and basic MATLAB formats and operations. In subsequent chapters we use this foundation to develop some useful image processing techniques such as filtering, Fourier and other transformations, and registration (alignment) of multiple images. PROBLEMS 1. (A) Following the approach used in Example 10.1, generate an image that is a sinusoidal grating in both horizontal and vertical directions (it will look somewhat like a checkerboard). (Hint: This can be done with very few addi- tional instructions.) (B) Combine this image with its inverse as a multiframe image and show it as a movie. Use multiple repetitions. The movie should look like a flickering checkerboard. Submit the two images. 2. Load the x-ray image of the spine (spine.tif) from the MATLAB Image Processing Toolbox. Slice the image into 4 different levels then plot in pseudo- color using yellow, red, green, and blue for each slice. The 0 level slice should be blue and the highest level slice should be yellow. Use grayslice and con- struct you own colormap. Plot original and sliced image in the same figure. (If the “original” image also displays in pseudocolor, it is because the computer display is using the same 3-level colormap for both images. In this case, you should convert the sliced image to RGB before displaying.) Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 82. 3. Load frame 20 from the MRI image (mri.tif) and code it in pseudocolor by coding the image into green and the inverse of the image into blue. Then take a threshold and plot pixels over 80% maximum as red. 4. Load the image of a cancer cell (from rat prostate, courtesy of Alan W. Partin, M.D., Johns Hopkins University School of Medicine) cell.tif and apply a correction to the intensity values of the image (a gamma correction described in later chapters). Specifically, modify each pixel in the image by a function that is a quarter wave sine wave. That is, the corrected pixels are the output of the sine function of the input pixels: Out(m,n) = f(In(m,n)) (see plot below). FIGURE PROB. 10.4 Correction function to be used in Problem 4. The input pixel values are on the horizontal axis, and the output pixels values are on the vertical axis. 5. Load the blood cell image in blood1.tif. Write a sliding neighborhood function to enhance horizontal boundaries that go from dark to light. Write a second function that enhances boundaries that go from light to dark. Threshold both images so as to enhance the boundaries. Use a 3 by 2 sliding block. (Hint: This program may require several minutes to run. You do not need to rerun the program each time to adjust the threshold for the two binary images.) 6. Load the blood cells in blood.tif. Apply a distinct block function that replaces all of the values within a block by the maximum value in that block. Use a 4 by 4 block size. Repeat the operation using a function that replaces all the values by the minimum value in the block. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 83. 11 Image Processing: Filters, Transformations, and Registration SPECTRAL ANALYSIS: THE FOURIER TRANSFORM The Fourier transform and the efficient algorithm for computing it, the fast Fourier transform, extend in a straightforward manner to two (or more) dimen- sions. The two-dimensional version of the Fourier transform can be applied to images providing a spectral analysis of the image content. Of course, the result- ing spectrum will be in two dimensions, and usually it is more difficult to inter- pret than a one-dimensional spectrum. Nonetheless, it can be a very useful anal- ysis tool, both for describing the contents of an image and as an aid in the construction of imaging filters as described in the next section. When applied to images, the spatial directions are equivalent to the time variable in the one- dimensional Fourier transform, and this analogous spatial frequency is given in terms of cycles/unit length (i.e., cycles/cm or cycles/inch) or normalized to cy- cles per sample. Many of the concerns raised with sampled time data apply to sampled spatial data. For example, undersampling an image will lead to aliasing. In such cases, the spatial frequency content of the original image is greater than fS/2, where fS now is 1/(pixel size). Figure 11.1 shows an example of aliasing in the frequency domain. The upper left-hand upper image contains a chirp signal increasing in spatial frequency from left to right. The high frequency elements on the right side of this image are adequately sampled in the left-hand image. The same pattern is shown in the upper right-hand image except that the sam- pling frequency has been reduced by a factor of 6. The right side of this image also contains sinusoidally varying intensities, but at additional frequencies as Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 84. FIGURE 11.1 The influence of aliasing due to undersampling on two images with high spatial frequency. The aliased images show addition sinusoidal frequencies in the upper right image and jagged diagonals in the lower right image. (Lower original image from file ‘testpostl.png’ from the MATLAB Image Processing Tool- box. Copyright 1993–2003, The Math Works, Inc. Reprinted with permission.) the aliasing folds other sinusoids on top of those in the original pattern. The lower figures show the influence of aliasing on a diagonal pattern. The jagged diagonals are characteristic of aliasing as are moire patterns seen in other im- ages. The problem of determining an appropriate sampling size is even more acute in image acquisition since oversampling can quickly lead to excessive memory storage requirements. The two-dimensional Fourier transform in continuous form is a direct ex- tension of the equation given in Chapter 3: F(ω1,ω2) = ∫ ∞ m=−∞ ∫ ∞ n=−∞ f(m,n)e−jω1m e−jω2n dm dn (1) The variables ω1 and ω2 are still frequency variables, although they define spatial frequencies and their units are in radians per sample. As with the time Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 85. domain spectrum, F(ω1,ω2) is a complex-valued function that is periodic in both ω1 and ω2. Usually only a single period of the spectral function is displayed, as was the case with the time domain analog. The inverse two-dimensional Fourier transform is defined as: f(m,n) = 1 4π2 ∫ π ω1 =−π ∫ π ω2 =−π F(ω1,ω2)e−jω1m e−jω2n dω1 dω2 (2) As with the time domain equivalent, this statement is a reflection of the fact that any two-dimensional function can be represented by a series (possibly infinite) of sinusoids, but now the sinusoids extend over the two dimensions. The discrete form of Eqs. (1) and (2) is again similar to their time domain analogs. For an image size of M by N, the discrete Fourier transform becomes: F(p,q) = ∑ M−1 m=0 ∑ N−1 n=0 f(m,n)e−j(2π/M)p m e−j(2π/N)q n (3) p = 0,1 . . . , M − 1; q = 0,1 . . . , N − 1 The values F(p,q) are the Fourier Transform coefficients of f(m,n). The discrete form of the inverse Fourier Transform becomes: f(m,n) = 1 MN ∑ M−1 p=0 ∑ N−1 q=0 F(p,q)e−j(2π/M)p m e−j(2π/N)q n (4) m = 0,1 . . . , M − 1; n = 0,1 . . . , N − 1 MATLAB Implementation Both the Fourier transform and inverse Fourier transform are supported in two (or more) dimensions by MATLAB functions. The two-dimensional Fourier transform is evoked as: F = fft2(x,M,N); where F is the output matrix and x is the input matrix. M and N are optional arguments that specify padding for the vertical and horizontal dimensions, re- spectively. In the time domain, the frequency spectrum of simple waveforms can usually be anticipated and the spectra of even relatively complicated wave- forms can be readily understood. With two dimensions, it becomes more diffi- cult to visualize the expected Fourier transform even of fairly simple images. In Example 11.1 a simple thin rectangular bar is constructed, and the Fourier trans- form of the object is constructed. The resultant spatial frequency function is plotted both as a three-dimensional function and as an intensity image. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 86. Example 11.1 Determine and display the two-dimensional Fourier trans- form of a thin rectangular object. The object should be 2 by 10 pixels in size and solid white against a black background. Display the Fourier transform as both a function (i.e., as a mesh plot) and as an image plot. % Example 11.1 Two-dimensional Fourier transform of a simple % object. % Construct a simple 2 by 10 pixel rectangular object, or bar. % Take the Fourier transform padded to 256 by 256 and plot the % result as a 3-dimensional function (using mesh) and as an % intensity image. % % Construct object close all; clear all; % Construct the rectangular object f = zeros(22,30); % Original figure can be small since it f(10:12,10:20) = 1; % will be padded % F = fft2(f,128,128); % Take FT; pad to 128 by 128 F = abs(fftshift(F));, % Shift center; get magnitude % imshow(f,’notruesize’); % Plot object .....labels.......... figure; mesh(F); % Plot Fourier transform as function .......labels.......... figure; F = log(F); % Take log function FIGURE 11.2A The rectangular object (2 pixels by 10 pixels used in Example 11.1. The Fourier transform of this image is shown in Figure 11.2B and C. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 87. FIGURE 11.2B Fourier transform of the rectangular object in Figure 11.2A plotted as a function. More energy is seen, particularly at the higher frequencies, along the vertical axis because the object’s vertical cross sections appear as a narrow pulse. The border horizontal cross sections produce frequency characteristics that fall off rapidly at higher frequencies. I = mat2gray(F); % Scale as intensity image imshow(I); % Plot Fourier transform as image Note that in the above program the image size was kept small (22 by 30) since the image will be padded (with zeros, i.e., black) by ‘fft2.’ The fft2 routine places the DC component in the upper-left corner. The fftshift routine is used to shift this component to the center of the image for plotting purposes. The log of the function was taken before plotting as an image to improve the grayscale quality in the figure. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 88. FIGURE 11.2C The Fourier transform of the rectangular object in Figure 11.2A plotted as an image. The log of the function was taken before plotting to improve the details. As in the function plot, more high frequency energy is seen in the vertical direction as indicated by the dark vertical band. The horizontal chirp signal plotted in Figure 11.1 also produces a easily interpretable Fourier transform as shown in Figure 11.3. The fact that this image changes in only one direction, the horizontal direction, is reflected in the Fourier transform. The linear increase in spatial frequency in the horizontal direction produces an approximately constant spectral curve in that direction. The two-dimensional Fourier transform is also useful in the construction and evaluation of linear filters as described in the following section. LINEAR FILTERING The techniques of linear filtering described in Chapter 4 can be directly ex- tended to two dimensions and applied to images. In image processing, FIR fil- ters are usually used because of their linear phase characteristics. Filtering an image is a local, or neighborhood, operation just as it was in signal filtering, although in this case the neighborhood extends in two directions around a given pixel. In image filtering, the value of a filtered pixel is determined from a linear combination of surrounding pixels. For the FIR filters described in Chapter 4, Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 89. FIGURE 11.3 Fourier transform of the horizontal chirp signal shown in Figure 11.1. The spatial frequency characteristics of this image are zero in the vertical direction since the image is constant in this direction. The linear increase in spa- tial frequency in the horizontal direction is reflected in the more or less constant amplitude of the Fourier transform in this direction. the linear combination for a given FIR filter was specified by the impulse re- sponse function, the filter coefficients, b(n). In image filtering, the filter function exists in two dimensions, h(m,n). These two-dimensional filter weights are ap- plied to the image using convolution in an approach analogous to one-dimen- sional filtering. The equation for two-dimensional convolution is a straightforward exten- sion of the one-dimensional form (Eq. (15), Chapter 2): y(m,n) = ∑ ∞ k1 =−∞ ∑ ∞ k2 =−∞ x(k1,k2)b(m − k1,n − k2) (5) While this equation would not be difficult to implement using MATLAB statements, MATLAB has a function that implements two-dimensional convolu- tion directly. Using convolution to perform image filtering parallels its use in signal imaging: the image array is convolved with a set of filter coefficients. However, Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 90. in image analysis, the filter coefficients are defined in two dimensions, h(m,n). A classic example of a digital image filter is the Sobel filter, a set of coefficients that perform a horizontal spatial derivative operation for enhancement of hori- zontal edges (or vertical edges if the coefficients are rotated using transposition): h(m,n)Sobel = ͫ1 2 1 0 0 0 −1 −2 −1 ͬؒ These two-dimensional filter coefficients are sometimes referred to as the convolution kernel. An example of the application of a Sobel filter to an image is provided in Example 11.2. When convolution is used to apply a series of weights to either image or signal data, the weights represent a two-dimensional impulse response, and, as with a one-dimensional impulse response, the weights are applied to the data in reverse order as indicated by the negative sign in the one- and two-dimensional convolution equations (Eq. (15) from Chapter 2 and Eq. (5).* This can become a source of confusion in two-dimensional applications. Image filtering is easier to conceptualize if the weights are applied directly to the image data in the same orientation. This is possible if digital filtering is implemented using correlation rather that convolution. Image filtering using correlation is a sliding neighbor- hood operation, where the value of the center pixel is just the weighted sum of neighboring pixels with the weighting given by the filter coefficients. When correlation is used, the set of weighting coefficients is termed the correlation kernel to distinguish it from the standard filter coefficients. In fact, the opera- tions of correlation and convolution both involve weighted sums of neighboring pixels, and the only difference between correlation kernels and convolution ker- nels is a 180-degree rotation of the coefficient matrix. MATLAB filter routines use correlation kernels because their application is easier to conceptualize. MATLAB Implementation Two dimensional-convolution is implemented using the routine ‘conv2’: I2 = conv2(I1, h, shape) where I1 and h are image and filter coefficients (or two images, or simply two matrices) to be convolved and shape is an optional argument that controls the size of the output image. If shape is ‘full’, the default, then the size of the output matrix follows the same rules as in one-dimensional convolution: each *In one dimension, this is equivalent to applying the weights in reverse order. In two dimensions, this is equivalent to rotating the filter matrix by 180 degrees before multiplying corresponding pixels and coefficients. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 91. dimension of the output is the sum of the two matrix lengths along that dimen- sion minus one. Hence, if the two matrices have sizes I1(M1, N1) and h(M2, N2), the output size is: I2(M1 ؉ M2 − 1, N2 ؉ N2 − 1). If shape is ‘valid’, then any pixel evaluation that requires image padding is ignored and the size of the output image is: Ic(M1- M2 ؉ 1, N1- N2 ؉ 1). Finally, if shape is ‘same’ the size of the output matrix is the same size as I1; that is: I2(M1, N1). These options allow a great deal in flexibility and can simplify the use of two-dimen- sional convolution; for example, the ‘same’ option can eliminate the need for dealing with the additional points generated by convolution. Two-dimensional correlation is implemented with the routine ‘imfilter’ that provides even greater flexibility and convenience in dealing with size and boundary effects. The calling structure of this routine is given in the next page. I2 = imfilter(I1, h, options); where again I1 and h are the input matrices and options can include up to three separate control options. One option controls the size of the output array using the same terms as in ‘conv2’ above: ‘same’ and ‘full’ (‘valid’ is not valid in this routine!). With ‘imfilter’ the default output size is ‘same’ (not ‘full’), since this is the more likely option in image analysis. The second possible option controls how the edges are treated. If a constant is given, then the edges are padded with the value of that constant. The default is to use a constant of zero (i.e., standard zero padding). The boundary option ‘symmet- ric’ uses a mirror reflection of the end points as shown in Figure 2.10. Simi- larly the option ‘circular’ uses periodic extension also shown in Figure 2.10. The last boundary control option is ‘replicate’, which pads using the nearest edge pixel. When the image is large, the influence of the various border control options is subtle, as shown in Example 11.4. A final option specifies the use of convolution instead of correlation. If this option is activated by including the argument conv, imfilter is redundant with ‘conv2’ except for the options and defaults. The imfilter routine will accept all of the data format and types defined in the previous chapter and produces an output in the same format; however, filtering is not usually appropriate for indexed images. In the case of RGB images, imfilter operates on all three image planes. Filter Design The MATLAB Image Processing Toolbox provides considerable support for generating the filter coefficients.* A number of filters can be generated using MATLAB’s fspecial routine: *Since MATLAB’s preferred implementation of image filters is through correlation, not convolu- tion, MATLAB’s filter design routines generate correlation kernels. We use the term “filter coeffi- cient” for either kernel format. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 92. h = fspecial(type, parameters); where type specifies a specific filter and the optional parameters are related to the filter selected. Filter type options include: ‘gaussian’, ‘disk’, ‘sobel’, ‘prewitt’, ‘laplacian’, ‘log’, ‘average’, and ‘unsharp’. The ‘gauss- ian’ option produces a Gaussian lowpass filter. The equation for a Gaussian filter is similar to the equation for the gaussian distribution: h(m,n) = e−(d/σ)/2 where d = √(m2 + n2 ) This filter has particularly desirable properties when applied to an image: it provides an optimal compromise between smoothness and filter sharpness. The MATLAB routine for this filter accepts two parameters: the first specifies the filter size (the default is 3) and the second the value of sigma. The value of sigma will influence the cutoff frequency while the size of the filter determines the number of pixels over which the filter operates. In general, the size should be 3–5 times the value of sigma. Both the ‘sobel’ and ‘prewitt’ options produce a 3 by 3 filter that enhances horizontal edges (or vertical if transposed). The ‘unsharp’ filter pro- duces a contrast enhancement filter. This filter is also termed unsharp masking because it actually suppresses low spatial frequencies where the low frequencies are presumed to be the unsharp frequencies. In fact, it is a special highpass filter. This filter has a parameter that specifies the shape of the highpass charac- teristic. The ‘average’ filter simply produces a constant set of weights each of which equals 1/N, where N = the number of elements in the filter (the default size of this filter is 3 by 3, in which case the weights are all 1/9 = 0.1111). The filter coefficients for a 3 by 3 Gaussian lowpass filter (sigma = 0.5) and the unsharpe filter (alpha = 0.2) are shown below: hunsharp = ͫ−0.1667 −0.6667 −0.1667 −0.6667 4.3333 −0.6667 −0.1667 −0.6667 −0.1667 ͬ; hgaussian = ͫ0.0113 0.0838 0.0113 0.0838 0.6193 0.0838 0.0113 0.0838 0.0113 ͬ The Laplacian filter is used to take the second derivative of an image: ∂2 /∂x. The log filter is actually the log of Gaussian filter and is used to take the first derivative, ∂ /∂x, of an image. MATLAB also provides a routine to transform one-dimensional FIR fil- ters, such as those described in Chapter 4, into two-dimensional filters. This approach is termed the frequency transform method and preserves most of the characteristics of the one-dimensional filter including the transition band- width and ripple features. The frequency transformation method is implemented using: h = ftrans2(b); Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 93. where h are the output filter coefficients (given in correlation kernel format), and b are the filter coefficients of a one-dimensional filter. The latter could be produced by any of the FIR routines described in Chapter 4 (i.e., fir1, fir2, or remez). The function ftrans2 can take an optional second argument that specifies the transformation matrix, the matrix that converts the one-dimensional coefficients to two dimensions. The default transformation is the McClellan transformation that produces a nearly circular pattern of filter coefficients. This approach brings a great deal of power and flexibility to image filter design since it couples all of the FIR filter design approaches described in Chapter 4 to image filtering. The two-dimensional Fourier transform described above can be used to evaluate the frequency characteristics of a given filter. In addition, MATLAB supplies a two-dimensional version of freqz, termed freqz2, that is slightly more convenient to use since it also handles the plotting. The basic call is: [H fx fy] = freqz2(h, Ny, Nx);. where h contains the two-dimensional filter coefficients and Nx and Ny specify the size of the desired frequency plot. The output argument, H, contains the two- dimensional frequency spectra and fx and fy are plotting vectors; however, if freqz2 is called with no output arguments then it generates the frequen- cy plot directly. The examples presented below do not take advantage of this function, but simply use the two-dimensional Fourier transform for filter evalua- tion. Example 11.2 This is an example of linear filtering using two of the filters in fspecial. Load one frame of the MRI image set (mri.tif) and apply the sharpening filter, hunsharp, described above. Apply a horizontal Sobel filter, hSobel, (also shown above), to detect horizontal edges. Then apply the Sobel filter to detect the vertical edges and combine the two edge detectors. Plot both the horizontal and combined edge detectors. Solution To generate the vertical Sobel edge detector, simply transpose the horizontal Sobel filter. While the two Sobel images could be added together using imadd, the program below first converts both images to binary then com- bines them using a logical or. This produces a more dramatic black and white image of the boundaries. % Example 11.2 and Figure 11.4A and B % Example of linear filtering using selected filters from the % MATLAB ’fspecial’ function. % Load one frame of the MRI image and apply the 3 by 3 “unshape” % contrast enhancement filter shown in the text. Also apply two Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 94. FIGURE 11.4A MRI image of the brain before and after application of two filters from MATLAB’s fspecial routine. Upper right: Image sharpening using the filter unsharp. Lower images: Edge detection using the sobel filter for horizontal edges (left) and for both horizontal and vertical edges (right). (Original image from MATLAB. Image Processing Toolbox. Copyright 1993–2003, The Math Works, Inc. Reprinted with permission.) % 3 by 3 Sobel edge detector filters to enhance horizontal and % vertical edges. % Combine the two edge detected images % clear all; close all; % frame = 17; % Load MRI frame 17 [I(:,:,:,1), map ] = imread(’mri.tif’, frame); Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 95. FIGURE 11.4B Frequency characteristics of the unsharp and Sobel filters used in Example 11.2. if isempty(map) == 0 % Usual check and I = ind2gray(I,map); % conversion if % necessary. else I = im2double(I); end % h_unsharp = fspecial(’unsharp’,.5); % Generate ‘unsharp’ I_unsharp = imfilter(I,h_unsharp); % filter coef. and % apply % h_s = fspecial(’Sobel’); % Generate basic Sobel % filter. I_sobel_horin = imfilter(I,h_s); % Apply to enhance I_sobel_vertical = imfilter(I,h_s’); % horizontal and % vertical edges % % Combine by converting to binary and or-ing together I_sobel_combined = im2bw(I_sobel_horin) * ... im2bw(I_sobel_vertical); % subplot(2,2,1); imshow(I); % Plot the images title(’Original’); subplot(2,2,2); imshow(I_unsharp); title(’Unsharp’); subplot(2,2,3); imshow(I_sobel_horin); Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 96. title(’Horizontal Sobel’); subplot(2,2,4); imshow(I_sobel_combined); title(’Combined Image’); figure; % % Now plot the unsharp and Sobel filter frequency % characteristics F= fftshift(abs(fft2(h_unsharp,32,32))); subplot(1,2,1); mesh(1:32,1:32,F); title(’Unsharp Filter’); view([-37,15]); % F = fftshift(abs(fft2(h_s,32,32))); subplot(1,2,2); mesh(1:32,1:32,F); title(’Sobel Filter’); view([-37,15]); The images produced by this example program are shown below along with the frequency characteristics associated with the ‘unsharp’ and ‘sobel’ filter. Note that the ‘unsharp’ filter has the general frequency characteristics of a highpass filter, that is, a positive slope with increasing spatial frequencies (Figure 11.4B). The double peaks of the Sobel filter that produce edge enhance- ment are evident in Figure 11.4B. Since this is a magnitude plot, both peaks appear as positive. In Example 11.3, routine ftrans2 is used to construct two-dimensional filters from one-dimensional FIR filters. Lowpass and highpass filters are con- structed using the filter design routine fir1 from Chapter 4. This routine gener- ates filter coefficients based on the ideal rectangular window approach described in that chapter. Example 11.3 also illustrates the use of an alternate padding technique to reduce the edge effects caused by zero padding. Specifically, the ‘replicate’ option of imfilter is used to pad by repetition of the last (i.e., image boundary) pixel value. This eliminates the dark border produced by zero padding, but the effect is subtle. Example 11.3 Example of the application of standard one-dimensional FIR filters extended to two dimensions. The blood cell images (blood1.tif) are loaded and filtered using a 32 by 32 lowpass and highpass filter. The one- dimensional filter is based on the rectangular window filter (Eq. (10), Chapter 4), and is generated by fir. It is then extended to two dimensions using ftrans2. % Example 11.3 and Figure 11.5A and B % Linear filtering. Load the blood cell image % Apply a 32nd order lowpass filter having a bandwidth of .125 % fs/2, and a highpass filter having the same order and band- % width. Implement the lowpass filter using ‘imfilter’ with the Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 97. FIGURE 11.5A Image of blood cells before and after lowpass and highpass filter- ing. The upper lowpass image (upper right) was filtered using zero padding, which produces a slight black border around the image. Padding by extending the edge pixel eliminates this problem (lower left). (Original Image reprinted with permission from The Image Processing Handbook, 2nd edition. Copyright CRC Press, Boca Raton, Florida.) % zero padding (the default) and with replicated padding % (extending the final pixels). % Plot the filter characteristics of the high and low pass filters % % Load the image and transform if necessary clear all; close all; N = 32; % Filter order w_lp = .125; % Lowpass cutoff frequency Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 98. FIGURE 11.5B Frequency characteristics of the lowpass (left) and highpass (right) filters used in Figure 11.5A. w_hp = .125; % Highpass cutoff frequency .......load image blood1.tif and convert as in Example 11.2 ...... % b = fir1(N,w_lp); % Generate the lowpass filter h_lp = ftrans2(b); % Convert to 2-dimensions I_lowpass = imfilter(I,h_lp); % and apply with, % and without replication I_lowpass_rep = imfilter (I,h_lp,’replicate’); b = fir1(N,w_hp,’high’); % Repeat for highpass h_hp = ftrans2(b); I_highpass = imfilter(I, h_hp); I_highpass = mat2gray(I_highpass); % ........plot the images and filter characteristics as in Example 11.2....... The figures produced by this program are shown below (Figure 11.5A and B). Note that there is little difference between the image filtered using zero padding and the one that uses extended (‘replicate’) padding. The highpass filtered image shows a slight derivative-like characteristic that enhances edges. In the plots of frequency characteristics, Figure 11.5B, the lowpass and highpass filters appear to be circular, symmetrical, and near opposites. The problem of aliasing due to downsampling was discussed above and Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 99. demonstrated in Figure 11.1. Such problems could occur whenever an image is displayed in a smaller size that will require fewer pixels, for example when the size of an image is reduced during reshaping of a computer window. Lowpass filtering can be, and is, used to prevent aliasing when an image is downsized. In fact, MATLAB automatically performs lowpass filtering when downsizing an image. Example 11.4 demonstrates the ability of lowpass filtering to reduce aliasing when downsampling is performed. Example 11.4 Use lowpass filtering to reduce aliasing due to downsam- pling. Load the radial pattern (‘testpat1.png’) and downsample by a factor of six as was done in Figure 11.1. In addition, downsample that image by the same amount, but after it has been lowpass filtered. Plot the two downsampled images side-by-side. Use a 32 by 32 FIR rectangular window lowpass filter. Set the cutoff frequency to be as high as possible and still eliminate most of the aliasing. % Example 11.4 and Figure 11.6 % Example of the ability of lowpass filtering to reduce aliasing. % Downsample the radial pattern with and without prior lowpass % filtering. % Use a cutoff frequency sufficient to reduce aliasing. % clear all; close all; N = 32; % Filter order w = .5; % Cutoff frequency (see text) FIGURE 11.6 Two images of the radial pattern shown in Figure 11.1 after down- sampling by a factor of 6. The right-hand image was filtered by a lowpass filter before downsampling. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 100. dwn = 6; % Downsampling coefficient b = fir1(N,w); % Generate the lowpass filter h = ftrans2(b); % Convert to 2-dimensions % [Imap] = imread(’testpat1.png’); % Load image I_lowpass = imfilter(I,h); % Lowpass filter image [M,N] = size(I); % I = I(1:dwn:M,1:dwn:N); % Downsample unfiltered image subplot (1,2,1); imshow(I); % and display title(’No Filtering’); % Downsample filtered image and display I_lowass = I_lowpass(1:dwn: M,1:dwn:N); subplot(1,2,2); imshow(I_lowpass); title (’Lowpass Filtered’); The lowpass cutoff frequency used in Example 11.5 was determined em- pirically. Although the cutoff frequency was fairly high ( fS/4), this filter still produced substantial reduction in aliasing in the downsampled image. SPATIAL TRANSFORMATIONS Several useful transformations take place entirely in the spatial domain. Such transformations include image resizing, rotation, cropping, stretching, shearing, and image projections. Spatial transformations perform a remapping of pixels and often require some form of interpolation in addition to possible anti-aliasing. The primary approach to anti-aliasing is lowpass filtering, as demonstrated above. For interpolation, there are three methods popularly used in image pro- cessing, and MATLAB supports all three. All three interpolation strategies use the same basic approach: the interpolated pixel in the output image is the weighted sum of pixels in the vicinity of the original pixel after transformation. The methods differ primarily in how many neighbors are considered. As mentioned above, spatial transforms involve a remapping of one set of pixels (i.e., image) to another. In this regard, the original image can be consid- ered as the input to the remapping process and the transformed image is the output of this process. If images were continuous, then remapping would not require interpolation, but the discrete nature of pixels usually necessitates re- mapping.* The simplest interpolation method is the nearest neighbor method in which the output pixel is assigned the value of the closest pixel in the trans- formed image, Figure 11.7. If the transformed image is larger than the original and involves more pixels, then a remapped input pixel may fall into two or *A few transformations may not require interpolation such as rotation by 90 or 180 degrees. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 101. FIGURE 11.7 A rotation transform using the nearest neighbor interpolation method. Pixel values in the output image (solid grid) are assigned values from the nearest pixel in the transformed input image (dashed grid). more output pixels. In the bilinear interpolation method, the output pixel is the weighted average of transformed pixels in the nearest 2 by 2 neighborhood, and in bicubic interpolation the weighted average is taken over a 4 by 4 neighbor- hood. Computational complexity and accuracy increase with the number of pix- els that are considered in the interpolation, so there is a trade-off between quality and computational time. In MATLAB, the functions that require interpolation have an optional argument that specifies the method. For most functions, the default method is nearest neighbor. This method produces acceptable results on all image classes and is the only method appropriate for indexed images. The method is also the most appropriate for binary images. For RGB and intensity image classes, the bilinear or bicubic interpolation method is recommended since they lead to better results. MATLAB provides several routines that can be used to generate a variety of complex spatial transformations such as image projections or specialized dis- tortions. These transformations can be particularly useful when trying to overlay (register) images of the same structure taken at different times or with different modalities (e.g., PET scans and MRI images). While MATLAB’s spatial trans- formations routines allow for any imaginable transformation, only two types of transformation will be discussed here: affine transformations and projective transformations. Affine transformations are defined as transformations in which straight lines remain straight and parallel lines remain parallel, but rectangles may become parallelograms. These transformations include rotation, scaling, stretching, and shearing. In projective translations, straight lines still remain straight, but parallel lines often converge toward vanishing points. These trans- formations are discussed in the following MATLAB implementation section. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 102. MATLAB Implementation Affine Transformations MATLAB provides a procedure described below for implementing any affine transformation; however, some of these transformations are so popular they are supported by separate routines. These include image resizing, cropping, and rotation. Image resizing and cropping are both techniques to change the dimen- sions of an image: the latter is interactive using the mouse and display while the former is under program control. To change the size of an image, MATLAB provides the ‘imresize’ command given below. I_resize = imresize(I, arg or [M N], method); where I is the original image and I_resize is the resized image. If the second argument is a scalar arg, then it gives a magnification factor, and if it is a vector, [M N], it indicates the desired new dimensions in vertical and horizontal pixels, M, N. If arg > 1, then the image is increased (magnified) in size proportionally and if arg < 1, it is reduced in size (minified). This will change image size proportionally. If the vector [M N] is used to specify the output size, image proportions can be modified: the image can be stretched or compressed along a given dimension. The argument method specifies the type of interpolation to be used and can be either ‘nearest’, ‘bilinear’, or ‘bicubic’, referring to the three interpolation methods described above. The nearest neighbor (nearest) is the default. If image size is reduced, then imresize automatically applies an anti- aliasing, lowpass filter unless the interpolation method is nearest; i.e., the default. The logic of this is that the nearest neighbor interpolation method would usually only be used with indexed images, and lowpass filtering is not really appropriate for these images. Image cropping is an interactive command: I_resize = imcrop; The imcrop routine waits for the operator to draw an on-screen cropping rectangle using the mouse. The current image is resized to include only the image within the rectangle. Image rotation is straightforward using the imrotate command: I_rotate = imrotate(I, deg, method, bbox); where I is the input image, I_rotate is the rotated image, deg is the degrees of rotation (counterclockwise if positive, and clockwise if negative), and method describes the interpolation method as in imresize. Again, the nearest neighbor Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 103. method is the default even though the other methods are preferred except for indexed images. After rotation, the image will not, in general, fit into the same rectangular boundary as the original image. In this situation, the rotated image can be cropped to fit within the original boundaries or the image size can be increased to fit the rotated image. Specifying the bbox argument as ‘crop’ will produce a cropped image having the dimensions of the original image, while setting bbox to ‘loose’ will produce a larger image that contains the entire original, unrotated, image. The loose option is the default. In either case, addi- tional pixels will be required to fit the rotated image into a rectangular space (except for orthogonal rotations), and imrotate pads these with zeros produc- ing a black background to the rotated image (see Figure 11.8). Application of the imresize and imrotate routines is shown in Example 11.5 below. Application of imcrop is presented in one of the problems at the end of this chapter. FIGURE 11.8 Two spatial transformations (horizontal stretching and rotation) ap- plied to an image of bone marrow. The rotated images are cropped either to include the full image (lower left), or to have the same dimensions are the original image (lower right). Stained image courtesy of Alan W. Partin, M.D., Ph.D., Johns Hopkins University School of Medicine. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 104. Example 11.5 Demonstrate resizing and rotation spatial transformations. Load the image of stained tissue (hestain.png) and transform it so that the horizontal dimension is 25% longer than in the original, keeping the vertical di- mension unchanged. Rotate the original image 45 degrees clockwise, with and without cropping. Display the original and transformed images in a single figure. % Example 11.5 and Figure 11.8 % Example of various Spatial Transformations % Input the image of bone marrow (bonemarr.tif) and perform % two spatial transformations: % 1) Stretch the object by 25% in the horizontal direction; % 2) Rotate the image clockwise by 30 deg. with and without % cropping. % Display the original and transformed images. % .......read image and convert if necessary ....... % % Rotate image with and without cropping I_rotate = imrotate(I,-45, ’bilinear’); I_rotate_crop = imrotate (I, -45, ’bilinear’, ’crop’); % [M N] = size(I); % Stretch by 25% horin. I_stretch = imresize (I,[M N*1.25], ’bilinear’); % .......display the images ......... The images produced by this code are shown in Figure 11.8. General Affine Transformations In the MATLAB Image Processing Toolbox, both affine and projective spatial transformations are defined by a Tform structure which is constructed using one of two routines: the routine maketform uses parameters supplied by the user to construct the transformation while cp2tform uses control points, or landmarks, placed on different images to generate the transformation. Both routines are very flexible and powerful, but that also means they are quite involved. This section describes aspects of the maketform routine, while the cp2tfrom routine will be presented in context with image registration. Irrespective of the way in which the desired transformation is specified, it is implemented using the routine imtransform. This routine is only slightly less complicated than the transformation specification routines, and only some of its features will be discussed here. (The associated help file should be con- sulted for more detail.) The basic calling structure used to implement the spatial transformation is: Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 105. B = imtransform(A, Tform, ‘Param1’, value1, ‘Param2’, value2,....); where A and B are the input and output arrays, respectively, and Tform provides the transformation specifications as generated by maketform or cp2tform. The additional arguments are optional. The optional parameters are specified as pairs of arguments: a string containing the name of the optional parameter (i.e., ‘Param1’) followed by the value.* These parameters can (1) specify the pixels to be used from the input image (the default is the entire image), (2) permit a change in pixel size, (3) specify how to fill any extra background pixels gener- ated by the transformation, and (4) specify the size and range of the output array. Only the parameters that specify output range will be discussed here, as they can be used to override the automatic rescaling of image size performed by imtransform. To specify output image range and size, parameters ‘XData’ and ‘YData’ are followed by a two-variable vector that gives the x or y coordi- nates of the first and last elements of the output array, B. To keep the size and range in the output image the same as the input image, simply specify the hori- zontal and vertical size of the input array, i.e.: [M N] = size(A); ... B = imtransform(A, Tform, ‘Xdata’, [1 N], ‘Ydata’, [1 M]); As with the transform specification routines, imtransform uses the spa- tial coordinate system described at the beginning of the Chapter 10. In this system, the first dimension is the x coordinate while the second is the y, the reverse of the matrix subscripting convention used by MATLAB. (However the y coordinate still increases in the downward direction.) In addition, non-integer values for x and y indexes are allowed. The routine maketform can be used to generate the spatial transformation descriptor, Tform. There are two alternative approaches to specifying the trans- formation, but the most straightforward uses simple geometrical objects to de- fine the transformation. The calling structure under this approach is: Tform = maketform(‘type’, U, X); where ‘type’ defines the type of transformation and U and X are vectors that define the specific transformation by defining the input (U) and output (X) geom- etries. While maketform supports a variety of transformation types, including *This is a common approach used in many MATLAB routines when a large number of arguments are possible, especially when many of these arguments are optional. It allows the arguments to be specified in any order. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 106. custom, user-defined types, only the affine and projective transformations will be discussed here. These are specified by the type parameters ‘affine’ and ‘projective’. Only three points are required to define an affine transformation, so, for this transformation type, U and X define corresponding vertices of input and output triangles. Specifically, U and X are 3 by 2 matrices where each 2-column row defines a corresponding vertex that maps input to output geometry. For example, to stretch an image vertically, define an output triangle that is taller than the input triangle. Assuming an input image of size M by N, to increase the vertical dimension by 50% define input (U) and output (X) triangles as: U = [1, 1; 1, M; N, M]’ X = [1, 1-.5M; 1, M; N, M]; In this example, the input triangle, U, is simply the upper left, lower left, and lower right corners of the image. The output triangle, X, has its top, left vertex increased by 50%. (Recall the coordinate pairs are given as x,y and y increases negatively. Note that negative coordinates are acceptable). To increase the vertical dimension symmetrically, change X to: X = [1, 1-.25M; 1, 1.25*M; N, 1.25*M]; In this case, the upper vertex is increased by only 25%, and the two lower vertexes are lowered in the y direction by increasing the y coordinate value by 25%. This transformation could be done with imresize, but this would also change the dimension of the output image. When this transform is implemented with imtransform, it is possible to control output size as described below. Hence this approach, although more complicated, allows greater control of the transformation. Of course, if output image size is kept the same, the contents of the original image, when stretched, may exceed the boundaries of the image and will be lost. An example of the use of this approach to change image proportions is given in Problem 6. The maketform routine can be used to implement other affine transforma- tions such as shearing. For example, to shear an image to the left, define an output triangle that is skewed by the desired amount with respect to the input triangle, Figure 11.9. In Figure 11.9, the input triangle is specified as: U = [N/ 2 1; 1 M; N M], (solid line) and the output triangle as X = [1 1; 1 M; N M] (solid line). This shearing transform is implemented in Example 11.6. Projective Transformations In projective transformations, straight lines remain straight but parallel lines may converge. Projective transformations can be used to give objects perspec- tive. Projective transformations require four points for definition; hence, the Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 107. FIGURE 11.9 An affine transformation can be defined by three points. The trans- formation shown here is defined by an input (left) and output (right) triangle and produces a sheared image. M,N are indicated in this figure as row, column, but are actually specified in the algorithm in reverse order, as x,y. (Original image from the MATLAB Image Processing Toolbox. Copyright 1993–2003, The Math Work, Inc. Reprinted with permission.) defining geometrical objects are quadrilaterals. Figure 11.10 shows a projective transformation in which the original image would appear to be tilted back. In this transformation, vertical lines in the original image would converge in the transformed image. In addition to adding perspective, these transformations are of value in correcting for relative tilts between image planes during image regis- tration. In fact, most of these spatial transformations will be revisited in the section on image registration. Example 11.6 illustrates the use of these general image transformations for affine and projective transformations. Example 11.6 General spatial transformations. Apply the affine and pro- jective spatial transformation to one frame of the MRI image in mri.tif. The affine transformation should skew the top of the image to the left, just as shown in Figure 11.9. The projective transformation should tilt the image back as shown in Figure 11.10. This example will also use projective transformation to tilt the image forward, or opposite to that shown in Figure 11.10. After the image is loaded, the affine input triangle is defined as an equilat- eral triangle inscribed within the full image. The output triangle is defined by shifting the top point to the left side, so the output triangle is now a right triangle (see Figure 11.9). In the projective transformation, the input quadrilateral is a Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 108. FIGURE 11.10 Specification of a projective transformation by defining two quadri- laterals. The solid lines define the input quadrilateral and the dashed line defines the desired output quadrilateral. rectangle the same size as the input image. The output quadrilateral is generated by moving the upper points inward and down by an equal amount while the lower points are moved outward and up, also by a fixed amount. The second projective transformation is achieved by reversing the operations performed on the corners. % Example 11.6 General Spatial Transformations % Load a frame of the MRI image (mri.tif) % and perform two spatial transformations % 1) An affine transformation that shears the image to the left % 2) A projective transformation that tilts the image backward % 3) A projective transformation that tilts the image forward clear all; close all; % % .......load frame 18 ....... % % Define affine transformation U1 = [N/2 1; 1 M; N M]; % Input triangle X1 = [1 1; 1 M; N M]; % Output triangle % Generate transform Tform1 = maketform(’affine’, U1, X1); % Apply transform I_affine = imtransform(I, Tform1,’Size’, [M N]); % % Define projective transformation vectors offset = .25*N; U = [1 1; 1 M; N M; N 1]; % Input quadrilateral Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 109. X = [1-offset 1؉offset; 1؉offset M-offset; ... N-offset M-offset; N؉offset 1؉offset]; % % Define transformation based on vectors U and X Tform2 = maketform(’projective’, U, X); I_proj1 = imtransform(I,Tform2,’Xdata’,[1 N],’Ydata’, ... [1 M]); % % Second transformation. Define new output quadrilateral X = [1؉offset 1؉offset; 1-offset M-offset; ... N؉offset M-offset; N-offset 1؉offset]; % Generate transform Tform3 = maketform(’projective’, U, X); % Apply transform I_proj2 = imtransform(I,Tform3, ’Xdata’,[1 N], ’Ydata’,[1 M]); % .......display images ....... The images produced by this code are shown in Figure 11.11. Of course, a great many other transforms can be constructed by redefining the output (or input) triangles or quadrilaterals. Some of these alternative trans- formations are explored in the problems. All of these transforms can be applied to produce a series of images hav- ing slightly different projections. When these multiple images are shown as a movie, they will give an object the appearance of moving through space, per- haps in three dimensions. The last three problems at the end of this chapter explore these features. The following example demonstrates the construction of such a movie. Example 11.7 Construct a series of projective transformations, that when shown as a movie, give the appearance of the image tilting backward in space. Use one of the frames of the MRI image. Solution The code below uses the projective transformation to generate a series of images that appear to tilt back because of the geometry used. The approach is based on the second projective transformation in Example 11.7, but adjusts the transformation to produce a slightly larger apparent tilt in each frame. The program fills 24 frames in such a way that the first 12 have increas- ing angles of tilt and the last 12 frames have decreasing tilt. When shown as a movie, the image will appear to rock backward and forward. This same ap- proach will also be used in Problem 7. Note that as the images are being gener- ated by imtransform, they are converted to indexed images using gray2ind since this is the format required by immovie. The grayscale map generated by gray2ind is used (at the default level of 64), but any other map could be substituted in immovie to produce a pseudocolor image. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 110. FIGURE 11.11 Original MR image and three spatial transformations. Upper right: An affine transformation that shears the image to the left. Lower left: A projective transform in which the image is made to appear tilted forward. Lower right: A projective transformation in which the image is made to appear tilted backward. (Original image from the MATLAB Image Processing Toolbox, Copyright 1993– 2003, The Math Works, Inc. Reprinted with permission.) % Example 11.7 Spatial Transformation movie % Load a frame of the MRI image (mri.tif). Use the projective % transformation to make a movie of the image as it tilts % horizontally. % clear all; close all; Nu_frame = 12; % Number of frames in each direction Max_tilt = .5; % Maximum tilt achieved ........load MRI frame 12 as in previous examples ....... Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 111. % U = [1 1; 1 M; N M; N 1]; % Input quadrilateral for i = 1:Nu_frame % Construct Nu_frame * 2 movie frames % Define projective transformation Vary offset up to Max_tilt offset = Max_tilt*N*i/Nu_frame; X = [1؉offset 1؉offset; 1-offset M-offset; N؉offset ... M-offset; N-offset...1؉offset]; Tform2 = maketform(’projective’, U, X); [I_proj(:,:,1,i), map] = gray2ind(imtransform(I,Tform2,... ’Xdata’,[1 N],’Ydata’,[1 M])); % Make image tilt back and forth I_proj(:,:,1,2*Nu_frame؉1-i) = I_proj(:,:,1,i); end % % Display first 12 images as a montage montage(I_proj(:,:,:,1:12),map); mov = immovie(I_proj,map); % Display as movie movie(mov,5); While it is not possible to show the movie that is produced by this code, the various frames are shown as a montage in Figure 11.12. The last three problems in the problem set explore the use of spatial transformations used in combination to make movies. IMAGE REGISTRATION Image registration is the alignment of two or more images so they best superim- pose. This task has become increasingly important in medical imaging as it is used for merging images acquired using different modalities (for example, MRI and PET). Registration is also useful for comparing images taken of the same structure at different points in time. In functional magnetic resonance imaging (fMRI), image alignment is needed for images taken sequentially in time as well as between images that have different resolutions. To achieve the best alignment, it may be necessary to transform the images using any or all of the transformations described previously. Image registration can be quite challenging even when the images are identical or very similar (as will be the case in the examples and problems given here). Frequently the images to be aligned are not that similar, perhaps because they have been acquired using different modalities. The difficulty in accurately aligning images that are only moderately similar pres- ents a significant challenge to image registration algorithms, so the task is often aided by a human intervention or the use of embedded markers for reference. Approaches to image registration can be divided into two broad catego- ries: unassisted image registration where the algorithm generates the alignment without human intervention, and interactive registration where a human operator Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 112. FIGURE 11.12 Montage display of the movie produced by the code in Example 11.7. The various projections give the appearance of the brain slice tilting and moving back in space. Only half the 24 frames are shown here as the rest are the same, just presented in reverse order to give the appearance of the brain rocking back and forth. (Original image from the MATLAB Image Processing Toolbox. Copyright 1993–2003, The Math Works, Inc. Reprinted with permis- sion.) Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 113. guides or aids the registration process. The former approach usually relies on some optimization technique to maximize the correlation between the images. In the latter approach, a human operator may aid the alignment process by selecting corresponding reference points in the images to be aligned: corre- sponding features are identified by the operator and tagged using some interac- tive graphics procedure. This approach is well supported in MATLAB’s Image Processing Toolbox. Both of these approaches are demonstrated in the examples and problems. Unaided Image Registration Unaided image registration usually involves the application of an optimization algorithm to maximize the correlation, or other measure of similarity, between the images. In this strategy, the appropriate transformation is applied to one of the images, the input image, and a comparison is made between this transformed image and the reference image (also termed the base image). The optimization routine seeks to vary the transformation in some manner until the comparison is best possible. The problem with this approach is the same as with all optimi- zation techniques: the optimization process may converge on a sub-optimal solu- tion (a so-called local maximum), not the optimal solution (the global maxi- mum). Often the solution achieved depends on the starting values of the transformation variables. An example of convergence to a sub-optimal solution and dependency on initial variables is found in Problem 8. Example 11.8 below uses the optimization routine that is part of the basic MATLAB package, fminsearch (formerly fmins). This routine is based on the simplex (direct search) method, and will adjust any number of parameters to minimize a function specified though a user routine. To maximize the correspon- dence between the reference image and the input image, the negative of the correlation between the two images is minimized. The routine fminsearch will automatically adjust the transformation variables to achieve this minimum (re- member that this may not be the absolute minimum). To implement an optimization search, a routine is required that applies the transformation variables supplied by fminsearch, performs an appropriate trial transformation on the input image, then compares the trial image with the reference image. Following convergence, the optimization routine returns the values of the transformation variables that produce the best comparison. These can then be applied to produce the final aligned image. Note that the program- mer must specify the actual structure of the transformation since the optimiza- tion routine works blindly and simply seeks a set of variables that produces a minimum output. The transformation selected should be based on the possible mechanisms for misalignment: translations, size changes, rotations, skewness, projective misalignment, or other more complex distortions. For efficiency, the transformation should be one that requires the least number of defining vari- Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 114. ables. Reducing the number of variables increases the likelihood of optimal convergence and substantially reduces computation time. To minimize the num- ber of transformation variables, the simplest transformation that will compensate for the possible mechanisms of distortions should be used.* Example 11.8 This is an example of unaided image registration requir- ing an affine transformation. The input image, the image to be aligned, is a distorted version of the reference image. Specifically, it has been stretched hori- zontally, compressed vertically, and tilted, all using a single affine transforma- tion. The problem is to find a transformation that will realign this image with the reference image. Solution MATLAB’s optimization routine fminsearch will be used to determine an optimized transformation that will make the two images as similar as possible. MATLAB’s fminsearch routine calls the user routine rescale to perform the transformation and make the comparison between the two images. The rescale routine assumes that an affine transformation is required and that only the horizontal, vertical, and tilt dimensions need to be adjusted. (It does not, for example, take into account possible translations between the two images, although this would not be too difficult to incorporate.) The fminsearch routine requires as input arguments, the name of the routine whose output is to be mini- mized (in this example, rescale), and the initial values of the transformation variables (in this example, all 1’s). The routine uses the size of the initial value vector to determine how many variables it needs to adjust (in this case, three variables). Any additional input arguments following an optional vector specify- ing operational features are passed to rescale immediately following the trans- formation variables. The optimization routine will continue to call rescale au- tomatically until it has found an acceptable minimum for the error (or until some maximum number of iterations is reached, see the associated help file). % Example 11.8 and Figure 11.13 % Image registration after spatial transformation % Load a frame of the MRI image (mri.tif). Transform the original % image by increasing it horizontally, decreasing it vertically, % and tilting it to the right. Also decrease image contrast % slightly % Use MATLAB’s basic optimization routine, ’fminsearch’ to find % the transformation that restores the original image shape. % *The number of defining variables depends on the transformation. For example rotation alone only requires one variable, linear transformations require two variables, affine transformations require 3 variables while projective transformations require 4 variables. Two additional variables are required for translations. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 115. FIGURE 11.13 Unaided image registration requiring several affine transforma- tions. The left image is the original (reference) image and the distorted center image is to be aligned with that image. After a transformation determined by opti- mization, the right image is quite similar to the reference image. (Original image from the same as fig 11.12.) clear all; close all; H_scale = .25; % Define distorting parameters V_scale = .2; % Horizontal, vertical, and tilt tilt = .2; % in percent .......load mri.tif, frame 18....... [M N]= size(I); H_scale = H_scale * N/2; % Convert percent scale to pixels V_scale = V_scale * M; tilt = tilt * N % % Construct distorted image. U = [1 1; 1 M; N M]; % Input triangle X = [1-H_scale؉tilt 1؉V_scale; 1-H_scale M; N؉H_scale M]; Tform = maketform(’affine’, U, X); I_transform = (imtransform(I,Tform,’Xdata’,[1 N], ... ’Ydata’, [1 M]))*.8; % % Now find transformation to realign image initial_scale = [1 1 1]; % Set initial values [scale,Fval] = fminsearch(’rescale’,initial_scale,[ ], ... I, I_transform); disp(Fval) % Display final correlation % % Realign image using optimized transform Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 116. X = [1؉scale(1)؉scale(3) 1 ؉ scale(2); 1؉scale(1) M; ... N-scale(1) M]; Tform = maketform(’affine’, U, X); I_aligned = imtransform(I_transform,Tform,’Xdata’,[1 N], ’Ydata’,[1 M]); % subplot(1,3,1); imshow(I); %Display the images title(’Original Image’); subplot(1,3,2); imshow(I_transform); title(’Transformed Image’); subplot(1,3,3); imshow(I_aligned); title(’Aligned Image’); The rescale routine is used by fminsearch. This routine takes in the transformation variables supplied by fminsearch, performs a trial transforma- tion, and compares the trial image with the reference image. The routine then returns the error to be minimized calculated as the negative of the correlation between the two images. function err = rescale(scale, I, I_transform); % Function used by ’fminsearch’ to rescale an image % horizontally, vertically, and with tilt. % Performs transformation and computes correlation between % original and newly transformed image. % Inputs: % scale Current scale factor (from ’fminsearch’) % I original image % I_transform image to be realigned % Outputs: % Negative correlation between original and transformed image. % [M N]= size(I); U = [1 1; 1 M; N M]; % Input triangle % % Perform trial transformation X = [1؉scale(1)؉scale(3) 1 ؉ scale(2); 1؉scale(1) M; ... N-scale(1) M]; Tform = maketform(’affine’, U, X); I_aligned = imtransform(I_transform,Tform,’Xdata’, ... [1 N], ’Ydata’,[1 M]); % % Calculate negative correlation err = -abs(corr2(I_aligned,I)); The results achieved by this registration routine are shown in Figure 11.13. The original reference image is shown on the left, and the input image Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 117. is in the center. As noted above, this image is the same as the reference except that it has been distorted by several affine transformations (horizontal scratch- ing, vertical compression, and a tilt). The aligned image achieved by the optimi- zation is shown on the right. This image is very similar to the reference image. This optimization was fairly robust: it converged to a correlation of 0.99 from both positive and negative initial values. However, in many cases, convergence can depend on the initial values as demonstrated in Problem 8. This program took about 1 minute to run on a 1 GHz PC. Interactive Image Registration Several strategies may be used to guide the registration process. In the example used here, registration will depend on reference marks provided by a human operator. Interactive image registration is well supported by the MATLAB Im- age Processing Toolbox and includes a graphically based program, cpselect, that automates the process of establishing corresponding reference marks. Under this procedure, the user interactively identifies a number of corresponding fea- tures in the reference and input image, and a transform is constructed from these pairs of reference points. The program must specify the type of transformation to be performed (linear, affine, projective, etc.), and the minimum number of reference pairs required will depend on the type of transformation. The number of reference pairs required is the same as the number of variables needed to define a transformation: an affine transformation will require a minimum of three reference points while a projective transformation requires four variables. Linear transformations require only two pairs, while other more complex trans- formations may require six or more point pairs. In most cases, the alignment is improved if more than the minimal number of point pairs is given. In Example 11.9, an alignment requiring a projective transformation is pre- sented. This Example uses the routine cp2tform to produce a transformation in Tform format, based on point pairs obtained interactively. The cp2tform routine has a large number of options, but the basic calling structure is: Tform = cp2tform(input_points, base_points, ‘type’); where input_points is a m by 2 matrix consisting of x,y coordinates of the reference points in the input image; base_points is a matrix containing the same information for the reference image. This routine assumes that the points are entered in the same order, i.e., that corresponding rows in the two vectors describe corresponding points. The type variable is the same as in maketform and specifies the type of transform (‘affine’, ‘projective’, etc.). The use of this routine is demonstrated in Example 11.9. Example 11.9 An example of interactive image registration. In this ex- ample, an input image is generated by transforming the reference image with a Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 118. projective transformation including vertical and horizontal translations. The pro- gram then opens two windows displaying the reference and input image, and takes in eight reference points for each image from the operator using the MATLAB ginput routine. As each point is taken it, it is displayed as an ‘*’ overlaid on the image. Once all 16 points have been acquired (eight from each image), a transformation is constructed using cp2tform. This transformation is then ap- plied to the input image using imtransform. The reference, input, and realigned images are displayed. % Example 11.9 Interactive Image Registration % Load a frame of the MRI image (mri.tif) and perform a spatial % transformation that tilts the image backward and displaces % it horizontally and vertically. % Uses interactive registration and the MATLAB function % ‘cp2tform’ to realign the image % clear all; close all; nu_points = 8; % Number of reference points .......Load mri.tif, frame 18 ....... [M N]= size(I); % % Construct input image. Perform projective transformation U = [1 1; 1 M; N M; N 1]; offset = .15*N; % Projection offset H = .2 * N; % Horizontal translation V = .15 * M; % Vertical translation X = [1-offset؉H 1؉offset-V; 1؉offset؉H M-offset-V; ... N-offset؉H M-offset-V;...N؉offset؉H 1؉offset-V]; Tform1 = maketform(’projective’, U, X); I_transform = imtransform(I,Tform1,’Xdata’,[1 N], ... ’Ydata’, [1 M]); % % Acquire reference points % First open two display windows fig(1) = figure; imshow(I); fig(2) = figure; imshow(I_transform); % % for i = 1:2 % Get reference points: both % images figure(fig(i)); % Open window i hold on; Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 119. title(’Enter four reference points’); for j = 1:nu_points [x(j,i), y(j,i)] = ginput(1); % Get reference point plot(x(j,i), y(j,i),’*’); % Mark reference point % with * end end % % Construct transformation with cp2tform and implement with % imtransform % [Tform2, inpts, base_pts] = cp2tform([x(:,2) y(:,2)], ... [x(:,1) y(:,1)],’projective’); I_aligned = imtransform(I_transform,Tform2,’Xdata’, ... [1 N],’Ydata’,[1 M]); % figure; subplot(1,3,1); imshow(I); % Display the images title(’Original’); subplot(1,3,2); imshow(I_transform); title(’Transformation’); subplot(1,3,3); imshow(I_aligned); title(’Realigned’); The reference and input windows are shown along with the reference points selected in Figure 11.14A and B. Eight points were used rather than the minimal four, because this was found to produce a better result. The influence of the number of reference point used is explored in Problem 9. The result of the transformation is presented in Figure 11.15. This figure shows that the realignment was less that perfect, and, in fact, the correlation after alignment was only 0.78. Nonetheless, the primary advantage of this method is that it couples into the extraordinary abilities of human visual identification and, hence, can be applied to images that are only vaguely similar when correlation- based methods would surely fail. PROBLEMS 1. Load the MATLAB test pattern image testpat1.png used in Example 11.5. Generate and plot the Fourier transform of this image. First plot only the 25 points on either side of the center of this transform, then plot the entire function, but first take the log for better display. 2. Load the horizontal chirp pattern shown in Figure 11.1 (found on the disk as imchirp.tif) and take the Fourier transform as in the above problem. Then multiply the Fourier transform (in complex form) in the horizontal direction by Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 120. FIGURE 11.14A A reference image used in Example 11.9 showing the reference points as black. (Original image from the MATLAB Image Processing Toolbox. Copyright 1993–2003, The Math Works, Inc. Reprinted with permission.) FIGURE 11.14B Input image showing reference points corresponding to those shown in Figure 11.14A. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 121. FIGURE 11.15 Image registration using a transformation developed interactively. The original (reference) image is seen on the left, and the input image in the center. The image after transformation is similar, but not identical to the reference image. The correlation between the two is 0.79. (Original image from the MATLAB Image Processing Toolbox. Copyright 1993–2003, The Math Works, Inc. Reprinted with permission.) a half-wave sine function of same length. Now take the inverse Fourier trans- form of this windowed function and plot alongside the original image. Also apply the window in the vertical direction, take the inverse Fourier transform, and plot the resulting image. Do not apply fftshift to the Fourier transform as the inverse Fourier transform routine, ifft2 expects the DC component to be in the upper left corner as fft2 presents it. Also you should take the absolute value at the inverse Fourier transform before display, to eliminate any imaginary components. (The chirp image is square, so you do not have to recompute the half-wave sine function; however, you may want to plot the sine wave to verify that you have a correct half-wave sine function ). You should be able to explain the resulting images. (Hint: Recall the frequency characteristics of the two-point central difference algorithm used for taking the derivative.) 3. Load the blood cell image (blood1.tif). Design and implement your own 3 by 3 filter that enhances vertical edges that go from dark to light. Repeat for a filter that enhances horizontal edges that go from light to dark. Plot the two images along with the original. Convert the first image (vertical edge enhance- ment) to a binary image and adjust the threshold to emphasize the edges. Plot this image with the others in the same figure. Plot the three-dimensional fre- quency representations of the two filters together in another figure. 4. Load the chirp image (imchirp.tif) used in Problem 2. Design a one- dimensional 64th-order narrowband bandpass filter with cutoff frequencies of 0.1 and 0.125 Hz and apply it the chirp image. Plot the modified image with the original. Repeat for a 128th-order filter and plot the result with the others. (This may take a while to run.) In another figure, plot the three-dimensional frequency representation of a 64th-order filter. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 122. 5. Produce a movie of the rotating brain. Load frame 16 of the MRI image (mri.tif). Make a multiframe image of the basic image by rotating that image through 360 degrees. Use 36 frames (10 degrees per rotation) to cover the com- plete 360 degrees. (If your resources permit, you could use 64 frames with 5 degrees per rotation.) Submit a montage plot of those frames that cover the first 90 degrees of rotation; i.e., the first eight images (or 16, if you use 64 frames). 6. Back in the 1960’s, people were into “expanding their minds” through med- itation, drugs, rock and roll, or other “mind-expanding” experiences. In this problem, you will expand the brain in a movie using an affine transformation. (Note: imresize will not work because it changes the number of pixels in the image and immovie requires that all images have the same dimensions.) Load frame 18 of the MRI image (mri.tif). Make a movie where the brain stretches in and out horizontally from 75% to 150% of normal size. The image will probably exceed the frame size during its larger excursions, but this is accept- able. The image should grow symmetrically about the center (i.e., in both direc- tions.) Use around 24 frames with the latter half of the frames being the reverse of the first as in Example 11.7, so the brain appears to grow then shrink. Submit a montage of the first 12 frames. Note: use some care in getting the range of image sizes to be between 75% and 150%. (Hint: to simplify the computation of the output triangle, it is best to define the input triangle at three of the image corners. Note that all three triangle vertices will have to be modified to stretch the image in both directions, symmetrically about the center.) 7. Produce a spatial transformation movie using a projective transformation. Load a frame of the MRI image (mri.tif, your choice of frame). Use the projec- tive transformation to make a movie of the image as it tilts vertically. Use 24 frames as in Example 11.7: the first 12 will tilt the image back while the rest tilt the image back to its original position. You can use any reasonable transformation that gives a vertical tilt or rotation. Submit a montage of the first 12 images. 8. Load frame 12 of mri.tif and use imrotate to rotate the image by 15 degrees clockwise. Also reduce image contrast of the rotated image by 25%. Use MATLAB’s basic optimization program fminsearch to align the image that has been rotated. (You will need to write a function similar to rescale in Example 11.8 that rotates the image based on the first input parameter, then computes the negative correlation between the rotated image and the original image.) 9. Load a frame of the MRI image (mri.tif) and perform a spatial transfor- mation that first expands the image horizontally by 20% then rotates the image by 20 degrees. Use interactive registration and the MATLAB function cp2t- form to transform the image. Use (A) the minimum number of points and (B) twice the minimum number of points. Compare the correlation between the original and the realigned image using the two different number of reference points. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 123. 12 Image Segmentation Image segmentation is the identification and isolation of an image into regions that—one hopes—correspond to structural units. It is an especially important operation in biomedical image processing since it is used to isolate physiological and biological structures of interest. The problems associated with segmentation have been well studied and a large number of approaches have been developed, many specific to a particular image. General approaches to segmentation can be grouped into three classes: pixel-based methods, regional methods, and edge- based methods. Pixel-based methods are the easiest to understand and to imple- ment, but are also the least powerful and, since they operate on one element at time, are particularly susceptible to noise. Continuity-based and edge-based methods approach the segmentation problem from opposing sides: edge-based methods search for differences while continuity-based methods search for simi- larities. PIXEL-BASED METHODS The most straightforward and common of the pixel-based methods is threshold- ing in which all pixels having intensity values above, or below, some level are classified as part of the segment. Thresholding is an integral part of converting an intensity image to a binary image as described in Chapter 10. Thresholding is usually quite fast and can be done in real time allowing for interactive setting of the threshold. The basic concept of thresholding can be extended to include Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 124. both upper and lower boundaries, an operation termed slicing since it isolates a specific range of pixels. Slicing can be generalized to include a number of dif- ferent upper and lower boundaries, each encoded into a different number. An example of multiple slicing was presented in Chapter 10 using the MATLAB gray2slice routine. Finally, when RGB color or pseudocolor images are in- volved, thresholding can be applied to each color plane separately. The resulting image could be either a thresholded RGB image, or a single image composed of a logical combination (AND or OR) of the three image planes after threshold- ing. An example of this approach is seen in the problems. A technique that can aid in all image analysis, but is particularly useful in pixel-based methods, is intensity remapping. In this global procedure, the pixel values are rescaled so as to extend over different maximum and minimum val- ues. Usually the rescaling is linear, so each point is adjusted proportionally with a possible offset. MATLAB supports rescaling with the routine imadjust described below, which also provides a few common nonlinear rescaling op- tions. Of course, any rescaling operation is possible using MATLAB code if the intensity images are of class double, or the image arithmetic routines described in Chapter 10 are used. Threshold Level Adjustment A major concern in these pixel-based methods is setting the threshold or slicing level(s) appropriately. Usually these levels are set by the program, although in some situations they can be set interactively by the user. Finding an appropriate threshold level can be aided by a plot of pixel intensity distribution over the whole image, regardless of whether you adjust the pixel level interactively or automatically. Such a plot is termed the intensity histogram and is supported by the MATLAB routine imhist detailed below. Figure 12.1 shows an x-ray image of the spine image with its associated density histogram. Figure 12.1 also shows the binary image obtained by applying a threshold at a specific point on the histogram. When RGB color images are being analyzed, intensity histograms can be obtained from all three color planes and different thresholds established for each color plane with the aid of the corresponding histogram. Intensity histograms can be very helpful in selecting threshold levels, not only for the original image, but for images produced by various segmentation algorithms described later. Intensity histograms can also be useful in evaluating the efficacy of different processing schemes: as the separation between struc- tures improves, histogram peaks should become more distinctive. This relation- ship between separation and histogram shape is demonstrated in Figures 12.2 and, more dramatically, in Figures 12.3 and 12.4. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 125. FIGURE 12.1 An image of bone marrow, upper left, and its associated intensity histogram, lower plot. The upper right image is obtained by thresholding the origi- nal image at a value corresponding to the vertical line on the histogram plot. (Original image from the MATLAB Image Processing Toolbox. Copyright 1993– 2003, The Math Works, Inc. Reprinted with permission.) Intensity histograms contain no information on position, yet it is spatial information that is of prime importance in problems of segmentation, so some strategies have been developed for determining threshold(s) from the histogram (Sonka et al. 1993). If the intensity histogram is, or can be assumed as, bimodal (or multi-modal), a common strategy is to search for low points, or minima, in the histogram. This is the strategy used in Figure 12.1, where the threshold was set at 0.34, the intensity value at which the histogram shows an approximate minimum. Such points represent the fewest number of pixels and should pro- duce minimal classification errors; however, the histogram minima are often difficult to determine due to variability. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 126. FIGURE 12.2 Image of bloods cells with (upper) and without (lower) intermediate boundaries removed. The associated histograms (right side) show improved sep- arability when the boundaries are eliminated. The code that generated these im- ages is given in Example 12.1. (Original image reprinted with permission from the Image Processing Handbook 2nd edition. Copyright CRC Press, Boca Raton, Florida.) An approach to improve the determination of histogram minima is based on the observation that many boundary points carry values intermediate to the values on either side of the boundary. These intermediate values will be associ- ated with the region between the actual boundary values and may mask the optimal threshold value. However, these intermediate points also have the high- est gradient, and it should be possible to identify them using a gradient-sensitive filter, such as the Sobel or Canny filter. After these boundary points are identi- fied, they can be eliminated from the image, and a new histogram is computed with a distribution that is possibly more definitive. This strategy is used in Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 127. FIGURE 12.3 Thresholded blood cell images. Optimal thresholds were applied to the blood cell images in Figure 12.2 with (left) and without (right) boundaries pixel masked. Fewer inappropriate pixels are seen in the right image. Example 12.1, and Figure 12.2 shows images and associated histograms before and after removal of boundary points as identified using Canny filtering. The reduction in the number of intermediate points can be seen in the middle of the histogram (around 0.45). As shown in Figure 12.3, this leads to slightly better segmentation of the blood cells. Another histogram-based strategy that can be used if the distribution is bimodal is to assume that each mode is the result of a unimodal, Gaussian distribution. An estimate is then made of the underlying distributions, and the point at which the two estimated distributions intersect should provide the opti- mal threshold. The principal problem with this approach is that the distributions are unlikely to be truly Gaussian. A threshold strategy that does not use the histogram is based on the con- cept of minimizing the variance between presumed foreground and background elements. Although the method assumes two different gray levels, it works well even when the distribution is not bimodal (Sonka et al., 1993). The approach uses an iterative process to find a threshold that minimizes the variance between the intensity values on either side of the threshold level (Outso’s method). This approach is implemented using the MATLAB routine grayslice (see Example 12.1). A pixel-based technique that provides a segment boundary directly is con- tour mapping. Contours are lines of equal intensity, and in a continuous image they are necessarily continuous: they cannot end within the image, although Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 128. FIGURE 12.4 Contour maps drawn from the blood cell image of Figures 12.2 and 12.3. The right image was pre-filtered with a Gaussian lowpass filter (alpha = 3) before the contour lines were drawn. The contour values were set manually to provide good images. they can branch or loop back on themselves. In digital images, these same prop- erties exist but the value of any given contour line will not generally equal the values of the pixels it traverses. Rather, it usually reflects values intermediate between adjacent pixels. To use contour mapping to identify image structures requires accurate setting of the contour levels, and this carries the same burdens as thresholding. Nonetheless, contour maps do provide boundaries directly, and, if subpixel interpolation is used in establishing the contour position, they may be spatially more accurate. Contour maps are easy to implement in MATLAB, as shown in the next section on MATLAB Implementation. Figure 12.4 shows contours maps for the blood cell images shown in Figure 12.2. The right image was pre-filtered with a Gaussian lowpass filter which reduces noise slightly and improves the resultant contour image. Pixel-based approaches can lead to serious errors, even when the average intensities of the various segments are clearly different, due to noise-induced intensity variation within the structure. Such variation could be acquired during image acquisition, but could also be inherent in the structure itself. Figure 12.5 shows two regions with quite different average intensities. Even with optimal threshold selection, many inappropriate pixels are found in both segments due to intensity variations within the segments Fig 12.3 (right). Techniques for im- proving separation in such images are explored in the sections on continuity- based approaches. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 129. FIGURE 12.5 An image with two regions having different average gray levels. The two regions are clearly distinguishable; however, using thresholding alone, it is not possible to completely separate the two regions because of noise. MATLAB Implementation Some of the routines for implementing pixel-based operations such as im2bw and grayslice have been described in preceding chapters. The image intensity histogram routine is produced by imhist without the output arguments: [counts, x] = imhist(I, N); where counts is the histogram value at a given x, I is the image, and N is an optional argument specifying the number of histogram bins (the default is 255). As mentioned above, imhist is usually invoked without the output arguments, count and x, to produce a plot directly. The rescale routine is: I_rescale = imscale(I, [low high], [bottom top], gamma); where I_rescale is the rescaled output image, I is the input image. The range between low and high in the input image is rescaled to be between bottom and top in the output image. Several pixel-based techniques are presented in Example 12.1. Example 12.1 An example of segmentation using pixel-based methods. Load the image of blood cells, and display along with the intensity histogram. Remove the edge pixels from the image and display the histogram of this modi- Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 130. FIGURE 12.6A Histogram of the image shown in Figure 12.3 before (upper) and after (lower) lowpass filtering. Before filtering the two regions overlap to such an extend that they cannot be identified. After lowpass filtering, the two regions are evident, and the boundary found by minimum variance is shown. The application of this boundary to the filtered image results in perfect separation as shown in Figure 12.4B. fied image. Determine thresholds using the minimal variance iterative technique described above, and apply this approach to threshold both images. Display the resultant thresholded images. Solution To remove the edge boundaries, first identify these boundaries using an edge detection scheme. While any of the edge detection filters de- scribed previously can be used, this application will use the Canny filter as it is most robust to noise. This filter is implemented as an option of MATLAB’s edge routine, which produces a binary image of the boundaries. This binary image will be converted to a boundary mask by inverting the image using imcomplement. After inversion, the edge pixels will be zero while all other pixels will be one. Multiplying the original image by the boundary mask will produce an image in which the boundary points are removed (i.e., set to zero, or black). All the images involved in this process, including the original image, will then be plotted. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 131. FIGURE 12.6B Left side: The same image shown in Figure 12.5 after lowpass filtering. Right side: This filtered image can now be perfectly separated by thresh- olding. % Example 12.1 and Figure 12.2 and Figure 12.3 % Lowpass filter blood cell image, then display histograms % before and after edge point removal. % Applies “optimal” threshold routine to both original and % “masked” images and display the results % ........input image and convert to double....... h = fspecial(‘gaussian’,12,2); % Construct gaussian % filter I_f = imfilter(I,h,‘replicate’); % Filter image % I_edge = edge(I_f,‘canny’,.3); % To remove edge I_rem = I_f .* imcomplement(I_edge); % points, find edge, % complement and use % as mask % subplot(2,2,1); imshow(I_f); % Display images and % histograms title(‘Original Figure’); subplot(2,2,2); imhist(I_f); axis([0 1 0 1000]); title(‘Filtered histogram’); subplot(2,2,3); imshow(I_rem); title(‘Edge Removed’); subplot(2,2,4); imhist(I_rem); axis([0 1 0 1000]); title(‘Edge Removed histogram’); Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 132. % figure; % Threshold and % display images t1 = graythresh(I); % Use minimum variance % thresholds t2 = graythresh(I_f); subplot(1,2,1); imshow(im2bw(I,t1)); title(‘Threshold Original Image’); subplot(1,2,2); imshow(im2bw(I_f,t2)); title(‘Threshold Masked Image’); The results have been shown previously in Figures 12.2 and 12.3, and the improvement in the histogram and threshold separation has been mentioned. While the change in the histogram is fairly small (Figure 12.2), it does lead to a reduction in artifacts in the thresholded image, as shown in Figure 12.3. This small improvement could be quite significant in some applications. Methods for removing the small remaining artifacts will be described in the section on morphological operations. CONTINUITY-BASED METHODS These approaches look for similarities or consistency in the search for structural units. As demonstrated in the examples below, these approaches can be very effective in segmentation tasks, but they all suffer from a lack of edge definition. This is because they are based on neighborhood operations and these tend to blur edge regions, as edge pixels are combined with structural segment pixels. The larger the neighborhood used, the more poorly edges will be defined. Unfor- tunately, increasing neighborhood size usually improves the power of any given continuity-based operation, setting up a compromise between identification abil- ity and edge definition. One easy technique that is based on continuity is low- pass filtering. Since a lowpass filter is a sliding neighborhood operation that takes a weighted average over a region, it enhances consistent characteristics. Figure 12.6A shows histograms of the image in Figure 12.5 before and after filtering with a Gaussian lowpass filter (alpha = 1.5). Note the substantial im- provement in separability suggested by the associated histograms. Applying a threshold to the filtered image results in perfectly isolated segments as shown in Figure 12.6B. The thresholded images in both Figures 12.5 and 12.4B used the same minimum variance technique to set the threshold, yet the improvement brought about by simple lowpass filtering is remarkable. Image features related to texture can be particularly useful in segmenta- tion. Figure 12.7 shows three regions that have approximately the same average intensity values, but are readily distinguished visually because of differences in texture. Several neighborhood-based operations can be used to distinguish tex- Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 133. tures: the small segment Fourier transform, local variance (or standard devia- tion), the Laplacian operator, the range operator (the difference between maxi- mum and minimum pixel values in the neighborhood), the Hurst operator (maximum difference as a function of pixel separation), and the Haralick opera- tor (a measure of distance moment). Many of these approaches are either di- rectly supported in MATLAB, or can be implement using the nlfilter routine described in Chapter 10. MATLAB Implementation Example 12.2 attempts to separate the three regions shown in Figure 12.7 by applying one of these operators to convert the texture pattern to a difference in intensity that can then be separated using thresholding. Example 12.2 Separate out the three segments in Figure 12.7 that differ only in texture. Use one of the texture operators described above and demon- strate the improvement in separability through histogram plots. Determine ap- propriate threshold levels for the three segments from the histogram plot. FIGURE 12.7 An image containing three regions having approximately the same intensity, but different textures. While these areas can be distinguished visually, separation based on intensity or edges will surely fail. (Note the single peak in the intensity histogram in Figure 12.9–upper plot.) Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 134. Solution Use the nonlinear range filter to convert the textural patterns into differences in intensity. The range operator is a sliding neighborhood proce- dure that takes the difference between the maximum and minimum pixel value with a neighborhood. Implement this operation using MATLAB’s nlfilter routine with a 7-by-7 neighborhood. % Example 12.2 Figures 12.8, 12.9, and 12.10 % Load image ‘texture3.tif’ which contains three regions having % the same average intensities, but different textural patterns. % Apply the “range” nonlinear operator using ‘nlfilter’ % Plot original and range histograms and filtered image % clear all; close all; [I] = imread(‘texture3.tif’); % Load image and I = im2double(I); % Convert to double % range = inline(‘max(max(x))— % Define Range function min (min(x))’); I_f = nlfilter(I,[7 7], range); % Compute local range I_f = mat2gray(I_f); % Rescale intensities FIGURE 12.8 The texture pattern shown in Figure 12.7 after application of the nonlinear range operation. This operator converts the textural properties in the original figure into a difference in intensities. The three regions are now clearly visible as intensity differences and can be isolated using thresholding. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 135. FIGURE 12.9 Histogram of original texture pattern before (upper) and after non- linear filtering using the range operator (lower). After filtering, the three intensity regions are clearly seen. The thresholds used to isolate the three segments are indicated. % imshow(I_f); % Display results title(‘“Range” Image’); figure; subplot(2,1,1); imhist(I); % Display both histograms title(‘Original Histogram’) subplot(2,1,2); imhist(I_f); title(‘“Range” Histogram’); figure; subplot(1,3,1); imshow(im2bw % Display three segments (I_f,.22)); subplot(1,3,2); imshow(islice % Uses ’islice’ (see below) (I_f,.22,.54)); subplot(1,3,3); imshow(im2bw(I_f,.54)); Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 136. The image produced by the range filter is shown in Figure 12.8, and a clear distinction in intensity level can now be seen between the three regions. This is also demonstrated in the histogram plots of Figure 12.9. The histogram of the original figure (upper plot) shows a single Gaussian-like distribution with no evidence of the three patterns.* After filtering, the three patterns emerge as three distinct distributions. Using this distribution, two thresholds were chosen at minima between the distributions (at 0.22 and 0.54: the solid vertical lines in Figure 12.9) and the three segments isolated based on these thresholds. The two end patterns could be isolated using im2bw, but the center pattern used a special routine, islice. This routine sets pixels to one whose values fall between an upper and lower boundary; if the pixel has values above or below these bound- aries, it is set to zero. (This routine is on the disk.) The three fairly well sepa- rated regions are shown in Figure 12.10. A few artifacts remain in the isolated images, and subsequent methods can be used to eliminate or reduce these erro- neous pixels. Occasionally, segments will have similar intensities and textural proper- ties, except that the texture differs in orientation. Such patterns can be distin- guished using a variety of filters that have orientation-specific properties. The local Fourier transform can also be used to distinguish orientation. Figure 12.11 shows a pattern with texture regions that are different only in terms of their orientation. In this figure, also given in Example 12.3, orientation was identified FIGURE 12.10 Isolated regions of the texture pattern in Figure 12.7. Although there are some artifact, the segmentation is quite good considering the original image. Methods for reducing the small artifacts will be given in the section on edge detection. *In fact, the distribution is Gaussian since the image patterns were generated by filtering an array filled with Gaussianly distributed numbers generated by randn. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 137. FIGURE 12.11 Textural pattern used in Example 12.3. The horizontal and vertical patterns have the same textural characteristics except for their orientation. As in Figure 12.7, the three patterns have the same average intensity. by application of a direction operator that operates only in the horizontal direc- tion. This is followed by a lowpass filter to improve separability. The intensity histograms in Figure 12.12 shown at the end of the example demonstrate the intensity separations achieved by the directional range operator and the improve- ment provided by the lowpass filter. The different regions are then isolated using threshold techniques. Example 12.3 Isolate segments from a texture pattern that includes two patterns with the same textural characteristics except for orientation. Note that the approach used in Example 12.2 will fail: the similarity in the statistical properties of the vertical and horizontal patterns will give rise to similar intensi- ties following a range operation. Solution Apply a filter that has directional sensitivity. A Sobel or Prewitt filter could be used, followed by the range or similar operator, or the operations could be done in a single step by using a directional range operator. The choice made in this example is to use a horizontal range operator implemented with nlfilter. This is followed by a lowpass filter (Gaussian, alpha = 4) to improve separation by removing intensity variation. Two segments are then isolated us- ing standard thresholding. In this example, the third segment was constructed Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 138. FIGURE 12.12 Images produced by application of a direction range operator ap- plied to the image in Figure 12.11 before (upper) and after (lower) lowpass filter- ing. The histograms demonstrate the improved separability of the filter image showing deeper minima in the filtered histogram. by applying a logical operation to the other two segments. Alternatively, the islice routine could have been used as in Example 12.2. % Example 12.3 and Figures 12.11, 12.12, and 12.13 % Analysis of texture pattern having similar textural % characteristics but with different orientations. Use a % direction-specific filter. % clear all; close all; I = imread(‘texture4.tif’); % Load “orientation” texture I = im2double(I); % Convert to double Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 139. FIGURE 12.13 Isolated segments produced by thresholding the lowpass filtered image in Figure 12.12. The rightmost segment was found by applying logical op- erations to the other two images. % % Define filters and functions: I-D range function range = inline(‘max(x)—min(x)’); h_lp = fspecial (‘gaussian’, 20, 4); % % Directional nonlinear filter I_nl = nlfilter(I, [9 1], range); I_h = imfilter(I_nl*2, h_lp); % Average (lowpass filter) % subplot(2,2,1); imshow % Display image and histogram (I_nl*2); % before lowpass filtering title(‘Modified Image’); % and after lowpass filtering subplot(2,2,2); imhist(I_nl); title(‘Histogram’); subplot(2,2,3); imshow(I_h*2); % Display modified image title(‘Modified Image’); subplot(2,2,4); imhist(I_h); title(‘Histogram’); % figure; BW1 = im2bw(I_h,.08); % Threshold to isolate segments BW2 = ϳim2bw(I_h,.29); BW3 = ϳ(BW1 & BW2); % Find third image from other % two subplot(1,3,1); imshow(BW1); % Display segments subplot(1,3,2); imshow(BW2); subplot(1,3,3); imshow(BW3); Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 140. The image produced by the horizontal range operator with, and without, lowpass filtering is shown in Figure 12.12. Note the improvement in separation produced by the lowpass filtering as indicated by a better defined histogram. The thresholded images are shown in Figure 12.13. As in Example 12.2, the separation is not perfect, but is quite good considering the challenges posed by the original image. Multi-Thresholding The results of several different segmentation approaches can be combined either by adding the images together or, more commonly, by first thresholding the images into separate binary images and then combining them using logical oper- ations. Either the AND or OR operator would be used depending on the charac- teristics of each segmentation procedure. If each procedure identified all of the segments, but also included non-desired areas, the AND operator could be used to reduce artifacts. An example of the use of the AND operation was found in Example 12.3 where one segment was found using the inverse of a logical AND of the other two segments. Alternatively, if each procedure identified some portion of the segment(s), then the OR operator could be used to com- bine the various portions. This approach is illustrated in Example 12.4 where first two, then three, thresholded images are combined to improve segment iden- tification. The structure of interest is a cell which is shown on a gray back- ground. Threshold levels above and below the gray background are combined (after one is inverted) to provide improved isolation. Including a third binary image obtained by thresholding a texture image further improves the identifica- tion. Example 12.4 Isolate the cell structures from the image of a cell shown in Figure 12.14. Solution Since the cell is projected against a gray background it is possi- ble to isolate some portions of the cell by thresholding above and below the background level. After inversion of the lower threshold image (the one that is below the background level), the images are combined using a logical OR. Since the cell also shows some textural features, a texture image is constructed by taking the regional standard deviation (Figure 12.14). After thresholding, this texture-based image is also combined with the other two images. % Example 12.4 and Figures 12.14 and 12.15 % Analysis of the image of a cell using texture and intensity % information then combining the resultant binary images % with a logical OR operation. clear all; close all; I = imread(‘cell.tif’); % Load “orientation” texture Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 141. FIGURE 12.14 Image of cells (left) on a gray background. The textural image (right) was created based on local variance (standard deviation) and shows somewhat more definition. (Cancer cell from rat prostate, courtesy of Alan W. Partin, M.D., Ph.D., Johns Hopkins University School of Medicine.) I = im2double(I); % Convert to double % h = fspecial(‘gaussian’, 20, 2); % Gaussian lowpass filter % subplot(1,2,1); imshow(I); % Display original image title(‘Original Image’); I_std = (nlfilter(I,[3 3], % Texture operation ’std2’))*6; I_lp = imfilter(I_std, h); % Average (lowpass filter) % subplot(1,2,2); imshow(I_lp*2); % Display texture image title(‘Filtered image’); % figure; BW_th = im2bw(I,.5); % Threshold image BW_thc = ϳim2bw(I,.42); % and its complement BW_std = im2bw(I_std,.2); % Threshold texture image BW1 = BW_th * BW_thc; % Combine two thresholded % images BW2 = BW_std * BW_th * BW_thc; % Combine all three images subplot(2,2,1); imshow(BW_th); % Display thresholded and subplot(2,2,2); imshow(BW_thc); % combined images subplot(2,2,3); imshow(BW1); Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 142. FIGURE 12.15 Isolated portions of the cells shown in Figure 12.14. The upper images were created by thresholding the intensity. The lower left image is a com- bination (logical OR) of the upper images and the lower right image adds a thresholded texture-based image. The original and texture images are shown in Figure 12.14. Note that the texture image has been scaled up, first by a factor of six, then by an additional factor of two, to bring it within a nominal image range. The intensity thresh- olded images are shown in Figure 12.15 (upper images; the upper right image has been inverted). These images are combined in the lower left image. The lower right image shows the combination of both intensity-based images with the thresholded texture image. This method of combining images can be ex- tended to any number of different segmentation approaches. MORPHOLOGICAL OPERATIONS Morphological operations have to do with processing shapes. In this sense they are continuity-based techniques, but in some applications they also operate on Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 143. edges, making them useful in edge-based approaches as well. In fact, morpho- logical operations have many image processing applications in addition to seg- mentation, and they are well represented and supported in the MATLAB Image Processing Toolbox. The two most common morphological operations are dilation and erosion. In dilation the rich get richer and in erosion the poor get poorer. Specifically, in dilation, the center or active pixel is set to the maximum of its neighbors, and in erosion it is set to the minimum of its neighbors. Since these operations are often performed on binary images, dilation tends to expand edges, borders, or regions, while erosion tends to decrease or even eliminate small regions. Obviously, the size and shape of the neighborhood used will have a very strong influence on the effect produced by either operation. The two processes can be done in tandem, over the same area. Since both erosion and dilation are nonlinear operations, they are not invertible transforma- tions; that is, one followed by the other will not generally result in the original image. If erosion is followed by dilation, the operation is termed opening. If the image is binary, this combined operation will tend to remove small objects without changing the shape and size of larger objects. Basically, the initial ero- sion tends to reduce all objects, but some of the smaller objects will disappear altogether. The subsequent dilation will restore those objects that were not elimi- nated by erosion. If the order is reversed and dilation is performed first followed by erosion, the combined process is called closing. Closing connects objects that are close to each other, tends to fill up small holes, and smooths an object’s outline by filling small gaps. As with the more fundamental operations of dila- tion and erosion, the size of objects removed by opening or filled by closing depends on the size and shape of the neighborhood that is selected. An example of the opening operation is shown in Figure 12.16 including the erosion and dilation steps. This is applied to the blood cell image after thresholding, the same image shown in Figure 12.3 (left side). Since we wish to eliminate black artifacts in the background, we first invert the image as shown in Figure 12.16. As can be seen in the final, opened image, there is a reduction in the number of artifacts seen in the background, but there is also now a gap created in one of the cell walls. The opening operation would be more effective on the image in which intermediate values were masked out (Figure 12.3, right side), and this is given as a problem at the end of the chapter. Figure 12.17 shows an example of closing applied to the same blood cell image. Again the operation was performed on the inverted image. This operation tends to fill the gaps in the center of the cells; but it also has filled in gaps between the cells. A much more effective approach to filling holes is to use the imfill routine described in the section on MATLAB implementation. Other MATLAB morphological routines provide local maxima and min- ima, and allows for manipulating the image’s maxima and minima, which im- plement various fill-in effects. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 144. FIGURE 12.16 Example of the opening operation to remove small artifacts. Note that the final image has fewer background spots, but now one of the cells has a gap in the wall. MATLAB Implementation The erosion and dilation could be implemented using the nonlinear filter routine nlfilter, although this routine limits the shape of the neighborhood to a rect- angle. The MATLAB routines imdilate and imerode provide for a variety of neighborhood shapes and are much faster than nlfilter. As mentioned above, opening consists of erosion followed by dilation and closing is the reverse. MATLAB also provide routines for implementing these two operations in one statement. To specify the neighborhood used by all of these routines, MATLAB uses a structuring element.* A structuring element can be defined by a binary array, where the ones represent the neighborhood and the zeros are irrelevant. This allows for easy specification of neighborhoods that are nonrectangular, indeed that can have any arbitrary shape. In addition, MATLAB makes a number of popular shapes directly available, just as the fspecial routine makes a number *Not to be confused with a similar term, structural unit, used in the beginning of this chapter. A structural unit is the object of interest in the image. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 145. FIGURE 12.17 Example of closing to fill gaps. In the closed image, some of the cells are now filled, but some of the gaps between cells have been erroneously filled in. of popular two-dimensional filter functions available. The routine to specify the structuring element is strel and is called as: structure = strel(shape, NH, arg); where shape is the type of shape desired, NH usually specifies the size of the neighborhood, and arg and an argument, frequently optional, that depends on shape. If shape is ‘arbitrary’, or simply omitted, then NH is an array that specifies the neighborhood in terms of ones as described above. Prepackaged shapes include: ‘disk’ a circle of radius NH (in pixels) ‘line’ a line of length NH and angle arg in degrees ‘rectangle’ a rectangle where NH is a two element vector specifying rows and col- umns ‘diamond’ a diamond where NH is the distance from the center to each corner ‘square’ a square with linear dimensions NH Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 146. For many of these shapes, the routine strel produces a decomposed structure that runs significantly faster. Based on the structure, the statements for dilation, erosion, opening, and closing are: I1 = imdilate(I, structure); I1 = imerode(I, structure); I1 = imopen(I, structuure); I1 = imclose(I, structure); where I1 is the output image, I is the input image and structure is the neigh- borhood specification given by strel, as described above. In all cases, struc- ture can be replaced by an array specifying the neighborhood as ones, bypass- ing the strel routine. In addition, imdilate and imerode have optional arguments that provide packing and unpacking of the binary input or output images. Example 12.5 Apply opening and closing to the thresholded blood cell images of Figure 12–3 in an effort to remove small background artifacts and to fill holes. Use a circular structure with a diameter of four pixels. % Example 12.5 and Figures 12.16 and 12.17 % Demonstration of morphological opening to eliminate small % artifacts and of morphological closing to fill gaps % These operations will be applied to the thresholded blood cell % images of Figure 12.3 (left image). % Uses a circular or disk shaped structure 4 pixels in diameter % clear all; close all; I = imread(‘blood1.tif’); % Get image and threshold I = im2double(I); BW = ϳim2bw(I,thresh(I)); % SE = strel(‘disk’,4); % Define structure: disk of radius % 4 pixels BW1= imerode(BW,SE); % Opening operation: erode BW2 = imdilate(BW1,SE); % image first, then dilate % .......display images..... % BW3= imdilate(BW,SE); % Closing operation, dilate image BW4 = imerode(BW3,SE); % first then erode % .......display images..... Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 147. This example produced the images in Figures 12.15 and 12.16. Example 12.6 Apply an opening operation to remove the dark patches seen in the thresholded cell image of Figure 12.15. % Figures 12.6 and 12.18 % Use opening to remove the dark patches in the thresholded cell % image of Figure 12.15 % close all; clear all; % SE = strel(‘square’,5); % Define closing structure: % square 5 pixels on a side load fig12_15; % Get data of Figure 12.15 (BW2) BW1= ϳimopen(ϳBW2,SE); % Opening operation .......Display images..... The result of this operation is shown in Figure 12.18. In this case, the closing operation is able to remove completely the dark patches in the center of the cell image. A 5-by-5 pixel square structural element was used. The size (and shape) of the structural element controlled the size of artifact removed, and no attempt was made to optimize its shape. The size was set here as the minimum that would still remove all of the dark patches. The opening operation in this example used the single statement imopen. Again, the opening operation oper- ates on activated (i.e., white pixels), so to remove dark artifacts it is necessary to invert the image (using the logical NOT operator, ϳ) before performing the opening operation. The opened image is then inverted again before display. FIGURE 12.18 Application of the open operation to remove the dark patches in the binary cell image in Figure 12.15 (lower right). Using a 5 by 5 square struc- tural element resulted in eliminating all of the dark patches. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 148. MATLAB morphology routines also allow for manipulation of maxima and minima in an image. This is useful for identifying objects, and for filling. Of the many other morphological operations supported by MATLAB, only the imfill operation will be described here. This operation begins at a designated pixel and changes connected background pixels (0’s) to foreground pixels (1’s), stopping only when a boundary is reached. For grayscale images, imfill brings the intensity levels of the dark areas that are surrounded by lighter areas up to the same intensity level as surrounding pixels. (In effect, imfill removes re- gional minima that are not connected to the image border.) The initial pixel can be supplied to the routine or obtained interactively. Connectivity can be defined as either four connected or eight connected. In four connectivity, only the four pixels bordering the four edges of the pixel are considered, while in eight con- nectivity all pixel that touch, including those that touch only at the corners, are considered connected. The basic imfill statement is: I_out = imfill(I, [r c], con); where I is the input image, I_out is the output image, [r c] is a two-element vector specifying the beginning point, and con is an optional argument that is set to 8 for eight connectivity (four connectivity is the default). (See the help file to use imfill interactively.) A special option of imfill is available specifi- cally for filling holes. If the image is binary, a hole is a set of background pixels that cannot be reached by filling in the background from the edge of the image. If the image is an intensity image, a hole is an area of dark pixels surrounded by lighter pixels. To invoke this option, the argument following the input image should be holes. Figure 12.19 shows the operation performed on the blood cell image by the statement: I_out = imfill(I, ‘holes’); EDGE-BASED SEGMENTATION Historically, edge-based methods were the first set of tools developed for seg- mentation. To move from edges to segments, it is necessary to group edges into chains that correspond to the sides of structural units, i.e., the structural bound- aries. Approaches vary in how much prior information they use, that is, how much is used of what is known about the possible shape. False edges and missed edges are two of the more obvious, and more common, problems associated with this approach. The first step in edge-based methods is to identify edges which then be- come candidates for boundaries. Some of the filters presented in Chapter 11 Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 149. FIGURE 12.19 Hole filling operation produced by imfill. Note that neither the edge cell (at the upper image boundary) or the overlapped cell in the center are filled since they are not actually holes. (Original image reprinted with permission from the Image Processing Handbook 2nd edition. Copyright CRC Press, Boca Raton, Florida.) perform edge enhancement, including the Sobel, Prewitt, and Log filters. In addition, the Laplacian, which takes the spatial second derivative, can be used to find edge candidates. The Canny filter is the most advanced edge detector supported by MATLAB, but it necessarily produces a binary output while many of the secondary operations require a graded edge image. Edge relaxation is one approach used to build chains from individual edge candidate pixels. This approach takes into account the local neighborhood: weak edges positioned between strong edges are probably part of the edge, while strong edges in isolation are likely spurious. The Canny filter incorporates a type of edge relaxation. Various formal schemes have been devised under this category. A useful method is described in Sonka (1995) that establishes edges between pixels (so-called crack edges) based on the pixels located at the end points of the edge. Another method for extending edges into chains is termed graph search- ing. In this approach, the endpoints (which could both be the same point in a closed boundary) are specified, and the edge is determined based on minimizing some cost function. Possible pathways between the endpoints are selected from candidate pixels, those that exceed some threshold. The actual path is selected based on a minimization of the cost function. The cost function could include features such as the strength of an edge pixel and total length, curvature, and proximity of the edge to other candidate borders. This approach allows for a Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 150. great deal of flexibility. Finally, dynamic programming can be used which is also based on minimizing a cost function. The methods briefly described above use local information to build up the boundaries of the structural elements. Details of these methods can be found in Sonka et al. (1995). Model-based edge detection methods can be used to exploit prior knowledge of the structural unit. For example, if the shape and size of the image is known, then a simple matching approach based on correlation can be used (matched filtering). When the general shape is known, but not the size, the Hough transform can be used. This approach was originally designed for identi- fying straight lines and curves, but can be expanded to other shapes provided the shape can be described analytically. The basic idea behind the Hough transform is to transform the image into a parameter space that is constructed specifically to describe the desired shape analytically. Maxima in this parameter space then correspond to the presence of the desired image in image space. For example, if the desired object is a straight line (the original application of the Hough transform), one analytic representa- tion for this shape is y = mx + b,* and such shapes can be completely defined by a two-dimensional parameter space of m and b parameters. All straight lines in image space map to points in parameter space (also known as the accumula- tor array for reasons that will become obvious). Operating on a binary image of edge pixels, all possible lines through a given pixel are transformed into m,b combinations, which then increment the accumulator array. Hence, the accumu- lator array accumulates the number of potential lines that could exist in the image. Any active pixel will give rise to a large number of possible line slopes, m, but only a limited number of m,b combinations. If the image actually contains a line, then the accumulator element that corresponds to that particular line’s m,b parameters will have accumulated a large number. The accumulator array is searched for maxima, or supra threshold locations, and these locations identify a line or lines in the image. This concept can be generalized to any shape that can be described analyt- ically, although the parameter space (i.e., the accumulator) may have to include several dimensions. For example, to search for circles note that a circle can be defined in terms of three parameters, a, s, and r for the equation given below. (y = a)2 + (x − b)2 = r2 (1) where a and b define the center point of the circle and r is the radius. Hence the accumulator space must be three-dimensional to represent a, b, and r. *This representation of a line will not be able to represent vertical lines since m → ∞ for a vertical line. However, lines can also be represented in two dimensions using cylindrical coordinates, r and θ: y = r cos θ + r sin θ. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 151. MATLAB Implementation Of the techniques described above, only the Hough transform is supported by MATLAB image processing routines, and then only for straight lines. It is sup- ported as the Radon transform which computes projections of the image along a straight line, but this projection can be done at any angle.* This results in a projection matrix that is the same as the accumulator array for a straight line Hough transform when expressed in cylindrical coordinates. The Radon transform is implemented by the statement: [R, xp] = radon(BW, theta); where BW is a binary input image and theta is the projection angle in degrees, usually a vector of angles. If not specified, theta defaults to (1:179). R is the projection array where each column is the projection at a specific angle. (R is a column vector if theta is a constant). Hence, maxima in R correspond to the positions (encoded as an angle and distance) in the image. An example of the use of radon to perform the Hough transformation is given in Example 12.7. Example 12.7 Find the strongest line in the image of Saturn in image file ‘saturn.tif’. Plot that line superimposed on the image. Solution First convert the image to an edge array using MATLAB’s edge routine. Use the Hough transform (implemented for straight lines using radon) to build an accumulator array. Find the maximum point in that array (using max) which will give theta, the angle perpendicular to the line, and the distance along that perpendicular line of the intersection. Convert that line to rectangular coordinates, then plot the line superimposed on the image. % Example 12.7 Example of the Hough transform % (implemented using ‘radon’) to identify lines in an image. % Use the image of Saturn in ‘saturn.tif’ % clear all; close all; radians = 2*pi/360; % Convert from degrees to radians I = imread(’saturn.tif’); % Get image of Saturn theta = 0:179; % Define projection angles BW = edge(I,.02); % Threshold image, threshold set [R,xp] = radon(BW,theta); % Hough (Radon) transform % Convert to indexed image [X, map] = gray2ind (mat2gray(R)); *The Radon transform is an important concept in computed tomography (CT) as described in a following section. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 152. % subplot(1,2,1); imshow(BW) % Display results title(‘Saturn ϳ Thresholded’); subplot(1,2,2); imshow(X, hot); % The hot colormap gives better % reproduction % [M, c] = max(max(R)); % Find maximum element [M, r] = max(R(:,c)); % Convert to rectangular coordinates [ri ci] = size(BW); % Size of image array [ra ca] = size(R); % Size of accumulator array m = tan((c-90)*radians); % Slope from theta b = -r/cos((c-90)*radians); % Intercept from basic % trigonometry x = (0:ci); y = m*x ؉ b; % Construct line subplot(1,2,1); hold on; plot(x,-y,’r’); % Plot line on graph subplot(1,2,1); hold on; plot(c, ra-r,’*k’); % Plot maximum point in % accumulator This example produces the images shown in Figure 12.20. The broad white line superimposed is the line found as the most dominant using the Hough transform. The location of this in the accumulator or parameter space array is shown in the right-hand image. Other points nearly as strong (i.e., bright) can be seen in the parameter array which represent other lines in the image. Of course, it is possible to identify these lines as well by searching for maxima other than the global maximum. This is done in a problem below. PROBLEMS 1. Load the blood cell image (blood1.tif) Filter the image with two lowpass filters, one having a weak cutoff (for example, Gaussian with an alpha of 0.5) and the other having a strong cutoff (alpha > 4). Threshold the two filtered images using the maximum variance routine (graythresh). Display the original and filtered images along with their histograms. Also display the thresholded images. 2. The Laplacian filter which calculates the second derivative can also be used to find edges. In this case edges will be located where the second derivative is near zero. Load the image of the spine (‘spine.tif’) and filter using the Laplacian filter (use the default constant). Then threshold this image using Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 153. FIGURE 12.20 Thresholded image of Saturn (from MATLAB’s saturn.tif) with the dominant line found by the Hough transform. The right image is the accumula- tor array with the maximum point indicated by an ‘*’. (Original image is a public domain image courtesy of NASA, Voyger 2 image, 1981-08-24.) islice. The threshold values should be on either side of zero and should be quite small (< 0.02) since you are interested in values quite close to zero. 3. Load image ‘texture3.tif’ which contains three regions having the same average intensities but different textural patterns. Before applying the nonlinear range operator used in Example 12.2, preprocess with a Laplacian filter (alpha = 0.5). Apply the range operator as in Example 12.2 using nlfilter. Plot original and range images along with their histograms. Threshold the range image to isolate the segments and compare with the figures in the book. (Hint: You may have to adjust the thresholds slightly, but you do not have to rerun the time- consuming range operator to adjust these thresholds.) You should observe a modest improvement: one of the segments can now be perfectly separated. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 154. 4. Load the texture orientation image texture4.tif. Separate the segments as well as possible by using a Sobel operator followed by a standard deviation operator implemented using nlfilter. (Note you will have to multiply the standard deviation image by around 4 to get it into an appropriate range.) Plot the histogram and use it to determine the best boundaries for separating the three segments. Display the three segments as white objects. 5. Load the thresholded image of Figure 12.5 (found as Fig12_5.tif on the disk) and use opening to eliminate as many points as possible in the upper field without affecting the lower field. Then use closing to try to blacken as many points as possible in the lower field without affecting the upper field. (You should be able to blacken the lower field completely except for edge effects.) Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 155. 13 Image Reconstruction Medical imaging utilizes several different physical principals or imaging modal- ities. Common modalities used clinically include x-ray, computed tomography (CT), positron emission tomography (PET), single photon emission computed tomography (SPECT), and ultrasound. Other approaches under development in- clude optical imaging* and impedence tomography. Except for simple x-ray images which provide a shadow of intervening structures, some form of image processing is required to produce a useful image. The algorithms used for image reconstruction depend on the modality. In magnetic resonance imaging (MRI), reconstruction techniques are fairly straightforward, requiring only a two-dimen- sional inverse Fourier transform (described later in this chapter). Positron emis- sion tomography (PET) and computed tomography use projections from colli- mated beams and the reconstruction algorithm is critical. The quality of the image is strongly dependent on the image reconstruction algorithm.† *Of course, optical imaging is used in microscopy, but because of scattering it presents serious problems when deep tissues are imaged. A number of advanced image processing methods are under development to overcome problems due to scattering and provide useful images using either coher- ent or noncoherent light. †CT may be the first instance where the analysis software is an essential component of medical diagnosis and comes between the physician and patient: the physician has no recourse but to trust the software. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 156. CT, PET, AND SPECT Reconstructed images from PET, SPECT, and CT all use collimated beams directed through the target, but they vary in the mechanism used to produce these collimated beams. CT is based on x-ray beams produced by an external source that are collimated by the detector: the detector includes a collimator, usually a long tube that absorbs diagonal or off-axis photons. A similar approach is used for SPECT, but here the photons are produced by the decay of a radioac- tive isotope within the patient. Because of the nature of the source, the beams are not as well collimated in SPECT, and this leads to an unavoidable reduction in image resolution. Although PET is also based on photons emitted from a radioactive isotope, the underlying physics provide an opportunity to improve beam collimation through so-called electronic collimation. In PET, the radioac- tive isotope emits a positron. Positrons are short lived, and after traveling only a short distance, they interact with an electron. During this interaction, their masses are annihilated and two photons are generated traveling in opposite di- rections, 180 deg. from one another. If two separate detectors are activated at essentially the same time, then it is likely a positron annihilation occurred some- where along a line connecting these two detectors. This coincident detection provides an electronic mechanism for establishing a collimated path that tra- verses the original positron emission. Note that since the positron does not decay immediately, but may travel several cm in any direction before annihilation, there is an inherent limitation on resolution. In all three modalities, the basic data consists of measurements of the absorption of x-rays (CT) or concentrations of radioactive material (PET and SPECT), along a known beam path. From this basic information, the reconstruc- tion algorithm must generate an image of either the tissue absorption character- istics or isotope concentrations. The mathematics are fairly similar for both absorption and emission processes and will be described here in terms of absorp- tion processes; i.e., CT. (See Kak and Slaney (1988) for a mathematical descrip- tion of emission processes.) In CT, the intensity of an x-ray beam is dependent on the intensity of the source, Io, the absorption coefficient, µ, and length, R, of the intervening tissue: I(x,y) = Ioe−µR (1) where I(x,y) is the beam intensity (proportional to number of photons) at posi- tion x,y. If the beam passes through tissue components having different absorp- tion coefficients then, assuming the tissue is divided into equal sections ∆R, Eq. (1) becomes: I(x,y) = Ioexpͩ−∑ i µ(x,y)∆Rͪ (2) Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 157. The projection p(x,y), is the log of the intensity ratio, and is obtained by dividing out Io and taking the natural log: p(x,y) = lnͩ Io I(x,y)ͪ= ∑ i µi(x,y)∆R (3) Eq. (3) is also expressed as a continuous equation where it becomes the line integral of the attenuation coefficients from the source to the detector: p(x,y) = ∫ Detector Source µ(x,y)dR (4) Figure 13.1A shows a series of collimated parallel beams traveling through tissue.* All of these beams are at the same angle, θ, with respect to the reference axis. The output of each beam is just the projection of absorption characteristics of the intervening tissue as defined in Eq. (4). The projections of all the individual parallel beams constitute a projection profile of the intervening FIGURE 13.1 (A) A series of parallel beam paths at a given angle, θ, is projected through biological tissue. The net absorption of each beam can be plotted as a projection profile. (B) A large number of such parallel paths, each at a different angle, is required to obtain enough information to reconstruct the image. *In modern CT scanners, the beams are not parallel, but dispersed in a spreading pattern from a single source to an array of detectors, a so-called fan beam pattern. To simplify the analysis pre- sented here, we will assume a parallel beam geometry. Kak and Slaney (1988) also cover the derivation of reconstruction algorithms for fan beam geometry. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 158. tissue absorption coefficients. With only one projection profile, it is not possible to determine how the tissue absorptions are distributed along the paths. How- ever, if a large number of projections are taken at different angles through the tissue, Figure 13.1B, it ought to be possible, at least in principle, to estimate the distribution of absorption coefficients from some combined analysis applied to all of the projections. This analysis is the challenge given to the CT reconstruc- tion algorithm. If the problem were reversed, that is, if the distribution of tissue absorption coefficients was known, determining the projection profile produced by a set of parallel beams would be straightforward. As stated in Eq. (13-4), the output of each beam is the line integral over the beam path through the tissue. If the beam is at an angle, θ (Figure 13-2), then the equation for a line passing through the origin at angle θ is: x cos θ + y sin θ = 0 (5) and the projection for that single line at a fixed angle, pθ, becomes: pθ = ∫ ∞ −∞ ∫ ∞ −∞ I(x,y)(x cosθ + y sinθ) dxdy (6) where I(x,y) is the distribution of absorption coefficients as Eq. (2). If the beam is displaced a distance, r, from the axis in a direction perpendicular to θ, Figure 13.2, the equation for that path is: FIGURE 13.2 A single beam path is defined mathematically by the equation given in Eq. (5). Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 159. x cos θ + y sin θ − r = 0 (7) The whole family of parallel paths can be mathematically defined using Eqs. (6) and (7) combined with the Dirac delta distribution, δ, to represent the discrete parallel beams. The equation describing the entire projection profile, pθ(r), becomes: pθ(r) = ∫ ∞ −∞ ∫ ∞ −∞ I(x,y) δ(x cosθ + y sinθ − r) dxdy (8) This equation is known as the Radon transform, ᏾. It is the same as the Hough transform (Chapter 12) for the case of straight lines. The expression for pθ(r) can also be written succinctly as: pθ(r) = ᏾[I(x,y)] (9) The forward Radon transform can be used to generate raw CT data from image data, useful in problems, examples, and simulations. This is the approach that is used in some of the examples given in the MATLAB Implementation section, and also to generate the CT data used in the problems. The Radon transform is helpful in understanding the problem, but does not help in the actual reconstruction. Reconstructing the image from the projec- tion profiles is a classic inverse problem. You know what comes out—the pro- jection profiles—but want to know the image (or, in the more general case, the system), that produced that output. From the definition of the Radon transform in Eq. (9), the image should result from the application of an inverse Radon transform ᏾−1 , to the projection profiles, pθ(r): I(x,y) = ᏾−1 [pθ(r)] (10) While the Radon transform (Eqs. (8) and (9)) and inverse Radon trans- form (Eq. (10)) are expressed in terms of continuous variables, in imaging sys- tems the absorption coefficients are given in terms of discrete pixels, I(n,m), and the integrals in the above equations become summations. In the discrete situation, the absorption of each pixel is an unknown, and each beam path pro- vides a single projection ratio that is the solution to a multi-variable equation. If the image contains N by M pixels, and there are N × M different projections (beam paths) available, then the system is adequately determined, and the recon- struction problem is simply a matter of solving a large number of simultaneous equations. Unfortunately, the number of simultaneous equations that must be solved is generally so large that a direct solution becomes unworkable. The early attempts at CT reconstruction used an iterative approach called the algebraic reconstruction algorithm or ART. In this algorithm, each pixel was updated based on errors between projections that would be obtained from the current pixel values and the actual projections. When many pixels are involved, conver- Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 160. gence was slow and the algorithm was computationally intensive and time- consuming. Current approaches can be classified as either transform methods or series expansion methods. The filtered back-projection method described below falls into the first category and is one of the most popular of CT reconstruction approaches. Filtered back-projection can be described in either the spatial or spatial frequency domain. While often implemented in the latter, the former is more intuitive. In back-projection, each pixel absorption coefficient is set to the sum (or average) of the values of all projections that traverse the pixel. In other words, each projection that traverses a pixel contributes its full value to the pixel, and the contributions from all of the beam paths that traverse that pixel are simply added or averaged. Figure 13.3 shows a simple 3-by-3 pixel grid with a highly absorbing center pixel (absorption coefficient of 8) against a back- ground of lessor absorbing pixels. Three projection profiles are shown traversing the grid horizontally, vertically, and diagonally. The lower grid shows the image that would be reconstructed using back-projection alone. Each grid contains the average of the projections though that pixel. This reconstructed image resembles the original with a large central value surrounded by smaller values, but the background is no longer constant. This background variation is the result of blurring or smearing the central image over the background. To correct the blurring or smoothing associated with the back-projection method, a spatial filter can be used. Since the distortion is in the form of a blurring or smoothing, spatial differentiation is appropriate. The most common filter is a pure derivative up to some maximum spatial frequency. In the fre- quency domain, this filter, termed the Ram-Lak filter, is a ramp up to some maximum cutoff frequency. As with all derivative filters, high-frequency noise will be increased, so this filter is often modified by the addition of a lowpass filter. Lowpass filters that can be used include the Hamming window, the Han- ning window, a cosine window, or a sinc function window (the Shepp-Logan filter). (The frequency characteristics of these filters are shown in Figure 13.4). Figure 13.5 shows a simple image of a light square on a dark background. The projection profiles produced by the image are also shown (calculated using the Radon transform). The back-projection reconstruction of this image shows a blurred version of the basic square form with indistinct borders. Application of a highpass filter sharpens the image (Figure 13.4). The MATLAB implementation of the inverse Radon transform, iradon described in the next section, uses the filtered back- projection method and also provides for all of the filter options. Filtered back-projection is easiest to implement in the frequency domain. The Fourier slice theorem states that the one-dimensional Fourier transform of a projection profile forms a single, radial line in the two-dimensional Fourier transform of the image. This radial line will have the same angle in the spatial Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 161. FIGURE 13.3 Example of back-projection on a simple 3-by-3 pixel grid. The up- per grid represents the original image which contains a dark (absorption 8) center pixel surrounded by lighter (absorption 2) pixels. The projections are taken as the linear addition of all intervening pixels. In the lower reconstructed image, each pixel is set to the average of all beams that cross that pixel. (Normally the sum would be taken over a much larger set of pixels.) The center pixel is still higher in absorption, but the background is no longer the same. This represents a smearing of the original image. frequency domain as the projection angle (Figure 13.6). Once the two-dimen- sional Fourier transform space is filled from the individual one-dimensional Fourier transforms of the projection profiles, the image can be constructed by applying the inverse two-dimensional Fourier transform to this space. Before the inverse transform is done, the appropriate filter can be applied directly in the frequency domain using multiplication. As with other images, reconstructed CT images can suffer from alaising if they are undersampled. Undersampling can be the result of an insufficient Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 162. FIGURE 13.4 Magnitude frequency characteristics of four common filters used in filtered back-projection. They all show highpass characteristics at lower frequen- cies. The cosine filter has the same frequency characteristics as the two-point central difference algorithm. number of parallel beams in the projection profile or too few rotation angles. The former is explored in Figure 13.7 which shows the square pattern of Figure 13.5 sampled with one-half (left-hand image) and one-quarter (right-hand im- age) the number of parallel beams used in Figure 13.5. The images have been multiplied by a factor of 10 to enhance the faint aliasing artifacts. One of the problems at the end of this chapter explores the influence of undersampling by reducing the number of angular rotations an well as reducing the number of parallel beams. Fan Beam Geometry For practical reasons, modern CT scanners use fan beam geometry. This geome- try usually involves a single source and a ring of detectors. The source rotates around the patient while those detectors in the beam path acquire the data. This allows very high speed image acquisition, as short as half a second. The source fan beam is shaped so that the beam hits a number of detections simultaneously, Figure 13.8. MATLAB provides several routines that provide the Radon and inverse Radon transform for fan beam geometry. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 163. FIGURE 13.5 Image reconstruction of a simple white square against a black background. Back-projection alone produces a smeared image which can be cor- rected with a spatial derivative filter. These images were generated using the code given in Example 13.1. MATLAB Implementation Radon Transform The MATLAB Image Processing Toolbox contains routines that perform both the Radon and inverse Radon transforms. The Radon transform routine has al- ready been introduced as an implementation of the Hough transform for straight line objects. The procedure here is essentially the same, except that an intensity image is used as the input instead of the binary image used in the Hough trans- form. [p, xp] = radon(I, theta); where I is the image of interest and theta is the production angle in degs.s, usually a vector of angles. If not specified, theta defaults to (1:179). The output parameter p is the projection array, where each column is the projection profile at a specific angle. The optional output parameter, xp gives the radial coordi- nates for each row of p and can be used in displaying the projection data. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 164. FIGURE 13.6 Schematic representation of the steps in filtered back-projection using frequency domain techniques. The steps shown are for a single projection profile and would be repeated for each projection angle. FIGURE 13.7 Image reconstructions of the same simple pattern shown in Figure 13.4, but undersampled by a factor of two (left image) or four (right image). The contrast has been increased by a factor of ten to enhance the relatively low- intensity aliasing patterns. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 165. FIGURE 13.8 A series of beams is projected from a single source in a fan-like pattern. The beams fall upon a number of detectors arranged in a ring around the patient. Fan beams typically range between 30 to 60 deg. In the most recent CT scanners (so-called fourth-generation machines) the detectors completely en- circle the patient, and the source can rotate continuously. Inverse Radon Transform: Parallel Beam Geometry MATLAB’s inverse Radon transform is based on filtered back-projection and uses the frequency domain approach illustrated in Figure 13.6. A variety of filtering options are available and are implemented directly in the frequency domain. The calling structure of the inverse Radon transform is: [I,f] = iradon(p,theta,interp,filter,d,n); where p is the only required input argument and is a matrix where each column contains one projection profile. The angle of the projection profiles is specified by theta in one of two ways: if theta is a scalar, it specifies the angular spacing (in degs.s) between projection profiles (with an assumed range of zero to number of columns − 1); if theta is a vector, it specifies the angles them- selves, which must be evenly spaced. The default theta is 180 deg. divided by the number of columns. During reconstruction, iradon assumes that the center of rotation is half the number of rows (i.e., the midpoint of the projection pro- file: ceil(size (p,1)/2)). The optional argument interp is a string specifying the back-projection interpolation method: ‘nearest’, ‘linear’ (the default), and ‘spline’. The Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 166. filter option is also specified as a string. The ‘Ram-Lak’ option is the default and consists of a ramp in frequency (i.e., an ideal derivative) up to some maxi- mum frequency (Figure 13.4 (on p. 382)). Since this filter is prone to high- frequency noise, other options multiply the ramp function by a lowpass function. These lowpass functions are the same as described above: Hamming window (‘Hamming’), Hanning window (‘Hann’), cosine (‘cosine’), and sinc (‘Shepp-Logan’) function. Frequency plots of several of these filters are shown in Figure 13.4. The filter’s frequency characteristics can be modified by the optional parameter, d, which scales the frequency axis: if d is less than one (the default value is one) then filter transfer function values above d, in normalized frequency, are set to 0. Hence, decreasing d increases the lowpass filter effect. The optional input argument, n, can be reused to rescale the image. These filter options are explored in several of the problems. The image is contained in the output matrix I (class double), and the optional output vector, h, contains the filter’s frequency response. (This output vector was used to generate the filter frequency curves of Figure 13.4.) An application of the inverse Radon transform is given in Example 13.1. Example 13.1 Example of the use of back-projection and filtered back- projection. After a simple image of a white square against a dark background is generated, the CT projections are constructed using the forward Radon trans- form. The original image is reconstructed from these projections using both the filtered and unfiltered back-projection algorithm. The original image, the projections, and the two reconstructed images are displayed in Figure 13.5 on page 385. % Example 13.1 and Figure 13.4. % Image Reconstruction using back-projection and filtered % back-projection. % Uses MATLAB’s ‘iradon’ for filtered back-projection and % ‘i_back’ for unfiltered back-projection. % (This routine is a version of ‘iradon’ modified to eliminate % the filter.) % Construct a simple image consisting of a white square against % a black background. Then apply back-projection without % filtering and with the derivative (Ram-Lak) filters. % Display the original and reconstructed images along with the % projections. % clear all; close all; % I = zeros(128,128); % Construct image: black I(44:84,44:84) = 1; % background with a central % white square Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 167. % % Generate the projections using ‘radon’ theta = (1:180); % Angle between projections % is 1 deg. [p,xp] = radon(I, theta); % % Now reconstruct the image I_back = i_back(p,delta_theta); % Back-projection alone I_back = mat2gray(I_back); % Convert to grayscale I_filter_back = iradon % Filtered back-projection (p,delta_theta); % .......Display images....... The display generated by this code is given in Figure 13.4. Example 13.2 explores the effect of filtering on the reconstructed images. Example 13.2 The inverse Radon transform filters. Generate CT data by applying the Radon transform to an MRI image of the brain (an unusual exam- ple of mixed modalities!). Reconstruct the image using the inverse Radon trans- form with the Ram-Lak (derivative) filter and the cosine filter with a maximum relative frequency of 0.4. Display the original and reconstructed images. % Example 13.2 and Figure 13.9 Image Reconstruction using % filtered back-projection % Uses MATLAB’s ‘iradon’ for filtered backprojection % Load a frame of the MRI image (mri.tif) and construct the CT % projections using ‘radon’. Then apply backprojection with % two different filters: Ram-Lak and cosine (with 0.4 as % highest frequency % clear all; close all; frame = 18; % Use MR image slice 18 [I(:,:,:,1), map ] = imread(‘mri.tif’,frame); if isempty(map) == 0 % Check to see if Indexed data I = ind2gray(I,map); % If so, convert to Intensity % image end I = im2double(I); % Convert to double and scale % % Construct projections of MR image delta_theta = (1:180); [p,xp] = radon(I,delta_theta); % Angle between projections % is 1 deg. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 168. % % Reconstruct image using Ram-Lak filter I_RamLak = iradon(p,delta_theta,‘Ram-Lak’); % .......Display images....... Radon and Inverse Radon Transform: Fan Beam Geometry The MATLAB routines for performing the Radon and inverse Radon transform using fan beam geometry are termed fanbeam and ifanbeam, respectively, and have the form: fan = fanbeam(I,D) where I is the input image and D is a scalar that specifies the distance between the beam vertex and the center of rotation of the beams. The output, fan, is a matrix containing the fan bean projection profiles, where each column contains the sensor samples at one rotation angle. It is assumed that the sensors have a one-deg. spacing and the rotation angles are spaced equally over 0 to 359 deg. A number of optional input variables specify different geometries, sensor spac- ing, and rotation increments. The inverse Radon transform for fan beam projections is specified as: I = ifanbeam(fan,D) FIGURE 13.9 Original MR image and reconstructed images using the inverse Radon transform with the Ram-Lak derivative and the cosine filter. The cosine filter’s lowpass cutoff has been modified by setting its maximum relative fre- quency to 0.4. The Ram-Lak reconstruction is not as sharp as the original image and sharpness is reduced further by the cosine filter with its lowered bandwidth. (Original image from the MATLAB Image Processing Toolbox. Copyright 1993– 2003, The Math Works, Inc. Reprinted with permission.) Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 169. where fan is the matrix of projections and D is the distance between beam vertex and the center of rotation. The output, I, is the reconstructed image. Again there are a number of optional input arguments specifying the same type of information as in fanbeam. This routine first converts the fan beam geometry into a parallel geometry, then applies filtered back-projection as in iradon. During the filtered back-projection stage, it is possible to specify filter options as in iradon. To specify, the string ‘Filter’ should precede the filter name (‘Hamming’, ‘Hann’, ‘cosine’, etc.). Example 13.3 Fan beam geometry. Apply the fan beam and parallel beam Radon transform to the simple square shown in Figure 13.4. Reconstruct the image using the inverse Radon transform for both geometries. % Example 13.3 and Figure 13.10 % Example of reconstruction using the Fan Beam Geometry % Reconstructs a pattern of 4 square of different intensities % using parallel beam and fan beam approaches. % clear all; close all; D = 150; % Distance between fan beam vertex % and center of rotation theta = (1:180); % Angle between parallel % projections is 1 deg. % I = zeros(128,128); % Generate image I(22:54,22:52) = .25; % Four squares of different shades I(76;106,22:52) = .5; % against a black background I(22:52,76:106) = .75; I(76:106,76:106) = 1; % % Construct projections: Fan and parallel beam [F,Floc,Fangles] = fanbeam (I,D,‘FanSensorSpacing’,.5); [R,xp] = radon(I,theta); % % Reconstruct images. Use Shepp-Logan filter I_rfb = ifanbeam(F,D,‘FanSensorSpacing’,.5,‘Filter’, ... ‘Shepp-Logan’); I_filter_back = iradon(R,theta,‘Shepp-Logan’); % % Display images subplot(1,2,1); imshow(I_rfb); title(‘Fan Beam’) subplot(1,2,2); imshow(I_filter_back); title(‘Parallel Beam’) Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 170. The images generated by this example are shown in Figure 13.10. There are small artifacts due to the distance between the beam source and the center of rotation. The affect of this distance is explored in one of the problems. MAGNETIC RESONANCE IMAGING Basic Principles MRI images can be acquired in a number of ways using different image acquisi- tion protocols. One of the more common protocols, the spin echo pulse sequence, will be described with the understanding that a fair number of alternatives are commonly used. In this sequence, the image is constructed on a slice-by-slice basis, although the data are obtained on a line-by-line basis. For each slice, the raw MRI data encode the image as a variation in signal frequency in one dimen- sion, and in signal phase in the other. To reconstruct the image only requires the application of a two-dimensional inverse Fourier transform to this fre- quency/phase encoded data. If desired, spatial filtering can be implemented in the frequency domain before applying the inverse Fourier transform. The physics underlying MRI is involved and requires quantum mechanics for a complete description. However, most descriptions are approximations that use classical mechanics. The description provided here will be even more abbre- viated than most. (For a detailed classical description of the MRI physics see Wright’s chapter in Enderle et al., 2000.). Nuclear magnetism occurs in nuclei with an odd number of nucleons (protons and/or neutrons). In the presence of a magnetic field such nuclei possess a magnetic dipole due to a quantum mechani- FIGURE 13.10 Reconstruction of an image of four squares at different intensities using parallel beam and fan beam geometry. Some artifact is seen in the fan beam geometry due to the distance between the beam source and object (see Problem 3). Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 171. cal property known as spin.* In MRI lingo, the nucleus and/or the associated magnetic dipole is termed a spin. For clinical imaging, the hydrogen proton is used because it occurs in large numbers in biological tissue. Although there are a large number of hydrogen protons, or spins, in biological tissue (1 mm3 of water contains 6.7 × 1019 protons), the net magnetic moment that can be pro- duced, even if they were all aligned, is small due to the near balance between spin-up (1⁄2) and spin-down (−1⁄2) states. When they are placed in a magnetic field, the magnetic dipoles are not static, but rotate around the axis of the applied magnetic field like spinning tops, Figure 13.11A (hence, the spins themselves spin). A group of these spins produces a net moment in the direction of the magnetic field, z, but since they are not in phase, any horizontal moment in the x and y direction tends to cancel (Figure 13.11B). While the various spins do not have the same relative phase, they do all rotate at the same frequency, a frequency given by the Larmor equation: ωo = γH (11) FIGURE 13.11 (A) A single proton has a magnetic moment which rotates in the presence of an applied magnet field, Bz. This dipole moment could be up or down with a slight favoritism towards up, as shown. (B) A group of upward dipoles create a net moment in the same direction as the magnetic field, but any horizon- tal moments (x or y) tend to cancel. Note that all of these dipole vectors should be rotating, but for obvious reasons they are shown as stationary with the as- sumption that they rotate, or more rigorously, that the coordinate system is ro- tating. *Nuclear spin is not really a spin, but another one of those mysterious quantum mechanical proper- ties. Nuclear spin can take on values of ±1/2, with +1/2 slightly favored in a magnetic field. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 172. where ωo is the frequency in radians, H is the magnitude of the magnitude field, and γ is a constant termed the gyromagnetic constant. Although γ is primarily a function of the type of nucleus it also depends slightly on the local chemical environment. As shown below, this equation contains the key to spatial localiza- tion in MRI: variations in local magnetic field will encode as variations in rota- tional frequency of the protons. If these rotating spins are exposed to electromagnetic energy at the rota- tional or Larmor frequency specified in Eq. (11), they will absorb this energy and rotate further and further from their equilibrium position near the z axis: they are tipped away from the z axis (Figure 13.12A). They will also be syn- chronized by this energy, so that they now have a net horizontal moment. For protons, the Larmor frequency is in the radio frequency (rf) range, so an rf pulse of the appropriate frequency in the xy-plane will tip the spins away from the z-axis an amount that depends on the length of the pulse: θ = γHTp (12) where θ is the tip angle and Tp pulse time. Usually Tp is adjusted to tip the angle either 90 or 180 deg. As described subsequently, a 90 deg. tip is used to generate the strongest possible signal and an 180 deg tip, which changes the sign of the FIGURE 13.12 (A) After an rf pulse that tips the spins 90 deg., the net magnetic moment looks like a vector, Mxy, rotating in the xy-plane. The net vector in the z direction is zero. (B) After the rf energy is removed, all of the spins begin to relax back to their equilibrium position, increasing the z component, Mz, and decreas- ing the xy component, Mxy. The xy component also decreases as the spins de- synchronize. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 173. moment, is used to generate an echo signal. Note that a given 90 or 180 deg. Tp will only flip those spins that are exposed to the appropriate local magnetic field, H. When all of the spins in a region are tipped 90 deg. and synchronized, there will be a net magnetic moment rotating in the xy-plane, but the component of the moment in the z direction will be zero (Figure 13.12A). When the rf pulse ends, the rotating magnetic field will generate its own rf signal, also at the Larmor frequency. This signal is known as the free induction decay (FID) signal. It is this signal that induces a small voltage in the receiver coil, and it is this signal that is used to construct the MR image. Immediately after the pulse ends, the signal generated is given by: S(t) = ρ sin (θ) cos(ωot) (13) where ωo is the Larmor frequency, θ is the tip angle, and ρ is the density of spins. Note that a tip angle of 90 deg. produces the strongest signal. Over time the spins will tend to relax towards the equilibrium position (Figure 13.12B). This relaxation is known as the longitudinal or spin-lattice relaxation time and is approximately exponential with a time constant denoted as “T1.” As seen in Figure 13.12B, it has the effect of increasing the horizontal moment, Mz, and decreasing the xy moment, Mxy. The xy moment is decreased even further, and much faster, by a loss of synchronization of the collective spins, since they are all exposed to a slightly different magnetic environment from neighboring atoms (Figure 13.12B). This so-called transverse or spin-spin relaxation time is also exponential and decays with a time constant termed “T2.” The spin-spin relaxation time is always less than the spin lattice relaxation time, so that by the time the net moment returns to equilibrium position along the z axis the individual spins are completely de-phased. Local inhomogeneities in the applied magnetic field cause an even faster de-phasing of the spins. When the de-phasing time constant is modified to include this effect, it is termed T*2 (pronounced tee two star). This time constant also includes the T2 influences. When these relaxation processes are included, the equation for the FID signals becomes: S(t) = ρ cos(ωot) e−t/T* 2 e−t/T1 (14) While frequency dependence (i.e., the Larmor equation) is used to achieve localization, the various relation times as well as proton density are used to achieve image contrast. Proton density, ρ, for any given collection of spins is a relatively straightforward measurement: it is proportional to FID signal ampli- tude as shown in Eq. (14). Measuring the local T1 and T2 (or T*2 ) relaxation times is more complicated and is done through clever manipulations of the rf pulse and local magnetic field gradients, as briefly described in the next section. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 174. Data Acquisition: Pulse Sequences A combination of rf pulses, magnetic gradient pulses, delays, and data acquisi- tion periods is termed a pulse sequence. One of the clever manipulations used in many pulse sequences is the spin echo technique, a trick for eliminating the de-phasing caused by local magnetic field inhomogeneities and related artifacts (the T*2 decay). One possibility might be to sample immediately after the rf pulse ends, but this is not practical. The alternative is to sample a realigned echo. After the spins have begun to spread out, if their direction is suddenly reversed they will come together again after a known delay. The classic example is that of a group of runners who are told to reverse direction at the same time, say one minute after the start. In principal, they all should get back to the start line at the same time (one minute after reversing) since the fastest runners will have the farthest to go at the time of reversal. In MRI, the reversal is accom- plished by a phase-reversing 180 rf pulse. The realignment will occur with the same time constant, T*2 , as the misalignment. This echo approach will only cancel the de-phasing due to magnetic inhomogeneities, not the variations due to the sample itself: i.e., those that produce the T2 relaxation. That is actually desirable because the sample variations that cause T2 relaxation are often of interest. As mentioned above, the Larmor equation (Eq. (11)) is the key to localiza- tion. If each position in the sample is subjected to a different magnetic field strength, then the locations are tagged by their resonant frequencies. Two ap- proaches could be used to identify the signal from a particular region. Use an rf pulse with only one frequency component, and if each location has a unique magnetic field strength then only the spins in one region will be excited, those whose magnetic field correlates with the rf frequency (by the Larmor equation). Alternatively excite a broader region, then vary the magnetic field strength so that different regions are given different resonant frequencies. In clinical MRI, both approaches are used. Magnetic field strength is varied by the application of gradient fields ap- plied by electromagnets, so-called gradient coils, in the three dimensions. The gradient fields provide a linear change in magnetic field strength over a limited area within the MR imager. The gradient field in the z direction, Gz, can be used to isolate a specific xy slice in the object, a process known as slice selection.* In the absence of any other gradients, the application of a linear gradient in the z direction will mean that only the spins in one xy-plane will have a resonant frequency that matches a specific rf pulse frequency. Hence, by adjusting the *Selected slices can be in any plane, x, y, z, or any combination, by appropriate activation of the gradients during the rf pulse. For simplicity, this discussion assumes the slice is selected by the z- gradient so spins in an xy-plane are excited. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 175. gradient, different xy-slices will be associated with (by the Larmor equation), and excited by, a specific rf frequency. Since the rf pulse is of finite duration it cannot consist of a single frequency, but rather has a range of frequencies, i.e., a finite bandwidth. The thickness of the slice, that is, the region in the z-direc- tion over which the spins are excited, will depend on the steepness of the gradi- ent field and the bandwidth of the rf pulse: ∆z ϰ γGz z(∆ω) (15) Very thin slices, ∆z, would require a very narrowband pulse, ∆ω, in com- bination with a steep gradient field, Gz. If all three gradients, Gx, Gy, and Gz, were activated prior to the rf pulse then only the spins in one unique volume would be excited. However, only one data point would be acquired for each pulse repetition, and to acquire a large volume would be quite time-consuming. Other strategies allow the acquisition of entire lines, planes, or even volumes with one pulse excitation. One popular pulse sequence, the spin-echo pulse sequence, acquires one line of data in the spatial frequency domain. The sequence begins with a shaped rf pulse in con- junction with a Gz pulse that provides slice selection (Figure 13.13). The Gz includes a reversal at the end to cancel a z-dependent phase shift. Next, a y- gradient pulse of a given amplitude is used to phase encode the data. This is followed by a second rf/Gz combination to produce the echo. As the echo re- groups the spins, an x-gradient pulse frequency encodes the signal. The re- formed signal constitutes one line in the ferquency domain (termed k-space in MRI), and is sampled over this period. Since the echo signal duration is several hundred microseconds, high-speed data acquisition is necessary to sample up to 256 points during this signal period. As with slice thickness, the ultimate pixel size will depend on the strength of the magnetic gradients. Pixel size is directly related to the number of pixels in the reconstructed image and the actual size of the imaged area, the so-called field-of-view (FOV). Most modern imagers are capable of a 2 cm FOV with samples up to 256 by 256 pixels, giving a pixel size of 0.078 mm. In practice, image resolution is usually limited by signal-to-noise considerations since, as pixel area decreases, the number of spins available to generate a signal dimin- ishes proportionately. In some circumstances special receiver coils can be used to increase the signal-to-noise ratio and improve image quality and/or resolu- tion. Figure 13.14A shows an image of the Shepp-Logan phantom and the same image acquired with different levels of detector noise.* As with other forms of signal processing, MR image noise can be improved by averaging. Figure *The Shepp-Logan phantom was developed to demonstrate the difficulty of identifying a tumor in a medical image. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 176. FIGURE 13.13 The spin-echo pulse sequence. Events are timed with respect to the initial rf pulse. See text for explanation. 13.14D shows the noise reduction resulting from averaging four of the images taken under the same noise conditions as Figure 13.14C. Unfortunately, this strategy increases scan time in direct proportion to the number of images aver- aged. Functional Magnetic Resonance Imaging Image processing for MR images is generally the same as that used on other images. In fact, MR images have been used in a number of examples and prob- lems in previous chapters. One application of MRI does have some unique im- Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 177. FIGURE 13.14 (A) MRI reconstruction of a Shepp-Logan phantom. (B) and (C) Reconstruction of the phantom with detector noise added to the frequency do- main signal. (D) Frequency domain average of four images taken with noise simi- lar to C. Improvement in the image is apparent. (Original image from the MATLAB Image Processing Toolbox. Copyright 1993–2003, The Math Works, Inc. Re- printed with permission.) age processing requirements: the area of functional magnetic resonance imaging (fMRI). In this approach, neural areas that are active in specific tasks are identi- fied by increases in local blood flow. MRI can detect cerebral blood changes using an approach known as BOLD: blood oxygenation level dependent. Special pulse sequences have been developed that can acquire images very quickly, and these images are sensitive to the BOLD phenomenon. However, the effect is very small: changes in signal level are only a few percent. During a typical fMRI experiment, the subject is given a task which is either physical (such a finger tapping), purely sensory (such as a flashing visual stimulus), purely mental (such as performing mathematical calculations), or in- volves sensorimotor activity (such as pushing a button whenever a given image appears). In single-task protocols, the task alternates with non-task or baseline activity period. Task periods are usually 20–30 seconds long, but can be shorter and can even be single events under certain protocols. Multiple task protocols are possible and increasingly popular. During each task a number of MR images Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 178. are acquired. The primary role of the analysis software is to identify pixels that have some relationship to the task/non-task activity. There are a number of software packages available that perform fMRI analysis, some written in MATLAB such as SPM, (statistical parametric map- ping), others in c-language such as AFNI (analysis of neural images). Some packages can be obtained at no charge off the Web. In addition to identifying the active pixels, these packages perform various preprocessing functions such as aligning the sequential images and reshaping the images to conform to stan- dard models of the brain. Following preprocessing, there are a number of different approaches to identifying regions where local blood flow correlates with the task/non-task timing. One approach is simply to use correlation, that is correlate the change in signal level, on a pixel-by-pixel basis, with a task-related function. This func- tion could represent the task by a one and the non-task by a zero, producing a square wave-like function. More complicated task functions account for the dy- namics of the BOLD process which has a 4 to 6 second time constant. Finally, some new approaches based on independent component analysis (ICA, Chapter 9) can be used to extract the task function from the data itself. The use of correlation and ICA analysis is explored in the MATLAB Implementation sec- tion and in the problems. Other univariate statistical techniques are common such as t-tests and f-tests, particularly in the multi-task protocols (Friston, 2002). MATLAB Implementation Techniques for fMRI analysis can be implemented using standard MATLAB routines. The identification of active pixels using correlation with a task protocol function will be presented in Example 13.4. Several files have been created on the disk that simulate regions of activity in the brain. The variations in pixel intensity are small, and noise and other artifacts have been added to the image data, as would be the case with real data. The analysis presented here is done on each pixel independently. In most fMRI analyses, the identification proce- dure might require activity in a number of adjoining pixels for identification. Lowpass filtering can also be used to smooth the image. Example 13.4 Use correlation to identify potentially active areas from MRI images of the brain. In this experiment, 24 frames were taken (typical fMRI experiments would contain at least twice that number): the first 6 frames were acquired during baseline activity and the next 6 during the task. This off- on cycle was then repeated for the next 12 frames. Load the image in MATLAB file fmril, which contains all 24 frames. Generate a function that represents the off-on task protocol and correlate this function with each pixel’s variation over the 24 frames. Identify pixels that have correlation above a given threshold and mark the image where these pixels occur. (Usually this would be done in color with higher correlations given brighter color.) Finally display the time sequence Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 179. of one of the active pixels. (Most fMRI analysis packages can display the time variation of pixels or regions, usually selected interactively.) % Example 13.4 Example of identification of active area % using correlation. % Load the 24 frames of the image stored in fmri1.mat. % Construct a stimulus profile. % In this fMRI experiment the first 6 frames were taken during % no-task conditions, the next six frames during the task % condition, and this cycle was repeated. % Correlate each pixel’s variation over the 24 frames with the % task profile. Pixels that correlate above a certain threshold % (use 0.5) should be identified in the image by a pixel % whose intensity is the same as the correlation values % clear all; close all thresh = .5; % Correlation threshold load fmri1; % Get data i_stim2 = ones(24,1); % Construct task profile i_stim2(1:6) = 0; % First 6 frames are no-task i_stim2(13:18) = 0; % Frames 13 through 18 % are also no-task % % Do correlation: pixel by pixel over the 24 frames I_fmri_marked = I_fmri; active = [0 0]; for i = 1:128 for j = 1:128 for k = 1:24 temp(k) = I_fmri(i,j,1,k); end cor_temp = corrcoef([temp’i_stim2]); corr(i,j) = cor_temp(2,1); % Get correlation value if corr(i,j) > thresh I_fmri_marked(i,j,:,1) = I_fmri(i,j,:,1) ؉ corr(i,j); active = [active; i,j]; % Save supra-threshold % locations end end end % % Display marked image imshow(I_fmri_marked(:,:,:,1)); title(‘fMRI Image’); figure; % Display one of the active areas for i = 1:24 % Plot one of the active areas Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 180. active_neuron(i) = I_fmri(active(2,1),active(2,2),:,i); end plot(active_neuron); title(‘Active neuron’); The marked image produced by this program is shown in Figure 13.15. The actual active area is the rectangular area on the right side of the image slightly above the centerline. However, a number of other error pixels are pres- ent due to noise that happens to have a sufficiently high correlation with the task profile (a correlation of 0.5 in this case). In Figure 13.16, the correlation threshold has been increased to 0.7 and most of the error pixels have been FIGURE 13.15 White pixels were identified as active based on correlation with the task profile. The actual active area is the rectangle on the right side slightly above the center line. Due to inherent noise, false pixels are also identified, some even outside of the brain. The correlation threshold was set a 0.5 for this image. (Original image from the MATLAB Image Processing Toolbox. Copyright 1993– 2003, The Math Works, Inc. Reprinted with permission.) Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 181. FIGURE 13.16 The same image as in Figure 13.15 with a higher correlation threshold (0.7). Fewer errors are seen, but the active area is only partially identi- fied. eliminated, but now the active region is only partially identified. An intermedi- ate threshold might result in a better compromise, and this is explored in one of the problems. Functional MRI software packages allow isolation of specific regions of interest (ROI), usually though interactive graphics. Pixel values in these regions of interest can be plotted over time and subsequent processing can be done on the isolated region. Figure 13.17 shows the variation over time (actually, over the number of frames) of one of the active pixels. Note the very approximate correlation with the square wave-like task profile also shown. The poor correla- tion is due to noise and other artifacts, and is fairly typical of fMRI data. Identi- fying the very small signal within the background noise is the one of the major challenges for fMRI image processing algorithms. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 182. FIGURE 13.17 Variation in intensity of a single pixel within the active area of Figures 13.15 and 13.16. A correlation with the task profile is seen, but consider- able noise is also present. Principal Component and Independent Component Analysis In the above analysis, active pixels were identified by correlation with the task profile. However, the neuronal response would not be expected to follow the task temporal pattern exactly because of the dynamics of the blood flow re- sponse (i.e., blood hemodynamics) which requires around 4 to 6 seconds to reach its peak. In addition, there may be other processes at work that systemati- cally affect either neural activity or pixel intensity. For example, respiration can alter pixel intensity in a consistent manner. Identifying the actual dynamics of the fMRI process and any consistent artifacts might be possible by a direct analysis of the data. One approach would be to search for components related to blood flow dynamics or artifacts using either principal component analysis (PCA) or independent component analysis (ICA). Regions of interest are first identified using either standard correlation or other statistical methods so that the new tools need not be applied to the entire image. Then the isolated data from each frame is re-formatted so that it is one- dimensional by stringing the image rows, or columns, together. The data from each frame are now arranged as a single vector. ICA or PCA is applied to the transposed ensemble of frame vectors so that each pixel is treated as a different source and each frame is an observation of that source. If there are pixels whose intensity varies in a non-random manner, this should produce one or more com- ponents in the analyses. The component that is most like the task profile can then be used as a more accurate estimate of blood flow hemodynamics in the correlation analysis: the isolated component is used for the comparison instead of the task profile. An example of this approach is given in Example 13.5. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 183. Example 13.5 Select a region of interest from the data of Figure 13.16, specifically an area that surrounds and includes the potentially active pixels. Normally this area would be selected interactively by an operator. Reformat the images so that each frame is a single row vector and constitutes one row of an ensemble composed of the different frames. Perform both an ICA and PCA analysis and plot the resulting components. % Example 13.5 and Figure 13.18 and 13.19 % Example of the use of PCA and ICA to identify signal % and artifact components in a region of interest % containing some active neurons. % Load the region of interest then re-format to a images so that % each of the 24 frames is a row then transpose this ensemble % so that the rows are pixels and the columns are frames. % Apply PCA and ICA analysis. Plot the first four principal % components and the first two independent components. % close all; clear all; nu_comp = 2; % Number of independent components load roi2; % Get ROI data % Find number of frames % [r c dummy frames] = size(ROI); % Convert each image frame to a column and construct an % ensemble were each column is a different frame % for i = 1:frames for j = 1:r row = ROI(j,:,:,i); % Convert frame to a row if j == 1 temp = row; else temp = [temp row]; end end if i == 1 data = temp; % Concatenate rows else data = [data;temp]; end end % % Now apply PCA analysis [U,S,pc]= svd(data’,0); % Use singular value decomposition eigen = diag(S).v 2; for i = 1:length(eigen) Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 184. FIGURE 13.18 First four components from a principal component analysis ap- plied to a region of interest in Figure 13.15 that includes the active area. A func- tion similar to the task is seen in the second component. The third component also has a possible repetitive structure that could be related to respiration. pc(:,i) = pc(:,i) * sqrt(eigen(i)); end % % Determine the independent components w = jadeR(data’,nu_comp); ica = (w* data’); Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 185. FIGURE 13.19 Two components found by independent component analysis. The task-related function and the respiration artifact are now clearly identified. % .......Display components....... The principal components produced by this analysis are shown in Figure 13.18. A waveform similar to the task profile is seen in the second plot down. Since this waveform derived from the data, it should more closely represent the actual blood flow hemodynamics. The third waveform shows a regular pattern, possibly due to respiration artifact. The other two components may also contain some of that artifact, but do not show any other obvious pattern. The two patterns in the data are better separated by ICA. Figure 13.19 shows the first two independent components and both the blood flow hemody- namics and the artifact are clearly shown. The former can be used instead of the task profile in the correlation analysis. The results of using the profile ob- tained through ICA are shown in Figure 13.20A and B. Both activity maps were obtained from the same data using the same correlation threshold. In Figure 13.20A, the task profile function was used, while in Figure 13.20B the hemody- Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 186. FIGURE 13.20A Activity map obtained by correlating pixels with the square-wave task function. The correlation threshold was 0.55. (Original image from the MATLAB Image Processing Toolbox. Copyright 1993–2003, The Math Works, Inc. Reprinted with permission.) FIGURE 13.20B Activity map obtained by correlating pixels with the estimated hemodynamic profile obtained from ICA. The correlation threshold was 0.55. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 187. namic profile (the function in the lower plot of Figure 13.19) was used in the correlation. The improvement in identification is apparent. When the task func- tion is used, very few of the areas actually active are identified and a number of error pixels are identified. Figure 13.20B contains about the same number of errors, but all of the active areas are identified. Of course, the number of active areas identified using the task profile could be improved by lowering the thresh- old of correlation, but this would also increase the errors. PROBLEMS 1. Load slice 13 of the MR image used in Example 13.3 (mri.tif). Construct parallel beam projections of this image using the Radon transform with two different angular spacings between rotations: 5 deg. and 10 deg. In addition, reduce spacing of the 5 deg. data by a factor of two. Reconstruct the three images (5 deg. unreduced, 5 deg. reduced, and 10 deg.) and display along with the original image. Multiply the images by a factor of 10 to enhance any varia- tions in the background. 2. The data file data_prob_13_2 contains projections of the test pattern im- age, testpat1.png with noise added. Reconstruct the image using the inverse Radon transform with two filter options: the Ram-Lak filter (the default), and the Hamming filter with a maximum frequency of 0.5. 3. Load the image squares.tif. Use fanbeam to construct fan beam projec- tions and ifanbeam to produce the reconstructed image. Repeat for two different beam distances: 100 and 300 (pixels). Plot the reconstructed images. Use a FanSensorSpacing of 1. 4. The rf-pulse used in MRI is a shaped pulse consisting of a sinusoid at the base frequency that is amplitude modulated by some pulse shaping waveform. The sinc waveform (sin(x)/x) is commonly used. Construct a shaped pulse con- sisting of cos(ω2) modulated by sinc(ω2). Pulse duration should be such that ω2 ranges between ±π: −2π ≤ ω2 ≤ 2π. The sinusoidal frequency, ω1, should be 10 ω2. Use the inverse Fourier transform to plot the magnitude frequency spectrum of this slice selection pulse. (Note: the MATLAB sinc function is normalized to π, so the range of the vector input to this function should be ±2. In this case, the cos function will need to multiplied by 2π, as well as by 10.) 5. Load the 24 frames of image fmri3.mat. This contains the 4-D variable, I_fmri, which has 24 frames. Construct a stimulus profile. Assume the same task profile as in Example 13.4: the first 6 frames were taken during no-task conditions, the next six frames during the task condition, then the cycle was repeated. Rearrange Example 13.4 so that the correlations coefficients are com- puted first, then the thresholds are applied (so each new threshold value does not Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 188. require another calculation of correlation coefficients). Search for the optimal threshold. Note these images contain more noise than those used in Example 13.4, so even the best thresholded will contain error pixels. 6. Example of identification of active area using correlation. Repeat Problem 6 except filter the matrix containing the pixel correlations before applying the threshold. Use a 4 by 4 averaging filter. (fspecial can be helpful here.) 7. Example of using principal component analysis and independent component analysis to identify signal and artifact. Load the region of interest file roi4.mat which contains variable ROI. This variable contains 24 frames of a small region around the active area of fmri3.mat. Reformat to a matrix as in Example 13.5 and apply PCA and ICA analysis. Plot the first four principal components and the first two independent components. Note the very slow time constant of the blood flow hemodynamics. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 189. 2 Basic Concepts NOISE In Chapter 1 we observed that noise is an inherent component of most measure- ments. In addition to physiological and environmental noise, electronic noise arises from the transducer and associated electronics and is intermixed with the signal being measured. Noise is usually represented as a random variable, x(n). Since the variable is random, describing it as a function of time is not very useful. It is more common to discuss other properties of noise such as its proba- bility distribution, range of variability, or frequency characteristics. While noise can take on a variety of different probability distributions, the Central Limit Theorem implies that most noise will have a Gaussian or normal distribution*. The Central Limit Theorem states that when noise is generated by a large num- ber of independent sources it will have a Gaussian probability distribution re- gardless of the probability distribution characteristics of the individual sources. Figure 2.1A shows the distribution of 20,000 uniformly distributed random numbers between −1 and +1. The distribution is approximately flat between the limits of ±1 as expected. When the data set consists of 20,000 numbers, each of which is the average of two uniformly distributed random numbers, the distri- bution is much closer to Gaussian (Figure 2.1B, upper right). The distribution *Both terms are used and reader should be familiar with both. We favor the term “Gaussian” to avoid the value judgement implied by the word “normal.” Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 190. FIGURE 2.1 (A) The distribution of 20,000 uniformly distributed random numbers. (B) The distribution of 20,000 numbers, each of which is the average of two uni- formly distributed random numbers. (C) and (D) The distribution obtained when 3 and 8 random numbers, still uniformly distributed, are averaged together. Al- though the underlying distribution is uniform, the averages of these uniformly dis- tributed numbers tend toward a Gaussian distribution (dotted line). This is an example of the Central Limit Theorem at work. constructed from 20,000 numbers that are averages of only 8 random numbers appears close to Gaussian, Figure 2.1D, even though the numbers being aver- aged have a uniform distribution. The probability of a Gaussianly distributed variable, x, is specified in the well-known normal or Gaussian distribution equation: p(x) = 1 σ√2π e−x2/2σ2 (1) Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 191. Two important properties of a random variable are its mean, or average value, and its variance, the term σ2 in Eq. (1). The arithmetic quantities of mean and variance are frequently used in signal processing algorithms, and their computation is well-suited to discrete data. The mean value of a discrete array of N samples is evaluated as: x¯ = 1 N ∑ N k=1 xk (2) Note that the summation in Eq. (2) is made between 1 and N as opposed to 0 and N − 1. This protocol will commonly be used throughout the text to be compatible with MATLAB notation where the first element in an array has an index of 1, not 0. Frequently, the mean will be subtracted from the data sample to provide data with zero mean value. This operation is particularly easy in MATLAB as described in the next section. The sample variance, σ2 , is calculated as shown in Eq. (3) below, and the standard deviation, σ, is just the square root of the variance. σ2 = 1 N − 1 ∑ N k=1 (xk − x¯)2 (3) Normalizing the standard deviation or variance by 1/N − 1 as in Eq. (3) produces the best estimate of the variance, if x is a sample from a Gaussian distribution. Alternatively, normalizing the variance by 1/N produces the second moment of the data around x. Note that this is the equivalent of the RMS value of the data if the data have zero as the mean. When multiple measurements are made, multiple random variables can be generated. If these variables are combined or added together, the means add so that the resultant random variable is simply the mean, or average, of the individ- ual means. The same is true for the variance—the variances add and the average variance is the mean of the individual variances: σ2 = 1 N ∑ N k=1 σ2 k (4) However, the standard deviation is the square root of the variance and the standard deviations add as the √N times the average standard deviation [Eq. (5)]. Accordingly, the mean standard deviation is the average of the individual standard deviations divided by √N [Eq. (6)]. From Eq. (4): ∑ N k=1 σ2 k, hence: ∑ N k=1 σk = √N σ2 = √N σ (5) Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 192. Mean Standard Deviation = 1 N ∑ N k=1 σk = 1 N √N σ = σ √N (6) In other words, averaging noise from different sensors, or multiple obser- vations from the same source, will reduce the standard deviation of the noise by the square root of the number of averages. In addition to a mean and standard deviation, noise also has a spectral characteristic—that is, its energy distribution may vary with frequency. As shown below, the frequency characteristics of the noise are related to how well one instantaneous value of noise correlates with the adjacent instantaneous values: for digitized data how much one data point is correlated with its neighbors. If the noise has so much randomness that each point is independent of its neigh- bors, then it has a flat spectral characteristic and vice versa. Such noise is called white noise since it, like white light, contains equal energy at all frequencies (see Figure 1.5). The section on Noise Sources in Chapter 1 mentioned that most electronic sources produce noise that is essentially white up to many mega- hertz. When white noise is filtered, it becomes bandlimited and is referred to as colored noise since, like colored light, it only contains energy at certain frequen- cies. Colored noise shows some correlation between adjacent points, and this correlation becomes stronger as the bandwidth decreases and the noise becomes more monochromatic. The relationship between bandwidth and correlation of adja- cent points is explored in the section on autocorrelation. ENSEMBLE AVERAGING Eq. (6) indicates that averaging can be a simple, yet powerful signal processing technique for reducing noise when multiple observations of the signal are possi- ble. Such multiple observations could come from multiple sensors, but in many biomedical applications, the multiple observations come from repeated responses to the same stimulus. In ensemble averaging, a group, or ensemble, of time re- sponses are averaged together on a point-by-point basis; that is, an average signal is constructed by taking the average, for each point in time, over all signals in the ensemble (Figure 2.2). A classic biomedical engineering example of the application of ensemble averaging is the visual evoked response (VER) in which a visual stimulus produces a small neural signal embedded in the EEG. Usually this signal cannot be detected in the EEG signal, but by averaging hundreds of observations of the EEG, time-locked to the visual stimulus, the visually evoked signal emerges. There are two essential requirements for the application of ensemble aver- aging for noise reduction: the ability to obtain multiple observations, and a reference signal closely time-linked to the response. The reference signal shows how the multiple observations are to be aligned for averaging. Usually a time Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 193. FIGURE 2.2 Upper traces: An ensemble of individual (vergence) eye movement responses to a step change in stimulus. Lower trace: The ensemble average, dis- placed downward for clarity. The ensemble average is constructed by averaging the individual responses at each point in time. Hence, the value of the average re- sponse at time T1 (vertical line) is the average of the individual responses at that time. signal linked to the stimulus is used. An example of ensemble averaging is shown in Figure 2.2, and the code used to produce this figure is presented in the following MATLAB implementation section. MATLAB IMPLEMENTATION In MATLAB the mean, variance, and standard deviations are implemented as shown in the three code lines below. xm = mean(x); % Evaluate mean of x xvar = var(x) % Evaluate the variance of x normalizing by % N-1 Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 194. xnorm = var(x,1); % Evaluate the variance of x xstd = std(x); % Evaluate the standard deviation of x, If x is an array (also termed a vector for reasons given later) the output of these function calls is a scalar representing the mean, variance, or standard deviation. If x is a matrix then the output is a row vector resulting from applying the appropriate calculation (mean, variance, or standard deviation) to each col- umn of the matrix. Example 2.1 below shows the implementation of ensemble averaging that produced the data in Figure 2.2. The program first loads the eye movement data (load verg1), then plots the ensemble. The ensemble average is determined using the MATLAB mean routine. Note that the data matrix, data_out, must be in the correct orientation (the responses must be in rows) for routine mean. If that were not the case (as in Problem 1 at the end of this chapter), the matrix transposition operation should be performed*. The ensemble average, avg, is then plotted displaced by 3 degrees to provide a clear view. Otherwise it would overlay the data. Example 2.1 Compute and display the Ensemble average of an ensemble of vergence eye movement responses to a step change in stimulus. These re- sponses are stored in MATLAB file verg1.mat. % Example 2.1 and Figure 2.2 Load eye movement data, plot % the data then generate and plot the ensemble average. % close all; clear all; load verg1; % Get eye movement data; Ts = .005; % Sample interval = 5 msec [nu,N] = size(data_out); % Get data length (N) t = (1:N)*Ts; % Generate time vector % % Plot ensemble data superimposed plot(t,data_out,‘k’); hold on; % % Construct and plot the ensemble average avg = mean(data_out); % Calculate ensemble average plot(t,avg-3,‘k’); % and plot, separated from % the other data xlabel(‘Time (sec)’); % Label axes ylabel(‘Eye Position’); *In MATLAB, matrix or vector transposition is indicated by an apostrophe following the variable. For example if x is a row vector, x′ is a column vector and visa versa. If X is a matrix, X′ is that matrix with rows and columns switched. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 195. plot([.43 .43],[0 5],’-k’); % Plot horizontal line text(1,1.2,‘Averaged Data’); % Label data average DATA FUNCTIONS AND TRANSFORMS To mathematicians, the term function can take on a wide range of meanings. In signal processing, most functions fall into two categories: waveforms, images, or other data; and entities that operate on waveforms, images, or other data (Hubbard, 1998). The latter group can be further divided into functions that modify the data, and functions used to analyze or probe the data. For example, the basic filters described in Chapter 4 use functions (the filter coefficients) that modify the spectral content of a waveform while the Fourier Transform detailed in Chapter 3 uses functions (harmonically related sinusoids) to analyze the spec- tral content of a waveform. Functions that modify data are also termed opera- tions or transformations. Since most signal processing operations are implemented using digital electronics, functions are represented in discrete form as a sequence of numbers: x(n) = [x(1),x(2),x(3), . . . ,x(N)] (5) Discrete data functions (waveforms or images) are usually obtained through analog-to-digital conversion or other data input, while analysis or modifying functions are generated within the computer or are part of the computer pro- gram. (The consequences of converting a continuous time function into a dis- crete representation are described in the section below on sampling theory.) In some applications, it is advantageous to think of a function (of whatever type) not just as a sequence, or array, of numbers, but as a vector. In this conceptu- alization, x(n) is a single vector defined by a single point, the endpoint of the vector, in N-dimensional space, Figure 2.3. This somewhat curious and highly mathematical concept has the advantage of unifying some signal processing operations and fits well with matrix methods. It is difficult for most people to imagine higher-dimensional spaces and even harder to present them graphically, so operations and functions in higher-dimensional space are usually described in 2 or 3 dimensions, and the extension to higher dimensional space is left to the imagination of the reader. (This task can sometimes be difficult for non- mathematicians: try and imagine a data sequence of even a 32-point array repre- sented as a single vector in 32-dimensional space!) A transform can be thought of as a re-mapping of the original data into a function that provides more information than the original.* The Fourier Trans- form described in Chapter 3 is a classic example as it converts the original time *Some definitions would be more restrictive and require that a transform be bilateral; that is, it must be possible to recover the original signal from the transformed data. We will use the looser definition and reserve the term bilateral transform to describe reversible transformations. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 196. FIGURE 2.3 The data sequence x(n) = [1.5,2.5,2] represented as a vector in three-dimensional space. data into frequency information which often provides greater insight into the nature and/or origin of the signal. Many of the transforms described in this text are achieved by comparing the signal of interest with some sort of probing function. This comparison takes the form of a correlation (produced by multipli- cation) that is averaged (or integrated) over the duration of the waveform, or some portion of the waveform: X(m) = ∫ ∞ −∞ x(t) fm(t) dt (7) where x(t) is the waveform being analyzed, fm(t) is the probing function and m is some variable of the probing function, often specifying a particular member in a family of similar functions. For example, in the Fourier Transform fm(t) is a family of harmonically related sinusoids and m specifies the frequency of an Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 197. individual sinusoid in that family (e.g., sin(mft)). A family of probing functions is also termed a basis. For discrete functions, a probing function consists of a sequence of values, or vector, and the integral becomes summation over a finite range: X(m) = ∑ N n=1 x(n)fm(n) (8) where x(n) is the discrete waveform and fm(n) is a discrete version of the family of probing functions. This equation assumes the probe and waveform functions are the same length. Other possibilities are explored below. When either x(t) or fm(t) are of infinite length, they must be truncated in some fashion to fit within the confines of limited memory storage. In addition, if the length of the probing function, fm(n), is shorter than the waveform, x(n), then x(n) must be shortened in some way. The length of either function can be shortened by simple truncation or by multiplying the function by yet another function that has zero value beyond the desired length. A function used to shorten another function is termed a window function, and its action is shown in Figure 2.4. Note that simple truncation can be viewed as multiplying the function by a rectangular window, a function whose value is one for the portion of the function that is retained, and zero elsewhere. The consequences of this artificial shortening will depend on the specific window function used. Conse- quences of data windowing are discussed in Chapter 3 under the heading Win- dow Functions. If a window function is used, Eq. (8) becomes: X(m) = ∑ N n=1 x(n) fm(n) W(n) (9) where W(n) is the window function. In the Fourier Transform, the length of W(n) is usually set to be the same as the available length of the waveform, x(n), but in other applications it can be shorter than the waveform. If W(n) is a rectan- gular function, then W(n) =1 over the length of the summation (1 ≤ n ≤ N), and it is usually omitted from the equation. The rectangular window is implemented implicitly by the summation limits. If the probing function is of finite length (in mathematical terms such a function is said to have finite support) and this length is shorter than the wave- form, then it might be appropriate to translate or slide it over the signal and perform the comparison (correlation, or multiplication) at various relative posi- tions between the waveform and probing function. In the example shown in Figure 2.5, a single probing function is shown (representing a single family member), and a single output function is produced. In general, the output would be a family of functions, or a two-variable function, where one variable corre- sponds to the relative position between the two functions and the other to the Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 198. FIGURE 2.4 A waveform (upper plot) is multiplied by a window function (middle plot) to create a truncated version (lower plot) of the original waveform. The win- dow function is shown in the middle plot. This particular window function is called the Kaiser Window, one of many popular window functions. specific family member. This sliding comparison is similar to convolution de- scribed in the next section, and is given in discrete form by the equation: X(m,k) = ∑ N n=1 x(n) fm(n − k) (10) where the variable k indicates the relative position between the two functions and m is the family member as in the above equations. This approach will be used in the filters described in Chapter 4 and in the Continuous Wavelet Trans- form described in Chapter 7. A variation of this approach can be used for long—or even infinite—probing functions, provided the probing function itself is shortened by windowing to a length that is less than the waveform. Then the shortened probing function can be translated across the waveform in the same manner as a probing function that is naturally short. The equation for this condi- tion becomes: Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 199. FIGURE 2.5 The probing function slides over the waveform of interest (upper panel) and at each position generates the summed, or averaged, product of the two functions (lower panel), as in Eq. (10). In this example, the probing function is one member of the “Mexican Hat” family (see Chapter 7) and the waveform is a sinusoid that increases its frequency linearly over time (known as a chirp.) The summed product (lower panel), also known as the scalar product, shows the rela- tive correlation between the waveform and the probing function as it slides across the waveform. Note that this relative correlation varies sinusoidally as the phase between the two functions varies, but reaches a maximum around 2.5 sec, the time when the waveform is most like the probing function. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 200. X(m,k) = ∑ N n=1 x(n) [W(n − k) fm(n)] (11) where fm(n) is a longer function that is shortened by the sliding window function, (W(n − k), and the variables m and k have the same meaning as in Eq. (10). This is the approach taken in the Short-Term Fourier Transform described in Chapter 6. All of the discrete equations above, Eqs. (7) to (11), have one thing in common: they all feature the multiplication of two (or sometimes three) func- tions and the summation of the product over some finite interval. Returning to the vector conceptualization for data sequences mentioned above (see Figure 2.3), this multiplication and summation is the same as scalar product of the two vectors.* The scalar product is defined as: Scalar product of a & b ≡ 〈a,b〉 = ͫa1 a2 Ӈ an ͬͫb1 b2 Ӈ bn ͬ= a1b1 + a2b2 + . . . + anbn (12) Note that the scalar product results in a single number (i.e., a scalar), not a vector. The scalar product can also be defined in terms of the magnitude of the two vectors and the angle between them: Scalar product of a and b ≡ 〈a,b〉 = *a* *b* cos θ (13) where θ is the angle between the two vectors. If the two vectors are perpendicu- lar to one another, i.e., they are orthogonal, then θ = 90°, and their salar product will be zero. Eq. (13) demonstrates that the scalar product between waveform and probe function is mathematically the same as a projection of the waveform vector onto the probing function vector (after normalizing by probe vector length). When the probing function consists of a family of functions, then the scalar product operations in Eqs. (7)–(11) can be thought of as projecting the waveform vector onto vectors representing the various family members. In this vector-based conceptualization, the probing function family, or basis, can be thought of as the axes of a coordinate system. This is the motivation behind the development of probing functions that have family members that are orthogonal, *The scalar product is also termed the inner product, the standard inner product, or the dot product. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 201. or orthonormal:† the scalar product computations (or projections) can be done on each axes (i.e., on each family member) independently of the others. CONVOLUTION, CORRELATION, AND COVARIANCE Convolution, correlation, and covariance are similar-sounding terms and are similar in the way they are calculated. This similarity is somewhat misleading—at least in the case of convolution—since the areas of application and underlying concepts are not the same. Convolution and the Impulse Response Convolution is an important concept in linear systems theory, solving the need for a time domain operation equivalent to the Transfer Function. Recall that the Transfer Function is a frequency domain concept that is used to calculate the output of a linear system to any input. Convolution can be used to define a general input–output relationship in the time domain analogous to the Transfer Function in the frequency domain. Figure 2.6 demonstrates this application of convolution. The input, x(t), the output, y(t), and the function linking the two through convolution, h(t), are all functions of time; hence, convolution is a time domain operation. (Ironically, convolution algorithms are often implemented in the frequency domain to improve the speed of the calculation.) The basic concept behind convolution is superposition. The first step is to determine a time function, h(t), that tells how the system responds to an infi- nitely short segment of the input waveform. If superposition holds, then the output can be determined by summing (integrating) all the response contribu- tions calculated from the short segments. The way in which a linear system responds to an infinitely short segment of data can be determined simply by noting the system’s response to an infinitely short input, an infinitely short pulse. An infinitely short pulse (or one that is at least short compared to the dynamics of the system) is termed an impulse or delta function (commonly denoted δ(t)), and the response it produces is termed the impulse response, h(t). FIGURE 2.6 Convolution as a linear process. †Orthonormal vectors are orthogonal, but also have unit length. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 202. Given that the impulse response describes the response of the system to an infinitely short segment of data, and any input can be viewed as an infinite string of such infinitesimal segments, the impulse response can be used to deter- mine the output of the system to any input. The response produced by an infi- nitely small data segment is simply this impulse response scaled by the magni- tude of that data segment. The contribution of each infinitely small segment can be summed, or integrated, to find the response created by all the segments. The convolution process is shown schematically in Figure 2.7. The left graph shows the input, x(n) (dashed curve), to a linear system having an impulse response of h(n) (solid line). The right graph of Figure 2.7 shows three partial responses (solid curves) produced by three different infinitely small data segments at N1, N2, and N3. Each partial response is an impulse response scaled by the associated input segment and shifted to the position of that segment. The output of the linear process (right graph, dashed line) is the summation of the individual FIGURE 2.7 (A) The input, x(n), to a linear system (dashed line) and the impulse response of that system, h(n) (solid line). Three points on the input data se- quence are shown: N1, N2, and N3. (B) The partial contributions from the three input data points to the output are impulse responses scaled by the value of the associated input data point (solid line). The overall response of the system, y(n) (dashed line, scaled to fit on the graph), is obtained by summing the contributions from all the input points. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 203. impulse responses produced by each of the input data segments. (The output is scaled down to produce a readable plot). Stated mathematically, the output y(t), to any input, x(t) is given by: y(t) = ∫ +∞ −∞ h(τ) x(t − τ) dτ = ∫ +∞ −∞ h(t − τ) x(τ) dτ (14) To determine the impulse of each infinitely small data segment, the im- pulse response is shifted a time τ with respect to the input, then scaled (i.e., multiplied) by the magnitude of the input at that point in time. It does not matter which function, the input or the impulse response, is shifted.* Shifting and mul- tiplication is sometimes referred to as the lag product. For most systems, h(τ) is finite, so the limit of integration is finite. Moreover, a real system can only respond to past inputs, so h(τ) must be 0 for τ < 0 (negative τ implies future times in Eq. (14), although for computer-based operations, where future data may be available in memory, τ can be negative. For discrete signals, the integration becomes a summation and the convo- lution equation becomes: y(n) = ∑ N k=1 h(n − k) x(k) or.... y(n) = ∑ N k=1 h(n) x(k − n) ≡ h(n) * x(n) (15) Again either h(n) or x(n) can be shifted. Also for discrete data, both h(n) and x(n) must be finite (since they are stored in finite memory), so the summa- tion is also finite (where N is the length of the shorter function, usually h(n)). In signal processing, convolution can be used to implement some of the basic filters described in Chapter 4. Like their analog counterparts, digital filters are just linear processes that modify the input spectra in some desired way (such as reducing noise). As with all linear processes, the filter’s impulse response, h(n), completely describes the filter. The process of sampling used in analog- to-digital conversion can also be viewed in terms of convolution: the sampled output x(n) is just the convolution of the analog signal, x(t), with a very short pulse (i.e., an impulse function) that is periodic with the sampling frequency. Convolution has signal processing implications that extend beyond the determi- nation of input-output relationships. We will show later that convolution in the time domain is equivalent to multiplication in the frequency domain, and vice versa. The former has particular significance to sampling theory as described latter in this chapter. *Of course, shifting both would be redundant. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 204. Covariance and Correlation The word correlation connotes similarity: how one thing is like another. Mathe- matically, correlations are obtained by multiplying and normalizing. Both covar- iance and correlation use multiplication to compare the linear relationship be- tween two variables, but in correlation the coefficients are normalized to fall between zero and one. This makes the correlation coefficients insensitive to variations in the gain of the data acquisition process or the scaling of the vari- ables. However, in many signal processing applications, the variable scales are similar, and covariance is appropriate. The operations of correlation and covari- ance can be applied to two or more waveforms, to multiple observations of the same source, or to multiple segments of the same waveform. These comparisons between data sequences can also result in a correlation or covariance matrix as described below. Correlation/covariance operations can not only be used to compare differ- ent waveforms at specific points in time, they can also make comparisons over a range of times by shifting one signal with respect the other. The crosscorrela- tion function is an example of this process. The correlation function is the lagged product of two waveforms, and the defining equation, given here in both contin- uous and discrete form, is quite similar to the convolution equation above (Eqs. (14) and (15): rxx(t) = ∫ T 0 y(t) x(t + τ)dτ (16a) rxx(n) = ∑ M k=1 y(k + n) x(k) (16b) Eqs. (16a) and (16b) show that the only difference in the computation of the crosscorrelation versus convolution is the direction of the shift. In convolu- tion the waveforms are shifted in opposite directions. This produces a causal output: the output function is the creation of past values of the input function (the output is caused by the input). This form of shifting is reflected in the negative sign in Eq. (15). Crosscorrelation shows the similarity between two waveforms at all possible relative positions of one waveform with respect to the other, and it is useful in identifying segments of similarity. The output of Eq. (16) is sometimes termed the raw correlation since there is no normaliza- tion involved. Various scalings can be used (such as dividing by N, the number of in the sum), and these are described in the section on MATLAB implementa- tion. A special case of the correlation function occurs when the comparison is between two waveforms that are one in the same; that is, a function is correlated with different shifts of itself. This is termed the autocorrelation function and it Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 205. provides a description of how similar a waveform is to itself at various time shifts, or time lags. The autocorrelation function will naturally be maximum for zero lag (n = 0) because at zero lag the comparison is between identical wave- forms. Usually the autocorrelation is scaled so that the correlation at zero lag is 1. The function must be symmetric about n = 0, since shifting one version of the same waveform in the negative direction is the same as shifting the other version in the positive direction. The autocorrelation function is related to the bandwidth of the waveform. The sharper the peak of the autocorrelation function the broader the bandwidth. For example, in white noise, which has infinite bandwidth, adjacent points are uncorrelated, and the autocorrelation function will be nonzero only for zero lag (see Problem 2). Figure 2.8 shows the autocorrelation functions of noise that has been filtered to have two different bandwidths. In statistics, the crosscorrela- tion and autocorrelation sequences are derived from the expectation operation applied to infinite data. In signal processing, data lengths are finite, so the expec- FIGURE 2.8 Autocorrelation functions of a random time series with a narrow bandwidth (left) and broader bandwidth (right). Note the inverse relationship be- tween the autocorrelation function and the spectrum: the broader the bandwidth the narrower the first peak. These figures were generated using the code in Ex- ample 2.2 below. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 206. tation operation becomes summation (with or without normalization), and the crosscorrelation and autocorrelation functions are necessarily estimations. The crosscovariance function is the same as crosscorrelation function ex- cept that the means have been removed from the data before calculation. Ac- cordingly, the equation is a slight modification of Eq. (16b), as shown below: Cov(n) = ∑ M k=1 [y(k + n) − y] [x(k) − x] (17) The terms correlation and covariance, when used alone (i.e., without the term function), imply operations similar to those described in Eqs. (16) and (17), but without the lag operation. The result will be a single number. For example, the covariance between two functions is given by: Cov = σx,y = ∑ M k=1 [y(k) − y] [x(k) − x] (18) Of particular interest is the covariance and correlation matrices. These analysis tools can be applied to multivariate data where multiple responses, or observations, are obtained from a single process. A representative example in biosignals is the EEG where the signal consists of a number of related wave- forms taken from different positions on the head. The covariance and correla- tion matrices assume that the multivariate data are arranged in a matrix where the columns are different variables and the rows are different observations of those variables. In signal processing, the rows are the waveform time samples, and the columns are the different signal channels or observations of the signal. The covariance matrix gives the variance of the columns of the data ma- trix in the diagonals while the covariance between columns is given by the off-diagonals: S = ͫσ1,1 σ1,2 ؒؒؒ σ1,N σ2,1 σ2,2 ؒؒؒ σ2,N Ӈ Ӈ O Ӈ σN,1 σN,2 ؒؒؒ σN,N ͬ (19) An example of the use of the covariance matrix to compare signals is given in the section on MATLAB implementation. In its usual signal processing definition, the correlation matrix is a normal- ized version of the covariance matrix. Specifically, the correlation matrix is related to the covariance matrix by the equation: C(i,j) = C(i,j) √C(i,i) C(j,j) (20) Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 207. The correlation matrix is a set of correlation coefficients between wave- form observations or channels and has a similar positional relationship as in the covariance matrix: Rxx = ͫrxx(0) rxx(1) ؒؒؒ rxx(L) rxx(1) rxx(0) ؒؒؒ rxx(L − 1) Ӈ Ӈ O Ӈ rxx(L) rxx(L − 1) ؒؒؒ rxx(0) ͬ (21) Since the diagonals in the correlation matrix give the correlation of a given variable or waveform with itself, they will all equal 1 (rxx(0) = 1), and the off-diagonals will vary between ± 1. MATLAB Implementation MATLAB has specific functions for performing convolution, crosscorrelation/ autocorrelation, crossvariance/autocovariance, and construction of the correla- tion and covariance matrices. To implement convolution in MATLAB, the code is straightforward using the conv function: y = conv(x,h) where x and h are vectors containing the waveforms to be convolved and y is the output waveform. The length of the output waveform is equal to the length of x plus the length of h minus 1. This will produce additional data points, and methods for dealing with these extra points are presented at the end of this chapter, along with other problems associated with finite data. Frequently, the additional data points can simply be discarded. An example of the use of this routine is given in Example 2.2. Although the algorithm performs the process defined in equation in Eq. (15), it actually operates in the frequency domain to improve the speed of the operation. The crosscorrelation and autocorrelation operations are both performed with the same MATLAB routine, with autocorrelation being treated as a special case: [c,lags] = xcorr(x,y,maxlags,‘options’) Only the first input argument, x, is required. If no y variable is specified, autocorrelation is performed. The optional argument maxlags specifies the shift- ing range. The shifted waveform is shifted between ± maxlags, or the default value which is −N + 1 to N − 1 where N is length of the input vector, x. If a y vector is specified then crosscorrelation is performed, and the same shifting range applies. If one of the waveforms the shorter than the other (as is usually Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 208. the case), it is zero padded (defined and described at the end of this chapter) to be the same length as the longer segment; hence, N would be the length of longer waveform. A number of scaling operations can be specified by the argu- ment options. If options equals biased, the output is divided by 1/N which gives a biased estimate of the crosscorrelation/autocorrelation function. If op- tions equals unbiased, the output is scaled by 1/*N − M* where M is the length of the data output as defined below. Setting options to coeff is used in autocorrelation and scales the autocorrelation function so that the zero lag autocorrelation has a value equal to one. Finally options equals none indicates no scaling, which is the default. The xcorr function produces an output argument, c, that is a vector of length 2 maxlags + 1 if maxlags is specified or 2N − 1 if the default range is used. The optional output argument, lags, is simply a vector containing the lag values (i.e., a vector of integers ranging between ±maxlags and is useful in plotting. Autocovariance or crosscovariance is obtained using the xcov function: [c,lags] = xcov(x,y,maxlags,‘options’) The arguments are identical to those described above for the xcorr func- tion. Correlation or covariance matrices are calculated using the corrcoef or cov functions respectively. Again, the calls are similar for both functions: Rxx = corrcoef(x) S = cov(x), or S = cov(x,1); Without the additional 1 in the calling argument, cov normalizes by N − 1, which provides the best unbiased estimate of the covariance matrix if the observations are from a Gaussian distribution. When the second argument is present, cov normalizes by N which produces the second moment of the obser- vations about their mean. Example 2.2 shows the use of both the convolution and autocorrelation functions. The program produces autocorrelation functions of noise bandlimited at two different frequencies. To generate the bandlimited (i.e., colored) noise used for the autocorrelation, an impulse response function is generated in the form of sin(x)/x (i.e., the sinc function). We will see in Chapter 4 that this is the impulse response of one type of lowpass filter. Convolution of this impulse response with a white noise sequence is used to generate bandlimited noise. A vector containing Gaussian white noise is produced using the randn routine and the lowpass filter is implemented by convolving the noise with the filter’s im- Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 209. pulse response. The result is noise bandlimited by the cutoff frequency of the filter. The output of the filter is then processed by the autocorrelation routine to produce the autocorrelation curves shown in Figure 2.8 above. The two figures were obtained for bandlimited noise having bandwidths of π/20 rad/sec and π/8 rad/sec. The variable wc specifies the cutoff frequency of the lowpass filter in the code below. The theory and implementation of a lowpass filter such as used below are presented in Chapter 4. Example 2.2 Generate bandlimited noise and compute and plot the auto- correlation function for two different bandwidths. % Example 2.2 and Figure 2.8 % Generate colored noise having two different bandwidths % and evaluate using autocorrelation. % close all; clear all; N = 1024; % Size of arrays L = 100; % FIR filter length w = pi/20; % Lowpass filter cutoff frequency noise = randn(N,1); % Generate noise % % Compute the impulse response of a lowpass filter % This type of filter is covered in Chapter 4 % wn = pi*[1/20 1/8]; % Use cutoff frequencies of ␲/20 and % ␲/8 for k = 1:2 % Repeat for two different cutoff % frequencies wc = wn(k); % Assigning filter cutoff frequency for i = 1:L؉1 % Generate sin(x)/x function n = i-L/2; % and make symmetrical if n = = 0 hn(i) = wc/pi; else hn(i) = (sin(wc*(n)))/(pi*n); % Filter impulse response end end out = conv(hn,noise); % Filter [cor, lags] = xcorr(out,‘coeff); % Calculate autocorrela- % tion, normalized % Plot the autocorrelation functions subplot (1,2,k); plot(lags(1,:),cor(:,1),‘k’); % Plot using ‘lags’ vector axis([-50 50 -.5 1.1]); % Define axes scale Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 210. ylabel(‘Rxx’); % Labels xlabel(‘Lags(n)’); title([‘Bandwidth = ‘num2str(wc)]); end Example 2.3 evaluates the covariance and correlation of sinusoids that are, and are not, orthogonal. Specifically, this example demonstrates the lack of correlation and covariance between sinusoids that are orthogonal such as a sine and cosine at the same frequency and harmonically related sinusoids (i.e., those having multiple frequencies of one another). It also shows correlation and covar- iance for sinusoids that are not orthogonal such as sines that are not at harmoni- cally related frequencies. Example 2.3 Generate a data matrix where the columns consist of or- thogonal and non-orthogonal sinusoids. Specifically, the data matrix should con- sist of a 1 Hz sine and a cosine, a 2 Hz sine and cosine, and a 1.5 Hz sine and cosine. The six sinusoids should all be at different amplitudes. The first four sinusoids are orthogonal and should show negligible correlation while the two 1.5 Hz sinusoids should show some correlation with the other sinusoids. % Example 2.3 % Application of the correlation and covariance matrices to % sinusoids that are orthogonal and non-orthogonal % clear all; close all; N = 256; % Number of data points in % each waveform fs = 256; % Sample frequency n = (1:N)/fs; % Generate 1 sec of data % % Generate the sinusoids as columns of the matrix x(:,1) = sin(2*pi*n)’; % Generate a 1 Hz sin x(:,2) = 2*cos(2*pi*n); % Generate a 1 Hx cos x(:,3) = 1.5*sin(4*pi*n)’; % Generate a 2 Hz sin x(:,4) = 3*cos(4*pi*n)’; % Generate a 2 Hx cos x(:,5) = 2.5*sin(3*pi*n)’; % Generate a 1.5 Hx sin x(:,6) = 1.75*cos(3*pi*n)’; % Generate a 1.5 Hz cos % S = cov(x) % Print covariance matrix C = corrcoef(x) % and correlation matrix The output from this program is a covariance and correlation matrix. The covariance matrix is: Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 211. S = 0.5020 0.0000 0.0000 0.0000 0.0000 -0.4474 0.0000 2.0078 -0.0000 -0.0000 1.9172 -0.0137 0.0000 -0.0000 1.1294 0.0000 -0.0000 0.9586 0.0000 -0.0000 0.0000 4.5176 -2.0545 -0.0206 0.0000 1.9172 -0.0000 -2.0545 2.8548 0.0036 -0.4474 -0.0137 0.9586 -0.0206 0.0036 1.5372 In the covariance matrix, the diagonals which give the variance of the six signals vary since the amplitudes of the signals are different. The covariance between the first four signals is zero, demonstrating the orthogonality of these signals. The correlation between the 5th and 6th signals and the other sinusoids can be best observed from the correlation matrix: Rxx = 1.0000 0.0000 0.0000 0.0000 0.0000 -0.5093 0.0000 1.0000 -0.0000 -0.0000 0.8008 -0.0078 0.0000 -0.0000 1.0000 0.0000 -0.0000 0.7275 0.0000 -0.0000 0.0000 1.0000 -0.5721 -0.0078 0.0000 0.8008 -0.0000 -0.5721 1.0000 0.0017 -0.5093 -0.0078 0.7275 -0.0078 0.0017 1.0000 In the correlation matrix, the correlation of each signal with itself is, of course, 1.0. The 1.5 Hz sine (the 5th column of the data matrix) shows good correlation with the 1.0 and 2.0 Hz cosine (2nd and 4th rows) but not the other sinewaves, while the 1.5 Hz cosine (the 6th column) shows the opposite. Hence, sinusoids that are not harmonically related are not orthogonal and do show some correlation. SAMPLING THEORY AND FINITE DATA CONSIDERATIONS To convert an analog waveform into a digitized version residing in memory requires two operations: sampling the waveform at discrete points in time,* and, if the waveform is longer than the computer memory, isolating a segment of the analog waveform for the conversion. The waveform segmentation operation is windowing as mentioned previously, and the consequences of this operation are discussed in the next chapter. If the purpose of sampling is to produce a digi- tized copy of the original waveform, then the critical issue is how well does this copy represent the original? Stated another way, can the original be recon- structed from the digitized copy? If so, then the copy is clearly adequate. The *As described in Chapter 1, this operation involves both time slicing, termed sampling, and ampli- tude slicing, termed quantization. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 212. answer to this question depends on the frequency at which the analog waveform is sampled relative to the frequencies that it contains. The question of what sampling frequency should be used can be best addressed assuming a simple waveform, a single sinusoid.* All finite, continu- ous waveforms can be represented by a series of sinusoids (possibly an infinite series), so if we can determine the appropriate sampling frequency for a single sinusoid, we have also solved the more general problem. The “Shannon Sam- pling Theorem” states that any sinusoidal waveform can be uniquely recon- structed provided it is sampled at least twice in one period. (Equally spaced samples are assumed). That is, the sampling frequency, fs, must be ≥ 2fsinusoid. In other words, only two equally spaced samples are required to uniquely specify a sinusoid, and these can be taken anywhere over the cycle. Extending this to a general analog waveform, Shannon’s Sampling Theorem states that a continuous waveform can be reconstructed without loss of information provided the sam- pling frequency is greater than twice the highest frequency in the analog wave- form: fs > 2fmax (22) As mentioned in Chapter 1, in practical situations, fmax is usually taken as the highest frequency in the analog waveform for which less than a negligible amount of energy exists. The sampling process is equivalent to multiplying the analog waveform by a repeating series of short pulses. This repeating series of short pulses is sometimes referred to as the sampling function. Recall that the ideal short pulse is called the impulse function, δ(t). In theory, the impulse function is infinitely short, but is also infinitely tall, so that its total area equals 1. (This must be justified using limits, but any pulse that is very short compared to the dynamics of the sampled waveform will due. Recall the sampling pulse produced in most modern analog-to-digital converters, termed the aperture time, is typically less than 100 nsec.) The sampling function can be stated mathematically using the impulse response. Samp(n) = ∑ ∞ k=−∞ δ (n − kTs) (23) where Ts is the sample interval and equals 1/fs. For an analog waveform, x(t), the sampled version, x(n), is given by multi- plying x(t) by the sampling function in Eq. (22): *A sinusoid has a straightforward frequency domain representation: only a single complex point at the frequency of the sinusoid. Classical methods of frequency analysis described in the next chapter make use of this fact. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 213. x(n) = ∑ ∞ k=−∞ x(nTs) δ (n − kTs) (24) The frequency spectrum of the sampling process represented by Eq. (23) can be determined by taking advantage of fact that multiplication in the time domain is equivalent to convolution in frequency domain (and vice versa). Hence, the frequency characteristic of a sampled waveform is just the convolu- tion of the analog waveform spectrum with the sampling function spectrum. Figure 2.9A shows the spectrum of a sampling function having a repetition rate of Ts, and Figure 2.9B shows the spectrum of a hypothetical signal that has a well-defined maximum frequency, fmax. Figure 2.9C shows the spectrum of the sampled waveform assuming fs = 1/Ts ≥ 2fmax. Note that the frequency character- istic of the sampled waveform is the same as the original for the lower frequen- cies, but the sampled spectrum now has a repetition of the original spectrum reflected on either side of fs and at multiples of fs. Nonetheless, it would be possible to recover the original spectrum simply by filtering the sampled data by an ideal lowpass filter with a bandwidth > fmax as shown in Figure 2.9E. Figure 2.9D shows the spectrum that results if the digitized data were sampled at fs < 2fmax, in this case fs = 1.5fmax. Note that the reflected portion of the spec- trum has become intermixed with the original spectrum, and no filter can un- mix them.* When fs < 2fmax, the sampled data suffers from spectral overlap, better known as aliasing. The sampled data no longer provides a unique repre- sentation of the analog waveform, and recovery is not possible. When correctly sampled, the original spectrum can by recovered by apply- ing an ideal lowpass filter (digital filter) to the digitized data. In Chapter 4, we show that an ideal lowpass filter has an impulse response given by: h(n) = sin(2πfcTs n) πn (25) where Ts is the sample interval and fc is the filter’s cutoff frequency. Unfortunately, in order for this impulse function to produce an ideal filter, it must be infinitely long. As demonstrated in Chapter 4, truncating h(n) results in a filter that is less than ideal. However if fs >> fmax, as is often the case, then any reasonable lowpass filter would suffice to recover the original waveform, Figure 2.9F. In fact, using sampling frequencies much greater than required is the norm, and often the lowpass filter is provided only by the response charac- teristics of the output, or display device which is sufficient to reconstruct an adequate looking signal. *You might argue that you could recover the original spectrum if you knew exactly the spectrum of the original analog waveform, but with this much information, why bother to sample the wave- form in the first place! Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 214. FIGURE 2.9 Consequences of sampling expressed in the frequency domain. (A) Frequency spectrum of a repetitive impulse function sampling at 6 Hz. (B) Fre- quency spectrum of a hypothetical time signal that has a maximum frequency, fmax, around 2 Hz. (Note negative frequencies occur with complex representation). (C) Frequency spectrum of sampled waveform when the sampling frequency was greater that twice the highest frequency component in the sampled waveform. (D) Frequency spectrum of sampled waveform when the sampling frequency was less that twice the highest frequency component in the sampled waveform. Note the overlap. (E) Recovery of correctly sampled waveform using an ideal lowpass filter (dotted line). (F) Recovery of a waveform when the sampling frequency is much much greater that twice the highest frequency in the sampled waveform (fs = 10fmax). In this case, the lowpass filter (dotted line) need not have as sharp a cutoff. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 215. Edge Effects An advantage of dealing with infinite data is that one need not be concerned with the end points since there are no end points. However, finite data consist of numerical sequences having a fixed length with fixed end points at the begin- ning and end of the sequence. Some operations, such as convolution, may pro- duce additional data points while some operations will require additional data points to complete their operation on the data set. The question then becomes how to add or eliminate data points, and there are a number of popular strategies for dealing with these edge effects. There are three common strategies for extending a data set when addi- tional points are needed: extending with zeros (or a constant), termed zero pad- ding; extending using periodicity or wraparound; and extending by reflection, also known as symmetric extension. These options are illustrated in Figure 2.10. In the zero padding approach, zeros are added to the end or beginning of the data sequence (Figure 2.10A). This approach is frequently used in spectral anal- ysis and is justified by the implicit assumption that the waveform is zero outside of the sample period anyway. A variant of zero padding is constant padding, where the data sequence is extended using a constant value, often the last (or first) value in the sequence. If the waveform can be reasonably thought of as one cycle of a periodic function, then the wraparound approach is clearly justi- fied (Figure 2.10B). Here the data are extended by tacking on the initial data sequence to the end of the data set and visa versa. This is quite easy to imple- ment numerically: simply make all operations involving the data sequence index modulo N, where N is the initial length of the data set. These two approaches will, in general, produce a discontinuity at the beginning or end of the data set, which can lead to artifact in certain situations. The symmetric reflection approach eliminates this discontinuity by tacking on the end points in reverse order (or beginning points if extending the beginning of the data sequence) (Figure 2.10C).* To reduce the number of points in cases where an operation has generated additional data, two strategies are common: simply eliminate the additional points at the end of the data set, or eliminate data from both ends of the data set, usually symmetrically. The latter is used when the data are considered peri- odic and it is desired to retain the same period or when other similar concerns are involved. An example of this is circular or periodic convolution. In this case, the original data set is extended using the wraparound strategy, convolution is performed on the extended data set, then the additional points are removed *When using this extension, there is a question as to whether or not to repeat the last point in the extension; either strategy will produce a smooth extension. The answer to this question will depend on the type of operation being performed and the number of data points involved, and determining the best approach may require empirical evaluation. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 216. FIGURE 2.10 Three strategies for extending the length of a finite data set. (A) Zero padding: Zeros are added at the ends of the data set. (B) Periodic or wrap- around: The waveform is assumed periodic so the end points are added at the beginning, and beginning points are added at the end. (C) Symmetric: Points are added to the ends in reverse order. Using this strategy the edge points may be repeated as was done at the beginning of the data set, or not repeated as at the end of the set. symmetrically. The goal is to preserve the relative phase between waveforms pre- and post-convolution. Periodic convolution is often used in wavelet analysis where a data set may be operated on sequentially a number of times, and exam- ples are found in Chapter 7. PROBLEMS 1. Load the data in ensemble_data.mat found in the CD. This file contains a data matrix labeled data. The data matrix contains 100 responses of a second- Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 217. order system buried in noise. In this matrix each row is a separate response. Plot several randomly selected samples of these responses. Is it possible to eval- uate the second-order response from any single record? Construct and plot the ensemble average for this data. Also construct and plot the ensemble standard deviation. 2. Use the MATLAB autocorrelation and random number routine to plot the autocorrelation sequence of white noise. Use arrays of 2048 and 256 points to show the affect of data length on this operation. Repeat for both uniform and Gaussian (normal) noise. (Use the MATLAB routines rand and randn, respec- tively.) 3. Construct a 512-point noise arrray then filter by averaging the points three at a time. That is, construct a new array in which every point is the average of the preceding three points in the noise array: y(n) = 1/3 x(n) + 1/3 x(n − 1) + 1/3 x(n − 2). Note that the new array will be two points shorter than the original noise array. Construct and plot the autocorrelation of this filtered array. You may want to save the output, or the code that generates it, for use in a spectral analysis problem at the end of Chapter 3. (See Problem 2, Chapter 3.) 4. Repeat the operation of Problem 3 to find the autocorrelation, but use con- volution to implement the filter. That is, construct a filter function consisting of 3 equal coefficients of 1/3: (h(n) = [1/3 1/3 1/3]). Then convolve this weighting function with the random array using conv. 5. Repeat the process in Problem 4 using a 10-weight averaging filter. (Note that it is much easier to implement such a running average filter with this many weights using convolution.) 6. Construct an array containing the impulse response of a first-order process. The impulse of a first-order process is given by the equation: y(t) = e−t/τ (scaled for unit amplitude). Assume a sampling frequency of 200 Hz and a time con- stant, τ, of 1 sec. Make sure the array is at least 5 time constants long. Plot this impulse response to verify its exponential shape. Convolve this impulse re- sponse with a 512-point noise array and construct and plot the autocorrelation function of this array. Repeat this analysis for an impulse response with a time constant of 0.2 sec. Save the outputs for use in a spectral analysis problem at the end of Chapter 3. (See Problems 4 and 5, Chapter 3.) 7. Repeat Problem 5 above using the impulse response of a second-order un- derdamped process. The impulse response of a second-order underdamped sys- tem is given by: y(t) = δ δ − 1 e−δ2πfnt sin(2πfn√1 − δ2 t) Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 218. Use a sampling rate of 500 Hz and set the damping factor, δ, to 0.1 and the frequency, fn (termed the undamped natural frequency), to 10 Hz. The array should be the equivalent of at least 2.0 seconds of data. Plot the impulse re- sponse to check its shape. Again, convolve this impulse response with a 512- point noise array and construct and plot the autocorrelation function of this array. Save the outputs for use in a spectral analysis problem at the end of Chapter 3. (See Problem 6, Chapter 3.) 8. Construct 4 damped sinusoids similar to the signal, y(t), in Problem 7. Use a damping factor of 0.04 and generate two seconds of data assuming a sampling frequency of 500 Hz. Two of the 4 signals should have an fn of 10 Hz and the other two an fn of 20 Hz. The two signals at the same frequency should be 90 degrees out of phase (replace the sin with a cos). Are any of these four signals orthogonal? Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 219. 3 Spectral Analysis: Classical Methods INTRODUCTION Sometimes the frequency content of the waveform provides more useful infor- mation than the time domain representation. Many biological signals demon- strate interesting or diagnostically useful properties when viewed in the so- called frequency domain. Examples of such signals include heart rate, EMG, EEG, ECG, eye movements and other motor responses, acoustic heart sounds, and stomach and intestinal sounds. In fact, just about all biosignals have, at one time or another, been examined in the frequency domain. Figure 3.1 shows the time response of an EEG signal and an estimate of spectral content using the classical Fourier transform method described later. Several peaks in the fre- quency plot can be seen indicating significant energy in the EEG at these frequencies. Determining the frequency content of a waveform is termed spectral anal- ysis, and the development of useful approaches for this frequency decomposition has a long and rich history (Marple, 1987). Spectral analysis can be thought of as a mathematical prism (Hubbard, 1998), decomposing a waveform into its constituent frequencies just as a prism decomposes light into its constituent colors (i.e., specific frequencies of the electromagnetic spectrum). A great variety of techniques exist to perform spectral analysis, each hav- ing different strengths and weaknesses. Basically, the methods can be divided into two broad categories: classical methods based on the Fourier transform and modern methods such as those based on the estimation of model parameters. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 220. FIGURE 3.1 Upper plot: Segment of an EEG signal from the PhysioNet data bank (Golberger et al.), and the resultant power spectrum (lower plot). The accurate determination of the waveform’s spectrum requires that the signal be periodic, or of finite length, and noise-free. The problem is that in many biological applications the waveform of interest is either infinite or of sufficient length that only a portion of it is available for analysis. Moreover, biosignals are often corrupted by substantial amounts of noise or artifact. If only a portion of the actual signal can be analyzed, and/or if the waveform contains noise along with the signal, then all spectral analysis techniques must necessarily be approximate; they are estimates of the true spectrum. The various spectral analy- sis approaches attempt to improve the estimation accuracy of specific spectral features. Intelligent application of spectral analysis techniques requires an under- standing of what spectral features are likely to be of interest and which methods Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 221. provide the most accurate determination of those features. Two spectral features of potential interest are the overall shape of the spectrum, termed the spectral estimate, and/or local features of the spectrum sometimes referred to as paramet- ric estimates. For example, signal detection, finding a narrowband signal in broadband noise, would require a good estimate of local features. Unfortunately, techniques that provide good spectral estimation are poor local estimators and vice versa. Figure 3.2A shows the spectral estimate obtained by applying the traditional Fourier transform to a waveform consisting of a 100 Hz sine wave buried in white noise. The SNR is minus 14 db; that is, the signal amplitude is 1/5 of the noise. Note that the 100 Hz sin wave is readily identified as a peak in the spectrum at that frequency. Figure 3.2B shows the spectral estimate ob- tained by a smoothing process applied to the same signal (the Welch method, described later in this chapter). In this case, the waveform was divided into 32 FIGURE 3.2 Spectra obtained from a waveform consisting of a 100 Hz sine wave and white noise using two different methods. The Fourier transform method was used to produce the left-hand spectrum and the spike at 100 Hz is clearly seen. An averaging technique was used to create the spectrum on the right side, and the 100 Hz component is no longer visible. Note, however, that the averaging technique produces a better estimate of the white noise spectrum. (The spectrum of white noise should be flat.) Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 222. segments, the Fourier transform was applied to each segment, then the 32 spec- tra were averaged. The resulting spectrum provides a more accurate representa- tion of the overall spectral features (predominantly those of the white noise), but the 100 Hz signal is lost. Figure 3.2 shows that the smoothing approach is a good spectral estimator in the sense that it provides a better estimate of the dominant noise component, but it is not a good signal detector. The classical procedures for spectral estimation are described in this chap- ter with particular regard to their strengths and weaknesses. These methods can be easily implemented in MATLAB as described in the following section. Mod- ern methods for spectral estimation are covered in Chapter 5. THE FOURIER TRANSFORM: FOURIER SERIES ANALYSIS Periodic Functions Of the many techniques currently in vogue for spectral estimation, the classical Fourier transform (FT) method is the most straightforward. The Fourier trans- form approach takes advantage of the fact that sinusoids contain energy at only one frequency. If a waveform can be broken down into a series of sines or co- sines of different frequencies, the amplitude of these sinusoids must be propor- tional to the frequency component contained in the waveform at those frequencies. From Fourier series analysis, we know that any periodic waveform can be represented by a series of sinusoids that are at the same frequency as, or multi- ples of, the waveform frequency. This family of sinusoids can be expressed either as sines and cosines, each of appropriate amplitude, or as a single sine wave of appropriate amplitude and phase angle. Consider the case where sines and cosines are used to represent the frequency components: to find the appro- priate amplitude of these components it is only necessary to correlate (i.e., mul- tiply) the waveform with the sine and cosine family, and average (i.e., integrate) over the complete waveform (or one period if the waveform is periodic). Ex- pressed as an equation, this procedure becomes: a(m) = 1 T ∫ T 0 x(t) cos(2πmfTt) dt (1) b(m) = 1 T ∫ T 0 x(t) sin(2πmfTt) dt (2) where T is the period or time length of the waveform, fT = 1/T, and m is set of integers, possibly infinite: m = 1, 2, 3, . . . , defining the family member. This gives rise to a family of sines and cosines having harmonically related frequen- cies, mfT. In terms of the general transform discussed in Chapter 2, the Fourier series analysis uses a probing function in which the family consists of harmonically Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 223. related sinusoids. The sines and cosines in this family have valid frequencies only at values of m/T, which is either the same frequency as the waveform (when m = 1) or higher multiples (when m > 1) that are termed harmonics. Since this approach represents waveforms by harmonically related sinusoids, the approach is sometimes referred to as harmonic decomposition. For periodic functions, the Fourier transform and Fourier series constitute a bilateral trans- form: the Fourier transform can be applied to a waveform to get the sinusoidal components and the Fourier series sine and cosine components can be summed to reconstruct the original waveform: x(t) = a(0)/2 + ∑ ∞ m=0 a(k) cos(2πmfTt) + ∑ ∞ m=0 b(k) sin (2πmfTt) (3) Note that for most real waveforms, the number of sine and cosine compo- nents that have significant amplitudes is limited, so that a finite, sometimes fairly short, summation can be quite accurate. Figure 3.3 shows the construction FIGURE 3.3 Two periodic functions and their approximations constructed from a limited series of sinusoids. Upper graphs: A square wave is approximated by a series of 3 and 6 sine waves. Lower graphs: A triangle wave is approximated by a series of 3 and 6 cosine waves. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 224. of a square wave (upper graphs) and a triangle wave (lower graphs) using Eq. (3) and a series consisting of only 3 (left side) or 6 (right side) sine waves. The reconstructions are fairly accurate even when using only 3 sine waves, particu- larly for the triangular wave. Spectral information is usually presented as a frequency plot, a plot of sine and cosine amplitude vs. component number, or the equivalent frequency. To convert from component number, m, to frequency, f, note that f = m/T, where T is the period of the fundamental. (In digitized signals, the sampling frequency can also be used to determine the spectral frequency). Rather than plot sine and cosine amplitudes, it is more intuitive to plot the amplitude and phase angle of a sinusoidal wave using the rectangular-to-polar transformation: a cos(x) + b sin(x) = C sin(x + Θ) (4) where C = (a2 + b2 )1/2 and Θ = tan−1 (b/a). Figure 3.4 shows a periodic triangle wave (sometimes referred to as a sawtooth), and the resultant frequency plot of the magnitude of the first 10 components. Note that the magnitude of the sinusoidal component becomes quite small after the first 2 components. This explains why the triangle function can be so accurately represented by only 3 sine waves, as shown in Figure 3.3. FIGURE 3.4 A triangle or sawtooth wave (left) and the first 10 terms of its Fourier series (right). Note that the terms become quite small after the second term. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 225. Symmetry Some waveforms are symmetrical or anti-symmetrical about t = 0, so that one or the other of the components, a(k) or b(k) in Eq. (3), will be zero. Specifically, if the waveform has mirror symmetry about t = 0, that is, x(t) = x(−t), than mul- tiplications by a sine functions will be zero irrespective of the frequency, and this will cause all b(k) terms to be zeros. Such mirror symmetry functions are termed even functions. Similarly, if the function has anti-symmetry, x(t) = −x(t), a so-called odd function, then all multiplications with cosines of any frequency will be zero, causing all a(k) coefficients to be zero. Finally, functions that have half-wave symmetry will have no even coefficients, and both a(k) and b(k) will be zero for even m. These are functions where the second half of the period looks like the first half flipped left to right; i.e., x(t) = x(T − t). Functions having half-wave symmetry can also be either odd or even functions. These symmetries are useful for reducing the complexity of solving for the coefficients when such computations are done manually. Even when the Fourier transform is done on a computer (which is usually the case), these properties can be used to check the correctness of a program’s output. Table 3.1 summarizes these properties. Discrete Time Fourier Analysis The discrete-time Fourier series analysis is an extension of the continuous analy- sis procedure described above, but modified by two operations: sampling and windowing. The influence of sampling on the frequency spectra has been cov- ered in Chapter 2. Briefly, the sampling process makes the spectra repetitive at frequencies mfT (m = 1,2,3, . . . ), and symmetrically reflected about these fre- quencies (see Figure 2.9). Hence the discrete Fourier series of any waveform is theoretically infinite, but since it is periodic and symmetric about fs /2, all of the information is contained in the frequency range of 0 to fs /2 (fs /2 is the Nyquist frequency). This follows from the sampling theorem and the fact that the origi- nal analog waveform must be bandlimited so that its highest frequency, fMAX, is <fs /2 if the digitized data is to be an accurate representation of the analog waveform. TABLE 3.1 Function Symmetries Function Name Symmetry Coefficient Values Even x(t) = x(−t) b(k) = 0 Odd x(t) = −x(−t) a(k) = 0 Half-wave x(t) = x(T−t) a(k) = b(k) = 0; for m even Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 226. The digitized waveform must necessarily be truncated at least to the length of the memory storage array, a process described as windowing. The windowing process can be thought of as multiplying the data by some window shape (see Figure 2.4). If the waveform is simply truncated and no further shaping is per- formed on the resultant digitized waveform (as is often the case), then the win- dow shape is rectangular by default. Other shapes can be imposed on the data by multiplying the digitized waveform by the desired shape. The influence of such windowing processes is described in a separate section below. The equations for computing Fourier series analysis of digitized data are the same as for continuous data except the integration is replaced by summation. Usually these equations are presented using complex variables notation so that both the sine and cosine terms can be represented by a single exponential term using Euler’s identity: e jx = cos x + j sin x (5) (Note mathematicians use i to represent √−1 while engineers use j; i is reserved for current.) Using complex notation, the equation for the discrete Fourier trans- form becomes: X(m) = ∑ N−1 n=0 x(n)e(−j2πmn/N) (6) where N is the total number of points and m indicates the family member, i.e., the harmonic number. This number must now be allowed to be both positive and negative when used in complex notation: m = −N/2, . . . , N/2–1. Note the similarity of Eq. (6) with Eq. (8) of Chapter 2, the general transform in discrete form. In Eq. (6), fm(n) is replaced by e−j2πmn/N . The inverse Fourier transform can be calculated as: x(n) = 1 N ∑ N−1 n=0 X(m) e−j2πnfmTs (7) Applying the rectangular-to-polar transformation described in Eq. (4), it is also apparent *X(m)* gives the magnitude for the sinusoidal representation of the Fourier series while the angle of X(m) gives the phase angle for this repre- sentation, since X(m) can also be written as: X(m) = ∑ N−1 n=0 x(n) cos(2πmn/N) − j ∑ N−1 n=0 x(n) sin(2πmn/N) (8) As mentioned above, for computational reasons, X(m) must be allowed to have both positive and negative values for m; negative values imply negative frequencies, but these are only a computational necessity and have no physical meaning. In some versions of the Fourier series equations shown above, Eq. (6) Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 227. is multiplied by Ts (the sampling time) while Eq. (7) is divided by Ts so that the sampling interval is incorporated explicitly into the Fourier series coefficients. Other methods of scaling these equations can be found in the literature. The discrete Fourier transform produces a function of m. To convert this to frequency note that: fm = mf1 = m/TP = m/NTs = mfs /N (9) where f1 ≡ fT is the fundamental frequency, Ts is the sample interval; fs is the sample frequency; N is the number of points in the waveform; and TP = NTs is the period of the waveform. Substituting m = fmTs into Eq. (6), the equation for the discrete Fourier transform (Eq. (6)) can also be written as: X(f) = ∑ N−1 n=0 x(n) e(−j2πnfmTs) (10) which may be more useful in manual calculations. If the waveform of interest is truly periodic, then the approach described above produces an accurate spectrum of the waveform. In this case, such analy- sis should properly be termed Fourier series analysis, but is usually termed Fourier transform analysis. This latter term more appropriately applies to aperi- odic or truncated waveforms. The algorithms used in all cases are the same, so the term Fourier transform is commonly applied to all spectral analyses based on decomposing a waveform into sinusoids. Originally, the Fourier transform or Fourier series analysis was imple- mented by direct application of the above equations, usually using the complex formulation. Currently, the Fourier transform is implemented by a more compu- tationally efficient algorithm, the fast Fourier transform (FFT), that cuts the number of computations from N2 to 2 log N, where N is the length of the digital data. Aperiodic Functions If the function is not periodic, it can still be accurately decomposed into sinu- soids if it is aperiodic; that is, it exists only for a well-defined period of time, and that time period is fully represented by the digitized waveform. The only difference is that, theoretically, the sinusoidal components can exist at all fre- quencies, not just multiple frequencies or harmonics. The analysis procedure is the same as for a periodic function, except that the frequencies obtained are really only samples along a continuous frequency spectrum. Figure 3.5 shows the frequency spectrum of a periodic triangle wave for three different periods. Note that as the period gets longer, approaching an aperiodic function, the spec- tral shape does not change, but the points get closer together. This is reasonable Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 228. FIGURE 3.5 A periodic waveform having three different periods: 2, 2.5, and 8 sec. As the period gets longer, the shape of the frequency spectrum stays the same but the points get closer together. since the space between the points is inversely related to the period (m/T).* In the limit, as the period becomes infinite and the function becomes truly aperi- odic, the points become infinitely close and the curve becomes continuous. The analysis of waveforms that are not periodic and that cannot be completely repre- sented by the digitized data is described below. *The trick of adding zeros to a waveform to make it appear to a have a longer period (and, therefore, more points in the frequency spectrum) is another example of zero padding. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 229. Frequency Resolution From the discrete Fourier series equation above (Eq. (6)), the number of points produced by the operation is N, the number of points in the data set. However, since the spectrum produced is symmetrical about the midpoint, N/2 (or fs /2 in frequency), only half the points contain unique information.* If the sampling time is Ts, then each point in the spectra represents a frequency increment of 1/(NTs). As a rough approximation, the frequency resolution of the spectra will be the same as the frequency spacing, 1/(NTs). In the next section we show that frequency resolution is also influenced by the type of windowing that is applied to the data. As shown in Figure 3.5, frequency spacing of the spectrum produced by the Fourier transform can be decreased by increasing the length of the data, N. Increasing the sample interval, Ts, should also improve the frequency resolution, but since that means a decrease in fs, the maximum frequency in the spectra, fs /2 is reduced limiting the spectral range. One simple way of increasing N even after the waveform has been sampled is to use zero padding, as was done in Figure 3.5. Zero padding is legitimate because the undigitized portion of the waveform is always assumed to be zero (whether true or not). Under this as- sumption, zero padding simply adds more of the unsampled waveform. The zero-padded waveform appears to have improved resolution because the fre- quency interval is smaller. In fact, zero padding does not enhance the underlying resolution of the transform since the number of points that actually provide information remains the same; however, zero padding does provide an interpo- lated transform with a smoother appearance. In addition, it may remove ambigu- ities encountered in practice when a narrowband signal has a center frequency that lies between the 1/NTs frequency evaluation points (compare the upper two spectra in Figure 3.5). Finally, zero padding, by providing interpolation, can make it easier to estimate the frequency of peaks in the spectra. Truncated Fourier Analysis: Data Windowing More often, a waveform is neither periodic or aperiodic, but a segment of a much longer—possibly infinite—time series. Biomedical engineering examples are found in EEG and ECG analysis where the waveforms being analyzed con- tinue over the lifetime of the subject. Obviously, only a portion of such wave- forms can be represented in the finite memory of the computer, and some atten- tion must be paid to how the waveform is truncated. Often a segment is simply *Recall that the Fourier transform contains magnitude and phase information. There are N/2 unique magnitude data points and N/2 unique phase data points, so the same number of actual data points is required to fully represent the data. Both magnitude and phase data are required to reconstruct the original time function, but we are often only interested in magnitude data for analysis. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 230. cut out from the overall waveform; that is, a portion of the waveform is trun- cated and stored, without modification, in the computer. This is equivalent to the application of a rectangular window to the overall waveform, and the analysis is restricted to the windowed portion of the waveform. The window function for a rectangular window is simply 1.0 over the length of the window, and 0.0 else- where, (Figure 3.6, left side). Windowing has some similarities to the sampling process described previously and has well-defined consequences on the resultant frequency spectrum. Window shapes other than rectangular are possible simply by multiplying the waveform by the desired shape (sometimes these shapes are referred to as tapering functions). Again, points outside the window are assumed to be zero even if it is not true. When a data set is windowed, which is essential if the data set is larger than the memory storage, then the frequency characteristics of the window be- come part of the spectral result. In this regard, all windows produce artifact. An idea of the artifact produced by a given window can be obtained by taking the Fourier transform of the window itself. Figure 3.6 shows a rectangular window on the left side and its spectrum on the right. Again, the absence of a window function is, by default, a rectangular window. The rectangular window, and in fact all windows, produces two types of artifact. The actual spectrum is widened by an artifact termed the mainlobe, and additional peaks are generated termed FIGURE 3.6 The time function of a rectangular window (left) and its frequency characteristics (right). Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 231. the sidelobes. Most alternatives to the rectangular window reduce the sidelobes (they decay away more quickly than those of Figure 3.6), but at the cost of wider mainlobes. Figures 3.7 and 3.8 show the shape and frequency spectra produced by two popular windows: the triangular window and the raised cosine or Ham- ming window. The algorithms for these windows are straightforward: Triangular window: for odd n: w(k) = ͭ2k/(n − 1) 2(n − k − 1)/(n + 1) 1 ≤ k ≤ (n + 1)/2 (n + 1)/2 ≤ k ≤ n (11) for even n: w(k) = ͭ(2k − 1)/n 2(n − k + 1)/n 1 ≤ k ≤ n/2 (n/2) + 1 ≤ k ≤ n (12) Hamming window: w(k + 1) = 0.54 − 0.46(2πk/(n − 1))k = 0, 1, . . . , n − 1 (13) FIGURE 3.7 The triangular window in the time domain (left) and its spectral char- acteristic (right). The sidelobes diminish faster than those of the rectangular win- dow (Figure 3.6), but the mainlobe is wider. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 232. FIGURE 3.8 The Hamming window in the time domain (left) and its spectral char- acteristic (right). These and several others are easily implemented in MATLAB, especially with the Signal Processing Toolbox as described in the next section. A MATLAB routine is also described to plot the spectral characteristics of these and other windows. Selecting the appropriate window, like so many other aspects of signal analysis, depends on what spectral features are of interest. If the task is to resolve two narrowband signals closely spaced in frequency, then a window with the narrowest mainlobe (the rectangular window) is preferred. If there is a strong and a weak signal spaced a moderate distance apart, then a window with rapidly decaying sidelobes is preferred to prevent the sidelobes of the strong signal from overpowering the weak signal. If there are two moderate strength signals, one close and the other more distant from a weak signal, then a compro- mise window with a moderately narrow mainlobe and a moderate decay in side- lobes could be the best choice. Often the most appropriate window is selected by trial and error. Power Spectrum The power spectrum is commonly defined as the Fourier transform of the auto- correlation function. In continuous and discrete notation, the power spectrum equation becomes: Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 233. PS(f) = ∫ T 0 rxx(τ) e−2πfTτ dτ PS(f) = ∑ N−1 n=0 rxx(n) e−j2πnfTTs (14) where rxx(n) is the autocorrelation function described in Chapter 2. Since the autocorrelation function has odd symmetry, the sine terms, b(k) will all be zero (see Table 3.1) and Eq. (14) can be simplified to include only real cosine terms. PS(f) = ∫ T 0 rxx(τ) cos(2πmfTt) dτ PS(f) = ∑ N−1 n=0 rxx(n) cos(2πnfTTs) (15) These equations in continuous and discrete form are sometimes referred to as the cosine transform. This approach to evaluating the power spectrum has lost favor to the so-called direct approach, given by Eq. (18) below, primarily because of the efficiency of the fast Fourier transform. However, a variation of this approach is used in certain time–frequency methods described in Chapter 6. One of the problems compares the power spectrum obtained using the direct approach of Eq. (18) with the traditional method represented by Eq. (14). The direct approach is motivated by the fact that the energy contained in an analog signal, x(t), is related to the magnitude of the signal squared, inte- grated over time: E = ∫ ∞ −∞ *x(t)*2 dt (16) By an extension of Parseval’s theorem it is easy to show that: ∫ ∞ −∞ *x(t)*2 dt = ∫ ∞ −∞ *X(f)*2 df (17) Hence *X(f)*2 equals the energy density function over frequency, also re- ferred to as the energy spectral density, the power spectral density, or simply the power spectrum. In the direct approach, the power spectrum is calculated as the magnitude squared of the Fourier transform of the waveform of interest: PS(f) = *X(f)*2 (18) Power spectral analysis is commonly applied to truncated data, particu- larly when the data contains some noise, since phase information is less useful in such situations. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 234. While the power spectrum can be evaluated by applying the FFT to the entire waveform, averaging is often used, particularly when the available wave- form is only a sample of a longer signal. In such very common situations, power spectrum evaluation is necessarily an estimation process, and averaging im- proves the statistical properties of the result. When the power spectrum is based on a direct application of the Fourier transform followed by averaging, it is com- monly referred to as an average periodogram. As with the Fourier transform, evaluation of power spectra involves necessary trade-offs to produce statistically reliable spectral estimates that also have high resolution. These trade-offs are implemented through the selection of the data window and the averaging strat- egy. In practice, the selection of data window and averaging strategy is usually based on experimentation with the actual data. Considerations regarding data windowing have already been described and apply similarly to power spectral analysis. Averaging is usually achieved by dividing the waveform into a number of segments, possibly overlapping, and evaluating the Fourier transform on each of these segments (Figure 3.9). The final spectrum is taken from an average of the Fourier transforms obtained from the various segments. Segmentation necessarily reduces the number of data sam- FIGURE 3.9 A waveform is divided into three segments with a 50% overlap be- tween each segment. In the Welch method of spectral analysis, the Fourier trans- form of each segment would be computed separately, and an average of the three transforms would provide the output. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 235. ples evaluated by the Fourier transform in each segment. As mentioned above, frequency resolution of a spectrum is approximately equal to 1/NTs, where N is now the number samples per segment. Choosing a short segment length (a small N) will provide more segments for averaging and improve the reliability of the spectral estimate, but it will also decrease frequency resolution. Figure 3.2 shows spectra obtained from a 1024-point data array consisting of a 100 Hz sinusoid and white noise. In Figure 3.2A, the periodogram is taken from the entire waveform, while in Figure 3.2B the waveform is divided into 32 non- overlapping segments; a Fourier transform is calculated from each segment, then averaged. The periodogram produced from the segmented and averaged data is much smoother, but the loss in frequency resolution is apparent as the 100 Hz sine wave is no longer visible. One of the most popular procedures to evaluate the average periodogram is attributed to Welch and is a modification of the segmentation scheme origi- nally developed by Bartlett. In this approach, overlapping segments are used, and a window is applied to each segment. By overlapping segments, more seg- ments can be averaged for a given segment and data length. Averaged periodo- grams obtained from noisy data traditionally average spectra from half-overlap- ping segments; that is, segments that overlap by 50%. Higher amounts of overlap have been recommended in other applications, and, when computing time is not factor, maximum overlap has been recommended. Maximum overlap means shifting over by just a single sample to get the new segment. Examples of this approach are provided in the next section on implementation. The use of data windowing for sidelobe control is not as important when the spectra are expected to be relatively flat. In fact, some studies claim that data windows give some data samples more importance than others and serve only to decrease frequency resolution without a significant reduction in estima- tion error. While these claims may be true for periodograms produced using all the data (i.e., no averaging), they are not true for the Welch periodograms be- cause overlapping segments serves to equalize data treatment and the increased number of segments decreases estimation errors. In addition, windows should be applied whenever the spectra are expected have large amplitude differences. MATLAB IMPLEMENTATION Direct FFT and Windowing MATLAB provides a variety of methods for calculating spectra, particularly if the Signal Processing Toolbox is available. The basic Fourier transform routine is implemented as: X = fft(x,n) Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 236. where X is the input waveform and x is a complex vector providing the sinusoi- dal coefficients. The argument n is optional and is used to modify the length of data analyzed: if n < length(x), then the analysis is performed over the first n points; or, if n > length(x), x is padded with trailing zeros to equal n. The fft routine implements Eq. (6) above and employs a high-speed algorithm. Calculation time is highly dependent on data length and is fastest if the data length is a power of two, or if the length has many prime factors. For example, on one machine a 4096-point FFT takes 2.1 seconds, but requires 7 seconds if the sequence is 4095 points long, and 58 seconds if the sequence is for 4097 points. If at all possible, it is best to stick with data lengths that are powers of two. The magnitude of the frequency spectra can be easily obtained by apply- ing the absolute value function, abs, to the complex output X: Magnitude = abs(X) This MATLAB function simply takes the square root of the sum of the real part of X squared and the imaginary part of X squared. The phase angle of the spectra can be obtained by application of the MATLAB angle function: Phase = angle(X) The angle function takes the arctangent of the imaginary part divided by the real part of Y. The magnitude and phase of the spectrum can then be plotted using standard MATLAB plotting routines. An example applying the MATLAB fft to a array containing sinusoids and white noise is provided below and the resultant spectra is given in Figure 3.10. Other applications are explored in the problem set at the end of this chapter. This example uses a special routine, sig_noise, found on the disk. The routine generates data consisting of sinu- soids and noise that are useful in evaluating spectral analysis algorithms. The calling structure for sig_noise is: [x,t] = sig_noise([f],[SNR],N); where f specifies the frequency of the sinusoid(s) in Hz, SNR specifies the de- sired noise associated with the sinusoid(s) in db, and N is the number of points. The routine assumes a sample frequency of 1 kHz. If f and SNR are vectors, multiple sinusoids are generated. The output waveform is in x and t is a time vector useful in plotting. Example 3.1 Plot the power spectrum of a waveform consisting of a single sine wave and white noise with an SNR of −7 db. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 237. FIGURE 3.10 Plot produced by the MATLAB program above. The peak at 250 Hz is apparent. The sampling frequency of this data is 1 kHz, hence the spectrum is symmetric about the Nyquist frequency, fs/2 (500 Hz). Normally only the first half of this spectrum would be plotted (SNR = −7 db; N = 1024). % Example 3.1 and Figure 3.10 Determine the power spectrum % of a noisy waveform % First generates a waveform consisting of a single sine in % noise, then calculates the power spectrum from the FFT % and plots clear all; close all; N = 1024; % Number of data points % Generate data using sig_noise % 250 Hz sin plus white noise; N data points ; SNR = -7 db [x,t] = sig_noise (250,-7,N); fs = 1000; % The sample frequency of data % is 1 kHz. Y = fft(x); % Calculate FFT PS = abs(Y).v 2; % Calculate PS as magnitude % squared Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 238. freq = (1:N)/fs; % Frequency vector for plot- ting plot(freq,20*log10(PS),’k’); % Plot PS in log scale title(’Power Spectrum (note symmetric about fs/2)’); xlabel(’Frequency (Hz)’); ylabel(’Power Spectrum (db)’); The Welch Method for Power Spectral Density Determination As described above, the Welch method for evaluating the power spectrum di- vides the data in several segments, possibly overlapping, performs an FFT on each segment, computes the magnitude squared (i.e., power spectrum), then averages these spectra. Coding these in MATLAB is straightforward, but this is unnecessary as the Signal Processing Toolbox features a function that performs these operations. In its more general form, the pwelch* function is called as: [PS,f] = pwelch(x,window,noverlap,nfft,fs) Only the first input argument, the name of the data vector, is required as the other arguments have default values. By default, x is divided into eight sections with 50% overlap, each section is windowed with a Hamming window and eight periodograms are computed and averaged. If window is an integer, it specifies the segment length, and a Hamming window of that length is applied to each segment. If window is a vector, then it is assumed to contain the window function (easily implemented using the window routines described below). In this situation, the window size will be equal to the length of the vector, usually set to be the same as nfft. If the window length is specified to be less than nfft (greater is not allowed), then the window is zero padded to have a length equal to nfft. The argument noverlap specifies the overlap in samples. The sampling frequency is specified by the optional argument fs and is used to fill the frequency vector, f, in the output with appropriate values. This output vari- able can be used in plotting to obtain a correctly scaled frequency axis (see Example 3.2). As is always the case in MATLAB, any variable can be omitted, and the default selected by entering an empty vector, [ ]. If pwelch is called with no output arguments, the default is to plot the power spectral estimate in dB per unit frequency in the current figure window. If PS is specified, then it contains the power spectra. PS is only half the length of the data vector, x, specifically, either (nfft/2)؉1 if nfft is even, or (nfft؉1)/2 for nfft odd, since the additional points would be redundant. (An *The calling structure for this function is different in MATLAB versions less than 6.1. Use the ‘Help’ command to determine the calling structure if you are using an older version of MATLAB. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 239. exception is made if x is complex data in which case the length of PS is equal to nfft.) Other options are available and can be found in the help file for pwelch. Example 3.2 Apply Welch’s method to the sine plus noise data used in Example 3.1. Use 124-point data segments and a 50% overlap. % Example 3.2 and Figure 3.11 % Apply Welch’s method to sin plus noise data of Figure 3.10 clear all; close all; N = 1024; % Number of data points fs = 1000; % Sampling frequency (1 kHz) FIGURE 3.11 The application of the Welch power spectral method to data con- taining a single sine wave plus noise, the same as the one used to produce the spectrum of Figure 3.10. The segment length was 128 points and segments overlapped by 50%. A triangular window was applied. The improvement in the background spectra is obvious, although the 250 Hz peak is now broader. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 240. % Generate data (250 Hz sin plus noise) [x,t,] = sig_noise(250,-7,N); % % Estimate the Welch spectrum using 128 point segments, % a the triangular filter, and a 50% overlap. % [PS,f] = (x, triang(128),[ ],128,fs); plot(f,PS,’k’); % Plot power spectrum title(’Power Spectrum (Welch Method)’); xlabel(’Frequency (Hz)’); ylabel(’Power Spectrum’); Comparing the spectra in Figure 3.11 with that of Figure 3.10 shows that the background noise is considerably smoother and reduced. The sine wave at 250 Hz is clearly seen, but the peak is now slightly broader indicating a loss in frequency resolution. Window Functions MATLAB has a number of data windows available including those de- scribed in Eqs. (11–13). The relevant MATLAB routine generates an n-point vector array containing the appropriate window shape. All have the same form: w = window_name(N); % Generate vector w of length N % containing the window function % of the associated name where N is the number of points in the output vector and window_name is the name, or an abbreviation of the name, of the desired window. At this writing, thirteen different windows are available in addition to rectangular (rectwin) which is included for completeness. Using help window will provide a list of window names. A few of the more popular windows are: bartlett, blackman, gausswin, hamming (a common MATLAB default window), hann, kaiser, and triang. A few of the routines have additional optional arguments. In particu- lar, chebwin (Chebyshev window), which features a nondecaying, constant level of sidelobes, has a second argument to specify the sidelobe amplitude. Of course, the smaller this level is set, the wider the mainlobe, and the poorer the frequency resolution. Details for any given window can be found through the help command. In addition to the individual functions, all of the window func- tions can be constructed with one call: w = window(@name,N,opt) % Get N-point window ‘name.’ Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 241. where name is the name of the specific window function (preceded by @), N the number of points desired, and opt possible optional argument(s) required by some specific windows. To apply a window to the Fourier series analysis such as in Example 2.1, simply point-by-point multiply the digitized waveform by the output of the MATLAB window_name routine before calling the FFT routine. For example: w = triang (N); % Get N-point triangular window curve x = x .* w’; % Multiply (point-by-point) data by window X = fft(x); % Calculate FFT Note that in the example above it was necessary to transpose the window function W so that it was in the same format as the data. The window function produces a row vector. Figure 3.12 shows two spectra obtained from a data set consisting of two sine waves closely spaced in frequency (235 Hz and 250 Hz) with added white noise in a 256 point array sampled at 1 kHz. Both spectra used the Welch method with the same parameters except for the windowing. (The window func- FIGURE 3.12 Two spectra computed for a waveform consisting of two closely spaced sine waves (235 and 250 Hz) in noise (SNR = −10 db). Welch’s method was used for both methods with the same parameters (nfft = 128, overlap = 64) except for the window functions. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 242. tion can be embedded in the pwelch calling structure.) The upper spectrum was obtained using a Hamming window (hamming) which has large sidelobes, but a fairly narrow mainlobe while the lower spectrum used a Chebyshev window (chebwin) which has small sidelobes but a larger mainlobe. A small difference is seen in the ability to resolve the two peaks. The Hamming window with a smaller main lobe gives rise to a spectrum that shows two peaks while the presence of two peaks might be missed in the Chebyshev windowed spectrum. PROBLEMS 1. (A) Construct two arrays of white noise: one 128 points in length and the other 1024 points in length. Take the FT of both. Does increasing the length improve the spectral estimate of white noise? (B) Apply the Welch methods to the longer noise array using a Hanning window with an nfft of 128 with no overlap. Does this approach improve the spectral estimate? Now change the overlap to 64 and note any changes in the spectrum. Submit all frequency plots appropriately labeled. 2. Find the power spectrum of the filtered noise data from Problem 3 in Chap- ter 2 using the standard FFT. Show frequency plots appropriately labeled. Scale, or rescale, the frequency axis to adequately show the frequency response of this filter. 3. Find the power spectrum of the filtered noise data in Problem 2 above using the FFT, but zero pad the data so that N = 2048. Note the visual improvement in resolution. 4. Repeat Problem 2 above using the data from Problem 6 in Chapter 2. Applying the Hamming widow to the data before calculating the FFT. 5. Repeat problem 4 above using the Welch method with 256 and 65 segment lengths and the window of your choice. 6. Repeat Problem 4 above using the data from Problem 7, Chapter 2. 7. Use routine sig_noise noise to generate a 256-point array that contains two closely spaced sinusoids at 140 and 180 Hz both with an SNR of -10 db. (Calling structure: data = sig_noise([140 180], [-10 -10], 256);) Sig_noise assumes a sampling rate of 1 kHz. Use the Welch method. Find the spectrum of the waveform for segment lengths of 256 (no overlap) and 64 points with 0%, 50% and 99% overlap. 8. Use sig_noise to generate a 512-point array that contains a single sinusoid at 200 Hz with an SNR of -12 db. Find the power spectrum first by taking the Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 243. FFT of the autocorrelation function. Compare this power spectrum with the one obtained using the direct method. Plot the two spectra side-by-side. 9. Using the data of Problem 7 above, find the power spectrum applying the Welch method with 64-point segments, and no overlap. Using the Chebyshev (chebwin), Kaiser (kaiser), and Gauss (gausswin) windows, find the best window in terms of frequency separation. Submit frequency plots obtained using the best and worse windows (in terms of separation). For the Chebyshev win- dow use a ripple of 40 db, and for the Kaiser window use a beta of 0 (minimum mainlobe option). 10. Use routine sig_noise to generate a 512-point array containing one sinus- oid at 150 Hz and white noise; SNR = −15db. Generate the power spectrum as the square of the magnitude obtained using the Fourier transform. Put the signal generator commands and spectral analysis commands in a loop and calculate the spectrum five times plotting the five spectra superimposed. Repeat using the Welch method and data segment length of 128 and a 90% overlap. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 244. 4 Digital Filters Filters are closely related to spectral analysis since the goal of filtering is to reshape the spectrum to one’s advantage. Most noise is broadband (the broadest- band noise being white noise with a flat spectrum) and most signals are narrow- band; hence, filters that appropriately reshape a waveform’s spectrum will al- most always provide some improvement in SNR. As a general concept, a basic filter can be viewed as a linear process in which the input signal’s spectrum is reshaped in some well-defined (and, one hopes, beneficial) manner. Filters differ in the way they achieve this spectral reshaping, and can be classified into two groups based on their approach. These two groups are termed finite impulse response (FIR) filters and infinite impulse response (IIR) filters, although this terminology is based on characteristics which are secondary to the actual meth- odology. We will describe these two approaches separately, clarifying the major differences between them. As in preceding chapters, these descriptions will be followed by a presentation of the MATLAB implementation. THE Z-TRANSFORM The frequency-based analysis introduced in the last chapter is a most useful tool for analyzing systems or responses in which the waveforms are periodic or aperiodic, but cannot be applied to transient responses of infinite length, such as step functions, or systems with nonzero initial conditions. These shortcomings motivated the development of the Laplace transform in the analog domain. La- place analysis uses the complex variable s (s = σ + jω) as a representation of Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 245. complex frequency in place of jω in the Fourier transform. The Z-transform is a digital operation analogous to the Laplace transform in the analog domain, and it is used in a similar manner. The Z-transform is based around the complex variable, z, where z is an arbitrary complex number, *z* ejω . This variable is also termed the complex frequency, and as with its time domain counterpart, the Laplace variable s, it is possible to substitute ejω for z to perform a strictly sinusoidal analysis.* The Z-transform follows the format of the general transform equation (Eq. (7)) and is also similar to the Fourier transform equation (Eq. (6)): X(z) ᭝ = Z[x(n)] = ∑ ∞ n=−∞ x(n) Z−n (1) where z = an arbitrary complex variable. Note that the probing function for this transform is simply z−n . In any real application, the limit of the summation will be finite, usually the length of x(n). When identified with a data sequence, such as x(n) above, z−n represents an interval shift of n samples, or an associated time shift of nTs seconds. Note that Eq. (1) indicates that every data sample in the sequence x(n) is associated with a unique power of z, and this power of z defines a sample’s position in the sequence. This time shifting property of z−n can be formally stated as: Z(x(n − k))] = z−k Z(x(n)) (2) For example, the time shifting characteristic of the Z-transform can be used to define a unit delay process, z−1 . For such a process, the output is the same as the input, but shifted (or delayed) by one data sample (Figure 4.1). Digital Transfer Function As in Laplace transform analysis, one of the most useful applications of the Z- transform lies in its ability to define the digital equivalent of a transfer function. FIGURE 4.1 A unit delay process shifts the input by one data sample. Other powers of z could be used to provide larger shifts. *If *z* is set to 1, then z = ejω . This is called evaluating z on the unit circle. See Bruce (2001) for a thorough discussion of the properties of z and the Z-transform. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 246. By analogy to linear system analysis, the digital transfer function is defined as: H(z) = Y(z) X(z) (3) For the simple example of Figure 4.1, the digital transfer function would be: H(z) = z−1 . Of course, most transfer functions will be more complicated, including polynomials of z in both the numerator and denominator, just as ana- log transfer functions contain polynomials of s: H(z) = b0 + b1z−1 + b2z−2 + ؒؒؒ + bNz−N 1 + a1z−1 + a2z−2 + ؒؒؒ + bDz−D (4) While H(z) has a structure similar to the Laplace domain transfer function H(s), there is no simple relationship between them. For example, unlike analog sys- tems, the order of the numerator, N, need not be less than, or equal to, the order of the denominator, D, for stability. In fact, systems that have a denominator order of 1 are more stable that those having higher order denominators. From the digital transfer function, H(z), it is possible to determine the output given any input. In the Z-transform domain this relationship is simply: Y(z) = H(z) X(z) = X(z) ∑ N−1 k=0 b(k) z−n ∑ D−1 R=0 a(R) z−n (5) The input–output or difference equation analogous to the time domain equation can be obtained from Eq. (5) by applying the time shift interpretation to the term z−n : y(n) = ∑ K k=0 b(k) x(n − k) − ∑ L R=0 a(R) y(n − R) (6) This equation assumes that a(0) = 1 as specified in Eq. (4). We will find that Eq. (6) is similar to the equation representing other linear processes such as the ARMA model in Chapter 5 (Eq. (3), Chapter 5). This is appropriate as the ARMA model is a linear digital process containing both denominator terms and numerator terms.* All basic digital filters can be interpreted as linear digital processes, and, in fact, the term digital filter is often used interchangeably with digital systems (Stearns and David, 1996). Filter design, then, is simply the determination of *Borrowing from analog terminology, the terms poles is sometimes uses for denominator coeffi- cients and zeros for numerator coefficients. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 247. the appropriate filter coefficients, a(n) and b(n), that provide the desired spectral shaping. This design process can be aided by MATLAB routines that can gener- ate the a(n) and b(n) coefficients of Eq. (6) given a desired frequency response. If the frequency spectrum of H(z) is desired, it can be obtained from a modification of Eq. (5) substituting z = ejω : H(m) = Y(m) X(m) = ∑ N−1 n=0 b(n) e(−j2πmn/N) ∑ D−1 n=0 a(n) e(−j2πmn/N) = fft(bn) fft(an) (7) where fft indicates the Fourier transform. As with all Fourier transforms, fre- quency can be obtained from the variable m by multiplying by fs/N or 1/(NTs). MATLAB Implementation Many MATLAB functions used in filter design and application can also be used in digital transfer function analysis. The MATLAB routine filter described below uses Eq. (6) to implement a digital filter, but can be used to implement a linear process given the Z-transform transfer function (see Example 4.1). With regard to implementation, note that if the a(R) coefficients in Eq. (6) are zero (with the exception of a(0) = 1), Eq. (6) reduces to convolution (see Eq. (15) in Chapter 2). The function filter determines the output, y(n), to an input, x(n), for a linear system with a digital transfer function as specified by the a and b coeffi- cients. Essentially this function implements Eq. (6). The calling structure is: y = filter(b,a,x) where x is the input, y the output, and b and a are the coefficients of the transfer function in Eq. (4). Example 4.1 Find and plot the frequency spectrum and the impulse re- sponse of a digital linear process having the digital transfer function: H(z) = 0.2 + 0.5z−1 1 − 0.2z−1 + 0.8z−2 Solution: Find H(z) using MATLAB’s fft. Then construct an impulse func- tion and determine the output using the MATLAB filter routine. % Example 4.1 and Figures 4.2 and 4.3 % Plot the frequency characteristics and impulse response Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 248. FIGURE 4.2 Plot of frequency characteristic (magnitude and phase) of the digital transfer function given above. % of a linear digital system with the given digital % transfer function % Assume a sampling frequency of 1 kHz % close all; clear all; fs = 1000; % Sampling frequency N = 512; % Number of points % Define a and b coefficients based on H(z) a = [1−.2 .8]; % Denominator of transfer % function b = [.2 .5]; % Numerator of transfer function % % Plot the Frequency characteristic of H(z) using the fft H = fft(b,N)./fft(a,N); % Compute H(f) Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 249. FIGURE 4.3 Impulse response of the digital transfer function described above. Both Figure 4.2 and Figure 4.3 were generated using the MATLAB code given in Example 1. Hm = 20*log10(abs(H)); % Get magnitude in db Theta = (angle(H)) *2*pi; % and phase in deg. f = (1:N/2) *fs/N; % Frequency vector for plotting % subplot(2,1,1); plot(f,Hm(1:N/2),’k’); % Plot and label mag H(f) xlabel (’Frequency (Hz)’); ylabel(’*H(z)* (db)’); grid on; % Plot using grid lines subplot(2,1,2); plot(f,Theta(1:N/2),’k’); % Plot the phase xlabel (’Frequency (Hz)’); ylabel(’Phase (deg)’); grid on; % % % Compute the Impulse Response x = [1, zeros(1,N-1)]; % Generate an impulse function y = filter(b,a,x); % Apply b and a to impulse using % Eq. (6) Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 250. figure; % New figure t = (1:N)/fs; plot(t(1:60),y(1:60),’k’); % Plot only the first 60 points % for clarity xlabel(’Time (sec)’); ylabel (’Impulse Response’); The digital filters described in the rest of this chapter use a straightforward application of these linear system concepts. The design and implementation of digital filters is merely a question of determining the a(n) and b(n) coefficients that produce linear processes with the desired frequency characteristics. FINITE IMPULSE RESPONSE (FIR) FILTERS FIR filters have transfer functions that have only numerator coefficients, i.e., H(z) = B(z). This leads to an impulse response that is finite, hence the name. They have the advantage of always being stable and having linear phase shifts. In addition, they have initial transients that are of finite durations and their extension to 2-dimensional applications is straightforward. The downside of FIR filters is that they are less efficient in terms of computer time and memory than IIR filters. FIR filters are also referred to as nonrecursive because only the input (not the output) is used in the filter algorithm (i.e., only the first term of Eq. (6) is used). A simple FIR filter was developed in the context of Problem 3 in Chapter 2. This filter was achieved taking three consecutive points in the input array and averaging them together. The filter output was constructed by moving this three- point average along the input waveform. For this reason, FIR filtering has also been referred to as a moving average process. (This term is used for any process that uses a moving set of multiplier weights, even if the operation does not really produce an average.) In Problem 4 of Chapter 2, this filter was imple- mented using a three weight filter, [1/3 1/3 1/3], which was convolved with the input waveform to produce the filtered output. These three numbers are simply the b(n) coefficients of a third-order, or three-weight, FIR filter. All FIR filters are similar to this filter; the only difference between them is the number and value of the coefficients. The general equation for an FIR filter is a simplification of Eq. (6) and, after changing the limits to conform with MATLAB notation, becomes: y(k) = ∑ L n=1 b(n) x(k − n) (8) where b(n) is the coefficient function (also referred to as the weighting function) of length L, x(n) is the input, and y(n) is the output. This is identical to the convolution equation in Chapter 2 (Eq. (15)) with the impulse response, h(n), Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 251. replaced by the filter coefficients, b(n). Hence, FIR filters can be implemented using either convolution or MATLAB’s filter routine. Eq. (8) indicates that the filter coefficients (or weights) of an FIR filter are the same as the impulse response of the filter. Since the frequency response of a process having an im- pulse response h(n) is simply the Fourier transform of h(n), the frequency re- sponse of an FIR filter having coefficients b(n) is just the Fourier transform of b(n): X(m) = ∑ N−1 n=0 b(n) e(−j2π mn/N) (9) Eq. (9) is a special case of Eq. (5) when the denominator equals one. If b(n) generally consists of a small number of elements, this equation can some- times be determined manually as well as by computer. The inverse operation, going from a desired frequency response to the coefficient function, b(n), is known as filter design. Since the frequency re- sponse is the Fourier transform of the filter coefficients, the coefficients can be found from the inverse Fourier transform of the desired frequency response. This design strategy is illustrated below in the design of a FIR lowpass filter based on the spectrum of an ideal filter. This filter is referred to as a rectangular window filter* since its spectrum is ideally a rectangular window. FIR Filter Design The ideal lowpass filter was first introduced in Chapter 1 as a rectangular win- dow in the frequency domain (Figure 1.7). The inverse Fourier transform of a rectangular window function is given in Eq. (25) in Chapter 2 and repeated here with a minor variable change: b(n) = sin[2πfcTs(n − L/2)] π(n − L/2) (10) where fc is the cutoff frequency; Ts is the sample interval in seconds; and L is the length of the filter. The argument, n − L/2, is used to make the coefficient function symmetrical giving the filter linear phase characteristics. Linear phase characteristics are a desirable feature not easily attainable with IIR filters. The coefficient function, b(n), produced by Eq. (10), is shown for two values of fc in Figure 4.4. Again, this function is the same as the impulse response. Unfortu- *This filter is sometimes called a window filter, but the term rectangular window filter will be used in this text so as not to confuse the filter with a window function as described in the last chapter. This can be particularly confusing since, as we show later, rectangular window filters also use window functions! Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 252. FIGURE 4.4 Symmetrical weighting function of a rectangular filter (Eq. (10) trun- cated at 64 coefficients. The cutoff frequencies are given relative to the sampling frequency, fs, as is often done in discussing digital filter frequencies. Left: Low- pass filter with a cutoff frequency of 0.1fs /2 Hz. Right: Lowpass cutoff frequency of 0.4fs /2 Hz. nately this coefficient function must be infinitely long to produce the filter char- acteristics of an ideal filter; truncating it will result in a lowpass filter that is less than ideal. Figure 4.5 shows the frequency response, obtained by taking the Fourier transform of the coefficients for two different lengths. This filter also shows a couple of artifacts associated with finite length: an oscillation in the frequency curve which increases in frequency when the coefficient function is longer, and a peak in the passband which becomes narrower and higher when the coefficient function is lengthened. Since the artifacts seen in Figure 4.5 are due to truncation of an (ideally) infinite function, we might expect that some of the window functions described in Chapter 3 would help. In discussing window frequency characteristics in Chapter 3, we noted that it is desirable to have a narrow mainlobe and rapidly diminishing sidelobes, and that the various window functions were designed to make different compromises between these two features. When applied to an FIR weight function, the width of the mainlobe will influence the sharpness of the transition band, and the sidelobe energy will influence the oscillations seen Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 253. FIGURE 4.5 Freuquency characteristics of an FIR filter based in a weighting func- tion derived from Eq. (10). The weighting functions were abruptly truncated at 17 and 65 coefficients. The artifacts associated with this truncation are clearly seen. The lowpass cutoff frequency is 100 Hz. in Figure 4.5. Figure 4.6 shows the frequency characteristics that are produced by the same coefficient function used in Figure 4.4 except that a Hamming window has been applied to the filter weights. The artifacts are considerably diminished by the Hamming window: the overshoot in the passband has disap- peared and the oscillations are barely visible in the plot. As with the unwin- dowed filter, there is a significant improvement in the sharpness of the transition band for the filter when more coefficients are used. The FIR filter coefficients for highpass, bandpass, and bandstop filters can be derived in the same manner from equations generated by applying an inverse FT to rectangular structures having the appropriate associated shape. These equations have the same general form as Eq. (10) except they include additional terms: b(n) = sin[π(n − L/2)] π(n − L/2) − sin[2πfcTs(n − L/2)] π(n − L/2) Highpass (11) b(n) = sin[2πfHT(n − L/2)] π(n − L/2) − sin[2πfLTs(n − L/2)] π(n − L/2) Bandpass (12) Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 254. FIGURE 4.6 Frequency characteristics produced by an FIR filter identical to the one used in Figure 4.5 except a Hamming function has been applied to the filter coefficients. (See Example 1 for the MATLAB code.) b(n) = sin[2πfLT(n − L/2)] π(n − L/2) + sin[π(n − L/2)] π(n − L/2) − sin[2πfHTs(n − L/2)] π(n − L/2) Bandstop (13) An FIR bandpass filter designed using Eq. (12) is shown in Figure 4.7 for two different truncation lengths. Implementation of other FIR filter types is a part of the problem set at the end of this chapter. A variety of FIR filters exist that use strategies other than the rectangular window to construct the filter coef- ficients, and some of these are explored in the section on MATLAB implemen- tation. One FIR filter of particular interest is the filter used to construct the derivative of a waveform since the derivative is often of interest in the analysis of biosignals. The next section explores a popular filter for this operation. Derivative Operation: The Two-Point Central Difference Algorithm The derivative is a common operation in signal processing and is particularly useful in analyzing certain physiological signals. Digital differentiation is de- Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 255. FIGURE 4.7 Frequency characteristics of an FIR Bandpass filter with a coefficient function described by Eq. (12) in conjuction with the Blackman window function. The low and high cutoff frequencies were 50 and 150 Hz. The filter function was truncated at 33 and 129 coefficients. These figures were generated with code similar to that in Example 4.2 below, except modified according to Eq. (12) fined as ∆x/∆t and can be implemented by taking the difference between two adjacent points, scaling by 1/Ts, and repeating this operation along the entire waveform. In the context of the FIR filters described above, this is equiva- lent to a two coefficient filter, [−1, +1]/Ts, and this is the approach taken by MATLAB’s derv routine. The frequency characteristic of the derivative opera- tion is a linear increase with frequency, Figure 4.8 (dashed line) so there is considerable gain at the higher frequencies. Since the higher frequencies fre- quently contain a greater percentage of noise, this operation tends to produce a noisy derivative curve. Figure 4.9A shows a noisy physiological motor response (vergence eye movements) and the derivative obtained using the derv function. Figure 4.9B shows the same response and derivative when the derivative was calculated using the two-point central difference algorithm. This algorithm acts Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 256. FIGURE 4.8 The frequency response of the two-point central difference algorithm using two different values for the skip factor: (A) L = 1; (B) L = 4. The sample time was 1 msec. as a differentiator for the lower frequencies and as an integrator (or lowpass filter) for higher frequencies. The two-point central difference algorithm uses two coefficients of equal but opposite value spaced L points apart, as defined by the input–output equa- tion: y(n) = x(n + L) − x(n − L) 2LTs (14) where L is the skip factor that influences the effective bandwidth as described below, and Ts is the sample interval. The filter coefficients for the two-point central difference algorithm would be: Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 257. FIGURE 4.9 A physiological motor response to a step input is shown in the upper trace and its derivative is shown in the lower trace. (A) The derivative was calcu- lated by taking the difference in adjacent points and scaling by the sample fre- quency. (B) The derivative was computed using the two-point central difference algorithm with a skip factor of 4. Derivative functions were scaled by 1 /2 and re- sponses were offset to improve viewing. h(n) = ͭ−0.5/L n = −L 0.5/L n = +L 0 n ≠ L (15) The frequency response of this filter algorithm can be determined by tak- ing the Fourier transform of the filter coefficient function. Since this function contains only two coefficients, the Fourier transform can be done either analyti- Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 258. cally or using MATLAB’s fft routine. Both methods are presented in the exam- ple below. Example 4.2 Determine the frequency response of the two-point central difference algorithm. Analytical: Since the coefficient function is nonzero only for n = ± L, the Fou- rier transform, after adjusting the summation limits for a symmetrical coefficient function with positive and negative n, becomes: X(k) = ∑ L n=−L b(n)e(−j2π kn/N) = 1 2Lts e(−j2π kL/N) − 1 2LTs e(−j2πk(−L)/N) X(k) = e(−j2πkL/N) − e(j2πkL/N) 2LTs = jsin(2πkL/N) LTs (16) where L is the skip factor and N is the number of samples in the waveform. To put Eq. (16) in terms of frequency, note that f = m/(NTs); hence, m = fNTs. Substituting: *X(f)* = j sin(2πfLTs) LTs = sin(2πfLTs) LTs (17) Eq. (17) shows that *X(k)* is a sine function that goes to zero at f = 1/(LTs) or fs/L. Figure 4.8 shows the frequency characteristics of the two-point central difference algorithm for two different skip factors, and the MATLAB code used to calculate and plot the frequency plot is shown in Example 4.2. A true derivative would have a linear change with frequency (specifically, a line with a slope of 2πf ) as shown by the dashed lines in Figure 4.8. The two-point central difference curves approximate a true derivative for the lower frequencies, but has the characteristic of a lowpass filter for higher frequencies. Increasing the skip factor, L, has the effect of lowering the frequency range over which the filter acts like a derivative operator as well as the lowpass filter range. Note that for skip factors >1, the response curve repeats above f = 1/(LTs). Usually the assumption is made that the signal does not contain frequencies in this range. If this is not true, then these higher frequencies could be altered by the frequency characteristics of this filter above 1/(LTs). MATLAB Implementation Since the FIR coefficient function is the same as the impulse response of the filter process, design and application of these filters can be achieved using only FFT and convolution. However, the MATLAB Signal Processing Toolbox has a number of useful FIR filter design routines that greatly facilitate the design of FIR filters, particularly if the desired frequency response is complicated. The Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 259. following two examples show the application use of FIR filters using only con- volution and the FFT, followed by a discussion and examples of FIR filter design using MATLAB’s Signal Processing Toolbox. Example 4.2 Generate the coefficient function for the two-point central difference derivative algorithm and plot the frequency response. This program was used to generate the plots in Figure 4.8. % Example 4.2 and Figure 4.8 % Program to determine the frequency response % of the two point central difference algorithm for % differentiation % clear all, close all; Ts = .001 % Assume a Ts of 1 msec. N = 1000; % Assume 1 sec of data; N = % 1000 Ln = [1 3]; % Define two different skip % factors for i = 1:2 % Repeat for each skip factor L = Ln(i); bn = zeros((2*L)؉1,1); %Set up b(n). Initialize to % zero bn(1,1) = -1/(2*L*Ts); % Put negative coefficient at % b(1) bn((2*L)؉1,1) = 1/(2*L*Ts); % Put positive coefficient at % b(2L؉1) H = abs(fft(bn,N)); % Cal. frequency response % using FFT subplot(1,2,i); % Plot the result hold on; plot(H(1:500),’k’); %Plot to fs/2 axis([0 500 0 max(H)؉.2*max(H)]); text(100,max(H),[’Skip Factor = ’,Num2str(L)]); xlabel(’Frequency (Hz)’); ylabel(’H(f)’); y = (1:500) * 2 * pi; plot(y,’--k’); % Plot ideal derivative % function end Note that the second to fourth lines of the for loop are used to build the filter coefficients, b(n), for the given skip factor, L. The next line takes the absolute value of the Fourier transform of this function. The coefficient function is zero-padded out to 1000 points, both to improve the appearance of the result- Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.
  • 260. ing frequency response curve and to simulate the application of the filter to a 1000 point data array sampled at 1 kHz. Example 4.3 Develop and apply an FIR bandpass filter to physiological data. This example presents the construction and application of a narrowband filter such as shown in Figure 4.10 (right side) using convolution. The data are from a segment of an EEG signal in the PhysioNet data bank (http://www. physionet.org). A spectrum analysis is first performed on both the data and the filter to show the range of the filter’s operation with respect to the frequency spectrum of the data. The standard FFT is used to analyze the data without windowing or averaging. As shown in Figure 4.10, the bandpass filter transmits most of the signal’s energy, attenuating only a portion of the low frequency and FIGURE 4.10 Frequency spectrum of EEG data shown in Figure 4.11 obtained using the FFT. Also shown is the frequency response of an FIR bandpass filter constructed using Eq. (12). The MATLAB code that generated this figure is pre- sented in Example 4.3. Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.