Developing Image Processing System for Classification of Indian Multispectral Satellite images
1. Developing Image Processing System for Classification of
Indian Multispectral Satellite images (Medium resolution)
using Java
B.Tech
Winter Training Report
Submittedby
Sumedha Mishra
Electronics and Communication Engineering
National Institute of Technology, Srinagar
Under the guidance of
Dr. SN Omkar
Department of Aerospace Engineering,
Indian Institute of Sciences. Bangalore
2.
3. Candidate’s Declaration
I hereby declare that the work presented in this projecttitled “Developing Image
Processing Systemfor Classificationof Indian Multispectral Satellite images
(Mediumresolution) using Java”submitted towards completion of summer
projectafter 5th Semester of B.Tech at the Indian Instituteof Sciences (IISc),
Bangalore is an authentic record of my original work pursued under the guidance
of Dr.SNOmkar, PrincipalResearch Scientist, Indian Instituteof Sciences,
Bangalore.
I have not submitted the matter embodied in this project for the award of any
other degree elsewhere.
Sumedha Mishra
ECE/53/13
B.Tech
NIT, Srinagar
4. ACKNOWLEDGEMENT
Firstand foremost, I would like to express my sincere gratitude to my project
guide, Dr. SN Omkar. I was privileged to experience a sustained enthusiastic and
involved interest fromhis side. His invaluable and untiring supportat crucial
junctures helped materialize this project. His supremebelief in our abilities to
conquer new frontiers kept me motivated.
I am also indebted to the PhD. scholars for sparing their valuable time to help me
with much needed inputs during this work.
This projectwork was aided by the innumerable references on this topic all of
which have been duly acknowledged in the reference section.
Last but not the least; I would like to thank the IISc-Bangalorestaff members and
the institute, in general, for extending a helping hand at every junctureof need.
Sumedha Mishra
ECE/53/13
NIT, Srinagar
5. ABSTRACT
Multispectral remote sensing is the collection and analysis of emitted, reflected
and back-scattered energy from an area of interest in multiple bands of regions of
the electromagnetic spectrum. The main purpose of multispectral imaging is the
potential to classify the image using multispectral classification. This report is
based on the study of different unsupervised classification algorithms and
development of a public domain Java image processing program called ImageJ, by
adding different classification plugins implemented in Java. This developed
version is further used for the pixel-based classification of very high resolution
images (resolution: 1m to 2.5m) of four different sensors namely Quickbird,
CARTOSAT, Worldview-3, IKONOUS.
Keywords: ImageJ, Plugin, Pixel-based classification, Quickbird, CARTOSAT,
Worldview-3, IKONOUS.
7. 4.2. Plugin Concept
5. Sensors under Study 23
6. Results 24
7. Conclusion 36
8. References 37
8. Introduction:
Image Processing is a procedure to convert an image into its digital form and
carry out some operations on it in order to get an improved image and take
out several helpful information from it. ImageJ is a public domain, Java, image
processing and analysis program inspired by NIH Image for the Macintosh. It
was designed with an open architecture that provides extensibility via Java
Plugins. Custom acquisition, analysisand processingplugins can be developed
using ImageJ’s built-in editor and Java compiler. User-written plugins make it
possible to solve almost any image processing or analysis problem.
In this paper, we are presenting a way to develop ImageJ in eclipse by
providing various algorithms for registration and classification and
implementing them in ImageJ as plugins. This developed version is further
used to process Multispectral Satellite images of various sensors mentioned
above.
Introduction to Image Processing:
Image processing is a method to perform some operations on an image, in
order to get an enhanced image or to extract some useful information from it.
It is a type of signal processing in which input is an image and output may be
image or characteristics/features associated with that image. Nowadays,
image processing is among rapidly growing technologies. It forms core
research area within engineering and computer science disciplines too.
Image processing basically includes the following three steps:
Importing the image via image acquisition tools;
Analyzing and manipulating the image;
Output in which result can be altered image or report that is based on
image analysis.
There are two types of methods used for image processing namely, analogue
and digital image processing. Analogue image processing can be used for the
9. hard copies like printouts and photographs. Image analysts use various
fundamentals of interpretation while using these visual techniques. Digital
image processing techniques help in manipulation of the digital images by
using computers. The three general phases that all types of data have to
undergo while using digital technique are pre-processing, enhancement, and
display, information extraction.
Purpose of Image processing
The purpose of image processing is divided into 5 groups. They are:
1. Visualization - Observe the objects that are not visible.
2. Image sharpening and restoration - To create a better image.
3. Image retrieval - Seek for the image of interest.
4. Measurement of pattern – Measures various objects in an image.
5. Image Recognition – Distinguish the objects in an image.
Types:
The two types of methods used for Image Processing are Analog and
Digital Image Processing. Analogor visual techniques of image processing can
be used for the hard copies like printouts and photographs. Image analysts
use various fundamentals of interpretation while using these visual
techniques. The image processing is not just confined to area that has to be
studied but on knowledge of analyst. Association is another important tool in
image processing through visual techniques. So analysts apply a combination
of personal knowledge and collateral data to image processing.
Digital Processing techniques help in manipulation of the digital images by
using computers. As raw data from imaging sensors from satellite platform
contains deficiencies. To get over such flaws and to get originality of
information, it has to undergo various phases of processing. The three general
phases that all types of data have to undergo while using digital technique are
Pre- processing, enhancement and display, information extraction.
10. Applications
1. Intelligent Transportation Systems – This technique can be used in
Automatic number plate recognition and Traffic sign recognition.
2. Remote Sensing – For this application, sensors capture the pictures of
the earth’s surface in remote sensing satellites or multi – spectral scanner
which is mounted on an aircraft. These pictures are processed by transmitting
it to the Earth station. Techniques used to interpret the objects and regions
are used in flood control, city planning, resource mobilization, agricultural
production monitoring, etc.
11. 3. Moving object tracking – This application enables to measure motion
parameters and acquire visual record of the movingobject. The differenttypes
of approach to track an object are:
· Motion based tracking
· Recognition based tracking
4. Defense surveillance – Aerial surveillance methods are used to
continuously keep an eye on the land and oceans. This application is also used
to locate the types and formation of naval vessels of the ocean surface. The
important duty is to divide the various objects present in the water body part
of the image. The different parameters such as length, breadth, area,
perimeter, compactness are set up to classify each of divided objects. It is
important to recognize the distribution of these objects in different directions
that are east, west, north, south, northeast, northwest, southeast and south
west to explain all possible formations of the vessels. We can interpret the
entire oceanic scenario from the spatial distribution of these objects.
5. Biomedical Imaging techniques – For medical diagnosis, different
types of imaging tools such as X- ray, Ultrasound, computer aided tomography
(CT) etc are used.
Current Research
A wide research is being done in the Image processing technique.
1. Cancer Imaging – Different tools such as PET, MRI, and Computer aided
Detection helps to diagnose and be aware of the tumor.
2. Brain Imaging – Focuses on the normal and abnormal development of
brain, brain ageing and common disease states.
3. Image processing – This research incorporates structural and functional
MRI in neurology, analysis of bone shape and structure, development of
functional imaging tools in oncology, and PET image processing software
development.
12. 4. Imaging Technology – Development in image technology have formed
the requirement to establish whether new technologies are effective and cost
beneficial. This technology works under the following areas:
· Magnetic resonance imaging of the knee
· Computer aided detection in mammography
· Endoscopic ultrasound in staging the oesophageal cancer
· Magnetic resonance imaging in low back pain
· Ophthalmic Imaging – This works under two categories:
5. Development of automated software- Analyzes the retinal images to
show early sign of diabetic retinopathy
6. Development of instrumentation – Concentrates on development of
scanning laser ophthalmoscope
Future
We all are in midst of revolution ignited by fast development in computer
technology and imaging. Against common belief, computers are not able to
match humans in calculation related to image processing and analysis. But
with increasing sophistication and power of the modern computing,
computation will go beyond conventional, Von Neumann sequential
architecture and would contemplate the optical execution too. Parallel and
distributed computing paradigms are anticipated to improve responses for
the image processing results.
Multispectral Images :
A multispectralimage consists of severalbandsof data. For visualdisplay,
each band of the image may be displayed oneband at a time as a grey scale
image, or in combination of three bandsat a time as a color composite image.
Interpretation of a multispectral color compositeimage will requirethe
knowledgeof the spectral reflectance signature of the targets in the scene. In
this case, the spectral information content of the image is utilized in the
interpretation. In displayinga color compositeimage, three primary colors
(red, green and blue) are used. When these three colors are combined in
variousproportions, they producedifferentcolorsin the visible spectrum.
13. Associating each spectral band (not necessarily a visible band)to a separate
primary color results in a color composite image. If a multispectral image
consists of the three visual primary color bands (red, green, blue), the three bands
may be combined to producea "true color" image. For example, the bands 3 (red
band), 2 (green band) and 1 (blue band) of a LANDSAT TM image or an
IKONOUS multispectral image can be assigned respectively to the R, G, and B
colors for display. In this way, the colors of the resulting color compositeimage
resemble closely what would be observed by the human eyes. For optical images
lacking one or more of the three visualprimary color bands (i.e. red, green
and blue), the spectral bands(some of which may not be in the visible region)
may be combined in such a way that the appearanceof the displayed image
resembles a visible color photograph, i.e. vegetation in green, water in blue,
soil in brown or grey, etc. Many peoplerefer to this composite as a "true
color"composite. However, this term is misleadingsince in many instances
the colors are only simulated to look similar to the "true" colors of the targets.
The term "natural color" is preferred. Multispectraland hyperspectral
imagery gives the power to see as humans(red, green and blue), goldfish
(infrared), bumblebees (ultraviolet) and more. This comesin the form of
reflected EM radiation to the sensor.
The main differencebetween multispectraland hyperspectralis the number
of bands and how narrowthe bands are.
Multispectralimagery generally refersto 3 to 10 bands that are represented
in pixels. Each band is acquired usinga remote sensing radiometer.
Hyperspectralimagery consists of muchnarrower bands(10-20 nm). A
hyperspectralimage could have hundreds of thousands of bands. This uses
an imaging spectrometer.
Our objective here is pixel-based classification of satellite images usingk-
means, ISODATA and Fuzzy C Meansalgorithms on the multispectralsatellite
image through ImageJ’s developer version.
14. Classification:
Classification of remotely sensed data is used to assign correspondinglevels
with respect to groupswith homogenouscharacteristics, with the aim of
distinguishing multipleobjects from each other within the image. There are
two typesof classification methods: supervised and unsupervised. The
classifiers that wewill be discussinghere are k-means, ISODATA and Fuzzy C
Meansclassifiers which come under the category of unsupervised
classification techniques. The procedureof classification includesfollowing
steps:
Supervisedclassification isbased on the idea that a user can select sample
pixels in an image that are representativeof specific classes and then direct
the image processing softwareto use these training sites as referencesfor the
classification of all other pixels in the image. Trainingsites (also known as
testing sets or inputclasses) are selected based on the knowledgeof the user.
The user also sets the boundsfor how similar other pixelsmust be to group
them together. These boundsare often set based on the spectral
15. characteristics of the trainingarea, plusor minusa certain increment(often
based on "brightness" or strength of reflection in specific spectral bands). The
user also designates the number of classes that the image is classified into.
Many analystsusea combination of supervised and unsupervised
classification processes to develop finaloutputanalysisand classified maps.
Unsupervisedclassification iswhere the outcomes(groupingsof pixels
with common characteristics) are based on the software analysisof an image
without the user providingsampleclasses. The computer usestechniques to
determinewhich pixels are related and groupsthem into classes. The user can
specify which algorism the software will useand the desired number of
outputclasses but otherwise doesnot aid in the classification process.
However, the user musthave knowledgeof the area being classified when the
groupingsof pixels with common characteristics produced by the computer
have to be related to actual features on the ground (suchas wetlands,
developed areas, coniferousforests, etc.).
K-Mean Classification
The k-mean clustering algorithm is a simplemethod for estimating the
mean (vectors) of a set of K-groups. It is a method of vector quantization
that is popular for cluster analysis in data mining. It aims
to partition n observations into k clusters in which each observation
belongs to the cluster with the nearest mean, servingas a prototype of the
cluster. The algorithmic steps for k-mean classification are as follows:
• Let X = {x1,x2,x3,……..,xn}be the set of data pointsand V = {v1,v2,…….,vc}
be the set of centers.
• 1) Randomly select ‘k’ cluster centers.
• 2) Calculate the distance between each data pointand cluster centers.
• 3) Assign the data pointto the cluster center whose distancefrom the
cluster center is minimum of all the cluster centers.
• 4) Recalculate the new cluster center.
• 5) Recalculate the distance between each data pointand new obtained
cluster centers.
16. • 6) If no data pointwas reassigned then stop, otherwise repeat from step
3).
Given a set of observations (x1, x2, …, xn), where each observation is a d-
dimensionalreal vector, k-means clustering aims to partition
the n observationsinto k (≤ n) sets S = {S1, S2, …, Sk} so as to minimizethe
within-cluster sum of squares(WCSS) (sum of distance functionsof each
pointin the cluster to the K center). In other words, its objective is to find:
Where μi is the mean of points in Si.
K-meansuses the squared Euclidean distanceto allocate objects to clusters.
There is the implicit assumption that the data should have roughly the same
scale to use such distances. Because the squared Euclidean distanceis used,
the k-mean algorithm proceedsto try to find a minimum for the error sum
of squares. The weak assumption is that each group has roughly the same
WCSS, i.e., the variance/covariancematrix between objects within a group
is equal. K-mean is a nice method to quickly sort the data into clusters, all
we need to know are the number of clusters that are required. The main
drawback is the Local optimathat can derail our results. To overcomethat,
we can run the process many timeswith differingstarting values. Solutions
typically found in k-meansare locally optimal; they have found the peak of a
function in a smallpart of the space.
Flowchart for k-means:
17. ISODATA Classification
ISODATA (Iterative Self-Organizing Data AnalysisTechnique) is an
unsupervised classification technique, which calculates class means evenly
distributed in data space, and then iteratively clusters the remainingpixels
usingminimum distancetechniques. Each iteration recalculates meansand
reclassifies pixels with respect to the new means. Iterative class splitting,
merging and deletingis donebased on the inputthreshold parameters. All
pixels are classified to the nearest class unlessa standard deviation or
distance threshold is specified, in which case, some pixels may be
unclassified if they do not meet the selected criteria. This procedure
continuesuntilthe number of pixels in each class changes by less than the
selected pixel change threshold or the maximum number of iterations is
reached. In the K-means method, the number of clusters K remainsthe
18. same throughout the iteration, although it may turn out later that more or
fewer clusters would fit the data better. This drawback can be overcomein
the ISODATAAlgorithm which allows the number of clusters to be adjusted
automatically duringthe iteration by merging similar clusters and splitting
clusters with large standard deviations. The steps for implementing ISODATA
are as follows:
Cluster centers are randomly placed and pixelsare assigned based on
the shortest distance to center method.
The standard deviation within each cluster, and the distance between
cluster centers is calculated.
A second iteration is performed withthe new cluster centers.
Further iterations are performed until:
i) The average inter-center distance falls below the user-defined threshold,
ii) The average change in the inter-center distance between iterations is
less than a threshold, or
iii) The maximum number of iterationsis reached.
Today, severaldifferentunsupervised classification algorithms are
commonly used in remote sensingand the most frequently used algorithms
are k-meansand ISODATA clusteringalgorithms. Both of these are iterative
processes. The ISODATA algorithm is similar to the k-means algorithm with
the distinct differencethat the ISODATA algorithm allowsfor different
number of clusters while the k-meansassumes that the number of clusters
is known a priori.
Flowchart for ISODATA classification:
19. Fuzzy C Means Classification(FCM)
Fuzzy c-means(FCM)is a method of clusteringwhich allows onepiece of data
to belong to two or more clusters. This method (developed by Dunn in 1973
and improved by Bezdek in 1981)isfrequently used in pattern recognition. It
is based on minimization of the followingobjective function:
,
where m is any real number greater than 1, uij is the degree of membership
of xi in the cluster j, xi is the ith of d-dimensionalmeasured data, cj is the d-
dimension center of the cluster, and ||*|| is any norm expressingthe similarity
between any measured dataand the center. The algorithm is composed of the
followingsteps:
1. Initialize U=[uij] matrix, U(0)
20. 2. At k-step: calculate the centers vectorsC(k)=[cj] with U(k)
3. Update U(k) , U(k+1)
4. If || U(k+1) - U(k)||< then STOP; otherwise return to step 2.
FCM gives best resultfor overlapped dataset and comparatively better
then k-means algorithm, because unlikek-means wheredata point must
exclusively belong to onecluster center, here data point is assigned
membership to each cluster center as a result of which data pointmay
belong to more than one cluster center.
Flowchart for FCM:
21. About ImageJ:
ImageJ is a public domain Java image processing program inspired by NIH
Image for the Macintosh. It runs, either as an online applet or as a
downloadable application, on any computer with a Java 1.1 or later virtual
machine. Downloadable distributions are available for Windows, Mac OS, Mac
OS X and Linux. It can display, edit, analyze, process, save and print 8-bit, 16-
bit and 32-bit images. It can read many image formats including TIFF, GIF,
JPEG, BMP, DICOM, FITSand “raw”. It supports “stacks”, a series of images that
share a single window. It is multithreaded, so time-consuming operations
such as image file reading can be performed in parallel with other operations.
It can calculate area and pixel value statistics of user-defined selections. It can
measure distances and angles. It can create density histograms and line
profileplots. It supportsstandard image processing functions such as contrast
manipulation, sharpening, smoothing, edge detection and median filtering. It
does geometric transformations such as scaling, rotation and flips. Image can
be zoomed up to 32:1 and down to 1:32. All analysis and processing functions
are available at any magnification factor. The program supports any number
of windows (images) simultaneously, limited only by available memory.
Spatial calibration is available to provide real world dimensional
measurementsin units such as millimeters. Density or gray scale calibration is
also available.
ImageJ was designed with an open architecture that provides extensibility via
Java plugins. Custom acquisition, analysis and processing plugins can be
developed using ImageJ’s built in editor and Java compiler. User-written
plugins make it possible to solve almost any image processing or analysis
problem. ImageJ is being developed on Mac OS X using it’s built in editor and
Java compiler, plus the BBEdit editor and the Ant build tool. The source code
is freely available. The author, Wayne Rasband (wayne@codon.nih.gov), is at
the Research Services Branch, National Institute of Mental Health, Bethesda,
Maryland, USA.
Requirements for developing ImageJ:
22. For runningImageJ weneed the ImageJ class and configuration files, a Java
RuntimeEnvironment(JRE)and—forcompilingour own plugins—aJava
compiler with the required libraries, as for example included in the Java2 SDK
Standard Edition (J2SE)from Sun Microsystems. Dependingon theImageJ
distribution we are using, some or all of this may already be included. The
latest distribution of ImageJ can be downloaded from
http://rsb.info.nih.gov/ij/download.html. In the followingthe installation of
ImageJ will be described briefly for differentoperating systems. Moredetailed
and up-to-dateinstallation instructionscan be found at
http://rsb.info.nih.gov/ij/docs/install. If wealready have a JRE (and a Java
compiler)installed on our computer and weare familiar with Java, wejust
need to download the ImageJ class and configuration fileswhich are available
as a ZIP archive. To run ImageJ, wehave to add ij.jar to our classpath and
execute class ij.ImageJ. This also worksfor all operating systemsfor which
there is no specific ImageJ distribution but for which a Javaruntime
environmentisavailable. Installing a Javacompiler is only necessary if it is not
included in the ImageJ distribution or provided by the operatingsystem. In
any case (also if we are usingan operatingsystem which is notmentioned
here but for which a Javacompiler is available), we can usea Javacompiler of
our choice to compileour plugins(e.g. J2SE SDK from Sun Microsystems,
which wecan download from http://www.javasoft.com).
Plugin Concept of ImageJ:
The functionsprovided by ImageJ’sbuilt-in commandscan be extended by
user-written code in the form of macros and plugins. These two optionsdiffer
in their complexity and capabilities. Macrosare an easy way to execute a
series of ImageJ commands. The simplestway to create a macro is to call using
“Plugins/Macros/Record” and executethe commandsto be recorded. The
macro codecan be modified in the built-in editor. The ImageJ macro language
contains a set of control structures, operatorsand built-in functionsand can
be used to call built-in commandsand macros. A referenceof the macro
language can be found at
http://rsb.info.nih.gov/ij/developer/macro/macros.html. Pluginsarea much
morepowerfulconceptthan macros and most of ImageJ’s built-in menu
commandsarein fact implemented as plugins. Pluginsare implemented as
Javaclasses, which mean that we can useall featuresof the Javalanguage,
access the fullImageJ APIand use all standard and third-party JavaAPIs in a
plugin. This opensa wide rangeof possibilities of what can be donein a
23. plugin. The most common usesof pluginsare filters performingsomeanalysis
or processingon an image or image stack and I/O pluginsfor reading/writing
not natively supported formatsfrom/to fileor other devices.
ImageJ user pluginshave to be located in a folder called plugins, which is a
subfolder of the ImageJ folder. Butonly class files in the plugins folder with at
least oneunderscorein their nameappear automatically in the “Plugins”
menu. Since version 1.20 itis also possible to create subfoldersof the plugins
folder and place plugin files there. The subfoldersare displayed assubmenus
of ImageJ’s “Plugins” menu.
To install a plugin, wehave to copy the .class file into the plugins folder or one
of its subfolders. The plugin will appear in the plugin menu (or oneof its
submenus)the next time we start ImageJ. We can add it to a menu and assign
a shortcut to it using the “Plugins/Shortcut/ Install plugin...” menu. In this
case, the plugin will appear in the menu without restarting ImageJ.
Alternatively, if wehave the source code of a plugin, we can compileand run it
from within ImageJ. We can specify the pluginsdirectory usingthe plugins.dir
property. This can be doneby addingan argumentlike -
Dplugins.dir=c:plugindirectoryto the command linecalling ImageJ.
Dependingon the typeof installation we are using, this modification is made
in the run script, the ImageJ.cfg file or the shortcut calling ImageJ.
Sensors under Study:
The sensorsunder study areCARTOSAT, QUICKBIRD, IKONOUS and
Worldview3.
35. Fuzzy C Means Classification:
Image Histogram:
No of iterations: 21
36. Conclusion:
ISODATA algorithm is a refined version of k-meansby splitting and merging of
clusters. The ISODATA algorithm is similar to the k-means algorithm with the
distinct difference that the ISODATA algorithm allows for different number of
clusters while the k-means assumes that the number of clusters is known a
priori, i.e. it provides flexibility to the classifier. Thus we can say that ISODATA
is a modification of k-means algorithm which overcomes its disadvantages. K-
means clustering algorithm required to define the number of final clusters
beforehand. Such algorithms are also having problems like susceptibility to
local optima, sensitivity to outliers, memory space and unknown number of
iteration steps that are required to cluster. K-Means was found extremely
faster than Fuzzy C Means, when implemented on the multispectral images.
FCM is an algorithm based on more iterative fuzzy calculations, so its
execution wasfound comparatively higher as it is expected. On comparison, K-
means was found better than FCM, because it requires more computational
time than k-means due to the fuzzy measures calculations involvement in the
algorithm. FCM is more suitable for handling the issues related to
understanding ability of patterns, incomplete or noisy data, mixed media
information, human interaction and it can provide approximate solutions
faster. Thus we can conclude that ISODATA clustering algorithm is superior to
the other two algorithms that are discussed above, i.e., k-means and Fuzzy C
Means algorithms.
37. References:
Burger, Burge: “Digital Image Processing - An Algorithmic Approach
usingJava”.
Werner Bailor, “WritingImageJ Plugins-A Tutorial”, Version-1.71.
Wu, Junjie, “Advancesin K-meansClustering”.
R. Sathya, Annamma Abraham, “Comparison of Supervised and
Unsupervised LearningAlgorithmsfor Pattern Classification” : (IJARAI)
InternationalJournalof Advanced Research in Artificial Intelligence, Vol.
2, No. 2, 2013.
SoumiGhosh, Sanjay Kumar Dubey, “ComparativeAnalysisof K-Means
and Fuzzy C MeansAlgorithms” : ((IJACSA)InternationalJournalof
Advanced Computer Scienceand Applications, Vol. 4, No.4, 2013.
B. Fergani, M.K. Kholladi, M. Bahr, “Comparison of FCM and FISODATA” :
InternationalJournalof Computer Applications(0975 – 8887)Volume
56– No.8, October 2012.