SlideShare a Scribd company logo
1 of 14
Download to read offline
A cluster computer system for the analysis and
classi®cation of massively large biomedical image data
T. Daggett *, I. R. Greenshields
Medical Imaging Laboratory, U-31 Room 19, T. L. Booth Research Center, Department of Computer Science and
Engineering, 233 Glenbrook Road, The University of Connecticut, Storrs, CT, U.S.A.
Received 25 March 1997
Abstract
The current trend in medical image acquisition is towards the generation of image datasets which are
massively large, either because they exhibit ®ne x, y, or z resolution, are volumetric, are multispectral, or
a combination of all of the preceding. Such images pose a signi®cant computational challenge in their
analysis, not only in terms of data throughput, but also in terms of platform costs and simplicity. In this
paper we describe the role of a cluster of workstations together with two quite di€erent application
programming interfaces (APIs) in the quantitative analysis of anatomic image data from the visible
human project using an MRF-Gibbs classi®cation algorithm. We describe the typical architecture of a
cluster computer, two API options and the parallelization of the MRF-Gibbs procedure for the cluster.
Finally, we show speedup results obtained on the cluster and sample classi®cations of visible human
data. # 1998 Elsevier Science Ltd. All rights reserved.
Keywords: Image processing; Cluster computing; Parallel computing; MRF-Gibbs; Classi®cation
1. Introduction
Medical image datasets, like a wide variety of other image datasets, are increasingly
characterizable as massive in terms of their spatial resolution and/or their spectral resolution.
Since the lower bound on processing costs for an n  n image grows quadratically (O(n2
)) with
the on-side dimensions of an image (since most meaningful algorithms have to examine every
pixel at least once), higher resolution images bring nonlinear growth in computational costs
associated with their processing. On the positive side, many imaging algorithms are easily
adapted to the single program multiple data model of parallelism, so that multiple copies of an
Computers in Biology and Medicine 28 (1998) 47±60
0010-4825/98/$19.00 # 1998 Elsevier Science Ltd. All rights reserved.
PII: S0010-4825(97)00032-2
PERGAMON
* Corresponding author.
algorithm can execute in parallel acting on di€erent subdivisions of the image. Thus,
discounting typical parallel overheads, many imaging algorithms are in principle amenable to
the p-speedup when distributed over p processors. (Many image algorithms are, of course, not
so amenable.) It is therefore often the case that the issue impacting the processing throughput
of massive images lies not in complex algorithm design, but in platform availability. It will
sometimes be true that a large parallel system's costs can be folded into the cost of the imager
itself without apparent impact on total system cost (as might be the case with a modern MR
imager), but the average biological or medical laboratory interested in analysing these images is
unlikely to wish to acquire a large parallel system, both because of cost and system
management problems. However, even the most modest laboratories will have access to
inexpensive PC-level workstations, and it has now become a trend in parallel system design to
exploit groups of these systems networked together to form what is known as a cluster
computer. Cluster computers, unlike the more expensive symmetric multiprocessors which
characterize the bulk of non-cluster parallel systems are inexpensive to assemble, easy to
maintain, extremely fault-tolerant (in the sense that the failure of a cluster component Ða
workstationÐ simply means that the cluster has less processing power than it had before),
surprisingly powerful (a tribute to the modem microprocessor) and (with newer application
programming interfaces) relatively easy to program. Clusters can be quite small yet still
powerful; the architecture we describe below consists of 8 Pentium systems capable of a
(theoretical) throughput SpecFP of about 245 [1].
The principal software component for application development on parallel systems is the
underlying communication mechanism used for inter-process communications. There are a
number of di€erent software communication paradigms that have received signi®cant attention.
Of these, the message passing interface (MPI) has been much heralded as the future standard
for message passing based communications [2, 3]. A very di€erent and less well known
technique for inter-process communications is virtual shared memory. Virtual shared memory
allows processes to communicate by directly sharing data as if it existed in a global shared
memory space. Processes can access (read or write) information in the memory space without
concern or knowledge of external processes, and can therefore be developed in a more
sequential program fashion [4]. This conceptually simpler view supports the main advantage
usually associated with virtual shared memory, which is that the application programming
interface is usually quite simple and therefore the complexity of developing parallel
applications is greatly reduced. Paradise (Scienti®c Computing, New Haven, CT) is a widely
used virtual shared memory based communications package that o€ers a very simple API.
The particular problem we address involves the derivation of quantitative anatomical
parameters (such as tissue volume by type etc.) from the visible human dataset [5]. Our medical
imaging goal is to segment (by automatic image classi®cation) the anatomical images of the
visible human dataset so that quantitative anatomical issues and computational geometric
models can be constructed from the dataset. In reference to earlier work from MR images [6, 7]
we have elected to use the visible human dataset to extract out and model the bladder and
urethra. Both because it is anatomically simpler and because the dataset is homogeneous in x,
y, and z resolution (0.33 mm per pixel), we are working with the newer visible female dataset.
Even in the small anatomical region of interest we are examining, the dataset is huge: each raw
image (2048 Â 1216 Â 3) exceeds 7 MB. The pelvic imagery alone (from the base of the sacrum
T. Daggett, I.R. Greenshields / Computers in Biology and Medicine 28 (1998) 47±6048
to about the linea terminalis on the sacral promontory) occupies about 4200 MB. The
algorithmic strategy involves image preprocessing (to remove background and non-anatomical
image details) followed by an unsupervised context-dependent classi®cation of the RGB images
using a Gibbs classi®er (described below).
Automatic image classi®cation, though dicult, is not a new problem and numerous
algorithms have been developed for other problem domains which may aid in the classi®cation
process [8]. These algorithms can be grouped into two principal categories: context-dependent
and context-independent. Context-independent algorithms perform an image classi®cation
based directly on a pixel's intensity and the distribution of the intensity values of an image. In
simpler terms they attempt to classify the pixel purely from its intensity with respect to the rest
of the pixel intensi®es of the image. Context-independent algorithms classify the pixel values of
a random image (one where pixels are randomly scattered) identically to the pixels of an image
that contains well de®ned regions (typical digital image), as long as the images have identical
intensity distributions. Well known examples of context-independent algorithms include the
nearest mean, maximum likelihood estimation and K nearest neighbors methods [8]. In
comparison to context-independent algorithms, context-dependent approaches attempt to
utilize relationships, or contextual information, between pixels within an image to perform a
classi®cation. An important and popular example from the context-dependent category is the
MRF-Gibbs algorithm [9, 10].
The MRF-Gibbs classi®cation method utilizes contextual information derived from the
relationship that a point (pixel) in an image has with the neighborhood (spatially close pixels)
that encompasses it. This relationship is assumed to exhibit the characteristics of a Markov
random ®eld (MRF), in which the neighboring pixels can be e€ectively used to classify the
pixel. The Gibbs component of the MRF-Gibbs classi®cation method stems from the
evaluation of this relationship in terms of an energy function. This energy function can be used
to produce the ideal image classi®cation through the maximization of the energy function at
each pixel of the image. However, the determination of the ideal classi®cation requires that
every possible combination of pixel classi®cations of an image be examined. Therefore an n  n
image with L possible classes contained within would require Ln  n
combinations to be
evaluated. This evaluation is considered computationally intractable and hence is considered
not plausible for practical sized images. To address this processing limitation, the
determination of the ideal classi®cation can be approximated through the employment of
stochastic maximization techniques, such as simulated annealing, to reduce the computational
burden of the MRF-Gibbs algorithm. Theoretically, the MRF-Gibbs classi®er iterates
hundreds (or even thousands) of times over each image, repeatedly computing energy functions
and probability distributions, so that the computational demands are enormous.
Faced with this challenge, we elected to construct a cluster computer comprising 8 PC
platforms (Pentium 166 MHz systems, 32 MB running Microsoft Windows NT). The machines
are connected both by a 25 Mbs ATM switch and 10 Mbs Ethernet, and (as we describe
below) support two distinct application programming interfaces (MPI and Paradise) for the
implementations of parallel algorithms. In the sections that follow we describe the
parallelization of the MRF-Gibbs algorithm in conjunction with simulated annealing, by ®rst
identifying the basic operational concepts of the algorithm and then de®ning how it was
T. Daggett, I.R. Greenshields / Computers in Biology and Medicine 28 (1998) 47±60 49
parallelized on the cluster. The speedup of the parallelized algorithm using both MPI and
Paradise based communications over a serial version is presented.
2. MRF-Gibbs
In the core of the MRF-Gibbs algorithm lies the Markov random ®eld (MRF) assumption.
The MRF assumption states that the true interpretation of any pixel Xij given the true
interpretation of all image pixels G depends only on the interpretation of its neighboring pixels
in a neighborhood Nij (Eq. (1)).
P…Xij ˆ okjG † ˆ P…Xij ˆ okjNij † …1†
This interpretation of a pixel can be used for the assignment, or classi®cation of a pixel to a
class ok, which represents a labeling of the pixel from a given set of L possible classes. These
classes may correspond to tissue types believed to be contained in the image (muscle tissue,
bone, etc.).
A neighborhood Nij consists of those pixels that fall into a de®ned region surrounding the
pixel at location (ij). Various shaped neighborhoods can be used, and their shapes (sizes) are
usually determined from the underlying characteristics present in the regions, or structures of
an image. A very commonly employed and simple neighborhood system may be de®ned as
consisting of those pixels which are within a particular Euclidean distance from the pixel Xij.
Typically the relationship between neighboring points diminishes greatly with increasing
distance. It has been shown [9, 10] that a meaningful neighborhood may in fact be quite small,
and can be de®ned as consisting of those neighbors, which are a Euclidean distance (d) of one
from the pixel in question. Thus
Nij ˆ f…r,s†:d……r,s†,…ij†† ˆ 1g …2†
de®nes a four-element neighborhood, consisting of the east, west, north and south adjacent
pixels about the site (ij), and it is this simple but e€ective structure that we use.
With this concept of a four-pixel neighborhood, the assignment of pixel Xij to a class ok can
be evaluated in terms of the posterior probability P(okvXij,Q), the conditional probability that
the assignment of class ok is correct given observation Xij and prior information Q. The prior
information Q consists of the pixel's neighborhood Nij, global information such as the class
probability of the various classes occurring naturally in the image P(ok), and local information
de®ning the likelihood of the contents of the neighborhood. In simple terms, the probability of
class ok, being assigned to site (ij) depends only upon the observations contained in its
neighborhood Nij, Eq. (3).
P…okjXij,Q† ˆ
‰
…xy†PNij
P…okjXxy,Q† …3†
In Eq. (3), Xxy represents an observation contained in the neighborhood Nij. From a
Bayesian perspective, the best classi®cation for Xij is the class ok that maximizes Eq. (4).
T. Daggett, I.R. Greenshields / Computers in Biology and Medicine 28 (1998) 47±6050
‰
…xy†PNij
P…ok†P…Xxyjok†r
‰
…xy†PNij
P…oh†P…Xxyjoh†, Vh Tˆ k …4†
The decision rule, de®ned above, involves maximizing the probability of the classi®cation of
an entire neighborhood. The MRF-Gibbs equivalence shows that Eq. (4) can be written in a
Gibbsian form P(Xij,ok),
P…Xij,ok† ˆ
1
Z
eÀU…Xij,ok†aT
…5†
U…Xij,ok† ˆ À
ˆ
…x,y†PNij
log P…Xxyjok† ‡ U…ok† …6†
and that this Gibbsian equivalent can be evaluated using an energy function, U(Xij,ok) used
here, a normalizing constant Z, and an arti®cial temperature T. Proof of the equivalence
between the Gibbs and MRF representations is given in Refs. [9, 10]. The MRF-Gibbs
classi®cation process therefore consists of assigning the pixels of an image those class values
that produce a maximum posterior distribution when the energy is minimized. The description
above has concentrated on the local expression of the MRF-Gibbs model for classi®cation.
Evidently, energy minimization cannot proceed on a purely local basis; instead, we consider the
maximum a posteriori (MAP) estimate of the image's class structure as follows. Suppose there
are L classes in an n  n image. A con®guration z $ C is an enumeration of the classes
attached to each point of the image. In the case described, cardinality card{C} = Ln  n
. Let W
be the (random) observation of a con®guration, and let X be the (random) observation of the
image data; then the MAP estimate is the estimate [8, 9].
WMAP ˆ arg maxzPCfP…W ˆ zjX ˆ x†g …7†
However, the assignment of the pixels cannot be performed pixel by pixel but instead
represents a dynamic programming problem.
One can employ simulated annealing to migrate towards an approximate solution to this
maximization problem. Simulated annealing is an iterative optimization technique which
attempts to minimize an energy function through random excitation [11]. It can be utilized to
reduce the overall computation of locating an ``ideal'' classi®cation to a more reasonable
amount. Here ``ideal'' now refers to an approximate solution to the classi®cation problem,
which may or may not be the true classi®cation solution. Typically, the procedure terminates
after a ®xed number of iterations of the annealing process.
In the MRF-Gibbs classi®cation approach, simulated annealing functions by periodically re-
classifying a pixel to a ``worst'' classi®cation, i.e. a classi®cation that actually produces a higher
energy potential. It is this occasional acceptance of a ``worst'' classi®cation that allows the
algorithm to jump out of a local minimum in search of a global minimum, consequently
producing an improved classi®ed image. The decision rule for the acceptance of a ``worst''
classi®cation is governed by a temperature and a cooling schedule.
Numerous cooling schedules have been developed for simulated annealing [11], however it
appears that the best schedule is usually determined through ad hoc trial and error. In our
T. Daggett, I.R. Greenshields / Computers in Biology and Medicine 28 (1998) 47±60 51
case, a cooling schedule T(t), (Eq. (8))
T…t† ˆ
T0
log…t ‡ 1†
…8†
where T0 is the initial temperature is employed [12]. The initial temperature T0 was determined
from the ®rst iteration of the algorithm, which evaluated the image by summing the total
energy encountered (increase in energy) during an attempted classi®cation of the pixels to a
new classi®cation with higher energy. The gradual cooling of the temperature produced by the
schedule ideally allows the algorithm to settle in the global minimum.
With the employment of simulated annealing to ®nd the minimal energy produced by Eq. (6),
the remaining complexity of the MRF-Gibbs approach involves determining a useful energy
function. The derivation of an energy function typically involves understanding the nature of
the image, identifying a meaningful neighborhood shape (and size), and describing the unique
attributes of the image regions, or structures. Numerous models have been developed to
describe these di€ering attributes in terms of relationships of the pixels of the neighborhoods,
contained in the regions or structures. As result, many di€erent energy functions have been
developed which attempt to evaluate the likelihood or quality of a neighborhood relative to a
set of attribute quali®ers [10]. For our energy function,
U…Xij,ok† ˆ À
ˆ
…x,y†PNij
log P…Xxyjok† ‡
ˆ
…x,y†PNij
log P…o…x,y††
‡
ˆ
…x,y†PNij
log P…x,y†,…x‡1,y†…o…x,y†,o…x‡1,y†† ‡
ˆ
…x,y†PNij
log P…x,y†,…x,y‡1†…o…x,y†,o…x,y‡1†† …9†
we utilized the conditional probability of each neighbor in Nij given the class ok, the
probability of the classi®cations of neighbors, and the transitional probabilities associated with
the spatial con®guration of the neighborhood [13]. This spatial likelihood encompasses both
the horizontal and vertical transitions of the neighborhood. Since our neighborhood consisted
of the four-pixel stencil mentioned earlier, the horizontal relationship consisted of evaluating
the transitional probability of the west pixel to pixel Xij and Xij to the east pixel. Likewise the
vertical relationship consisted of the vertical transitional probability of the north pixel to Xij
and Xij to the south pixel. It should be noted that a transitional probability of ok to oj is the
likelihood, or probability, that ok, is followed by oj in the direction of the transition type.
Simulated annealing is employed in conjunction with Eq. (9), as de®ned by the following
series of steps.
1. Randomly select a new class for the pixel xij.
2. Evaluate the new classi®cation in terms of the neighborhood relationships and the
likelihood of the neighborhood via the energy function (Eq. (9)), DU.
3. If the re-classi®cation produces a lower energy then the current classi®cation then accept it.
4. Else accept it conditionally based upon an annealing schedule derived probability Ai>r.
The probability Ai,
Ai ˆ eÀDUaT…t†
…10†
T. Daggett, I.R. Greenshields / Computers in Biology and Medicine 28 (1998) 47±6052
is determined based upon the magnitude of the increase in energy DU from the current pixel
classi®cation to the new one, and the annealing schedule T(t). The acceptance guide r is de®ned
as a uniform random variable between 0 and 1.
The series of steps de®ned in the re-classi®cation rule were applied to each pixel of the image
in succession. Upon the completion, the temperature controlling the annealing schedule was
updated and the next iteration of the algorithm was begun. The number of iterations required
to produce the ``ideal'' classi®cation is not predictable. A classi®cation resulting from a
context-independent classi®cation approach (K nearest neighbors) was used to estimate the
transitional probabilities, the probabilities of the classes, the class statistics (means and
covariances) and as an initial classi®cation of the image.
3. Implementation
3.1. Parallel design description
The parallelization of the MRF-Gibbs algorithm was approached through domain
decomposition in which both data and computation was partitioned. This led to the principal
employment of a single program multiple data (SPMD) design, in which n child processes
(n = 1 to 7) cooperatively work on di€erent portions of an image (W wide by H high) under
the direction of a single parent process. The principal responsibilities of the parent process are
the partitioning and distributing of the image data amongst the children, performing collective
calculations, and communicating important global classi®cation information. Child processes
are tasked with performing the required image classi®cation operations on their image portion.
Information used by the child processes, obtained from the parent process, consists primarily
of image data and class related information, while information obtained from neighboring
child processes consists of the conditional probabilities of the image partition edges and the
classi®ed values associated with the same edges. This edge information is required in order to
generate the maximum number of complete neighborhoods utilized in the MRF-Gibbs
classi®cation process. Evidently, the outcomes of the serial and parallel algorithms are
identical.
Each child process is given an approximately equal size portion of the image to work on.
This image section consists of a rectangular segment of the image, which was Wan in width
(the last process receives extra columns when required by the magnitude of n) with a height of
H. Therefore the partitioning of the data is basically performed by mapping the columns of the
image to particular child processes. Image data (both raw image data and initial classi®ed data)
is sent by the parent to the children in row by row fashion. Hence, each image message
consisted of a Wan image pixel values with a total of 2nH messages sent from the parent to
the children. In addition, a total of nH number of messages is sent from the children to the
parent at the end of a classi®cation, to return the classi®ed portions of the image to the parent
for re-uni®cation.
Each process, child and parent, were mapped to a separate processor of the cluster. This was
chosen in attempt to both maximize processor utilization and load balance the overall system.
In addition, the principal communications required between child processes consists of the
T. Daggett, I.R. Greenshields / Computers in Biology and Medicine 28 (1998) 47±60 53
exchanging of classi®ed edge data at the beginning of each iteration. During a given MRF-
Gibbs iteration, each child can operate simply on their image portion without interference by
neighboring processes. Both the parent and child processes exhibit synchronous
communications in that parent and children will wait for expected information to arrive.
The MRF-Gibbs parallel application was designed and implemented with respect to
scalability, future software modi®cation, and software reuse. Both the child and parent
processes were created to function without concern for either the number of processors or
processes employed. A high-level communications interface (C + + class) was designed and
implemented to abstract the actual algorithm above the communications mechanism utilized,
therefore supporting the rapid employment of other forms of underlying inter-process
communications. This class also provided overloaded operations, which abstracted the
development of the algorithm above the complexities of the communication of various data
types, thus further simplifying code development.
3.2. Image dataset
The MRF-Gibbs algorithm was customized to improve its performance due to the images
used and the parallel environment employed. Sample images from the visible human project
female dataset were used for all classi®cations. The raw images have dimensions of
2048 Â 1216 Â 3, but are reduced by background elimination to 1300 Â 700 Â 3.
3.3. Algorithm realization
The implementation of the MRF-Gibbs algorithm was approached through a series of
phases. The phases were data loading, transitional probability determination, class conditional
probability construction, and MRF-Gibbs re-classi®cation iteration.
The ®rst phase, data loading, consists of loading in the raw image data, the context-
independent classi®ed version of the image, and the statistics of the classes determined from
the context-independent approach. The context-independent classi®ed image contained a single
classi®ed value (0 to L À 1 classes) for each pixel. The class statistics consists of a class mean g,
covariance matrix Sk and probability Pk for each class.
The next phase consists of determining the transitional probabilities of the classes de®ned in
the context-independent classi®cation of the image.
The class conditional probability construction phase followed the transitional probability
phase. This phase consists of determining the class conditional probability of each class for
each pixel of the image. The class conditional probability was modeled as a maximum
likelihood estimation posterior probability, which is the posterior probability of a class ok
given a pixel value X. Eq. (11) de®nes the class conditional probability P(okvXij) of class ok, of
L total classes based upon the probability of Xij given class ok, which is identi®ed as the
conditional probability p(Xijvok), and the overall probability of class ok, P(ok).
T. Daggett, I.R. Greenshields / Computers in Biology and Medicine 28 (1998) 47±6054
P…okjXij † ˆ
P…ok†P…Xijjok†
ˆL
zˆ1
P…oz†P…Xijjoz†
…11†
In Eq. (11), the conditional probability p(Xijvok) is modeled as a normal distribution using
the class mean and covariance matrix de®ned in the class statistics. Overall this phase consists
of calculating and storing W Â H Â L conditional probabilities, where W, H, L refers to the
image width, image height and number of classes of the non-contextually classi®ed image,
respectively.
The ®nal phase, MRF-Gibbs iteration, involves the actual re-classifying of the pixels through
a series of re-classi®cation iterations. During the ®rst iteration the initial temperature, T0 is set
based upon the total energy encountered during the attempted re-classi®cation of all pixels.
During this initial iteration no pixels are re-classi®ed which produced a ``worst'' classi®cation,
but instead the increase in energy encountered during an attempted re-classi®cation of a pixel
is accumulated, and used to set T0. During the second and subsequent iterations this phase
attempts to re-classify each pixel to a random class value. The energy of each new classi®ed
pixel value is evaluated and accepted based upon the guidelines de®ned in the re-classi®cation
rules for MRF-Gibbs. This phase iterates through the MRF-Gibbs re-classi®cation algorithm
for a ®xed number of times. At the end of each iteration, the temperature is adjusted in
accordance with the annealing schedule de®ned earlier.
Overall the implementation of the MRF-Gibbs algorithm can be examined as a series of
steps. These steps can be separated and listed as the following.
1. Load in the raw image, the class statistics, and a previously classi®ed image.
2. Determine the transitional probabilities from the previously classi®ed image.
3. Determine the class conditional probability for each class at each pixel.
4. Evaluate the raw image and set the initial temperature.
5. For each pixel in the image choose a new class value at random, and evaluate the change in
energy resulting from the energy function (using class probabilities, conditional
probabilities, and transitional probabilities). Accept the new pixel classi®cation if the
resulting energy is lower, else accept it conditionally base upon an annealing probability.
6. Once all the pixels have been attempted to be re-classi®ed, then lower the temperature based
upon the annealing schedule.
7. Go to step 5 until the desired number of iterations has been reached.
In a typical image classi®cation the majority of the execution time of the program is spent in
either the conditional probability construction phase (phase 3) or in the re-classi®cation phase
(phase 5). The other phases do not constitute a signi®cant amount of computation. Fig. 1
represents a graphical view of the parallel version of the MRF-Gibbs algorithm.
Steps are 1, 8, and 10 of the MRG-Gibbs parallel implementation are considered to be
principally communication steps. Steps 1 and 10 occur only once during an image
classi®cation, while the classi®ed edge exchanging step (step 8) will occur during each iteration
of the MRF-Gibbs algorithm.
T. Daggett, I.R. Greenshields / Computers in Biology and Medicine 28 (1998) 47±60 55
Fig. 1. Depiction of parallel version of the MRF-Gibbs algorithm.
T. Daggett, I.R. Greenshields / Computers in Biology and Medicine 28 (1998) 47±6056
Hence, step 8 is perhaps the true communications cost of the parallelized version of the
algorithm, since it is the only communications that results from various numbers of iterations.
The other steps (other then 1, 8, and 10) which require communications represent relatively
insigni®cant communication amounts and also only occur once per image classi®cation. In
Figs. 2 and 3, Ixy represents the image partition and Cxy represents the classi®ed partition of
the same image segment.
The computationally most intense step is the determination of the class conditional
probabilities (step 5), though this step is only required to be performed once per image
classi®cation. In step 5 the class conditionals are computed once and stored for re-use during
the MRF-Gibbs classi®cation phase (steps 8±9).
Fig. 2. Original image after background removal.
Fig. 3. Classi®ed image produced by MRF-Gibbs algorithm.
T. Daggett, I.R. Greenshields / Computers in Biology and Medicine 28 (1998) 47±60 57
Once the program entered the MRF-Gibbs classi®cation phase (steps 8±9), most of time is
spent in the evaluation of random class values at each pixel in the image (step 9). This step
requires a fairly large amount of computation due primarily to the evaluation of the energy
function using both the current pixel classi®cation value and the randomly selected value. Note
that the current energy at pixel Xij would not be accurate if the classi®cation of pixel Xij + 1
changes during the attempted re-classi®cation of pixel Xij + 1, hence the energy potentials of
the neighborhoods change continuously as neighbors, are subsequently re-classi®ed.
At the end of the required number of iterations each child process returns the classi®ed
partitions of the image to the parent (step 10), who combines the image sections into a ®nal
classi®ed version of the original image. This signals the completion of the classi®cation process.
4. Performance results
Table 1 shows the performance of the algorithm for varying number of child processes in
conjunction with a single parent process. The number of iterations was ®xed at 200 with 6
classes for all trials. Both MPI and Paradise, as indicated in the table, were utilized for all
inter-process communications. The information exchanged consisted of the class statistics, the
raw image data and classi®ed partitions of the images, along with the class conditional and
classi®ed image edge values. The table lists speedup relative to a single sequential process and
the timeline employed started with the distribution of the image data and ended with the
returning to and combining of all classi®ed image sections by the parent.
5. Experimental framework
All of the parallelization experiments were performed on the University of Connecticut CoW
(Cluster of Workstations) facility located inside the Image Processing Laboratory (IPL), Booth
Research Center. The machines contained in the cluster were identical and consisted of 32 MB
Table 1
Speedupa
of MRF-Gibbs resulting from using MPI and Paradise
Number of Child Processes Speedup with MPI Speedup with Paradise
1 0.14 0.21
2 1.32 1.86
3 2.39 2.79
4 2.37 3.67
5 2.43 4.47
6 2.65 5.15
7 2.56 6.02
a
Speedup generated from estimated single process performance.
T. Daggett, I.R. Greenshields / Computers in Biology and Medicine 28 (1998) 47±6058
166 MHz Intel Pentium based personal computers executing the Windows NT 3.5.1 operating
system, connected together via an IBM ATM 25.5 MBps Switch. The ATM switch was
con®gured to provide LAN Emulation support. All software was written in C + + using the
Visual C + + software development environment. Version 0.9 of MPI was used.
6. Conclusion
Automatic image classi®cation is an important and computationally challenging task. We
have taken the MRF-Gibbs algorithm and have parallelized it using both message passing and
virtual shared memory based communications. We demonstrated that the algorithm can be
parallelized and implemented on a common cluster using standard inter-process
communications. Our results show the following.
1. Classi®cation algorithms can be e€ectively parallelized by partitioning the image between
cooperating child processes and then reconciling the local attributes of the image partitions.
This coarse grain partitioning approach could perhaps be employed for other classi®cation
algorithms, either context-dependent or context-independent.
2. Both MPI and Paradise generate an overhead but can still be e€ectively used if the
algorithm computation burden is large enough to compensate for additional
communications cost. For relatively computational simple algorithms, such as nearest mean,
communications costs may greatly limit the speedup experienced.
3. A small cluster can be con®gured and utilized for the classi®cation process. However, the
real-time acquisition and display requirements of future medical image visualization systems
are currently far greater then can be realistically supported with current small scale systems.
4. In general, Paradise far out performed MPI for the MRF-Gibbs application developed
under the hardware and software suite utilized.
In summary, the parallelization of the MRF-Gibbs algorithm demonstrated that statistically
based classi®cation algorithms can be parallelized using a coarse grain approach based upon
partitioning the image data. Partitioning the data does require that statistics associated with
the image partitions be reconciled. However, the amount of data required to be communicated
is typically quite small and can be performed using standard communication mechanisms. It
should be noted that both MPI and Paradise do generate a processing overhead. The impact of
these overheads could potentially be reduced by increasing the processing capacity of the
cluster, or perhaps by employing a lower level communication level then TCP/IP for MPI or
Paradise. In addition, both MPI and Paradise provide an e€ective application programming
interface for the classi®cation problem domain.
Acknowledgements
This work was funded by State of Connecticut under its Critical Technologies Program. We
also wish to gratefully acknowledge Scienti®c Computing Associates, New Haven, CT for the
T. Daggett, I.R. Greenshields / Computers in Biology and Medicine 28 (1998) 47±60 59
delineation of the Paradise Software for the duration of the project and Dr Michael J.
Ackerman, Visible Human Project, National Library of Medicine for providing the Visible
Human Female dataset.
References
[1] SPECfp_rate95, Spec Benchmarks, http://open.specbench.org/osg/cpu95/results/rfp95.html.
[2] Snir, M., Otto, S., Huss-Lederman, S., Walker, D. and Dongarra, J., MPI The Complete Reference. MIT Press,
Cambridge, MA, 1996.
[3] Gropp, W., Lusk, E. and Skjellum, A., Using MPI Portable Parallel Programming with the Message-Passing
Interface. MIT Press, Cambridge, MA, 1996.
[4] Amza, C. and Cox, A., TreadMarks: shared memory computing on networks of workstations, IEEE Computer,
February, 1996, 18±28.
[5] Visible Human Dataset, National Library of Medicine, National Institutes of Health, Bethesda, MD.
[6] Greenshields, I. R. and Chun, J., Simulation of ¯uid ¯ow through biological inhomogeneous elastic tubes,
Proc. of IASTED Int. Conference on Modeling and Simulation, Pittsburgh, PA, 1993, pp. 66±68.
[7] Greenshields, I. R., Peters, T. J. and Chun, J., Parallel strategies in the reconstruction of surfaces from contour
data, Proc. of 4th ISMM Conf. on Parallel and Distributed Computing Systems, 1991, pp. 355±358.
[8] Fukunaga, K., Statistical Pattern Recognition, 2nd edition. Academic Press, San Diego, CA, 1990.
[9] Li, S. Z., Markov Random Field Modeling in Computer Vision. Springer, Berlin, 1995.
[10] Winkler, G., Image Analysis, Random Fields and Dynamic Monte Carlo Methods. Springer, Berlin, 1991.
[11] L. Ingber, Simulated annealing: practice versus theory, Mathematical and Computer Modeling 18 (11) (1993)
29±57.
[12] Geman, S. and Gemen, D., Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images,
IEEE Trans. on Pattern Analysis and Machine Intelligence, 1984, PAMI-8(6), 721±741.
[13] Zhu, J. and Greenshields, I., Classi®cation of multiecho magnetic resonance images of brain using MRG-Gibbs
classi®er, Department of Computer Science and Engineering, University of Connecticut, Storrs, CT, 1993.
Thomas A. Daggett is a Ph.D. candidate at the University of Connecticut. He received a BSE in computer
engineering from the University of Connecticut in 1985 and his MS in computer science from Rennselaer
Polytechnic Institute in 1989. He is currently serving as an operability and display engineer in the Advanced Display
Research Facility (ADRF) at the Naval Undersea Warfare Center (NUWC) division Newport, Newport Rhode
Island. His research interests lie in parallel/distributed processing, image processing, small-scale cluster computing,
and advanced display technologies.
Ian Greenshields is an Associate Professor of Computer Science and Engineering at the University of Connecticut.
His research interests lie in the areas of biomedical computing, biomedical image processing and biodynamical
modeling.
T. Daggett, I.R. Greenshields / Computers in Biology and Medicine 28 (1998) 47±6060

More Related Content

What's hot

On Text Realization Image Steganography
On Text Realization Image SteganographyOn Text Realization Image Steganography
On Text Realization Image SteganographyCSCJournals
 
IEEE 2014 Matlab Projects
IEEE 2014 Matlab ProjectsIEEE 2014 Matlab Projects
IEEE 2014 Matlab ProjectsVijay Karan
 
IEEE 2014 Matlab Projects
IEEE 2014 Matlab ProjectsIEEE 2014 Matlab Projects
IEEE 2014 Matlab ProjectsVijay Karan
 
IRJET - Symmetric Image Registration based on Intensity and Spatial Informati...
IRJET - Symmetric Image Registration based on Intensity and Spatial Informati...IRJET - Symmetric Image Registration based on Intensity and Spatial Informati...
IRJET - Symmetric Image Registration based on Intensity and Spatial Informati...IRJET Journal
 
Intel, Intelligent Systems Lab: Syable View Synthesis Whitepaper
Intel, Intelligent Systems Lab: Syable View Synthesis WhitepaperIntel, Intelligent Systems Lab: Syable View Synthesis Whitepaper
Intel, Intelligent Systems Lab: Syable View Synthesis WhitepaperAlejandro Franceschi
 
PR-065 : High-Resolution Image Synthesis and Semantic Manipulation with Condi...
PR-065 : High-Resolution Image Synthesis and Semantic Manipulation with Condi...PR-065 : High-Resolution Image Synthesis and Semantic Manipulation with Condi...
PR-065 : High-Resolution Image Synthesis and Semantic Manipulation with Condi...광희 이
 
A Novel Multiple-kernel based Fuzzy c-means Algorithm with Spatial Informatio...
A Novel Multiple-kernel based Fuzzy c-means Algorithm with Spatial Informatio...A Novel Multiple-kernel based Fuzzy c-means Algorithm with Spatial Informatio...
A Novel Multiple-kernel based Fuzzy c-means Algorithm with Spatial Informatio...CSCJournals
 
International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)inventionjournals
 
Volume 2-issue-6-1930-1932
Volume 2-issue-6-1930-1932Volume 2-issue-6-1930-1932
Volume 2-issue-6-1930-1932Editor IJARCET
 
Implementation of Fuzzy Logic for the High-Resolution Remote Sensing Images w...
Implementation of Fuzzy Logic for the High-Resolution Remote Sensing Images w...Implementation of Fuzzy Logic for the High-Resolution Remote Sensing Images w...
Implementation of Fuzzy Logic for the High-Resolution Remote Sensing Images w...IOSR Journals
 
Probabilistic model based image segmentation
Probabilistic model based image segmentationProbabilistic model based image segmentation
Probabilistic model based image segmentationijma
 
A NOBEL HYBRID APPROACH FOR EDGE DETECTION
A NOBEL HYBRID APPROACH FOR EDGE  DETECTIONA NOBEL HYBRID APPROACH FOR EDGE  DETECTION
A NOBEL HYBRID APPROACH FOR EDGE DETECTIONijcses
 
DCT based Steganographic Evaluation parameter analysis in Frequency domain by...
DCT based Steganographic Evaluation parameter analysis in Frequency domain by...DCT based Steganographic Evaluation parameter analysis in Frequency domain by...
DCT based Steganographic Evaluation parameter analysis in Frequency domain by...IOSR Journals
 
Energy minimization based spatially
Energy minimization based spatiallyEnergy minimization based spatially
Energy minimization based spatiallysipij
 
An Efficient Clustering Method for Aggregation on Data Fragments
An Efficient Clustering Method for Aggregation on Data FragmentsAn Efficient Clustering Method for Aggregation on Data Fragments
An Efficient Clustering Method for Aggregation on Data FragmentsIJMER
 

What's hot (18)

On Text Realization Image Steganography
On Text Realization Image SteganographyOn Text Realization Image Steganography
On Text Realization Image Steganography
 
IEEE 2014 Matlab Projects
IEEE 2014 Matlab ProjectsIEEE 2014 Matlab Projects
IEEE 2014 Matlab Projects
 
IEEE 2014 Matlab Projects
IEEE 2014 Matlab ProjectsIEEE 2014 Matlab Projects
IEEE 2014 Matlab Projects
 
IRJET - Symmetric Image Registration based on Intensity and Spatial Informati...
IRJET - Symmetric Image Registration based on Intensity and Spatial Informati...IRJET - Symmetric Image Registration based on Intensity and Spatial Informati...
IRJET - Symmetric Image Registration based on Intensity and Spatial Informati...
 
Intel, Intelligent Systems Lab: Syable View Synthesis Whitepaper
Intel, Intelligent Systems Lab: Syable View Synthesis WhitepaperIntel, Intelligent Systems Lab: Syable View Synthesis Whitepaper
Intel, Intelligent Systems Lab: Syable View Synthesis Whitepaper
 
O017429398
O017429398O017429398
O017429398
 
PR-065 : High-Resolution Image Synthesis and Semantic Manipulation with Condi...
PR-065 : High-Resolution Image Synthesis and Semantic Manipulation with Condi...PR-065 : High-Resolution Image Synthesis and Semantic Manipulation with Condi...
PR-065 : High-Resolution Image Synthesis and Semantic Manipulation with Condi...
 
A Novel Multiple-kernel based Fuzzy c-means Algorithm with Spatial Informatio...
A Novel Multiple-kernel based Fuzzy c-means Algorithm with Spatial Informatio...A Novel Multiple-kernel based Fuzzy c-means Algorithm with Spatial Informatio...
A Novel Multiple-kernel based Fuzzy c-means Algorithm with Spatial Informatio...
 
International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)
 
Volume 2-issue-6-1930-1932
Volume 2-issue-6-1930-1932Volume 2-issue-6-1930-1932
Volume 2-issue-6-1930-1932
 
Implementation of Fuzzy Logic for the High-Resolution Remote Sensing Images w...
Implementation of Fuzzy Logic for the High-Resolution Remote Sensing Images w...Implementation of Fuzzy Logic for the High-Resolution Remote Sensing Images w...
Implementation of Fuzzy Logic for the High-Resolution Remote Sensing Images w...
 
Probabilistic model based image segmentation
Probabilistic model based image segmentationProbabilistic model based image segmentation
Probabilistic model based image segmentation
 
A NOBEL HYBRID APPROACH FOR EDGE DETECTION
A NOBEL HYBRID APPROACH FOR EDGE  DETECTIONA NOBEL HYBRID APPROACH FOR EDGE  DETECTION
A NOBEL HYBRID APPROACH FOR EDGE DETECTION
 
Ad04603175180
Ad04603175180Ad04603175180
Ad04603175180
 
DCT based Steganographic Evaluation parameter analysis in Frequency domain by...
DCT based Steganographic Evaluation parameter analysis in Frequency domain by...DCT based Steganographic Evaluation parameter analysis in Frequency domain by...
DCT based Steganographic Evaluation parameter analysis in Frequency domain by...
 
Cray HPC + D + A = HPDA
Cray HPC + D + A = HPDACray HPC + D + A = HPDA
Cray HPC + D + A = HPDA
 
Energy minimization based spatially
Energy minimization based spatiallyEnergy minimization based spatially
Energy minimization based spatially
 
An Efficient Clustering Method for Aggregation on Data Fragments
An Efficient Clustering Method for Aggregation on Data FragmentsAn Efficient Clustering Method for Aggregation on Data Fragments
An Efficient Clustering Method for Aggregation on Data Fragments
 

Similar to ClusterPaperDaggett

Region wise processing of an image using multithreading in multi core environ
Region wise processing of an image using multithreading in multi core environRegion wise processing of an image using multithreading in multi core environ
Region wise processing of an image using multithreading in multi core environIAEME Publication
 
Region wise processing of an image using multithreading in multi core environ
Region wise processing of an image using multithreading in multi core environRegion wise processing of an image using multithreading in multi core environ
Region wise processing of an image using multithreading in multi core environIAEME Publication
 
Development and Comparison of Image Fusion Techniques for CT&MRI Images
Development and Comparison of Image Fusion Techniques for CT&MRI ImagesDevelopment and Comparison of Image Fusion Techniques for CT&MRI Images
Development and Comparison of Image Fusion Techniques for CT&MRI ImagesIJERA Editor
 
COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...
COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...
COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...acijjournal
 
Influence of local segmentation in the context of digital image processing
Influence of local segmentation in the context of digital image processingInfluence of local segmentation in the context of digital image processing
Influence of local segmentation in the context of digital image processingiaemedu
 
Image recogonization
Image recogonizationImage recogonization
Image recogonizationSANTOSH RATH
 
A Review on Matching For Sketch Technique
A Review on Matching For Sketch TechniqueA Review on Matching For Sketch Technique
A Review on Matching For Sketch TechniqueIOSR Journals
 
A deep learning based stereo matching model for autonomous vehicle
A deep learning based stereo matching model for autonomous vehicleA deep learning based stereo matching model for autonomous vehicle
A deep learning based stereo matching model for autonomous vehicleIAESIJAI
 
SVD BASED LATENT SEMANTIC INDEXING WITH USE OF THE GPU COMPUTATIONS
SVD BASED LATENT SEMANTIC INDEXING WITH USE OF THE GPU COMPUTATIONSSVD BASED LATENT SEMANTIC INDEXING WITH USE OF THE GPU COMPUTATIONS
SVD BASED LATENT SEMANTIC INDEXING WITH USE OF THE GPU COMPUTATIONSijscmcj
 
Sparse Sampling in Digital Image Processing
Sparse Sampling in Digital Image ProcessingSparse Sampling in Digital Image Processing
Sparse Sampling in Digital Image ProcessingEswar Publications
 
IRJET- Fusion based Brain Tumor Detection
IRJET- Fusion based Brain Tumor DetectionIRJET- Fusion based Brain Tumor Detection
IRJET- Fusion based Brain Tumor DetectionIRJET Journal
 
Analog signal processing solution
Analog signal processing solutionAnalog signal processing solution
Analog signal processing solutioncsandit
 
k means clustering-based data compression
k means clustering-based data compressionk means clustering-based data compression
k means clustering-based data compressionmohammed alrekabe
 
Comparison Between Levenberg-Marquardt And Scaled Conjugate Gradient Training...
Comparison Between Levenberg-Marquardt And Scaled Conjugate Gradient Training...Comparison Between Levenberg-Marquardt And Scaled Conjugate Gradient Training...
Comparison Between Levenberg-Marquardt And Scaled Conjugate Gradient Training...CSCJournals
 
Low complexity features for jpeg steganalysis using undecimated dct
Low complexity features for jpeg steganalysis using undecimated dctLow complexity features for jpeg steganalysis using undecimated dct
Low complexity features for jpeg steganalysis using undecimated dctPvrtechnologies Nellore
 
Parallel Batch-Dynamic Graphs: Algorithms and Lower Bounds
Parallel Batch-Dynamic Graphs: Algorithms and Lower BoundsParallel Batch-Dynamic Graphs: Algorithms and Lower Bounds
Parallel Batch-Dynamic Graphs: Algorithms and Lower BoundsSubhajit Sahu
 
Parallel Batch-Dynamic Graphs: Algorithms and Lower Bounds
Parallel Batch-Dynamic Graphs: Algorithms and Lower BoundsParallel Batch-Dynamic Graphs: Algorithms and Lower Bounds
Parallel Batch-Dynamic Graphs: Algorithms and Lower BoundsSubhajit Sahu
 
Joint, Image-Adaptive Compression and Watermarking by GABased Wavelet Localiz...
Joint, Image-Adaptive Compression and Watermarking by GABased Wavelet Localiz...Joint, Image-Adaptive Compression and Watermarking by GABased Wavelet Localiz...
Joint, Image-Adaptive Compression and Watermarking by GABased Wavelet Localiz...CSCJournals
 
A spatial image compression algorithm based on run length encoding
A spatial image compression algorithm based on run length encodingA spatial image compression algorithm based on run length encoding
A spatial image compression algorithm based on run length encodingjournalBEEI
 

Similar to ClusterPaperDaggett (20)

Region wise processing of an image using multithreading in multi core environ
Region wise processing of an image using multithreading in multi core environRegion wise processing of an image using multithreading in multi core environ
Region wise processing of an image using multithreading in multi core environ
 
Region wise processing of an image using multithreading in multi core environ
Region wise processing of an image using multithreading in multi core environRegion wise processing of an image using multithreading in multi core environ
Region wise processing of an image using multithreading in multi core environ
 
Development and Comparison of Image Fusion Techniques for CT&MRI Images
Development and Comparison of Image Fusion Techniques for CT&MRI ImagesDevelopment and Comparison of Image Fusion Techniques for CT&MRI Images
Development and Comparison of Image Fusion Techniques for CT&MRI Images
 
COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...
COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...
COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...
 
Influence of local segmentation in the context of digital image processing
Influence of local segmentation in the context of digital image processingInfluence of local segmentation in the context of digital image processing
Influence of local segmentation in the context of digital image processing
 
Image recogonization
Image recogonizationImage recogonization
Image recogonization
 
A Review on Matching For Sketch Technique
A Review on Matching For Sketch TechniqueA Review on Matching For Sketch Technique
A Review on Matching For Sketch Technique
 
A deep learning based stereo matching model for autonomous vehicle
A deep learning based stereo matching model for autonomous vehicleA deep learning based stereo matching model for autonomous vehicle
A deep learning based stereo matching model for autonomous vehicle
 
SVD BASED LATENT SEMANTIC INDEXING WITH USE OF THE GPU COMPUTATIONS
SVD BASED LATENT SEMANTIC INDEXING WITH USE OF THE GPU COMPUTATIONSSVD BASED LATENT SEMANTIC INDEXING WITH USE OF THE GPU COMPUTATIONS
SVD BASED LATENT SEMANTIC INDEXING WITH USE OF THE GPU COMPUTATIONS
 
Sparse Sampling in Digital Image Processing
Sparse Sampling in Digital Image ProcessingSparse Sampling in Digital Image Processing
Sparse Sampling in Digital Image Processing
 
IRJET- Fusion based Brain Tumor Detection
IRJET- Fusion based Brain Tumor DetectionIRJET- Fusion based Brain Tumor Detection
IRJET- Fusion based Brain Tumor Detection
 
Analog signal processing solution
Analog signal processing solutionAnalog signal processing solution
Analog signal processing solution
 
B0310408
B0310408B0310408
B0310408
 
k means clustering-based data compression
k means clustering-based data compressionk means clustering-based data compression
k means clustering-based data compression
 
Comparison Between Levenberg-Marquardt And Scaled Conjugate Gradient Training...
Comparison Between Levenberg-Marquardt And Scaled Conjugate Gradient Training...Comparison Between Levenberg-Marquardt And Scaled Conjugate Gradient Training...
Comparison Between Levenberg-Marquardt And Scaled Conjugate Gradient Training...
 
Low complexity features for jpeg steganalysis using undecimated dct
Low complexity features for jpeg steganalysis using undecimated dctLow complexity features for jpeg steganalysis using undecimated dct
Low complexity features for jpeg steganalysis using undecimated dct
 
Parallel Batch-Dynamic Graphs: Algorithms and Lower Bounds
Parallel Batch-Dynamic Graphs: Algorithms and Lower BoundsParallel Batch-Dynamic Graphs: Algorithms and Lower Bounds
Parallel Batch-Dynamic Graphs: Algorithms and Lower Bounds
 
Parallel Batch-Dynamic Graphs: Algorithms and Lower Bounds
Parallel Batch-Dynamic Graphs: Algorithms and Lower BoundsParallel Batch-Dynamic Graphs: Algorithms and Lower Bounds
Parallel Batch-Dynamic Graphs: Algorithms and Lower Bounds
 
Joint, Image-Adaptive Compression and Watermarking by GABased Wavelet Localiz...
Joint, Image-Adaptive Compression and Watermarking by GABased Wavelet Localiz...Joint, Image-Adaptive Compression and Watermarking by GABased Wavelet Localiz...
Joint, Image-Adaptive Compression and Watermarking by GABased Wavelet Localiz...
 
A spatial image compression algorithm based on run length encoding
A spatial image compression algorithm based on run length encodingA spatial image compression algorithm based on run length encoding
A spatial image compression algorithm based on run length encoding
 

ClusterPaperDaggett

  • 1. A cluster computer system for the analysis and classi®cation of massively large biomedical image data T. Daggett *, I. R. Greenshields Medical Imaging Laboratory, U-31 Room 19, T. L. Booth Research Center, Department of Computer Science and Engineering, 233 Glenbrook Road, The University of Connecticut, Storrs, CT, U.S.A. Received 25 March 1997 Abstract The current trend in medical image acquisition is towards the generation of image datasets which are massively large, either because they exhibit ®ne x, y, or z resolution, are volumetric, are multispectral, or a combination of all of the preceding. Such images pose a signi®cant computational challenge in their analysis, not only in terms of data throughput, but also in terms of platform costs and simplicity. In this paper we describe the role of a cluster of workstations together with two quite di€erent application programming interfaces (APIs) in the quantitative analysis of anatomic image data from the visible human project using an MRF-Gibbs classi®cation algorithm. We describe the typical architecture of a cluster computer, two API options and the parallelization of the MRF-Gibbs procedure for the cluster. Finally, we show speedup results obtained on the cluster and sample classi®cations of visible human data. # 1998 Elsevier Science Ltd. All rights reserved. Keywords: Image processing; Cluster computing; Parallel computing; MRF-Gibbs; Classi®cation 1. Introduction Medical image datasets, like a wide variety of other image datasets, are increasingly characterizable as massive in terms of their spatial resolution and/or their spectral resolution. Since the lower bound on processing costs for an n  n image grows quadratically (O(n2 )) with the on-side dimensions of an image (since most meaningful algorithms have to examine every pixel at least once), higher resolution images bring nonlinear growth in computational costs associated with their processing. On the positive side, many imaging algorithms are easily adapted to the single program multiple data model of parallelism, so that multiple copies of an Computers in Biology and Medicine 28 (1998) 47±60 0010-4825/98/$19.00 # 1998 Elsevier Science Ltd. All rights reserved. PII: S0010-4825(97)00032-2 PERGAMON * Corresponding author.
  • 2. algorithm can execute in parallel acting on di€erent subdivisions of the image. Thus, discounting typical parallel overheads, many imaging algorithms are in principle amenable to the p-speedup when distributed over p processors. (Many image algorithms are, of course, not so amenable.) It is therefore often the case that the issue impacting the processing throughput of massive images lies not in complex algorithm design, but in platform availability. It will sometimes be true that a large parallel system's costs can be folded into the cost of the imager itself without apparent impact on total system cost (as might be the case with a modern MR imager), but the average biological or medical laboratory interested in analysing these images is unlikely to wish to acquire a large parallel system, both because of cost and system management problems. However, even the most modest laboratories will have access to inexpensive PC-level workstations, and it has now become a trend in parallel system design to exploit groups of these systems networked together to form what is known as a cluster computer. Cluster computers, unlike the more expensive symmetric multiprocessors which characterize the bulk of non-cluster parallel systems are inexpensive to assemble, easy to maintain, extremely fault-tolerant (in the sense that the failure of a cluster component Ða workstationÐ simply means that the cluster has less processing power than it had before), surprisingly powerful (a tribute to the modem microprocessor) and (with newer application programming interfaces) relatively easy to program. Clusters can be quite small yet still powerful; the architecture we describe below consists of 8 Pentium systems capable of a (theoretical) throughput SpecFP of about 245 [1]. The principal software component for application development on parallel systems is the underlying communication mechanism used for inter-process communications. There are a number of di€erent software communication paradigms that have received signi®cant attention. Of these, the message passing interface (MPI) has been much heralded as the future standard for message passing based communications [2, 3]. A very di€erent and less well known technique for inter-process communications is virtual shared memory. Virtual shared memory allows processes to communicate by directly sharing data as if it existed in a global shared memory space. Processes can access (read or write) information in the memory space without concern or knowledge of external processes, and can therefore be developed in a more sequential program fashion [4]. This conceptually simpler view supports the main advantage usually associated with virtual shared memory, which is that the application programming interface is usually quite simple and therefore the complexity of developing parallel applications is greatly reduced. Paradise (Scienti®c Computing, New Haven, CT) is a widely used virtual shared memory based communications package that o€ers a very simple API. The particular problem we address involves the derivation of quantitative anatomical parameters (such as tissue volume by type etc.) from the visible human dataset [5]. Our medical imaging goal is to segment (by automatic image classi®cation) the anatomical images of the visible human dataset so that quantitative anatomical issues and computational geometric models can be constructed from the dataset. In reference to earlier work from MR images [6, 7] we have elected to use the visible human dataset to extract out and model the bladder and urethra. Both because it is anatomically simpler and because the dataset is homogeneous in x, y, and z resolution (0.33 mm per pixel), we are working with the newer visible female dataset. Even in the small anatomical region of interest we are examining, the dataset is huge: each raw image (2048 Â 1216 Â 3) exceeds 7 MB. The pelvic imagery alone (from the base of the sacrum T. Daggett, I.R. Greenshields / Computers in Biology and Medicine 28 (1998) 47±6048
  • 3. to about the linea terminalis on the sacral promontory) occupies about 4200 MB. The algorithmic strategy involves image preprocessing (to remove background and non-anatomical image details) followed by an unsupervised context-dependent classi®cation of the RGB images using a Gibbs classi®er (described below). Automatic image classi®cation, though dicult, is not a new problem and numerous algorithms have been developed for other problem domains which may aid in the classi®cation process [8]. These algorithms can be grouped into two principal categories: context-dependent and context-independent. Context-independent algorithms perform an image classi®cation based directly on a pixel's intensity and the distribution of the intensity values of an image. In simpler terms they attempt to classify the pixel purely from its intensity with respect to the rest of the pixel intensi®es of the image. Context-independent algorithms classify the pixel values of a random image (one where pixels are randomly scattered) identically to the pixels of an image that contains well de®ned regions (typical digital image), as long as the images have identical intensity distributions. Well known examples of context-independent algorithms include the nearest mean, maximum likelihood estimation and K nearest neighbors methods [8]. In comparison to context-independent algorithms, context-dependent approaches attempt to utilize relationships, or contextual information, between pixels within an image to perform a classi®cation. An important and popular example from the context-dependent category is the MRF-Gibbs algorithm [9, 10]. The MRF-Gibbs classi®cation method utilizes contextual information derived from the relationship that a point (pixel) in an image has with the neighborhood (spatially close pixels) that encompasses it. This relationship is assumed to exhibit the characteristics of a Markov random ®eld (MRF), in which the neighboring pixels can be e€ectively used to classify the pixel. The Gibbs component of the MRF-Gibbs classi®cation method stems from the evaluation of this relationship in terms of an energy function. This energy function can be used to produce the ideal image classi®cation through the maximization of the energy function at each pixel of the image. However, the determination of the ideal classi®cation requires that every possible combination of pixel classi®cations of an image be examined. Therefore an n  n image with L possible classes contained within would require Ln  n combinations to be evaluated. This evaluation is considered computationally intractable and hence is considered not plausible for practical sized images. To address this processing limitation, the determination of the ideal classi®cation can be approximated through the employment of stochastic maximization techniques, such as simulated annealing, to reduce the computational burden of the MRF-Gibbs algorithm. Theoretically, the MRF-Gibbs classi®er iterates hundreds (or even thousands) of times over each image, repeatedly computing energy functions and probability distributions, so that the computational demands are enormous. Faced with this challenge, we elected to construct a cluster computer comprising 8 PC platforms (Pentium 166 MHz systems, 32 MB running Microsoft Windows NT). The machines are connected both by a 25 Mbs ATM switch and 10 Mbs Ethernet, and (as we describe below) support two distinct application programming interfaces (MPI and Paradise) for the implementations of parallel algorithms. In the sections that follow we describe the parallelization of the MRF-Gibbs algorithm in conjunction with simulated annealing, by ®rst identifying the basic operational concepts of the algorithm and then de®ning how it was T. Daggett, I.R. Greenshields / Computers in Biology and Medicine 28 (1998) 47±60 49
  • 4. parallelized on the cluster. The speedup of the parallelized algorithm using both MPI and Paradise based communications over a serial version is presented. 2. MRF-Gibbs In the core of the MRF-Gibbs algorithm lies the Markov random ®eld (MRF) assumption. The MRF assumption states that the true interpretation of any pixel Xij given the true interpretation of all image pixels G depends only on the interpretation of its neighboring pixels in a neighborhood Nij (Eq. (1)). P…Xij ˆ okjG † ˆ P…Xij ˆ okjNij † …1† This interpretation of a pixel can be used for the assignment, or classi®cation of a pixel to a class ok, which represents a labeling of the pixel from a given set of L possible classes. These classes may correspond to tissue types believed to be contained in the image (muscle tissue, bone, etc.). A neighborhood Nij consists of those pixels that fall into a de®ned region surrounding the pixel at location (ij). Various shaped neighborhoods can be used, and their shapes (sizes) are usually determined from the underlying characteristics present in the regions, or structures of an image. A very commonly employed and simple neighborhood system may be de®ned as consisting of those pixels which are within a particular Euclidean distance from the pixel Xij. Typically the relationship between neighboring points diminishes greatly with increasing distance. It has been shown [9, 10] that a meaningful neighborhood may in fact be quite small, and can be de®ned as consisting of those neighbors, which are a Euclidean distance (d) of one from the pixel in question. Thus Nij ˆ f…r,s†:d……r,s†,…ij†† ˆ 1g …2† de®nes a four-element neighborhood, consisting of the east, west, north and south adjacent pixels about the site (ij), and it is this simple but e€ective structure that we use. With this concept of a four-pixel neighborhood, the assignment of pixel Xij to a class ok can be evaluated in terms of the posterior probability P(okvXij,Q), the conditional probability that the assignment of class ok is correct given observation Xij and prior information Q. The prior information Q consists of the pixel's neighborhood Nij, global information such as the class probability of the various classes occurring naturally in the image P(ok), and local information de®ning the likelihood of the contents of the neighborhood. In simple terms, the probability of class ok, being assigned to site (ij) depends only upon the observations contained in its neighborhood Nij, Eq. (3). P…okjXij,Q† ˆ ‰ …xy†PNij P…okjXxy,Q† …3† In Eq. (3), Xxy represents an observation contained in the neighborhood Nij. From a Bayesian perspective, the best classi®cation for Xij is the class ok that maximizes Eq. (4). T. Daggett, I.R. Greenshields / Computers in Biology and Medicine 28 (1998) 47±6050
  • 5. ‰ …xy†PNij P…ok†P…Xxyjok†r ‰ …xy†PNij P…oh†P…Xxyjoh†, Vh Tˆ k …4† The decision rule, de®ned above, involves maximizing the probability of the classi®cation of an entire neighborhood. The MRF-Gibbs equivalence shows that Eq. (4) can be written in a Gibbsian form P(Xij,ok), P…Xij,ok† ˆ 1 Z eÀU…Xij,ok†aT …5† U…Xij,ok† ˆ À ˆ …x,y†PNij log P…Xxyjok† ‡ U…ok† …6† and that this Gibbsian equivalent can be evaluated using an energy function, U(Xij,ok) used here, a normalizing constant Z, and an arti®cial temperature T. Proof of the equivalence between the Gibbs and MRF representations is given in Refs. [9, 10]. The MRF-Gibbs classi®cation process therefore consists of assigning the pixels of an image those class values that produce a maximum posterior distribution when the energy is minimized. The description above has concentrated on the local expression of the MRF-Gibbs model for classi®cation. Evidently, energy minimization cannot proceed on a purely local basis; instead, we consider the maximum a posteriori (MAP) estimate of the image's class structure as follows. Suppose there are L classes in an n  n image. A con®guration z $ C is an enumeration of the classes attached to each point of the image. In the case described, cardinality card{C} = Ln  n . Let W be the (random) observation of a con®guration, and let X be the (random) observation of the image data; then the MAP estimate is the estimate [8, 9]. WMAP ˆ arg maxzPCfP…W ˆ zjX ˆ x†g …7† However, the assignment of the pixels cannot be performed pixel by pixel but instead represents a dynamic programming problem. One can employ simulated annealing to migrate towards an approximate solution to this maximization problem. Simulated annealing is an iterative optimization technique which attempts to minimize an energy function through random excitation [11]. It can be utilized to reduce the overall computation of locating an ``ideal'' classi®cation to a more reasonable amount. Here ``ideal'' now refers to an approximate solution to the classi®cation problem, which may or may not be the true classi®cation solution. Typically, the procedure terminates after a ®xed number of iterations of the annealing process. In the MRF-Gibbs classi®cation approach, simulated annealing functions by periodically re- classifying a pixel to a ``worst'' classi®cation, i.e. a classi®cation that actually produces a higher energy potential. It is this occasional acceptance of a ``worst'' classi®cation that allows the algorithm to jump out of a local minimum in search of a global minimum, consequently producing an improved classi®ed image. The decision rule for the acceptance of a ``worst'' classi®cation is governed by a temperature and a cooling schedule. Numerous cooling schedules have been developed for simulated annealing [11], however it appears that the best schedule is usually determined through ad hoc trial and error. In our T. Daggett, I.R. Greenshields / Computers in Biology and Medicine 28 (1998) 47±60 51
  • 6. case, a cooling schedule T(t), (Eq. (8)) T…t† ˆ T0 log…t ‡ 1† …8† where T0 is the initial temperature is employed [12]. The initial temperature T0 was determined from the ®rst iteration of the algorithm, which evaluated the image by summing the total energy encountered (increase in energy) during an attempted classi®cation of the pixels to a new classi®cation with higher energy. The gradual cooling of the temperature produced by the schedule ideally allows the algorithm to settle in the global minimum. With the employment of simulated annealing to ®nd the minimal energy produced by Eq. (6), the remaining complexity of the MRF-Gibbs approach involves determining a useful energy function. The derivation of an energy function typically involves understanding the nature of the image, identifying a meaningful neighborhood shape (and size), and describing the unique attributes of the image regions, or structures. Numerous models have been developed to describe these di€ering attributes in terms of relationships of the pixels of the neighborhoods, contained in the regions or structures. As result, many di€erent energy functions have been developed which attempt to evaluate the likelihood or quality of a neighborhood relative to a set of attribute quali®ers [10]. For our energy function, U…Xij,ok† ˆ À ˆ …x,y†PNij log P…Xxyjok† ‡ ˆ …x,y†PNij log P…o…x,y†† ‡ ˆ …x,y†PNij log P…x,y†,…x‡1,y†…o…x,y†,o…x‡1,y†† ‡ ˆ …x,y†PNij log P…x,y†,…x,y‡1†…o…x,y†,o…x,y‡1†† …9† we utilized the conditional probability of each neighbor in Nij given the class ok, the probability of the classi®cations of neighbors, and the transitional probabilities associated with the spatial con®guration of the neighborhood [13]. This spatial likelihood encompasses both the horizontal and vertical transitions of the neighborhood. Since our neighborhood consisted of the four-pixel stencil mentioned earlier, the horizontal relationship consisted of evaluating the transitional probability of the west pixel to pixel Xij and Xij to the east pixel. Likewise the vertical relationship consisted of the vertical transitional probability of the north pixel to Xij and Xij to the south pixel. It should be noted that a transitional probability of ok to oj is the likelihood, or probability, that ok, is followed by oj in the direction of the transition type. Simulated annealing is employed in conjunction with Eq. (9), as de®ned by the following series of steps. 1. Randomly select a new class for the pixel xij. 2. Evaluate the new classi®cation in terms of the neighborhood relationships and the likelihood of the neighborhood via the energy function (Eq. (9)), DU. 3. If the re-classi®cation produces a lower energy then the current classi®cation then accept it. 4. Else accept it conditionally based upon an annealing schedule derived probability Ai>r. The probability Ai, Ai ˆ eÀDUaT…t† …10† T. Daggett, I.R. Greenshields / Computers in Biology and Medicine 28 (1998) 47±6052
  • 7. is determined based upon the magnitude of the increase in energy DU from the current pixel classi®cation to the new one, and the annealing schedule T(t). The acceptance guide r is de®ned as a uniform random variable between 0 and 1. The series of steps de®ned in the re-classi®cation rule were applied to each pixel of the image in succession. Upon the completion, the temperature controlling the annealing schedule was updated and the next iteration of the algorithm was begun. The number of iterations required to produce the ``ideal'' classi®cation is not predictable. A classi®cation resulting from a context-independent classi®cation approach (K nearest neighbors) was used to estimate the transitional probabilities, the probabilities of the classes, the class statistics (means and covariances) and as an initial classi®cation of the image. 3. Implementation 3.1. Parallel design description The parallelization of the MRF-Gibbs algorithm was approached through domain decomposition in which both data and computation was partitioned. This led to the principal employment of a single program multiple data (SPMD) design, in which n child processes (n = 1 to 7) cooperatively work on di€erent portions of an image (W wide by H high) under the direction of a single parent process. The principal responsibilities of the parent process are the partitioning and distributing of the image data amongst the children, performing collective calculations, and communicating important global classi®cation information. Child processes are tasked with performing the required image classi®cation operations on their image portion. Information used by the child processes, obtained from the parent process, consists primarily of image data and class related information, while information obtained from neighboring child processes consists of the conditional probabilities of the image partition edges and the classi®ed values associated with the same edges. This edge information is required in order to generate the maximum number of complete neighborhoods utilized in the MRF-Gibbs classi®cation process. Evidently, the outcomes of the serial and parallel algorithms are identical. Each child process is given an approximately equal size portion of the image to work on. This image section consists of a rectangular segment of the image, which was Wan in width (the last process receives extra columns when required by the magnitude of n) with a height of H. Therefore the partitioning of the data is basically performed by mapping the columns of the image to particular child processes. Image data (both raw image data and initial classi®ed data) is sent by the parent to the children in row by row fashion. Hence, each image message consisted of a Wan image pixel values with a total of 2nH messages sent from the parent to the children. In addition, a total of nH number of messages is sent from the children to the parent at the end of a classi®cation, to return the classi®ed portions of the image to the parent for re-uni®cation. Each process, child and parent, were mapped to a separate processor of the cluster. This was chosen in attempt to both maximize processor utilization and load balance the overall system. In addition, the principal communications required between child processes consists of the T. Daggett, I.R. Greenshields / Computers in Biology and Medicine 28 (1998) 47±60 53
  • 8. exchanging of classi®ed edge data at the beginning of each iteration. During a given MRF- Gibbs iteration, each child can operate simply on their image portion without interference by neighboring processes. Both the parent and child processes exhibit synchronous communications in that parent and children will wait for expected information to arrive. The MRF-Gibbs parallel application was designed and implemented with respect to scalability, future software modi®cation, and software reuse. Both the child and parent processes were created to function without concern for either the number of processors or processes employed. A high-level communications interface (C + + class) was designed and implemented to abstract the actual algorithm above the communications mechanism utilized, therefore supporting the rapid employment of other forms of underlying inter-process communications. This class also provided overloaded operations, which abstracted the development of the algorithm above the complexities of the communication of various data types, thus further simplifying code development. 3.2. Image dataset The MRF-Gibbs algorithm was customized to improve its performance due to the images used and the parallel environment employed. Sample images from the visible human project female dataset were used for all classi®cations. The raw images have dimensions of 2048 Â 1216 Â 3, but are reduced by background elimination to 1300 Â 700 Â 3. 3.3. Algorithm realization The implementation of the MRF-Gibbs algorithm was approached through a series of phases. The phases were data loading, transitional probability determination, class conditional probability construction, and MRF-Gibbs re-classi®cation iteration. The ®rst phase, data loading, consists of loading in the raw image data, the context- independent classi®ed version of the image, and the statistics of the classes determined from the context-independent approach. The context-independent classi®ed image contained a single classi®ed value (0 to L À 1 classes) for each pixel. The class statistics consists of a class mean g, covariance matrix Sk and probability Pk for each class. The next phase consists of determining the transitional probabilities of the classes de®ned in the context-independent classi®cation of the image. The class conditional probability construction phase followed the transitional probability phase. This phase consists of determining the class conditional probability of each class for each pixel of the image. The class conditional probability was modeled as a maximum likelihood estimation posterior probability, which is the posterior probability of a class ok given a pixel value X. Eq. (11) de®nes the class conditional probability P(okvXij) of class ok, of L total classes based upon the probability of Xij given class ok, which is identi®ed as the conditional probability p(Xijvok), and the overall probability of class ok, P(ok). T. Daggett, I.R. Greenshields / Computers in Biology and Medicine 28 (1998) 47±6054
  • 9. P…okjXij † ˆ P…ok†P…Xijjok† ˆL zˆ1 P…oz†P…Xijjoz† …11† In Eq. (11), the conditional probability p(Xijvok) is modeled as a normal distribution using the class mean and covariance matrix de®ned in the class statistics. Overall this phase consists of calculating and storing W Â H Â L conditional probabilities, where W, H, L refers to the image width, image height and number of classes of the non-contextually classi®ed image, respectively. The ®nal phase, MRF-Gibbs iteration, involves the actual re-classifying of the pixels through a series of re-classi®cation iterations. During the ®rst iteration the initial temperature, T0 is set based upon the total energy encountered during the attempted re-classi®cation of all pixels. During this initial iteration no pixels are re-classi®ed which produced a ``worst'' classi®cation, but instead the increase in energy encountered during an attempted re-classi®cation of a pixel is accumulated, and used to set T0. During the second and subsequent iterations this phase attempts to re-classify each pixel to a random class value. The energy of each new classi®ed pixel value is evaluated and accepted based upon the guidelines de®ned in the re-classi®cation rules for MRF-Gibbs. This phase iterates through the MRF-Gibbs re-classi®cation algorithm for a ®xed number of times. At the end of each iteration, the temperature is adjusted in accordance with the annealing schedule de®ned earlier. Overall the implementation of the MRF-Gibbs algorithm can be examined as a series of steps. These steps can be separated and listed as the following. 1. Load in the raw image, the class statistics, and a previously classi®ed image. 2. Determine the transitional probabilities from the previously classi®ed image. 3. Determine the class conditional probability for each class at each pixel. 4. Evaluate the raw image and set the initial temperature. 5. For each pixel in the image choose a new class value at random, and evaluate the change in energy resulting from the energy function (using class probabilities, conditional probabilities, and transitional probabilities). Accept the new pixel classi®cation if the resulting energy is lower, else accept it conditionally base upon an annealing probability. 6. Once all the pixels have been attempted to be re-classi®ed, then lower the temperature based upon the annealing schedule. 7. Go to step 5 until the desired number of iterations has been reached. In a typical image classi®cation the majority of the execution time of the program is spent in either the conditional probability construction phase (phase 3) or in the re-classi®cation phase (phase 5). The other phases do not constitute a signi®cant amount of computation. Fig. 1 represents a graphical view of the parallel version of the MRF-Gibbs algorithm. Steps are 1, 8, and 10 of the MRG-Gibbs parallel implementation are considered to be principally communication steps. Steps 1 and 10 occur only once during an image classi®cation, while the classi®ed edge exchanging step (step 8) will occur during each iteration of the MRF-Gibbs algorithm. T. Daggett, I.R. Greenshields / Computers in Biology and Medicine 28 (1998) 47±60 55
  • 10. Fig. 1. Depiction of parallel version of the MRF-Gibbs algorithm. T. Daggett, I.R. Greenshields / Computers in Biology and Medicine 28 (1998) 47±6056
  • 11. Hence, step 8 is perhaps the true communications cost of the parallelized version of the algorithm, since it is the only communications that results from various numbers of iterations. The other steps (other then 1, 8, and 10) which require communications represent relatively insigni®cant communication amounts and also only occur once per image classi®cation. In Figs. 2 and 3, Ixy represents the image partition and Cxy represents the classi®ed partition of the same image segment. The computationally most intense step is the determination of the class conditional probabilities (step 5), though this step is only required to be performed once per image classi®cation. In step 5 the class conditionals are computed once and stored for re-use during the MRF-Gibbs classi®cation phase (steps 8±9). Fig. 2. Original image after background removal. Fig. 3. Classi®ed image produced by MRF-Gibbs algorithm. T. Daggett, I.R. Greenshields / Computers in Biology and Medicine 28 (1998) 47±60 57
  • 12. Once the program entered the MRF-Gibbs classi®cation phase (steps 8±9), most of time is spent in the evaluation of random class values at each pixel in the image (step 9). This step requires a fairly large amount of computation due primarily to the evaluation of the energy function using both the current pixel classi®cation value and the randomly selected value. Note that the current energy at pixel Xij would not be accurate if the classi®cation of pixel Xij + 1 changes during the attempted re-classi®cation of pixel Xij + 1, hence the energy potentials of the neighborhoods change continuously as neighbors, are subsequently re-classi®ed. At the end of the required number of iterations each child process returns the classi®ed partitions of the image to the parent (step 10), who combines the image sections into a ®nal classi®ed version of the original image. This signals the completion of the classi®cation process. 4. Performance results Table 1 shows the performance of the algorithm for varying number of child processes in conjunction with a single parent process. The number of iterations was ®xed at 200 with 6 classes for all trials. Both MPI and Paradise, as indicated in the table, were utilized for all inter-process communications. The information exchanged consisted of the class statistics, the raw image data and classi®ed partitions of the images, along with the class conditional and classi®ed image edge values. The table lists speedup relative to a single sequential process and the timeline employed started with the distribution of the image data and ended with the returning to and combining of all classi®ed image sections by the parent. 5. Experimental framework All of the parallelization experiments were performed on the University of Connecticut CoW (Cluster of Workstations) facility located inside the Image Processing Laboratory (IPL), Booth Research Center. The machines contained in the cluster were identical and consisted of 32 MB Table 1 Speedupa of MRF-Gibbs resulting from using MPI and Paradise Number of Child Processes Speedup with MPI Speedup with Paradise 1 0.14 0.21 2 1.32 1.86 3 2.39 2.79 4 2.37 3.67 5 2.43 4.47 6 2.65 5.15 7 2.56 6.02 a Speedup generated from estimated single process performance. T. Daggett, I.R. Greenshields / Computers in Biology and Medicine 28 (1998) 47±6058
  • 13. 166 MHz Intel Pentium based personal computers executing the Windows NT 3.5.1 operating system, connected together via an IBM ATM 25.5 MBps Switch. The ATM switch was con®gured to provide LAN Emulation support. All software was written in C + + using the Visual C + + software development environment. Version 0.9 of MPI was used. 6. Conclusion Automatic image classi®cation is an important and computationally challenging task. We have taken the MRF-Gibbs algorithm and have parallelized it using both message passing and virtual shared memory based communications. We demonstrated that the algorithm can be parallelized and implemented on a common cluster using standard inter-process communications. Our results show the following. 1. Classi®cation algorithms can be e€ectively parallelized by partitioning the image between cooperating child processes and then reconciling the local attributes of the image partitions. This coarse grain partitioning approach could perhaps be employed for other classi®cation algorithms, either context-dependent or context-independent. 2. Both MPI and Paradise generate an overhead but can still be e€ectively used if the algorithm computation burden is large enough to compensate for additional communications cost. For relatively computational simple algorithms, such as nearest mean, communications costs may greatly limit the speedup experienced. 3. A small cluster can be con®gured and utilized for the classi®cation process. However, the real-time acquisition and display requirements of future medical image visualization systems are currently far greater then can be realistically supported with current small scale systems. 4. In general, Paradise far out performed MPI for the MRF-Gibbs application developed under the hardware and software suite utilized. In summary, the parallelization of the MRF-Gibbs algorithm demonstrated that statistically based classi®cation algorithms can be parallelized using a coarse grain approach based upon partitioning the image data. Partitioning the data does require that statistics associated with the image partitions be reconciled. However, the amount of data required to be communicated is typically quite small and can be performed using standard communication mechanisms. It should be noted that both MPI and Paradise do generate a processing overhead. The impact of these overheads could potentially be reduced by increasing the processing capacity of the cluster, or perhaps by employing a lower level communication level then TCP/IP for MPI or Paradise. In addition, both MPI and Paradise provide an e€ective application programming interface for the classi®cation problem domain. Acknowledgements This work was funded by State of Connecticut under its Critical Technologies Program. We also wish to gratefully acknowledge Scienti®c Computing Associates, New Haven, CT for the T. Daggett, I.R. Greenshields / Computers in Biology and Medicine 28 (1998) 47±60 59
  • 14. delineation of the Paradise Software for the duration of the project and Dr Michael J. Ackerman, Visible Human Project, National Library of Medicine for providing the Visible Human Female dataset. References [1] SPECfp_rate95, Spec Benchmarks, http://open.specbench.org/osg/cpu95/results/rfp95.html. [2] Snir, M., Otto, S., Huss-Lederman, S., Walker, D. and Dongarra, J., MPI The Complete Reference. MIT Press, Cambridge, MA, 1996. [3] Gropp, W., Lusk, E. and Skjellum, A., Using MPI Portable Parallel Programming with the Message-Passing Interface. MIT Press, Cambridge, MA, 1996. [4] Amza, C. and Cox, A., TreadMarks: shared memory computing on networks of workstations, IEEE Computer, February, 1996, 18±28. [5] Visible Human Dataset, National Library of Medicine, National Institutes of Health, Bethesda, MD. [6] Greenshields, I. R. and Chun, J., Simulation of ¯uid ¯ow through biological inhomogeneous elastic tubes, Proc. of IASTED Int. Conference on Modeling and Simulation, Pittsburgh, PA, 1993, pp. 66±68. [7] Greenshields, I. R., Peters, T. J. and Chun, J., Parallel strategies in the reconstruction of surfaces from contour data, Proc. of 4th ISMM Conf. on Parallel and Distributed Computing Systems, 1991, pp. 355±358. [8] Fukunaga, K., Statistical Pattern Recognition, 2nd edition. Academic Press, San Diego, CA, 1990. [9] Li, S. Z., Markov Random Field Modeling in Computer Vision. Springer, Berlin, 1995. [10] Winkler, G., Image Analysis, Random Fields and Dynamic Monte Carlo Methods. Springer, Berlin, 1991. [11] L. Ingber, Simulated annealing: practice versus theory, Mathematical and Computer Modeling 18 (11) (1993) 29±57. [12] Geman, S. and Gemen, D., Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images, IEEE Trans. on Pattern Analysis and Machine Intelligence, 1984, PAMI-8(6), 721±741. [13] Zhu, J. and Greenshields, I., Classi®cation of multiecho magnetic resonance images of brain using MRG-Gibbs classi®er, Department of Computer Science and Engineering, University of Connecticut, Storrs, CT, 1993. Thomas A. Daggett is a Ph.D. candidate at the University of Connecticut. He received a BSE in computer engineering from the University of Connecticut in 1985 and his MS in computer science from Rennselaer Polytechnic Institute in 1989. He is currently serving as an operability and display engineer in the Advanced Display Research Facility (ADRF) at the Naval Undersea Warfare Center (NUWC) division Newport, Newport Rhode Island. His research interests lie in parallel/distributed processing, image processing, small-scale cluster computing, and advanced display technologies. Ian Greenshields is an Associate Professor of Computer Science and Engineering at the University of Connecticut. His research interests lie in the areas of biomedical computing, biomedical image processing and biodynamical modeling. T. Daggett, I.R. Greenshields / Computers in Biology and Medicine 28 (1998) 47±6060