IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 10, NO. 11, DECEMBER 1999 1299
A New Self-Routing Multicast Network
Yuanyuan Yang, Senior Member, IEEE, and Jianchao Wang, Member, IEEE
AbstractÐIn this paper, we propose a design for a new self-routing multicast network which can realize arbitrary multicast
assignments between its inputs and outputs without any blocking. The network design uses a recursive decomposition approach and is
based on the binary radix sorting concept. All functional components of the network are reverse banyan networks. Specifically, the new
multicast network is recursively constructed by cascading a binary splitting network and two half-size multicast networks. The binary
splitting network, in turn, consists of two recursively constructed reverse banyan networks. The first reverse banyan network serves as
a scatter network and the second reverse banyan network serves as a quasisorting network. The advantage of this approach is to
provide a way to self-route multicast assignments through the network and a possibility to reuse part of network to reduce the network
cost. The new multicast network we design is compared favorably with the previously proposed multicast networks. It uses y n logP n
logic gates, and has y logP n depth and y logP n routing time where the unit of time is a gate delay. By reusing part of the network,
the feedback implementation of the network can further reduce the network cost to y n log n.
Index TermsÐMulticast network, self-routing, binary radix sorting network, reverse banyan network, compact routing, recursive
M ULTICAST or one-to-many communication is one of the
most important collective communication operations
time for any k, I k log n.1 Since the routing algorithm in
 relies on a cube or a perfect shuffle connected parallel
 and is highly demanded in parallel and distributed computer consisting of y knIk processors, as mentioned
applications as well as in other communication environ- in , the routing process actually takes y k logP n gate
ments. For example, multicast is required to make updates delays. Lee and Oruc  designed a multicast network with
in replicated and distributed databases and in commonly a special built-in routing circuit. Their network uses
used parallel algorithms such as matrix multiplication and y n logP n logic gates, and has y logP n depth and
Fast Fourier Transform (FFT). Multiprocessor systems also y logQ n routing time where the unit of time is a gate delay.
require multicast for barrier synchronization and massage Another notable feature in our multicast network design
is to adopt a self-routing scheme. Self-routing is a promising
passing, and multicast is a critical operation for video/
routing scheme which usually renders a network with
teleconference calls, video-on-demands services and dis-
faster switch setting and lower hardware complexity.
tance learning in a telecommunication environment.
However, most of self-routing network designs described
Clearly, providing multicast support at hardware/inter- in the literature are for permutation networks, for example,
connection network level is the most efficient way support- , , , . In a recent work, Cheng and Chen 
ing such communication operations, , . A switching designed a new self-routing permutation network con-
network that can realize every multicast assignment structed by reverse banyan networks.
between its inputs and outputs over edge-disjoint trees is In this paper, we propose a design for a new self-routing
referred to as a multicast network. This type of network has multicast network, which is also based on the reverse
been investigated by several researchers in the literature , banyan networks. We will explore the properties of a
, , , , , . reverse banyan network so that it can be used to handle
In this paper, we design a new type of multicast network arbitrary multicast connections in a self-routing manner.
using an approach based on recursive decompositions of Different from earlier proposed multicast networks, the
multicast networks. This approach was first introduced by new multicast network is conceptually simple and has good
Nassimi and Sahni  and was later adopted by Lee and modularity. The network design is based on the binary
Oruc  in their design of a multicast network. The n Â n
Ë radix sorting concept and all functional components of the
multicast network proposed in  uses y knIk log n P Â P network are recursively constructed reverse banyan net-
switches and has y k log n depth and y k log n routing works. Therefore, it has a potential to greatly reduce the
network cost by reusing part of the network. The new
multicast network we design uses y n logP n logic gates,
. Y. Yang is with the Department of Electrical and Computer Engineering, and has y logP n depth and y logP n routing time where
State University of New York, at Stony Brook, Stony Brook, NY 11794.
the unit of time is a gate delay. Moreover, by reusing part of
. J. Wang is with GTE Laboratories, 40 Sylvan Road, Waltham, MA 02454. the network, the feedback version of our design can reduce
E-mail: email@example.com. the network cost to y n log n.
Manuscript received 3 Sept., 1997.
For information on obtaining reprints of this article, please send e-mail to: 1. k is a network parameter which equals log m , and (IY m)-generator and
firstname.lastname@example.org, and reference IEEECS Log Number 105737. nY m-concentrator are components of the network.
1045-9219/99/$10.00 ß 1999 IEEE
1300 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 10, NO. 11, DECEMBER 1999
We may also represent this multicast assignment in a
V V W W
`& ' ` HII a & 'a
0Y IHH Y fHIHgY 0Y 0Y 0Y X
X HHI X Y IIH Y
Clearly, a permutation assignment is a special case of a
multicast assignment where each si has, at most, one
In this paper, we design a multicast network based on
binary radix sorting concept and refer to it as a binary radix
Fig. 1. The construction of an n Â n binary radix sorting multicast sorting multicast network (BRSMN). An n Â n BRSMN can be
network (BRSMN). recursively constructed by an n Â n binary splitting network
(BSN) followed by two n Â n BRSMNs which are directly
The rest of the paper is organized as follows: Section 2 linked to the upper half and the lower half of the outputs of
introduces the binary radix sorting concept and gives the n Â n BSN, respectively. The construction is shown in
the recursive definition of the new multicast network Fig. 1. For an input i, depending on its destination set we
based on it. Section 3 describes a key component of the have four possible cases:
new multicast network, the binary splitting network.
Section 4 discusses the basic building block of the binary . Case 1: If all elements in its destination set si are in
splitting network, the reverse banyan network. Section 5 the upper half of the network outputs (i.e., the most
further explores the properties of the reverse banyan significant bit of every binary address in si is H),
network as a scatter network and as a quasisorting net- there will be a single connection from input i via the
n Â n BSN to an input of the upper n Â n BRSMN
work. Section 6 presents the distributed self-routing P P
with the same destination set si .
algorithms, and Section 7 discusses the implementation
. Case 2: If all elements in si are in the lower half of the
and complexity issues. Section 8 concludes the paper.
network outputs (i.e., the most significant bit of
Finally, Appendices A, B, and C give some detailed proofs every binary address in si is I), there will be a single
and descriptions of the routing algorithms. connection from input i via the BSN to an input of
the lower n Â n BRSMN with the same destination
2 MULTICAST NETWORKS BASED oN BINARY set si .
RADIX SORTING . Case 3: If some elements in si go to the upper half
and other elements in si go to the lower half (i.e., the
We consider an n Â n interconnection network with n most significant bits of the addresses in si contain
inputs and n outputs where n Pm . Clearly, each input or both H and I), there will be two connections from
output address can be expressed as an m-bit binary number input i via the BSN. One goes to the upper n Â n P P
H I F F F mÀI . For a multicast connection from network input BRSMN, and another goes to the lower n Â n P P
i (H i n À I) to a subset of network outputs, let si denote BRSMN. The original destination set si will be split
the subset of the outputs that input i is connected to. si is into two subsets which form the destination sets of
referred to as the destination set of the multicast connection, the corresponding inputs of the upper and the lower
or simply the destination set of input i. Then, a multicast P Â P BRSMNs, respectively.
assignment can be expressed as a set fsH Y sI Y F F F Y snÀI g . Case 4: The destination set is empty (i.e., the input
where, si sj 0 for i T j and nÀI si fHY IY F F F Y n À Ig.
carries no message).
For example, the following is a multicast assignment of Then, for an n Â n BRSMN, we also have four cases for
an V Â V network: ffHY IgY 0Y fQY RY UgY fPgY 0Y 0Y 0Y fSY TggX each input, but this time we need to check the second most
Fig. 2. A routing example for a multicast assignment in an V Â V BRSMN.
YANG AND WANG: A NEW SELF-ROUTING MULTICAST NETWORK 1301
Fig. 3. Legal operations on the four values in a P Â P switch, where xY y P fHY IY Y g. (a)Parallel. (b) Crossing. (c) Upper broadcast. (d) Lower
significant bit of the binary addresses in the corresponding unicast with no value changed, and the operations in Fig. 3c
destination set, and so on. Finally, for a P Â P BRSMN (i.e., a and Fig. 3d are broadcast with values and on the inputs
P Â P switch), realizing a multicast or a unicast connection is changed to H and I on the outputs.
straightforward. Fig. 2 shows the routing for the multicast In an n Â n BSN using the four value routing tags, let nH ,
assignment in the example mentioned earlier in an V Â V nI , n , and n denote the numbers of inputs with value H, I,
binary radix sorting multicast network. From Fig. 2, we can , , respectively. Clearly, they satisfy some constraints.
also see that n Â n BRSMN is constructed by an n Â n BSN First, we have
(level 1), followed by P n Â n BSN's (level 2), then followed
by PP PP Â PP BSN's (level 3), F F F , and finally followed by n
n n nH nI n n nX I
P Â P switches (level log n). For more discussions on this Since at most a half of the outputs are in the upper half of
type of multicast network structure, readers may also refer the network outputs, we must have
Clearly, the problem of constructing a binary radix n n
n H n nd nI n X P
sorting multicast network (BRSMN) is now transformed to P P
the problem of designing a binary splitting network (BSN) Also, it is interesting to note that
n n Y Q
3 THE BINARY SPLITTING NETWORK which can be derived from (1) and (2).
Now, the function of a BSN is to transform the input tags
As described in Section 2, the function of the binary
Hs, Is, s, and s to the output side such that all s are
splitting network is to split (when necessary) the multicast
eliminated, all Hs are in the upper half of outputs, all Is are
connection on each input based on whether each output in
the destination set of the multicast connection belongs to the in the lower half of outputs. The numbers of the different
upper half or the lower half of the network outputs, and tags on the outputs of the BSN should have the following
pass the (possibly split) multicast connections properly to
relations: Let nH , nI , n , and n denote the numbers of
the upper or lower n Â n BRSMNs following it. To simplify outputs with value H, I, , and , respectively. Since any
the routing in a BSN, we use a routing tag with four values paired with one will be transformed to a pair of H and I by
for each link: H, I, , and , which correspond to Cases 1, 2, switch broadcast operations, we must have
3, and 4 described in the previous section, respectively, and
nH nH n Y nI nI n Y n n À n Y nd n HX R
are determined by the ith most significant bit of the
multicast destination set on the inputs of BSNs at level i. Having described the function of a BSN, we now
The operations on four values in a P Â P switch are consider the construction of the BSN. An n Â n BSN can
simply an extension to those on only two values H and I. be constructed by cascading two n Â n networks as shown
Fig. 3 shows all legal operations on four values in a P Â P in Fig. 4a. The first network is referred to as a scatter network
switch, where, the operations in Fig. 3a and Fig. 3b are which scatters all s to Hs and Is. In other words, the
Fig. 4. Tags scattered in the first subnetwork and then quasisorted in the second subnetwork. (a) The construction of a binary splitting network
(BSN). (b) An example of routing in a BSN.
1302 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 10, NO. 11, DECEMBER 1999
Cheng and Chen's permutation network . An n Â n
RBN is recursively constructed by two n Â n RBNs followed
by an n Â n merging network. An n Â n merging network
consists of one stage of n P Â P switches, such that both the
input and the output links of the stage are connected
according to the perfect shuffle interconnection function
. The merging network merges the outputs of the upper
and the lower n Â n RBNs to form the outputs of the n Â n
RBN. The recursive construction of an n Â n RBN, and an
n Â n merging network are shown in Fig. 5.
To facilitate the discussions on the functions of an RBN
as a scatter network and as a quasisorting network, we need
to introduce the following notations.
Suppose an n-bit sequence of two symbols, say, and ,
is to be applied in parallel to the n inputs of a reverse
Fig. 5. The recursive definition of an n Â n RBN. The dashed box is an banyan network. We define an n-bit circular compact sequence
n Â n merging network. of two symbols s and s as follows:
transformation from the inputs to the outputs of the scatter n s l nÀsÀl if s l n
network is lÀns nÀl nÀs if s l b nY
fHY IY Y gÃ A fHY IY gÃ X where H s ` n and H l n. The real meaning of gsYlYY is
that, in an n-bit sequence all l -bits are compacted together
The second network is referred to as a quasisorting followed by also compacted n À l -bits in a circular way
network which transfers all Hs to the upper half of the (modulo n), and s is the starting position for the -bit
network outputs, and all Is to the lower half of the network sequence. This concept is very important in this paper. In
outputs, while each of s may go to either the upper half or fact, the key results of this paper pertain to that under what
the lower half. It is easy to see that the network constructed conditions two circular compact sequences can be merged
by cascading a scatter network and quasisorting network into a longer circular compact sequence. For example, the
can perform the functions of a BSN. special circular compact sequence gnYnYHYI HP IP is what
In Fig. 4b, we show how input tags in a BSN are scattered we expect after sorting tags 0s and 1s in a bit sorting
in the first subnetwork and then quasisorted in the second network. In a later section, we will see that circular compact
subnetwork. sequences will be used in merging and eliminating tags s
In this paper, we use a reverse banyan network to and s in a scatter network.
implement both the scatter network and the quasisorting Cheng and Chen  considered the circular compact
network in a BSN. As can be seen later, this implementation sequence of Hs and Is in their permutation network design,
of a BSN not only leads to a faster self-routing multicast and found an interesting property which was used in their
network but also provides a possibility to reuse part of the bit-sorting network. Since this property is also very useful
network to further reduce the network cost. In the following for the designs of the scatter network and quasisorting
sections, we will mainly discuss how a reverse banyan network, we state it below in Theorem 1 in a more general
network can perform the functions of a scatter network and form, and provide a different, much easier to understand
a quasisorting network. proof for it. As can be seen later, the bit sorting network
directly related to Theorem 1 will be used in the
quasi-sorting network and the new technique used in the
4 THE REVERSE BANYAN NETWORK proof (i.e., Lemma 1) and some observations will be applied
The reverse banyan network (RBN) used in our design is a to the design of the scatter network.
variant of reverse banyan network, which was also used in Theorem 1. For any - values on the inputs of an RBN, a
circular compact sequence with any starting position can be
achieved at the outputs of the RBN under a proper setting for
switches in the network.
Suppose the inputs of the n Â n RBN consist of l s and
n À l s, among which the upper half n inputs contain lH s
and the lower half n inputs contain lI s, where lH lI l.
Assume Theorem 1 holds for an n Â n RBN. We will prove
that it also holds for an n Â n RBN by giving a positive
answer to the following question.
Question 1. Given integers n, s, l, lH , and lI satisfying that n is
Fig. 6. The connections through a switch in a merging network, where an even number, H s ` n, H l n, H lH Y lI n , and P
exhnge . l lH lI , do there exist integers sH and sI , H sH Y sI ` n ,
YANG AND WANG: A NEW SELF-ROUTING MULTICAST NETWORK 1303
such that gsH YlH YY and gsI YlI YY can be merged to gsYlYY
through an n Â n merging network (as defined in this
section) under a proper switch setting (in a one-to-one
Before we answer this question, we first review some
observations on the shuffle/exchange interconnection func-
tions. Consider an n Â n merging network. Recall that the
inputs (outputs) of the merging network are linked to
switches according to the perfect shuffle function. Let be a
binary address of an input of a switch (with address ) in
the network. Then exchange denoted as is the other
input of the switch. The inputs of the merging network with
addresses shuffle and shuffle are linked to inputs and
Fig. 7. Four different switch settings for a P Â P switch in an n Â n
of the switch, respectively, and the outputs and of the merging network. (a) Parallel. (b) Crossing. (c) Upper broadcast. (d)
switch are linked to the outputs of the merging network Lower broadcast.
with addresses shuffle and shuffle , respectively. (See
Fig. 6, and the definition of shuffle/exchange functions in in addition to switch settings ri H or I, let the switch
.) We can see that connections are only possible between setting ri P if the switch is set to upper broadcast, and
the inputs and the outputs of a merging network with ri Q if the switch is set to lower broadcast. We can extend
addresses shuffle and shuffle . Since only one-to-one
the binary circular compact switch setting to trinary. We
connections are considered here, the switch has only two explain the extension by the following example. nA trinary
settings: parallel and crossing. The parallel setting circular compact sequence of switch setting sYlI YlP YI YP YQ
corresponds to the connections from input shuffle to means lI consecutive P s followed by lP consecutive Q s and
output shuffle and from input shuffle to output
then followed by n À lI À lP I s in a circular way, with s
shuffle (see Fig. 7a); while the crossing setting corre-
being the starting position of the P s sequence.
sponds to the connections from input shuffle to output
The following lemma gives a positive answer to
shuffle and from input shuffle to output shuffle (see
Fig. 7b). Note that jshuffle À shuffle j n , that is,
two inputs n apart in their addresses can be connected to
Lemma 1. Given integers n, s, l, lH , and lI satisfying that n is an
outputs with the same addresses in a parallel or crossing even number, H s ` n, H l n, H lH Y lI n , and P
way as shown in Fig. 7a and Fig. 7b, respectively. l lH lI . Let
Let ri denote the setting for switch i. Let ri H if the
switch is set to parallel, and ri I if the switch is set to sH s mod Y sI s lH mod
crossing. Thus, the switch setting of the n switches in the
merging network is an n -bit sequence of Hs and Is. Similar and the switch setting of an n Â n merging network be
to n definition of circular compact sequence in (5), we use
the HYs YY , where
sYlYHYI to specify the compact switch setting of n switches in
an n Â n merging network, where l consecutive switches s lH div mod PY
have setting I with starting position s, and the rest of n n
and I À mod P. Then, gsH YlH YY and gsI YlI YY can be
switches have setting H in a circular way. In the case of a
switch can also be set to upper broadcast and lower merged to gsYlYY through the n Â n merging network under
broadcast as shown in Fig. 3c and Fig. 3d, for any switch i, the above switch setting.
Fig. 8. The binary tree structure of an RBN. (a) The forward and backward phases in the binary tree. (b) The forward and backward trees embedded
in a reverse banyan network.
1304 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 10, NO. 11, DECEMBER 1999
Proof. See Appendix A. u
t the switch have values 1s. By exploring the properties of the
circular compact sequence, we can obtain the following
Lemma 1 is actually the inductive step of the proof of general results for an n Â n RBN with any numbers of s
Theorem 1, and it is easy to check that for n P Theorem 1 and s on its inputs. This result will be used in the proof of
holds. Hence, Theorem 1 is proved. Theorem 2.
Note that for a full permutation assignment, only two tag Theorem 3. For any values on the inputs of an n Â n RBN, let
values H and I are used. Let be H, be I, and the initial n , and n be the numbers of inputs with value and ,
starting position for an n Â n RBN be s n . Clearly the total
P respectively. Clearly, n n n, and the rest of the inputs of
number of Is in this case is l n . By Theorem 1, the circular
P n the RBN have values 1s. Let s be any integer such that
compact sequence gsYlYHYI HP IP , which represents the
H s ` n. Then,
function of bit sorting (in an ascending order), can be
achieved by a proper switch setting. 1. n
if n n , a circular compact sequence gsYn Àn Y1Y
with any starting position s can be achieved at the
outputs of the RBN under a proper setting for switches
5 CONSTRUCTING THE BINARY SPLITTING
in the network;
NETWORK BY THE RBNS 2. n
if n ! n , a circular compact sequence gsYn Àn Y1Y
Recall that a binary splitting network consists of a scatter with any starting position s can be achieved at the
network and a quasisorting network. In this section, we outputs of the RBN under a proper setting for switches
show how to use the same basic component network, a in the network.
reverse banyan network, to perform the function of a scatter In the above theorem, we say that (or ) is the dominating
network and the function of a quasisorting network, type among and if n n (or n ! n ).
We now further explore the property of the circular
5.1 The Reverse Banyan Network as a Scatter compact sequence in order to deal with multicast assign-
Network ments. In the case of merging two circular compact
In the last section, we considered only two tag values sequences with the same set of binary values, (for example,
(0 and 1) for a full permutation assignment in an RBN. merging two sequences with 1s and s, gsH YlH Y1Y and
Now, for a multicast assignment in an RBN, we must deal n
gsI YlI Y1Y ), we can simply use the results in Theorem 1 or
with four tag values H, I, , and . Recall that the function of
Lemma 1. However, in other cases, we may also need to
a scatter network is to split the s on its input into Hs and Is
merge two circular compact sequences with different sets of
on its output.
binary values (for example, merging a sequence with 1s and
The following theorem gives one of the main results of n n
this paper, which states that the function of a scatter s, gsH YlH Y1Y , and a sequence with 1s and s, gsI YlI Y1Y ). In this
network can be achieved under a proper switching setting. case, we need to add two more switch settings: upper
The proof of this theorem is given at the end of this broadcast and lower broadcast, and use the notation of the
subsection. trinary circular compact switch setting defined in Section 4.
Theorem 2. Given an n Â n RBN, which is used as the scatter To merge two compact sequences with different sets of
network of an n Â n BSN, with H, I, and values on the binary values, the following question must be answered.
inputs. Let nH , nI , n , and n be the numbers of inputs with
value H, I, , and , respectively. Then under a proper setting Question 2. Given integers n, s, l, lH , and lI satisfying that n is
for switches in the network, s can be eliminated at the outputs an even number, H s ` n, H l n, H lI lH n , and P
of the RBN, i.e. the outputs of the RBN have only values H, I, l lH À lI , do there exist integers sH and sI (H sH Y sI ` n )
and with n
such that gsH YlH Y1Y and gsI YlI Y1Y can be merged to gsYlY1Y
through an n Â n merging network under a proper switch
nH nH n Y nI nI n Y n n À n Y nd n HY
where nH , nI , n , and n are the numbers of outputs with value
H, I, , and , respectively, and satisfying that H nH Y nI n
The following lemma gives a positive answer to
and nH nI n n. Question 2.
Lemma 2. Given integers n, s, l, lH , and lI satisfying that n is an
It is worth pointing out that although n n holds
even number, H s ` n, H l n, H lI lH n , and
globally for the original n Â n RBN which is the scatter P
l lH À lI . Let
network of an n Â n BSN (see (3)), for the recursively
defined subnetwork nH Â nH RBN of the n Â n RBN, we may n n
sH s mod Y sI s l mod
have nH ! nH , where nH and nH are the numbers of inputs (of P P
this nH Â nH RBN) with values and , respectively. This is
and the switch setting be
because that s and s are distributed nonuniformly.
To simplify the problem, we can combine H and I into a 1. sPI YlI YHYP , if s l ` n ;
single value 1. A link has a value 1 if it has a single value H 2.
sPI YlI YnÀsI ÀlI YIYPYH , if s ` n and s l ! n ;
or I. If the two inputs of a P Â P switch have values and n P
respectively, can be scattered so that the two outputs of 3. sPI YlI YIYP , if s ! n and s l ` n;
YANG AND WANG: A NEW SELF-ROUTING MULTICAST NETWORK 1305
4. sPI YlI YnÀsI ÀlI YHYPYI , if s ! n and s l ! n.
P 4. sPH YlH YnÀsH ÀlH YIYQYH , if s ! n and s l ! n,
n n n n
P P n
Then, gsH YlH Y1Y and gsI YlI Y1Y can be merged to gsYlY1Y through then, gsH YlH Y1Y and gsI YlI Y1Y can be merged to gsYlY1Y through
an n Â n merging network under the above switch setting. an n Â n merging network under the above switch setting.
Proof. See Appendix B. u
t Now we are in the position to prove Theorem 3 by using
Lemmas 1, 2 ,3, 4, and 5.
Question 2 has three other symmetric variants, which
Proof of Theorem 3. By induction on the network size n.
can be obtained by changing the condition in Lemma 2 to
lI ! lH and l lI À lH , and/or swapping for . The For n P (base case), the network is a P Â P switch. There
corresponding solutions to these three variants are also are six possible cases:
symmetric. Lemmas 3, 4, and 5 give such solutions. . Case 2.1. n n H. Let the switch set to
By swapping lH for lI in Lemma 2, we can obtain the parallel. Then both outputs of the switch have
following lemma. P
1s, that is, sequence gsYHY1Y (or equivalently
Lemma 3. Given integers n, s, l, lH , and lI satisfying that n is an gsYHY1Y ) can be achieved at the outputs of the
even number, H s ` n, H l n, H lH lI n , and P
switch for H s I.
l lI À lH . Let . Case 2.2. n n I. Let the switch set to either
upper-broadcast or lower-broadcast (depending
sH s l mod Y sI s mod on which input of the switch has ). Then both
P P outputs of the switch have 1s as in Case 2.1.
and the switch setting be . Case 2.3. n I and n H. Let the switch set to
n parallel, if s H and the upper input has , or
sH YlH YIYP, if s l ` n ;
P s I and the lower input has . Otherwise, let
sH YlH YnÀsH ÀlH YHYPYI , if s ` n and s l ! n ;
the switch set to crossing. Then sequence gsYIY1Y
, if s ! n
and s l ` n; can be achieved at the outputs of the switch for
sH YlH YHYP P
n H s I.
sH YlH YnÀsH ÀlH YIYPYH , if s ! n and s l ! n,
P . Case 2.4. n H and n I. Similar to Case 2.3.
then, g P
sH YlH Y1Y and g P
sI YlI Y1Y
can be merged to gsYlY1Y through an . Case 2.5. n P and n H. Let the switch set to
parallel. Then sequence gsYPY1Y can be achieved at
n Â n merging network under the above switch setting.
the outputs of the switch for H s I.
The two other variants are obtained by simply swapping . Case 2.6. n H and n P. Similar to Case 2.5.
for , and changing upper broadcast to lower broadcast in Hence, Theorem 3 holds for n P. Now, assume
Lemma 2 and Lemma 3, respectively. Theorem 3 holds for an n Â n RBN (inductive hypothesis)
Lemma 4. Given integers n, s, l, lH , and lI satisfying that n is an and consider an n Â n RBN. In the recursive definition of
even number, H s ` n, H l n, H lI lH n , and an n Â n RBN, let nH and nH denote the numbers of s
l lH À lI . Let and s, respectively in the upper n Â n RBN, and nHH and
nHH denote the numbers of s and s in the lower n Â n
n n RBN, respectively. Clearly, we have
sH s mod Y sI s l mod
nH nHH n nd nH nHH n X
and the switch setting be
n We first assume n n and consider the following
1. sPI YlI YHYQ , if s l ` n ;
sI YlI YnÀsI ÀlI YIYQYH , if s ` n and s l ! n ;
n . Case n 1. nH nH and nHH nHH . By the inductive
sI YlI YIYQ, if s ! P and s l ` n; P
n hypothesis, Theorem 3 holds for both upper and
sI YlI Y PÀsI ÀlI YHYQYI , if s ! n and s l ! n,
P lower n Â n RBNs. That is, for anyn given integers
sHn a n d sI ( H sH Y sI ` n ) , gsH YnH ÀnH Y1Y a n d
n n P
then, g P
sH YlH Y1Y and g P
sI YlI Y1Y can be merged to gsYlY1Y through an P
gsI YnHH ÀnHH 1Y can be achieved at the outputs of the
n Â n merging network under the above switch setting.
upper and lower n Â n RBNs, respectively. Now,
Lemma 5. Given integers n, s, l, lH , and lI satisfying that n is an let lH nH À nH , lI nHH À nHH , and l n À n ,
even number, H s ` n, H l n, H lH lI n , and which implies l lH lI . Then by Lemma 1,
given any integer s (H s ` n), there exist
l lI À lH . Let
integers sH and sI (H sH Y sI ` n ) such that
n n P
n n n
gsH YlH Y1Y and gsI YlI Y1Y can be merged to gsYlY1Y
sH s l mod Y sI s mod
P P through the n Â n merging network under a
and the switch setting be proper switch setting. Hence, Theorem 3 holds
in this case.
1. sPH YlH YIYQ , if s l ` n ;
. Case n 2. nH nH and nHH ! nHH . By the
, if s ` n and s l ! n ; inductive hypothesis, for any given integers sH
sH YlH YnÀsH ÀlH YHYQYI P P
a n d sI ( H sH Y sI ` n ) , gsH YnH ÀnH Y1Y a n d
, if s !
sH YlH YHYQ P and s l ` n; gsI YnHH ÀnHH Y1Y can be achieved at the outputs of
1306 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 10, NO. 11, DECEMBER 1999
the upper and lower n Â n RBNs, respectively.
P P TABLE 1
Let lH nH À nH , lI nHH À nHH , and l n À n .
An Encoding Scheme for Tag Values
Then we have l lH À lI and lH ! lI . By Lemma
4, given any integer s (H s ` n), there exist
integers sH and sI (H sH Y sI ` n ) such that
n n P
gsH YlH Y1Y and gsI YlI Y1Y can be merged to gsYlY1Y
through the n Â n merging network under a
proper switch setting. Thus, the result is also
true in this case. In the following, we discuss how an RBN can be used as
. Case n 3. nH ! nH and nHH nHH . Similar to Case n 2, a quasisorting network for partial permutation or multicast
by the inductive hypothesis and Lemma 3, we can assignments. As can be seen in the last subsection, the
see Theorem 3 holds in this case. u
t outputs of the n Â n RBN as a scatter network have tag
values H, I, and only, which are passed to the inputs of the
Thus, we have proved Theorem 3 for n n . Symme-
n Â n RBN as a quasisorting network, and the numbers of
trically, we can show that the theorem holds for n ! n . such Hs or Is are no more than n . The function of an n Â n
In the proof of Theorem 3, we can see that the number of
RBN as a quasisorting network is to route all Hs and Is on
the tag values of dominating type in the outputs of an RBN the inputs of the RBN to the upper and lower halves of its
is the sum of those in its two sub-RBNs, if the two outputs, respectively, and to route s to the remaining
sub-RBNs have the same dominating type. In this case, positions at the outputs. We can let some of s be dummy Hs
Lemma 1 is applied, and we refer to it as /-addition. On (denoted as H s) and the rest of s be dummy Is (denoted as
the other hand, the number of the tag values of dominating I s), such that both the number of all Hs (including all the
type is the difference between those of its two sub-RBNs, real Hs and the dummy Hs) and the number of all Is
if the two-sub RBNs have different dominating types. In
(including all the real Is and the dummy Is) are equal to n .
this case, Lemmas 2, 3, 4, and 5 are applied, and we refer P
to it as /-elimination. Then by applying the bit sorting results in Theorem 1 we
Finally, we are in the position to prove Theorem 2 by can achieve the quasisorting. The distributed algorithm
using Theorem 3. dividing s to H s and I s will be given in Section 6.
Proof of Theorem 2. Since n n holds for the original
n Â n RBN which is the scatter network of an n Â n BSN 6 THE DISTRIBUTED SELF-ROUTING ALGORITHMS
(see (3)), by Theorem 3 we can always eliminate s at the FOR RBNS
outputs of the RBN under a proper switch setting (i.e., Lemmas 1, 2, 3, 4, and 5 actually provide a way to perform
gsYn Àn Y1Y can be achieved at the outputs of the RBN, switch setting for all the switches in an RBN as a bit sorting
where s can be any number between H and n À I). On the network and as a scatter network, while switch setting for
other hand, from the proof of Theorem 3, we know that a an RBN as a quasisorting network can be transformed to
tag value H or I, once presented on a link at some stage of that for an RBN as a bit sorting network after all the s on
the RBN, will be passed in a unicast way to some output the inputs are properly divided into H s and I s.
without any value change. We also know that value In this section, we give a higher level description of the
changes among Hs, Is, s, and s occur only in some P Â P distributed self-routing algorithms used in switch setting
switches. In this case, two inputs of the switch have for each type of RBNs: As a bit sorting network and as a
and , respectively, the switch setting is upper or lower scatter network, and describe a distributed algorithm for
broadcast (this is always true in our design), and two dividing s to H s and I s in a quasisorting network.
outputs of the switch have H and I respectively. Needless to say, these switch setting algorithms can be
Consequently, a pair of and are transformed to a implemented using proper logic circuits. In Section 7, we
pair of H and I. In total, n such pairs are transformed. will discuss how they can be implemented in a pipelined
Hence, nH , nI , n , and n , which are the numbers of fashion so that the circuits used for the switch setting can be
outputs with value H, I, , and , respectively, satisfy the distributed to each switch module and a lower hardware
following: cost can be achieved.
By observing Lemmas 1, 2, 3, 4, and 5, we can see that
nH nH n Y nI nI n Y n n À n Y nd n HX switch setting at the last stage of RBN can be obtained if n,
Also by using (1) and (2), we show that H
nH Y nI n s, lH , and lI are given. Before the switch setting being
and nH nI n n. u
t determined, l must be obtained from lH and lI (forward
phase), and sH and sI must be obtained from s, l (or lH in
5.2 The Reverse Banyan Network as a Quasisorting Lemma 1) and n (backward phase). We can use the
Network recursive property of an RBN to design a distributed
The results of Theorem 1 can be used to perform bit sorting routing algorithm for switch setting in the RBN.
in an RBN, that is, under a proper switch setting, all Hs and Based on the recursive construction of an RBN, we can
Is on the inputs of the RBN can be routed to its outputs in formulate the structure of an RBN into a complete binary
an ascending order. However, this works only for full tree shown in Fig. 8a. The root node of the tree represents
permutation assignments, in which each input of the RBN the original RBN as a bit sorting network, a scatter network,
has a tag value either H or I, and cannot be directly applied or a quasisorting network; the two child nodes of the root
to partial permutation or multicast assignments. represent the two recursively defined sub-RBN networks of
YANG AND WANG: A NEW SELF-ROUTING MULTICAST NETWORK 1307
Comparisons of Recursively Constructed Multicast Networks
the original RBN and so on; and, finally, the leaves of the The self-routing algorithm for the RBN as a bit sorting
tree represent the inputs of the original RBN. network is shown in Table 3 in Appendix C. It directly
The distributed algorithms are performed by each node follows Lemma 1. The self-routing algorithm for the RBN as
of the binary tree. The algorithms start from leaves and a scatter network is described in Table 4 in Appendix C.
perform forward computations all the way to the root, and This algorithm actually combines the results of all cases in
then start from root and perform backward computations Lemmas 1, 2, 3, 4, and 5. The variable type in the
all the way to the leaves. For a clearer presentation, we omit algorithm represents the dominating type among s and
the initialization step in each of the algorithms. s in the corresponding sub RBN network of size nH Â nH .
If the two dominating types in the forward inputs agree
6.1 Distributed Switch Setting Algorithms (i.e., /-addition), Lemma 1 applies as in the algorithm
The switch setting at a node of the binary tree means the for the RBN as a bit sorting network; otherwise (i.e.,
compact switch setting for all the switches in the last stage /-elimination) Lemmas 2, 3, 4, and 5 apply. The
of the RBN represented by this node. Typically, there are functions BinaryCompactSetting() and TrinaryCompactSet-
three phases performed at each node of the tree. In the ting() used in the algorithms correspond to the compact
forward phase, whenever the two forward inputs lH and lI switch setting in those lemmas, and are described in
are available, l is computed and sent forward to the node in Table 5 in Appendix C. All switches in one stage are set
the next stage. In the backward phase, whenever the two simultaneously according to the forward and backward
forward inputs lH and lI and the backward input s are values of the nH Â nH subnetwork and its own switch-
address in modulo n .
available, sH and sI are generated and sent backward to the P
nodes in the previous stage. In the switch setting phase, 6.2 Distributed -Dividing Algorithm in a
under the same precondition as the backward phase, each Quasisorting Network
switch is set simultaneously by using the forward and As stated in the last section, in an n Â n RBN quasisorting
backward input values and its own address in the stage. network, the tag value on each input belongs only to
The Distributed Self-Routing Algorithm for the RBN as a Bit Sorting Network
1308 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 10, NO. 11, DECEMBER 1999
The Distributed Self-Routing Algorithm for the RBN as a Scatter Network
fHY IY g, and the number of the inputs with tag H or the n nH nHH X
number of the inputs with tag I are no more than n . TheP When the forward phase reaches the root node, the
function of -dividing algorithm is to reassign H (dummy H) backward phase starts, which will determine that among
or I (dummy I) to each input with tag value such that n of s, how many should be dummy Hs (H s) and how
both the number of all Hs (including real Hs and dummy Hs) many should be dummy Is (I s). Denote the numbers as nH
and the number of all Is (including real Is and dummy Is) and nI , respectively. We must have
on the inputs of the network are equal to n . P
The distributed -dividing algorithm is given in Table 6 n nH nI Y U
in Appendix C. There are a forward phase and a backward and
phase performed at each node of the binary tree in Fig. 8a.
The forward phase is to compute n , the number of s in the nH nHH nHHH Y
inputs of the sub-RBN represented by this node, when nH
and nHH , the numbers of s in the inputs of its two sub-RBNs
nI nHI nHHI Y W
are available. Clearly, we have
YANG AND WANG: A NEW SELF-ROUTING MULTICAST NETWORK 1309
The Compact Switching Setting Algorithms for the Last Stage of an nH Â nH RBN
where nHH and nHI , and nHHH and nHHI are the numbers of
an H , and if the nI of the leaf node is I, this input is
dummy Hs and dummy Is for two child nodes, respectively. assigned an I .
Equations (6), (8), and (9) are invariants for all but leaf
nodes and (7) is an invariant for all nodes. To balance the
7 IMPLEMENTATION ISSUES AND COMPLEXITY
number of Hs and the number of Is (both ªrealº and
ªdummyº ones), initially in the backward phase, nH and nI
of the root node can be set to In this section, we first discuss some implementation issues
pertaining to the newly designed self-routing multicast
nI À nI nd nH n À nI network, and then analyze the complexity of the network.
where nI represents the number of the real Is of the RBN, 7.1 Routing Tag Format and Handling
which can also be computed through the forward phase. To send a multicast message (i.e., multidestination message)
Then, nHH and nHI , and nHHH and nHHI of the child node can be
through a multicast network, the header of the message
computed and passed backward as in the backward must carry the information of the addresses of the
algorithm. That is, when forward inputs nH and nHH and destination nodes. This information, as an overhead due
backward inputs nH and nI are available, the backward to the nature of this type of communication, is considered as
outputs is computed as follows:
part of the message data. The preparation of the header is
nHH minfnH Y nH gY nHI nH À nHH Y done before the message enters the network. For more
detailed discussions on this issue and various multiaddress
nHHH nH À nHH Y nd nHHI nHH À nHI X
encoding schemes, readers can refer to . In this subsec-
It can be verified that the invariants shown in (6), (7), (8), tion, we introduce a multiaddress encoding scheme for the
and (9) hold at every node. Finally, in the last step of the routing tag of the proposed self-routing multicast network,
algorithm, if the nH of a leaf node is I, this input is assigned and briefly describe the routing tag handling technique.
1310 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 10, NO. 11, DECEMBER 1999
The Distributed Algorithm Dividing s to H s and I s in the RBN as a Quasi-Sorting Network
In order to describe the multicast routing tag format, we through the n Â n BSN so that the tags can be used as
first need the following preparations. routing tag sequence(s) of the upper or lower or both n Â n P P
A multicast, represented by a set of addresses of its BSNs. As shown in Fig. 10, if H equals , it passes
destinations in binary, in an n Â n BRSMN can be expressed I Y Q Y F F F Y PnÀQ to the upper n Â n BSN and P Y R Y F F F Y PnÀP
as a complete binary tree of height log n (number of levels) to the lower n Â n BSN simultaneously; if H equals H (or I),
with each node assigned a tag value chosen from fHY IY Y g.
it passes only I Y Q Y F F F Y PnÀQ (or P Y R Y F F F Y PnÀP ) to the
In this binary tree, level i (I i log n) gives the informa-
upper (or lower) n Â n BSN. Passing i s alternatively to the
tion of the ith most significant bits of the destinations of the P P
upper and the lower subnetworks is for efficiency con-
multicast. The binary tree can also be defined recursively as
follows: The root node (i.e., the unique node in level 1) of sideration. In fact, this way only a constant number of
the tree represents the entire multicast, and its left (or right) buffers are needed to store the tag sequence at each input of
child node in level 2, represents a multicast which is formed a BSN as it passes through the network. To ensure the
by all the destination addresses of the original multicast message is sent to the destinations of the multicast, the
with the most significant bit being 0 (or 1). In general, for a sequences I Y Q Y F F F Y PnÀQ and P Y R Y F F F Y PnÀP must match
node in level i I i log n À I, its left (or right) child the left and the right subtrees of the original complete
node in level i I represents a multicast which is formed binary tree. This should also hold for all smaller BSNs.
by all the destination addresses of the multicast at this node Thus, the routing tag sequence H Y I Y P Y F F F Y PnÀQ Y PnÀP
in level i with the ith most significant bit being 0 (or 1). The consists of all tags of the complete binary tree arranged in a
tag assigned to each node in the binary tree depends on the proper order. In the following, we give a formal definition
multicast it represents. For a node in level i I i log n,
of the routing tag sequence.
it is assigned the tag if and only if the destination
Let SEQ denote the final routing tag sequence for a
addresses of the multicast it represents have both 0 and 1 in
multicast. Let the tags of all PiÀI nodes (from left to right) in
the ithe most significant bit; it is assigned the tag 0 (or 1) if,
and only if, the destination addresses of the multicast it level i I i log n be tiI Y tiP Y F F F Y tiYPiÀP Y tiYPiÀP I Y F F F Y tiYPiÀI
represents have only 0 (or 1) in the ith most significant bit; it which is denoted as sequence ii (Fig. 11 gives an
is assigned the tag if and only if it represents an empty example for n IT). Let conc be a function which con-
multicast. Clearly, for a node in level i I i log n À I, if catenates a number of sequences to form a single sequence,
it has a tag , both its children must have a tag other than ; merge be a function which merges two sequences of equal
if it has a tag 0 (or 1), its left (or right) child must have a tag length and is defined as:
other than , and its right (or left) child must have a tag ;
and if it has a tag , both its children have a tag . It can be merge I Y P Y F F F Y k Y I Y P Y F F F Y k I Y I Y P Y P Y F F F Y k Y k
shown that, for a given multicast, the binary tree with tags IH
defined above is unique. Fig. 9a and Fig. 9b give two
and order be a recursive function of a sequence of length
examples of the binary trees with a tag at each node and
k Pi which is defined as:
their corresponding multicasts.
Now we are in the position to describe the multicast order I Y F F F Y k merge order I Y F F F Y k Y
routing tag format and its handling technique. First of all, P
any network input without a message is always assumed to order kI Y F F F Y k Y nd order I I X
have a tag . Each message at a network input is assigned a Finally, the routing tag sequence is defined as:
routing tag sequence H Y I Y P Y F F F Y PnÀQ Y PnÀP , which
corresponds to the complete binary tree of the multicast i on order iI Y order iP Y F F F Y
on the input. We use H as the routing tag to route the IP
order ilog n X
message on this input in the n Â n BSN, i.e., set up switches
in the scatter and quasisorting networks as described in In the example shown in Fig. 11 for n IT, we have the
Section 5; then pass the rest of tags along the setup path(s) ordered sequences of four levels,
YANG AND WANG: A NEW SELF-ROUTING MULTICAST NETWORK 1311
Fig. 9. (a) and (b) Examples of the binary trees with tags and their corresponding multicasts. (c) Routing tag sequences for (a) and (b) and their
order iI tII Y First of all, as can be seen, the different tag values H, I, ,
order iP tPI Y tPP Y and (including H and I ) are used in the RBN as a scatter
order iQ tQI Y tQQ Y tQP Y tQR Y nd network and as a quasisorting networks. We need to use
order iR tRI Y tRS Y tRQ Y tRU Y tRP Y tRT Y tRR Y tRV X three bits, H I P to represent a tag value. Table 1 lists an
encoding scheme for these tag values. Also, when we
Thus, the final routing tag format is their concatenation: compute the numbers of s, s, and Is on the inputs, we
need a combination of all the bits for every input tag. In this
tII Y tPI Y tPP Y tQI Y tQQ Y tQP Y tQR Y tRI Y tRS Y tRQ Y tRU Y tRP Y tRT Y tRR Y tRV X IQ
encoding, we can use H I of an input tag for counting the
We omit the proof of (12). Instead, we verify its number of for this input, and use H I for counting the
correctness by checking the tag sequence (13) in the number of of an input in the forward phases of
example and, for simplicity, we check only one branch of self-routing for the RBN as a scatter network and -dividing
the binary tree. By using our tag handling method, the algorithm for the RBN as a quasisorting network.
tag sequence tPI Y tQI Y tQP Y tRI Y tRQ Y tRP Y tRR , which corresponds Furthermore, since only Hs, Is, and s (including H s and
to the left subtree, is sent to the upper V Â V BSN; and I s) can be the input tags for the RBN as a quasi-sorting
then the tag sequence tQI Y tRI Y tRP , which corresponds to the network, we need to use only bit P of an input tag for
first subtree in level Q, is sent to the upper R Â R BSN, counting all Is (including real and dummy Is) in the
and so on. For the multicast examples in Fig. 9a and forward phase of self-routing for this RBN.
Fig. 9b, their routing tag sequences are HH and Second, the binary tree structure used in all distributed
IHII, respectively, and Fig. 9c shows the handling for algorithms can be embedded into an RBN. For a balanced
these two sequences. hardware distribution, we separate the tree to a forward
tree and a backward tree as shown in Fig. 8b, where the first
7.2 Circuit Design Issues
and the last switches of the last stage of a sub-RBN network
Based on the distributed self-routing algorithms presented
can serve as the nodes in the forward tree and the backward
in the last section, we can design a self-routing circuit for tree, respectively, and the switches in between can use the
our RBN-based multicast network in a similar approach to results from these two nodes for their switch settings.
Cheng and Chen's  self-routing circuit design for their Third, we can see that the most frequently used
RBN-based permutation network. In this subsection, we operation in the distributed algorithms is addition (or
only briefly discuss some circuit design issues. addition-like operations). For an n Â n RBN, the maximum
Fig. 10. Three possible cases of routing in an n Â n BSN.
1312 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 10, NO. 11, DECEMBER 1999
Fig. 12. One bit adder used in a pipelined fashion.
Note that the distributed routing algorithms work in a
pipelined fashion. It takes y log n unit time delay for the
first bit (from input) to reach a switch in the last stage of an
Fig. 11. A binary tree represents the routing tag sequence of a multicast n Â n RBN. Then it takes only y I unit time for each of the
for n IT. subsequent log n À I bits to reach a switch in the last stage.
Hence, the propagation delay in the forward phase is
values of n , n , or nI are n, which can be represented by y log n unit time, and so is in the backward phase. Let n
using at most log n bits. We implement the adder in a denote the total propagation delay of the routing algorithm
pipelined fashion as shown in Fig. 12. Then, the log n bit for an n Â n BRSMN. Since there are a constant number of
adder can be reduced to a one bit adder. Also, since the forward and backward phases in switch setting in a BSN,
distributed algorithms work in a pipelined fashion, the the routing time for an n Â n BSN is y log n. From
delay caused by running the forward and the backward n y log n n, we obtain n y logP n.
processes is also reduced. For the feedback implementation discussed in Section
7.3 Feedback Implementation of the Network 7.3, since an n Â n BRSMN is now simply an n Â n RBN and
the routing circuit distributed in each switch still has y I
Note that in our design all major functional components are
cost, we conclude that the feedback version of our design
recursively defined reverse banyan networks. We can easily
reuse part of the network to reduce network cost. In has y n log n network cost.
particular, we can construct a feedback version of our In Table 2, we list the cost, depth and routing time of the
multicast network, BRSMN as depicted in Fig. 13. Let an newly designed multicast network along with those of the
n Â n BSN use only one n Â n RBN, with each output fed existing multicast networks constructed by the recursive
back to the input with the same address. The first pass on decomposition approach. Finally, we can see that the
the RBN functions as a scatter network and the second pass RBN-based multicast network presented in this paper
functions as a quasisorting network. Note that in the achieves the same order of complexities of network cost,
original design of the BRSMN, (see Fig. 2 for an example), network depth, routing time as those of the RBN-based
the n Â n BSN is followed by two n Â n BSNs, and so on. permutation network proposed by Cheng and Chen .
Now, a BSN is simply an RBN of the same size and an n Â n
RBN consists of two n Â n RBNs followed by an n Â n
P P 8 CONCLUSIONS
merging network (see Fig. 5). In the feedback version of the
network, after the first split according to the most In this paper, we have proposed a design for a new
significant bit of a destination address, we can reuse the self-routing multicast network using an approach based
two n Â n RBNs in the n Â n RBN as the two BSNs to on recursive decompositions of multicast networks.
perform the second split according to the second most Different from earlier proposed multicast networks, the
significant bit of the destination address, and so on.
Consequently, the feedback version of an n Â n BRSMN is
simply an n Â n RBN.
7.4 Complexity Analyses
In this subsection, we analyze the hardware cost, network
depth and routing time of the new multicast network.
Compared to Chen and Cheng's permutation network
, since we add only a constant cost (i.e., a constant
number of one bit adders or adder-like circuits) to each
switch for the self-routing circuit and there are n log n
switches in an n Â n RBN, the hardware cost of the RBN
is y n log n, and so is that of an n Â n BSN. Let g n
denote the cost of an n Â n BRSMN. By the recursive À Á
construction in Fig. 1, we have g n y n log n Pg n Y
which implies g n y n logP n.
Since the network depth of an n Â n RBN is y log n,
so is an n Â n BSN. Let h n denote the network Á À depth
for an n Â n BRSMN. From h n y log n h n Y weP
immediately have h n y logP n. Fig. 13. The feedback implementation of the network.