SlideShare a Scribd company logo
1 of 42
CSCI 8150
Advanced Computer Architecture

Hwang, Chapter 2
Program and Network Properties
2.4 System Interconnect Architectures
System Interconnect Architectures
 Direct networks for static connections
 Indirect networks for dynamic connections
 Networks are used for
   internal connections in a centralized system among
    • processors
    • memory modules
    • I/O disk arrays
   distributed networking of multicomputer nodes
Goals and Analysis
 The goals of an interconnection network are to
 provide
   low-latency
   high data transfer rate
   wide communication bandwidth
 Analysis includes
   latency
   bisection bandwidth
   data-routing functions
   scalability of parallel architecture
Network Properties and Routing
 Static networks: point-to-point direct connections
 that will not change during program execution
 Dynamic networks:
   switched channels dynamically configured to match user
   program communication demands
   include buses, crossbar switches, and multistage
   networks
 Both network types also used for inter-PE data
 routing in SIMD computers
Terminology - 1
 Network usually represented by a graph with a finite
 number of nodes linked by directed or undirected edges.
 Number of nodes in graph = network size .
 Number of edges (links or channels) incident on a node =
 node degree d (also note in and out degrees when edges
 are directed). Node degree reflects number of I/O ports
 associated with a node, and should ideally be small and
 constant.
 Diameter D of a network is the maximum shortest path
 between any two nodes, measured by the number of links
 traversed; this should be as small as possible (from a
 communication point of view).
Terminology - 2
 Channel bisection width b = minimum number of edges cut
 to split a network into two parts each having the same
 number of nodes. Since each channel has w bit wires, the
 wire bisection width B = bw. Bisection width provides good
 indication of maximum communication bandwidth along the
 bisection of a network, and all other cross sections should
 be bounded by the bisection width.
 Wire (or channel) length = length (e.g. weight) of edges
 between nodes.
 Network is symmetric if the topology is the same looking
 from any node; these are easier to implement or to
 program.
 Other useful characterizing properties: homogeneous
 nodes? buffered channels? nodes are switches?
Data Routing Functions
 Shifting
 Rotating
 Permutation (one to one)
 Broadcast (one to all)
 Multicast (many to many)
 Personalized broadcast (one to many)
 Shuffle
 Exchange
 Etc.
Permutations
 Given n objects, there are n ! ways in which they
 can be reordered (one of which is no reordering).
 A permutation can be specified by giving the rule fo
 reordering a group of objects.
 Permutations can be implemented using crossbar
 switches, multistage networks, shifting, and
 broadcast operations. The time required to
 perform permutations of the connections between
 nodes often dominates the network performance
 when n is large.
Perfect Shuffle and Exchange
 Stone suggested the special permutation that
 entries according to the mapping of the k-bit
 binary number a b … k to b c … k a (that is,
 shifting 1 bit to the left and wrapping it around to
 the least significant bit position).
 The inverse perfect shuffle reverses the effect of
 the perfect shuffle.
Hypercube Routing Functions
 If the vertices of a n-dimensional cube are labeled
 with n-bit numbers so that only one bit differs
 between each pair of adjacent vertices, then n
 routing functions are defined by the bits in the node
 (vertex) address.
 For example, with a 3-dimensional cube, we can
 easily identify routing functions that exchange data
 between nodes with addresses that differ in the
 least significant, most significant, or middle bit.
Factors Affecting Performance
 Functionality – how the network supports data routing,
 interrupt handling, synchronization, request/message
 combining, and coherence
 Network latency – worst-case time for a unit message to be
 transferred
 Bandwidth – maximum data rate
 Hardware complexity – implementation costs for wire, logic,
 switches, connectors, etc.
 Scalability – how easily does the scheme adapt to an
 increasing number of processors, memories, etc.?
Static Networks
 Linear Array
 Ring and Chordal Ring
 Barrel Shifter
 Tree and Star
 Fat Tree
 Mesh and Torus
Static Networks – Linear Array
 N nodes connected by n-1 links (not a bus);
 segments between different pairs of nodes can be
 used in parallel.
 Internal nodes have degree 2; end nodes have
 degree 1.
 Diameter = n-1
 Bisection = 1
 For small n, this is economical, but for large n, it is
 obviously inappropriate.
Static Networks – Ring, Chordal Ring
 Like a linear array, but the two end nodes are
 connected by an n th link; the ring can be uni- or bi-
 directional. Diameter is n/2 for a bidirectional
 ring, or n for a unidirectional ring.
 By adding additional links (e.g. “chords” in a circle),
 the node degree is increased, and we obtain a
 chordal ring. This reduces the network diameter.
 In the limit, we obtain a fully-connected network,
 with a node degree of n -1 and a diameter of 1.
Static Networks – Barrel Shifter
 Like a ring, but with additional links between all
 pairs of nodes that have a distance equal to a
 power of 2.
 With a network of size N = 2n, each node has
 degree d = 2n -1, and the network has diameter D
 = n /2.
 Barrel shifter connectivity is greater than any
 chordal ring of lower node degree.
 Barrel shifter much less complex than fully-
 interconnected network.
Static Networks – Tree and Star
 A k-level completely balanced binary tree will have
 N = 2k – 1 nodes, with maximum node degree of 3
 and network diameter is 2(k – 1).
 The balanced binary tree is scalable, since it has a
 constant maximum node degree.
 A star is a two-level tree with a node degree d = N
 – 1 and a constant diameter of 2.
Static Networks – Fat Tree
 A fat tree is a tree in which the number of edges
 between nodes increases closer to the root (similar
 to the way the thickness of limbs increases in a
 real tree as we get closer to the root).
 The edges represent communication channels
 (“wires”), and since communication traffic
 increases as the root is approached, it seems
 logical to increase the number of channels there.
Static Networks – Mesh and Torus
 Pure mesh – N = n k nodes with links between each
 adjacent pair of nodes in a row or column (or higher
 degree). This is not a symmetric network; interior node
 degree d = 2k, diameter = k (n – 1).
 Illiac mesh (used in Illiac IV computer) – wraparound is
 allowed, thus reducing the network diameter to about half
 that of the equivalent pure mesh.
 A torus has ring connections in each dimension, and is
 symmetric. An n × n binary torus has node degree of 4 and
 a diameter of 2 × n / 2 .
Static Networks – Systolic Array
 A systolic array is an arrangement of processing
 elements and communication links designed
 specifically to match the computation and
 communication requirements of a specific
 algorithm (or class of algorithms).
 This specialized character may yield better
 performance than more generalized structures, but
 also makes them more expensive, and more
 difficult to program.
Static Networks – Hypercubes
 A binary n-cube architecture with N = 2n nodes
 spanning along n dimensions, with two nodes per
 dimension.
 The hypercube scalability is poor, and packaging is
 difficult for higher-dimensional hypercubes.
Static Networks – Cube-connected
Cycles
 k-cube connected cycles (CCC) can be created
 from a k-cube by replacing each vertex of the k-
 dimensional hypercube by a ring of k nodes.
 A k-cube can be transformed to a k-CCC with k ×
 2k nodes.
 The major advantage of a CCC is that each node
 has a constant degree (but longer latency) than in
 the corresponding k-cube. In that respect, it is
 more scalable than the hypercube architecture.
Static Networks – k-ary n-Cubes
 Rings, meshes, tori, binary n-cubes, and Omega
 networks (to be seen) are topologically isomorphic
 to a family of k-ary n-cube networks.
 n is the dimension of the cube, and k is the radix,
 or number of of nodes in each dimension.
 The number of nodes in the network, N, is k n.
 Folding (alternating nodes between connections)
 can be used to avoid the long “end-around” delays
 in the traditional implementation.
Static Networks – k-ary n-Cubes
 The cost of k-ary n-cubes is dominated by the
 amount of wire, not the number of switches.
 With constant wire bisection, low-dimensional
 networks with wider channels provide lower
 latecny, less contention, and higher “hot-spot”
 throughput than higher-dimensional networks with
 narrower channels.
Network Throughput
 Network throughput – number of messages a network can
 handle in a unit time interval.
 One way to estimate is to calculate the maximum number of
 messages that can be present in a network at any instant
 (its capacity); throughput usually is some fraction of its
 capacity.
 A hot spot is a pair of nodes that accounts for a
 disproportionately large portion of the total network traffic
 (possibly causing congestion).
 Hot spot throughput is maximum rate at which messages
 can be sent between two specific nodes.
Minimizing Latency
 Latency is minimized when the network radix k and
 dimension n are chose so as to make the
 components of latency due to distance (# of hops)
 and the message aspect ratio L / W (message
 length L divided by the channel width W )
 approximately equal.
 This occurs at a very low dimension. For up to
 1024 nodes, the best dimension (in this respect) is
 2.
What is Dynamic Network
   Dynamic Network is the network that can connect any input
   to any output by enabling or disabling some switches in the
   network
   Examples:
   - Shared Bus: The bus arbiter connects a processor to a
   memory
   - Multistage Network: Consists of several stages of
   switches that are enabled to get connections
   - Crossbar: Consists of a lot of switching elements, which
   can be enabled to connect many inputs to many outputs
   simultaneously
- The nodes in static networks (like Mesh) also consist of
   dynamic crossbars
Dynamic Networks – Bus Systems
 A bus system (contention bus, time-sharing bus) has
    a collection of wires and connectors
    multiple modules (processors, memories, peripherals, etc.) which
    connect to the wires
    data transactions between pairs of modules
 Bus supports only one transaction at a time.
 Bus arbitration logic must deal with conflicting requests.
 Lowest cost and bandwidth of all dynamic schemes.
 Many bus standards are available.
A Bus Connected multiprocessor system




bus
Dynamic Networks – Switch Modules
 An a × b switch module has a inputs and b outputs.
  A binary switch has a = b = 2.
 It is not necessary for a = b, but usually a = b = 2k,
 for some integer k>=1.
 In general, any input can be connected to one or
 more of the outputs. However, multiple inputs may
 not be connected to the same output.
 When only one-to-one mappings are allowed, the
 switch is called a crossbar switch.
Multistage Networks
 In general, any multistage network is comprised of a
 collection of a × b switch modules and fixed network
 modules. The a × b switch modules are used to provide
 variable permutation or other reordering of the inputs, which
 are then further reordered by the fixed network modules.
 A generic multistage network consists of a sequence
 alternating dynamic switches (with relatively small values
 for a and b) with static networks (with larger numbers of
 inputs and outputs). The static networks are used to
 implement interstage connections (ISC).
Omega Network
 A 2 × 2 switch can be configured for
    Straight-through
    Crossover
    Upper broadcast (upper input to both outputs)
    Lower broadcast (lower input to both outputs)
    (No output is a somewhat vacuous possibility as well)
 With four stages of eight 2 × 2 switches, and a static perfect
 shuffle for each of the four ISCs, a 16 by 16 Omega
 network can be constructed (but not all permutations are
 possible).
 In general , an n-input Omega network requires log 2 n
 stages of 2 × 2 switches and n / 2 switch modules.
A 16×16 Omega Network
Baseline Network
 A Baseline network can be generated recursively.

 First stage contains N*N block and second stage
 contains N/2*N/2 sub blocks, labeled C0, C1 and
 go on until sub blocks size reached 2*2.
4 × 4 Baseline Network
Crossbar Networks
 A m × n crossbar network can be used to provide a
 constant latency connection between devices; it can be
 thought of as a single stage switch.
 Different types of devices can be connected, yielding
 different constraints on which switches can be enabled.
   With m processors and n memories, one processor may be able to
   generate requests for multiple memories in sequence; thus several
   switches might be set in the same row.
   For m × m interprocessor communication, each PE is connected to
   both an input and an output of the crossbar; only one switch in each
   row and column can be turned on simultaneously. Additional
   control processors are used to manage the crossbar itself.
Crossbar Network
Summary: Notes


Bus                  n processors, bus width w
Multistage Network   n × n network using k × k switches, line
                     width w
Crossbar             n × n crossbar, with line width w
Summary: Minimum Latency


Bus                  Constant
Multistage Network   O(logk n)
Crossbar             Constant
Summary: Bandwidth per Processor


Bus                  O(w/n) to O(w)
Multistage Network   O(w) to O(nw)
Crossbar             O(w) to O(nw)
Summary: Wiring Complexity


Bus                  O(w)
Multistage Network   O(nw logk n)
Crossbar             O(n2w)
Summary: Switching Complexity


Bus                  O(n)
Multistage Network   O(n logk n)
Crossbar             O(n2)
Summary: Connectivity and Routing


Bus                  One to one, and only one at a time
Multistage Network   Some permutations and broadcast (if
                     network “unblocked”)
Crossbar             All permutations, one at a time

More Related Content

What's hot

Distributed document based system
Distributed document based systemDistributed document based system
Distributed document based systemChetan Selukar
 
Agreement Protocols, distributed File Systems, Distributed Shared Memory
Agreement Protocols, distributed File Systems, Distributed Shared MemoryAgreement Protocols, distributed File Systems, Distributed Shared Memory
Agreement Protocols, distributed File Systems, Distributed Shared MemorySHIKHA GAUTAM
 
RISC (reduced instruction set computer)
RISC (reduced instruction set computer)RISC (reduced instruction set computer)
RISC (reduced instruction set computer)LokmanArman
 
Operating system 31 multiple processor scheduling
Operating system 31 multiple processor schedulingOperating system 31 multiple processor scheduling
Operating system 31 multiple processor schedulingVaibhav Khanna
 
Chapter 5: Mapping and Scheduling
Chapter  5: Mapping and SchedulingChapter  5: Mapping and Scheduling
Chapter 5: Mapping and SchedulingHeman Pathak
 
CS878 Green Computing Anna University Question Paper
CS878 Green Computing Anna University Question Paper CS878 Green Computing Anna University Question Paper
CS878 Green Computing Anna University Question Paper Gobinath Subramaniam
 
Physical organization of parallel platforms
Physical organization of parallel platformsPhysical organization of parallel platforms
Physical organization of parallel platformsSyed Zaid Irshad
 
Chapter 14 replication
Chapter 14 replicationChapter 14 replication
Chapter 14 replicationAbDul ThaYyal
 
Message passing ( in computer science)
Message   passing  ( in   computer  science)Message   passing  ( in   computer  science)
Message passing ( in computer science)Computer_ at_home
 
Advanced Comuter Architecture Ch6 Problem Solutions
Advanced Comuter Architecture Ch6 Problem SolutionsAdvanced Comuter Architecture Ch6 Problem Solutions
Advanced Comuter Architecture Ch6 Problem SolutionsJoe Christensen
 
Elliptic Curve Cryptography for those who are afraid of maths
Elliptic Curve Cryptography for those who are afraid of mathsElliptic Curve Cryptography for those who are afraid of maths
Elliptic Curve Cryptography for those who are afraid of mathsMartijn Grooten
 
Parallel sorting Algorithms
Parallel  sorting AlgorithmsParallel  sorting Algorithms
Parallel sorting AlgorithmsGARIMA SHAKYA
 

What's hot (20)

Distributed deadlock
Distributed deadlockDistributed deadlock
Distributed deadlock
 
Array Processor
Array ProcessorArray Processor
Array Processor
 
Distributed document based system
Distributed document based systemDistributed document based system
Distributed document based system
 
Agreement Protocols, distributed File Systems, Distributed Shared Memory
Agreement Protocols, distributed File Systems, Distributed Shared MemoryAgreement Protocols, distributed File Systems, Distributed Shared Memory
Agreement Protocols, distributed File Systems, Distributed Shared Memory
 
RISC (reduced instruction set computer)
RISC (reduced instruction set computer)RISC (reduced instruction set computer)
RISC (reduced instruction set computer)
 
Desktop and multiprocessor systems
Desktop and multiprocessor systemsDesktop and multiprocessor systems
Desktop and multiprocessor systems
 
Operating system 31 multiple processor scheduling
Operating system 31 multiple processor schedulingOperating system 31 multiple processor scheduling
Operating system 31 multiple processor scheduling
 
Chapter 5: Mapping and Scheduling
Chapter  5: Mapping and SchedulingChapter  5: Mapping and Scheduling
Chapter 5: Mapping and Scheduling
 
CS878 Green Computing Anna University Question Paper
CS878 Green Computing Anna University Question Paper CS878 Green Computing Anna University Question Paper
CS878 Green Computing Anna University Question Paper
 
Physical organization of parallel platforms
Physical organization of parallel platformsPhysical organization of parallel platforms
Physical organization of parallel platforms
 
Threshold cryptography
Threshold cryptographyThreshold cryptography
Threshold cryptography
 
Chapter 14 replication
Chapter 14 replicationChapter 14 replication
Chapter 14 replication
 
Neural Networks: Introducton
Neural Networks: IntroductonNeural Networks: Introducton
Neural Networks: Introducton
 
Message passing ( in computer science)
Message   passing  ( in   computer  science)Message   passing  ( in   computer  science)
Message passing ( in computer science)
 
Network layer tanenbaum
Network layer tanenbaumNetwork layer tanenbaum
Network layer tanenbaum
 
Run time storage
Run time storageRun time storage
Run time storage
 
Advanced Comuter Architecture Ch6 Problem Solutions
Advanced Comuter Architecture Ch6 Problem SolutionsAdvanced Comuter Architecture Ch6 Problem Solutions
Advanced Comuter Architecture Ch6 Problem Solutions
 
Elliptic Curve Cryptography for those who are afraid of maths
Elliptic Curve Cryptography for those who are afraid of mathsElliptic Curve Cryptography for those who are afraid of maths
Elliptic Curve Cryptography for those who are afraid of maths
 
24 Multithreaded Algorithms
24 Multithreaded Algorithms24 Multithreaded Algorithms
24 Multithreaded Algorithms
 
Parallel sorting Algorithms
Parallel  sorting AlgorithmsParallel  sorting Algorithms
Parallel sorting Algorithms
 

Viewers also liked (20)

Bengali optical character recognition system
Bengali optical character recognition systemBengali optical character recognition system
Bengali optical character recognition system
 
Interconnection Network
Interconnection NetworkInterconnection Network
Interconnection Network
 
Parallel computing chapter 3
Parallel computing chapter 3Parallel computing chapter 3
Parallel computing chapter 3
 
Parallel computing(2)
Parallel computing(2)Parallel computing(2)
Parallel computing(2)
 
Clustering manual
Clustering manualClustering manual
Clustering manual
 
Observer pattern
Observer patternObserver pattern
Observer pattern
 
Mediator pattern
Mediator patternMediator pattern
Mediator pattern
 
Parallel searching
Parallel searchingParallel searching
Parallel searching
 
Static Networks
Static NetworksStatic Networks
Static Networks
 
Interconnection mechanisms
Interconnection mechanismsInterconnection mechanisms
Interconnection mechanisms
 
os
osos
os
 
Apache hadoop & map reduce
Apache hadoop & map reduceApache hadoop & map reduce
Apache hadoop & map reduce
 
Map reduce
Map reduceMap reduce
Map reduce
 
R with excel
R with excelR with excel
R with excel
 
Twitter
TwitterTwitter
Twitter
 
Icons presentation
Icons presentationIcons presentation
Icons presentation
 
New microsoft office word 97 2003 document
New microsoft office word 97   2003 documentNew microsoft office word 97   2003 document
New microsoft office word 97 2003 document
 
Big data
Big dataBig data
Big data
 
Strategy pattern.pdf
Strategy pattern.pdfStrategy pattern.pdf
Strategy pattern.pdf
 
Job search_resume
Job search_resumeJob search_resume
Job search_resume
 

Similar to Parallel computing chapter 2

system interconnect architectures in ACA
system interconnect architectures in ACAsystem interconnect architectures in ACA
system interconnect architectures in ACAPankaj Kumar Jain
 
Optimal Transmit Power and Packet Size in Wireless Sensor Networks in Shadowe...
Optimal Transmit Power and Packet Size in Wireless Sensor Networks in Shadowe...Optimal Transmit Power and Packet Size in Wireless Sensor Networks in Shadowe...
Optimal Transmit Power and Packet Size in Wireless Sensor Networks in Shadowe...IDES Editor
 
What Is A Network made by Ms. Archika Bhatia
What Is A Network made by Ms. Archika BhatiaWhat Is A Network made by Ms. Archika Bhatia
What Is A Network made by Ms. Archika Bhatiakulachihansraj
 
2015 11-07 -ad_hoc__network architectures and protocol stack
2015 11-07 -ad_hoc__network architectures and protocol stack2015 11-07 -ad_hoc__network architectures and protocol stack
2015 11-07 -ad_hoc__network architectures and protocol stackSyed Ariful Islam Emon
 
COMPUTER NETWORKING SYSTEM
COMPUTER NETWORKING SYSTEMCOMPUTER NETWORKING SYSTEM
COMPUTER NETWORKING SYSTEMprapti borthakur
 
NETWORKS & TOPOLOGY
NETWORKS & TOPOLOGYNETWORKS & TOPOLOGY
NETWORKS & TOPOLOGYPRINCE KUMAR
 
Bca3040– data communication
Bca3040– data communicationBca3040– data communication
Bca3040– data communicationsmumbahelp
 
Wireless sensor network
Wireless sensor network  Wireless sensor network
Wireless sensor network Sandeep Kumar
 
Iaetsd game theory and auctions for cooperation in
Iaetsd game theory and auctions for cooperation inIaetsd game theory and auctions for cooperation in
Iaetsd game theory and auctions for cooperation inIaetsd Iaetsd
 
Fundamentals of Networking
Fundamentals of NetworkingFundamentals of Networking
Fundamentals of Networkingjashhad
 
Distributed Spatial Modulation based Cooperative Diversity Scheme
Distributed Spatial Modulation based Cooperative Diversity SchemeDistributed Spatial Modulation based Cooperative Diversity Scheme
Distributed Spatial Modulation based Cooperative Diversity Schemeijwmn
 
VTU 5TH SEM CSE COMPUTER NETWORKS-1 (DATA COMMUNICATION) SOLVED PAPERS
VTU 5TH SEM CSE COMPUTER NETWORKS-1 (DATA COMMUNICATION)  SOLVED PAPERSVTU 5TH SEM CSE COMPUTER NETWORKS-1 (DATA COMMUNICATION)  SOLVED PAPERS
VTU 5TH SEM CSE COMPUTER NETWORKS-1 (DATA COMMUNICATION) SOLVED PAPERSvtunotesbysree
 
Comm. & net. concepts
Comm. & net. conceptsComm. & net. concepts
Comm. & net. conceptsAshwin Kumar
 
Performance Analysis of Enhanced Opportunistic Minimum Cost Routingin Mobile ...
Performance Analysis of Enhanced Opportunistic Minimum Cost Routingin Mobile ...Performance Analysis of Enhanced Opportunistic Minimum Cost Routingin Mobile ...
Performance Analysis of Enhanced Opportunistic Minimum Cost Routingin Mobile ...IJERA Editor
 
Performance Analysis of Enhanced Opportunistic Minimum Cost Routingin Mobile ...
Performance Analysis of Enhanced Opportunistic Minimum Cost Routingin Mobile ...Performance Analysis of Enhanced Opportunistic Minimum Cost Routingin Mobile ...
Performance Analysis of Enhanced Opportunistic Minimum Cost Routingin Mobile ...IJERA Editor
 

Similar to Parallel computing chapter 2 (20)

system interconnect architectures in ACA
system interconnect architectures in ACAsystem interconnect architectures in ACA
system interconnect architectures in ACA
 
Static networks
Static networksStatic networks
Static networks
 
Optimal Transmit Power and Packet Size in Wireless Sensor Networks in Shadowe...
Optimal Transmit Power and Packet Size in Wireless Sensor Networks in Shadowe...Optimal Transmit Power and Packet Size in Wireless Sensor Networks in Shadowe...
Optimal Transmit Power and Packet Size in Wireless Sensor Networks in Shadowe...
 
Gk2411581160
Gk2411581160Gk2411581160
Gk2411581160
 
D031202018023
D031202018023D031202018023
D031202018023
 
What Is A Network made by Ms. Archika Bhatia
What Is A Network made by Ms. Archika BhatiaWhat Is A Network made by Ms. Archika Bhatia
What Is A Network made by Ms. Archika Bhatia
 
2015 11-07 -ad_hoc__network architectures and protocol stack
2015 11-07 -ad_hoc__network architectures and protocol stack2015 11-07 -ad_hoc__network architectures and protocol stack
2015 11-07 -ad_hoc__network architectures and protocol stack
 
Computer network introduction
Computer network introductionComputer network introduction
Computer network introduction
 
COMPUTER NETWORKING SYSTEM
COMPUTER NETWORKING SYSTEMCOMPUTER NETWORKING SYSTEM
COMPUTER NETWORKING SYSTEM
 
NETWORKS & TOPOLOGY
NETWORKS & TOPOLOGYNETWORKS & TOPOLOGY
NETWORKS & TOPOLOGY
 
Bca3040– data communication
Bca3040– data communicationBca3040– data communication
Bca3040– data communication
 
Wireless sensor network
Wireless sensor network  Wireless sensor network
Wireless sensor network
 
Iaetsd game theory and auctions for cooperation in
Iaetsd game theory and auctions for cooperation inIaetsd game theory and auctions for cooperation in
Iaetsd game theory and auctions for cooperation in
 
Fundamentals of Networking
Fundamentals of NetworkingFundamentals of Networking
Fundamentals of Networking
 
Distributed Spatial Modulation based Cooperative Diversity Scheme
Distributed Spatial Modulation based Cooperative Diversity SchemeDistributed Spatial Modulation based Cooperative Diversity Scheme
Distributed Spatial Modulation based Cooperative Diversity Scheme
 
VTU 5TH SEM CSE COMPUTER NETWORKS-1 (DATA COMMUNICATION) SOLVED PAPERS
VTU 5TH SEM CSE COMPUTER NETWORKS-1 (DATA COMMUNICATION)  SOLVED PAPERSVTU 5TH SEM CSE COMPUTER NETWORKS-1 (DATA COMMUNICATION)  SOLVED PAPERS
VTU 5TH SEM CSE COMPUTER NETWORKS-1 (DATA COMMUNICATION) SOLVED PAPERS
 
Comm. & net. concepts
Comm. & net. conceptsComm. & net. concepts
Comm. & net. concepts
 
Performance Analysis of Enhanced Opportunistic Minimum Cost Routingin Mobile ...
Performance Analysis of Enhanced Opportunistic Minimum Cost Routingin Mobile ...Performance Analysis of Enhanced Opportunistic Minimum Cost Routingin Mobile ...
Performance Analysis of Enhanced Opportunistic Minimum Cost Routingin Mobile ...
 
Performance Analysis of Enhanced Opportunistic Minimum Cost Routingin Mobile ...
Performance Analysis of Enhanced Opportunistic Minimum Cost Routingin Mobile ...Performance Analysis of Enhanced Opportunistic Minimum Cost Routingin Mobile ...
Performance Analysis of Enhanced Opportunistic Minimum Cost Routingin Mobile ...
 
Topology ppt
Topology pptTopology ppt
Topology ppt
 

More from Md. Mahedi Mahfuj

More from Md. Mahedi Mahfuj (17)

Parallel computing(1)
Parallel computing(1)Parallel computing(1)
Parallel computing(1)
 
Message passing interface
Message passing interfaceMessage passing interface
Message passing interface
 
Advanced computer architecture
Advanced computer architectureAdvanced computer architecture
Advanced computer architecture
 
Matrix multiplication graph
Matrix multiplication graphMatrix multiplication graph
Matrix multiplication graph
 
Strategy pattern
Strategy patternStrategy pattern
Strategy pattern
 
Database management system chapter16
Database management system chapter16Database management system chapter16
Database management system chapter16
 
Database management system chapter15
Database management system chapter15Database management system chapter15
Database management system chapter15
 
Database management system chapter12
Database management system chapter12Database management system chapter12
Database management system chapter12
 
Strategies in job search process
Strategies in job search processStrategies in job search process
Strategies in job search process
 
Report writing(short)
Report writing(short)Report writing(short)
Report writing(short)
 
Report writing(long)
Report writing(long)Report writing(long)
Report writing(long)
 
Job search_interview
Job search_interviewJob search_interview
Job search_interview
 
Basic and logical implementation of r language
Basic and logical implementation of r language Basic and logical implementation of r language
Basic and logical implementation of r language
 
R language
R languageR language
R language
 
Chatbot Artificial Intelligence
Chatbot Artificial IntelligenceChatbot Artificial Intelligence
Chatbot Artificial Intelligence
 
Cloud testing v1
Cloud testing v1Cloud testing v1
Cloud testing v1
 
Paper review
Paper review Paper review
Paper review
 

Recently uploaded

DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 

Recently uploaded (20)

DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 

Parallel computing chapter 2

  • 1. CSCI 8150 Advanced Computer Architecture Hwang, Chapter 2 Program and Network Properties 2.4 System Interconnect Architectures
  • 2. System Interconnect Architectures Direct networks for static connections Indirect networks for dynamic connections Networks are used for internal connections in a centralized system among • processors • memory modules • I/O disk arrays distributed networking of multicomputer nodes
  • 3. Goals and Analysis The goals of an interconnection network are to provide low-latency high data transfer rate wide communication bandwidth Analysis includes latency bisection bandwidth data-routing functions scalability of parallel architecture
  • 4. Network Properties and Routing Static networks: point-to-point direct connections that will not change during program execution Dynamic networks: switched channels dynamically configured to match user program communication demands include buses, crossbar switches, and multistage networks Both network types also used for inter-PE data routing in SIMD computers
  • 5. Terminology - 1 Network usually represented by a graph with a finite number of nodes linked by directed or undirected edges. Number of nodes in graph = network size . Number of edges (links or channels) incident on a node = node degree d (also note in and out degrees when edges are directed). Node degree reflects number of I/O ports associated with a node, and should ideally be small and constant. Diameter D of a network is the maximum shortest path between any two nodes, measured by the number of links traversed; this should be as small as possible (from a communication point of view).
  • 6. Terminology - 2 Channel bisection width b = minimum number of edges cut to split a network into two parts each having the same number of nodes. Since each channel has w bit wires, the wire bisection width B = bw. Bisection width provides good indication of maximum communication bandwidth along the bisection of a network, and all other cross sections should be bounded by the bisection width. Wire (or channel) length = length (e.g. weight) of edges between nodes. Network is symmetric if the topology is the same looking from any node; these are easier to implement or to program. Other useful characterizing properties: homogeneous nodes? buffered channels? nodes are switches?
  • 7. Data Routing Functions Shifting Rotating Permutation (one to one) Broadcast (one to all) Multicast (many to many) Personalized broadcast (one to many) Shuffle Exchange Etc.
  • 8. Permutations Given n objects, there are n ! ways in which they can be reordered (one of which is no reordering). A permutation can be specified by giving the rule fo reordering a group of objects. Permutations can be implemented using crossbar switches, multistage networks, shifting, and broadcast operations. The time required to perform permutations of the connections between nodes often dominates the network performance when n is large.
  • 9. Perfect Shuffle and Exchange Stone suggested the special permutation that entries according to the mapping of the k-bit binary number a b … k to b c … k a (that is, shifting 1 bit to the left and wrapping it around to the least significant bit position). The inverse perfect shuffle reverses the effect of the perfect shuffle.
  • 10. Hypercube Routing Functions If the vertices of a n-dimensional cube are labeled with n-bit numbers so that only one bit differs between each pair of adjacent vertices, then n routing functions are defined by the bits in the node (vertex) address. For example, with a 3-dimensional cube, we can easily identify routing functions that exchange data between nodes with addresses that differ in the least significant, most significant, or middle bit.
  • 11. Factors Affecting Performance Functionality – how the network supports data routing, interrupt handling, synchronization, request/message combining, and coherence Network latency – worst-case time for a unit message to be transferred Bandwidth – maximum data rate Hardware complexity – implementation costs for wire, logic, switches, connectors, etc. Scalability – how easily does the scheme adapt to an increasing number of processors, memories, etc.?
  • 12. Static Networks Linear Array Ring and Chordal Ring Barrel Shifter Tree and Star Fat Tree Mesh and Torus
  • 13. Static Networks – Linear Array N nodes connected by n-1 links (not a bus); segments between different pairs of nodes can be used in parallel. Internal nodes have degree 2; end nodes have degree 1. Diameter = n-1 Bisection = 1 For small n, this is economical, but for large n, it is obviously inappropriate.
  • 14. Static Networks – Ring, Chordal Ring Like a linear array, but the two end nodes are connected by an n th link; the ring can be uni- or bi- directional. Diameter is n/2 for a bidirectional ring, or n for a unidirectional ring. By adding additional links (e.g. “chords” in a circle), the node degree is increased, and we obtain a chordal ring. This reduces the network diameter. In the limit, we obtain a fully-connected network, with a node degree of n -1 and a diameter of 1.
  • 15. Static Networks – Barrel Shifter Like a ring, but with additional links between all pairs of nodes that have a distance equal to a power of 2. With a network of size N = 2n, each node has degree d = 2n -1, and the network has diameter D = n /2. Barrel shifter connectivity is greater than any chordal ring of lower node degree. Barrel shifter much less complex than fully- interconnected network.
  • 16. Static Networks – Tree and Star A k-level completely balanced binary tree will have N = 2k – 1 nodes, with maximum node degree of 3 and network diameter is 2(k – 1). The balanced binary tree is scalable, since it has a constant maximum node degree. A star is a two-level tree with a node degree d = N – 1 and a constant diameter of 2.
  • 17. Static Networks – Fat Tree A fat tree is a tree in which the number of edges between nodes increases closer to the root (similar to the way the thickness of limbs increases in a real tree as we get closer to the root). The edges represent communication channels (“wires”), and since communication traffic increases as the root is approached, it seems logical to increase the number of channels there.
  • 18. Static Networks – Mesh and Torus Pure mesh – N = n k nodes with links between each adjacent pair of nodes in a row or column (or higher degree). This is not a symmetric network; interior node degree d = 2k, diameter = k (n – 1). Illiac mesh (used in Illiac IV computer) – wraparound is allowed, thus reducing the network diameter to about half that of the equivalent pure mesh. A torus has ring connections in each dimension, and is symmetric. An n × n binary torus has node degree of 4 and a diameter of 2 × n / 2 .
  • 19. Static Networks – Systolic Array A systolic array is an arrangement of processing elements and communication links designed specifically to match the computation and communication requirements of a specific algorithm (or class of algorithms). This specialized character may yield better performance than more generalized structures, but also makes them more expensive, and more difficult to program.
  • 20. Static Networks – Hypercubes A binary n-cube architecture with N = 2n nodes spanning along n dimensions, with two nodes per dimension. The hypercube scalability is poor, and packaging is difficult for higher-dimensional hypercubes.
  • 21. Static Networks – Cube-connected Cycles k-cube connected cycles (CCC) can be created from a k-cube by replacing each vertex of the k- dimensional hypercube by a ring of k nodes. A k-cube can be transformed to a k-CCC with k × 2k nodes. The major advantage of a CCC is that each node has a constant degree (but longer latency) than in the corresponding k-cube. In that respect, it is more scalable than the hypercube architecture.
  • 22. Static Networks – k-ary n-Cubes Rings, meshes, tori, binary n-cubes, and Omega networks (to be seen) are topologically isomorphic to a family of k-ary n-cube networks. n is the dimension of the cube, and k is the radix, or number of of nodes in each dimension. The number of nodes in the network, N, is k n. Folding (alternating nodes between connections) can be used to avoid the long “end-around” delays in the traditional implementation.
  • 23. Static Networks – k-ary n-Cubes The cost of k-ary n-cubes is dominated by the amount of wire, not the number of switches. With constant wire bisection, low-dimensional networks with wider channels provide lower latecny, less contention, and higher “hot-spot” throughput than higher-dimensional networks with narrower channels.
  • 24. Network Throughput Network throughput – number of messages a network can handle in a unit time interval. One way to estimate is to calculate the maximum number of messages that can be present in a network at any instant (its capacity); throughput usually is some fraction of its capacity. A hot spot is a pair of nodes that accounts for a disproportionately large portion of the total network traffic (possibly causing congestion). Hot spot throughput is maximum rate at which messages can be sent between two specific nodes.
  • 25. Minimizing Latency Latency is minimized when the network radix k and dimension n are chose so as to make the components of latency due to distance (# of hops) and the message aspect ratio L / W (message length L divided by the channel width W ) approximately equal. This occurs at a very low dimension. For up to 1024 nodes, the best dimension (in this respect) is 2.
  • 26. What is Dynamic Network Dynamic Network is the network that can connect any input to any output by enabling or disabling some switches in the network Examples: - Shared Bus: The bus arbiter connects a processor to a memory - Multistage Network: Consists of several stages of switches that are enabled to get connections - Crossbar: Consists of a lot of switching elements, which can be enabled to connect many inputs to many outputs simultaneously - The nodes in static networks (like Mesh) also consist of dynamic crossbars
  • 27. Dynamic Networks – Bus Systems A bus system (contention bus, time-sharing bus) has a collection of wires and connectors multiple modules (processors, memories, peripherals, etc.) which connect to the wires data transactions between pairs of modules Bus supports only one transaction at a time. Bus arbitration logic must deal with conflicting requests. Lowest cost and bandwidth of all dynamic schemes. Many bus standards are available.
  • 28. A Bus Connected multiprocessor system bus
  • 29. Dynamic Networks – Switch Modules An a × b switch module has a inputs and b outputs. A binary switch has a = b = 2. It is not necessary for a = b, but usually a = b = 2k, for some integer k>=1. In general, any input can be connected to one or more of the outputs. However, multiple inputs may not be connected to the same output. When only one-to-one mappings are allowed, the switch is called a crossbar switch.
  • 30. Multistage Networks In general, any multistage network is comprised of a collection of a × b switch modules and fixed network modules. The a × b switch modules are used to provide variable permutation or other reordering of the inputs, which are then further reordered by the fixed network modules. A generic multistage network consists of a sequence alternating dynamic switches (with relatively small values for a and b) with static networks (with larger numbers of inputs and outputs). The static networks are used to implement interstage connections (ISC).
  • 31. Omega Network A 2 × 2 switch can be configured for Straight-through Crossover Upper broadcast (upper input to both outputs) Lower broadcast (lower input to both outputs) (No output is a somewhat vacuous possibility as well) With four stages of eight 2 × 2 switches, and a static perfect shuffle for each of the four ISCs, a 16 by 16 Omega network can be constructed (but not all permutations are possible). In general , an n-input Omega network requires log 2 n stages of 2 × 2 switches and n / 2 switch modules.
  • 32. A 16×16 Omega Network
  • 33. Baseline Network A Baseline network can be generated recursively. First stage contains N*N block and second stage contains N/2*N/2 sub blocks, labeled C0, C1 and go on until sub blocks size reached 2*2.
  • 34. 4 × 4 Baseline Network
  • 35. Crossbar Networks A m × n crossbar network can be used to provide a constant latency connection between devices; it can be thought of as a single stage switch. Different types of devices can be connected, yielding different constraints on which switches can be enabled. With m processors and n memories, one processor may be able to generate requests for multiple memories in sequence; thus several switches might be set in the same row. For m × m interprocessor communication, each PE is connected to both an input and an output of the crossbar; only one switch in each row and column can be turned on simultaneously. Additional control processors are used to manage the crossbar itself.
  • 37. Summary: Notes Bus n processors, bus width w Multistage Network n × n network using k × k switches, line width w Crossbar n × n crossbar, with line width w
  • 38. Summary: Minimum Latency Bus Constant Multistage Network O(logk n) Crossbar Constant
  • 39. Summary: Bandwidth per Processor Bus O(w/n) to O(w) Multistage Network O(w) to O(nw) Crossbar O(w) to O(nw)
  • 40. Summary: Wiring Complexity Bus O(w) Multistage Network O(nw logk n) Crossbar O(n2w)
  • 41. Summary: Switching Complexity Bus O(n) Multistage Network O(n logk n) Crossbar O(n2)
  • 42. Summary: Connectivity and Routing Bus One to one, and only one at a time Multistage Network Some permutations and broadcast (if network “unblocked”) Crossbar All permutations, one at a time