A Survey on Topology Mapping
for Large Scale Interconnection
Networks
Soheila Abrishami, Peyman Faizian
Overview
• Background
• Definition
• Performance metrics
• Mapping techniques
Background
• Research in interconnect design can be classified as:
• Communication infrastructure (topology)
• Communication paradigm (routing, switching)
• Evaluation framework (throughput, latency)
• Topology/application mapping
Data Locality
• The challenge deals with scalability and can be expressed in several
ways:
• How to use the maximum of the available resources at their full potential?
• How to do so with an energy consumption that remains acceptable?
• One global and practical answer to these questions is to improve the
“Data Locality” of parallel applications
Data Locality…
• Data locality: the way the data are placed, accessed and moved by the
multiple hardware processing units of the underlying target
architecture
• Improving data locality can cause:
• Reduced Communication cost
• Decrease in application’s execution time
• Decrease in energy consumption
Mapping
• One way to improve data locality is to dedicate physical processing
units to their specific software processing entities
• This means that matching between the application virtual topology
and the target hardware architecture has to be determined
Mapping…
• Virtual topology: Expresses the existing dependencies between
software processing entities
• Static: number of processing entities and the dependencies between these
entities do not change
• Dynamic: when one of the two above conditions (possibly both) is not
fulfilled.
• In addition, the maximum of details regarding the target hardware
have to be gathered.
Mapping…
• The matching between the virtual and the physical topologies is
achievable in both ways
• the virtual topology can be mapped on to the physical one
• the physical topology can also be mapped onto the virtual one
Topology mapping
• The network is typically modeled by a weighted graph
𝐻 = 𝑉𝐻, 𝑊𝐻, 𝑅 𝐻
• 𝑉𝐻: represent the execution units
• 𝑊𝐻(𝑢, 𝑣): represent the weight of the edges between two vertices 𝑢 and 𝑣
• 𝑅 𝐻: represent the routing as a probability distribution
• The static application graph is often modeled as a weighted graph
𝐴 = 𝑉𝐴, 𝑤 𝐴
• 𝑉𝐴: represents the set of communicating processes
• 𝑤 𝐴 𝑢, 𝑣 : represents some metric for the communication between two
processes 𝑢, 𝑣
Topology mapping…
• The topology mapping considers mappings 𝜎 ∶ 𝑉𝐴 → 𝑉𝐻
• Each concrete mapping 𝜎 has two metrics:
• Dilation: is defined as either the maximum or the sum of the pairwise
distances of neighbors in 𝐴 mapped to 𝐻. (correlate with the dynamic energy
consumption of the network)
• Congestion: counts how many communication pairs use a certain link.
(correlates strongly with the execution time of bulk-synchronous parallel
applications)
Mapping Techniques
Finding a perfect mapping (wrt to a specific metric) is NP-Complete.
• LP based algorithms
• Constructive approaches (greedy)
• Partitioning approaches
• Transformative approaches
• Graph similarity based approaches
LP formulation
• Given a topology G (links and bandwidths)
• Given a virtual topology H (communications graph)
• Find a mapping from H -> G such that:
• Maximum throughput
• Minimum latency
• Minimum congestion
• Minimum dilation
• Software solutions are available to solve the linear program
Greedy Approaches
• Select two starting vertices u, v from G and H respectively
• Local
• Add next vertices from the neighborhood of initial vertices
• Global
• Add next vertices based on a global property (i.e., node degrees)
• Continue until finding a full mapping
• The end result relies heavily on the choice of first vertices
• Compute mappings based on different initial choices and select the best one
• Define some kind of primary conditions to choose the initial vertices
Partitioning Approaches
• Based on k-way graph partitioning (i.e., 2-way partitioning)
• H and G graphs are recursively cut into k parts based on a property
(i.e., minimum weighted edge-cut)
• The resulting graphs are mapped together using the same approach
• Several heuristics are available to perform the partitioning task which
is NP-Complete
Transformative approaches
• Start with an initial mapping
• Iteratively transform it to better ones
• Typically evolutionary techniques are used
• Genetic Algorithms
• Ant Colony Optimization
• …
• Fitness measure
• Delay
• Power consumption
• …
Graph Similarity Based Approaches
• Graph adjacency matrix can be modeled as a sparse matrix
• Mapping problem would be transformed to bringing two matrices
into a similar shape
• One possible approach:
• Reduce the bandwidth of two sparse matrices
• Transform them to diagonal matrices
• Do the mapping
Graph Similarity Based Approaches
Final Comments
• None of the techniques give optimal results in all cases
• Topology specific mapping approaches seem to work better
• Some papers propose using a combination of above approaches to
achieve better results
• Data locality is not always desirable (i.e., dragonfly)
• So far a few papers have explored parallelized mapping techniques
• Better outcomes if we consider the routing algorithms while mapping

Presentation

  • 1.
    A Survey onTopology Mapping for Large Scale Interconnection Networks Soheila Abrishami, Peyman Faizian
  • 2.
    Overview • Background • Definition •Performance metrics • Mapping techniques
  • 3.
    Background • Research ininterconnect design can be classified as: • Communication infrastructure (topology) • Communication paradigm (routing, switching) • Evaluation framework (throughput, latency) • Topology/application mapping
  • 4.
    Data Locality • Thechallenge deals with scalability and can be expressed in several ways: • How to use the maximum of the available resources at their full potential? • How to do so with an energy consumption that remains acceptable? • One global and practical answer to these questions is to improve the “Data Locality” of parallel applications
  • 5.
    Data Locality… • Datalocality: the way the data are placed, accessed and moved by the multiple hardware processing units of the underlying target architecture • Improving data locality can cause: • Reduced Communication cost • Decrease in application’s execution time • Decrease in energy consumption
  • 6.
    Mapping • One wayto improve data locality is to dedicate physical processing units to their specific software processing entities • This means that matching between the application virtual topology and the target hardware architecture has to be determined
  • 7.
    Mapping… • Virtual topology:Expresses the existing dependencies between software processing entities • Static: number of processing entities and the dependencies between these entities do not change • Dynamic: when one of the two above conditions (possibly both) is not fulfilled. • In addition, the maximum of details regarding the target hardware have to be gathered.
  • 8.
    Mapping… • The matchingbetween the virtual and the physical topologies is achievable in both ways • the virtual topology can be mapped on to the physical one • the physical topology can also be mapped onto the virtual one
  • 9.
    Topology mapping • Thenetwork is typically modeled by a weighted graph 𝐻 = 𝑉𝐻, 𝑊𝐻, 𝑅 𝐻 • 𝑉𝐻: represent the execution units • 𝑊𝐻(𝑢, 𝑣): represent the weight of the edges between two vertices 𝑢 and 𝑣 • 𝑅 𝐻: represent the routing as a probability distribution • The static application graph is often modeled as a weighted graph 𝐴 = 𝑉𝐴, 𝑤 𝐴 • 𝑉𝐴: represents the set of communicating processes • 𝑤 𝐴 𝑢, 𝑣 : represents some metric for the communication between two processes 𝑢, 𝑣
  • 10.
    Topology mapping… • Thetopology mapping considers mappings 𝜎 ∶ 𝑉𝐴 → 𝑉𝐻 • Each concrete mapping 𝜎 has two metrics: • Dilation: is defined as either the maximum or the sum of the pairwise distances of neighbors in 𝐴 mapped to 𝐻. (correlate with the dynamic energy consumption of the network) • Congestion: counts how many communication pairs use a certain link. (correlates strongly with the execution time of bulk-synchronous parallel applications)
  • 11.
    Mapping Techniques Finding aperfect mapping (wrt to a specific metric) is NP-Complete. • LP based algorithms • Constructive approaches (greedy) • Partitioning approaches • Transformative approaches • Graph similarity based approaches
  • 12.
    LP formulation • Givena topology G (links and bandwidths) • Given a virtual topology H (communications graph) • Find a mapping from H -> G such that: • Maximum throughput • Minimum latency • Minimum congestion • Minimum dilation • Software solutions are available to solve the linear program
  • 13.
    Greedy Approaches • Selecttwo starting vertices u, v from G and H respectively • Local • Add next vertices from the neighborhood of initial vertices • Global • Add next vertices based on a global property (i.e., node degrees) • Continue until finding a full mapping • The end result relies heavily on the choice of first vertices • Compute mappings based on different initial choices and select the best one • Define some kind of primary conditions to choose the initial vertices
  • 14.
    Partitioning Approaches • Basedon k-way graph partitioning (i.e., 2-way partitioning) • H and G graphs are recursively cut into k parts based on a property (i.e., minimum weighted edge-cut) • The resulting graphs are mapped together using the same approach • Several heuristics are available to perform the partitioning task which is NP-Complete
  • 15.
    Transformative approaches • Startwith an initial mapping • Iteratively transform it to better ones • Typically evolutionary techniques are used • Genetic Algorithms • Ant Colony Optimization • … • Fitness measure • Delay • Power consumption • …
  • 16.
    Graph Similarity BasedApproaches • Graph adjacency matrix can be modeled as a sparse matrix • Mapping problem would be transformed to bringing two matrices into a similar shape • One possible approach: • Reduce the bandwidth of two sparse matrices • Transform them to diagonal matrices • Do the mapping
  • 17.
  • 18.
    Final Comments • Noneof the techniques give optimal results in all cases • Topology specific mapping approaches seem to work better • Some papers propose using a combination of above approaches to achieve better results • Data locality is not always desirable (i.e., dragonfly) • So far a few papers have explored parallelized mapping techniques • Better outcomes if we consider the routing algorithms while mapping