Realizing Robust and Scalable Evolutionary Algorithms toward Exascale Era

Realizing Robust and Scalable Evolutionary
Algorithms toward Exascale Era
Masaharu Munetomo
Information Initiative Center,
Hokkaido University, Sapporo, JAPAN.

Toward the next generations
supercomputing
• Next year (2012): 10 PFlops systems will be installed
• “K” Computer (Fujitsu) ： 10PFlops
• Sequoia (IBM) : 20PFlops
• Near Future (~2018): Toward exascale (1018
)
• International Exa-Scale Software Project

Potential Architecture in Exa-Flops Era
• # of nodes: O(106
)
• # of concurrency
per node: O(103
)
• Total concurrency: O(109
)
• Memory BW: 400GB/s
• Inter-node BW: 50GB/s
(Jack Dongarra
Presentation@SC10)

Uniform or Heterogeneous
• Uniform Architecture: using cores with the same architecture
• PROS: Natural approach, easy to program
• CONS: Needs more power compared with heterogeneous architecture
• Heterogeneous Architecture: CPU + Accelerator with different architecture
• PROS: Energy efficiency, concurrency
• CONS: Difficult to program and optimize compared with uniform one

Challenges toward Exa-Scale
• Improvement of bandwidth and delay of the interconnect are not enough
compared to that by increasing the number of cores
• It is difficult to develop massively parallel library for the many-core
architecture
• It is also difficult to develop scalable algorithms for many-core
• In the field of high performance computing, their efforts concentrate on
parallel numerical library such as large-scale matrix calculations. Extreme
scale parallel optimization is not considered.

Scalability of Master-Worker Model
• When delay is proportional to # of processors
• When delay is proportional to log of # of processors (ideal situation)

Extremely Parallel Master-Slave Model
• P* should be around
100,000 ~ 1,000,000
• X
should be more
than 1010
!!
• OK
• It is preferable if we can
hide communication
overheads.

Toward exa-scale massive parallelization
• Communication latency should be at most O(log n)
• Network hardware architecture: Hypercube, etc.
• Network software stack: Optimizing collective communications, etc.
• Algorithms that reduces communications
• Employing heterogeneous architecture: GPGPU, etc.

Massive parallelization of Evolutionary
computation
• Master-Worker Model: Same as general parallelized algorithms
• Island models: difficult to implement massive parallelism as it is
• Localize inter-node communications such as cellular GA
• Massive parallelization of advanced methods like EDAs
Dependency (communications) among variables should be considered→
　　 ex) DBOA, pLINC, pD5 etc.
• Massive parallelization on cloud systems: MapReduce GA, ECGA, etc.

Research toward exa-scale ECs
• We should assume more than million-way parallelization
• Making communication overheads as less as possible
• Designing algorithms with less communications necessary
• Dependency on the target application problems
• When fitness evaluations are costly, simply parallelize them is enough
• Parallelization of “intelligent” algorithms are necessary to consider
• Analyzing interactions among genes to solve complex problems

Promising approach toward massive
parallelization
• Massively parallel cellular GAs with local communications on many-core
• Massive parallelization of linkage identification: pLINC, pLIEM, pD5
• Parallelization of EDAs: pBOA, gBOA (A GPGPU implementation)
• GPGPU implementations: gBOA, MAXSAT, MINLP
• MapReduce implementations: MapReduce GA, ECGA, linkage identification

A cellular GA implemented on CellBE [Asim
2009]
• A cellular GA (cGA) to solve Capacitated Vehicle Routing
Problem (CVRP) implemented on a many-core architecture,
Cell BroadBand Engine (Sony, IBM, Toshiba).
• Cellular GA was employed due
to limited communication BW
between SPE and PPE.

Parallelization of linkage identification
techniques
• Linkage identification techniques are based on pair-wise perturbations to
detect nonlinearity/non-monotonicity.
• It is relatively easy to parallelize by
assigning calculation of each pair
to each processors
• pLINC, pLIEM, pD5
have been proposed.
• We have succeeded in solving a half-million bit problems by employing pD5

A half-million-bit optimization with pD5
[Tsuji
2004]
• A 500,000-bit problem
(5bit trap x 100,000 ）
• # of local optima
= 2100,000
-1 10≒ 30,000
• HITACHI SR-11000
model K1 (5.4TFlops)
• 32 days by 1 node (16 CPU)
• 1 day by whole system
(40 nodes)

pBOA on GPGPU [Asim 2008]
• Implementation of parallel Bayesian Optimization
Algorithm on NVIDIA CUDA.

Massive Parallelization over the Cloud:
MapReduce
• MapReduce is a simple but powerful approach to realize massive
parallelization over Cloud computing systems
• There are several papers have been published on MapReduce
implementations on GA, ECGA.
• MapReduce ECGA: Calculation of Marginal Product Model (MPM) and
generating offsprings can be parallelized via
“Map” process.
• Fault tolerance can be handled byMapReduce
for massive parallelization

Concluding Remarks
• Designing robust and scalable algorithms is not easy, since analysis of
linkage or interactions among genes should be necessary which usually
needs at least O(l2
) computational overheads and communications among
processors.
• In order to adapt to exa-scale massively parallel processing cores of more
than O(106
), we need to reduce communication overheads of each message
exchange less than O(log n). Otherwise, the parallel algorithm cannot scale
well to such massively parallel architectures.
• We also need to adapt to modern architecture and systems such as
heterogeneous architecture with GPGPUs and cloud computing environment.

Realizing Robust and Scalable Evolutionary Algorithms toward Exascale Era

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Realizing Robust and Scalable Evolutionary Algorithms toward Exascale Era

Similar to Realizing Robust and Scalable Evolutionary Algorithms toward Exascale Era (20)

More from Masaharu Munetomo

More from Masaharu Munetomo (13)

Recently uploaded

Recently uploaded (20)

Realizing Robust and Scalable Evolutionary Algorithms toward Exascale Era

Editor's Notes