BLUE GENE




      Sunitha M. Jenarius
What is Blue Gene

A  massively parallel supercomputer using
  tens of thousands of embedded PowerPC
  processors supporting a large memory space
 With standard compilers

  and message passing
  environment
Why the name “Blue Gene”?

 “Blue”: The corporate color of IBM
 “Gene”: The intended use of the Blue Gene
  clusters – Computational biology, specifically,
  protein folding
History

   Dec’99, IBM Research announced $100M US effort
    to build a Petaflop scale supercomputer.
   Two goals of The Blue Gene project :
    –   Massively parallel machine architecture and software
    –   Bio-Molecular Simulation – advance orders of magnitude
   November 2001, Partnership with Lawrence
    Livermore National Laboratory (LLNL)

                                        and this resulted in …
Results

 Linpack   Top 500 Supercomputers
Blue Gene Projects

 Four   Blue Gene projects :
  –   BlueGene/L
  –   BlueGene/C
  –   BlueGene/P
  –   BlueGene/Q
Blue Gene/L

 The  first computer in the Blue Gene
  series
 IBM first announced the Blue Gene/L
  project, Sept. 29, 2004
 Final configuration was launched in
  October 2005
Blue Gene/L - Unsurpassed
Performance

 Designed  to deliver the most performance
  per kilowatt of power consumed
 Theoretical peak performance of 360
  TFLOPS
 Final Configuration (Oct. ‘05) scores over
  280 TFLOPS sustained on the Linpack
  benchmark.
 Nov 14, ‘06, at Supercomputing 2006, Blue
  Gene/L was awarded the winning prize in all
  HPC Challenge Classes of awards.
Blue Gene/L Architecture

 Can be scaled up to 65,536 compute or I/O
  nodes, with 131,072 processors
 Each node is a single ASIC with associated
  DRAM memory chips
 Each ASIC has 2 700 MHz IBM PowerPC
  processors
 PowerPC processors
  –   Low-frequency, low-power embedded processors,
      superior to today's high-frequency, high-power
      microprocessors by a factor of 2 or more
Blue Gene/L Architecture contd…

    –   Double-pipeline-double-precision Floating Point Unit
    –   A cache sub-system with built-in DRAM controller
   Node CPUs are not cache coherent with one another
   FPUs and CPUs are designed for low power
    consumption
    –   Using transistors with low leakage current
    –   Local clock gating
    –   Putting the FPU or CPU/FPU pair to sleep
Blue Gene/L Architecture contd…




            1024 nodes




                         System Overview
Blue Gene/L Architecture contd…

1 rack holds 1024 nodes or 2048 processors
 Nodes optimized for low power consumption
 ASIC based on System-on-a-chip technology
  –   Large numbers of low-power system-on-a-chip technology
      allows it to outperform commodity clusters while saving on
      power
  –   Aggressive packaging of processors, memory and
      interconnect
  –   Power Efficient & Space Efficient
  –   Allows for latencies and bandwidths that are significantly
      better than those for nodes typically used in ASC scale
      supercomputers
Blue Gene/L Networks

 Each
     node is attached to 3 main parallel
 communication networks
  –   3D Torus network - peer-2-peer between compute
      nodes
  –   Collective network – collective & global
      communication
  –   Ethernet network - I/O and management (such as
      access to any node for configuration, booting and
      diagnostics )
Blue Gene/L System Software

   System software supports efficient execution of
    parallel applications
   Compiler support for DFPU (C, C++, Fortran)
   Compute nodes use a minimal operating system
    called “BlueGene/L compute node kernel”
    –   A lightweight, single-user operating system
    –   Supports execution of a single dual-threaded application
        compute process
    –   Kernel provides a single and static virtual address space to
        one running compute process
    –   Because of single-process nature, no context switching
        required
Blue Gene/L System Software contd…

   To allow multiple programs to run concurrently
    –   Blue Gene/L system can be partitioned into electronically
        isolated sets of nodes
    –   The number of nodes in a partition must be a positive
        integer power of 2
    –   To run program – reserve this partition
    –   No other program can use till partition is done with current
        program
    –   With so many nodes, component failures are inevitable. The
        system is able to electrically isolate faulty hardware to allow
        the machine to continue to run
Blue Gene/L System Software contd…

 Parallel   Programming model
  –   Message Passing – supported through an
      implementation of MPI
  –   Only a subset of POSIX calls are supported
  –   Green threads are also used to simulate local
      concurrency
Blue Gene/C

 Sister-project  to BlueGene/L
 Renamed to Cyclops64
 Massively parallel, supercomputer-on-a-chip
  cellular architecture
 Cellular architecture gives the programmer
  the ability to run large numbers of concurrent
  threads within a single processor.
Blue Gene/P

 Architecturally
                similar to BlueGene/L
 Expected to operate around one petaflop
 Expected around 2008
Blue Gene/Q

 Last known supercomputer in the Blue Gene
  series
 Expected to reach 3-10 petaflops
Resources

 Wikipedia.org
 IBM   website
  –   (
      www.03.ibm.com/servers/deepcomputing/bluegene.htm
      )
 www.supercomp.org/sc2002/paperpdfs/pap.p
  ap207.pdf

Blue gene

  • 1.
    BLUE GENE Sunitha M. Jenarius
  • 2.
    What is BlueGene A massively parallel supercomputer using tens of thousands of embedded PowerPC processors supporting a large memory space  With standard compilers and message passing environment
  • 3.
    Why the name“Blue Gene”?  “Blue”: The corporate color of IBM  “Gene”: The intended use of the Blue Gene clusters – Computational biology, specifically, protein folding
  • 4.
    History  Dec’99, IBM Research announced $100M US effort to build a Petaflop scale supercomputer.  Two goals of The Blue Gene project : – Massively parallel machine architecture and software – Bio-Molecular Simulation – advance orders of magnitude  November 2001, Partnership with Lawrence Livermore National Laboratory (LLNL) and this resulted in …
  • 5.
    Results  Linpack Top 500 Supercomputers
  • 6.
    Blue Gene Projects Four Blue Gene projects : – BlueGene/L – BlueGene/C – BlueGene/P – BlueGene/Q
  • 7.
    Blue Gene/L  The first computer in the Blue Gene series  IBM first announced the Blue Gene/L project, Sept. 29, 2004  Final configuration was launched in October 2005
  • 8.
    Blue Gene/L -Unsurpassed Performance  Designed to deliver the most performance per kilowatt of power consumed  Theoretical peak performance of 360 TFLOPS  Final Configuration (Oct. ‘05) scores over 280 TFLOPS sustained on the Linpack benchmark.  Nov 14, ‘06, at Supercomputing 2006, Blue Gene/L was awarded the winning prize in all HPC Challenge Classes of awards.
  • 9.
    Blue Gene/L Architecture Can be scaled up to 65,536 compute or I/O nodes, with 131,072 processors  Each node is a single ASIC with associated DRAM memory chips  Each ASIC has 2 700 MHz IBM PowerPC processors  PowerPC processors – Low-frequency, low-power embedded processors, superior to today's high-frequency, high-power microprocessors by a factor of 2 or more
  • 10.
    Blue Gene/L Architecturecontd… – Double-pipeline-double-precision Floating Point Unit – A cache sub-system with built-in DRAM controller  Node CPUs are not cache coherent with one another  FPUs and CPUs are designed for low power consumption – Using transistors with low leakage current – Local clock gating – Putting the FPU or CPU/FPU pair to sleep
  • 11.
    Blue Gene/L Architecturecontd… 1024 nodes System Overview
  • 12.
    Blue Gene/L Architecturecontd… 1 rack holds 1024 nodes or 2048 processors  Nodes optimized for low power consumption  ASIC based on System-on-a-chip technology – Large numbers of low-power system-on-a-chip technology allows it to outperform commodity clusters while saving on power – Aggressive packaging of processors, memory and interconnect – Power Efficient & Space Efficient – Allows for latencies and bandwidths that are significantly better than those for nodes typically used in ASC scale supercomputers
  • 13.
    Blue Gene/L Networks Each node is attached to 3 main parallel communication networks – 3D Torus network - peer-2-peer between compute nodes – Collective network – collective & global communication – Ethernet network - I/O and management (such as access to any node for configuration, booting and diagnostics )
  • 14.
    Blue Gene/L SystemSoftware  System software supports efficient execution of parallel applications  Compiler support for DFPU (C, C++, Fortran)  Compute nodes use a minimal operating system called “BlueGene/L compute node kernel” – A lightweight, single-user operating system – Supports execution of a single dual-threaded application compute process – Kernel provides a single and static virtual address space to one running compute process – Because of single-process nature, no context switching required
  • 15.
    Blue Gene/L SystemSoftware contd…  To allow multiple programs to run concurrently – Blue Gene/L system can be partitioned into electronically isolated sets of nodes – The number of nodes in a partition must be a positive integer power of 2 – To run program – reserve this partition – No other program can use till partition is done with current program – With so many nodes, component failures are inevitable. The system is able to electrically isolate faulty hardware to allow the machine to continue to run
  • 16.
    Blue Gene/L SystemSoftware contd…  Parallel Programming model – Message Passing – supported through an implementation of MPI – Only a subset of POSIX calls are supported – Green threads are also used to simulate local concurrency
  • 17.
    Blue Gene/C  Sister-project to BlueGene/L  Renamed to Cyclops64  Massively parallel, supercomputer-on-a-chip cellular architecture  Cellular architecture gives the programmer the ability to run large numbers of concurrent threads within a single processor.
  • 18.
    Blue Gene/P  Architecturally similar to BlueGene/L  Expected to operate around one petaflop  Expected around 2008
  • 19.
    Blue Gene/Q  Lastknown supercomputer in the Blue Gene series  Expected to reach 3-10 petaflops
  • 20.
    Resources  Wikipedia.org  IBM website – ( www.03.ibm.com/servers/deepcomputing/bluegene.htm )  www.supercomp.org/sc2002/paperpdfs/pap.p ap207.pdf