SlideShare a Scribd company logo
1 of 49
Download to read offline
© 2012 IBM Corporation
Blue Gene/Q®:
Design for Sustained Multi-
Petaflop Computing
Michael Gschwind
IBM Corp.
This disclosure is provided for informational purposes only and is subject to change, without notice, at the sole discretion of IBM. This information is intended to outline IBM’s
product direction and does not commit IBM to make, offer or deliver any products or technologies which are not currently publicly announced or generally available from IBM.
© 2012 IBM Corporation
Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing
International Conference on Supercomputing 2012, San Servolo, Venice
IBM System Design
ƒ Holistic system-level optimization across
hardware and software
– From technology and circuits to
software
– Codesign of technologies for maximum
efficiency
ƒ Innovation across the entire value stack
ƒ Benefit from IBM’s system skills
– Experience in system tuning and
optimization
IBM System Design
ƒ Holistic system-level optimization across
hardware and software
– From technology and circuits to
software
– Co-design of technologies for maximum
efficiency
ƒ Innovation across the entire value stack
ƒ Benefit from IBM’s system skills
– Experience in system tuning and
optimization
© 2012 IBM Corporation
Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing
International Conference on Supercomputing 2012, San Servolo, Venice
Blue Gene/Q Design Goals
ƒ Provide the platform for the future of
supercomputing
– Order of magnitude improvement in
efficiency
– Build the most scalable, most reliable, most
programmable and fastest supercomputer
ƒ Give programmers the tools to exploit this
platform with groundbreaking applications
© 2012 IBM Corporation
Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing
International Conference on Supercomputing 2012, San Servolo, Venice
Leverage IBM’s technologies for HPC
ƒ High-reliability SOI design
ƒ Power ISA™
ƒ SIMD Vector architecture
ƒ Communication technologies
ƒ Packaging and Cooling
ƒ System-level RAS experience
ƒ Numeric middleware: ESSL, MASSV
ƒ Deep understanding of algorithms and workloads
© 2012 IBM Corporation
Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing
International Conference on Supercomputing 2012, San Servolo, Venice
And Use HPC Innovation to Benefit Commercial Systems
ƒ Multicore SoC
ƒ Fast Networks
ƒ Workload Optimized
ƒ Hybrid Architectures
ƒ Scalability
ƒ High Efficiency
© 2012 IBM Corporation
Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing
International Conference on Supercomputing 2012, San Servolo, Venice
ƒ Holistic system-level optimization across
hardware and software
– From technology and circuits to
software
– Codesign of technologies for maximum
efficiency
ƒ Innovation across the entire value stack
ƒ Benefit from IBM’s system skills
– Experience in system tuning and
optimization
The “Walls”– Traditional Obstacles to Design
ƒ Memory Wall
– High bandwidth interfaces
– Exploiting memory level
parallelism
ƒ Power Wall
– Optimize for FLOPS / W
– Improve design efficiency
ƒ Frequency Wall
– Optimize balance between
frequency and parallelism
© 2012 IBM Corporation
Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing
International Conference on Supercomputing 2012, San Servolo, Venice
ƒ Holistic system-level optimization across
hardware and software
– From technology and circuits to
software
– Codesign of technologies for maximum
efficiency
ƒ Innovation across the entire value stack
ƒ Benefit from IBM’s system skills
– Experience in system tuning and
optimization
The New “Walls”
ƒ Scalability Wall
ƒ Communication Wall
ƒ Reliability Wall
ƒ Programmer Productivity Wall
© 2012 IBM Corporation
Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing
International Conference on Supercomputing 2012, San Servolo, Venice
Blue Gene/Q Concept
ƒ Avoid scaling of processors into area of diminishing returns
ƒ Scalable, integrated SMP node
ƒ Integrated system-level network
ƒ Familiar environment and new abstractions
ƒ Design with a focus on system-level reliability
© 2012 IBM Corporation
Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing
International Conference on Supercomputing 2012, San Servolo, Venice
System
96 racks @ 20PF/s
Node Card
32 Compute Cards @ 6.5TF/s,
Link Chips, Optics, Torus
Chip
16 cores @
205 GF/s
Compute Card
One chip in SCM
16 GB DDR3 Memory
Rack
32 Node Cards in 2 Midplanes
I/O Drawers
Rack
Peak Performance 209 TF
Sustained
(Linpack)
~170+ TF
Power ~100 kW
Power Efficiency ~1.7 GF/W
System Scaling to Multi-Petaflop
© 2012 IBM Corporation
Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing
International Conference on Supercomputing 2012, San Servolo, Venice
BlueGene/Q Compute chip
ƒ 360 mm² Cu-45 technology (SOI)
– ~ 1.47 B transistors
ƒ 18-way CMP
–16 user + 1 service processors
–plus 1 redundant processor
–all processors are symmetric
–peak performance 204.8
GFLOPS@55W
ƒ Central shared L2 cache: 32 MB
–eDRAM
ƒ Dual memory controller
–16 GB external DDR3 memory
–1.33 Gb/s
–2 * 16 byte-wide interface (+ECC)
ƒ Chip-to-chip networking
–Router logic integrated into BQC
chip.
ƒ External IO
–PCIe Gen2 interface
System-on-a-Chip design - integrates processors,
memory and networking logic into a single chip
© 2012 IBM Corporation
Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing
International Conference on Supercomputing 2012, San Servolo, Venice
BG/Q Processor Unit
ƒ A2 processor core
ƒ Based on PowerEN™ design
ƒ Implements 64-bit Power ISA™
ƒ Optimized for aggregate throughput:
– 4-way simultaneously multi-threaded (SMT)
– 2-way concurrent issue: XU (br/int/l/s) + QPU
ƒ 1.6 GHz @ 0.8V QPU
© 2012 IBM Corporation
Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing
International Conference on Supercomputing 2012, San Servolo, Venice
Quadvector Processing Unit (QPU)
ƒ Power-efficient scalable performance
– 8 floating point ops (FMA) per cycle
– Concurrent load and store to ensure peak
utilization
ƒ 4 double precision pipelines, usable as:
– scalar FPU
– 4-wide FPU SIMD
– 2-wide complex arithmetic SIMD
ƒ QPX Instruction extensions to Power ISA
ƒ Load/store supports a multitude of alignments
ƒ Permute instructions to reorganize vector data
RF
MAD0 MAD3
MAD2
MAD1
RF
RF
RF
Permute
Load
A2
256
64
Putting the Q into Blue Gene/Q:
© 2012 IBM Corporation
Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing
International Conference on Supercomputing 2012, San Servolo, Venice
Quadvector Processing Extensions (QPX)
ƒ New FPU SIMD architecture optimized for Blue
Gene/Q
– Scalable, efficient performance
– Co-design for efficient code generation
ƒ Builds on Blue Gene application insights
– Optimized for scientific applications
ƒ Improved programmer productivity
– Automatic SIMD-ization
– Efficient handling of unaligned data
– Conversions from control flow to dataflow
– Efficient SIMD exceptions handling
QPU
QPX
Programmability Compilability
PowerPC
Compatibility
IEEE
Capability
Efficiency
Putting the Q into Blue Gene/Q:
© 2012 IBM Corporation
Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing
International Conference on Supercomputing 2012, San Servolo, Venice
Beyond SIMD:
Flexible communication between element lanes
from D$
Vector permute Load & replicate Complex numbers
Putting the Q into Blue Gene/Q:
© 2012 IBM Corporation
Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing
International Conference on Supercomputing 2012, San Servolo, Venice
Breaking the control flow limit with data-level parallelism
for (i =0; i< VL; i++)
if (a[i] !=0)
m[i] = b[i]/a[i];
else
m[i] = b[i] * 10;
a’[0] =b[0]/a[0] b’[0]=b[0]*10
s[0]=a[0] != 0
m[0]=s[0]?
a’[0]:b’[0]
a’[1]=b[1]/a[1] b’[1]=b[1]*10
s[1]=a[1] != 0
m[1]=s[1]?
a’[1]:b’[1]
a’[2]=b[2]/a[2] b’[2]=b[2]*10
s[2]=a[2] != 0
m[2]=s[2]?
a’[2]:b’[2]
a’[3]=b[3]/a[3] b’[3]=b[3]*10
s[3]=a[3] != 0
m[3]=s[3]?
a’[3]:b’[3]
Exploit data parallelism
a[0] != 0
m[0]=b[0]/a[0] m[0]=b[0]*10
a[1] != 0
m[1]=b[1]/a[1] m[1]=b[1]*10
a[2] != 0
m[2]=b[2]/a[2] m[2]=b[2]*10
a[3] !=0
m[3]=b[3]/a[3] m[3]=b[3]*10
Long
latency
© 2012 IBM Corporation
Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing
International Conference on Supercomputing 2012, San Servolo, Venice
Exception condition detection in data-parallel code
b’[0]=b[0]*10
s[0]=a[0] != 0
m[0]=s[0]?
t’[0]:b’[0]
b’[1]=b[1]*10
s[1]=a[1] != 0
m[1]=s[1]?
t’[1]:b’[1]
b’[2]=b[2]*10;
s[2]=a[2] != 0
m[2]=s[2]?
t’[2]:b’[2]
b’[3]=b[3]*10;
s[3]=a[3] != 0
m[3]=s[3]?
t’[3]:b’[3]
t’[0] =b[0]*t[0] t’[1]=b[1]*t[1] t’[3]=b[3]*t[3]
t’[2]=b[2]*t[2]
qvfcmpgt qS, qA, qB
qvre qT, qA t[0] =1/a[0] t[1]=1/a[1] t[2]=1/a[2] t[3]=1/a[3]
qvfmul qT’, qB, qT
qvfmul qB’, qB, q10_0
qvfsel qM, qS, qT’,qB’
qvstfdx qM, rA, rB store m[0] store m[1] store m[2] store m[3]
Putting the Q into Blue Gene/Q:
© 2012 IBM Corporation
Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing
International Conference on Supercomputing 2012, San Servolo, Venice
store m[0] store m[1] store m[2] store m[3]
Exception condition detection in data-parallel code
b’[0]=b[0]*10
s[0]=a[0] != 0
m[0]=s[0]?
t’[0]:b’[0]
b’[1]=b[1]*10
s[1]=a[1] != 0
m[1]=s[1]?
t’[1]:b’[1]
b’[2]=b[2]*10;
s[2]=a[2] != 0
m[2]=s[2]?
t’[2]:b’[2]
b’[3]=b[3]*10;
s[3]=a[3] != 0
m[3]=s[3]?
t’[3]:b’[3]
t’[0] =b[0]*t[0] t’[1]=b[1]*t[1] t’[3]=b[3]*t[3]
t’[2]=b[2]*t[2]
qvfcmpgt qS, qA, qB
qvre qT, qA t[0] =1/a[0] t[1]=1/a[1] t[2]=1/a[2] t[3]=1/a[3]
qvfmul qT’, qB, qT
qvfmul qB’, qB, q10_0
qvfsel qM, qS, qT’,qB’
qvstfdx qM, rA, rB store & indicate
m[0]
store & indicate
m[1]
Store & indicate
m[2]
store & indicate
m[3]
qvsfdxi qM, rA, rB
Putting the Q into Blue Gene/Q:
© 2012 IBM Corporation
Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing
International Conference on Supercomputing 2012, San Servolo, Venice
Exception condition detection in data-parallel code
b’[0]=b[0]*10
s[0]=a[0] != 0
m[0]=s[0]?
t’[0]:b’[0]
b’[1]=b[1]*10
s[1]=a[1] != 0
m[1]=s[1]?
t’[1]:b’[1]
b’[2]=b[2]*10;
s[2]=a[2] != 0
m[2]=s[2]?
t’[2]:b’[2]
b’[3]=b[3]*10;
s[3]=a[3] != 0
m[3]=s[3]?
t’[3]:b’[3]
t’[0] =b[0]*t[0] t’[1]=b[1]*t[1] t’[3]=b[3]*t[3]
t’[2]=b[2]*t[2]
qvfcmpgt qS, qA, qB
qvre qT, qA t[0] =1/a[0] t[1]=1/a[1] t[2]=1/a[2] t[3]=1/a[3]
qvfmul qT’, qB, qT
qvfmul qB’, qB, q10_0
qvfsel qM, qS, qT’,qB’
qvstfdxi qM, rA, rB store & indicate
m[0]
store & indicate
m[1]
store & indicate
m[2]
store & indicate
m[3]
Putting the Q into Blue Gene/Q:
© 2012 IBM Corporation
Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing
International Conference on Supercomputing 2012, San Servolo, Venice
Exception condition detection in data-parallel code
b’[0]=b[0]*10
s[0]=a[0] != 0
m[0]=s[0]?
t’[0]:b’[0]
b’[1]=b[1]*10
s[1]=a[1] != 0
m[1]=s[1]?
t’[1]:b’[1]
b’[2]=b[2]*10;
s[2]=a[2] != 0
m[2]=s[2]?
t’[2]:b’[2]
b’[3]=b[3]*10;
s[3]=a[3] != 0
m[3]=s[3]?
t’[3]:b’[3]
t’[0] =b[0]*t[0] t’[1]=b[1]*t[1] t’[3]=b[3]*t[3]
t’[2]=b[2]*t[2]
qvfcmpgt qS, qA, qB
qvre qT, qA t[0] =1/a[0] t[1]=1/a[1] t[2]=1/a[2] t[3]=1/a[3]
qvfmul qT’, qB, qT
qvfmul qB’, qB, q10_0
qvfsel qM, qS, qT’,qB’
qvstfdxi qM, rA, rB store & indicate
m[0]
store & indicate
m[1]
store & indicate
m[2]
Store & indicate
m[3]
Putting the Q into Blue Gene/Q:
© 2012 IBM Corporation
Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing
International Conference on Supercomputing 2012, San Servolo, Venice
Scalability innovations for thread-level communication
ƒ Atomic operations
– Faster OpenMP work hand off
ÎReduced messaging latency
ƒ Wake-up unit
ÎImprove latency and reduce contention
© 2012 IBM Corporation
Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing
International Conference on Supercomputing 2012, San Servolo, Venice
Scalability innovation to enable speculation:
breaking data dependences with cache multi-versioning
ƒ Multi-versioned data
– Tracked by score board in L2 cache
ƒ Transactional Memory:
– Load/store conflicts detected and resolved by SW
ƒ Speculative Execution:
– Load/store conflicts detected and resolved based
on original program ordering
Single MPI
Task
User defined parallelism
User defined
transaction start
User defined
transaction
end
parallelization completion
synchronization point
Hardware
detected
dependency
conflict
rollback
Single MPI
Task
User defined parallelism
User defined
transaction start
User defined
transaction
end
parallelization completion
synchronization point
Hardware
detected
dependency
conflict
rollback
© 2012 IBM Corporation
Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing
International Conference on Supercomputing 2012, San Servolo, Venice
Scalability with the 17th core
ƒ Common IO Client Interface
– Asynchronous I/O completion hand-off
ƒ Application Agents: privileged application
processing
– Messaging assist, e.g., MPI pacing thread
– Performance and trace helpers
ƒ RAS Event handling and interrupt off-load
– Reduce O/S noise and jitter
(not to scale)
© 2012 IBM Corporation
Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing
International Conference on Supercomputing 2012, San Servolo, Venice
Interconnect Innovations for MPI Scaling
ƒ Integrated 5D torus
– High nearest neighbor bandwidth
• measured ~1.75 GB/s per link
– Simple partitioning
ƒ 10 links
– Hardware assisted collective and
barrier
– FP addition support in network
ƒ Hardware latency
– Nearest: 80ns
– Farthest: 3μs (96-rack 20PF
system)
© 2012 IBM Corporation
Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing
International Conference on Supercomputing 2012, San Servolo, Venice
Communication Innovations for MPI Scaling:
High-Performance, application-level RDMA messaging acceleration
ƒ Dedicated accelerators for MPI
ƒ 32 MEs offload packet operations
ƒ Accessible from problem state applications
Slave port master ports
Crossbar Switch
DCRs
Interrupts &
UPC events
Global
Barrier
Control &
Status
…
rMEs
…
iMEs
Injection Control &
Arbitration
IC SRAM MC SRAM
Reception Control
RC SRAM RPUT
SRAM
MU
To Central
Global Barrier
Logic in ND
To ND Injection
FIFOs
To ND
Reception
FIFOs
© 2012 IBM Corporation
Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing
International Conference on Supercomputing 2012, San Servolo, Venice
Design for Reliability: Blue Gene Compute Chip
ƒ DDR3, L2 cache, network, all major arrays
and buses: ECC protected
ƒ Stacked / DICE latches for critical state
holding latches
ƒ Register files and minor buses: parity
protected, with recovery
FMA
PRM
recovery
Copy 0
Copy 1
US7,512,772
© 2012 IBM Corporation
Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing
International Conference on Supercomputing 2012, San Servolo, Venice
Co-Design for Reliability and Serviceability
ƒ Avoid “silent hangs” by catching and reporting
failure indications (“machine checks”)
ƒ Firmware operates from on-chip BeDRAM with
minimal resources
– Machine check vectors point to BeDRAM
ƒ Hardware provides on-chip 256K BeDRAM (“Boot
eDRAM”)
– Node firmware can continue to operate in the
presence of L2, DDR interface and memory
failure
– Minimize requirements to report node failure
© 2012 IBM Corporation
Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing
International Conference on Supercomputing 2012, San Servolo, Venice
Packaging and Cooling for Reliability
ƒ Water cooled node board
– Very stable, low temperature environment for
ASICs, DRAM, power supplies, optics
ƒ Node Board integrates
– 32 compute cards
– 8 link ASICs
• drive 4D links using 10Gb/s optical
transceivers
ƒ Compute Cards
– Memory soldered to card for reliability
ƒ Hot pluggable front-end power supplies
© 2012 IBM Corporation
Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing
International Conference on Supercomputing 2012, San Servolo, Venice
Blue Gene/Q Compute Card
ƒ Basic FRU of a BlueGene/Q system
ƒ Compute Card has 1 BQC chip + 72 SDRAMs (16GB DDR3)
ƒ Two heat sink options: water-cooled Æ “Compute Node” / air-cooled Æ “IO Node”
ƒ Connectors carry power supplies, JTAG etc, and 176 HSS signals (4 and 5 Gbps)
© 2012 IBM Corporation
Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing
International Conference on Supercomputing 2012, San Servolo, Venice
Height 2095.5 mm (82.5 inches)
Width 1219.2 mm (48 inches)
Depth 1321 mm (52 inches)
Weight
1520 kg (3350 lbs)
(including water)
I/O enclosure with 4 drawers
177 kg (390 lbs)
Packaging and Cooling
Water cooled node board w/ 32 compute cards
© 2012 IBM Corporation
Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing
International Conference on Supercomputing 2012, San Servolo, Venice
Cabled Rack
© 2012 IBM Corporation
Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing
International Conference on Supercomputing 2012, San Servolo, Venice
Rack Water Cooling
© 2012 IBM Corporation
Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing
International Conference on Supercomputing 2012, San Servolo, Venice
Blue Gene Rack Group
© 2012 IBM Corporation
Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing
International Conference on Supercomputing 2012, San Servolo, Venice
Argonne Mira
© 2012 IBM Corporation
Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing
International Conference on Supercomputing 2012, San Servolo, Venice
Blue Gene/Q Sequoia System
© 2012 IBM Corporation
Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing
International Conference on Supercomputing 2012, San Servolo, Venice
Programmability: emphasis on general purpose applicability
© 2012 IBM Corporation
Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing
International Conference on Supercomputing 2012, San Servolo, Venice
Messaging Software Architecture
ƒ BG/Q Messaging is based on the optimized Parallel Active Message Interface (PAMI) open
source library
ƒ Multiple program models may coexist within a task (e.g. MPI + GA/ARMCI)
ƒ Direct access to PAMI and baremetal SPI interfaces is allowed
Other
Paradigms*
SPI
Message Layer Core (C++)
pt2pt protocols
ARMCI
MPICH
Converse/Charm++
PAMI API (C)
Network Hardware (DMA, Collective Network, Global Interrupt Network)
Application
Global
Arrays
High-Level
API
collective protocols
Low-Level
API
UPC*
DMA Device Collective Device GI Device Shmem Device
© 2012 IBM Corporation
Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing
International Conference on Supercomputing 2012, San Servolo, Venice
Application Performance:
Speedup from Blue Gene/P to Blue Gene/Q
ƒ Q/P ratio indicates performance
increase per node
–using the same number of cores
on both BG/P and BG/Q
–2x-4x more total threads on BG/Q
ƒ Applications expected to benefit from
tuning over system life
–Improved vectorization and SIMD
exploitation
–Tuning to network topologies
*Linpack (tuned) performance is ~82.5% of peak performance on BG/Q
Application Q/P ratio
Ported Applications
DNS3D 16.9
FLASH 5.9
GFMC 10.5
GTC 11.2
HELD-SUAREZ 8.2
MILC 6.1
NAMD 5.5
NEK 7.4
Geometric Mean 8.4
Tuned Applications
GROMACS 15
GADGET 16
GENE 11
OCTOPUS 11
Geometric Mean 13
Linpack 173 TF/s*
© 2012 IBM Corporation
Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing
International Conference on Supercomputing 2012, San Servolo, Venice
A Vision for Applying Supercomputers
ƒ Traditional usage for science discoveries
ƒ For advances in drug discovery and understanding life sciences
Materials Science
Molecular Dynamics
Life Sciences: Sequencing
Life Sciences: In-Silico
Trials, Drug Discovery
Biological Modeling
Brain Science
© 2012 IBM Corporation
Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing
International Conference on Supercomputing 2012, San Servolo, Venice
Fraud Prevention
Telecom
Environment
Law Enforcement
Health Care Traffic Control
For a Smarter, Greener Planet
Energy
© 2012 IBM Corporation
Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing
International Conference on Supercomputing 2012, San Servolo, Venice
Up to
10,000
Times
larger
Up to 10,000
times faster
Traditional Data
Warehouse and
Business Intelligence
Data
Scale
Data
Scale
yr mo wk day hr min sec … ms μs
Exa
Peta
Tera
Giga
Mega
Kilo
Decision Frequency
Occasional Frequent Real-time
Data in Motion
Data
at
Rest
For improved business results with Advanced Business
Analytics for new “Big Data”
Telco Promotions
100,000 records/sec, 6B/day
10 ms/decision
270TB for Deep Analytics
© 2012 IBM Corporation
Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing
International Conference on Supercomputing 2012, San Servolo, Venice
And for Answering All Questions
Answer with
Confidence
In <300
Milliseconds
Final
Merging /
Ranking
Deep
Answer
Scorers
Deep
Answer
Scorers
Deep
Answer
Scorers
Deep
Evidence
Scorers
Supporting
Evidence
Retrieval
Supporting
Evidence
Retrieval
Supporting
Evidence
Retrieval
Supporting
Evidence
Retrieval
Shallow
Answer
Scorers
Shallow
Answer
Scorers
Shallow
Answer
Scorers
Shallow
Scoring &
Filtering
Primary
Search
Primary
Search
Primary
Search
Search &
Candidate
Generation
100’s of
Hits 10k’s of
Hits Statistical
Models
1,000s of
Candidates
100s of
Candidates
10Ks Pieces
of Evidence
100K
Scores
Huge volumes
of text
for knowledge
100’s of millions
of facts
refine interpretation
Clue/
Category
Analysis
Dist.
Search
© 2012 IBM Corporation
Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing
International Conference on Supercomputing 2012, San Servolo, Venice
New Era of High Performance Computing
“Physics
Driven”
“Data
Driven”
ÆTechnology becomes affordable and pervasive
ÆBroad adoption across a variety of industries
ÆUsage for modeling, simulation, predictive analysis
workloads
ÆDelivered via Clusters, Grids, and Cloud
Supercomputers
Science, Research & Government
BlueGene/Q, Power 775, iDataPlex
Business Analyt.
Financial Services
Digital Media
Analytics / Big Data /
Cloud deployment
© 2012 IBM Corporation
Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing
International Conference on Supercomputing 2012, San Servolo, Venice
16.32 PFLOP (81.08% eff) on 2012-06-08
jobid: 24760 size=98304 end=2012-06-08 09:55:38.726489 /bgsys/apps/linpack//BURN_R0-RB-96K.2012_0607_1042 PASS
16324.751 TF (81.08%)
stdout[0]: ================================================================================
stdout[0]: T/V N NB P Q Time Gflops
stdout[0]: --------------------------------------------------------------------------------
stdout[0]: WR16L2L4 12681215 256 384 1024 83280.78786813750048168 1.63247519148838781e+07
stdout[0]: --VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV-
stdout[0]: Max aggregated wall time rfact . . . : 10.29742211375065608
stdout[0]: + Max aggregated wall time pfact . . : 6.47944774470261109
stdout[0]: + Max aggregated wall time mxswp . . : 4.23286864816418529
stdout[0]: + Max aggregated wall time laswp . . : 2011.12352745312227853
stdout[0]: Max aggregated wall time up tr sv . : 69.15868747876083944
stdout[0]: Max aggregated wall time dgemm . . . : 75943.87711409496841952
stdout[0]: Max aggregated wall time dtrsm . . . : 466.78703577646933809
stdout[0]: --------------------------------------------------------------------------------
stdout[0]: ||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)= 0.0005574431767873 ...... PASSED
stdout[0]: RAMP ended at 1
stdout[0]: ================================================================================
stdout[0]:
stdout[0]: Finished 1 tests with the following results:
stdout[0]: 1 tests completed and passed residual checks,
stdout[0]: 0 tests completed and failed residual checks,
stdout[0]: 0 tests skipped because of illegal input values.
stdout[0]: --------------------------------------------------------------------------------
stdout[0]:
stdout[0]: End of Tests.
stdout[0]: ================================================================================
Blue Gene/Q®: The World’s Fastest Supercomputer
© 2012 IBM Corporation
Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing
International Conference on Supercomputing 2012, San Servolo, Venice
0.0001
0.001
0.01
0.1
1
10
100
1000
10000
100000
1000000
J
u
n
9
3
N
o
v
9
3
J
u
n
9
4
N
o
v
9
4
J
u
n
9
5
N
o
v
9
5
J
u
n
9
6
N
o
v
9
6
J
u
n
9
7
N
o
v
9
7
J
u
n
9
8
N
o
v
9
8
J
u
n
9
9
N
o
v
9
9
J
u
n
0
0
N
o
v
0
0
J
u
n
0
1
N
o
v
0
1
J
u
n
0
2
N
o
v
0
2
J
u
n
0
3
N
o
v
0
3
J
u
n
0
4
N
o
v
0
4
J
u
n
0
5
N
o
v
0
5
J
u
n
0
6
N
o
v
0
6
J
u
n
0
7
N
o
v
0
7
J
u
n
0
8
N
o
v
0
8
J
u
n
0
9
N
o
v
0
9
J
u
n
1
0
N
o
v
1
0
J
u
n
1
1
N
o
v
1
1
J
u
n
1
2
Rmax
Performance
(TFlops)
Over the long haul IBM has demonstrated continued leadership in various
TOP500 metrics, even as the performance continues it’s relentless growth.
Total Aggregate Performance # 1
# 10 # 500
Source:
www.top500.org
Blue Square Markers Indicate IBM Leadership
16.33 PF
Top500 Performance Trends
© 2012 IBM Corporation
Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing
International Conference on Supercomputing 2012, San Servolo, Venice
Blue Gene/Q is the most energy-efficient supercomputer (*)
# Site Vendor Processor kWatts Mflops/w
1 DOE/NNSA/LLNL IBM BlueGene/Q, Power BQC 16C 1.60GHz 41.1 2100.88
2 IBM Thomas J. Watson Research Center IBM BlueGene/Q, Power BQC 16C 1.60GHz 41.1 2100.88
3 DOE/SC/Argonne National Laboratory IBM BlueGene/Q, Power BQC 16C 1.60GHz 82.2 2100.86
4 DOE/SC/Argonne National Laboratory IBM BlueGene/Q, Power BQC 16C 1.60GHz 82.2 2100.86
5 Rensselaer Polytechnic Institute IBM BlueGene/Q, Power BQC 16C 1.60GHz 82.2 2100.86
6 University of Rochester IBM BlueGene/Q, Power BQC 16C 1.60GHz 82.2 2100.86
7 IBM Thomas J. Watson Research Center IBM BlueGene/Q, Power BQC 16C 1.60GHz 82.2 2100.86
8 University of Edinburgh IBM BlueGene/Q, Power BQC 16C 1.60GHz 493.1 2099.56
9 Daresbury Laboratory (S&T Fac. Council) IBM BlueGene/Q, Power BQC 16C 1.60GHz 575.3 2099.50
10 Forschungszentrum Juelich (FZJ) IBM BlueGene/Q, Power BQC 16C 1.60GHz 657.5 2099.46
11 CINECA IBM BlueGene/Q, Power BQC 16C 1.60GHz 821.9 2099.39
12 High Energy Accelerator Research Org. /KEK IBM BlueGene/Q, Power BQC 16C 1.60GHz 246.6 2099.14
13 EDF R&D IBM BlueGene/Q, Power BQC 16C 1.60GHz 328.8 2099.14
14 IDRIS/GENCI IBM BlueGene/Q, Power BQC 16C 1.60GHz 328.8 2099.14
15 Victorian Life Sciences Computation Initiative IBM BlueGene/Q, Power BQC 16C 1.60GHz 328.8 2099.14
16 IBM - Rochester IBM BlueGene/Q, Power BQC 16C 1.60GHz 164.4 2099.14
17 IBM - Rochester IBM BlueGene/Q, Power BQC 16C 1.60GHz 164.4 2099.14
18 DOE/NNSA/LLNL IBM BlueGene/Q, Power BQC 16C 1.60GHz 164.4 2099.14
19 DOE/SC/Argonne National Laboratory IBM BlueGene/Q, Power BQC 16C 1.60GHz 3945 2069.04
20 DOE/NNSA/LLNL IBM BlueGene/Q, Power BQC 16C 1.60GHz 7890 2069.04
Source: www.green500.org
*as measured by Green500,
6/2011, 11/2011 and 6/2012
© 2012 IBM Corporation
Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing
International Conference on Supercomputing 2012, San Servolo, Venice
•Ultra-scalability for breakthrough science, broad range of HPC applicability
•Address traditional walls for computer systems
•Address new walls on the path to ultrascalable exa-scale computing
•Low power, small footprint, low total cost of ownership (TCO)
•High reliability, 10-100X better MTBF/TF, low maintenance requirements
•New, even higher performing 256B quad-vector SIMD with 8 FLOPS per cycle
•Low latency, high performance inter-processor communications
•Open source and standards-based programming environment
Blue Gene/Q® is designed to overcome hurdles
© 2012 IBM Corporation
Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing
International Conference on Supercomputing 2012, San Servolo, Venice
Thank you!
© 2012 IBM Corporation
Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing
International Conference on Supercomputing 2012, San Servolo, Venice
Disclaimer
IBM Corporation
Integrated Marketing Communications, Systems & Technology Group
Route 100
Somers, NY 10589
Produced in the United States of America
06/28/2012
All Rights Reserved
IBM’s statements regarding its plans, directions, and intent are subject to change or withdrawal without notice at IBM’s sole
discretion. Information regarding potential future products is intended to outline our general product direction and it should
not be relied on in making a purchasing decision. The information mentioned regarding potential future products is not a
commitment, promise, or legal obligation to deliver any material, code or functionality. Information about potential future
products may not be incorporated into any contract. The development, release, and timing of any future features or
functionality described for our products remains at our sole discretion.
Some information in this document addresses anticipated future capabilities. Such information is not intended as a definitive
statement of a commitment to specific levels of performance, function or delivery schedules with respect to any future
products. Such commitments are only made in IBM product announcements. The information is presented here to
communicate IBM’s current investment and development activities as a good faith effort to help with our customers’ future
planning.
All performance information was determined in a controlled environment. Actual results may vary. Performance information
is provided “AS IS” and no warranties or guarantees are expressed or implied by IBM.
IBM does not warrant that the information offered herein will meet your requirements or those of your distributors or
customers. IBM provides this information “AS IS” without warranty. IBM disclaim all warranties, express or implied,
including the implied warranties of noninfringement, merchantability and fitness for a particular purpose or
noninfringement.
© 2012 IBM Corporation
Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing
International Conference on Supercomputing 2012, San Servolo, Venice
Disclaimer
References in this publication to IBM products or services do not imply that IBM intends to make them
available in every country in which IBM operates. Consult your local IBM business contact for information
on the products, features, and services available in your area.
Blue Gene, Blue Gene/Q and PowerEN are trademarks or registered trademarks of IBM Corporation in the
United States, other countries or both.
PowerISA and Power Architecture are trademarks or registered trademarks in the United States, other
countries, or both, licensed through Power.org
Linux is a registered trademark of Linus Torvalds.
Tivoli is a registered trademark of Tivoli Systems Inc.in the United States, other countries or both.
UNIX is a registered trademark in the United States and other countries, licensed exclusively through The
Open Group.
Other trademarks and registered trademarks are the properties of their respective companies.
Photographs shown are of engineering prototypes. Changes may be incorporated in production models.
This equipment is subject to all applicable FCC rules and will comply with them upon delivery.
IBM makes no representations or warranties, expressed or implied, regarding non-IBM products and
services.

More Related Content

What's hot

Ci Physical Infrastructure Carousel
Ci Physical Infrastructure CarouselCi Physical Infrastructure Carousel
Ci Physical Infrastructure Carouselmkeaveney
 
Maximize Your Data Center for Virtualization Initiatives
Maximize Your Data Center for Virtualization InitiativesMaximize Your Data Center for Virtualization Initiatives
Maximize Your Data Center for Virtualization InitiativesSchneider Electric
 
A 10-BIT 25 MS/S PIPELINED ADC USING 1.5-BIT SWITCHED CAPACITANCE BASED MDAC ...
A 10-BIT 25 MS/S PIPELINED ADC USING 1.5-BIT SWITCHED CAPACITANCE BASED MDAC ...A 10-BIT 25 MS/S PIPELINED ADC USING 1.5-BIT SWITCHED CAPACITANCE BASED MDAC ...
A 10-BIT 25 MS/S PIPELINED ADC USING 1.5-BIT SWITCHED CAPACITANCE BASED MDAC ...IAEME Publication
 
DTI Custom Publication, 2010
DTI Custom Publication, 2010DTI Custom Publication, 2010
DTI Custom Publication, 2010joeldeer
 
[Case study] Jackson Energy Authority: Aligning solutions into one product suite
[Case study] Jackson Energy Authority: Aligning solutions into one product suite[Case study] Jackson Energy Authority: Aligning solutions into one product suite
[Case study] Jackson Energy Authority: Aligning solutions into one product suiteSchneider Electric
 

What's hot (6)

Fmcad08
Fmcad08Fmcad08
Fmcad08
 
Ci Physical Infrastructure Carousel
Ci Physical Infrastructure CarouselCi Physical Infrastructure Carousel
Ci Physical Infrastructure Carousel
 
Maximize Your Data Center for Virtualization Initiatives
Maximize Your Data Center for Virtualization InitiativesMaximize Your Data Center for Virtualization Initiatives
Maximize Your Data Center for Virtualization Initiatives
 
A 10-BIT 25 MS/S PIPELINED ADC USING 1.5-BIT SWITCHED CAPACITANCE BASED MDAC ...
A 10-BIT 25 MS/S PIPELINED ADC USING 1.5-BIT SWITCHED CAPACITANCE BASED MDAC ...A 10-BIT 25 MS/S PIPELINED ADC USING 1.5-BIT SWITCHED CAPACITANCE BASED MDAC ...
A 10-BIT 25 MS/S PIPELINED ADC USING 1.5-BIT SWITCHED CAPACITANCE BASED MDAC ...
 
DTI Custom Publication, 2010
DTI Custom Publication, 2010DTI Custom Publication, 2010
DTI Custom Publication, 2010
 
[Case study] Jackson Energy Authority: Aligning solutions into one product suite
[Case study] Jackson Energy Authority: Aligning solutions into one product suite[Case study] Jackson Energy Authority: Aligning solutions into one product suite
[Case study] Jackson Energy Authority: Aligning solutions into one product suite
 

Similar to Michael Gschwind, Blue Gene/Q: Design for Sustained Multi-Petaflop Computing

Accelerating Innovation from Edge to Cloud
Accelerating Innovation from Edge to CloudAccelerating Innovation from Edge to Cloud
Accelerating Innovation from Edge to CloudRebekah Rodriguez
 
FUSION APU & TRENDS/ CHALLENGES IN FUTURE SoC DESIGN
FUSION APU & TRENDS/ CHALLENGES IN FUTURE SoC DESIGNFUSION APU & TRENDS/ CHALLENGES IN FUTURE SoC DESIGN
FUSION APU & TRENDS/ CHALLENGES IN FUTURE SoC DESIGNPankaj Singh
 
Introducing the CrossLink Programmable ASSP
Introducing the CrossLink Programmable ASSPIntroducing the CrossLink Programmable ASSP
Introducing the CrossLink Programmable ASSPLatticeSemiconductor
 
Datacenter Dynamics Chicago 30 sept 2010
Datacenter Dynamics Chicago 30 sept 2010Datacenter Dynamics Chicago 30 sept 2010
Datacenter Dynamics Chicago 30 sept 2010Dileep Bhandarkar
 
Open Source Possibilities for 5G Edge Computing Deployment
Open Source Possibilities for 5G Edge Computing DeploymentOpen Source Possibilities for 5G Edge Computing Deployment
Open Source Possibilities for 5G Edge Computing DeploymentIgnacio Verona
 
Discrete MFG IoT Factory of the Future
Discrete MFG IoT Factory of the FutureDiscrete MFG IoT Factory of the Future
Discrete MFG IoT Factory of the FutureMainstay
 
AIST Super Green Cloud: lessons learned from the operation and the performanc...
AIST Super Green Cloud: lessons learned from the operation and the performanc...AIST Super Green Cloud: lessons learned from the operation and the performanc...
AIST Super Green Cloud: lessons learned from the operation and the performanc...Ryousei Takano
 
StampedeCon 2015 Keynote
StampedeCon 2015 KeynoteStampedeCon 2015 Keynote
StampedeCon 2015 KeynoteKen Owens
 
How Cisco Migrated from MapReduce Jobs to Spark Jobs - StampedeCon 2015
How Cisco Migrated from MapReduce Jobs to Spark Jobs - StampedeCon 2015How Cisco Migrated from MapReduce Jobs to Spark Jobs - StampedeCon 2015
How Cisco Migrated from MapReduce Jobs to Spark Jobs - StampedeCon 2015StampedeCon
 
Applying Cloud Techniques to Address Complexity in HPC System Integrations
Applying Cloud Techniques to Address Complexity in HPC System IntegrationsApplying Cloud Techniques to Address Complexity in HPC System Integrations
Applying Cloud Techniques to Address Complexity in HPC System Integrationsinside-BigData.com
 
IMAGE CAPTURE, PROCESSING AND TRANSFER VIA ETHERNET UNDER CONTROL OF MATLAB G...
IMAGE CAPTURE, PROCESSING AND TRANSFER VIA ETHERNET UNDER CONTROL OF MATLAB G...IMAGE CAPTURE, PROCESSING AND TRANSFER VIA ETHERNET UNDER CONTROL OF MATLAB G...
IMAGE CAPTURE, PROCESSING AND TRANSFER VIA ETHERNET UNDER CONTROL OF MATLAB G...Christopher Diamantopoulos
 
Confluent Partner Tech Talk with SVA
Confluent Partner Tech Talk with SVAConfluent Partner Tech Talk with SVA
Confluent Partner Tech Talk with SVAconfluent
 
Introduction to Software Defined Visualization (SDVis)
Introduction to Software Defined Visualization (SDVis)Introduction to Software Defined Visualization (SDVis)
Introduction to Software Defined Visualization (SDVis)Intel® Software
 
Process offloading from android device to cloud using JADE.
Process offloading from android device to cloud using JADE.Process offloading from android device to cloud using JADE.
Process offloading from android device to cloud using JADE.David Innocent Fadaraliki
 
Five key emerging trends impacting Data Centers in 2016
Five key emerging trends impacting Data Centers in 2016 Five key emerging trends impacting Data Centers in 2016
Five key emerging trends impacting Data Centers in 2016 Greg Stover
 
Smart edge ioT devices enable utility company to create new business segments...
Smart edge ioT devices enable utility company to create new business segments...Smart edge ioT devices enable utility company to create new business segments...
Smart edge ioT devices enable utility company to create new business segments...mfrancis
 

Similar to Michael Gschwind, Blue Gene/Q: Design for Sustained Multi-Petaflop Computing (20)

Accelerating Innovation from Edge to Cloud
Accelerating Innovation from Edge to CloudAccelerating Innovation from Edge to Cloud
Accelerating Innovation from Edge to Cloud
 
FUSION APU & TRENDS/ CHALLENGES IN FUTURE SoC DESIGN
FUSION APU & TRENDS/ CHALLENGES IN FUTURE SoC DESIGNFUSION APU & TRENDS/ CHALLENGES IN FUTURE SoC DESIGN
FUSION APU & TRENDS/ CHALLENGES IN FUTURE SoC DESIGN
 
Introducing the CrossLink Programmable ASSP
Introducing the CrossLink Programmable ASSPIntroducing the CrossLink Programmable ASSP
Introducing the CrossLink Programmable ASSP
 
Cisco data center training for ibm
Cisco data center training for ibmCisco data center training for ibm
Cisco data center training for ibm
 
Aniruddha_More_Resume
Aniruddha_More_ResumeAniruddha_More_Resume
Aniruddha_More_Resume
 
Datacenter Dynamics Chicago 30 sept 2010
Datacenter Dynamics Chicago 30 sept 2010Datacenter Dynamics Chicago 30 sept 2010
Datacenter Dynamics Chicago 30 sept 2010
 
Open Source Possibilities for 5G Edge Computing Deployment
Open Source Possibilities for 5G Edge Computing DeploymentOpen Source Possibilities for 5G Edge Computing Deployment
Open Source Possibilities for 5G Edge Computing Deployment
 
Discrete MFG IoT Factory of the Future
Discrete MFG IoT Factory of the FutureDiscrete MFG IoT Factory of the Future
Discrete MFG IoT Factory of the Future
 
AIST Super Green Cloud: lessons learned from the operation and the performanc...
AIST Super Green Cloud: lessons learned from the operation and the performanc...AIST Super Green Cloud: lessons learned from the operation and the performanc...
AIST Super Green Cloud: lessons learned from the operation and the performanc...
 
StampedeCon 2015 Keynote
StampedeCon 2015 KeynoteStampedeCon 2015 Keynote
StampedeCon 2015 Keynote
 
How Cisco Migrated from MapReduce Jobs to Spark Jobs - StampedeCon 2015
How Cisco Migrated from MapReduce Jobs to Spark Jobs - StampedeCon 2015How Cisco Migrated from MapReduce Jobs to Spark Jobs - StampedeCon 2015
How Cisco Migrated from MapReduce Jobs to Spark Jobs - StampedeCon 2015
 
Applying Cloud Techniques to Address Complexity in HPC System Integrations
Applying Cloud Techniques to Address Complexity in HPC System IntegrationsApplying Cloud Techniques to Address Complexity in HPC System Integrations
Applying Cloud Techniques to Address Complexity in HPC System Integrations
 
IMAGE CAPTURE, PROCESSING AND TRANSFER VIA ETHERNET UNDER CONTROL OF MATLAB G...
IMAGE CAPTURE, PROCESSING AND TRANSFER VIA ETHERNET UNDER CONTROL OF MATLAB G...IMAGE CAPTURE, PROCESSING AND TRANSFER VIA ETHERNET UNDER CONTROL OF MATLAB G...
IMAGE CAPTURE, PROCESSING AND TRANSFER VIA ETHERNET UNDER CONTROL OF MATLAB G...
 
Confluent Partner Tech Talk with SVA
Confluent Partner Tech Talk with SVAConfluent Partner Tech Talk with SVA
Confluent Partner Tech Talk with SVA
 
Introduction to Software Defined Visualization (SDVis)
Introduction to Software Defined Visualization (SDVis)Introduction to Software Defined Visualization (SDVis)
Introduction to Software Defined Visualization (SDVis)
 
Process offloading from android device to cloud using JADE.
Process offloading from android device to cloud using JADE.Process offloading from android device to cloud using JADE.
Process offloading from android device to cloud using JADE.
 
Cloud, Fog, or Edge: Where and When to Compute?
Cloud, Fog, or Edge: Where and When to Compute?Cloud, Fog, or Edge: Where and When to Compute?
Cloud, Fog, or Edge: Where and When to Compute?
 
Unit-1 ESD.pptx
Unit-1 ESD.pptxUnit-1 ESD.pptx
Unit-1 ESD.pptx
 
Five key emerging trends impacting Data Centers in 2016
Five key emerging trends impacting Data Centers in 2016 Five key emerging trends impacting Data Centers in 2016
Five key emerging trends impacting Data Centers in 2016
 
Smart edge ioT devices enable utility company to create new business segments...
Smart edge ioT devices enable utility company to create new business segments...Smart edge ioT devices enable utility company to create new business segments...
Smart edge ioT devices enable utility company to create new business segments...
 

More from Michael Gschwind

M. Gschwind, A novel SIMD architecture for the Cell heterogeneous chip multip...
M. Gschwind, A novel SIMD architecture for the Cell heterogeneous chip multip...M. Gschwind, A novel SIMD architecture for the Cell heterogeneous chip multip...
M. Gschwind, A novel SIMD architecture for the Cell heterogeneous chip multip...Michael Gschwind
 
Michael Gschwind, Chip Multiprocessing and the Cell Broadband Engine
Michael Gschwind, Chip Multiprocessing and the Cell Broadband EngineMichael Gschwind, Chip Multiprocessing and the Cell Broadband Engine
Michael Gschwind, Chip Multiprocessing and the Cell Broadband EngineMichael Gschwind
 
Michael Gschwind, Cell Broadband Engine: Exploiting multiple levels of parall...
Michael Gschwind, Cell Broadband Engine: Exploiting multiple levels of parall...Michael Gschwind, Cell Broadband Engine: Exploiting multiple levels of parall...
Michael Gschwind, Cell Broadband Engine: Exploiting multiple levels of parall...Michael Gschwind
 
Synergistic processing in cell's multicore architecture
Synergistic processing in cell's multicore architectureSynergistic processing in cell's multicore architecture
Synergistic processing in cell's multicore architectureMichael Gschwind
 
Michael Gschwind et al, "An Open Source Environment for Cell Broadband Engine...
Michael Gschwind et al, "An Open Source Environment for Cell Broadband Engine...Michael Gschwind et al, "An Open Source Environment for Cell Broadband Engine...
Michael Gschwind et al, "An Open Source Environment for Cell Broadband Engine...Michael Gschwind
 
Gschwind, PowerAI: A Co-Optimized Software Stack for AI on Power
Gschwind, PowerAI: A Co-Optimized Software Stack for AI on PowerGschwind, PowerAI: A Co-Optimized Software Stack for AI on Power
Gschwind, PowerAI: A Co-Optimized Software Stack for AI on PowerMichael Gschwind
 
Gschwind - Software and System Co-Optimization in the Era of Heterogeneous Co...
Gschwind - Software and System Co-Optimization in the Era of Heterogeneous Co...Gschwind - Software and System Co-Optimization in the Era of Heterogeneous Co...
Gschwind - Software and System Co-Optimization in the Era of Heterogeneous Co...Michael Gschwind
 
Gschwind - AI Everywhere: democratize AI with an open platform and end-to -en...
Gschwind - AI Everywhere: democratize AI with an open platform and end-to -en...Gschwind - AI Everywhere: democratize AI with an open platform and end-to -en...
Gschwind - AI Everywhere: democratize AI with an open platform and end-to -en...Michael Gschwind
 

More from Michael Gschwind (8)

M. Gschwind, A novel SIMD architecture for the Cell heterogeneous chip multip...
M. Gschwind, A novel SIMD architecture for the Cell heterogeneous chip multip...M. Gschwind, A novel SIMD architecture for the Cell heterogeneous chip multip...
M. Gschwind, A novel SIMD architecture for the Cell heterogeneous chip multip...
 
Michael Gschwind, Chip Multiprocessing and the Cell Broadband Engine
Michael Gschwind, Chip Multiprocessing and the Cell Broadband EngineMichael Gschwind, Chip Multiprocessing and the Cell Broadband Engine
Michael Gschwind, Chip Multiprocessing and the Cell Broadband Engine
 
Michael Gschwind, Cell Broadband Engine: Exploiting multiple levels of parall...
Michael Gschwind, Cell Broadband Engine: Exploiting multiple levels of parall...Michael Gschwind, Cell Broadband Engine: Exploiting multiple levels of parall...
Michael Gschwind, Cell Broadband Engine: Exploiting multiple levels of parall...
 
Synergistic processing in cell's multicore architecture
Synergistic processing in cell's multicore architectureSynergistic processing in cell's multicore architecture
Synergistic processing in cell's multicore architecture
 
Michael Gschwind et al, "An Open Source Environment for Cell Broadband Engine...
Michael Gschwind et al, "An Open Source Environment for Cell Broadband Engine...Michael Gschwind et al, "An Open Source Environment for Cell Broadband Engine...
Michael Gschwind et al, "An Open Source Environment for Cell Broadband Engine...
 
Gschwind, PowerAI: A Co-Optimized Software Stack for AI on Power
Gschwind, PowerAI: A Co-Optimized Software Stack for AI on PowerGschwind, PowerAI: A Co-Optimized Software Stack for AI on Power
Gschwind, PowerAI: A Co-Optimized Software Stack for AI on Power
 
Gschwind - Software and System Co-Optimization in the Era of Heterogeneous Co...
Gschwind - Software and System Co-Optimization in the Era of Heterogeneous Co...Gschwind - Software and System Co-Optimization in the Era of Heterogeneous Co...
Gschwind - Software and System Co-Optimization in the Era of Heterogeneous Co...
 
Gschwind - AI Everywhere: democratize AI with an open platform and end-to -en...
Gschwind - AI Everywhere: democratize AI with an open platform and end-to -en...Gschwind - AI Everywhere: democratize AI with an open platform and end-to -en...
Gschwind - AI Everywhere: democratize AI with an open platform and end-to -en...
 

Recently uploaded

UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingrknatarajan
 
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Bookingroncy bisnoi
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)simmis5
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduitsrknatarajan
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...ranjana rawat
 
Vivazz, Mieres Social Housing Design Spain
Vivazz, Mieres Social Housing Design SpainVivazz, Mieres Social Housing Design Spain
Vivazz, Mieres Social Housing Design Spaintimesproduction05
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfJiananWang21
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
Unit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfUnit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfRagavanV2
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Christo Ananth
 
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELLPVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELLManishPatel169454
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...roncy bisnoi
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college projectTonystark477637
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfKamal Acharya
 
Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01KreezheaRecto
 

Recently uploaded (20)

UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
 
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduits
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
 
Vivazz, Mieres Social Housing Design Spain
Vivazz, Mieres Social Housing Design SpainVivazz, Mieres Social Housing Design Spain
Vivazz, Mieres Social Housing Design Spain
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
Unit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfUnit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdf
 
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELLPVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college project
 
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
 
Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01
 

Michael Gschwind, Blue Gene/Q: Design for Sustained Multi-Petaflop Computing

  • 1. © 2012 IBM Corporation Blue Gene/Q®: Design for Sustained Multi- Petaflop Computing Michael Gschwind IBM Corp. This disclosure is provided for informational purposes only and is subject to change, without notice, at the sole discretion of IBM. This information is intended to outline IBM’s product direction and does not commit IBM to make, offer or deliver any products or technologies which are not currently publicly announced or generally available from IBM.
  • 2. © 2012 IBM Corporation Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing International Conference on Supercomputing 2012, San Servolo, Venice IBM System Design ƒ Holistic system-level optimization across hardware and software – From technology and circuits to software – Codesign of technologies for maximum efficiency ƒ Innovation across the entire value stack ƒ Benefit from IBM’s system skills – Experience in system tuning and optimization IBM System Design ƒ Holistic system-level optimization across hardware and software – From technology and circuits to software – Co-design of technologies for maximum efficiency ƒ Innovation across the entire value stack ƒ Benefit from IBM’s system skills – Experience in system tuning and optimization
  • 3. © 2012 IBM Corporation Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing International Conference on Supercomputing 2012, San Servolo, Venice Blue Gene/Q Design Goals ƒ Provide the platform for the future of supercomputing – Order of magnitude improvement in efficiency – Build the most scalable, most reliable, most programmable and fastest supercomputer ƒ Give programmers the tools to exploit this platform with groundbreaking applications
  • 4. © 2012 IBM Corporation Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing International Conference on Supercomputing 2012, San Servolo, Venice Leverage IBM’s technologies for HPC ƒ High-reliability SOI design ƒ Power ISA™ ƒ SIMD Vector architecture ƒ Communication technologies ƒ Packaging and Cooling ƒ System-level RAS experience ƒ Numeric middleware: ESSL, MASSV ƒ Deep understanding of algorithms and workloads
  • 5. © 2012 IBM Corporation Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing International Conference on Supercomputing 2012, San Servolo, Venice And Use HPC Innovation to Benefit Commercial Systems ƒ Multicore SoC ƒ Fast Networks ƒ Workload Optimized ƒ Hybrid Architectures ƒ Scalability ƒ High Efficiency
  • 6. © 2012 IBM Corporation Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing International Conference on Supercomputing 2012, San Servolo, Venice ƒ Holistic system-level optimization across hardware and software – From technology and circuits to software – Codesign of technologies for maximum efficiency ƒ Innovation across the entire value stack ƒ Benefit from IBM’s system skills – Experience in system tuning and optimization The “Walls”– Traditional Obstacles to Design ƒ Memory Wall – High bandwidth interfaces – Exploiting memory level parallelism ƒ Power Wall – Optimize for FLOPS / W – Improve design efficiency ƒ Frequency Wall – Optimize balance between frequency and parallelism
  • 7. © 2012 IBM Corporation Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing International Conference on Supercomputing 2012, San Servolo, Venice ƒ Holistic system-level optimization across hardware and software – From technology and circuits to software – Codesign of technologies for maximum efficiency ƒ Innovation across the entire value stack ƒ Benefit from IBM’s system skills – Experience in system tuning and optimization The New “Walls” ƒ Scalability Wall ƒ Communication Wall ƒ Reliability Wall ƒ Programmer Productivity Wall
  • 8. © 2012 IBM Corporation Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing International Conference on Supercomputing 2012, San Servolo, Venice Blue Gene/Q Concept ƒ Avoid scaling of processors into area of diminishing returns ƒ Scalable, integrated SMP node ƒ Integrated system-level network ƒ Familiar environment and new abstractions ƒ Design with a focus on system-level reliability
  • 9. © 2012 IBM Corporation Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing International Conference on Supercomputing 2012, San Servolo, Venice System 96 racks @ 20PF/s Node Card 32 Compute Cards @ 6.5TF/s, Link Chips, Optics, Torus Chip 16 cores @ 205 GF/s Compute Card One chip in SCM 16 GB DDR3 Memory Rack 32 Node Cards in 2 Midplanes I/O Drawers Rack Peak Performance 209 TF Sustained (Linpack) ~170+ TF Power ~100 kW Power Efficiency ~1.7 GF/W System Scaling to Multi-Petaflop
  • 10. © 2012 IBM Corporation Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing International Conference on Supercomputing 2012, San Servolo, Venice BlueGene/Q Compute chip ƒ 360 mm² Cu-45 technology (SOI) – ~ 1.47 B transistors ƒ 18-way CMP –16 user + 1 service processors –plus 1 redundant processor –all processors are symmetric –peak performance 204.8 GFLOPS@55W ƒ Central shared L2 cache: 32 MB –eDRAM ƒ Dual memory controller –16 GB external DDR3 memory –1.33 Gb/s –2 * 16 byte-wide interface (+ECC) ƒ Chip-to-chip networking –Router logic integrated into BQC chip. ƒ External IO –PCIe Gen2 interface System-on-a-Chip design - integrates processors, memory and networking logic into a single chip
  • 11. © 2012 IBM Corporation Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing International Conference on Supercomputing 2012, San Servolo, Venice BG/Q Processor Unit ƒ A2 processor core ƒ Based on PowerEN™ design ƒ Implements 64-bit Power ISA™ ƒ Optimized for aggregate throughput: – 4-way simultaneously multi-threaded (SMT) – 2-way concurrent issue: XU (br/int/l/s) + QPU ƒ 1.6 GHz @ 0.8V QPU
  • 12. © 2012 IBM Corporation Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing International Conference on Supercomputing 2012, San Servolo, Venice Quadvector Processing Unit (QPU) ƒ Power-efficient scalable performance – 8 floating point ops (FMA) per cycle – Concurrent load and store to ensure peak utilization ƒ 4 double precision pipelines, usable as: – scalar FPU – 4-wide FPU SIMD – 2-wide complex arithmetic SIMD ƒ QPX Instruction extensions to Power ISA ƒ Load/store supports a multitude of alignments ƒ Permute instructions to reorganize vector data RF MAD0 MAD3 MAD2 MAD1 RF RF RF Permute Load A2 256 64 Putting the Q into Blue Gene/Q:
  • 13. © 2012 IBM Corporation Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing International Conference on Supercomputing 2012, San Servolo, Venice Quadvector Processing Extensions (QPX) ƒ New FPU SIMD architecture optimized for Blue Gene/Q – Scalable, efficient performance – Co-design for efficient code generation ƒ Builds on Blue Gene application insights – Optimized for scientific applications ƒ Improved programmer productivity – Automatic SIMD-ization – Efficient handling of unaligned data – Conversions from control flow to dataflow – Efficient SIMD exceptions handling QPU QPX Programmability Compilability PowerPC Compatibility IEEE Capability Efficiency Putting the Q into Blue Gene/Q:
  • 14. © 2012 IBM Corporation Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing International Conference on Supercomputing 2012, San Servolo, Venice Beyond SIMD: Flexible communication between element lanes from D$ Vector permute Load & replicate Complex numbers Putting the Q into Blue Gene/Q:
  • 15. © 2012 IBM Corporation Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing International Conference on Supercomputing 2012, San Servolo, Venice Breaking the control flow limit with data-level parallelism for (i =0; i< VL; i++) if (a[i] !=0) m[i] = b[i]/a[i]; else m[i] = b[i] * 10; a’[0] =b[0]/a[0] b’[0]=b[0]*10 s[0]=a[0] != 0 m[0]=s[0]? a’[0]:b’[0] a’[1]=b[1]/a[1] b’[1]=b[1]*10 s[1]=a[1] != 0 m[1]=s[1]? a’[1]:b’[1] a’[2]=b[2]/a[2] b’[2]=b[2]*10 s[2]=a[2] != 0 m[2]=s[2]? a’[2]:b’[2] a’[3]=b[3]/a[3] b’[3]=b[3]*10 s[3]=a[3] != 0 m[3]=s[3]? a’[3]:b’[3] Exploit data parallelism a[0] != 0 m[0]=b[0]/a[0] m[0]=b[0]*10 a[1] != 0 m[1]=b[1]/a[1] m[1]=b[1]*10 a[2] != 0 m[2]=b[2]/a[2] m[2]=b[2]*10 a[3] !=0 m[3]=b[3]/a[3] m[3]=b[3]*10 Long latency
  • 16. © 2012 IBM Corporation Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing International Conference on Supercomputing 2012, San Servolo, Venice Exception condition detection in data-parallel code b’[0]=b[0]*10 s[0]=a[0] != 0 m[0]=s[0]? t’[0]:b’[0] b’[1]=b[1]*10 s[1]=a[1] != 0 m[1]=s[1]? t’[1]:b’[1] b’[2]=b[2]*10; s[2]=a[2] != 0 m[2]=s[2]? t’[2]:b’[2] b’[3]=b[3]*10; s[3]=a[3] != 0 m[3]=s[3]? t’[3]:b’[3] t’[0] =b[0]*t[0] t’[1]=b[1]*t[1] t’[3]=b[3]*t[3] t’[2]=b[2]*t[2] qvfcmpgt qS, qA, qB qvre qT, qA t[0] =1/a[0] t[1]=1/a[1] t[2]=1/a[2] t[3]=1/a[3] qvfmul qT’, qB, qT qvfmul qB’, qB, q10_0 qvfsel qM, qS, qT’,qB’ qvstfdx qM, rA, rB store m[0] store m[1] store m[2] store m[3] Putting the Q into Blue Gene/Q:
  • 17. © 2012 IBM Corporation Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing International Conference on Supercomputing 2012, San Servolo, Venice store m[0] store m[1] store m[2] store m[3] Exception condition detection in data-parallel code b’[0]=b[0]*10 s[0]=a[0] != 0 m[0]=s[0]? t’[0]:b’[0] b’[1]=b[1]*10 s[1]=a[1] != 0 m[1]=s[1]? t’[1]:b’[1] b’[2]=b[2]*10; s[2]=a[2] != 0 m[2]=s[2]? t’[2]:b’[2] b’[3]=b[3]*10; s[3]=a[3] != 0 m[3]=s[3]? t’[3]:b’[3] t’[0] =b[0]*t[0] t’[1]=b[1]*t[1] t’[3]=b[3]*t[3] t’[2]=b[2]*t[2] qvfcmpgt qS, qA, qB qvre qT, qA t[0] =1/a[0] t[1]=1/a[1] t[2]=1/a[2] t[3]=1/a[3] qvfmul qT’, qB, qT qvfmul qB’, qB, q10_0 qvfsel qM, qS, qT’,qB’ qvstfdx qM, rA, rB store & indicate m[0] store & indicate m[1] Store & indicate m[2] store & indicate m[3] qvsfdxi qM, rA, rB Putting the Q into Blue Gene/Q:
  • 18. © 2012 IBM Corporation Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing International Conference on Supercomputing 2012, San Servolo, Venice Exception condition detection in data-parallel code b’[0]=b[0]*10 s[0]=a[0] != 0 m[0]=s[0]? t’[0]:b’[0] b’[1]=b[1]*10 s[1]=a[1] != 0 m[1]=s[1]? t’[1]:b’[1] b’[2]=b[2]*10; s[2]=a[2] != 0 m[2]=s[2]? t’[2]:b’[2] b’[3]=b[3]*10; s[3]=a[3] != 0 m[3]=s[3]? t’[3]:b’[3] t’[0] =b[0]*t[0] t’[1]=b[1]*t[1] t’[3]=b[3]*t[3] t’[2]=b[2]*t[2] qvfcmpgt qS, qA, qB qvre qT, qA t[0] =1/a[0] t[1]=1/a[1] t[2]=1/a[2] t[3]=1/a[3] qvfmul qT’, qB, qT qvfmul qB’, qB, q10_0 qvfsel qM, qS, qT’,qB’ qvstfdxi qM, rA, rB store & indicate m[0] store & indicate m[1] store & indicate m[2] store & indicate m[3] Putting the Q into Blue Gene/Q:
  • 19. © 2012 IBM Corporation Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing International Conference on Supercomputing 2012, San Servolo, Venice Exception condition detection in data-parallel code b’[0]=b[0]*10 s[0]=a[0] != 0 m[0]=s[0]? t’[0]:b’[0] b’[1]=b[1]*10 s[1]=a[1] != 0 m[1]=s[1]? t’[1]:b’[1] b’[2]=b[2]*10; s[2]=a[2] != 0 m[2]=s[2]? t’[2]:b’[2] b’[3]=b[3]*10; s[3]=a[3] != 0 m[3]=s[3]? t’[3]:b’[3] t’[0] =b[0]*t[0] t’[1]=b[1]*t[1] t’[3]=b[3]*t[3] t’[2]=b[2]*t[2] qvfcmpgt qS, qA, qB qvre qT, qA t[0] =1/a[0] t[1]=1/a[1] t[2]=1/a[2] t[3]=1/a[3] qvfmul qT’, qB, qT qvfmul qB’, qB, q10_0 qvfsel qM, qS, qT’,qB’ qvstfdxi qM, rA, rB store & indicate m[0] store & indicate m[1] store & indicate m[2] Store & indicate m[3] Putting the Q into Blue Gene/Q:
  • 20. © 2012 IBM Corporation Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing International Conference on Supercomputing 2012, San Servolo, Venice Scalability innovations for thread-level communication ƒ Atomic operations – Faster OpenMP work hand off ÎReduced messaging latency ƒ Wake-up unit ÎImprove latency and reduce contention
  • 21. © 2012 IBM Corporation Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing International Conference on Supercomputing 2012, San Servolo, Venice Scalability innovation to enable speculation: breaking data dependences with cache multi-versioning ƒ Multi-versioned data – Tracked by score board in L2 cache ƒ Transactional Memory: – Load/store conflicts detected and resolved by SW ƒ Speculative Execution: – Load/store conflicts detected and resolved based on original program ordering Single MPI Task User defined parallelism User defined transaction start User defined transaction end parallelization completion synchronization point Hardware detected dependency conflict rollback Single MPI Task User defined parallelism User defined transaction start User defined transaction end parallelization completion synchronization point Hardware detected dependency conflict rollback
  • 22. © 2012 IBM Corporation Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing International Conference on Supercomputing 2012, San Servolo, Venice Scalability with the 17th core ƒ Common IO Client Interface – Asynchronous I/O completion hand-off ƒ Application Agents: privileged application processing – Messaging assist, e.g., MPI pacing thread – Performance and trace helpers ƒ RAS Event handling and interrupt off-load – Reduce O/S noise and jitter (not to scale)
  • 23. © 2012 IBM Corporation Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing International Conference on Supercomputing 2012, San Servolo, Venice Interconnect Innovations for MPI Scaling ƒ Integrated 5D torus – High nearest neighbor bandwidth • measured ~1.75 GB/s per link – Simple partitioning ƒ 10 links – Hardware assisted collective and barrier – FP addition support in network ƒ Hardware latency – Nearest: 80ns – Farthest: 3μs (96-rack 20PF system)
  • 24. © 2012 IBM Corporation Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing International Conference on Supercomputing 2012, San Servolo, Venice Communication Innovations for MPI Scaling: High-Performance, application-level RDMA messaging acceleration ƒ Dedicated accelerators for MPI ƒ 32 MEs offload packet operations ƒ Accessible from problem state applications Slave port master ports Crossbar Switch DCRs Interrupts & UPC events Global Barrier Control & Status … rMEs … iMEs Injection Control & Arbitration IC SRAM MC SRAM Reception Control RC SRAM RPUT SRAM MU To Central Global Barrier Logic in ND To ND Injection FIFOs To ND Reception FIFOs
  • 25. © 2012 IBM Corporation Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing International Conference on Supercomputing 2012, San Servolo, Venice Design for Reliability: Blue Gene Compute Chip ƒ DDR3, L2 cache, network, all major arrays and buses: ECC protected ƒ Stacked / DICE latches for critical state holding latches ƒ Register files and minor buses: parity protected, with recovery FMA PRM recovery Copy 0 Copy 1 US7,512,772
  • 26. © 2012 IBM Corporation Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing International Conference on Supercomputing 2012, San Servolo, Venice Co-Design for Reliability and Serviceability ƒ Avoid “silent hangs” by catching and reporting failure indications (“machine checks”) ƒ Firmware operates from on-chip BeDRAM with minimal resources – Machine check vectors point to BeDRAM ƒ Hardware provides on-chip 256K BeDRAM (“Boot eDRAM”) – Node firmware can continue to operate in the presence of L2, DDR interface and memory failure – Minimize requirements to report node failure
  • 27. © 2012 IBM Corporation Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing International Conference on Supercomputing 2012, San Servolo, Venice Packaging and Cooling for Reliability ƒ Water cooled node board – Very stable, low temperature environment for ASICs, DRAM, power supplies, optics ƒ Node Board integrates – 32 compute cards – 8 link ASICs • drive 4D links using 10Gb/s optical transceivers ƒ Compute Cards – Memory soldered to card for reliability ƒ Hot pluggable front-end power supplies
  • 28. © 2012 IBM Corporation Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing International Conference on Supercomputing 2012, San Servolo, Venice Blue Gene/Q Compute Card ƒ Basic FRU of a BlueGene/Q system ƒ Compute Card has 1 BQC chip + 72 SDRAMs (16GB DDR3) ƒ Two heat sink options: water-cooled Æ “Compute Node” / air-cooled Æ “IO Node” ƒ Connectors carry power supplies, JTAG etc, and 176 HSS signals (4 and 5 Gbps)
  • 29. © 2012 IBM Corporation Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing International Conference on Supercomputing 2012, San Servolo, Venice Height 2095.5 mm (82.5 inches) Width 1219.2 mm (48 inches) Depth 1321 mm (52 inches) Weight 1520 kg (3350 lbs) (including water) I/O enclosure with 4 drawers 177 kg (390 lbs) Packaging and Cooling Water cooled node board w/ 32 compute cards
  • 30. © 2012 IBM Corporation Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing International Conference on Supercomputing 2012, San Servolo, Venice Cabled Rack
  • 31. © 2012 IBM Corporation Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing International Conference on Supercomputing 2012, San Servolo, Venice Rack Water Cooling
  • 32. © 2012 IBM Corporation Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing International Conference on Supercomputing 2012, San Servolo, Venice Blue Gene Rack Group
  • 33. © 2012 IBM Corporation Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing International Conference on Supercomputing 2012, San Servolo, Venice Argonne Mira
  • 34. © 2012 IBM Corporation Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing International Conference on Supercomputing 2012, San Servolo, Venice Blue Gene/Q Sequoia System
  • 35. © 2012 IBM Corporation Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing International Conference on Supercomputing 2012, San Servolo, Venice Programmability: emphasis on general purpose applicability
  • 36. © 2012 IBM Corporation Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing International Conference on Supercomputing 2012, San Servolo, Venice Messaging Software Architecture ƒ BG/Q Messaging is based on the optimized Parallel Active Message Interface (PAMI) open source library ƒ Multiple program models may coexist within a task (e.g. MPI + GA/ARMCI) ƒ Direct access to PAMI and baremetal SPI interfaces is allowed Other Paradigms* SPI Message Layer Core (C++) pt2pt protocols ARMCI MPICH Converse/Charm++ PAMI API (C) Network Hardware (DMA, Collective Network, Global Interrupt Network) Application Global Arrays High-Level API collective protocols Low-Level API UPC* DMA Device Collective Device GI Device Shmem Device
  • 37. © 2012 IBM Corporation Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing International Conference on Supercomputing 2012, San Servolo, Venice Application Performance: Speedup from Blue Gene/P to Blue Gene/Q ƒ Q/P ratio indicates performance increase per node –using the same number of cores on both BG/P and BG/Q –2x-4x more total threads on BG/Q ƒ Applications expected to benefit from tuning over system life –Improved vectorization and SIMD exploitation –Tuning to network topologies *Linpack (tuned) performance is ~82.5% of peak performance on BG/Q Application Q/P ratio Ported Applications DNS3D 16.9 FLASH 5.9 GFMC 10.5 GTC 11.2 HELD-SUAREZ 8.2 MILC 6.1 NAMD 5.5 NEK 7.4 Geometric Mean 8.4 Tuned Applications GROMACS 15 GADGET 16 GENE 11 OCTOPUS 11 Geometric Mean 13 Linpack 173 TF/s*
  • 38. © 2012 IBM Corporation Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing International Conference on Supercomputing 2012, San Servolo, Venice A Vision for Applying Supercomputers ƒ Traditional usage for science discoveries ƒ For advances in drug discovery and understanding life sciences Materials Science Molecular Dynamics Life Sciences: Sequencing Life Sciences: In-Silico Trials, Drug Discovery Biological Modeling Brain Science
  • 39. © 2012 IBM Corporation Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing International Conference on Supercomputing 2012, San Servolo, Venice Fraud Prevention Telecom Environment Law Enforcement Health Care Traffic Control For a Smarter, Greener Planet Energy
  • 40. © 2012 IBM Corporation Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing International Conference on Supercomputing 2012, San Servolo, Venice Up to 10,000 Times larger Up to 10,000 times faster Traditional Data Warehouse and Business Intelligence Data Scale Data Scale yr mo wk day hr min sec … ms μs Exa Peta Tera Giga Mega Kilo Decision Frequency Occasional Frequent Real-time Data in Motion Data at Rest For improved business results with Advanced Business Analytics for new “Big Data” Telco Promotions 100,000 records/sec, 6B/day 10 ms/decision 270TB for Deep Analytics
  • 41. © 2012 IBM Corporation Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing International Conference on Supercomputing 2012, San Servolo, Venice And for Answering All Questions Answer with Confidence In <300 Milliseconds Final Merging / Ranking Deep Answer Scorers Deep Answer Scorers Deep Answer Scorers Deep Evidence Scorers Supporting Evidence Retrieval Supporting Evidence Retrieval Supporting Evidence Retrieval Supporting Evidence Retrieval Shallow Answer Scorers Shallow Answer Scorers Shallow Answer Scorers Shallow Scoring & Filtering Primary Search Primary Search Primary Search Search & Candidate Generation 100’s of Hits 10k’s of Hits Statistical Models 1,000s of Candidates 100s of Candidates 10Ks Pieces of Evidence 100K Scores Huge volumes of text for knowledge 100’s of millions of facts refine interpretation Clue/ Category Analysis Dist. Search
  • 42. © 2012 IBM Corporation Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing International Conference on Supercomputing 2012, San Servolo, Venice New Era of High Performance Computing “Physics Driven” “Data Driven” ÆTechnology becomes affordable and pervasive ÆBroad adoption across a variety of industries ÆUsage for modeling, simulation, predictive analysis workloads ÆDelivered via Clusters, Grids, and Cloud Supercomputers Science, Research & Government BlueGene/Q, Power 775, iDataPlex Business Analyt. Financial Services Digital Media Analytics / Big Data / Cloud deployment
  • 43. © 2012 IBM Corporation Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing International Conference on Supercomputing 2012, San Servolo, Venice 16.32 PFLOP (81.08% eff) on 2012-06-08 jobid: 24760 size=98304 end=2012-06-08 09:55:38.726489 /bgsys/apps/linpack//BURN_R0-RB-96K.2012_0607_1042 PASS 16324.751 TF (81.08%) stdout[0]: ================================================================================ stdout[0]: T/V N NB P Q Time Gflops stdout[0]: -------------------------------------------------------------------------------- stdout[0]: WR16L2L4 12681215 256 384 1024 83280.78786813750048168 1.63247519148838781e+07 stdout[0]: --VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV- stdout[0]: Max aggregated wall time rfact . . . : 10.29742211375065608 stdout[0]: + Max aggregated wall time pfact . . : 6.47944774470261109 stdout[0]: + Max aggregated wall time mxswp . . : 4.23286864816418529 stdout[0]: + Max aggregated wall time laswp . . : 2011.12352745312227853 stdout[0]: Max aggregated wall time up tr sv . : 69.15868747876083944 stdout[0]: Max aggregated wall time dgemm . . . : 75943.87711409496841952 stdout[0]: Max aggregated wall time dtrsm . . . : 466.78703577646933809 stdout[0]: -------------------------------------------------------------------------------- stdout[0]: ||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)= 0.0005574431767873 ...... PASSED stdout[0]: RAMP ended at 1 stdout[0]: ================================================================================ stdout[0]: stdout[0]: Finished 1 tests with the following results: stdout[0]: 1 tests completed and passed residual checks, stdout[0]: 0 tests completed and failed residual checks, stdout[0]: 0 tests skipped because of illegal input values. stdout[0]: -------------------------------------------------------------------------------- stdout[0]: stdout[0]: End of Tests. stdout[0]: ================================================================================ Blue Gene/Q®: The World’s Fastest Supercomputer
  • 44. © 2012 IBM Corporation Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing International Conference on Supercomputing 2012, San Servolo, Venice 0.0001 0.001 0.01 0.1 1 10 100 1000 10000 100000 1000000 J u n 9 3 N o v 9 3 J u n 9 4 N o v 9 4 J u n 9 5 N o v 9 5 J u n 9 6 N o v 9 6 J u n 9 7 N o v 9 7 J u n 9 8 N o v 9 8 J u n 9 9 N o v 9 9 J u n 0 0 N o v 0 0 J u n 0 1 N o v 0 1 J u n 0 2 N o v 0 2 J u n 0 3 N o v 0 3 J u n 0 4 N o v 0 4 J u n 0 5 N o v 0 5 J u n 0 6 N o v 0 6 J u n 0 7 N o v 0 7 J u n 0 8 N o v 0 8 J u n 0 9 N o v 0 9 J u n 1 0 N o v 1 0 J u n 1 1 N o v 1 1 J u n 1 2 Rmax Performance (TFlops) Over the long haul IBM has demonstrated continued leadership in various TOP500 metrics, even as the performance continues it’s relentless growth. Total Aggregate Performance # 1 # 10 # 500 Source: www.top500.org Blue Square Markers Indicate IBM Leadership 16.33 PF Top500 Performance Trends
  • 45. © 2012 IBM Corporation Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing International Conference on Supercomputing 2012, San Servolo, Venice Blue Gene/Q is the most energy-efficient supercomputer (*) # Site Vendor Processor kWatts Mflops/w 1 DOE/NNSA/LLNL IBM BlueGene/Q, Power BQC 16C 1.60GHz 41.1 2100.88 2 IBM Thomas J. Watson Research Center IBM BlueGene/Q, Power BQC 16C 1.60GHz 41.1 2100.88 3 DOE/SC/Argonne National Laboratory IBM BlueGene/Q, Power BQC 16C 1.60GHz 82.2 2100.86 4 DOE/SC/Argonne National Laboratory IBM BlueGene/Q, Power BQC 16C 1.60GHz 82.2 2100.86 5 Rensselaer Polytechnic Institute IBM BlueGene/Q, Power BQC 16C 1.60GHz 82.2 2100.86 6 University of Rochester IBM BlueGene/Q, Power BQC 16C 1.60GHz 82.2 2100.86 7 IBM Thomas J. Watson Research Center IBM BlueGene/Q, Power BQC 16C 1.60GHz 82.2 2100.86 8 University of Edinburgh IBM BlueGene/Q, Power BQC 16C 1.60GHz 493.1 2099.56 9 Daresbury Laboratory (S&T Fac. Council) IBM BlueGene/Q, Power BQC 16C 1.60GHz 575.3 2099.50 10 Forschungszentrum Juelich (FZJ) IBM BlueGene/Q, Power BQC 16C 1.60GHz 657.5 2099.46 11 CINECA IBM BlueGene/Q, Power BQC 16C 1.60GHz 821.9 2099.39 12 High Energy Accelerator Research Org. /KEK IBM BlueGene/Q, Power BQC 16C 1.60GHz 246.6 2099.14 13 EDF R&D IBM BlueGene/Q, Power BQC 16C 1.60GHz 328.8 2099.14 14 IDRIS/GENCI IBM BlueGene/Q, Power BQC 16C 1.60GHz 328.8 2099.14 15 Victorian Life Sciences Computation Initiative IBM BlueGene/Q, Power BQC 16C 1.60GHz 328.8 2099.14 16 IBM - Rochester IBM BlueGene/Q, Power BQC 16C 1.60GHz 164.4 2099.14 17 IBM - Rochester IBM BlueGene/Q, Power BQC 16C 1.60GHz 164.4 2099.14 18 DOE/NNSA/LLNL IBM BlueGene/Q, Power BQC 16C 1.60GHz 164.4 2099.14 19 DOE/SC/Argonne National Laboratory IBM BlueGene/Q, Power BQC 16C 1.60GHz 3945 2069.04 20 DOE/NNSA/LLNL IBM BlueGene/Q, Power BQC 16C 1.60GHz 7890 2069.04 Source: www.green500.org *as measured by Green500, 6/2011, 11/2011 and 6/2012
  • 46. © 2012 IBM Corporation Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing International Conference on Supercomputing 2012, San Servolo, Venice •Ultra-scalability for breakthrough science, broad range of HPC applicability •Address traditional walls for computer systems •Address new walls on the path to ultrascalable exa-scale computing •Low power, small footprint, low total cost of ownership (TCO) •High reliability, 10-100X better MTBF/TF, low maintenance requirements •New, even higher performing 256B quad-vector SIMD with 8 FLOPS per cycle •Low latency, high performance inter-processor communications •Open source and standards-based programming environment Blue Gene/Q® is designed to overcome hurdles
  • 47. © 2012 IBM Corporation Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing International Conference on Supercomputing 2012, San Servolo, Venice Thank you!
  • 48. © 2012 IBM Corporation Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing International Conference on Supercomputing 2012, San Servolo, Venice Disclaimer IBM Corporation Integrated Marketing Communications, Systems & Technology Group Route 100 Somers, NY 10589 Produced in the United States of America 06/28/2012 All Rights Reserved IBM’s statements regarding its plans, directions, and intent are subject to change or withdrawal without notice at IBM’s sole discretion. Information regarding potential future products is intended to outline our general product direction and it should not be relied on in making a purchasing decision. The information mentioned regarding potential future products is not a commitment, promise, or legal obligation to deliver any material, code or functionality. Information about potential future products may not be incorporated into any contract. The development, release, and timing of any future features or functionality described for our products remains at our sole discretion. Some information in this document addresses anticipated future capabilities. Such information is not intended as a definitive statement of a commitment to specific levels of performance, function or delivery schedules with respect to any future products. Such commitments are only made in IBM product announcements. The information is presented here to communicate IBM’s current investment and development activities as a good faith effort to help with our customers’ future planning. All performance information was determined in a controlled environment. Actual results may vary. Performance information is provided “AS IS” and no warranties or guarantees are expressed or implied by IBM. IBM does not warrant that the information offered herein will meet your requirements or those of your distributors or customers. IBM provides this information “AS IS” without warranty. IBM disclaim all warranties, express or implied, including the implied warranties of noninfringement, merchantability and fitness for a particular purpose or noninfringement.
  • 49. © 2012 IBM Corporation Michael Gschwind, Blue Gene/Q®: Design for Sustained Multi-Petaflop Computing International Conference on Supercomputing 2012, San Servolo, Venice Disclaimer References in this publication to IBM products or services do not imply that IBM intends to make them available in every country in which IBM operates. Consult your local IBM business contact for information on the products, features, and services available in your area. Blue Gene, Blue Gene/Q and PowerEN are trademarks or registered trademarks of IBM Corporation in the United States, other countries or both. PowerISA and Power Architecture are trademarks or registered trademarks in the United States, other countries, or both, licensed through Power.org Linux is a registered trademark of Linus Torvalds. Tivoli is a registered trademark of Tivoli Systems Inc.in the United States, other countries or both. UNIX is a registered trademark in the United States and other countries, licensed exclusively through The Open Group. Other trademarks and registered trademarks are the properties of their respective companies. Photographs shown are of engineering prototypes. Changes may be incorporated in production models. This equipment is subject to all applicable FCC rules and will comply with them upon delivery. IBM makes no representations or warranties, expressed or implied, regarding non-IBM products and services.