Advanced Computer Architecture

“
”
NADAR SARASWATHI COLLEGE OF ARTS AND SCIENCE
VADAPUTHUPATTI, THENI – 625 531
Department of Computer Science and Information Technology
PRESENTED BY
NIBIYA.G
I-MSC(INFORMATION TECHONOLGY)

ADVANCED COMPUTER
ARCHITECTURE
SEMINAR TOPIC:-
CONNECTION MACHINES-CM-5

CONNECTION MACHINES
• 1981:MIT AI-lab Technical memo on CM
• 1982:Thinking Machines Inc. founded
• 1985:Danny Hillis wins ACM “Best Phn” Award
• 1986:CM-1 ships
• 1987:CM-2 ships
• 1991:CM-5 Announced
• 1991:ships

Cm-1 and cm-2 architecture
• Original design goal to support neuron like simulations
• Up to 64k single bit processors(actually 3 bits in and 2 out)
• 16 processors/chips, 32chips/PCD, 16 PCBs/cube, 8cubes/hypercube
• Hypercube architecture-each 16-proc chip a hyper-node
• Each proc has 4k bits of bit addressable RAM
 Distributed Physical Memory
 Global Memory Addresses

Cm-1 and cm-2 architecture
• Up to 4 front-end computer talk to sequencers via 4X4 crossbar
• “Sequencers” issues SIMD instruction over a Broadcast Network
• Bit process communication via 2D local HW grid connection(“NEWS”)
• Bit process communication via hypercube network using MSG passing
• Lots of Twinkling Lights

CM-1 and CM-2 PROGRAMMING
• ISA support:
 Bit- oriented operations
 Arbitrary precision multi-bit scalar Ops using bit-serial implementation on bit process
 Full multi-Dimensional Vector Ops
• “Virtual processor” idea similar to CUDA threads but they are statically allocated
• OS and programming tools run on front-ends
• List as the initial programming language
• Later c* and CM- Fortran

CM-2 IMPROVEMENTS
• 1 weitex IEEE FP coprocessor per 32 1-bit process
• Up to 256k bits of memory per process
• Added ECC to memory
• Implemented the IO subsystem
 Up to 80 GB RAID array called “Data Vault” uses 39 Striped Disks and ECC, plus spare disks on standby
 High speed graphics output
• En-route MSG combining in H-cube router
• New implementation of multi-Dimensional NEWS on top of H-cube (special addressing mode)

CM-5 vs. CM-1 and CM-2
• Significant departure from CM-1 and CM-2
• Targeted at more scientific and business application
• More commercial off-the-shelf components (“COTS”)
• Large array of SPARC processing nodes
 1-bit processors are abandoned
• Abandoned “NEWS” grid and hyper-cube networks
• Delivered 1024 node machine, with claims 16K nodes possible
• Even more Twinkling lights

CM-5 OVERALL ARCHITECTURE
• “coordination homogeneous array of RISC processors” or “CHARM”
• Asymmetric coprocessors model
 Large array of processor nodes
 Small collection of control nodes
• 2 separate scalable networks
 One for data
 One for control and synchronization
• Still uses striped RAID for high disk Bandwidth

DIVISION OF LABOR
• Processor node can be assigned to a “partition”
• One control node per partition
• Control node runs scalar code, then broadcasts parallel work to processor nodes
• Processor nodes can access other node’s memory by reading or writing a global
memory address
• Processor nodes also communicate via MSG passing
• Processor nodes cannot issue system calls

CONTROL NODES
• Full sun workstations
• Running UNIX
• Connected to the “outside world”
• Handle partition time sharing
• Connected to both data and control networks
• Performs system diagnostics

PROCESSOR NODES
• Nodes are a 5-chip microprocessor
 off the shelf SPARC processor @40 MHz
 32MB local node memory
 Multi-port memory controller for added BW
 “caching techniques do not perform as well on large parallel machines”
 Proprietary 4-FPU vector coprocessor
 Proprietary network controller

DATA NETWORK ARCHITECTURE
• Point to point inter-node communication and i/o
• Implemented as a fat tree
 Fat tree invented by TMI employee Charles leiserson
• Claim: onsite bandwidth expandable
• Delivering 5GB/sec bisection BW on 1024 node machine
• Data router chip is a 8x8 crossbar switch
• Faulty nodes are mapped out of network
 Program can not assume a network topology
• Network can be flushed when time share swaps occur
• Network, not processors, guarantee end-to-end delivery

SEPARATE CONTROL NETWORK
• Synchronization & control network
• Complete binary tree organization
• Provides broadcast capability
• Implements barrier operations
• Implements interrupts for timesharing
• Performs reduction operators
(Sum, Max, AND, OR, Count, etc.,)

CM-5 PROGRAMMING
• Supports multiple parallel high level languages and programming styles
 Including data parallel model from CM-1 and CM-2
• Goal: hide many decisions from programmers
 CM-1, CM-2, vs. CM-5 ISA changes
 Use of processor node CPU vs. vectors coprocessors
 Partition wide synchronization generated by compiler
• Is it MIMD, SPMD, SIMD?
 “globally synchronized MIMD”

SAMPLE CM APPS
• Machine learning
• VLSI design
• Geophysics (oil exploration), plate tectonics
• Fluid flow simulation
• Computer vision
• Computer graphics, animation
• Protein sequence matching
• Global climate model simulation

CONNECTION MACHINES
• The light panels of FROSTBURG a CM-5 on display at the
“NATION CRUPTOLOGIC MUSEM”
• The panels were used to check the usage of the processing nodes & to run
diagnostics

Advanced Computer Architecture

Advanced Computer Architecture

More Related Content

What's hot

Similar to Advanced Computer Architecture

More from nibiganesh

Recently uploaded

Advanced Computer Architecture