Parallel computing in india

What is parallel computing?
 Parallel computing is a form of computation in which
many instructions are carried out simultaneously
operating on the principle that large problems can
often be divided into smaller ones,
which are then
solved concurrently
(in parallel)

Why Parallel Computing?
 The primary reasons for using parallel computing:
 Save time - wall clock time
 Solve larger problems
 Provide concurrency (do multiple things at the same
time)

Other reasons might
include:
 Cost savings - using multiple "cheap" computing
resources instead of paying for time on a supercomputer.
 Overcoming memory constraints - single computers
have very finite memory resources. For large problems,
using the memories of multiple computers may
overcome this obstacle.

Architecture Concepts
 von Neumann Architecture
Comprised of four main components:
• Memory
• Control Unit
• Arithmetic Logic Unit
• Input / Output

 Memory is used to store both program instructions
and data
• Program instructions are coded data which tell the
computer to do something
• Data is simply information to be used by the program
 Control unit fetches instructions/data from memory ,
decodes the instructions and then sequentially
coordinates operations to accomplish the programmed
task.
 Arithmetic Unit performs basic arithmetic operations
 Input / Output is the interface to the human operator

Single Instruction, Single Data
(SISD):
 A serial (non-parallel) computer Single instruction
 only one instruction stream is being acted on by the
CPU during any one clock cycle Single data:
 only one data stream is being used as input during any
one clock cycle
 Deterministic execution

Single Instruction, Multiple Data
(SIMD):
 A type of parallel computer Single instruction
 All processing units execute the same instruction at
any given clock cycle
 Multiple data: Each processing unit can operate on a
different data element
 Best suited for specialized problems characterized by
a high degree of regularity, such as graphics/image
processing.
 Two varieties: Processor Arrays and Vector Pipelines

Multiple Instruction, Single Data
(MISD):
 A single data stream is fed into multiple processing
units.
 Each processing unit operates on the data
independently via independent instruction streams.
 Some conceivable uses might be :
 multiple frequency filters operating on a single signal stream
 multiple cryptography algorithms attempting to crack a single coded
message.

Multiple Instruction, Multiple Data
(MIMD):
 Multiple Instruction: every processor may be
executing a different instruction stream
 Multiple Data: every processor may be working with a
different data stream
 Execution can be synchronous or asynchronous,
deterministic or non- deterministic

Parallel Computer Memory
Architectures:
 Shared Memory
i) Uniform Memory Access (UMA)
ii)Non-Uniform Memory Access (NUMA)
 Distributed Memory
 Hybrid Distributed-Shared Memory

Shared Memory :
 Shared memory parallel computers have in common
the ability for all processors to access all memory as
global address space.
 Multiple processors can operate independently
 Changes in a memory location effected by one
processor are visible to all other processors
 Shared memory machines can be divided into two
main classes based upon memory access times: UMA
and NUMA.

Shared Memory : UMA
 Uniform Memory Access (UMA):
 Identical processors, Symmetric Multiprocessor (SMP)
 Equal access and access times to memory
 Sometimes called CC-UMA - Cache Coherent UMA.
Cache coherent means if one processor updates a
location in shared memory, all the other processors
know about the update.

Shared Memory : NUMA
 Non-Uniform Memory Access (NUMA):
 Often made by physically linking two or more SMPs
 One SMP can directly access memory of another SMP
 Not all processors have equal access time to all
memories
 Memory access across link is slower
 If cache coherency is maintained, then may also be
called CC-NUMA - Cache Coherent NUMA

Distributed Memory :
 Distributed memory systems require a communication
network to connect inter-processor memory
 Processors have their own local memory
 Because each processor has its own local memory, it
operates independently. Hence, the concept of cache
coherency does not apply
 When a processor needs access to data in another
processor, it is usually the task of the programmer to
explicitly define how and when data is communicated.
Synchronization between tasks is likewise the
programmers responsibility.

Hybrid Distributed-Shared
Memory:
 The shared memory component is usually a cache
coherent SMP machine. Processors on a given SMP
can address that machines memory as global
 The distributed memory component is the
networking of multiple SMPs. SMPs know only about
their own memory - not the memory on another SMP.
Therefore, network communications are required to
move data from one SMP to another.

The main fields that need advanced
computing are:
 Computational Fluid Dynamics
 Design of large structures
 Computational physics and Chemistry
 Climate Modeling
 Vehicle Simulation
 Image processing
 Signal Processing
 Oil reservoir Management
 Seismic data processing

India ‘s contribution towards
parallel computing
 India has made significant strides in developing high-
performance parallel computers.

Major Indian parallel computing
projects:
 PARAM(from CDAC, Pune)
 ANUPAM(from BARC, Bombay)
 MTPPS(from BARC, Bombay)
 PACE(from ANURAG, Hyderabad)
 CHIPPS(from CDOT, Bangalore)
 FLOSOLVER(from NAL, Bangalore)

PARAM (Centre for Development
of Advanced Computing)
 India’s first super computer – PARAM 8000
 It was indigenously built in 1990
 1 GF (Giga Flops) Parallel Machine
 64 node prototype i.e. had 64 CPU’s
 PARAM is Sanskrit and means “Supreme”
 Programming environment called PARAS

 Based on Transputers 800/805
 Theoretically peak performance was 1 giga flops
 Practically provided 100 to 200 Mflops
 Hardware upgrade was given to PARAM 8000 to
produce the new PARAM 8600
 Hardware upgradation was the integration of Intel
i860 coprocessor with PARAM 8000
 It was a 256 CPU computer

PARAM 9000
 Mission was to deliver teraflops range parallel system
 This architecture emphasizes flexibility
 PARAM 9000SS is based on SuperSarc Processors
 Operating speed of processor is 75 Mhz
 PARAM 10000 has a peak speed of 6.4 GFlops

PARAM Padma
 Introduced in April 2003
 Top speed of 1024 Gflops ( 1 Tflops)
 Used IBM Power4 CPU’s
 Operating system was IBM AIX 5
 First Indian computer to break 1 Tflops barrier

PARAM Yuva
 Unveiled in November 2008
 The maximum sustainable speed is 38.1 Tflops
 The peak speed is 54 Tflops
 Uses Intel 73XX with 2.9 Ghz each.
 Storage capacity of 25 TB up to 200 TB
 PARAM Yuva II released in February 2013
 Peak performance of 524 Tflops
 Uses less power compared to its predecessor

ANUPAM
 Developed by Bhabha Atomic Research Centre,
Bombay
 200 Mflops of sustained computing power was
needed by them
 Based on standard MultiBus II i860 hardware

ANUPAM 860
 First developed in December 1991
 It made use of i860 microprocessor @ 40Mhz
 Overall sustained speed of 30 Mflops
 Upgraded version released on August 1992 has a
computational speed of 52 Mflops
 Further upgradation provided a sustained speed of 110
Mflops which was released in November 1992

ANUPAM Alpha
 Developed in July 1997 having a sustained speed of
1000 Mflops
 Made use of Alpha 21164 microprocessor @ 400 Mhz
 This system used complete servers / workstations as
compute node instead of processor boards.
 Updated version released in March 1998 had a
sustained speed of 1.5 Gflops.

ANUPAM Pentium
 Started in January 1998
 Main focus of its development is the minimization of
cost
 The first version ANUPAM Pentium II/4 gave a
sustained speed of 248 Mflops
 ANUPAM Pentium II was expanded in march 1999
with a sustained speed of 1.3Gflops

 In April 2000 the system was upgraded to Pentium
III/16 which gave a sustained speed of 3.5 Gflops
 ANUPAM PIV 64 node has a sustained speed of 43
Gflop

Applications
 All the three versions of ANUPAM was introduced to
solve the computational problems at BARC.
 The main fields that ANUPAM being used are ◦
Molecular Dynamic Simulation
◦ Neutron Transport Calculation
◦ Gamma Ray Simulation by Monte Carlo method
◦ Crystal Structure Analysis

PACE
 Developed by ANURAG (Advanced Numerical
Research and Analysis Group) under DRDO
 Developed as a result of R & D in parallel computing
 Uses VLSI
 Started in 1998
 Motorolla 68020 processor @ 16.67 MHz

 Processor for Aerodynamic Computation and
Evaluation (PACE)
 Used to design computational Fluid Dynamics needed
in aircraft
 Developed version is Pace Plus 32 used in missile
development
 More advanced version is PACE++

ANAMICA - Software
 ANURAG’s Medical Imaging and Characterization Aid
(ANAMICA)
 Medical visualization software for data obtained from
MRI , CT and Ultrasound
 Has both two dimensional and three dimensional
visualization
 Used for medical diagnosis etc.

DHRUVA 3
 Set up by DRDO for solving mission critical Defence
Research and Development applications
 Used in design of aircraft
 E.g.: Advanced Medium Combact Aircraft (AMCA)

FLOSOLVER
 Started in 1986 by National Aerospace Laboratories
(NAL)
 Used in numerical weather prediction
 Varsha GCM could predict the weather accurately in
two weeks advance uses FS
 Based on 16 bit Intel 8086 and 8087 processors
 Updated versions were released to increase the
performance

CHIPPS
 Developed to have indigenous digital switching
technology
 Established in rural exchanges and secondary
switching areas
 Speed of 200 Mflops is acquired

Parallel computing in india

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Parallel computing in india

Similar to Parallel computing in india (20)

Recently uploaded

Recently uploaded (20)

Parallel computing in india