2. The User Challenge
Verification Bottleneck
Long verification process
In 66% of designs, verification takes >50% of the design cycle
In ~40% of projects, simulation regression runtime is longer than 1 day
Large design/SoC simulation challenge
>40% of designs are larger than 10M gates
Difficult to simulate the entire design/SoC
Excessive computing resources
Required 10’s or 100’s GBytes of memory
Needs most advanced CPUs Source: 2010 Wilson Research Group and
Mentor Graphics Functional Verification Study
Effort spent on verification was increased by >58% in last 4 years
May 2, 2012
5. Simulators using CPUs
Event driven single queue of events
Memory access patterns cache miss
Multi-core CPUs:
Only one order of magnitude
Communication latency
Limited bandwidth
May 2, 2012
6. HW Solutions
Hardware accelerators and emulators
are not simulators
Suitable for system-level debug
Are very expensive – HW cost, custom design
Require significant effort for bring up
Are limited in capacity (large designs require several boxes)
They lack:
Support for non-synthesizable code
4-state-logic
Full debug visibility
May 2, 2012
12. GPU Computing – cont’
146X 36X 149X 50X 100X
Medical Molecular Financial Matlab Computing Astrophysics
Imaging Dynamics simulation AccelerEyes RIKEN
U of Utah U of Illinois, Oxford
Urbana
Source: NVIDIA
May 2, 2012
13. The Power of GPU
Peak Single Precision Performance
GFlops/sec
Tesla 20-series
Tesla 10-series
Tesla 8-
series Nehalem
3 GHz
Source: NVIDIA
May 2, 2012
14. GPU 100% Utilization
Thread #1 ALU Memory Load ALU Memory Load
Thread #2 ALU Memory Load ALU Memory Load
Thread #3 ALU Memory Load ALU Memory Load
Thread #4 ALU Memory Load ALU Memory Load
Thread #5 ALU Memory Load ALU Memory Load
Pipelining multiple threads can increase utilization to 100%
May 2, 2012
15. Logic Simulations - Challenge
– Billions of
computing
elements
– Short/simple
calculations
– Many
dependencies
– How to SIMD?
May 2, 2012
20. RocketSim™ Overview
Summary
Highly-cost-effective simulation offload-engine, based on GPUs
10x acceleration factor compared to Cadence ncsim & Synopsys vcs
Acceleration increases with every new GPU generation
Works seamlessly with every existing simulator (Cadence, Synopsys, …)
Zero ramp-up time
Supports extremely huge designs (Giga-gates)
Short compilation time (minutes)
Full visibility
May 2, 2012
21. Customers testimonials
“With RocketSim, simulation time was reduced dramatically from
weeks to days, with a tenfold increase in speed and five-fold
decrease in server RAM requirements. This, together with support
of 4-state and capacity of more than 1G gates, give us a superb
tool to simulate our next generation GPU designs.”
- Dan Smith, Director of Engineering, NVIDIA
"Rocketick's RocketSim™ simulation accelerator solved verification
bottleneck of our SwitchX® switch silicon IC project by running
37 days’ worth of simulation over a single weekend without
changes in our standard verification environment and scripts. In
addition, reducing memory consumption from 192GB to less than
8GB allowed full chip simulation of SwitchX®, which was
previously impossible using standard simulators."
- Eitan Zahavi, Senior Director of
Engineering, Mellanox
May 2, 2012
22. Thank you for your time
For more information:
www.rocketick.com
May 2, 2012