Unblocking The Main Thread Solving ANRs and Frozen Frames
Emulation on Your Desktop
1. The Verification Problem
Emulation
Case Studies
Summary
Emulation On Your Desktop
Nirajnayan Sharma
Head, Engineering and Marketing - India
Bluespec, Inc.
niraj.sharma@bluespec.com
March 12, 2010
Nirajnayan Sharma Emulation On Your Desktop
2. The Verification Problem
Emulation
Case Studies
Summary
Outline
1 The Verification Problem
2 Emulation
First Generation Emulation
Next Generation Emulation
3 Case Studies
AXI Switch
CMU ProtoFLEX
4 Summary
Nirajnayan Sharma Emulation On Your Desktop
3. The Verification Problem
Emulation
Case Studies
Summary
Outline
1 The Verification Problem
2 Emulation
First Generation Emulation
Next Generation Emulation
3 Case Studies
AXI Switch
CMU ProtoFLEX
4 Summary
Nirajnayan Sharma Emulation On Your Desktop
4. The Verification Problem
Emulation
Case Studies
Summary
Simulation – The Fundamental Verification Bottleneck
Simulations are great for DOA and initial testing
Simulators allow the highest visibility into the hardware
Simulations remain the best method for the debug-fix-verify
cycle
However . . .
Speed quickly becomes a limiting factor in the verification of
complex systems
System simulations which can last days
1 sec of D1 H.264 encode => 15 hours
Linux boot initialization => 140 hours . . .
Nirajnayan Sharma Emulation On Your Desktop
5. The Verification Problem
Emulation First Generation Emulation
Case Studies Next Generation Emulation
Summary
Outline
1 The Verification Problem
2 Emulation
First Generation Emulation
Next Generation Emulation
3 Case Studies
AXI Switch
CMU ProtoFLEX
4 Summary
Nirajnayan Sharma Emulation On Your Desktop
6. The Verification Problem
Emulation First Generation Emulation
Case Studies Next Generation Emulation
Summary
From Simulation To Emulation
!"#$%&'()*+,-./&$,"0
89!
4'&3,$,"0&/*5.'(*+,-
!"#$%#&'&((')*+,-.& 1-./&$,"0
2&'3%&'(
+5((3.5*=>>>?*@*=>A>>>?
674*-"8(3*$"*
1-./&$"'
674*9*5('#"'-&0:(;
:',$,:&/*4<*-"8(3*$"*
1-./&$"'
!"#$#!"#$%#&'&( /-++#"0,,1-%-23#+$%2-#
(')*+,-.& +,$.414+'#5%51*+$+#$+#56752-%,#
,0#89!
Nirajnayan Sharma Emulation On Your Desktop
7. The Verification Problem
Emulation First Generation Emulation
Case Studies Next Generation Emulation
Summary
First Generation Emulation
Emulation when used effectively can be a game-changer, but
existing systems are . . .
Expensive – difficult to justify except at the highest-end
Complexity – co-emulation link API requires bit-level data
packing and timing synchronization
!"#$%&'&(")*+*,(-./'&"# 7;9%(3'/ <7=>*:"?*+*@-./'&"#
A()$
61%&*B1)3;
D"+1-./'&(")*/()$*
01%(C)
012(31
61%&*
4)51#
:1)3;
61%&
7#"8#(1&'#9
7#"&"3"/%
Nirajnayan Sharma Emulation On Your Desktop
8. The Verification Problem
Emulation First Generation Emulation
Case Studies Next Generation Emulation
Summary
First Generation Emulation . . .
Emulation when used effectively can be a game-changer, but
existing systems are . . .
Difficult to remove bottlenecks between testbench and
DUT resulting in unrealized performance
Complex test-benches from simulation will not port into
emulation
!"#$%&'&(")*+*,(-./'&"# 7;9%(3'/ <7=>*:"?*+*@-./'&"#
A()$
61%&*B1)3;
D"+1-./'&(")*/()$*
01%(C)
012(31
61%&*
4)51#
:1)3;
61%&
7#"8#(1&'#9
7#"&"3"/%
Nirajnayan Sharma Emulation On Your Desktop
9. The Verification Problem
Emulation First Generation Emulation
Case Studies Next Generation Emulation
Summary
First Generation Emulation . . .
Emulation when used effectively can be a game-changer, but
existing systems are . . .
Limited to later in the development cycle when RTL has
reached a critical point in verification
Only functional after a lot of verification (including
architectural and spec) have already been completed
!"#$%&'&(")*+*,(-./'&"# 7;9%(3'/ <7=>*:"?*+*@-./'&"#
A()$
61%&*B1)3;
D"+1-./'&(")*/()$*
01%(C)
012(31
61%&*
4)51#
:1)3;
61%&
7#"8#(1&'#9
7#"&"3"/%
Nirajnayan Sharma Emulation On Your Desktop
10. The Verification Problem
Emulation First Generation Emulation
Case Studies Next Generation Emulation
Summary
Next Generation Emulation
87)9+,3,$7'-:-0$;<53,7) >/1+$%35 =>?@-.73)(-:-A;<53,7)
F$'9
01',/"+$2345"-*"+,-
."'%/
63%,7) 63%,7)
0BADGC 0BADGC
>)7,7%75 >)7,7%75 !"#$%"
*"+,-."'%/ H H &'(")
I@F I@F *"+,
&0. 63%,7) 63%,7)
>BCDA
A,/")'",
EEE
Hardware Abstraction Layer – decouples user
application from physical layer
Proprietary or standard (USB, Ethernet, PCI-E, . . . ), serial
or parallel physical link
Work with high-level data structures
Low-level link data and timing details hidden
Nirajnayan Sharma Emulation On Your Desktop
11. The Verification Problem
Emulation First Generation Emulation
Case Studies Next Generation Emulation
Summary
Next Generation Emulation . . .
87)9+,3,$7'-:-0$;<53,7) >/1+$%35 =>?@-.73)(-:-A;<53,7)
F$'9
01',/"+$2345"-*"+,-
."'%/
63%,7) 63%,7)
0BADGC 0BADGC
>)7,7%75 >)7,7%75 !"#$%"
*"+,-."'%/ H H &'(")
I@F I@F *"+,
&0. 63%,7) 63%,7)
>BCDA
A,/")'",
EEE
SCE-MI
Standardized API and communication protocol to take full
advantage of emulation performance
Both on hardware and software sides
Ensures simulation and emulation component
inter-operability
Nirajnayan Sharma Emulation On Your Desktop
12. The Verification Problem
Emulation First Generation Emulation
Case Studies Next Generation Emulation
Summary
Next Generation Emulation . . .
87)9+,3,$7'-:-0$;<53,7) >/1+$%35 =>?@-.73)(-:-A;<53,7)
F$'9
01',/"+$2345"-*"+,-
."'%/
63%,7) 63%,7)
0BADGC 0BADGC
>)7,7%75 >)7,7%75 !"#$%"
*"+,-."'%/ H H &'(")
I@F I@F *"+,
&0. 63%,7) 63%,7)
>BCDA
A,/")'",
EEE
Transactors – Virtual termination of external design
interfaces
Help eliminate link bottlenecks by converting from timed
bit-level to untimed transaction-level
Nirajnayan Sharma Emulation On Your Desktop
13. The Verification Problem
Emulation First Generation Emulation
Case Studies Next Generation Emulation
Summary
Next Generation Emulation . . .
87)9+,3,$7'-:-0$;<53,7) >/1+$%35 =>?@-.73)(-:-A;<53,7)
F$'9
01',/"+$2345"-*"+,-
."'%/
63%,7) 63%,7)
0BADGC 0BADGC
>)7,7%75 >)7,7%75 !"#$%"
*"+,-."'%/ H H &'(")
I@F I@F *"+,
&0. 63%,7) 63%,7)
>BCDA
A,/")'",
EEE
Synthesizable testbench
Helps eliminate bottlenecks in the simulator and
co-emulation link
Allow debug instrumentation to be placed around the DUT
without affecting performance
Nirajnayan Sharma Emulation On Your Desktop
14. The Verification Problem
Emulation First Generation Emulation
Case Studies Next Generation Emulation
Summary
Next Generation Emulation . . .
87)9+,3,$7'-:-0$;<53,7) >/1+$%35 =>?@-.73)(-:-A;<53,7)
F$'9
01',/"+$2345"-*"+,-
."'%/
63%,7) 63%,7)
0BADGC 0BADGC
>)7,7%75 >)7,7%75 !"#$%"
*"+,-."'%/ H H &'(")
I@F I@F *"+,
&0. 63%,7) 63%,7)
>BCDA
A,/")'",
EEE
Emulator – Commodity FPGAs
Must move to commodity FPGAs and support virtualization
of FPGA platforms (portability)
Allows tracking of latest FPGA technology at lowest cost
Nirajnayan Sharma Emulation On Your Desktop
15. The Verification Problem
Emulation AXI Switch
Case Studies CMU ProtoFLEX
Summary
Outline
1 The Verification Problem
2 Emulation
First Generation Emulation
Next Generation Emulation
3 Case Studies
AXI Switch
CMU ProtoFLEX
4 Summary
Nirajnayan Sharma Emulation On Your Desktop
16. The Verification Problem
Emulation AXI Switch
Case Studies CMU ProtoFLEX
Summary
AXI Switch Performance Validation
!"#$%&'&(") I<K%(/'4+ HI08+@"'#J
F()$
!"#$*:%&%'()
-/4L-$+0>:
3&'&(%&(/% 3&'&(%&(/%
78*
3
34'5,
89:
89:+-#'..(/+
*'%&,#+ *
0,),#'&"# 78*
9'/&"# 3
34'5,
89:
89:+-#'..(/+
0,),#'&"#
*'%&,#+ * 89:+ 78*
9'/&"# 3
3;(&/< 34'5,
89:+-#'..(/+ >3@
=6>-?
6(#,/&"# 89:+-#'..(/+
89:
*'%&,#+ *
78*
0,),#'&"# 3
9'/&"# 34'5,
89:
89:+-#'..(/+
*'%&,#+ *
78*
0,),#'&"# 3
9'/&"# 34'5,
*+*'%&,# 3,#('4(A,#B6,%,#('4(A,#+ *+*'%&,#+-#'..(/
-#'..(/+6(#,/&"#% C"B,DE4'&(")+F()$ 0,),#'&"#%+1+-#')%'/&"#% *+G+2+89:+3;(&/< 2+34'5,%
DUT and emulation setup
AXI switch as DUT
Configurable number of masters and slaves
Implemented on ML507 Virtex-5 board
Nirajnayan Sharma Emulation On Your Desktop
17. The Verification Problem
Emulation AXI Switch
Case Studies CMU ProtoFLEX
Summary
AXI Switch Performance Validation . . .
!"#$%&'&(") I<K%(/'4+ HI08+@"'#J
F()$
!"#$*:%&%'()
-/4L-$+0>:
3&'&(%&(/% 3&'&(%&(/%
78*
3
34'5,
89:
89:+-#'..(/+
*'%&,#+ *
0,),#'&"# 78*
9'/&"# 3
34'5,
89:
89:+-#'..(/+
0,),#'&"#
*'%&,#+ * 89:+ 78*
9'/&"# 3
3;(&/< 34'5,
89:+-#'..(/+ >3@
=6>-?
6(#,/&"# 89:+-#'..(/+
89:
*'%&,#+ *
78*
0,),#'&"# 3
9'/&"# 34'5,
89:
89:+-#'..(/+
*'%&,#+ *
78*
0,),#'&"# 3
9'/&"# 34'5,
*+*'%&,# 3,#('4(A,#B6,%,#('4(A,#+ *+*'%&,#+-#'..(/
-#'..(/+6(#,/&"#% C"B,DE4'&(")+F()$ 0,),#'&"#%+1+-#')%'/&"#% *+G+2+89:+3;(&/< 2+34'5,%
Test setup
Directed random traffic
Traffic generator and analyzer split between simulator and
emulator
Nirajnayan Sharma Emulation On Your Desktop
20. The Verification Problem
Emulation AXI Switch
Case Studies CMU ProtoFLEX
Summary
AXI Switch Performance Validation . . .
!"#$%&'&(") I<K%(/'4+ HI08+@"'#J
F()$
!"#$*:%&%'()
-/4L-$+0>:
3&'&(%&(/% 3&'&(%&(/%
78*
3
34'5,
89:
89:+-#'..(/+
*'%&,#+ *
0,),#'&"# 78*
9'/&"# 3
34'5,
89:
89:+-#'..(/+
0,),#'&"#
*'%&,#+ * 89:+ 78*
9'/&"# 3
3;(&/< 34'5,
89:+-#'..(/+ >3@
=6>-?
6(#,/&"# 89:+-#'..(/+
89:
*'%&,#+ *
78*
0,),#'&"# 3
9'/&"# 34'5,
89:
89:+-#'..(/+
*'%&,#+ *
78*
0,),#'&"# 3
9'/&"# 34'5,
*+*'%&,# 3,#('4(A,#B6,%,#('4(A,#+ *+*'%&,#+-#'..(/
-#'..(/+6(#,/&"#% C"B,DE4'&(")+F()$ 0,),#'&"#%+1+-#')%'/&"#% *+G+2+89:+3;(&/< 2+34'5,%
SCE-MI communication – consistent framework for
implementing untimed communication between simulator
and emulator
Nirajnayan Sharma Emulation On Your Desktop
21. The Verification Problem
Emulation AXI Switch
Case Studies CMU ProtoFLEX
Summary
AXI Switch Performance Validation . . .
!"#$%&'&(") I<K%(/'4+ HI08+@"'#J
F()$
!"#$*:%&%'()
-/4L-$+0>:
3&'&(%&(/% 3&'&(%&(/%
78*
3
34'5,
89:
89:+-#'..(/+
*'%&,#+ *
0,),#'&"# 78*
9'/&"# 3
34'5,
89:
89:+-#'..(/+
0,),#'&"#
*'%&,#+ * 89:+ 78*
9'/&"# 3
3;(&/< 34'5,
89:+-#'..(/+ >3@
=6>-?
6(#,/&"# 89:+-#'..(/+
89:
*'%&,#+ *
78*
0,),#'&"# 3
9'/&"# 34'5,
89:
89:+-#'..(/+
*'%&,#+ *
78*
0,),#'&"# 3
9'/&"# 34'5,
*+*'%&,# 3,#('4(A,#B6,%,#('4(A,#+ *+*'%&,#+-#'..(/
-#'..(/+6(#,/&"#% C"B,DE4'&(")+F()$ 0,),#'&"#%+1+-#')%'/&"#% *+G+2+89:+3;(&/< 2+34'5,%
HAL automatically serializes and de-serializes transaction
level communication through the USB
Nirajnayan Sharma Emulation On Your Desktop
22. The Verification Problem
Emulation AXI Switch
Case Studies CMU ProtoFLEX
Summary
AXI Switch Performance Validation . . .
!"#$%&'&(") I<K%(/'4+ HI08+@"'#J
F()$
!"#$*:%&%'()
-/4L-$+0>:
3&'&(%&(/% 3&'&(%&(/%
78*
3
34'5,
89:
89:+-#'..(/+
*'%&,#+ *
0,),#'&"# 78*
9'/&"# 3
34'5,
89:
89:+-#'..(/+
0,),#'&"#
*'%&,#+ * 89:+ 78*
9'/&"# 3
3;(&/< 34'5,
89:+-#'..(/+ >3@
=6>-?
6(#,/&"# 89:+-#'..(/+
89:
*'%&,#+ *
78*
0,),#'&"# 3
9'/&"# 34'5,
89:
89:+-#'..(/+
*'%&,#+ *
78*
0,),#'&"# 3
9'/&"# 34'5,
*+*'%&,# 3,#('4(A,#B6,%,#('4(A,#+ *+*'%&,#+-#'..(/
-#'..(/+6(#,/&"#% C"B,DE4'&(")+F()$ 0,),#'&"#%+1+-#')%'/&"#% *+G+2+89:+3;(&/< 2+34'5,%
Performance – entire system runs at 50 MHz in emulation
on the board.
Nirajnayan Sharma Emulation On Your Desktop
23. The Verification Problem
Emulation AXI Switch
Case Studies CMU ProtoFLEX
Summary
CMU ProtoFLEX – Accelerating Software Validation
For many SoCs software development and verification can dominate development cycle
!"#$%&'()*+,
-$:%#=%$
-./%00%0 81..9="87$"1= !"#$9$%8:
*12%#*3 IJ$#7=;6C7=$KL 5"."8;
G12AC7$%=8<
4567#8
:"H:A/7=02"0$:
81..9="87$"1=
IJ."8#1A$#7=;6C7=$KL
!
5D(1=(**3 F%;$(1@(@9CC(;<;$%.
5<=$:%;">%0(0%?"8%
@1#(C%;;(@#%E9%=$
1=()*+,(@7/#"8(@1#
B5,(@9=8$"1=7C"$<
81.69$%A"=$%=;"?%
B5,(@9=8$"1=7C"$<
Virtutech Simics
Virtutech Simics – Commercial SW simulator for whole
systems (OS/devices/apps)
Despite clever tricks, steady slowdown for each added
thread and for each added bit of instrumentation
Nirajnayan Sharma Emulation On Your Desktop
24. The Verification Problem
Emulation AXI Switch
Case Studies CMU ProtoFLEX
Summary
CMU ProtoFLEX . . .
!"#$%&'()*+,
-$:%#=%$
-./%00%0 81..9="87$"1= !"#$9$%8:
*12%#*3 IJ$#7=;6C7=$KL 5"."8;
G12AC7$%=8<
4567#8
:"H:A/7=02"0$:
81..9="87$"1=
IJ."8#1A$#7=;6C7=$KL
!
5D(1=(**3 F%;$(1@(@9CC(;<;$%.
5<=$:%;">%0(0%?"8%
@1#(C%;;(@#%E9%=$
1=()*+,(@7/#"8(@1#
B5,(@9=8$"1=7C"$<
81.69$%A"=$%=;"?%
B5,(@9=8$"1=7C"$<
CMU ProtoFLEX
Fully operational model of 16-cpu UltraSPARC III SunFire
3800 Server, running unmodified Solaris 8; running on
FPGA at 90 MHz
Hybrid simulation – continue to use Simics for modeling
rest of system (I/O devices, ...)
Nirajnayan Sharma Emulation On Your Desktop
25. The Verification Problem
Emulation AXI Switch
Case Studies CMU ProtoFLEX
Summary
CMU ProtoFLEX . . .
!"#$%&'()*+,
-$:%#=%$
-./%00%0 81..9="87$"1= !"#$9$%8:
*12%#*3 IJ$#7=;6C7=$KL 5"."8;
G12AC7$%=8<
4567#8
:"H:A/7=02"0$:
81..9="87$"1=
IJ."8#1A$#7=;6C7=$KL
!
5D(1=(**3 F%;$(1@(@9CC(;<;$%.
5<=$:%;">%0(0%?"8%
@1#(C%;;(@#%E9%=$
1=()*+,(@7/#"8(@1#
B5,(@9=8$"1=7C"$<
81.69$%A"=$%=;"?%
B5,(@9=8$"1=7C"$<
CMU ProtoFLEX
Benchmarks
TPC-C OLTP on Oracle 10g Enterprise Database Server
SPECINT (bzip2, crafty, gcc, gzip, parser, vortex)
Performance: 10-60 MIPS. 39x faster than Virtutech Simics
alone on same system/benchmark
Written by 1 graduate student (Eric Chung) in 1 year
Nirajnayan Sharma Emulation On Your Desktop
26. The Verification Problem
Emulation
Case Studies
Summary
Outline
1 The Verification Problem
2 Emulation
First Generation Emulation
Next Generation Emulation
3 Case Studies
AXI Switch
CMU ProtoFLEX
4 Summary
Nirajnayan Sharma Emulation On Your Desktop
27. The Verification Problem
Emulation
Case Studies
Summary
Summary
If you had a 50-100 MHz emulation platform on your desktop right from the start?
FPGA densities and parallel interconnect speeds today
allow
Significant unit verification to be done in a single FPGA
Complete chips to be cost effectively emulated on FPGA
boards and systems
A well-engineered co-emulation stack and link
virtualization are key to enabling mainstream use
Ensure simple and rapid bringup
Flexible hardware-software partitioning
Portability across hardware and applications
Paired with a general-purpose high-level synthesis language, a
well productized co-emulation stack can enable mass-adoption
of emulation.
Nirajnayan Sharma Emulation On Your Desktop