SlideShare a Scribd company logo
Leveraging Low-Cost
FPGA Prototyping
for Validation of
Highly Threaded
Server-on-Chip
DV Club - July 2009




Jai Kumar,
Verification Technologist
Sun Microsystems Inc.
jai.kumar@sun.com
http://sun.com
Outline
  •       Verification Challenges
  •       Emulation alternatives
  •       FPGA Prototyping Basics
  •       Prototyping Challenges            What's in it for you -
                                            Managers:
  •       Guidelines                        - Requirements – effort,
                                            $$, Time, tools
  •       Results                           Engineers:
                                            - Challenges
  •       Summary                           - Avoid Pitfalls
                                            Vendors:
                                            - Enhancements to
                                            simplify adoption
DV Club                         Jai Kumar                              Slide 2
Design Challenges Impacting Verification

                                   1000000                                                     Threads                              Design Size
                                                                                                                                               160M
                                                                                  300                                   180

                                              FPGA Prototyping                    250
                                                                                                               256      160


                                    100000                                        200
                                                                                                                        140
                                                                                                                        120
                                                                                                                                            120M
                                                                                  150
                                                                                                       128              100
                                                                                                                                          80M
                                                                                                64
                                                                                                                         80
                                                                                  100
                                                                                          32                             60       41M
                                     10000                      Emulation            50
                                                                                                                         40
                                                                                                                         20
                                                                                      0                                      0
   Simulation Speed (cycles/sec)




                                                                                          T1000 T5220 T5240 T5440                 T1000 T5220 T5240 T5440
                                      1000


                                       100                                                 Performance                                   Memory
                                                                                 9
                                                                                 8
                                                                                                               8X      600
                                                                                                                                                     512G
                                                                                                                       500
                                                                                 7
                                                                                 6                                     400
                                        10                         SW Sim                                                                   256G
                                                                                 5
                                                                                 4
                                                                                                       4X              300

                                                                                 3             2.5X                    200
                                                                                                                                         128G
                                         1
                                                                                 2
                                                                                 1
                                                                                          1X                           100       64G
                                       5000000 Size (M gates)
                                             Design 10000000      15000000       0                                       0
                                                                                                                                 T1000   T5220   T5240   T5440
                                                                                      T1000    T5220   T5240   T5440




DV Club                                                                      Jai Kumar                                                                   Slide 3
Server-on-Chip:
                                                                                           • 2x+ performance over
Verification Complexity                                                                      UltraSPARC T1, within the
  Dual-channel           Dual-channel           Dual-channel       Dual-channel
                                                                                             same power envelope
   FB-DIMM                FB-DIMM                FB-DIMM            FB-DIMM                • Up to 8 cores @1.4GHz
                                                                                           • 2x the threads
                                                                                             > Up to 64 threads per CPU
                                                                                           • 2x the memory
          Memory              Memory          Memory           Memory                        > Up to 128GB memory
         controller          controller      controller       controller
                                                                                             > Up to 16 full buffered Dimms
      L2$ Bank
        L2$      L2$       L2$ Bank
                            L2$      L2$   L2$ Bank
                                            L2$      L2$    L2$ Bank
                                                             L2$      L2$
       Bank     Bank       Bank    Bank    Bank     Bank    Bank Bank                        > 2.5x memory BW = 60+GB/S
                                Crossbar
                                Crossbar                                                   • 8x FPUs, 1 fully pipelined
          16       16      16      16      16       16      16       16
          KB
          8
          I$
          KB
                   KB
                   8
                   I$
                   KB
                           KB
                           8
                           I$
                           KB
                                   KB
                                   8
                                   I$
                                   KB
                                           KB
                                           8
                                           I$
                                           KB
                                                    KB
                                                    8
                                                    I$
                                                    KB
                                                            KB
                                                            8
                                                            I$
                                                            KB
                                                                     KB
                                                                     8
                                                                     I$
                                                                     KB
                                                                                             floating point unit/core
          D$
          FP
          U
                   D$
                  FP
                  U
                           D$
                           FP
                           U      U
                                   D$
                                  FP       D$
                                           FP
                                           U
                                                    D$
                                                    FP
                                                    U
                                                            D$
                                                            FP
                                                            U
                                                                     D$
                                                                     FP
                                                                     U
                                                                                           • 4MB L2$ (8 banks) 16 way set
          SP      SP       SP     SP       SP       SP      SP       SP
          U       U        U      U        U        U       U        U                     • Security co-processor per core
      C1 C2 C3 C4 C5 C6 C7 C8                                                                > DES, 3DES, AES, RC4, SHA1,
                                                                                               SHA256, MD5, RSA to 2096 key,
                                                                                               ECC
                                     Sys I/F
                NIU               buffer switch
                                                                 PCIe                      • Powers SunFire T5120, T5220,
                                      core
                                                                                             T6320 Servers
                                SSI, JTAG Debug port
               10 Gb Ethernet                                 X8 @ 2.5 GHz
                                                           2 GB/s each direction
DV Club                                                                        Jai Kumar                                      Slide 4
Problem: cost of Emulation going up




          Emulator HW (big iron)               Gulfstream jet


DV Club                            Jai Kumar                    Slide 5
FPGA Roadmap




                                                          Source: MPSOC Keynote 2006, Xilinx
          FPGAs are getting bigger, cheaper and faster!
DV Club                             Jai Kumar                                          Slide 6
Solution: Supplement Emulation with
  cheaper FPGA prototyping alternatives
• Why use FPGA prototyping?
    
          Not enough $$ for HW Emulators (big iron) – R&D dollars
    
          Need to run at close to real-time speed
    
          New advancements in FPGA technology creates opportunity for leverage
• Benefits

    Availability of standard off-the-shelf, mix-n-match FPGA HW/SW tools (small
    iron)

    Allows you to stretch your R&D dollars

    Deploy many replicates – multiple systems in parallel

    Supplements your emulators (big iron) – does not replace
                                           Think Small, Fast and Many
DV Club                                  Jai Kumar                        Slide 7
FPGA Prototyping 101
      What is Prototyping:
      • Process of mapping RTL functionality to FPGAs
      Hardware:
      • Multiple Latest, Largest FPGAs on a board
      • Two Major Vendors: Altera & Xilinx
      • Capacity: 3-150M Gates
      • Performance: 5 to 50MHz
      Software:
      • Synthesis, Design Partition, FPGA P&R
      • Debug Tools
DV Club                          Jai Kumar              Slide 8
Big Picture
                                HW verification                                  System-level (HW/SW verification
                                                                                                                                   Silicon
                                                                                 SW Development
                                                                                     Productivity



                                                                                              FPGA Prototyping
          Modeling Effort




                                                                                                     38mins
                                                                    Emulation
                                             Acceleration
                                                                    6 hours
                                Simulation      1Day 18hrs                                                    Debug Productivity
Solaris Boot
   Time                         15 years

                            1    10    100   1K     10K      100K      500K     1M     5M           10M          100M    1G+
                                                                                Simulation Speed (Hz)

DV Club                                                                           Jai Kumar                                                  Slide 9
FPGA Protyping Vs. Emulation
          Features                               FPGA Prototype          Emulation
          General:
          Capacity Expandability                              Good       Very Good
          Memory Capacity                                Very Good            Good
          Ease of use                                          Low       Very Good
          Cost                                                 Low             High
          Model Build Efficiency:
          Compile Time                                            OK     Very Good
          Model Size                                           Smaller       Bigger
          RTL Flexibility                                         OK          Good
          Test bench support                                      OK     Very Good
          Simulation Efficiency:
          Simulation Speed                               Very Good            Good
          Save/Restore                                          No       Very Good
          IO Expandability (PCIE,Ethernet etc)           Very Good            Good
          Debug Efficiency:
          Signal Visibility                                    Limited   Very Good
          Waveforms w/o re-run                                     No    Very Good


DV Club                                            Jai Kumar                          Slide 10
FPGA Tools
                                                               Design
                                                                RTL
                                                                                        Synopsys
                                    Auspy               Design Partition                    Certify


                                    Altera                                      Synopsys              Xilinx
                                    Quartus              RTL Synthesis           Synplify              ISE



                               Altera Place & Route                           Xilinx Place & Route


                               Altera Stratix3 FPGA                            Xilinx Virtex5 FPGA

                                                             HW Boards
                           Gidel HW                DINI HW                Synopsys     DINI               Vendor X



                              Altera SignalTap Debug                         Xilinx Chipscope Debug

                                                                                      Synopsys
                             ALDE           DAFC        Advanced Debug
                              C              A                                         Identify
                                                             Tools                       Pro

  Off-the-Shelf, Mix-n-Match FPGA Emulation HW/SW Tools
DV Club                                                       Jai Kumar                                              Slide 11
Deployment Strategy
  • Understand platform capabilities and limitations
          > Build your use model
          > Set management, user expectations
  • Identify Applicable Model Configurations
          > Size limited to small capacity (<16MGates)
  • Identify Workload
          > Primary Platform for SW Development
          > Secondary Platform for RTL/IO Verification
  • Design Mapping
          > Automated FPGA RTL Coding enforcements
  • Leverage simulators/emulators for debug
DV Club                                 Jai Kumar        Slide 12
Prototyping Challenges
  • Design Mapping – Size, Style
          > Limit to 4-6 FPGAs (~16M Gates)
  • Memory Mapping
    > RTL Arrays (custom logic) – BLK RAM inferencing
    > Multi-ported arrays – over clocking
    > Large system memory - mapping to DDR
  • Verification Infrastructure
    > TestBench – synthesizable, self-checking
    > Initialization - Use back-door access to download/upload big memories
    > Monitors, SVA, $display is not supported – use LA triggers
  • Mapping Transformation Verification
          > Gate-level Simulation at every stage

DV Club                                       Jai Kumar                       Slide 13
Guidelines
  • RTL Coding Guidelines for FPGAs
          > No XMRs, no force/release, avoid latches, clock gating
          > No initializations (constant inits results in undesired synth
            optimizations)
          > Perform FPGA RTL Linting Check
  • Stand-alone Synthesis & Verif of custom logic
          > check for RAM utilization & reduced CLK domains
          > Mixed-mode RTL-Gate Simulations
  • Perform full-chip gate simulations at different stages
          > After synthesis, after partitioning, after insertion of signal
            multiplexing logic
DV Club                                 Jai Kumar                        Slide 14
FPGA Flow
                   Modular        Parallel
      Emulation
                  Synthesis      Synthesis
      RTL Model                                     Gate-level
                                                    Simulation
                                   Netlist
                                 Qualification
                                                    RTL Simulation
                                     Design         - verify latch, clk-gate
                                                    conversions
                                    Partition       - fpga partitioning
                                                    - pin multiplexing

                    C-API                              FPGA
                               Design Visibility
                   Compile                         Place & Route



                                                     FPGA
                                                    Platform

DV Club                       Jai Kumar                                        Slide 15
FPGA Prototyping Results                                Memory
                                                        controller
                                                                        Memory
                                                                       controller
                                                                                         Memory        Memory
                                                      L2$ Bank
                                                        L2$    L2$   L2$ Bank
                                                                      L2$    L2$    L2$ Bank L2$ Bank
                                                                                      controller
                                                                                     L2$     L2$ controller
                                                                                                 L2$    L2$
  • OpenSPARC T2 Model                                 Bank Bank
                                                      16
                                                      KB
                                                      8
                                                             16
                                                             KB
                                                             8
                                                                     16
                                                                     KB
                                                                     8
                                                                          Crossbar
                                                                          Crossbar
                                                                      Bank Bank
                                                                           16
                                                                            KB
                                                                            8
                                                                                Bank
                                                                               16
                                                                                    KB
                                                                                    8
                                                                                             Bank Bank Bank
                                                                                            16
                                                                                            KB
                                                                                            8
                                                                                                  16
                                                                                                  KB
                                                                                                  8
                                                                                                          16
                                                                                                          KB
                                                                                                          8

     > 3.8M Gates, Runs @8MHz
                                                      I$
                                                      KB
                                                       F     I$
                                                              F
                                                             KB      I$
                                                                      F
                                                                     KB     I$
                                                                             F
                                                                            KB      I$
                                                                                     F
                                                                                    KB      I$
                                                                                             F
                                                                                            KB    I$
                                                                                                   F
                                                                                                  KB      I$
                                                                                                           F
                                                                                                          KB
                                                      D$     D$
                                                              P      D$
                                                                      P     D$
                                                                             P      D$
                                                                                     P      D$
                                                                                             P    D$
                                                                                                   P      D$
                                                                                                           P
                                                       P
                                                       S     S        S     S       S       S     S       S
                                                       U
                                                       P     U
                                                             P        U
                                                                      P     U
                                                                            P       U
                                                                                    P       U
                                                                                            P     U
                                                                                                  P       U
                                                                                                          P
     > Being opensourced soon –                       C1 C2 C3 C4 C5 C6 C7 C8
                                                       U     U        U     U       U       U     U       U

         opensparc.net                                                           Sys I/F
                                                           NIU                   buffer                PCI
  • Hardware:                                                                    switch                e
                                                                                  core
     > 6M Gates
     > 2 Altera Stratix III SL340 FPGAS
   • Software:
        > RTL Partitioner, Bundled FPGA tools
   • Effort:
        > 1 engineer; 3 months
   • Applications:
        > Verify Core, SOC, IO
DV Club
        > Verify Firmware (HV/OBP), Solaris,
                                                                                                   Slide 16
          Application                     Jai Kumar
Platform improvements – to ease adoption
  • Bridge gap between Emulator and FPGA
    Prototyping
          >   Learn from advances in the emulator space
          >   Ease of model build
          >   Support for RTL, SVA, TB constructs
          >   Seamless RTL partitioning
          >   Eliminate need for gate-simulations
  • Support for Verification infrastructure
          > XMRs, preserve net names, ports
  • Enhance Debug experience
          > Improve debug tools, offload to simulators
DV Club                               Jai Kumar           Slide 17
Summary
  • Low cost FPGA prototyping supplements expensive
    emulators
  • Collaborate with vendors to implement feature-set
    for your use models
  • FPGA Prototyping is effort-intensive, but will pay off
    in cost savings & higher performance
  • Benefit:
          > Higher HW & SW coverage (fewer silicon respins)
          > Debug Bringup Tools before TO (faster bringup; productization
            time savings)

DV Club                                 Jai Kumar                      Slide 18
Leveraging Low-Cost
FPGA Prototyping
for Validation of
Highly Threaded
Server-on-Chip
DV Club - July 2009




Jai Kumar,
Verification Technologist
Sun Microsystems Inc.
jai.kumar@sun.com
http://sun.com

More Related Content

What's hot

Deutsche EuroShop | Company Presentation | 11/11
Deutsche EuroShop | Company Presentation | 11/11Deutsche EuroShop | Company Presentation | 11/11
Deutsche EuroShop | Company Presentation | 11/11
Deutsche EuroShop AG
 
Shape from Distortion - 3D Digitization
Shape from Distortion - 3D DigitizationShape from Distortion - 3D Digitization
Shape from Distortion - 3D Digitization
Vanya Valindria
 
A Function by Any Other Name is a Function
A Function by Any Other Name is a FunctionA Function by Any Other Name is a Function
A Function by Any Other Name is a Function
Jason Strate
 
Deutsche EuroShop | Company Presentation | 10/11
Deutsche EuroShop | Company Presentation | 10/11Deutsche EuroShop | Company Presentation | 10/11
Deutsche EuroShop | Company Presentation | 10/11
Deutsche EuroShop AG
 
iSLC Technology
iSLC TechnologyiSLC Technology
iSLC Technology
Jessika Remolona
 
Textile industry webinar: 28-Jun-2011
Textile industry webinar: 28-Jun-2011Textile industry webinar: 28-Jun-2011
Textile industry webinar: 28-Jun-2011
BusinessVibes_Network
 
Database Sharding the Right Way: Easy, Reliable, and Open source - HighLoad++...
Database Sharding the Right Way: Easy, Reliable, and Open source - HighLoad++...Database Sharding the Right Way: Easy, Reliable, and Open source - HighLoad++...
Database Sharding the Right Way: Easy, Reliable, and Open source - HighLoad++...
CUBRID
 
Real Application Testing
Real Application TestingReal Application Testing
Real Application Testing
oracleonthebrain
 
Fctcp Chw Trends 2 12 2007 10 2
Fctcp Chw Trends 2 12 2007 10 2Fctcp Chw Trends 2 12 2007 10 2
Fctcp Chw Trends 2 12 2007 10 2
guest6ac2d0
 
Seminar Saham
Seminar SahamSeminar Saham
Seminar Saham
Bilawal Alhariri Anwar
 
Engagement Metrics August 2012
Engagement Metrics August 2012Engagement Metrics August 2012
Engagement Metrics August 2012
Absolute Radio
 
Aslo 2012
Aslo 2012Aslo 2012
Aslo 2012
emmats
 

What's hot (12)

Deutsche EuroShop | Company Presentation | 11/11
Deutsche EuroShop | Company Presentation | 11/11Deutsche EuroShop | Company Presentation | 11/11
Deutsche EuroShop | Company Presentation | 11/11
 
Shape from Distortion - 3D Digitization
Shape from Distortion - 3D DigitizationShape from Distortion - 3D Digitization
Shape from Distortion - 3D Digitization
 
A Function by Any Other Name is a Function
A Function by Any Other Name is a FunctionA Function by Any Other Name is a Function
A Function by Any Other Name is a Function
 
Deutsche EuroShop | Company Presentation | 10/11
Deutsche EuroShop | Company Presentation | 10/11Deutsche EuroShop | Company Presentation | 10/11
Deutsche EuroShop | Company Presentation | 10/11
 
iSLC Technology
iSLC TechnologyiSLC Technology
iSLC Technology
 
Textile industry webinar: 28-Jun-2011
Textile industry webinar: 28-Jun-2011Textile industry webinar: 28-Jun-2011
Textile industry webinar: 28-Jun-2011
 
Database Sharding the Right Way: Easy, Reliable, and Open source - HighLoad++...
Database Sharding the Right Way: Easy, Reliable, and Open source - HighLoad++...Database Sharding the Right Way: Easy, Reliable, and Open source - HighLoad++...
Database Sharding the Right Way: Easy, Reliable, and Open source - HighLoad++...
 
Real Application Testing
Real Application TestingReal Application Testing
Real Application Testing
 
Fctcp Chw Trends 2 12 2007 10 2
Fctcp Chw Trends 2 12 2007 10 2Fctcp Chw Trends 2 12 2007 10 2
Fctcp Chw Trends 2 12 2007 10 2
 
Seminar Saham
Seminar SahamSeminar Saham
Seminar Saham
 
Engagement Metrics August 2012
Engagement Metrics August 2012Engagement Metrics August 2012
Engagement Metrics August 2012
 
Aslo 2012
Aslo 2012Aslo 2012
Aslo 2012
 

Similar to Jai kumar fpga_prototyping

Extending Io Scalability
Extending Io ScalabilityExtending Io Scalability
Extending Io Scalability
The Linux Foundation
 
slide
slideslide
slide
koh-t
 
Metaksan 14
Metaksan 14Metaksan 14
Metaksan 14
metaksan
 
SPICE MODEL of CM600HA-24H (Professional+FWDP Model) in SPICE PARK
SPICE MODEL of CM600HA-24H (Professional+FWDP Model) in SPICE PARKSPICE MODEL of CM600HA-24H (Professional+FWDP Model) in SPICE PARK
SPICE MODEL of CM600HA-24H (Professional+FWDP Model) in SPICE PARK
Tsuyoshi Horigome
 
parker hannifin annual 06
parker hannifin annual 06parker hannifin annual 06
parker hannifin annual 06
finance25
 
Q3 Earning report of Daimler AG
Q3 Earning report of Daimler AGQ3 Earning report of Daimler AG
Q3 Earning report of Daimler AG
earningreport earningreport
 
Quepasa Corporation (NYSE Amex: QPSA) Q1 2012 Financial Results
Quepasa Corporation (NYSE Amex: QPSA) Q1 2012 Financial ResultsQuepasa Corporation (NYSE Amex: QPSA) Q1 2012 Financial Results
Quepasa Corporation (NYSE Amex: QPSA) Q1 2012 Financial Results
MeetMe, Inc
 
Parker Hannifin 2012 Annual Report
Parker Hannifin 2012 Annual ReportParker Hannifin 2012 Annual Report
Parker Hannifin 2012 Annual Report
Parker Hannifin Corporation
 
Facebook: an investment for the future
Facebook: an investment for the futureFacebook: an investment for the future
Facebook: an investment for the future
Ideas4Tomorrow
 
Falcon On Board
Falcon On BoardFalcon On Board
Falcon On Board
POWERGEN SRL
 
Analyzing Probabilistic Models in Hierarchical BOA on Traps and Spin Glasses
Analyzing Probabilistic Models in Hierarchical BOA on Traps and Spin GlassesAnalyzing Probabilistic Models in Hierarchical BOA on Traps and Spin Glasses
Analyzing Probabilistic Models in Hierarchical BOA on Traps and Spin Glasses
Martin Pelikan
 
Cost analysis 2
Cost analysis 2Cost analysis 2
Cost analysis 2
Amira Squ
 
Solaris 10 10 09 what's new customer presentation
Solaris 10 10 09 what's new customer presentationSolaris 10 10 09 what's new customer presentation
Solaris 10 10 09 what's new customer presentation
xKinAnx
 
Beyond Moore's Law
Beyond Moore's LawBeyond Moore's Law
Beyond Moore's Law
Xiaolin Lu
 
SPICE MODEL of TPCM8002-H (Professional+BDP Model) in SPICE PARK
SPICE MODEL of TPCM8002-H (Professional+BDP Model) in SPICE PARKSPICE MODEL of TPCM8002-H (Professional+BDP Model) in SPICE PARK
SPICE MODEL of TPCM8002-H (Professional+BDP Model) in SPICE PARK
Tsuyoshi Horigome
 
Maximiliano Martinhao - Rules and Procedures Related to Certification in Brazil
Maximiliano Martinhao - Rules and Procedures Related to Certification in BrazilMaximiliano Martinhao - Rules and Procedures Related to Certification in Brazil
Maximiliano Martinhao - Rules and Procedures Related to Certification in Brazil
MIT Forum of Israel
 
스타마즈-엔젤클럽-협력이벤트
스타마즈-엔젤클럽-협력이벤트스타마즈-엔젤클럽-협력이벤트
스타마즈-엔젤클럽-협력이벤트
Kwangshick Kim
 
Russian M&A - Cross-Border Opportunities
Russian M&A - Cross-Border OpportunitiesRussian M&A - Cross-Border Opportunities
Russian M&A - Cross-Border Opportunities
Aalto Capital
 
Chinese Pediatric USCOM values
Chinese Pediatric USCOM valuesChinese Pediatric USCOM values
Chinese Pediatric USCOM values
Uscom - Presentations
 
SPICE MODEL of CM200HA-24H (Professional+FWDP Model) in SPICE PARK
SPICE MODEL of CM200HA-24H (Professional+FWDP Model) in SPICE PARKSPICE MODEL of CM200HA-24H (Professional+FWDP Model) in SPICE PARK
SPICE MODEL of CM200HA-24H (Professional+FWDP Model) in SPICE PARK
Tsuyoshi Horigome
 

Similar to Jai kumar fpga_prototyping (20)

Extending Io Scalability
Extending Io ScalabilityExtending Io Scalability
Extending Io Scalability
 
slide
slideslide
slide
 
Metaksan 14
Metaksan 14Metaksan 14
Metaksan 14
 
SPICE MODEL of CM600HA-24H (Professional+FWDP Model) in SPICE PARK
SPICE MODEL of CM600HA-24H (Professional+FWDP Model) in SPICE PARKSPICE MODEL of CM600HA-24H (Professional+FWDP Model) in SPICE PARK
SPICE MODEL of CM600HA-24H (Professional+FWDP Model) in SPICE PARK
 
parker hannifin annual 06
parker hannifin annual 06parker hannifin annual 06
parker hannifin annual 06
 
Q3 Earning report of Daimler AG
Q3 Earning report of Daimler AGQ3 Earning report of Daimler AG
Q3 Earning report of Daimler AG
 
Quepasa Corporation (NYSE Amex: QPSA) Q1 2012 Financial Results
Quepasa Corporation (NYSE Amex: QPSA) Q1 2012 Financial ResultsQuepasa Corporation (NYSE Amex: QPSA) Q1 2012 Financial Results
Quepasa Corporation (NYSE Amex: QPSA) Q1 2012 Financial Results
 
Parker Hannifin 2012 Annual Report
Parker Hannifin 2012 Annual ReportParker Hannifin 2012 Annual Report
Parker Hannifin 2012 Annual Report
 
Facebook: an investment for the future
Facebook: an investment for the futureFacebook: an investment for the future
Facebook: an investment for the future
 
Falcon On Board
Falcon On BoardFalcon On Board
Falcon On Board
 
Analyzing Probabilistic Models in Hierarchical BOA on Traps and Spin Glasses
Analyzing Probabilistic Models in Hierarchical BOA on Traps and Spin GlassesAnalyzing Probabilistic Models in Hierarchical BOA on Traps and Spin Glasses
Analyzing Probabilistic Models in Hierarchical BOA on Traps and Spin Glasses
 
Cost analysis 2
Cost analysis 2Cost analysis 2
Cost analysis 2
 
Solaris 10 10 09 what's new customer presentation
Solaris 10 10 09 what's new customer presentationSolaris 10 10 09 what's new customer presentation
Solaris 10 10 09 what's new customer presentation
 
Beyond Moore's Law
Beyond Moore's LawBeyond Moore's Law
Beyond Moore's Law
 
SPICE MODEL of TPCM8002-H (Professional+BDP Model) in SPICE PARK
SPICE MODEL of TPCM8002-H (Professional+BDP Model) in SPICE PARKSPICE MODEL of TPCM8002-H (Professional+BDP Model) in SPICE PARK
SPICE MODEL of TPCM8002-H (Professional+BDP Model) in SPICE PARK
 
Maximiliano Martinhao - Rules and Procedures Related to Certification in Brazil
Maximiliano Martinhao - Rules and Procedures Related to Certification in BrazilMaximiliano Martinhao - Rules and Procedures Related to Certification in Brazil
Maximiliano Martinhao - Rules and Procedures Related to Certification in Brazil
 
스타마즈-엔젤클럽-협력이벤트
스타마즈-엔젤클럽-협력이벤트스타마즈-엔젤클럽-협력이벤트
스타마즈-엔젤클럽-협력이벤트
 
Russian M&A - Cross-Border Opportunities
Russian M&A - Cross-Border OpportunitiesRussian M&A - Cross-Border Opportunities
Russian M&A - Cross-Border Opportunities
 
Chinese Pediatric USCOM values
Chinese Pediatric USCOM valuesChinese Pediatric USCOM values
Chinese Pediatric USCOM values
 
SPICE MODEL of CM200HA-24H (Professional+FWDP Model) in SPICE PARK
SPICE MODEL of CM200HA-24H (Professional+FWDP Model) in SPICE PARKSPICE MODEL of CM200HA-24H (Professional+FWDP Model) in SPICE PARK
SPICE MODEL of CM200HA-24H (Professional+FWDP Model) in SPICE PARK
 

More from Obsidian Software

Zhang rtp q307
Zhang rtp q307Zhang rtp q307
Zhang rtp q307
Obsidian Software
 
Zehr dv club_12052006
Zehr dv club_12052006Zehr dv club_12052006
Zehr dv club_12052006
Obsidian Software
 
Yang greenstein part_2
Yang greenstein part_2Yang greenstein part_2
Yang greenstein part_2
Obsidian Software
 
Yang greenstein part_1
Yang greenstein part_1Yang greenstein part_1
Yang greenstein part_1
Obsidian Software
 
Williamson arm validation metrics
Williamson arm validation metricsWilliamson arm validation metrics
Williamson arm validation metrics
Obsidian Software
 
Whipp q3 2008_sv
Whipp q3 2008_svWhipp q3 2008_sv
Whipp q3 2008_sv
Obsidian Software
 
Vishakantaiah validating
Vishakantaiah validatingVishakantaiah validating
Vishakantaiah validating
Obsidian Software
 
Validation and-design-in-a-small-team-environment
Validation and-design-in-a-small-team-environmentValidation and-design-in-a-small-team-environment
Validation and-design-in-a-small-team-environment
Obsidian Software
 
Tobin verification isglobal
Tobin verification isglobalTobin verification isglobal
Tobin verification isglobal
Obsidian Software
 
Tierney bq207
Tierney bq207Tierney bq207
Tierney bq207
Obsidian Software
 
The validation attitude
The validation attitudeThe validation attitude
The validation attitude
Obsidian Software
 
Thaker q3 2008
Thaker q3 2008Thaker q3 2008
Thaker q3 2008
Obsidian Software
 
Thaker q3 2008
Thaker q3 2008Thaker q3 2008
Thaker q3 2008
Obsidian Software
 
Strickland dvclub
Strickland dvclubStrickland dvclub
Strickland dvclub
Obsidian Software
 
Stinson post si and verification
Stinson post si and verificationStinson post si and verification
Stinson post si and verification
Obsidian Software
 
Shultz dallas q108
Shultz dallas q108Shultz dallas q108
Shultz dallas q108
Obsidian Software
 
Shreeve dv club_ams
Shreeve dv club_amsShreeve dv club_ams
Shreeve dv club_ams
Obsidian Software
 
Sharam salamian
Sharam salamianSharam salamian
Sharam salamian
Obsidian Software
 
Schulz sv q2_2009
Schulz sv q2_2009Schulz sv q2_2009
Schulz sv q2_2009
Obsidian Software
 
Schulz dallas q1_2008
Schulz dallas q1_2008Schulz dallas q1_2008
Schulz dallas q1_2008
Obsidian Software
 

More from Obsidian Software (20)

Zhang rtp q307
Zhang rtp q307Zhang rtp q307
Zhang rtp q307
 
Zehr dv club_12052006
Zehr dv club_12052006Zehr dv club_12052006
Zehr dv club_12052006
 
Yang greenstein part_2
Yang greenstein part_2Yang greenstein part_2
Yang greenstein part_2
 
Yang greenstein part_1
Yang greenstein part_1Yang greenstein part_1
Yang greenstein part_1
 
Williamson arm validation metrics
Williamson arm validation metricsWilliamson arm validation metrics
Williamson arm validation metrics
 
Whipp q3 2008_sv
Whipp q3 2008_svWhipp q3 2008_sv
Whipp q3 2008_sv
 
Vishakantaiah validating
Vishakantaiah validatingVishakantaiah validating
Vishakantaiah validating
 
Validation and-design-in-a-small-team-environment
Validation and-design-in-a-small-team-environmentValidation and-design-in-a-small-team-environment
Validation and-design-in-a-small-team-environment
 
Tobin verification isglobal
Tobin verification isglobalTobin verification isglobal
Tobin verification isglobal
 
Tierney bq207
Tierney bq207Tierney bq207
Tierney bq207
 
The validation attitude
The validation attitudeThe validation attitude
The validation attitude
 
Thaker q3 2008
Thaker q3 2008Thaker q3 2008
Thaker q3 2008
 
Thaker q3 2008
Thaker q3 2008Thaker q3 2008
Thaker q3 2008
 
Strickland dvclub
Strickland dvclubStrickland dvclub
Strickland dvclub
 
Stinson post si and verification
Stinson post si and verificationStinson post si and verification
Stinson post si and verification
 
Shultz dallas q108
Shultz dallas q108Shultz dallas q108
Shultz dallas q108
 
Shreeve dv club_ams
Shreeve dv club_amsShreeve dv club_ams
Shreeve dv club_ams
 
Sharam salamian
Sharam salamianSharam salamian
Sharam salamian
 
Schulz sv q2_2009
Schulz sv q2_2009Schulz sv q2_2009
Schulz sv q2_2009
 
Schulz dallas q1_2008
Schulz dallas q1_2008Schulz dallas q1_2008
Schulz dallas q1_2008
 

Jai kumar fpga_prototyping

  • 1. Leveraging Low-Cost FPGA Prototyping for Validation of Highly Threaded Server-on-Chip DV Club - July 2009 Jai Kumar, Verification Technologist Sun Microsystems Inc. jai.kumar@sun.com http://sun.com
  • 2. Outline • Verification Challenges • Emulation alternatives • FPGA Prototyping Basics • Prototyping Challenges What's in it for you - Managers: • Guidelines - Requirements – effort, $$, Time, tools • Results Engineers: - Challenges • Summary - Avoid Pitfalls Vendors: - Enhancements to simplify adoption DV Club Jai Kumar Slide 2
  • 3. Design Challenges Impacting Verification 1000000 Threads Design Size 160M 300 180 FPGA Prototyping 250 256 160 100000 200 140 120 120M 150 128 100 80M 64 80 100 32 60 41M 10000 Emulation 50 40 20 0 0 Simulation Speed (cycles/sec) T1000 T5220 T5240 T5440 T1000 T5220 T5240 T5440 1000 100 Performance Memory 9 8 8X 600 512G 500 7 6 400 10 SW Sim 256G 5 4 4X 300 3 2.5X 200 128G 1 2 1 1X 100 64G 5000000 Size (M gates) Design 10000000 15000000 0 0 T1000 T5220 T5240 T5440 T1000 T5220 T5240 T5440 DV Club Jai Kumar Slide 3
  • 4. Server-on-Chip: • 2x+ performance over Verification Complexity UltraSPARC T1, within the Dual-channel Dual-channel Dual-channel Dual-channel same power envelope FB-DIMM FB-DIMM FB-DIMM FB-DIMM • Up to 8 cores @1.4GHz • 2x the threads > Up to 64 threads per CPU • 2x the memory Memory Memory Memory Memory > Up to 128GB memory controller controller controller controller > Up to 16 full buffered Dimms L2$ Bank L2$ L2$ L2$ Bank L2$ L2$ L2$ Bank L2$ L2$ L2$ Bank L2$ L2$ Bank Bank Bank Bank Bank Bank Bank Bank > 2.5x memory BW = 60+GB/S Crossbar Crossbar • 8x FPUs, 1 fully pipelined 16 16 16 16 16 16 16 16 KB 8 I$ KB KB 8 I$ KB KB 8 I$ KB KB 8 I$ KB KB 8 I$ KB KB 8 I$ KB KB 8 I$ KB KB 8 I$ KB floating point unit/core D$ FP U D$ FP U D$ FP U U D$ FP D$ FP U D$ FP U D$ FP U D$ FP U • 4MB L2$ (8 banks) 16 way set SP SP SP SP SP SP SP SP U U U U U U U U • Security co-processor per core C1 C2 C3 C4 C5 C6 C7 C8 > DES, 3DES, AES, RC4, SHA1, SHA256, MD5, RSA to 2096 key, ECC Sys I/F NIU buffer switch PCIe • Powers SunFire T5120, T5220, core T6320 Servers SSI, JTAG Debug port 10 Gb Ethernet X8 @ 2.5 GHz 2 GB/s each direction DV Club Jai Kumar Slide 4
  • 5. Problem: cost of Emulation going up Emulator HW (big iron) Gulfstream jet DV Club Jai Kumar Slide 5
  • 6. FPGA Roadmap Source: MPSOC Keynote 2006, Xilinx FPGAs are getting bigger, cheaper and faster! DV Club Jai Kumar Slide 6
  • 7. Solution: Supplement Emulation with cheaper FPGA prototyping alternatives • Why use FPGA prototyping?  Not enough $$ for HW Emulators (big iron) – R&D dollars  Need to run at close to real-time speed  New advancements in FPGA technology creates opportunity for leverage • Benefits  Availability of standard off-the-shelf, mix-n-match FPGA HW/SW tools (small iron)  Allows you to stretch your R&D dollars  Deploy many replicates – multiple systems in parallel  Supplements your emulators (big iron) – does not replace Think Small, Fast and Many DV Club Jai Kumar Slide 7
  • 8. FPGA Prototyping 101 What is Prototyping: • Process of mapping RTL functionality to FPGAs Hardware: • Multiple Latest, Largest FPGAs on a board • Two Major Vendors: Altera & Xilinx • Capacity: 3-150M Gates • Performance: 5 to 50MHz Software: • Synthesis, Design Partition, FPGA P&R • Debug Tools DV Club Jai Kumar Slide 8
  • 9. Big Picture HW verification System-level (HW/SW verification Silicon SW Development Productivity FPGA Prototyping Modeling Effort 38mins Emulation Acceleration 6 hours Simulation 1Day 18hrs Debug Productivity Solaris Boot Time 15 years 1 10 100 1K 10K 100K 500K 1M 5M 10M 100M 1G+ Simulation Speed (Hz) DV Club Jai Kumar Slide 9
  • 10. FPGA Protyping Vs. Emulation Features FPGA Prototype Emulation General: Capacity Expandability Good Very Good Memory Capacity Very Good Good Ease of use Low Very Good Cost Low High Model Build Efficiency: Compile Time OK Very Good Model Size Smaller Bigger RTL Flexibility OK Good Test bench support OK Very Good Simulation Efficiency: Simulation Speed Very Good Good Save/Restore No Very Good IO Expandability (PCIE,Ethernet etc) Very Good Good Debug Efficiency: Signal Visibility Limited Very Good Waveforms w/o re-run No Very Good DV Club Jai Kumar Slide 10
  • 11. FPGA Tools Design RTL Synopsys Auspy Design Partition Certify Altera Synopsys Xilinx Quartus RTL Synthesis Synplify ISE Altera Place & Route Xilinx Place & Route Altera Stratix3 FPGA Xilinx Virtex5 FPGA HW Boards Gidel HW DINI HW Synopsys DINI Vendor X Altera SignalTap Debug Xilinx Chipscope Debug Synopsys ALDE DAFC Advanced Debug C A Identify Tools Pro Off-the-Shelf, Mix-n-Match FPGA Emulation HW/SW Tools DV Club Jai Kumar Slide 11
  • 12. Deployment Strategy • Understand platform capabilities and limitations > Build your use model > Set management, user expectations • Identify Applicable Model Configurations > Size limited to small capacity (<16MGates) • Identify Workload > Primary Platform for SW Development > Secondary Platform for RTL/IO Verification • Design Mapping > Automated FPGA RTL Coding enforcements • Leverage simulators/emulators for debug DV Club Jai Kumar Slide 12
  • 13. Prototyping Challenges • Design Mapping – Size, Style > Limit to 4-6 FPGAs (~16M Gates) • Memory Mapping > RTL Arrays (custom logic) – BLK RAM inferencing > Multi-ported arrays – over clocking > Large system memory - mapping to DDR • Verification Infrastructure > TestBench – synthesizable, self-checking > Initialization - Use back-door access to download/upload big memories > Monitors, SVA, $display is not supported – use LA triggers • Mapping Transformation Verification > Gate-level Simulation at every stage DV Club Jai Kumar Slide 13
  • 14. Guidelines • RTL Coding Guidelines for FPGAs > No XMRs, no force/release, avoid latches, clock gating > No initializations (constant inits results in undesired synth optimizations) > Perform FPGA RTL Linting Check • Stand-alone Synthesis & Verif of custom logic > check for RAM utilization & reduced CLK domains > Mixed-mode RTL-Gate Simulations • Perform full-chip gate simulations at different stages > After synthesis, after partitioning, after insertion of signal multiplexing logic DV Club Jai Kumar Slide 14
  • 15. FPGA Flow Modular Parallel Emulation Synthesis Synthesis RTL Model Gate-level Simulation Netlist Qualification RTL Simulation Design - verify latch, clk-gate conversions Partition - fpga partitioning - pin multiplexing C-API FPGA Design Visibility Compile Place & Route FPGA Platform DV Club Jai Kumar Slide 15
  • 16. FPGA Prototyping Results Memory controller Memory controller Memory Memory L2$ Bank L2$ L2$ L2$ Bank L2$ L2$ L2$ Bank L2$ Bank controller L2$ L2$ controller L2$ L2$ • OpenSPARC T2 Model Bank Bank 16 KB 8 16 KB 8 16 KB 8 Crossbar Crossbar Bank Bank 16 KB 8 Bank 16 KB 8 Bank Bank Bank 16 KB 8 16 KB 8 16 KB 8 > 3.8M Gates, Runs @8MHz I$ KB F I$ F KB I$ F KB I$ F KB I$ F KB I$ F KB I$ F KB I$ F KB D$ D$ P D$ P D$ P D$ P D$ P D$ P D$ P P S S S S S S S S U P U P U P U P U P U P U P U P > Being opensourced soon – C1 C2 C3 C4 C5 C6 C7 C8 U U U U U U U U opensparc.net Sys I/F NIU buffer PCI • Hardware: switch e core > 6M Gates > 2 Altera Stratix III SL340 FPGAS • Software: > RTL Partitioner, Bundled FPGA tools • Effort: > 1 engineer; 3 months • Applications: > Verify Core, SOC, IO DV Club > Verify Firmware (HV/OBP), Solaris, Slide 16 Application Jai Kumar
  • 17. Platform improvements – to ease adoption • Bridge gap between Emulator and FPGA Prototyping > Learn from advances in the emulator space > Ease of model build > Support for RTL, SVA, TB constructs > Seamless RTL partitioning > Eliminate need for gate-simulations • Support for Verification infrastructure > XMRs, preserve net names, ports • Enhance Debug experience > Improve debug tools, offload to simulators DV Club Jai Kumar Slide 17
  • 18. Summary • Low cost FPGA prototyping supplements expensive emulators • Collaborate with vendors to implement feature-set for your use models • FPGA Prototyping is effort-intensive, but will pay off in cost savings & higher performance • Benefit: > Higher HW & SW coverage (fewer silicon respins) > Debug Bringup Tools before TO (faster bringup; productization time savings) DV Club Jai Kumar Slide 18
  • 19. Leveraging Low-Cost FPGA Prototyping for Validation of Highly Threaded Server-on-Chip DV Club - July 2009 Jai Kumar, Verification Technologist Sun Microsystems Inc. jai.kumar@sun.com http://sun.com