CPU Verification
Ramdas M
Introduction
● Verification engineer with 20+ years of experience
○ AMD, Intel, CPU start ups (Montalvo, Applied Micro)
● www.verificationexcellence.in
Agenda
CPUs
CPU vs IP / SOC Verification
Architecture Verification
Microarchitecture Verification
Verification Milestones and Metrics for completeness
CPU SOCs
CPU vs Cores
Single Core vs Multi core CPUs
CPUs
Simple Core pipeline Multi Core CPU
Architecture vs Microarchitecture - ?
Architecture Microarchitecture
Describes high level attributes/features of a
system
Details about implementation of
design/CPU
Hardware-Software interface Units/sub-units, Interfaces, State machines
and other data structures etc
Instruction Set Architecture (ISA) for CPUs
- Instructions
- Registers
- Memory Model
- Interrupts and Exceptions
- Programming Model
Pipelining,
In-order vs Out-of-order execution
Parallel execution units
Branch prediction
Caches and Coherency implementation
Memory and interconnects
x86, ARM-v8, RISC-V, MIPS, Power, Instruction Decode – Microcode
IP - SOC – CPU SOC - Verification -?
● IP - Intellectual Property
- A logical block / unit with defined interfaces
- e.g. an Ethernet MAC, a PCIE Bridge, USB controller, DRAM controller
- Use standard verification methodologies – Simulation, constrained random etc.
- Verification focus on implementation
● SOC – A System on a Chip
- A collection of logical blocks/IP s with interconnects
- May be one or more CPUs / controllers
- Verification focus on system level scenarios, interconnect and interaction between IPs
● CPU - Consider as a sub-system inside an SOC
- ISA compliance
Architectural Verification
● Implementation meets Architecture definition – ISA compliance
- Instruction, Modes, Memory management, Interrupts/Exceptions
● Random Instruction Sequences / Generators
-Developed primarily in a high-level language – C , Python with
- inline assembly - used for better control sometimes
● Verified for correctness with an Architectural Simulator
● Test suites developed in house or available from third party for compliance
- Large collection of tests, debugs are always challenging.
● Done at a CPU / Core level Test bench
Micro-architecture Verification
● Focus on Implementation of CPU core
● Block level , Cluster/Sub-system Level Verification
- IF, ID, LS, EX,
- Single Core Cluster
● Constrained Random Verification – SV, UVM
● Functional and Code Coverage
● Formal Verification techniques
Other Verification Focus - Areas
● Power Management Flows
-Power states and Power savings - Sleep , Hibernate
● DFT/DFX Verification
- How to test after manufacturing?
● Performance Verification
- IPC, Single Threaded vs Multi threaded
- Benchmarks – SpecInt, SpecFP, LMbench
- Memory latency and bandwidth
● System Level Verification or Emulation
- Long running Stress tests
- Software readiness before Silicon including OS boot
Regressions
● Collection of Tests that is run frequently and efficiently
-Important as help to make sure progress in upward direction
-Add new tests to regression as being developed
- Makes sure a bug fix does not break a verified feature
- Distribute tests across multiple machines/computers
- Complex process with large test suites
● Architecture Verification
- Huge test suite – Several thousands of tests
● Microarchitecture Verification
- Each block might have a handy suite of constrained random tests
Debugs
● Detecting Failures is first step to start debug
○ Assertions, checkers/scoreboards, hangs, timeouts etc
○ Failures classified based on signature (type of failure)
● Root cause a failure
○ Understanding failure and locating the cause
○ Failure could be in Verification or Design
○ Understanding design behavior is key to debug and track back
○ Might need to isolate failure before locating cause
○ Debug Tools - Monitors, log files, wave form viewers
Milestones, Metrics and Checklist
● CPU /CPU-SOC development takes from a year to few years
● Development divided as milestones with metrics that monitors progress
● Checklists and sign off criteria used to gain high confidence before tape out
Milestones
• Rev0.0 –
-Design Documentation, Initial RTL structure,
-Test bench infrastructure, Build and Regression flows
• Rev0.25
- Basic Features verified. Test bench components and stimulus infrastructure.
- Initial Test plans Reviewed, Random regressions for verified features
• Rev0.50 – Rev0.75
- More features coded and verified. More random regressions and bug hunting
- Bug rates and pass rates gets monitored
Milestones
• Rev1.0
- All features coded and stimulus in place. Functional coverage monitor coded
- Regressions and bug hunting. Coverage collection and analysis started
• Freeze
- Pass rates and functional coverage at 100% with known exceptions. All
known/critical bugs fixed.
- Incoming bug rates very low with no risks.
• Tapeout
Metrics
Why Metrics ?
-Design Verification rarely completes on schedule.
-There is always more to verify in less time.
-Be paranoid – There is always a bug in there hidden.
Metrics for Completion
● Test plan completion
● Test plan reviews and Actions
● Tests, Stimulus , Checkers, Assertions, Functional Coverage coding completion
● Simulation based Metrics
● Regression pass rates,
● Bug Trends Incoming bug rates, Open bugs, historical trends
● Functional Coverage
● Code Coverage
● Code (RTL/DV) Stability
??

CPU Verification

  • 1.
  • 2.
    Introduction ● Verification engineerwith 20+ years of experience ○ AMD, Intel, CPU start ups (Montalvo, Applied Micro) ● www.verificationexcellence.in
  • 3.
    Agenda CPUs CPU vs IP/ SOC Verification Architecture Verification Microarchitecture Verification Verification Milestones and Metrics for completeness
  • 4.
    CPU SOCs CPU vsCores Single Core vs Multi core CPUs
  • 5.
  • 6.
    Architecture vs Microarchitecture- ? Architecture Microarchitecture Describes high level attributes/features of a system Details about implementation of design/CPU Hardware-Software interface Units/sub-units, Interfaces, State machines and other data structures etc Instruction Set Architecture (ISA) for CPUs - Instructions - Registers - Memory Model - Interrupts and Exceptions - Programming Model Pipelining, In-order vs Out-of-order execution Parallel execution units Branch prediction Caches and Coherency implementation Memory and interconnects x86, ARM-v8, RISC-V, MIPS, Power, Instruction Decode – Microcode
  • 7.
    IP - SOC– CPU SOC - Verification -? ● IP - Intellectual Property - A logical block / unit with defined interfaces - e.g. an Ethernet MAC, a PCIE Bridge, USB controller, DRAM controller - Use standard verification methodologies – Simulation, constrained random etc. - Verification focus on implementation ● SOC – A System on a Chip - A collection of logical blocks/IP s with interconnects - May be one or more CPUs / controllers - Verification focus on system level scenarios, interconnect and interaction between IPs ● CPU - Consider as a sub-system inside an SOC - ISA compliance
  • 8.
    Architectural Verification ● Implementationmeets Architecture definition – ISA compliance - Instruction, Modes, Memory management, Interrupts/Exceptions ● Random Instruction Sequences / Generators -Developed primarily in a high-level language – C , Python with - inline assembly - used for better control sometimes ● Verified for correctness with an Architectural Simulator ● Test suites developed in house or available from third party for compliance - Large collection of tests, debugs are always challenging. ● Done at a CPU / Core level Test bench
  • 9.
    Micro-architecture Verification ● Focuson Implementation of CPU core ● Block level , Cluster/Sub-system Level Verification - IF, ID, LS, EX, - Single Core Cluster ● Constrained Random Verification – SV, UVM ● Functional and Code Coverage ● Formal Verification techniques
  • 10.
    Other Verification Focus- Areas ● Power Management Flows -Power states and Power savings - Sleep , Hibernate ● DFT/DFX Verification - How to test after manufacturing? ● Performance Verification - IPC, Single Threaded vs Multi threaded - Benchmarks – SpecInt, SpecFP, LMbench - Memory latency and bandwidth ● System Level Verification or Emulation - Long running Stress tests - Software readiness before Silicon including OS boot
  • 11.
    Regressions ● Collection ofTests that is run frequently and efficiently -Important as help to make sure progress in upward direction -Add new tests to regression as being developed - Makes sure a bug fix does not break a verified feature - Distribute tests across multiple machines/computers - Complex process with large test suites ● Architecture Verification - Huge test suite – Several thousands of tests ● Microarchitecture Verification - Each block might have a handy suite of constrained random tests
  • 12.
    Debugs ● Detecting Failuresis first step to start debug ○ Assertions, checkers/scoreboards, hangs, timeouts etc ○ Failures classified based on signature (type of failure) ● Root cause a failure ○ Understanding failure and locating the cause ○ Failure could be in Verification or Design ○ Understanding design behavior is key to debug and track back ○ Might need to isolate failure before locating cause ○ Debug Tools - Monitors, log files, wave form viewers
  • 13.
    Milestones, Metrics andChecklist ● CPU /CPU-SOC development takes from a year to few years ● Development divided as milestones with metrics that monitors progress ● Checklists and sign off criteria used to gain high confidence before tape out
  • 14.
    Milestones • Rev0.0 – -DesignDocumentation, Initial RTL structure, -Test bench infrastructure, Build and Regression flows • Rev0.25 - Basic Features verified. Test bench components and stimulus infrastructure. - Initial Test plans Reviewed, Random regressions for verified features • Rev0.50 – Rev0.75 - More features coded and verified. More random regressions and bug hunting - Bug rates and pass rates gets monitored
  • 15.
    Milestones • Rev1.0 - Allfeatures coded and stimulus in place. Functional coverage monitor coded - Regressions and bug hunting. Coverage collection and analysis started • Freeze - Pass rates and functional coverage at 100% with known exceptions. All known/critical bugs fixed. - Incoming bug rates very low with no risks. • Tapeout
  • 16.
    Metrics Why Metrics ? -DesignVerification rarely completes on schedule. -There is always more to verify in less time. -Be paranoid – There is always a bug in there hidden.
  • 17.
    Metrics for Completion ●Test plan completion ● Test plan reviews and Actions ● Tests, Stimulus , Checkers, Assertions, Functional Coverage coding completion ● Simulation based Metrics ● Regression pass rates, ● Bug Trends Incoming bug rates, Open bugs, historical trends ● Functional Coverage ● Code Coverage ● Code (RTL/DV) Stability
  • 18.