Verification Strategy for PCI-Express


Published on

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Verification Strategy for PCI-Express

  1. 1. Verification Strategy for PCI-ExpressPresenter: Pradip ThakerJuly 4th, 2008
  2. 2. 2OutlinePCI-Express Protocol OverviewVerification ParadigmDesign-for-Verification (Well-aligned implementation andverification architectures)A key ingredient for a timely verification closure
  3. 3. 3PCI to PCI ExpressLimitations of PCINot enough bandwidth32-bit/33 MHz (132 MB/s)64-bit/66 MHz (528 MB/s)Shared bus bandwidthNo support for Isochronous applications (TDM or Synchronous Traffic application)Cost of hardware for parallel bussesEvolution PathGrowing faster is the only possibility (not wider)Point-to-point communication (Shared bus connectivity impossible above 100/150MHz)CDR architecture (Speed limitation of a synchronous bus above few hundred MHz)Backward compatibility – a mustFast forward to future – PCI Express (PCIe)Packet-level data-units over high-speed SERDES based connectivityLayered architecture – much like networking protocolsMechanical, Physical, Data-link, Transaction, Software and System LayersCompatible with existing PCI software infrastructureWeird wedding of two distinct architectural and business practices – Networking andComputerCreation of nightmarish scenario for chip verification (Details on later slides)
  4. 4. 4PCI-Express Protocol Overview - TerminologyDual Simplex – a related set of two differential pairs (Tx and Rx)Lane – “Dual Simplex” when PCI-Express compliantPort – A group of Txs and Rxs within a single device that represent a single connectionto PCI-Express fabricLink – Two ports and the collection of lanes that interconnect themx1, x4, x8, xN – Number of lanes within a port or a linkUpstream – Flow of traffic towards the CPU or a port that establishes link in thatdirection within the hierarchyDownstream – Flow of traffic away from the CPU or a port that establishes a link in thatdirection within the hierarchyIngress Port – the portion of a PCIe port that receives the incoming trafficEgress Port – the portion of a PCIe port that transmits outgoing trafficRoot Complex – The combination of a PCIe host bridge and one or more downstreamportsEndpoint – A device that terminates a path within the hierarchyBridge – A device that physically and electrically connects PCIe to another protocolSwitch – A device that provides a physical connection between two or more PCIe ports
  5. 5. 5PCI-Express Hierarchy
  6. 6. 6PCI-Express Protocol Overview : PhysicalLogical Functions8B/10B Encoding and DecodingScramblingReset, initialization, multi-lane de-skewLane mappingAdjustments of bit-transmission order for various throughput options (x1 through x32)Logical idle behavior and transition to active state as per protocolTLP and DLLP transmission and reception: Insertion and Processing of Special Symbols per protocol conditionsLink initialization (recovery from link errors, transition from low power states)Link negotiationsWidthData-rateLane reversalPolarity inversionLink synchronizationBit-wise per laneSymbol-wise per laneLane-to-lane de-skewOrdered (TS and Skip) set handling and processingFast training sequenceLink power managementDelay insertions as per protocol……………………more that could not fit hereElectrical FunctionsLink within 600 ppm at all timesSpread spectrum clockingAC couplingInterconnect parasitic capacitance adherenceReceiver DC commong mode voltage of 0 VTransmitter DC common mode established during “Detect”Receiver Detect under various scenariosTotal jitterMaximum loss budgetDe-emphasisMaximum BERBeacon………………………………more that could not fit here
  7. 7. 7PCI-Express Protocol Overview : Data-link LayerLink managementDL_UP, DL_Down, DL_Inactive, DL_Active, DL_Init state transitionsSlot power limit handlingPropagation of link-reset downstreamPoint-to-point reliable data exchangeError detection, re-try as well as Error Logging and ReportingPower Management message decoding, state transitions for activation and de-activationTLP sequence number generation and trackingLCRC computation and decodingDLLP integrity encoding and decodingACK/NAK generation and processingACK time-out notification and handlingFlow control computation, tracking and processing – Credit based flow-controlData poisoningCompletion Time-outRe-transmission of packetsPackage storage for re-try/replayDLLP generation, processing and actuation based on current statusACK DLLPNAK DLLPInitiFC1InitFC2UpdateFCPower ManagementVendor specificCut-through routingTLP/DLLP ordering permutations per protocolTLP integrity check insertion and processingACK/NAK latency timer rules processing a limit-triggered response………………….more that could not fit here
  8. 8. 8PCI-Express Protocol Overview : Transaction LayerFlow control managementTL manages, DL executesPoint-to-point, not end-to-endIndependent for each VC IDMechanism presumes “Ideal” conditionsCredit types – PH, PD, NPH, NPD, CPLH, CPLDData transactionsTLP storage and processing for transmission or consumptionTLP generation: Header, Payload and DigestTLP generation and handling of various lengths (4 Bytes to 4096 Bytes)Transaction typesMemory (32-bit and 64-bite addressing)I/OConfigurationMessageINTxPMEERRUnlockSlot PowerHot PlugVendor-definedTransaction CompletionReads and non-posted writesCompletion routing is by IDProvide completion statusTransaction OrderingRouting rulesArbitrationPort arbitrationVC arbitrationVirtual channelsTraffic classesLocked transactions supportIsochronous supportAdvance error processing and reporting………………………….………more that could not fit here
  9. 9. 9PCI-Express Protocol Overview: SummaryOpen standard containing over 500 pagesMany more pages of supporting literatureEach line of each page in the standards document is a crypticedict dictating a specific behavior for each conditionand not a detailed explanation about behavior or implementationMuch space for protocol detail misinterpretation resulting intomal-function or non-complianceHundreds of configuration bits – each controlling a complexbehavior within the chip with strict adherence to standard dictateto guarantee backward software compatibilityNo wiggle room to claim bug as a feature!!!
  10. 10. 10Verification ParadigmChips based on Open-Standard – Pressure PointsTechnology/Feature differentiator – Marginal or Non-existingCommodity product – Power, Performance and PriceTime-to-market – Very CriticalFirst product – To Establish Credible PresenceSub-sequent products with various flavors – To Capture Market ShareBridges: PCI-to-PCIe, SATA-to-PCIe, 1394-to-PCIe, USB-to-PCIe etc.Switches: 4-port x1 throughput, 4-port x4 throughput, 8-port x4 throughput, etc.Root Complex: x1 throughput, x4 throughput, etc.Quality of First Silicon – CriticalVerification Plays A Major Role in Success of Chips based on Open-StandardAddresses Two Key Aspects: TTM and Quality of SiliconVerification Execution: Focal PointsFunctionalityPerformanceInteroperability (Compliance and Compatibility)Verification Platform Architecture and Methodology: Focal PointsRe-usabilityScalability (Modularity)Comprehensiveness (with leveraging of automation)
  11. 11. 11Verification Strategy: A Broader DefinitionVerification – A vehicle to deliver chips with “Zero Bugs(!)”,Compliance and Superior performancePerformance Modeling (C/C++/SystemC)Architecture and Micro-architecture of Key Data and Control PathsRTL VerificationFPGA-based EmulationCompliance and Compatibility testingPCI-SIG certification to be on Integrator’s ListPerformance verification3rd party Compliance Checkers and VectorsMixed-signal Simulations
  12. 12. 12Functional Verification: Four PillarsCoverage-driven constrained-random testing with reference models (HVLs)Reference Model (RFM)Temporal CheckersProtocol MonitorsSequence GeneratorsConstraintsFunctional CoverageTest-planAssertion-based verification for key building blocksDetects design errors at the source – increases observability and decreases debug-timeCan identify subtle bugs that may be hard to reach with SBVBlack-box assertions – Protocol orientedEffective for size/complexity to an extent (memory-size and run-time limitations)Suitable for block-level deployment rather than end-to-end chip-level stand-alone verificationmethodComplex properties are verified through bounded-proof (neither proven nor falsified)Effective for control-path oriented logic (state space exploration rather than data-path logic)verificationAssertions when written by engineer other than designer can help detect specification(interpretation) class of errorsAsynchronous clock-domain simulationsPower-domain simulations – Power Management Compliance Check-listImproper Buffer Insertion, Missing Level Shifters, Missing Power Good, Power Sequencing Tests
  13. 13. 13Functional Verification: CDV (Re-usability and Scalability)
  14. 14. 14Functional Verification: Golden Rules for RFMReference Model shall be independent of the DUT implementationReference Model to be created by engineer other than designer of the blockReference Model created in high-level language and hence it does not have any low-level mechanics analogous to RTL implementation to realize functionalityReference Model shall support co-simulation with the DUT in order to predictand verify run-time behaviorReference Model for each block shall be created such that it can be integratedinto chip-level verification environment seamlesslyHybrid ModelingControl paths: Cycle-accurate modelingData paths: Packet-accurate or Data-unit-accurate modelingFully cycle-accurate model is maintenance nightmare as well as a cumbersome taskwithout significant value-add to verification qualityComprehensiveness (with leveraging of automation)CDV is only as powerful as comprehensiveness of automated checking features ofreference model and monitorsCan run millions of RTG cycles with comprehensive reference model and monitorswithout much manual overhead
  15. 15. 15Performance VerificationPerformance Parameters (to be supported with variable sized packets across mixed-traffictypes, across all traffic patterns, mixed VCs and mixed-packet sizes)Aggregate ThroughputLatency (to be balanced against power dissipation)Jitter in LatencyAvailability/Blocking – Internal back-pressureN+1 Performance limitation (small TLPs back-to-back)Flow-control creditsLoad distribution and balancing (peer-to-peer as well as vertical traffic flows withmixed of traffic types, VCs and packet sizes)Link utilization – No bubbles within or between TLPs (really challenging for cut-through mode)Zero tolerance for packet lossZero tolerance for wrong packet routing20% overhead lost in 8B/10B codingSmall TLPs with header as well as DL layer overhead impacting transaction layer efficiencyeven with 100% link utilizationTraffic-aware flow-control credit updates (large and small TLPs)Performance Modeling (C/C++/SystemC)Architecture and Micro-architecture of Key Data and Control PathsFPGA-based EmulationRTL Verification – Not an adequate method for performance testing for PCIe development
  16. 16. 16Compliance VerificationElectrical Compliance Check-listSignal Quality AnalysisEye pattern, jitter and BER analysisSignaling for upstream and downstreamJitter Analysis DLLClock recoveryInterpolationTransition/non-transition eye pointsData-Link Layer Compliance Check-listReserved Fields testingNAK ResponseReplay TimerReplay CountLink RetrainReplay TLP OrderBad CRCUndefined PacketBad Sequence NumberDuplicate TLPTransaction Layer Compliance Check-listCompletion request, completion time-out, read-dataMessaging – Legacy interrupts, Native power management, Hot-plug, Error SignalingFlow Control – Initialization, Transmit and Receive States, Negotiated Link WidthVirtual ChannelSystem Architecture/Platform-configuration Check-listCapability registers testingDefault valuesStress testSlot reportingHot plug event reporting
  17. 17. 17Compliance VerificationSeparate compliance check-list with some overlap for RC,Endpoints and SwitchesIntegrated PHY in the siliconFPGA platforms with discrete PHY and digital logicFPGA-based emulation (Native or 3rd Party)Compliance testing with Agilent PTC and PCI-SIG Golden SuiteCompatibility testing with over 80% of the systems duringPlugFestPCI-SIG certification to be on Integrator’s ListNative protocol checkers – static and temporal3rd party Compliance Checkers and VectorsSynopsys, Denali, nSys and others
  18. 18. 18Design-for-VerificationCafeteria Architecture: Modular and ScalableFor rapid deployment of various flavors of bridges and switches based on flagshipplatform partSpeed of Capturing market-share as critical as first product deployment to establishcredible presenceModular architecture to enable thorough block-level or sub-system levelsimulationsFunctional partitioning to reduce scope of chip-level verification effort andcomplexityPush v/s Pull Inter-block Data-threadsDistributed v/s Centralized Control ProcessingStandardized block interfaceReduce scope of “Error of Specification” and “Error of Omission”Promote verification component re-use (BFMs, Sequences, etc.)Minimum number as well as flavors of physical interconnects between blocks (mayuse in-band signaling where applicable)Emphasis on correct-by-construction practices during design-creation phaseOtherwise TTM Window will be missed due to prolonged verification or multiple re-spins (PCIe non-forgiving of bugs that hamper compliance or compatibility)
  19. 19. 19Thank You!