Your SlideShare is downloading. ×
Emulation Error Recovery
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Saving this for later?

Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime - even offline.

Text the download link to your phone

Standard text messaging rates apply

Emulation Error Recovery

242
views

Published on

Paper presented at ASQED 2009, KL, Malaysia

Paper presented at ASQED 2009, KL, Malaysia


0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
242
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Automatic Error Recovery in Targetless Logic Emulation Somnath Banerjee Tushar Gupta Mentor Graphics Pvt. Ltd. (India) Presented at ASQED 09, KL, Malaysia
  • 2. Agenda
    • Logic emulation Overview.
    • Targetless emulation use models.
    • Emulation errors and recovery.
    • Proposed system.
    • Checkpoint and checkpoint server.
    • Experimental results.
    • Conclusion.
  • 3. Logic Emulation
    • Mostly FPGA based.
    • Hardware assisted functional verification.
    • Maps the design on HW (FPGA + memory banks).
    • Much faster than software simulators.
    • Typical speed is 1 MHz.
  • 4. Logic Emulation Components HDL Files SW Compiler Emulation HW Emulation Kernel Waveform and Debug Testbench
  • 5. Targetless Emulation
    • No external hardware to send stimuli.
    • Software testbench.
    • Standalone emulation – vector file stimuli.
    • Co-simulation – stimuli from SW simulator.
    • Advantage is ease of use – easy to edit testbench.
  • 6. Standalone Emulation DUT Vector File Batch Script Loader Pattern Memory
  • 7. Co-simulation DUT Batch Script Transactor SW Similator Loader
  • 8. Typical Usage of Targetless Emulation
    • Long running verification jobs.
    • Massive sets of stimuli.
    • Automated setup to run multiple tests.
    • Nightly runs on farm setups.
    • Uses automatic job scheduler like LSF.
    • No manual intervention.
  • 9. User Operations during Emulation
    • Load stimuli.
    • Set/Get flop/latch.
    • Force/release flop/latch.
    • Download/upload memory.
    • Set trigger on events.
    • Set source line breakpoint.
    • Run design clock.
  • 10. Emulation Batch Script
    • Contains sequence of user operations
    • Example:
    • loadStimuli
    • registerSet top.f1 1
    • memoryDnld top.mem1 data.dat
    • runClock 1000000
    • Value = registerGet top.f2
  • 11. Logical Command Block (LCB)
    • Defines a block of operations.
    • May be to verify a section of DUT.
    • Batch script is a collection of LCBs.
    • One error corrupts the corresponding LCB.
  • 12. Error Recovery in Targetless Emulation
    • Avoid halting of emulation run.
    • Recover from error and generate report.
    • Roll back the system to a last stable state.
  • 13. Typical Emulation Errors
    • Set 1 on a flop with clear pin at logic 1.
    • Set 0 on a flop with preset pin at logic 1.
    • Set 0 on a flop/latch forced to 1.
    • Setting trigger on unsupported signal.
    • An expected trigger never matured.
    • Set source line breakpoint on a line not breakable.
  • 14. Proposed Error Recovery System
    • Save checkpoint at the beginning of each LCB.
    • In case of error, roll back the system to last saved checkpoint.
    • Skip failing LCB and resume emulation.
    • Generate report.
  • 15. Batch Script Instrumentation
    • User marks start/end of LCBs.
    • For commands outside LCB –
    • Automatic LCBs inserted.
    • New LCB after commands worth a cost
    • limit.
    • Cost limit can be set by user.
    • Compiler dumps a command cost DB.
  • 16. Command Cost DB
    • Unit cost is 1 cycle of clock run.
    • User operation’s cost in terms of unit cost.
    • Typical costs:
    • registerSet – 100 units/bit.
    • registerForce – 300 units/bit.
    • memoryDnld – 1M units/MB.
  • 17. Error Handling Function
    • Called in case an error occurs.
    • User customizable.
    • Example:
    • void errorHandler()
    • {
    • log_error();
    • restore_last_checkpoint();
    • put_file_ptr_after_failing_lcb_in_batch_script();
    • resume_execution();
    • }
  • 18. System Checkpoint
    • Data representing system state.
    • Online or offline.
    • SW simulators provide save/restore feature.
    • Contains following:
    • State element values
    • Memory contents
    • Some software states + testbench state
  • 19. Online Checkpoint
    • Online – data resides in RAM.
    • Data is stored in compressed buffers.
    • Consumes ~100MB for a FPGA of capacity 9-10M gate.
    • Faster checkpoint save/restore.
    • Can support up to ~256M gate designs in 32 bit systems.
  • 20. Offline Checkpoint
    • Data resides on-disk.
    • More reliability.
    • Slower checkpoint save/restore.
    • Can support design of any size.
  • 21. Save/Restore Techniques
    • Firmware provided functions.
    • Function to save/restore all states in a FPGA.
    • Detect modified memories since last checkpoint.
    • Save only modified memories.
  • 22. Checkpoint Server
    • Manages checkpoint data.
    • Keeps original emulation system unaffected.
    • Can be 32 bit or 64 bit.
    • Keeps most recent checkpoint.
    • Talks to emulation kernel and SW testbench.
  • 23. Checkpoint Server Emulation Kernel SW Testbench Checkpoint Server Checkpoint save/restore
  • 24. Experimental Results
    • Design – 128M gate, 16 FPGAs.
    • Checkpoint frequency – commands equivalent to 100M cycles clock run.
    • Typical emulation speed 1MHz.
    • Average time to save checkpoint is 5 secs.
    • 5% overhead in online checkpointing.
    • 20% overhead in offline checkpointing.
  • 25. Conclusion
    • Automatic error recovery is important in targetless emulation.
    • The proposed method is flexible, scalable, reliable.
    • No special HW extension is required.
    • Minimal overhead on execution speed.
  • 26. Q & A
    • Thank you