Your SlideShare is downloading. ×
Implementing Useful Clock Skew Using Skew Groups
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Implementing Useful Clock Skew Using Skew Groups

766
views

Published on

Published in: Engineering, Business, Technology

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
766
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
29
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Implementing Useful Skew Using Skew Groups Matthew Mei Cisco Systems
  • 2. 2 Matthew Mei • Overview of skew • Example design affected by skew • What is useful skew • Using skew groups to achieve useful skew • Experimental results of trials on example design • Inserting clock buffers to achieve useful skew • Comparing skew groups and buffer insertion • Conclusions Outline
  • 3. 3 Matthew Mei Skew Capture Flip Flop Clock Port • Skew equals insertion delay at capture minus insertion delay at launch • The insertion delay from: report_clock_timing -to <pin> -type latency -setup • Common path pessimism removal from: report_crpr -from <pin1> -to <pin2> -setup Launch Flip Flop
  • 4. 4 Matthew Mei • 40 nm technology being used • The block was about 8000 µm ×4000 µm • Block utilization was about 75%, while standard cell utilization was only about 20% (~600K cells) • The block was mostly Ternary Content Addressable Memories (TCAMs), which are large memory macros used for fast searches The Example Design
  • 5. 5 Matthew Mei Example Failing Path (Diagram) Memory Capture Flip Flops clk_core • Thus, the skew is equal to: 1.0460 ns – 1.1783 ns = -0.132 ns • Therefore, this timing path has -132 ps of skew 1.4831 ns 0.0000 ns 1.0460 ns1.1783 ns
  • 6. 6 Matthew Mei Example Failing Path (Timing Report) Path Type: max Point Incr Path ---------------------------------------------------------- clock clk_core (rise edge) 0.0000 0.0000 clock network delay (propagated) 1.1783 1.1783 w/m_36x1/CLK 0.0000 1.1783 r w/m_36x1/QXY[13] 1.4831 2.6614 f w/r0_data_read1_s_36x1_13_ (net) 0.0000 2.6614 f w/r1_data_read1_s_36x1_reg_13_/D 0.0000 & 2.6614 f data arrival time 2.6614 clock clk_core (rise edge) 1.6670 1.6670 clock network delay (propagated) 1.0460 2.7130 clock uncertainty -0.0580 2.6550 w/r1_data_read1_s_36x1_reg_13_/CK 0.0000 2.6550 r library setup time -0.1197 2.5353 data required time 2.5353 ---------------------------------------------------------- data required time 2.5353 data arrival time -2.6614 ---------------------------------------------------------- slack (VIOLATED) -0.1261
  • 7. 7 Matthew Mei Example Failing Path (Layout) • Pipeline flops already added and magnet placed
  • 8. 8 Matthew Mei Using Skew Groups to Achieve Useful Skew TCAMs Pipeline Flip Flops clk_core • To improve the setup timing performance, delay can be added to the red clock path • Tried to achieve the target skew using skew groups • Also tried manual buffer insertion (later) Target Skew
  • 9. 9 Matthew Mei Skew Groups • Skew groups were defined before clock tree synthesis • The following commands were used before clock_opt to create a skew group: set_skew_group -name <name> -target_skew <skew> <pins list> report_skew_group -name <name> commit_skew_group • The pins list in the example design included the clock pins of about 8000 flip flops • Tried 50 ps, 120 ps, 200 ps, 240 ps, 300 ps
  • 10. 10 Matthew Mei Skew Groups Effective Skew vs. Target Skew -0.05 0 0.05 0.1 0.15 0.2 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 EffectiveSkew(ns) Target Skew (ns) Effective Skew vs. Target Skew Clock Opt Effective Skew Route Opt Effective Skew Post Route Effective Skew
  • 11. 11 Matthew Mei Skew Groups Setup Timing Performance -700 -600 -500 -400 -300 -200 -100 0 -0.18 -0.16 -0.14 -0.12 -0.1 -0.08 -0.06 -0.04 -0.02 0 0 0.05 0.1 0.15 NegativeSlack(ns) Effective Skew (ns) Negative Slack vs. Effective Skew WNS TNS 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 0 0.05 0.1 0.15 FailingPaths Effective Skew (ns) Failing Paths vs. Effective Skew
  • 12. 12 Matthew Mei Skew Groups Hold Timing Performance 0 20 40 60 80 100 120 140 0 0.05 0.1 0.15 FailingPaths Effective Skew (ns) Failing Hold Paths vs. Effective Skew -1.8 -1.6 -1.4 -1.2 -1 -0.8 -0.6 -0.4 -0.2 0 -0.14 -0.12 -0.1 -0.08 -0.06 -0.04 -0.02 0 0 0.05 0.1 0.15 NegativeSlack(ns) Effective Skew (ns) Negative Hold Slack vs. Effective Skew Worst Hold Total Hold
  • 13. 13 Matthew Mei Skew Groups Path Skew Distribution 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 -0.1 -0.05 0 0.05 0.1 0.15 0.2 0.25 0.3 NumberofFlops(Cumulative) Skew of Individual Path (ns) Cumulative Distribution of Path Skew Among Skew Group Flip Flops Effective Skew 0.005 ns Effective Skew 0.085 ns Effective Skew 0.121 ns Effecitve Skew 0.138 ns
  • 14. 14 Matthew Mei • Using skew groups causes the clock tree to branch out at an early level • The TCAMs and the pipeline flip flops had zero common path pessimism removed • More complex clock tree, more cells and routing Skew Groups Effects on Clock Tree
  • 15. 15 Matthew Mei Skew Groups Clock Tree Cells and Buffer Area 23000 24000 25000 26000 27000 28000 29000 5950 6000 6050 6100 6150 6200 6250 6300 6350 6400 6450 Control 0.05 0.12 0.2 0.24 0.3 BufferArea(µm2) NumberofClockCells Target Skew (ns) Clock Tree vs. Target Skew Buffer Area Clock Cells • Increased clock tree size by about 250 cells
  • 16. 16 Matthew Mei Skew Groups Power Consumption 0 0.2 0.4 0.6 0.8 1 1.2 0 1 2 3 4 5 6 7 8 0.05 0.12 0.2 0.24 0.3 IncreaseinTotalPower(%) IncreaseinClockTreePower(%) Target Skew (ns) Power Increase vs. Target Skew Percent Total Power Increase Percent Clock Tree Power Increase • On average, increase by 5.16% in clock tree and 0.66% in total block power consumption
  • 17. 17 Matthew Mei Manual Buffer Insertion to Achieve Useful Skew TCAMs Pipeline Flip Flops clk_core • The instinctive way of inserting delay is to manually insert clock buffers: insert_buffer –no_of_cells <num buffers> <pins list> <buffer type> • The target skew is determined by the number and type of buffers, not by numerical value Target Skew
  • 18. 18 Matthew Mei Manual Buffer Insertion • Clock buffers were inserted right before clock tree routing • Two buffers of low drive strength were used. Each buffer added about 40 ps of delay • The pins list in the example design included the clock pins of the same ~8000 flip flops • The clock buffer insertion resulted in a “Post Route Effective Skew” of about 0.084 ns • The TCAMs and the flip flops had on average 38 ps of common path pessimism removed
  • 19. 19 Matthew Mei Manual Buffer Insertion Setup Timing Performance -700 -600 -500 -400 -300 -200 -100 0 -0.18 -0.16 -0.14 -0.12 -0.1 -0.08 -0.06 -0.04 -0.02 0 0 0.05 0.1 0.15 NegativeSlack(ns) Effective Skew (ns) Negative Slack vs. Effective Skew WNS WNS (clkbuf) TNS TNS (clkbuf) 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 0 0.05 0.1 0.15 FailingPaths Effective Skew (ns) Failing Paths vs. Effective Skew Failing Paths Failing Paths (clkbuf)
  • 20. 20 Matthew Mei Manual Buffer Insertion Hold Timing Performance 0 20 40 60 80 100 120 140 0 0.05 0.1 0.15 FailingPaths Effective Skew (ns) Failing Hold Paths vs. Effective Skew Failing Paths Failing Paths (clkbuf) -1.8 -1.6 -1.4 -1.2 -1 -0.8 -0.6 -0.4 -0.2 0 -0.14 -0.12 -0.1 -0.08 -0.06 -0.04 -0.02 0 0 0.05 0.1 0.15 NegativeSlack(ns) Effective Skew (ns) Negative Hold Slack vs. Effective Skew Worst Hold Worst Hold (clkbuf) Total Hold Total Hold (clkbuf)
  • 21. 21 Matthew Mei Manual Buffer Insertion Path Skew Distribution 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 -0.1 -0.05 0 0.05 0.1 0.15 0.2 0.25 0.3 NumberofFlops(Cumulative) Path Skew (ns) Cumulative Distribution of Path Skew Among Skew Group Flip Flops Effective Skew 0.005 ns Effective Skew 0.085 ns Effective Skew 0.121 ns Effecitve Skew 0.138 ns Effective Skew clkbuf
  • 22. 22 Matthew Mei Manual Buffer Insertion Power Consumption • Buffer insertion resulted in about 22000 clock cells, dramatically increasing power 0 0.5 1 1.5 2 2.5 3 3.5 4 0 10 20 30 40 50 60 0.05 0.12 0.2 0.24 0.3 clkbuf IncreaseinTotalPower(%) IncreaseinClockTreePower(%) Target Skew (ns) Power Increase vs. Target Skew Percent Total Power Increase Percent Clock Tree Power Increase
  • 23. 23 Matthew Mei Conclusions • Both methods are easy to setup in IC Compiler • Skew groups: – Easy to specify target skew – Results in smaller increase in cells, power, and area • Manual buffer insertion: – Relies on past experience for buffer selection – Results in larger increase in cells, power, and area
  • 24. Questions?