Understanding cts log_messages

3,212 views

Published on

understanding CLOCK TREE SYNTHESIS MESSAGES

0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
3,212
On SlideShare
0
From Embeds
0
Number of Embeds
20
Actions
Shares
0
Downloads
199
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

Understanding cts log_messages

  1. 1. Understanding Clock Tree Synthesis Log Messages© Synopsys 2012 1
  2. 2. Agenda • Prerequisites for Clock Tree Synthesis • Enabling Useful Debug Messages in IC Compiler Clock Tree Synthesis • Clock Tree Synthesis Log Messages • Clock Tree Optimization Log Messages© Synopsys 2012 2
  3. 3. Agenda • Prerequisites for Clock Tree Synthesis • Enabling Useful Debug Messages in IC Compiler Clock Tree Synthesis • Clock Tree Synthesis Log Messages • Clock Tree Optimization Log Messages© Synopsys 2012 3
  4. 4. Prerequisite 1: Run the check_clock_tree Command • Run the check_clock_tree command prior to clock tree synthesis, and fix the issues reported • This command checks the following, and reports issues that can lead to bad QoR:  Cl k T Clock Tree S Structure  Constraints  Clock Tree Exceptions© Synopsys 2012 4
  5. 5. Prerequisite 2: Ensure Placement Legality g y • For clock tree synthesis to proceed without any errors, it is necessary to have a legally placed design. • Use the check legality command to check whether the design is check_legality properly placed and legalized, prior to CTS. • In case of legality issues, use the legalize_placement command to resolve these issues issues. Note: • Clock tree synthesis will abort in case of placement legality issues issues. • In some cases, like overlapping standard cells, it may still proceed and issue a warning during placement legality checking, but continuing with placement legality issues may lead to bad QoR QoR. Warning: Some cells in the design are not legal. (CTS-242)© Synopsys 2012 5
  6. 6. Default Constraints • The default constraints that clock tree synthesis uses are as follows: Maximum transition time 0.5ns Maximum capacitance 0.6pF Maximum f M i fanout 2000© Synopsys 2012 6
  7. 7. Design Rule Constraints• In addition to the clock tree design rule constraint values specified using set_clock_tree_options, IC Compiler also considers the design rule constraint values from the logic library and the design.• The following table summarizes how IC Compiler determines the design rule constraint values used during the design rule fixing stage of clock tree synthesis and optimization. Case1: Case2: Case3: Default behavior: Use library and SDC settings for maximum Use only user set settings for clock tree fanout: synthesis and clock tree optimization: cts_use_lib_max_fanout=false t lib f t f l cts_use_lib_max_fanout=true t lib f t t cts_use_sdc_max_fanout=false cts_use_sdc_max_fanout=true cts_force_user_constraints=true cts_force_user_constraints=false cts_force_user_constraints=false The minimum value from: The minimum value from: • The set_clock_tree_options • The set_clock_tree_options Value set usingMaximum capacitance • The CTS default value (0.6pF) • The CTS default value (0.6pF) set_clock_tree_options p • The logic library • The logic library • The SDC constraints • The SDC constraints The minimum value from: The minimum value from: • The set_clock_tree_options • The set_clock_tree_options Value set usingMaximum transition time • The CTS default value (0.5ns) • The CTS default value (0.5ns) set_clock_tree_options • Th logic lib The l i library • Th logic lib The l i library • The SDC constraints • The SDC constraints The minimum value from Maximum fanout The value set using • The logic library The value set using set_clock_tree_options • The SDC constraints set_clock_tree_options • The set clock tree options set_clock_tree_options © Synopsys 2012 7
  8. 8. Constraints Specified Using the set_clock_tree_options Command p • Library units are used for time and capacitance values specified by using the set_clock_tree_options command • The smallest values accepted for the -max_capacitance and -max_transition options of the set_clock_tree_options command are 1fF and 1ps respectively respectively. • For example, if the library units are pF and ps, and you specify the following command, command IC Compiler will issue an error: icc_shell> set_clock_tree_options -max_cap 0.0009 -max_tran 0.300 Error: User max_cap constraint (0.900000 fF) is too small. (CTS-206) Error: User max_tran constraint (0.300000 ps) is too small. (CTS-207) – IC compiler will not accept these small values, and will use the previously specified values or the default values for maximum capacitance and maximum transition, during clock tree synthesis.© Synopsys 2012 8
  9. 9. Agenda • Prerequisites for Clock Tree Synthesis • Enabling Useful Debug Messages in IC Compiler Clock Tree Synthesis • Clock Tree Synthesis Log Messages • Clock Tree Optimization Log Messages© Synopsys 2012 9
  10. 10. Enabling Debug Messages • To enable clock tree synthesis debug messages in IC Compiler, use: set cts use debug mode true cts_use_debug_mode • Many of the messages discussed in this presentation are available only when you enable the debug mode. y g© Synopsys 2012 10
  11. 11. Agenda • Prerequisites for Clock Tree Synthesis • Enabling Useful Debug Messages in IC Compiler Clock Tree Synthesis • Clock Tree Synthesis Log Messages • Clock Tree Optimization Log Messages© Synopsys 2012 11
  12. 12. Messages in the compile_clock_tree Command Log • Before clock tree synthesis: – Design d t D i update – Buffer and Inverter information – Clock tree constraints – Clock structure before clock three synthesis • During clock tree synthesis: – Clustering – Meeting target early delay – Gate level clock tree synthesis results • After clock tree synthesis: – Summary report S t – Embedded clock tree optimization – DRC fixing beyond exceptions – Placement legalization© Synopsys 2012 12
  13. 13. Overview of the compile_clock_tree Command LogSTART_CMD: compile_clock_tree CPU: p 55 s ( 0.02 hr) ELAPSE: ) 288 s ( 0.08 hr) MEM-PEAK: ) 203 Mb Wed Dec 28 22:33:54 2011 (PSYN-508)CTS: CTS Operating Condition(s): MAX(Worst)START_FUNC: prelude CPU: 55 s ( 0.02 hr) ELAPSE: 288 s ( 0.08 hr) MEM-PEAK: 203 Mb Wed Dec 28 22:33:54 2011 (PSYN-508) Loading design ORCA_TOP Prelude…Information: Design Library and main library capacitance units are matched - 1.000 pf. g y y p pEND_FUNC: prelude CPU: 56 s ( 0.02 hr) ELAPSE: 288 s ( 0.08 hr) MEM-PEAK: 203 Mb Wed Dec 28 22:33:54 2011 (PSYN-508)…****************************************************************Information: TLUPlus based RC computation is enabled. (RCEX-141) Extraction related messages****************************************************************Information: The distance unit in Capacitance and Resistance is 1 micron. (RCEX-007) (RCEX 007)Information: The RC model used is TLU+. (RCEX-015)…CTS: Blockage Aware AlgorithmCTS: Marking Ignore Pins....…Warning: too small maximum transition (=0.300000) defined at library cell dl02d4. (CTS-619)CTS: bufferCTS b ff estimated skew t ti t d k target d l t delay d i i driving res i input cap tCTS: invbdk [0.009 0.010] [0.043 0.058] [0.197 0.213] [0.059 0.059] Buffer characterization...CTS: Prepare sources for clock domain SD_DDR_CLKCTS: Prepare sources for clock domain SDRAM_CLKCTS: Prepare sources for clock domain SYS_2x_CLK…CTS: Region Aware Algorithm is automatically turned off when design has no region or only has one region.CTS: Info: Found net sys_2x_clk, on cell I_RISC_CORE/I_REG_FILE/REG_FILE_B_RAM is macro. Will not treat as pad.…clean drc fixing cell first...In all, 0 drc fixing cell(s) are cleanedIn all, 0 drc fixing cell(s) beyond exception pins are cleaned…CTS: I_SDRAM_TOP/I_SDRAM_IF/sd_mux_dq_out_8/S is implicit ignoreCTS: I_SDRAM_TOP/I_SDRAM_IF/sd_mux_dq_out_9/S is implicit ignore… © Synopsys 2012 13
  14. 14. CTS: I_SDRAM_TOP/I_SDRAM_IF/sd_mux_dq_out_8/S is implicit ignoreCTS: I_SDRAM_TOP/I_SDRAM_IF/sd_mux_dq_out_11/S is implicit ignore…Warning: Ignore net sd_CK since it has no synchronous pins. (CTS-231)CTS: Info: will use target transition value for initial CTS stagesPruning library cells (r/f, pwr) Min drive = 0.000372606.… Pruning of buffers and inverters Final pruned buffer set (7 buffers): bufbd1…CTDN lib estimation: buffers should result in better clock power.CTS: BA: Net sdram_clkCTS: Starting clock tree synthesis ...CTS: Conditions = worst(1)CTS: Global design rule constraints [rise fall] Reporting globalCTS: max transition = worst[0.300 0.300] GUI = worst[0.300 0.300] SDC = undefined/ignored clock tree constraints…Information: Removing clock transition on clock PCI_CLK ... (CTS-103)CTS: gate level 1 clock tree synthesisCTS: clock net = sdram_clkCTS: gate level 1 clock tree synthesis results Clock tree synthesisCTS: clock net : sdram clk sdram_clk…TS: Clock tree synthesis completed successfullyCTS: CPU time: 18 secondsCTS: Reporting clock tree violations ...… Reporting the results of clock tree synthesisCTS: ------------------------------------------------CTS: Clock Tree Synthesis SummaryCTS: ------------------------------------------------…CTS: Starting block level clock tree optimization…CTS: gate level 1 clock tree optimization Embedded clock tree optimizationCTS: clock net = pclk © Synopsys 2012 14
  15. 15. Gate Upsizing During Clock Tree Synthesis • The compile_clock_tree command will upsize all the preexisting cells in the clock tree before building the clock tree. Preexisting gate Information: Replaced the library cell of sys_ctl/sunburst_clk_mux_div1/clk_buf from bufbd4 to bufbdf. (CTS-152) bufbdf (CTS 152) • In the previous example the preexisting gate is upsized from a bufbd4 to a bufbdf. • This upsizing helps in reducing the number of buffer levels needed to building the clock tree, thereby reducing the buffer count. g , y g© Synopsys 2012 15
  16. 16. Maximum Capacitance and Transition Related Warnings • Even if the set_clock_tree_options command does not issue any errors when you set the maximum capacitance and transition constraints, the compile_clock_tree command can issue warnings if the values are too small. Warning: too small maximum transition (=0.050000) defined at pin instCLK1GC1/Q. (CTS-620) Max trans =50ps is too tight for the pin instCLK1GC1/Q Warning: too small maximum capacitance (=0.050000) defined at pin instCLK1GC1/Q. (CTS-620) Max cap =50fF is too tight for the pin instCLK1GC1/Q Warning: too small maximum transition ( 0.050000) defined at (=0.050000) library cell bufbdk. (CTS-619) • Tight constraints can cause clock tree synthesis to use an excessive number of buffers to build the clock trees© Synopsys 2012 16
  17. 17. Buffers and Inverters Used During Clock Tree Synthesis• Before synthesizing the clock tree, IC Compiler characterizes each buffer and inverter  To see the characterization details, set the following variable to true: g set cts_do_characterization true  After characterization is done, characterized values for each buffer andBuffer inverter are reported p CTS: buffer estimated skew target delay driving res input cap CTS: bufbdf [0.013 0.015] [0.217 0.200] [0.210 0.248] [0.007 0.007] CTS: inv0da [0.018 0.021] [0.097 0.119] [0.294 0.347] [0.036 0.036] CTS: bufbd7 [0.025 0.030] [0.223 0.234] [0.415 0.503] [0.008 0.008]Inverter CTS CTS: b fbd4 bufbd4 [0 047 0.053] [0.047 0 053] [0.347 [0 347 0.357] 0 357] [0.786 0.880] [0 786 0 880] [0.004 0.004] [0 004 0 004] Rise delay Fall delay• Driving resistance determines the drive strength of the buffer or inverter. • Smaller the driving resistance, greater is the drive strength. • In the previous example, bufbdf is the buffer with the highest drive strength. © Synopsys 2012 17
  18. 18. Unbalanced Buffers • Buffers and inverters that have a big difference between their rise and fall delays, which is referred to as the rise/fall delay skew, are reported. CTS: inverter inv0da: rise/fall delay skew = 0.204816 (> 0.200000) • Remove unbalanced buffers them from the buffer list specified for clock tree synthesis, as they can might cause bad skew. • Use the set_clock_tree_references command to specify the buffers and inverters that should be used for clock tree synthesis© Synopsys 2012 18
  19. 19. Pruning of Buffers and Invertors • Pruning is a process by which IC Compiler selects the buffers and inverters which are best suited for clock tree synthesis, based on the buffer and inverter characterization, and prevents the remaining ones from being used. f b i d • IC Compiler prunes the buffers and inverters based on drive strength and power: Pruning library cells (r/f, pwr) Min drive = 0.264263. Pruning inv0d0 because drive of 0.149845 is less than 0.264263. Pruning inv0d2 because it is (w/ power-considered) inferior to invbd2. • IC Compiler calculates a minimum drive value based on heuristics. Buffers and inverters whose drive strength is less than the minimum d e a ue are considered drive value a e co s de ed as weak d e s a d a e p u ed by IC ea drivers and are pruned C Compiler. • It is not possible to override the default pruning process© Synopsys 2012 19
  20. 20. Maximum Transition, Maximum Capacitance and Timing Constraints Before clock tree synthesis begins, all the global clock tree constraints are reported in the log, in the format shown below: log Default value or the value set using The value set_clock_tree_optionsp used by CTS Value from SDC CTS: Global design rule constraints [rise fall] CTS: max transition = worst[0.050 0.050] [ ] GUI = worst[0.100 0.100] [ ] SDC = worst[0.050 0.050] [ ] CTS: max capacitance = worst[0.600 0.600] GUI = worst[0.600 0.600] SDC = undefined/ignored CTS: max fanout = 2000 GUI = 2000 SDC = undefined/ignored Undefined means no value onSkew/insertio specified i SDC ifi d indelay targets s CTS: Global timing/clock tree constraints CTS: clock skew = worst[0.100] CTS: insertion delay = worst[2.000] Ignored means the value from CTS: levels per net = 200 SDC is ignored as theS cts_force_user_constraints cts force user constraintsd Values set using the set_clock_tree_options variable is set to true command © Synopsys 2012 20
  21. 21. Clock Tree Synthesis Target Specifications • Target specifications are the internal targets for clock tree synthesis, but are not guaranteed. Only target constraints are guaranteed to be achieved CTS: Global target spec [rise fall] CTS: transition = worst[0.250 0.250] CTS: capacitance = worst[0.300 0.300] CTS: fanout= 32 (This target fanout value is not considered by CTS) • Target specifications:  maxTransSpec: Min(0.25, 80%of max_transition constraints)  maxCapSpec: Min(0.30, 80%of max_capacitance constraints)© Synopsys 2012 21
  22. 22. Preexisting Clock Tree Information in the Log File Maximum number of Before starting to gate levels available build the clock tree, CTS: Design infomation CTS: total gate levels = 8 the preexisting clockNumber of gate levels CTS: Root clock net CLK2 Number of sinks tree structure is CTS: clock gate levels = 2 printed in the log file efor clock CLK2 CTS: clock sink pins = 4 CTS: level 2: gates = 1 Existing gate levels and number CTS: level 1: gates = 1 of gates at each level CTS: Buffer/Inverter list for CTS for clock net CLK2: CTS: invbdkNf CTS: bufbdk ... CTS: Root clock net CLK1 CTS: clock gate levels = 8 CTS: clock sink pins = 8431 CTS: level 8: gates = 2 flip-flops towards Gate levels from CTS: level 7: gates = 3 CTS: level 6: gates = 4 clock source CTS: level 5: gates = 3 s CTS: level 4: gates = 1 CTS: level 3: gates = 5 CTS: level 2: gates = 4 CTS: level 1: gates = 1 CTS: Buffer/Inverter list for CTS for clock net CLK1: CTS: CTS invbdk i bdk CTS: bufbdk ... © Synopsys 2012 22
  23. 23. Real Gates and Guide Buffers • You may see the term real gates in the preexisting clock tree structure information section: CTS: Root clock net CLK1 CTS: clock gate levels = 16 CTS: clock sink pins = 70644 ... CTS: level 13: gates = 14 (real gates = 4) CTS: level 12: gates = 111 (real gates = 101) CTS: level 11: gates = g 146 ( (real gates = 136) g ) CTS: level 10: gates = 2488 (real gates = 2478) • Real gates are preexisting gates in the clock tree, and are not gates added by the tool • Guide buffers are buffers or inverters that are inserted by the tool, before it begins to build the tree. They are intended to help clock tree synthesis build a better clock tree • The number of guide buffers inserted at each level can be determined from the difference between gates and real gates. – In the above example, the tool has added 10 guide buffers at each of the clock tree© Synopsys 2012 23
  24. 24. Buffers and Inverters Used• Before it begins to build the clock tree, the tool will list all the buffers and inverters it will use to build the tree CTS: Buffer/Inverter list for CTS for clock net sdram_clk: CTS: CLKBUFX20 CTS: CLKBUFX16 CTS uses this list CTS: CLKBUFX12 CTS: Buffer/Inverter LEQ cell list for Boundary Cell for clock net sdram_clk: CTS: CTS CLKBUFX20 CTS: CLKBUFX16 CTS uses this list for inserting boundary cells CTS: CLKINVX8 CTS: Buffer/Inverter LEQ cell list for CTO for clock net sdram_clk: CTS: CLKBUFX20 CTS: CLKBUFX16 CTO uses this list for sizing CTS: CLKINVX8 CTS: Buffer/Inverter list for DelayInsertion for clock net sdram_clk: CTS: CLKBUFX20 CTS: CLKBUFX16 CTO uses this list f delay i thi li t for d l insertion ti CTS: CLKINVX8• You can change the buffer and inverter list by using the following command: set_clock_tree_references © Synopsys 2012 24
  25. 25. Clock Tree Synthesis Removes User-Specified Ideal Attributes on Clocks• Synthesized clocks are set to be propagated, and clock transition, which is an attribute of an ideal clock, is removed CTS: Information: Removing clock transition on clock SP0XCLK ... (CTS-103) CTS: Information: Removing clock transition on clock SP0RCLK ... (CTS-103)• Latency, another attribute of an ideal clock, is also removed CTS: Information: Removing clock latency on pin Idma_scr_wrap0__Idma_scrba0_m2m0_wrap/I_dma_scrba0_m2m0/ I_dma@ ... (CTS- 098)• Source Latency is removed for generated clocks Information: Removing clock source latency on clock CLK1GC1 ... (CTS-289)• These messages are informational only, and no action is required © Synopsys 2012 25
  26. 26. Overlap or Reconvergent Paths• Overlap or reconvergent paths occur when multiple clocks can drive a node• IC Compiler issues warnings about such paths Warning: Either the driven net has been synthesized previously or clock path overlaps/reconverges at pin periph/U1852/Y. (CTS-209)• Such messages should be treated as informational, rather than as warnings – IC Compiler has no problems handling such situations © Synopsys 2012 26
  27. 27. Gate Level-by-Level Clock Tree Synthesis• Clock tree b ildi i d Cl k t building is done gate l t level b gate l l by t level, starting f l t ti from ththe sinks to the clock root• For each gate level, just before the synthesis starts, the following information will be printed in the log: CTS: gate level 2 clock tree synthesis CTS: clock net = I_BLENDER_1/gclk g Net and driver at CTS: driving pin = I_BLENDER_1/U483/Z this gate level CTS: gate level 2 design rule constraints [rise fall] CTS: max transition = worst[0.300 0.300] CTS: max capacitance = worst[0.300 0.300] CTS: max fanout = 2000 CTS: gate level 2 target spec [rise fall] CTS: transition = worst[0.240 0.240] CTS: capacitance = worst[0.240 0.240] C S: CTS: d driver cap. e = worst[0.088 0.088] o st[0.088 CTS: fanout = 32 CTS: gate level 2 timing constraints CTS: clock skew = worst[0.000] CTS: levels per net = 200 CTS: ----------------------------------------------- CTS: Starting clustering for bufbda with target load = worst[0.240 0.240] © Synopsys 2012 27
  28. 28. Clustering During Clock Tree Synthesis • The clock tree building starts with clustering. Clustering is the p g g g process of dividing a set of sink pins (fanouts) into groups. Each group is driven by a buffer  The instances of a cluster are all close to each other • The following message says that 423 sink pins are divided into 27 clusters clusters, each with approximately 423/27 sink pins CTS: gate level 2 clock tree synthesis ... CTS: gate level 2 design rule constraints [rise fall] CTS: max transition = worst[0.300 0.300] CTS: max capacitance = worst[0.300 0.300] CTS: max fanout = 2000 CTS: gate level 2 target spec [rise fall] CTS: transition = worst[0.240 0.240] CTS: p capacitance = worst[0.240 0.240] [ ] CTS: driver cap. = worst[0.088 0.088] CTS: fanout = 32 CTS: gate level 2 timing constraints ... Before clustering After clustering CTS: ----------------------------------------------- CTS: Starting clustering for bufbda with target load = worst[0.240 0.240] CTS: Completed 423 to 27 clustering CTS: BA: lp (1.520, 0.673): skew (0.149, 0.080) c(1.481, 0.198) viol(n y) One buffer level is added CTS: ----------------------------------------------- with each clustering CTS: Starting clustering for bufbda with target load = worst[0.240 0.240] Represents DRCs CTS: Completed 27 to 4 clustering (cap,trans) CTS: BA: lp (0 673 0 597): skew (0 080 0 105) c(0 198 0 026) viol(n n) (0.673, 0.597): (0.080, 0.105) c(0.198, 0.026) y : violation present CTS: ----------------------------------------------- n : no violation Skew (Before clustering, After clustering)© Synopsys 2012 28
  29. 29. Clustering With Hookup Pins • Hookup pins are input pins of gates or macros • Unlike clock pins of flip-flops and latches (sink pins), hookup pins have a nonzero phase delay that must be balanced with the sink pins© Synopsys 2012 29
  30. 30. Clustering With Hookup Pins• Initially, Initially the tool makes attempts to cluster hookup pins along with the normal sinks (trial clustering) CTS: gate level 1 clock tree synthesis ... CTS: gate level 1 design rule constraints [rise fall] CTS: CTS: max transition = worst[0.300 0.300] max capacitance = worst[0.300 0.300] In this example, there are 479 sinks example CTS: CTS: max fanout = 2000 gate level 1 target spec [rise fall] and 1 hookup pin CTS: transition = worst[0.240 0.240] CTS: capacitance = worst[0.240 0.240] CTS: driver cap. = worst[0.150 0.150] CTS: fanout = 32 CTS: gate level 1 timing constraints ... CTS: ----------------------------------------------- CTS: Starting clustering for bufbda with target load = worst[0.240 0.240] CTS: Completed 480 to 34 clustering Trial CTS: Starting clustering for bufbda with target load = worst[0.240 0.240] CTS: CTS C Completed 34 t 6 clustering l t d to l t i clustering CTS: BA: this delay [max min] (skew) = worst[0.000 0.000] (0.000) CTS: BA: next delay [max min] (skew) = worst[0.124 0.124] (0.000) CTS: BA: target cap = 0.070 pf CTS: Starting clustering for bufbda with target load = worst[0.240 0.240] CTS: CTS: BA: CAC set: target cap = 0.070317: targetWireCap = 0.274866 Completed 479 to 39 clustering Actual CTS: CTS: BA: lp (1.574, 0.770): skew (0.821, 0.451) c(1.737, 0.269) viol(n y) ----------------------------------------------- clustering l t i• At the trial clustering stage, the hookup pin is considered along with the other sink pins and (479+1) to 34 to 6 clustering is obtained• At the actual clustering stage the tool clusters the 479 sink pins separately from the hookup stage, pin © Synopsys 2012 30
  31. 31. Clustering With Hookup Pins: Hookup Pin Clustered With Sinks• If the trial clustering gives good QoR results, the following message shown in blue is displayed : CTS: BA: lp (1.968, 2.031): skew (0.257, 0.194) c(0.076, 0.072) viol(y y) CTS: ----------------------------------------------- CTS: Starting clustering for bufbd7 with target load = worst[0.000 0.005] CTS: BA: rootNetCap = 0.071776: targ cap = 0.045000: targ wirecap = 0.000000: not relaxed CTS: Completed 2 to 2 clustering CTS: Starting clustering for bufbd7 with target load = worst[0.000 0.005] CTS: BA: rootNetCap = 0.071776: targ cap = 0.045000: targ wirecap = 0.000000: not relaxed CTS: Completed 2 to 1 clustering CTS: BA: this delay [max min] (skew) = worst[2.040 1.844] (0.196) CTS: BA: next delay [max min] (skew) = worst[2.161 1.965] (0.196) CTS: BA: target cap = 0.048 pf CTS: Pin 1: periph/U5659/A is selected for next level CTS: delay [max min] (skew) = worst[1.976 1.921] (0.055) CTS: Starting clustering for bufbd7 with target load = worst[0.000 0.005] CTS: Completed 2 to 2 clustering p g CTS: BA: lp (2.031, 2.153): skew (0.194, 0.210) c(0.072, 0.026) viol(n n) CTS: -----------------------------------------------• When the phase delay of the hookup pin periph/U5659/A matches with the delay of the already built tree at that g y y gate level, it will be clustered at that buffer , level.© Synopsys 2012 31
  32. 32. Meeting Target Early Delay • After the synthesis of the root clock net (gate level 1 synthesis), the tool checks if the delay constraint set by the user is being met or not. • If it is not met, the tool inserts some buffers at the root clock net to achieve the target delay specified by the user. p y • In the following message, 16 buffers are inserted at the root clock net to increase the delay from 0.569ns to 2ns, which is the user specified target. CTS: gate level 1 clock tree synthesis C S: CTS: c oc clock net = sys clk et sys_c CTS: driving pin = sys_clk CTS: gate level 1 design rule constraints [rise fall] ... CTS: gate level 1 target spec [rise fall] ... CTS: gate level 1 timing constraints Constraint set by the user CTS: clock skew = worst[0.000] CTS: insertion delay = worst[2.000] CTS: levels per net = 200 CTS: ----------------------------------------------- CTS: Starting clustering for CLKBUF_X20 with target load = worst[0.211 0.270] ... CTS: ----------------------------------------------- CTS: Starting clustering for CLKBUF_X20 with target load = worst[0.211 0.270] CTS: Completed 19 to 2 clustering CTS: BA: lp (0.563, 0.569): skew (0.142, 0.112) c(0.008, 0.008) viol(n n) CTS: ----------------------------------------------- CTS: Inserting delay cells for clock tree sys_clk ... CTS: current delay = worst[0.569] worst[0.457] CTS: constraint = worst[2.000] worst[0.000] CTS: inserted 16 (buffd3) delay cells to the clock net sys_clk© Synopsys 2012 32
  33. 33. Synthesis Results of One Gate Level CTS: gate level 1 clock tree synthesis results After the synthesis of aSkew and insertion delay at the CTS: clock net : sdram_clk gate level, the results are dram_clk) CTS: driving pin: sdram_clk CTS: load pins : 5 sink pins, 0 gates/macros pins, 0 ignore pins printed in the log CTS: buffer level 1: bufbd7 (1) CTS: buffer level 2: bufbd7 (1)driving pin A (here sd d CTS: clock tree skew = worst[0.036] CTS: longest path delay = worst[0.327](rise) CTS: shortest path delay = worst[0.291](rise) CTS: total capacitance = worst[0.389 0.389] CTS: buffer level phase delay Operating Condition n d CTS: CTS 1 (I) worst[0.293](rise), worst[0.256](rise); skew = worst[0.036] (I): t[0 293]( i ) t[0 256]( i ) k t[0 036] CTS: (O): worst[0.151](rise), worst[0.129](rise); skew = worst[0.022] CTS: 2 (I): worst[0.150](rise), worst[0.128](rise); skew = worst[0.022] CTS: (O): worst[0.004](rise), worst[0.000](rise); skew = worst[0.004] CTS: buffer level output transition delays [rise fall] CTS: level 0: worst[0.088 0.085] worst[0.088 0.085] CTS: load 0: worst[0.088 0.085] worst[0.088 0.085] CTS: level 1: worst[0.111 0.115] worst[0.091 0.092] CTS: load 1: worst[0.111 0.115] worst[0.091 0.092] CTS: level 2: worst[0.158 0.153] worst[0.080 0.071] CTS: load 2: worst[0.158 0.153] worst[0.080 0.071] CTS: buffer level total load capacitance CTS: level 0: worst[0.045 0.045] CTS: level 1: worst[0.093 0.093] CTS: level 2: worst[0.251 0.251] A 1 2 C CTS: drc violations: 0 0 B Load capacitance value is added and is Number of cap Number of trans reported as total capacitance of the subtree violations violations © Synopsys 2012 33
  34. 34. Maximum Transition and Capacitance Violations • After each gate level is synthesized, the maximum capacitance and maximum transition violations at that gate level are reported CTS: gate level 3 clock tree synthesis results ... CTS: buffer level total load capacitance ... CTS: CTS capacitance violation on periph/CTS_755 it i l ti i h/CTS 755 CTS: capacitance = worst[0.052 0.052] CTS: constraint = worst[0.050 0.050] CTS: capacitance violation on periph/CTS_757 CTS: capacitance = worst[0.051 0.051] CTS: constraint = worst[0 050 0.050] worst[0.050 0 050] ... CTS: transition delay violation at periph/CLKBUFX20_G3B1I3/A CTS: transition delay = worst[0.052 0.050] worst[0.052 0.050] CTS: constraint = worst[0.050 0.050] CTS: transition delay violation at periph/CLKBUFX20_G3B2I14/A CTS: transition delay = worst[0.053 0.051] worst[0.053 0.051] CTS: constraint = worst[0.050 0.050] ... CTS: drc violations: 18 5 Number of cap Number of trans violations violations© Synopsys 2012 34
  35. 35. A More Complex Synthesis ResultsCTS: gate level 1 clock tree synthesis resultsCTS: clock net : clkCTS: driving pin: clkCTS: load pins : 80 sink pins, 0 gates/macros pins, 0 ignore pinsCTS: buffer level 1: CLKBUFX20 (1)CTS: buffer level 2: CLKBUFX20 (2) CLKBUFX12 (1)CTS: clock tree skew = worst[0.001]CTS: longest path delay = worst[0.248](rise)CTS: shortest path delay = worst[0.246](rise)CTS: total capacitance = worst[0.549 0.549]CTS: buffer level phase delayCTS: 1 (I): worst[0.247](rise), worst[0.246](rise); skew = worst[0.001]CTS: (O): worst[0.141](rise), worst[0.140](rise); skew = worst[0.001]CTS: 2 (I): worst[0.141](rise), worst[0.140](rise); skew = worst[0.001]CTS: (O): worst[0.001](rise), worst[0.000](rise); skew = worst[0.001]CTS: buffer level output transition delays [rise fall]CTS: level 0: worst[0.000 0.000] worst[0.000 0.000]CTS: load 0: worst[0.000 0.000] worst[0.000 0.000]CTS: level 1: worst[0.089 0.076] worst[0.089 0.076]CTS: load 1: worst[0.089 0.076] worst[0.089 0.076]CTS: level 2: worst[0.109 0.093] worst[0.104 0.091]CTS: load 2: worst[0.109 0.093] worst[0.104 0.091]CTS: buffer level total load capacitanceCTS: level 0: worst[0.038 0.038]CTS: level 1: worst[0.108 0.108]CTS: level 2: worst[0.403 0.403]CTS: drc violations: 0 0© Synopsys 2012 35
  36. 36. Gate lev 1 vel (Clock source pin) s© Synopsys 2012 Buffer le evel 1 of gate level 136 Buffer le evel 2 of gate level 1 Red: Preexisting gates Black: CTS introduced gates Gate Le evel 2 Buffer le evel 1 of gate level 2 Buffer le evel 2 of gate level 2 to appear top-down Buffer level 3 e of gate level 2 Buffer level 4 e of gate level 2 At each gate level, the clock tree is built bottom-up, but the buffer names are changed Gate Level and Buffer Level Nomenclature
  37. 37. DRC Violation Report After Synthesis • After building the complete clock tree, all the remaining DRC violations in the entire clock tree gets reported in the log file: CTS: Clock tree synthesis completed successfully CTS: CPU time: 50 seconds CTS: Reporting clock tree violations ... CTS: Global design rules: CTS: maximum transition delay [rise,fall] = [0.05,0.05] CTS: maximum capacitance = 0.05 CTS: maximum fanout = 2000 Constraints CTS: maximum buffer levels per net = 200 CTS: transition delay violation at sdram_clk CTS: user specified transition delay = worst[0.056 0.050] worst[0.056 0.050] CTS: constraint = worst[0.050 0.050] CTS: transition delay violation at CLKBUF_X20_G1B21I1/Z CTS: transition delay = worst[0.051 0.050] worst[0.051 0.050] CTS: constraint = worst[0.050 0.050] CTS: capacitance violation on CTS_6557 Reports only transition CTS: capacitance = worst[0.074 0.074] p [ ] and capacitance violations p CTS: constraint = worst[0.050 0.050] CTS: Summary of clock tree violations: CTS: Total number of transition violations = 2 Total transition and CTS: Total number of capacitance violations = 1 capacitance violations© Synopsys 2012 37
  38. 38. Summary Report After Clock Tree Synthesis CTS: ------------------------------------------------ CTS: Clock Tree S th i S CTS Cl k T Synthesis Summary CTS: ------------------------------------------------ CTS: 5 clock domain synthesized CTS: 30 gated clock nets synthesized CTS: 26 buffer trees inserted CTS: 722 buffers used (total size = 45974.2) CTS: 752 clock nets total capacitance = worst[76.868 76.868] Each gate level can have multiple nets h li l© Synopsys 2012 38
  39. 39. Clock-by-Clock Summary • A summary is reported for each clock: CTS: ------------------------------------------------ CTS: Clock-by-Clock Summary Buffer tree is inserted CTS: only if necessary ------------------------------------------------ CTS: Root clock net pclk CTS: 3 gated clock nets synthesized CTS: 2 buffer trees inserted CTS: 2 buffers used (total size = 159.667) CTS: 5 clock nets total capacitance = worst[0.514 0.514] CTS: clock tree skew = worst[0.341] CTS: longest path delay = worst[5.959](rise) CTS: shortest path delay = worst[5.619](rise) CTS: Root clock net sys_clk ...© Synopsys 2012 39
  40. 40. Embedded Clock Tree Optimization• After clock tree synthesis, embedded clock tree optimization begins• The characteristics of the buffers and inverters used are reported again CTS: buffer estimated skew target delay driving res input cap CTS: bufbdf [0.013 0.015] [0.217 0.200] [0.210 0.248] [0.007 0.007] CTS: inv0da [0.018 0.021] [0.097 0.119] [0.294 0.347] [0.036 0.036] ...• The global constraints for clock tree are also reported again CTS: Global design rule constraints [rise fall] CTS: max transition = worst[0.050 0.050] GUI = worst[0.050 0.050] SDC = undefined/ignored ... CTS: Global i i / l k C S Gl b l timing/clock tree constraints i CTS: clock skew = worst[0.000] ... CTS: Global target spec [rise fall] CTS: transition = worst[0.040 0.040] ... Note: Embedded clock tree optimization is called only when the compile_clock_tree command is used It is not called when the clock_opt command is used used. l k t © Synopsys 2012 40
  41. 41. More Messages on Real Gates and Guide Buffers • At the beginning of optimization, you might get the following messages: CTS: Root clock net chip_sclk_src CTS: clock gate levels = 75 CTS: clock sink pins = 125896 ... CTS: level 73: gates = 3 (real gates = 1) CTS: level 72: gates = 2 (no real gates, guide buffers only) • All the gates are guide buffers and inverters inserted during clock ff tree synthesis. • This information is similar to the one printed prior to clock tree synthesis. h i© Synopsys 2012 41
  42. 42. Gate Level Optimization• The clock tree optimization is also done for each gate level • Similar to when the clock tree is built• Before optimizing a gate level, the current skew, longest path delay and shortest path delay from the driving pin of that gate level, is reported. CTS: gate level 2 clock tree optimization CTS: clock net = I_BLENDER_1/gclk CTS: driving pin = I_BLENDER_1/U483/Z CTS: clock tree skew = worst[0.517] CTS: longest path delay = worst[5.339](rise) CTS: shortest path delay = worst[4.822](fall)• After which that gate level is optimized © Synopsys 2012 42
  43. 43. Buffer Sizing• The following message indicates that buffer sizing was successfulCTO-BS: Starting buffer sizing ...Information: Replaced the library cell of CLKBUF_X20_G2B2I1 from CLKBUF_X20 to CLKBUF_X16. (CTS-152)CTO-BS: CPU time = 0 seconds for buffer sizing• Clock tree optimization will try to resize buffers, and improve skew and insertion delay. If it does not find it beneficial, then the original cell master will be restored. CTO-BS: Starting buffer sizing ... CTO-BS: Restoring original cellMaster <CLKBUF_X20> of <CLKBUF_X20_G2B2I4> CTO-BS: CPU time = 1 seconds for buffer sizing © Synopsys 2012 43
  44. 44. Gate SizingCTO-GS: Starting gate sizing ...Information: Replaced the library cell of I7188625 from TLQMUX2X60 to TULQMUX2ZSX40. (CTS-152)Information: Replaced the library cell of I7586451 from TLTMUX2X60 to TLTMUX2X50. (CTS-152)Information: Replaced the library cell of I3342873 from TULTMUX2X50 to TLTMUX2ZSX60. (CTS-152)Information: Replaced the library cell of I1387108 from TULTMUX2X80 to TULTMUX2ZSX80. (CTS-152)... 14 cells sizedInformation: R lI f ti Replaced th lib d the library cell of I6717862 f ll f from THQMUX2ZSX80 t TSTMUX2ZSX20 (CTS 152) to TSTMUX2ZSX20. (CTS-152)Information: Replaced the library cell of I9359863 from TLTMUX2ZSX80 to TULTMUX2ZSX60. (CTS-152)Information: Replaced the library cell of I10258160 from TLTMUX2ZSX60 to TLTMUX2ZSX40. (CTS-152)Information: Replaced the library cell of I7636259 from TLTMUX2ZFFX80 to TULTMUX2ZSX60. (CTS-152)CTO-GS: 1: Sized 14/40 cell instances (tested 40X247)CTO-GS: y (from) = worst[9.104] worst[8.633]; skew = worst[0.471] Summary of the first round of sizing delay ( ) [ ] [ ]; [ ] • Number of gate sized (Here 14 out of 40 gates)CTO-GS: delay (to) = worst[9.104] worst[8.633]; skew = worst[0.471] • Shows the improvement in skewCTO-GS: improvement = worst[0.106%]Information: Replaced the library cell of I2130284 from TLTMUX2X80 to TLTMUX2ZSX40. (CTS-152)Information: Replaced the library cell of I8618764 from TLTMUX2ZFFX80 to TLTMUX2X80. (CTS-152)Information: Replaced the library cell of I1749911 from TULTMUX2ZFFFX80 to TULTMUX2ZFFX80. (CTS-152)Information: Replaced the library cell of I3342873 from TLTMUX2ZSX60 to TLTMUX2ZSX40. (CTS-152)Information: Replaced the library cell of I8872989 from TULTMUX2ZFFFX60 to TLTMUX2ZFFX80. (CTS-152)Information: Replaced the library cell of I1387108 from TULTMUX2ZSX80 to TULTMUX2X50. (CTS-152)CTO-GS: 2: Sized 6/40 cell instances (tested 40X247)CTO-GS: delay (from) = worst[9.104] worst[8.633]; skew = worst[0.471]CTO-GS:CTO GS: delay (to) = worst[9 104] worst[8 633]; skew = worst[0 471] worst[9.104] worst[8.633]; worst[0.471]CTO-GS: improvement = worst[0.000%]CTO-GS: Summary of cell sizingCTO-GS: Sized 20/40 cell instances (tested 80X247)CTO-GS: delay (from) = worst[9.104] worst[8.633]; skew = worst[0.471] Overall summary of gate sizing done at this gate level. Total 14+6 =20 gates sized giving anCTO-GS: delay (to) y = worst[9.104] worst[8.633]; skew = worst[0.471] 0.106% improvement i skew at thi gate l 0 106% i t in k t this t level lCTO-GS: improvement = worst[0.106%]CTO-GS: CPU time = 2413 seconds for gate sizing © Synopsys 2012 44
  45. 45. Gate Relocation • Gate relocation works on preexisting gates. • If you have no preexisting gates, you might see the following message: g CTO-GR: gate relocation is skipped since there are no hookup pins© Synopsys 2012 45
  46. 46. A Successful Gate Relocation 2 cells were tried at 47 new locations, 1 was movedCTO-GR: Starting gate relocation ...CTO-GR: delay [max min] (skew) = worst[9.023 8.563] (0.460)CTO-GR: 1: Relocated 1/40 cell instances (tested 2 cell instances at 47 points)CTO-GR: delay (from) = worst[9.023] worst[8.563]; skew = worst[0.460] Initial skewCTO-GR: delay (to) = worst[9.023] worst[8.563]; skew = worst[0.460] Final skewCTO-GR: improvement = worst[0.000%] Improvement in skewCTO-GR:CTO GR d l delay [max min] (skew) = worst[9.018 8.563] (0 455) [ i ] ( k ) t[9 018 8 563] (0.455)CTO-GR: delay [max min] (skew) = worst[9.018 8.563] (0.455)CTO-GR: 2: Relocated 2/40 cell instances (tested 5 cell instances at 83 points)CTO-GR: delay (from) = worst[9.023] worst[8.563]; skew = worst[0.460]CTO-GR: delay ( ) y (to) = worst[9.018] worst[8.563]; skew = worst[0.455] [ ] [ ] [ ]CTO-GR: improvement = worst[1.118%]CTO-GR: Summary of cell relocationCTO-GR: Relocated 3/40 cell instances (tested 7 cell instances at 130 points)CTO-GR: delay (from) = worst[9.023] worst[8.563]; skew = worst[0.460] Overall summary ofCTO-GR: delay (to) = worst[9.018] worst[8.563]; skew = worst[0.455] gate relocation at this t l ti t thiCTO-GR: improvement = worst[1.118%] gate levelCTO-GR: CPU time = 2 seconds for gate relocation © Synopsys 2012 46
  47. 47. Gate Relocation: Failed Attempts CTO-GR: Starting gate relocation ... CTO-GR: Summary of cell relocation CTO-GR: Relocated 0/1 cell instances (tested 1 cell instances at 24 points) CTO-GR: delay (from) = worst[1.207] worst[0.980]; skew = worst[0.227] CTO-GR: delay (to) = worst[1.207] worst[0.980]; skew = worst[0.227] CTO-GR: improvement = worst[0.000%] CTO-GR: CPU time = 0 seconds for gate relocation • In this example, clock tree optimization tried to move one gate instance to 24 different locations. Since the attempts did not improve the QoR, the gate relocation was abandoned© Synopsys 2012 47
  48. 48. Buffer Relocation• Buffer relocation is done on all clock tree synthesis inserted buffers CTO BR: CTO-BR: Buffer relocation ... CTO-BR: Optimization level: net CTO-BR: delay [max min] (skew) = worst[9.087 8.503] (0.584) CTO-BR: 1: Relocated 1/6 cell instances (tested 6 cell instances at 74 points) CTO-BR: delay (from) = worst[9.099] worst[8.503]; skew = worst[0.596] CTO-BR: delay (to) = worst[9.087] worst[8.503]; skew = worst[0.584] CTO-BR: improvement = worst[2.013%] CTO-BR: delay [max min] (skew) = worst[9.087 8.503] (0.584) CTO-BR: 2: Relocated 1/6 cell instances (tested 5 cell instances at 62 points) CTO BR: CTO-BR: delay (from) = worst[9 087] worst[8.503]; skew = worst[0 584] worst[9.087] worst[8 503]; worst[0.584] CTO-BR: delay (to) = worst[9.087] worst[8.503]; skew = worst[0.584] CTO-BR: improvement = worst[0.000%] CTO-BR: Summary of cell relocation CTO-BR: Relocated 2/6 cell instances (tested 11 cell instances at 136 points) CTO-BR: delay (from) = worst[9.099] worst[8.503]; skew = worst[0.596] CTO-BR: delay (to) = worst[9.099] worst[8.503]; skew = worst[0.584] CTO-BR: improvement = worst[2.013%] CTO-BR: CPU time = 0 seconds for buffer relocation• The information i similar to gate relocation Th i f i is i il l i © Synopsys 2012 48
  49. 49. Post Embedded Clock Tree Synthesis • After the embedded clock tree optimization, the tool prints the summary. • It looks exactly similar to the summary printed after clock tree synthesis synthesis. CTS: ------------------------------------------------ CTS: Clock Tree Optimization Summary CTS: ------------------------------------------------ CTS: 4 clock domain synthesized CTS: 5 gated clock nets synthesized CTS: 5 buffer trees inserted CTS: 1000 buffers used (total size = 16570 8) 16570.8) CTS: 1005 clock nets total capacitance = worst[14.010 14.010] CTS: ------------------------------------------------ CTS: Clock-by-Clock Summary CTS: ------------------------------------------------ CTS: Root clock net sdram_clk CTS: 1 gated clock nets synthesized CTS: 1 buffer trees inserted CTS: 302 buffers used (total size = 5039.47) CTS: 303 clock nets total capacitance = worst[4.170 4.170] CTS: clock tree skew = worst[0.035] CTS: longest path delay = worst[2.041](rise) CTS: shortest path delay = worst[2.006](fall) CTS: Root clock net sys_2x_clk ... • After the summary, all the trans and cap violations on the clock tree are also reported. CTS: Global design rules: CTS: maximum transition delay [rise,fall] = [0.05,0.05] CTS: maximum capacitance = 0.05 CTS: maximum fanout = 2000 CTS: maximum buffer levels per net = 200 CTS: transition delay violation at sdram_clk CTS: user specified transition delay = worst[0.056 0.050] worst[0.056 0.050] CTS: constraint = worst[0.050 0.050] CTS: transition delay violation at buffd2_G1B1I1/Z ... CTS: Summary of clock tree violations: CTS: Total number of transition violations = 3994 CTS: Total number of capacitance violations = 1© Synopsys 2012 49
  50. 50. DRC Fixing Beyond Exceptions • After embedded clock tree optimization, the tool will start fixing the DRC violations beyond exceptions. • The messages are similar to clustering: CTS: fixing DRC beyond exception pins under clock CLK1 CTS: gate level 2 DRC fixing (exception level 1) CTS: clock net = CLK1_G1IP CTS: driving pin = bufbd2_G1IP_1/Z CTS: gate level 2 design rule constraints [rise fall] CTS: max transition = worst[0.100 0.100] CTS: max capacitance = worst[0.600 0.600] CTS: max fanout = 2000 CTS: ----------------------------------------------- CTS: Starting clustering for bufbdf with target load = worst[0.056 0.056] CTS: Completed 4 to 1 clustering CTS: ----------------------------------------------- CTS: Starting clustering for bufbd7 with target load = worst[0.050 0.050] CTS: Completed 1 to 1 clustering i CTS: ------------------------------------------------ • After fixing the DRC violations, the whole summary and the clock- by clock by-clock summary of DRC fixing beyond exceptions are reported.© Synopsys 2012 50
  51. 51. Placement Legalization is Called After Clock Tree Synthesis • When clock tree synthesis places a clock tree buffer or inverter, it places it at a legal location, but the location might be occupied  Causes overlaps which needs to be resolved • The tool calls the placement legalizer which moves the cells to resolve the overlaps. • After legalization, the cells with large displacement gets reported in the log Largest displacement cells: Cell: periph/U122 (AND3X) 1 of 6 cells that Input location: (906.380 1597.520) were displaced Legal location: (897.140 1582.400) Displacement: 17 720 um e g 3 52 row height 17.720 um, e.g. 3.52 height. Total 6 cells has large displacement (e.g. > 15.120 um or 3 row height)© Synopsys 2012 51
  52. 52. Agenda • Prerequisites for Clock Tree Synthesis • Enabling Useful Debug Messages in IC Compiler Clock Tree Synthesis • Clock Tree Synthesis Log Messages • Clock Tree Optimization Log Messages© Synopsys 2012 52
  53. 53. The optimize_clock_tree Command Log File Messages • Optimization options p p • Report before optimization • Optimization • Report after optimization© Synopsys 2012 53
  54. 54. Standalone Optimization Using the optimize clock tree Command optimize_clock_tree • Standalone optimization differs from embedded optimization in the algorithms used • Some of the log messages are similar to those of when y use the g g you compile_clock_tree command  Design update information  Buffer characterization  Pruning of cells  List of cells used for clock tree optimization© Synopsys 2012 54
  55. 55. CTS-352 Warning • The default delay calculation engine is Elmore. Elmore delay calculation might lead to inferior accuracy in skew and latency estimation. • Enable the Arnoldi delay calculation engine for more accurate delay y g y calculation during optimization, by using the following command: set_delay_calculation –clock_arnoldi • Otherwise, the optimize_clock_tree command will issue the following warning: Warning: set_delay_calculation is currently set to elmore. clock_arnoldi clock arnoldi is suggested (CTS 352) suggested. (CTS-352)© Synopsys 2012 55
  56. 56. Optimization Options • Before starting optimization, the optimize_clock_tree command reports the root pin and the optimization options f each d h i d h i i i i for h clock. • The following are the options which you have specified, by using the set clock tree optimization options command set_clock_tree_optimization_options Initializing parameters for clock CLK2GC: Root pin: instCLK2GC/Q Using the following optimization options: gate sizing : on gate relocation : on preserve levels : off area recovery : on relax insertion delay : off balance rc : off© Synopsys 2012 56
  57. 57. Preoptimization Report • Before the tool begins to optimize the clock tree, it reports some of the current characteristics of the clock tree: ***************************************** * Preoptimization report (clock CLK3) * CLK3 ) Clock name ***************************************** Corner max CTS corner Estimated Skew (r/f/b) = (0.073 0.000 0.073) The starting skew and ID Estimated Insertion Delay (r/f/b) = (1.903 -inf 1.903) for the clock as seen by Corner RC-ONLY CTO Estimated Skew (r/f/b) = (0.005 0.000 0.005) Estimated Insertion Delay (r/f/b) = (0.008 -inf 0.008) Wire capacitance = 0.8 pf Total capacitance = 2.3 pf Maximum transition value Max transition = 0.448 ns p present in the clock tree Cells = 24 (area=67.500000) Buffers = 23 (area=67.500000) Buffer Types ============ Information about the bufbd2: 1 buffers and inverters bufbdf: 8 present in th clock t t i the l k tree bufbd7: 5 bufbd4: 3 bufbd1: 6© Synopsys 2012 57
  58. 58. Optimization Messages • During optimization, the tool prints out messages for sizing, insertion and removal, and switching of metal layers: Deleting cell I_SDRAM_TOP/bufbda_G1B1I10 and output net I_SDRAM_TOP/sdram_clk_G1B1I10. iteration 1: (0.314104, 3.328620) (skew, ID) Total 1 buffers removed on clock CLK3 Buffer Removal Start (3.256, 3.527), End (3.015, 3.329) Start (sp, lp) : Initial delays .... End (sp, lp) : Final delays iteration 2: (0.313991, 3.314841) sp: shortest path delay iteration 3: (0.308073, 3.295621) lp: longest path delay Total 2 cells sized on clock CLK3 Start (3 015, 3 329), End (2 988, 3 296) (3.015, 3.329), (2.988, 3.296) Cell Sizing .... iteration 6: (0.305181, 3.275623) Total 1 delay buffers added on clock sck_in12 (LP) Start (2.975, 3.283), End (2.970, 3.276) Buffer Insertion .... Switch to low metal layer for clock ‘CLK3: Total 9 out of 13 nets switched to low metal layer for clock ‘CLK3 with largest cap change 0.00 percent Metal layer switching© Synopsys 2012 58
  59. 59. Optimization Messages• If area recovery option is enabled, the tool does area recovery after optimizing each clock and reports the changes made to that clock: clock,Area recovery optimization for clock ‘CLK3:15% 23% 30% 46% 53% 61% 76% 84% 92% 100%Deleting cell cell I_SDRAM_TOP/bufbda_G1B1I9 and output net I_SDRAM_TOP/sdram_clk_G1B1I9. Total 1 buffers removed (all paths) for clock ‘CLK3 © Synopsys 2012 59
  60. 60. Post Optimization Report• After completing the optimization of a clock, the tool reports the new p g p , p characteristics of the clock tree.• This is similar to the information printed in before optimization: ************************************************** * Multicorner optimization report (clock CLK3) * ************************************************** Corner ‘max Estimated Skew (r/f/b) = (0.041 0.000 0.041) Estimated Insertion D l E ti t d I ti Delay ( /f/b) = (1 725 -inf 1 725) (r/f/b) (1.725 i f 1.725) Corner RC-ONLY Estimated Skew (r/f/b) = (0.007 0.000 0.007) Estimated Insertion Delay (r/f/b) = (0.009 -inf 0.009) Wire capacitance = 0.8 pf Total capacitance = 2.3 pf Max transition = 0.356 ns Cells = 24 (area=59.000000) Buffers = 23 (area=59.000000) Buffer Types ============ bufbd7: 4 bufbdf: 6 bufbd4: 5 bufbd1: 7 bufbd2: 1© Synopsys 2012 60
  61. 61. Reporting the Longest and Shortest Paths• The longest and shortest paths corresponding to all corners are reported, soon after the post optimization report: ++ Longest path for clock CLK3 in corner max: object fan cap trn inc arr r location clk3 (port) 32 0 0 r ( 440 748) clk3 (net) 13 97 … I_SDRAM_TOP/I_SDRAM_READ_FIFO/reg_array_reg_3__8_/CP (senrq1) 167 4 289 r ( 521 520) ++ Shortest path for clock CLK3 in corner max: object fan cap trn inc arr r location clk3 (port) 32 0 0 r ( 440 748) clk3(net) 13 97 … I_SDRAM_TOP/I_SDRAM_READ_FIFO/reg_array_reg_4__11_/CP (senrq1) 217 4 247 r ( 687 656)• Placement legalization related messages are located at the end of the optimize_clock_tree command log© Synopsys 2012 61
  62. 62. Thank you© Synopsys 2012 62
  63. 63. © Synopsys 2012 63

×