SlideShare a Scribd company logo
1 of 15
Clock Skew Versus Data Skew Analysis in Launch to
Capture Flip-Flop Pair Timing Paths to Verify Process
Range Uniformity
Jack Knutson
Independent
Abstract:
Often an analysis of clock skew is needed in verifying post-routing integrated circuit
integrity and to balance clock trees. With large clock skew that has been balanced, such
that capture flip-flops are skewed from launch flip-flops with similar amounts, a further
analysis is necessary to verify process range uniformity. This is true since Primetime will
report timing on paths with launch to capture clock skew greater than a clock period
without any further qualitative notification. If one or more timing paths have balanced
clock skew in the launch to capture pair on the order of or even greater than a clock
period under worst case process conditions but balanced clock skew much less than a
clock period under best case conditions, Primetime static timing analysis could pass
timing for a circuit that behaved differently over the process, voltage temperature range!
This paper presents a case analysis of such a circuit along with a concise TCL script that
will analyze a post-routed netlist with an SDF file to report any possible timing paths
where this may be true. The script is also easily scalable on which timing paths are
selected based on the relationship of the clock skew and data delay which the user can
easily specify. .
There is a temptation to simply insert delays in the data path to resolve hold errors.
Usually this is when hold checks fail under Best Case also known as Fast or Minimum
conditions. But what happens in the case that clock skew has been balanced in a clock
domain with delays greater than the period of the clock itself? With enormous fanout this
1
is a possibility. A case analysis is presented here of a circuit where this is true under
Worst Case conditions but not under Best Case when the propagation delays become
much smaller. The goal then becomes to coordinate Best Case to Worst Case timing over
the process range when clock delays are known to be in the order of or larger than the
clock period. Primetime will not make any special announcement that the timing path is
longer than a clock period and therefore it is possible that this is a problem because it is
possible to construct a circuit with input delays such that timing passes Primetime yet the
circuit behaves differently from a logical functional level over the process, voltage and
temperature range.
It is best to minimize clock skew and fix timing errors by adjusting clock skew than to
accommodate clock skew by inserting delays in data paths. This minimizes the sensitivity
due to the process variation of delay buffers in data paths and reduces gate count. Also,
different delay buffers in the same delay path may not track the same over the process
variation from best case to worst case.
If the clock skew is greater than the clock period it is especially important to reduce the
clock skew because of the danger that if we do not reduce the clock skew to a value less
than the period of the clock there exist a set of circuits and constraints for which STA,
Static Timing Analysis, will report as meeting timing at both Best Case hold and Worst
Case set-up but this set of circuits will behave functionally different over the process
range.
At least some of these are:
(i) A state machine that has an enable that is SYNCHRONOUSLY applied to
state machine DATA enable and even has set_input_delay constraints with
min and max applied relative to the launch clock when the capture clock is
skewed greater than a clock period relative to the launch clock over some
portion of the process variation.
(ii) A state machine that has an enable that is an asynchronous set or reset of a
flip-flop even if the signal applied to that asynchronous input is
SYNCHRONOUSLY applied and the launch to capture skew is as in (i).
(iii) Parallel bus members with different delays that go directly to an output pin
and the launch to capture skew is as in 2(i).
(i) and (ii) could describe a state machine that missed a branch instruction such as shown
by the State Diagram in Figure 1 on page 3.
2
State Diagram
State Diagram
Reset = 0
Enable = 1 and Reset = 1State 0
State 1
State 2 State 3
BranchD = 1
(This is true in Best Case)
BranchD = 0
(This is true in Worst Case)
This diagram shows that when the data is delayed longer than the clock period
it effectively slips a cycle and allows an extra clock edge to reach the state machine
causing change to a state that is different than if branch instruction had arrived
when it should in State 1
Figure 1
(iii) Would describe a multi-bit bus that had data exiting the chip out of sequence.
(iii) Should be caught if a correct set_output_delay relative to the appropriate clock were
applied to all bus members. 2(i) and 2(ii) are harder to detect and very likely would slip
through the STA if the spec for set_input_delay constraints were in a certain range, which
very well may be the "correct" ones.
Figure 2 on page three shows a schematic of a circuit corresponding to (i).
3
Circuit Schematic for STA
Q
Q
SET
CLR
D
Q
Q
SET
CLR
D
Q
Q
SET
CLR
D
Q
Q
SET
CLR
D
Data Buffer
EnableA
Launch Clock
Reset
Capture Clock
Set_input_delay -min 3.30 -clock Launch Clock
Set_input_delay -max 5.90 -clock Launch Clock
BranchA
Q
Q
SET
CLR
D
State
Machine
BranchD
EnableB
BranchB BranchC
Buffer Rise Delay Fall Delay
BC WC BC WC
Data Buffer 3.71ns 8.16ns 3.27ns 7.87ns
Clock Buffer 3.10ns 6.81ns 3.10ns 6.81ns
Set_input_delay -min 3.30 -clock Launch Clock
Set_input_delay -max 5.90 -clock Launch Clock
Clock Period 6.25 ns
Clock Buffer
Figure 2
Figure 2 is an example schematic of a circuit described in 2(i) above. The state diagram is
shown in Figure 1 on page 3. The Worst Case and Best Case waveforms are shown are
shown following in Figure3 and Figure 4. This circuit was analyzed using Primetime
2000.11. Under Worst Case Branch D is so delayed after EnableB that an extra rising
edge occurring by the Capture Clock in State1 makes the state machine transition to state
2. This extra rising edge does not occur under Best Case conditions since BranchD has
arrived in time in State1 to make the state machine transition to State3. In Figure 3
BranchC is delayed from BranchC by 8.16 ns, which is about 1 1/3 clock periods. Best
Case hold (minimum) check and Worst Case setup (maximum) both pass without
violation. Only if a hold (minimum) analysis is done under Worst Case conditions where
often only a setup (maximum) check is done will a hold error be flagged against
EnableA.
4
STA Worst Case Timing
Launch
Clock
Capture
Clock
WC
EnableA
State
Cycle
WC
BranchD
WC
State 1State 0 State 2
+
+
0 1 2 3 4 5
0 1 2 3 4 5-1
BranchA
BranchB
BranchC
STA Worst Case Timing
EnableB
5.9 ns
5.9 ns
+
+
8.16 ns
+
+
6.25 ns
Figure 3
5
STA Best Case Timing
Launch
Clock
Capture
Clock
BC
EnableA
State
Cycle
BC
BranchD
BC
State 1State 0
State 3
+
+
0 1 2 3 4 5
0 1 2 3 4 5
BranchA
BranchB
BranchC
EnableB
STA Best Case Timing
Figure 4
The netlist corresponding to the schematic is transcribed below as “device_test” in
Verilog.
module device_test (launch_clock, branch, reset, enableA, qout3, qout4);
input launch_clock;
input branch;
input reset;
input enableA;
output qout3;
output qout4;
wire qout3;
wire qout4;
wire capture_clock;
wire enableB;
wire branchD;
wire branchB;
wire branchC;
wire andout3;
wire andout4;
6
/** inverting buffers if want to use them for something **/
/*N1AFP delay_1 (.Z(clockin_delay) ,.A(clockin)); */
/*N1AFP delay_2 (.Z(dataout1_delay) ,.A(dataout1));*/
BUFAFP delay_1 (.Z(capture_clock) ,.A(launch_clock));
FD4QCFP reg_1 (.Q(branchB) ,.D(branch) ,.CP(launch_clock)
,.SD(reset));
BUFAFP delay_2 (.Z(branchC) ,.A(branchB));
/*** Delayed clock flip-flop ***/
FD4QCFP reg_2 (.Q(branchD) ,.D(branchC) ,.CP(capture_clock)
,.SD(reset));
/*** Non-delayed clock flip-flop ***/
AND3DFP and_3 (.Z(andout3) ,.A(branchD) ,.B(qout3) ,.C(enableB));
FD4QCFP reg_3 (.Q(qout3) ,.D(andout3) ,.CP(capture_clock)
,.SD(reset));
/*** State machine ***/
AND3DFP and_4 (.Z(andout4) ,.A(andout3) ,.B(qout4) ,.C(enableB));
FD4QCFP reg_4 (.Q(qout4) ,.D(andout4) ,.CP(capture_clock)
,.SD(reset));
FD4QCFP reg_5 (.Q(enableB) ,.D(enableA) ,.CP(capture_clock)
,.SD(reset));
endmodule
The Best Case and Worst Case SDF format is transcribed below.
/* PARASITIC FILES USED FOR DELAY PREDICTION :
"
*/
(DELAYFILE
(SDFVERSION "2.1")
(DESIGN "delay_test")
(DATE "05/22/2002 15:24:25")
(VENDOR "LOGIC")
(PROGRAM "DELAY")
(VERSION "62")
(DIVIDER /)
(VOLTAGE 2.0, 1.62::)
(PROCESS "0.80, 1.17::")
(TEMPERATURE 115.00::)
(TIMESCALE 1ns)
(CELL
(CELLTYPE "device_test")
(INSTANCE device_test)
/** (DELAY
(ABSOLUTE
)
) **/
)
(CELL
/*(CELLTYPE "N1AFP") this is the CLOCK delay */
(CELLTYPE "BUFAFP")
7
(INSTANCE delay_1 )
(DELAY
(ABSOLUTE
/*Best Case*/
(IOPATH A Z (3.1100:0.0000:3.1100) (3.1100:0.0000:3.1100) )
/*Worst Case */
) (IOPATH A Z (6.8100:0.0000:6.8100) (6.8100:0.0000:6.8100) )
)
)
(CELL
/*(CELLTYPE "N1AFP") this is the DATA delay */
(CELLTYPE "BUFAFP")
(INSTANCE delay_2 )
(DELAY
(ABSOLUTE
/*Best Case*/
(IOPATH A Z (3.2700:0.0000:3.2700) (3.7100:0.0000:3.7100) )
/*Worst Case*/
) (IOPATH A Z (8.1600:0.0000:8.1600) (7.8700:0.0000:7.8700) )
)
)
)
The Worst Case Enable to register timing reported as follows.
Startpoint: enable (input port clocked by launch_clock)
Endpoint: reg_5 (rising edge-triggered flip-flop clocked by launch_clock)
Path Group: launch_clock
Path Type: max
Point Incr Path
---------------------------------------------------------------
clock launch_clock (rise edge) 0.00 0.00
clock network delay (propagated) 0.00 0.00
input external delay 6.40 6.40 r
enable (in) 0.00 6.40 r
reg_5/D (FD4QCFP) 0.00 * 6.40 r
data arrival time 6.40
clock launch_clock (rise edge) 6.25 6.25
clock network delay (propagated) 6.81 13.06
reg_5/CP (FD4QCFP) 13.06 r
library setup time -0.23 12.83
data required time 12.83
---------------------------------------------------------------
8
data required time 12.83
data arrival time -6.40
---------------------------------------------------------------
slack (MET) 6.43
The Best Case Enable to register reported as follows:
Startpoint: enable (input port clocked by launch_clock)
Endpoint: reg_5 (rising edge-triggered flip-flop clocked by launch_clock)
Path Group: launch_clock
Path Type: min
Point Incr Path
---------------------------------------------------------------
clock launch_clock (rise edge) 0.00 0.00
clock network delay (propagated) 0.00 0.00
input external delay 3.30 3.30 r
enable (in) 0.00 3.30 r
reg_5/D (FD4QCFP) 0.00 * 3.30 r
data arrival time 3.30
clock launch_clock (rise edge) 0.00 0.00
clock network delay (propagated) 3.11 3.11
reg_5/CP (FD4QCFP) 3.11 r
library hold time 0.15 3.26
data required time 3.26
---------------------------------------------------------------
data required time 3.26
data arrival time -3.30
---------------------------------------------------------------
slack (MET) 0.04
All paths for all four cases of Best Case Maximum, Best Case Minimum, Worst Case
Maximum and Worst Case Minimum passed timing except the one path when running
Worst Case Minimum for the Enable to register path shown as follows.
Startpoint: enable (input port clocked by launch_clock)
Endpoint: reg_5 (rising edge-triggered flip-flop clocked by launch_clock)
Path Group: launch_clock
Path Type: min
9
Point Incr Path
---------------------------------------------------------------
clock launch_clock (rise edge) 0.00 0.00
clock network delay (propagated) 0.00 0.00
input external delay 3.30 3.30 f
enable (in) 0.00 3.30 f
reg_5/D (FD4QCFP) 0.00 * 3.30 f
data arrival time 3.30
clock launch_clock (rise edge) 0.00 0.00
clock network delay (propagated) 6.81 6.81
reg_5/CP (FD4QCFP) 6.81 r
library hold time 0.18 6.99
data required time 6.99
---------------------------------------------------------------
data required time 6.99
data arrival time -3.30
------------------------------------ ---------------------------
slack (VIOLATED) -3.69
We do get a hold error being flagged with a "set_input_delay –max value of 5.9
ns", under Worst Case Slow (WCS) conditions with a Minimum Primetime analysis for
the circuit model in Figure 2. . This value was purposely chosen so that it would be
noticed that a hold error for Worst Case Minimum would get caught. This example
shows that for a "reasonable" looking set of input delay values (set_input_delay max and
min values both less than the appropriate clock period) that just using Primetime for a
Worst Case Maximum (setup check) and a Best Case Minimum (hold check) is
not enough. That in fact four runs need to be don to catch this type of circuit behavior:
1. Worst Case Maximum
2. Worst Case Minimum
3. Best Case Maximum
4. Best Case Minimum
Also, if no paths like the one modeled exist, (both clock path and data path delayed
longer than a period) then it is not necessary to do the four runs, just the traditional two
runs, Worst Case Maximum and Best Case Minimum are enough. It also shows that we
need to have both maximum and minimum input delay values used on set_input_delays
for I/Os and not just a single value with no maximum-minimum range specified. For
example, for the model circuit used there is a range one can choose for single
10
set_input_delay values, approximately 6.9 ns to 9.2 ns, which would pass all four cases
above with no timing error reported because for those values the circuit behavior would
be identical. But this simply ignores the full range of values possible. So if one were to
use a single "set_input_delay 7.2 -clock LaunchClock EnableA" , then no timing error
would be reported even running all four timing cases. Also, if one were to define the
maximum and minimum input delay range as follows:
set_input_delay -min 6.9 -clock LaunchClock Enable A
set_input_delay -max 9.2 -clock LaunchClock Enable A
again no timing error for any of the four cases would be flagged. Then it becomes a
matter of interpreting the tool's reports against what value of input delay values are really
supposed to be used. This is not a problem if we do not have clock and data path delays
greater than the clock period.
Also, Primetime assumes a linear variation in delay from Best Case to Worst Case and
this is how the assumption is made that intermediate timing will also be without
violations. When delays in clock or data are of values greater than a period the circuit is
more susceptible to any non-linearties perhaps resulting from a p-channel dominated
clock path versus an n-channel dominated data path.
A TCL script was written that can parse the Primetime environment with a netlist and
SDF file and look for all circuits in which the clock delay to capture flip-flops was
greater than a programmable threshold AND data delays were greater than another
programmable threshold. Then a list of such possible circuits is made and timing analysis
is run as specified by the user. The script is useful since it can identify all such sensitive
circuits. The limitation is that though all the possibly sensitive circuits are listed the user
must still interpret the results and make the final determination. The TCL script entitled
“findlc_skew_all.pt” is transcribed below.
### Script to find launch to capture skew greater than a
### clock period when data is also greater than a clock period in delay
### Jack Knutson July 5, 2002
###find_lc_skew_all.pt
###Last edit of File August 1, 2002
###
set iter 0
set inneriter 0
foreach_in_collection wanted_clock [all_clocks] {
set your_clock [get_clocks $wanted_clock]
###########
set iter [expr {$iter +1}]
echo "iter is " $iter
set your_clock_name [ get_attribute $your_clock full_name ]
set clock_period [ get_attribute $your_clock period ]
11
###########
set endPoints [all_registers -clock $your_clock -clock_pins]
set fanout_this_clock [sizeof $endPoints]
if {[expr {$fanout_this_clock < 1 }]} {
set fanout_this_clock 1
}
foreach_in_collection path [get_timing_paths -from $your_clock -nworst 1 -max_paths
$fanout_this_clock ] {
set inneriter [expr {$inneriter +1}]
set capture [get_attribute $path endpoint_clock_latency]
set launch [get_attribute $path startpoint_clock_latency]
if {[expr {$launch >= 0.0 }] && [expr {$capture >= 0.001 }] } {
set skew_lc [expr { $capture - $launch }]
set skew_lc_abs [expr {abs ($skew_lc)}]
###Other examples of criteria for proceeding with analysis
### if {[expr {$skew_lc_abs >= $clock_period }]}
### if {[expr {$skew_lc_abs >= 0.5 }]}
set threshold [expr { $clock_period/1 }]
if {[expr {$skew_lc_abs >= $threshold }]} {
echo "The clock information is as follows"
echo "The clock period is " $clock_period
echo "The clock name is " $your_clock_name
echo "The clock fanout information is as follows"
echo "The fanout for the clock is" $fanout_this_clock
echo [format "Launch to capture skew is GREATER than the skew threshold"]
echo "The threshold for clock skew comparison is " $threshold
echo "This is calculated as: clock_period/1 "
echo [format "%-25s %-25s %-25s" "Launch" "Capture" "Capture to Launch skew" ]
echo [format "%-25s %-25s %-8s" $launch $capture $skew_lc_abs ]
set theArrival [get_attribute $path arrival ]
set startclock [get_attribute $path startpoint_clock ]
set startclock_period [get_attribute $startclock period ]
set startclock_name [get_attribute $startclock full_name ]
set endclock [get_attribute $path endpoint_clock ]
set endclock_period [get_attribute $endclock period ]
set endclock_name [get_attribute $endclock full_name ]
echo [format "%-15s %-15s %-15s %-15s %-15s %-15s" "Arrival Delay for Data"
"Clock Period" "Endclock Period" 
"Endclock Name" "Startclock Period" "Startclock Name" ]
echo [format "%-15s %-15s %-15s %-15s %-15s %-15s" $theArrival $clock_period
$endclock_period 
$endclock_name $startclock_period $startclock_name]
12
set theStartpoint [get_attribute $path startpoint]
set theEndpoint [get_attribute $path endpoint]
report_timing -from $theStartpoint -to $theEndpoint -delay_type max
if { [expr {$skew_lc_abs >= $clock_period }] && [expr {$theArrival >=
$clock_period }] } {
echo [format "L C skew is GREATER than the clock Period and WOW Heavens the
data is delayed greater than a period "]
report_timing -from $theStartpoint -to $theEndpoint -delay_type min_max
}
}
##### echo [format "%-25s %-25s %-25s" "Launch" "Capture" "Capture to Launch
skew" ]
##### echo [format "%-25s %-25s %-8s" $launch $capture $skew_lc_abs ]
}
}
}
echo "Final inneriter is"
echo $inneriter
echo "Final iter is"
echo $iter
Further development might include other scripts with extended analysis with other
threshold indicators that based on the type of circuit i.e. state machine or simply a data
path flag whether it may or may not be sensitive to varying behavior. Also worth
understanding or investigating are any process parameters and data that may cause a non-
linear propagation delay response between best case to worst case. The circuit becomes
more susceptible to any non-linearities with data delay buffer insertion as opposed to
clock skew reduction. A linear response between the two extremes of best and worst case
is a fundamental assumption of the validity of static timing analysis conducted only at the
two extremes.
Summary and Conclusions
As the process, voltage and temperature change on a CMOS integrated circuit the timing
delays vary two to nearly three times. Even if paths track fairly linearly, if they have a
wide range, an enable signal may reach a state machine, for example, in time for two
clock edges in best case conditions before a strobe signal appears, but only in time for
one clock edge before a strobe signal appears in worst case conditions. A state machine
would go to different states under these two conditions. The problem lies also from two
possible sources:
13
(1) Inadequate modeling of data input. In this case setup/hold timing can pass best
case and worst case.
(2) Process range variation is large
Running Setup and Hold timing at both worst case and best case conditions may catch
these situations but not necessarily if the set_input_delay values are not correct or if the
clock versus data response does not track linearly as expected relative to one another.
Primetime uses the definition of clock skew as the time difference in the latencies
between two clock inputs of a launch-capture flip-flop pair. This is done regardless of the
magnitude of the latencies and independent of clock period. Another definition of clock
skew is the time difference between a launch rising edge and the next nearest subsequent
rising edge of the clock to the capture flip-flop. When the latency is less than the clock
period these are identical. Let's call the former skew; the one used in Primetime, absolute
skew and the latter skew definition the relative skew. The maximum delay between a
launch capture flip-flop pair to insure that only one rising edge of the capture clock
occurs before the edge the launched data is set-up to is as follows:
Maximum Data Delay = (Capture clock period) + (Relative clock skew) - (set up time)
If the Maximum Data Delay were kept within the constraint defined the dangers
described in this paper would NOT occur. The reason for this is that all set-uphold errors
would have to be fixed via clock skew adjustment, not data delay buffer insertion and
extra clock edges over the process range would not be able to occur.
There is no reason to delay a clock greater than a period since a phase shift is functional
only modulo 360 degrees. For example, a phase shift of 180 degrees is functionally the
same as a phase shift of 540 degrees and so on. Therefore adding more delay buffers just
increases process variation sensitivity.
Perhaps Primetime should flag any clock latencies and clock skew greater than a clock
period as a warning.
In any circuit but especially those with absolute skew greater than a clock period when
setup check is met with a large slack but hold check fails the fix can be done with clock
skew minimization, data delay buffer insertion or a combination of the two. In using a
combination when it is too difficult to meet the clock skew requirement over the process
range at least a minimization of the insertion of data delay buffers would result. If this
still results in a data delay greater than the Maximum Data Delay defined above, the
designer needs to verify independently that this is acceptable.
Another check to see how a path is meeting timing is to compare the set-up i.e., max
delay, timing reports with both relative and absolute skew when these skew definitions
are different because of clock delays greater than a clock period. If after fixing a hold
error one can see that the timing path now meets set-up only with the absolute skew
whereas previously it also met timing with the relative skew. The engineer needs to
14
manually ascertain whether this signal is or may be the enable to a state machine. This
may or may not be acceptable considering the extra clock edges that may occur relative
to other inputs. This relative versus absolute type of analysis is another way of showing
that timing has been fixed with data buffer delays as opposed to fixing it by adjusting
clock skew.
15

More Related Content

What's hot

Timing Analysis
Timing AnalysisTiming Analysis
Timing Analysisrchovatiya
 
Cyber-physical system with machine learning (Poster)
Cyber-physical system with machine learning (Poster)Cyber-physical system with machine learning (Poster)
Cyber-physical system with machine learning (Poster)wassim bouazza
 
Analog, IO Test Chip Validation
Analog,  IO Test Chip  ValidationAnalog,  IO Test Chip  Validation
Analog, IO Test Chip ValidationSMIT A. PATEL
 
Implementing Useful Clock Skew Using Skew Groups
Implementing Useful Clock Skew Using Skew GroupsImplementing Useful Clock Skew Using Skew Groups
Implementing Useful Clock Skew Using Skew GroupsM Mei
 
Design of -- Two phase non overlapping low frequency clock generator using Ca...
Design of -- Two phase non overlapping low frequency clock generator using Ca...Design of -- Two phase non overlapping low frequency clock generator using Ca...
Design of -- Two phase non overlapping low frequency clock generator using Ca...Prashantkumar R
 
Nonlinear Aeroelastic Steady Simulation Applied to Highly Flexible Blades for...
Nonlinear Aeroelastic Steady Simulation Applied to Highly Flexible Blades for...Nonlinear Aeroelastic Steady Simulation Applied to Highly Flexible Blades for...
Nonlinear Aeroelastic Steady Simulation Applied to Highly Flexible Blades for...Fausto Gill Di Vincenzo
 
High available energy management system
High available energy management systemHigh available energy management system
High available energy management systemJo Ee Liew
 
Vlsi interview questions compilation
Vlsi interview questions compilationVlsi interview questions compilation
Vlsi interview questions compilationRajesh M
 
Study of inter and intra chip variations
Study of inter and intra chip variationsStudy of inter and intra chip variations
Study of inter and intra chip variationsRajesh M
 
Understanding cts log_messages
Understanding cts log_messagesUnderstanding cts log_messages
Understanding cts log_messagesMujahid Mohammed
 
Docfoc.com ericsson commands
Docfoc.com ericsson commandsDocfoc.com ericsson commands
Docfoc.com ericsson commandsadel kaoubi
 
Asiasim2004 final
Asiasim2004 finalAsiasim2004 final
Asiasim2004 finalvrsim
 

What's hot (20)

Timing Analysis
Timing AnalysisTiming Analysis
Timing Analysis
 
Sta
StaSta
Sta
 
Cyber-physical system with machine learning (Poster)
Cyber-physical system with machine learning (Poster)Cyber-physical system with machine learning (Poster)
Cyber-physical system with machine learning (Poster)
 
Analog, IO Test Chip Validation
Analog,  IO Test Chip  ValidationAnalog,  IO Test Chip  Validation
Analog, IO Test Chip Validation
 
Implementing Useful Clock Skew Using Skew Groups
Implementing Useful Clock Skew Using Skew GroupsImplementing Useful Clock Skew Using Skew Groups
Implementing Useful Clock Skew Using Skew Groups
 
CNWeek4 lec2-bscs1
CNWeek4 lec2-bscs1CNWeek4 lec2-bscs1
CNWeek4 lec2-bscs1
 
Design of -- Two phase non overlapping low frequency clock generator using Ca...
Design of -- Two phase non overlapping low frequency clock generator using Ca...Design of -- Two phase non overlapping low frequency clock generator using Ca...
Design of -- Two phase non overlapping low frequency clock generator using Ca...
 
Nonlinear Aeroelastic Steady Simulation Applied to Highly Flexible Blades for...
Nonlinear Aeroelastic Steady Simulation Applied to Highly Flexible Blades for...Nonlinear Aeroelastic Steady Simulation Applied to Highly Flexible Blades for...
Nonlinear Aeroelastic Steady Simulation Applied to Highly Flexible Blades for...
 
Timing analysis
Timing analysisTiming analysis
Timing analysis
 
High available energy management system
High available energy management systemHigh available energy management system
High available energy management system
 
Fast020702
Fast020702Fast020702
Fast020702
 
Vlsi interview questions compilation
Vlsi interview questions compilationVlsi interview questions compilation
Vlsi interview questions compilation
 
pramod
pramodpramod
pramod
 
Study of inter and intra chip variations
Study of inter and intra chip variationsStudy of inter and intra chip variations
Study of inter and intra chip variations
 
Understanding cts log_messages
Understanding cts log_messagesUnderstanding cts log_messages
Understanding cts log_messages
 
Docfoc.com ericsson commands
Docfoc.com ericsson commandsDocfoc.com ericsson commands
Docfoc.com ericsson commands
 
Clock Distribution
Clock DistributionClock Distribution
Clock Distribution
 
STANDARD CELL LIBRARY DESIGN
STANDARD CELL LIBRARY DESIGNSTANDARD CELL LIBRARY DESIGN
STANDARD CELL LIBRARY DESIGN
 
Asiasim2004 final
Asiasim2004 finalAsiasim2004 final
Asiasim2004 final
 
Clock Gating
Clock GatingClock Gating
Clock Gating
 

Viewers also liked

Feidecomisos publicos de fomentos (m.a.c.v.)
Feidecomisos publicos de fomentos  (m.a.c.v.)Feidecomisos publicos de fomentos  (m.a.c.v.)
Feidecomisos publicos de fomentos (m.a.c.v.)bodega1088comitan
 
Research into Collateral Content
Research into Collateral ContentResearch into Collateral Content
Research into Collateral ContentTayebTarar
 
Magazine Analysis
Magazine AnalysisMagazine Analysis
Magazine AnalysisTayebTarar
 
Cloud computing provider assessment
Cloud computing provider assessmentCloud computing provider assessment
Cloud computing provider assessmentYazan AlMasri
 
PPG Structure Updates from Matthew McClain
PPG Structure Updates from Matthew McClainPPG Structure Updates from Matthew McClain
PPG Structure Updates from Matthew McClainOffice of HIV Planning
 
始計③~五事とはなんぞや?~
始計③~五事とはなんぞや?~始計③~五事とはなんぞや?~
始計③~五事とはなんぞや?~YujiSuzue
 
La Nueva Guia del CMO
La Nueva Guia del CMOLa Nueva Guia del CMO
La Nueva Guia del CMOIBM
 
Meoug annual confference uae
Meoug annual confference uaeMeoug annual confference uae
Meoug annual confference uaemeoug
 

Viewers also liked (17)

Risha Bansal-2
Risha Bansal-2Risha Bansal-2
Risha Bansal-2
 
summer2004
summer2004summer2004
summer2004
 
Feidecomisos publicos de fomentos (m.a.c.v.)
Feidecomisos publicos de fomentos  (m.a.c.v.)Feidecomisos publicos de fomentos  (m.a.c.v.)
Feidecomisos publicos de fomentos (m.a.c.v.)
 
Research into Collateral Content
Research into Collateral ContentResearch into Collateral Content
Research into Collateral Content
 
Testimonials
TestimonialsTestimonials
Testimonials
 
Encuesta "La opinión de los empresarios del transporte de cargas"
Encuesta "La opinión de los empresarios del transporte de cargas"Encuesta "La opinión de los empresarios del transporte de cargas"
Encuesta "La opinión de los empresarios del transporte de cargas"
 
Resume Jayson Scherer
Resume Jayson SchererResume Jayson Scherer
Resume Jayson Scherer
 
Magazine Analysis
Magazine AnalysisMagazine Analysis
Magazine Analysis
 
Doug Barnhouse resume
Doug Barnhouse resumeDoug Barnhouse resume
Doug Barnhouse resume
 
Cloud computing provider assessment
Cloud computing provider assessmentCloud computing provider assessment
Cloud computing provider assessment
 
January
JanuaryJanuary
January
 
PPG Structure Updates from Matthew McClain
PPG Structure Updates from Matthew McClainPPG Structure Updates from Matthew McClain
PPG Structure Updates from Matthew McClain
 
Javantura v3 - RIT Croatia
Javantura v3 - RIT CroatiaJavantura v3 - RIT Croatia
Javantura v3 - RIT Croatia
 
始計③~五事とはなんぞや?~
始計③~五事とはなんぞや?~始計③~五事とはなんぞや?~
始計③~五事とはなんぞや?~
 
La Nueva Guia del CMO
La Nueva Guia del CMOLa Nueva Guia del CMO
La Nueva Guia del CMO
 
Meoug annual confference uae
Meoug annual confference uaeMeoug annual confference uae
Meoug annual confference uae
 
ทฤษฎีการจัดการเรียนรู้
ทฤษฎีการจัดการเรียนรู้ทฤษฎีการจัดการเรียนรู้
ทฤษฎีการจัดการเรียนรู้
 

Similar to Jack_Knutson_SNUG2003_ Copy

IRJET- Study Over Current Relay (MCGG53) Response using Matlab Model
IRJET- Study Over Current Relay (MCGG53) Response using Matlab ModelIRJET- Study Over Current Relay (MCGG53) Response using Matlab Model
IRJET- Study Over Current Relay (MCGG53) Response using Matlab ModelIRJET Journal
 
IRJET- Metastability Mitigation & Error Masking of High Speed Flip-Flop
IRJET- Metastability Mitigation & Error Masking of High Speed Flip-FlopIRJET- Metastability Mitigation & Error Masking of High Speed Flip-Flop
IRJET- Metastability Mitigation & Error Masking of High Speed Flip-FlopIRJET Journal
 
Challenges in verification_of_clock_domains
Challenges in verification_of_clock_domainsChallenges in verification_of_clock_domains
Challenges in verification_of_clock_domainsAmit kumar
 
DELAY ERROR WITH META-STABILITY DETECTION AND CORRECTION USING CMOS TRANSMISS...
DELAY ERROR WITH META-STABILITY DETECTION AND CORRECTION USING CMOS TRANSMISS...DELAY ERROR WITH META-STABILITY DETECTION AND CORRECTION USING CMOS TRANSMISS...
DELAY ERROR WITH META-STABILITY DETECTION AND CORRECTION USING CMOS TRANSMISS...VLSICS Design
 
Clock distribution in high speed board
Clock distribution in high speed boardClock distribution in high speed board
Clock distribution in high speed boardPankaj Khodifad
 
Ericsson SDCCH establishment Issue
Ericsson SDCCH establishment IssueEricsson SDCCH establishment Issue
Ericsson SDCCH establishment IssueHoussein Abou Chacra
 
Library Characterization Flow
Library Characterization FlowLibrary Characterization Flow
Library Characterization FlowSatish Grandhi
 
A survey of scan-capture power reduction techniques
A survey of scan-capture power reduction techniquesA survey of scan-capture power reduction techniques
A survey of scan-capture power reduction techniquesIJECEIAES
 
Timing notes 2006
Timing notes 2006Timing notes 2006
Timing notes 2006pavan kumar
 
Performance Comparison of Various Clock Gating Techniques
Performance Comparison of Various Clock Gating TechniquesPerformance Comparison of Various Clock Gating Techniques
Performance Comparison of Various Clock Gating Techniquesiosrjce
 
Physical design
Physical design Physical design
Physical design Mantra VLSI
 
Synchronization and timing loop presentation -mapyourtech
Synchronization and timing loop presentation -mapyourtechSynchronization and timing loop presentation -mapyourtech
Synchronization and timing loop presentation -mapyourtechMapYourTech
 
State_Machine_Documentation_ CharlotteQin_Summer2014
State_Machine_Documentation_ CharlotteQin_Summer2014State_Machine_Documentation_ CharlotteQin_Summer2014
State_Machine_Documentation_ CharlotteQin_Summer2014Charlotte Qin
 
Optimal Coordination of DOCR for Radial Distribution Systems in Presence of TCSC
Optimal Coordination of DOCR for Radial Distribution Systems in Presence of TCSCOptimal Coordination of DOCR for Radial Distribution Systems in Presence of TCSC
Optimal Coordination of DOCR for Radial Distribution Systems in Presence of TCSCIJPEDS-IAES
 
OPTICAL BURST SWITCHING
OPTICAL BURST SWITCHINGOPTICAL BURST SWITCHING
OPTICAL BURST SWITCHINGJigyasa Singh
 
11gsmpo b-en-gsmgsmnetworksdcchcongestionsolutions-word-201009-130915134031-p...
11gsmpo b-en-gsmgsmnetworksdcchcongestionsolutions-word-201009-130915134031-p...11gsmpo b-en-gsmgsmnetworksdcchcongestionsolutions-word-201009-130915134031-p...
11gsmpo b-en-gsmgsmnetworksdcchcongestionsolutions-word-201009-130915134031-p...Emmanuel Msumali
 
A Security Analysis of Circuit Clock Obfuscation
A Security Analysis of Circuit Clock ObfuscationA Security Analysis of Circuit Clock Obfuscation
A Security Analysis of Circuit Clock ObfuscationRajeshKumarDattaShao
 
Dalls02950 1
Dalls02950 1Dalls02950 1
Dalls02950 1S BW
 

Similar to Jack_Knutson_SNUG2003_ Copy (20)

IRJET- Study Over Current Relay (MCGG53) Response using Matlab Model
IRJET- Study Over Current Relay (MCGG53) Response using Matlab ModelIRJET- Study Over Current Relay (MCGG53) Response using Matlab Model
IRJET- Study Over Current Relay (MCGG53) Response using Matlab Model
 
IRJET- Metastability Mitigation & Error Masking of High Speed Flip-Flop
IRJET- Metastability Mitigation & Error Masking of High Speed Flip-FlopIRJET- Metastability Mitigation & Error Masking of High Speed Flip-Flop
IRJET- Metastability Mitigation & Error Masking of High Speed Flip-Flop
 
Challenges in verification_of_clock_domains
Challenges in verification_of_clock_domainsChallenges in verification_of_clock_domains
Challenges in verification_of_clock_domains
 
DELAY ERROR WITH META-STABILITY DETECTION AND CORRECTION USING CMOS TRANSMISS...
DELAY ERROR WITH META-STABILITY DETECTION AND CORRECTION USING CMOS TRANSMISS...DELAY ERROR WITH META-STABILITY DETECTION AND CORRECTION USING CMOS TRANSMISS...
DELAY ERROR WITH META-STABILITY DETECTION AND CORRECTION USING CMOS TRANSMISS...
 
Clock distribution in high speed board
Clock distribution in high speed boardClock distribution in high speed board
Clock distribution in high speed board
 
Ericsson SDCCH establishment Issue
Ericsson SDCCH establishment IssueEricsson SDCCH establishment Issue
Ericsson SDCCH establishment Issue
 
Library Characterization Flow
Library Characterization FlowLibrary Characterization Flow
Library Characterization Flow
 
A survey of scan-capture power reduction techniques
A survey of scan-capture power reduction techniquesA survey of scan-capture power reduction techniques
A survey of scan-capture power reduction techniques
 
Timing notes 2006
Timing notes 2006Timing notes 2006
Timing notes 2006
 
Performance Comparison of Various Clock Gating Techniques
Performance Comparison of Various Clock Gating TechniquesPerformance Comparison of Various Clock Gating Techniques
Performance Comparison of Various Clock Gating Techniques
 
Physical design
Physical design Physical design
Physical design
 
Synchronization and timing loop presentation -mapyourtech
Synchronization and timing loop presentation -mapyourtechSynchronization and timing loop presentation -mapyourtech
Synchronization and timing loop presentation -mapyourtech
 
State_Machine_Documentation_ CharlotteQin_Summer2014
State_Machine_Documentation_ CharlotteQin_Summer2014State_Machine_Documentation_ CharlotteQin_Summer2014
State_Machine_Documentation_ CharlotteQin_Summer2014
 
Burst clock controller
Burst clock controllerBurst clock controller
Burst clock controller
 
Optimal Coordination of DOCR for Radial Distribution Systems in Presence of TCSC
Optimal Coordination of DOCR for Radial Distribution Systems in Presence of TCSCOptimal Coordination of DOCR for Radial Distribution Systems in Presence of TCSC
Optimal Coordination of DOCR for Radial Distribution Systems in Presence of TCSC
 
Channelconfih s9
Channelconfih s9Channelconfih s9
Channelconfih s9
 
OPTICAL BURST SWITCHING
OPTICAL BURST SWITCHINGOPTICAL BURST SWITCHING
OPTICAL BURST SWITCHING
 
11gsmpo b-en-gsmgsmnetworksdcchcongestionsolutions-word-201009-130915134031-p...
11gsmpo b-en-gsmgsmnetworksdcchcongestionsolutions-word-201009-130915134031-p...11gsmpo b-en-gsmgsmnetworksdcchcongestionsolutions-word-201009-130915134031-p...
11gsmpo b-en-gsmgsmnetworksdcchcongestionsolutions-word-201009-130915134031-p...
 
A Security Analysis of Circuit Clock Obfuscation
A Security Analysis of Circuit Clock ObfuscationA Security Analysis of Circuit Clock Obfuscation
A Security Analysis of Circuit Clock Obfuscation
 
Dalls02950 1
Dalls02950 1Dalls02950 1
Dalls02950 1
 

Jack_Knutson_SNUG2003_ Copy

  • 1. Clock Skew Versus Data Skew Analysis in Launch to Capture Flip-Flop Pair Timing Paths to Verify Process Range Uniformity Jack Knutson Independent Abstract: Often an analysis of clock skew is needed in verifying post-routing integrated circuit integrity and to balance clock trees. With large clock skew that has been balanced, such that capture flip-flops are skewed from launch flip-flops with similar amounts, a further analysis is necessary to verify process range uniformity. This is true since Primetime will report timing on paths with launch to capture clock skew greater than a clock period without any further qualitative notification. If one or more timing paths have balanced clock skew in the launch to capture pair on the order of or even greater than a clock period under worst case process conditions but balanced clock skew much less than a clock period under best case conditions, Primetime static timing analysis could pass timing for a circuit that behaved differently over the process, voltage temperature range! This paper presents a case analysis of such a circuit along with a concise TCL script that will analyze a post-routed netlist with an SDF file to report any possible timing paths where this may be true. The script is also easily scalable on which timing paths are selected based on the relationship of the clock skew and data delay which the user can easily specify. . There is a temptation to simply insert delays in the data path to resolve hold errors. Usually this is when hold checks fail under Best Case also known as Fast or Minimum conditions. But what happens in the case that clock skew has been balanced in a clock domain with delays greater than the period of the clock itself? With enormous fanout this 1
  • 2. is a possibility. A case analysis is presented here of a circuit where this is true under Worst Case conditions but not under Best Case when the propagation delays become much smaller. The goal then becomes to coordinate Best Case to Worst Case timing over the process range when clock delays are known to be in the order of or larger than the clock period. Primetime will not make any special announcement that the timing path is longer than a clock period and therefore it is possible that this is a problem because it is possible to construct a circuit with input delays such that timing passes Primetime yet the circuit behaves differently from a logical functional level over the process, voltage and temperature range. It is best to minimize clock skew and fix timing errors by adjusting clock skew than to accommodate clock skew by inserting delays in data paths. This minimizes the sensitivity due to the process variation of delay buffers in data paths and reduces gate count. Also, different delay buffers in the same delay path may not track the same over the process variation from best case to worst case. If the clock skew is greater than the clock period it is especially important to reduce the clock skew because of the danger that if we do not reduce the clock skew to a value less than the period of the clock there exist a set of circuits and constraints for which STA, Static Timing Analysis, will report as meeting timing at both Best Case hold and Worst Case set-up but this set of circuits will behave functionally different over the process range. At least some of these are: (i) A state machine that has an enable that is SYNCHRONOUSLY applied to state machine DATA enable and even has set_input_delay constraints with min and max applied relative to the launch clock when the capture clock is skewed greater than a clock period relative to the launch clock over some portion of the process variation. (ii) A state machine that has an enable that is an asynchronous set or reset of a flip-flop even if the signal applied to that asynchronous input is SYNCHRONOUSLY applied and the launch to capture skew is as in (i). (iii) Parallel bus members with different delays that go directly to an output pin and the launch to capture skew is as in 2(i). (i) and (ii) could describe a state machine that missed a branch instruction such as shown by the State Diagram in Figure 1 on page 3. 2
  • 3. State Diagram State Diagram Reset = 0 Enable = 1 and Reset = 1State 0 State 1 State 2 State 3 BranchD = 1 (This is true in Best Case) BranchD = 0 (This is true in Worst Case) This diagram shows that when the data is delayed longer than the clock period it effectively slips a cycle and allows an extra clock edge to reach the state machine causing change to a state that is different than if branch instruction had arrived when it should in State 1 Figure 1 (iii) Would describe a multi-bit bus that had data exiting the chip out of sequence. (iii) Should be caught if a correct set_output_delay relative to the appropriate clock were applied to all bus members. 2(i) and 2(ii) are harder to detect and very likely would slip through the STA if the spec for set_input_delay constraints were in a certain range, which very well may be the "correct" ones. Figure 2 on page three shows a schematic of a circuit corresponding to (i). 3
  • 4. Circuit Schematic for STA Q Q SET CLR D Q Q SET CLR D Q Q SET CLR D Q Q SET CLR D Data Buffer EnableA Launch Clock Reset Capture Clock Set_input_delay -min 3.30 -clock Launch Clock Set_input_delay -max 5.90 -clock Launch Clock BranchA Q Q SET CLR D State Machine BranchD EnableB BranchB BranchC Buffer Rise Delay Fall Delay BC WC BC WC Data Buffer 3.71ns 8.16ns 3.27ns 7.87ns Clock Buffer 3.10ns 6.81ns 3.10ns 6.81ns Set_input_delay -min 3.30 -clock Launch Clock Set_input_delay -max 5.90 -clock Launch Clock Clock Period 6.25 ns Clock Buffer Figure 2 Figure 2 is an example schematic of a circuit described in 2(i) above. The state diagram is shown in Figure 1 on page 3. The Worst Case and Best Case waveforms are shown are shown following in Figure3 and Figure 4. This circuit was analyzed using Primetime 2000.11. Under Worst Case Branch D is so delayed after EnableB that an extra rising edge occurring by the Capture Clock in State1 makes the state machine transition to state 2. This extra rising edge does not occur under Best Case conditions since BranchD has arrived in time in State1 to make the state machine transition to State3. In Figure 3 BranchC is delayed from BranchC by 8.16 ns, which is about 1 1/3 clock periods. Best Case hold (minimum) check and Worst Case setup (maximum) both pass without violation. Only if a hold (minimum) analysis is done under Worst Case conditions where often only a setup (maximum) check is done will a hold error be flagged against EnableA. 4
  • 5. STA Worst Case Timing Launch Clock Capture Clock WC EnableA State Cycle WC BranchD WC State 1State 0 State 2 + + 0 1 2 3 4 5 0 1 2 3 4 5-1 BranchA BranchB BranchC STA Worst Case Timing EnableB 5.9 ns 5.9 ns + + 8.16 ns + + 6.25 ns Figure 3 5
  • 6. STA Best Case Timing Launch Clock Capture Clock BC EnableA State Cycle BC BranchD BC State 1State 0 State 3 + + 0 1 2 3 4 5 0 1 2 3 4 5 BranchA BranchB BranchC EnableB STA Best Case Timing Figure 4 The netlist corresponding to the schematic is transcribed below as “device_test” in Verilog. module device_test (launch_clock, branch, reset, enableA, qout3, qout4); input launch_clock; input branch; input reset; input enableA; output qout3; output qout4; wire qout3; wire qout4; wire capture_clock; wire enableB; wire branchD; wire branchB; wire branchC; wire andout3; wire andout4; 6
  • 7. /** inverting buffers if want to use them for something **/ /*N1AFP delay_1 (.Z(clockin_delay) ,.A(clockin)); */ /*N1AFP delay_2 (.Z(dataout1_delay) ,.A(dataout1));*/ BUFAFP delay_1 (.Z(capture_clock) ,.A(launch_clock)); FD4QCFP reg_1 (.Q(branchB) ,.D(branch) ,.CP(launch_clock) ,.SD(reset)); BUFAFP delay_2 (.Z(branchC) ,.A(branchB)); /*** Delayed clock flip-flop ***/ FD4QCFP reg_2 (.Q(branchD) ,.D(branchC) ,.CP(capture_clock) ,.SD(reset)); /*** Non-delayed clock flip-flop ***/ AND3DFP and_3 (.Z(andout3) ,.A(branchD) ,.B(qout3) ,.C(enableB)); FD4QCFP reg_3 (.Q(qout3) ,.D(andout3) ,.CP(capture_clock) ,.SD(reset)); /*** State machine ***/ AND3DFP and_4 (.Z(andout4) ,.A(andout3) ,.B(qout4) ,.C(enableB)); FD4QCFP reg_4 (.Q(qout4) ,.D(andout4) ,.CP(capture_clock) ,.SD(reset)); FD4QCFP reg_5 (.Q(enableB) ,.D(enableA) ,.CP(capture_clock) ,.SD(reset)); endmodule The Best Case and Worst Case SDF format is transcribed below. /* PARASITIC FILES USED FOR DELAY PREDICTION : " */ (DELAYFILE (SDFVERSION "2.1") (DESIGN "delay_test") (DATE "05/22/2002 15:24:25") (VENDOR "LOGIC") (PROGRAM "DELAY") (VERSION "62") (DIVIDER /) (VOLTAGE 2.0, 1.62::) (PROCESS "0.80, 1.17::") (TEMPERATURE 115.00::) (TIMESCALE 1ns) (CELL (CELLTYPE "device_test") (INSTANCE device_test) /** (DELAY (ABSOLUTE ) ) **/ ) (CELL /*(CELLTYPE "N1AFP") this is the CLOCK delay */ (CELLTYPE "BUFAFP") 7
  • 8. (INSTANCE delay_1 ) (DELAY (ABSOLUTE /*Best Case*/ (IOPATH A Z (3.1100:0.0000:3.1100) (3.1100:0.0000:3.1100) ) /*Worst Case */ ) (IOPATH A Z (6.8100:0.0000:6.8100) (6.8100:0.0000:6.8100) ) ) ) (CELL /*(CELLTYPE "N1AFP") this is the DATA delay */ (CELLTYPE "BUFAFP") (INSTANCE delay_2 ) (DELAY (ABSOLUTE /*Best Case*/ (IOPATH A Z (3.2700:0.0000:3.2700) (3.7100:0.0000:3.7100) ) /*Worst Case*/ ) (IOPATH A Z (8.1600:0.0000:8.1600) (7.8700:0.0000:7.8700) ) ) ) ) The Worst Case Enable to register timing reported as follows. Startpoint: enable (input port clocked by launch_clock) Endpoint: reg_5 (rising edge-triggered flip-flop clocked by launch_clock) Path Group: launch_clock Path Type: max Point Incr Path --------------------------------------------------------------- clock launch_clock (rise edge) 0.00 0.00 clock network delay (propagated) 0.00 0.00 input external delay 6.40 6.40 r enable (in) 0.00 6.40 r reg_5/D (FD4QCFP) 0.00 * 6.40 r data arrival time 6.40 clock launch_clock (rise edge) 6.25 6.25 clock network delay (propagated) 6.81 13.06 reg_5/CP (FD4QCFP) 13.06 r library setup time -0.23 12.83 data required time 12.83 --------------------------------------------------------------- 8
  • 9. data required time 12.83 data arrival time -6.40 --------------------------------------------------------------- slack (MET) 6.43 The Best Case Enable to register reported as follows: Startpoint: enable (input port clocked by launch_clock) Endpoint: reg_5 (rising edge-triggered flip-flop clocked by launch_clock) Path Group: launch_clock Path Type: min Point Incr Path --------------------------------------------------------------- clock launch_clock (rise edge) 0.00 0.00 clock network delay (propagated) 0.00 0.00 input external delay 3.30 3.30 r enable (in) 0.00 3.30 r reg_5/D (FD4QCFP) 0.00 * 3.30 r data arrival time 3.30 clock launch_clock (rise edge) 0.00 0.00 clock network delay (propagated) 3.11 3.11 reg_5/CP (FD4QCFP) 3.11 r library hold time 0.15 3.26 data required time 3.26 --------------------------------------------------------------- data required time 3.26 data arrival time -3.30 --------------------------------------------------------------- slack (MET) 0.04 All paths for all four cases of Best Case Maximum, Best Case Minimum, Worst Case Maximum and Worst Case Minimum passed timing except the one path when running Worst Case Minimum for the Enable to register path shown as follows. Startpoint: enable (input port clocked by launch_clock) Endpoint: reg_5 (rising edge-triggered flip-flop clocked by launch_clock) Path Group: launch_clock Path Type: min 9
  • 10. Point Incr Path --------------------------------------------------------------- clock launch_clock (rise edge) 0.00 0.00 clock network delay (propagated) 0.00 0.00 input external delay 3.30 3.30 f enable (in) 0.00 3.30 f reg_5/D (FD4QCFP) 0.00 * 3.30 f data arrival time 3.30 clock launch_clock (rise edge) 0.00 0.00 clock network delay (propagated) 6.81 6.81 reg_5/CP (FD4QCFP) 6.81 r library hold time 0.18 6.99 data required time 6.99 --------------------------------------------------------------- data required time 6.99 data arrival time -3.30 ------------------------------------ --------------------------- slack (VIOLATED) -3.69 We do get a hold error being flagged with a "set_input_delay –max value of 5.9 ns", under Worst Case Slow (WCS) conditions with a Minimum Primetime analysis for the circuit model in Figure 2. . This value was purposely chosen so that it would be noticed that a hold error for Worst Case Minimum would get caught. This example shows that for a "reasonable" looking set of input delay values (set_input_delay max and min values both less than the appropriate clock period) that just using Primetime for a Worst Case Maximum (setup check) and a Best Case Minimum (hold check) is not enough. That in fact four runs need to be don to catch this type of circuit behavior: 1. Worst Case Maximum 2. Worst Case Minimum 3. Best Case Maximum 4. Best Case Minimum Also, if no paths like the one modeled exist, (both clock path and data path delayed longer than a period) then it is not necessary to do the four runs, just the traditional two runs, Worst Case Maximum and Best Case Minimum are enough. It also shows that we need to have both maximum and minimum input delay values used on set_input_delays for I/Os and not just a single value with no maximum-minimum range specified. For example, for the model circuit used there is a range one can choose for single 10
  • 11. set_input_delay values, approximately 6.9 ns to 9.2 ns, which would pass all four cases above with no timing error reported because for those values the circuit behavior would be identical. But this simply ignores the full range of values possible. So if one were to use a single "set_input_delay 7.2 -clock LaunchClock EnableA" , then no timing error would be reported even running all four timing cases. Also, if one were to define the maximum and minimum input delay range as follows: set_input_delay -min 6.9 -clock LaunchClock Enable A set_input_delay -max 9.2 -clock LaunchClock Enable A again no timing error for any of the four cases would be flagged. Then it becomes a matter of interpreting the tool's reports against what value of input delay values are really supposed to be used. This is not a problem if we do not have clock and data path delays greater than the clock period. Also, Primetime assumes a linear variation in delay from Best Case to Worst Case and this is how the assumption is made that intermediate timing will also be without violations. When delays in clock or data are of values greater than a period the circuit is more susceptible to any non-linearties perhaps resulting from a p-channel dominated clock path versus an n-channel dominated data path. A TCL script was written that can parse the Primetime environment with a netlist and SDF file and look for all circuits in which the clock delay to capture flip-flops was greater than a programmable threshold AND data delays were greater than another programmable threshold. Then a list of such possible circuits is made and timing analysis is run as specified by the user. The script is useful since it can identify all such sensitive circuits. The limitation is that though all the possibly sensitive circuits are listed the user must still interpret the results and make the final determination. The TCL script entitled “findlc_skew_all.pt” is transcribed below. ### Script to find launch to capture skew greater than a ### clock period when data is also greater than a clock period in delay ### Jack Knutson July 5, 2002 ###find_lc_skew_all.pt ###Last edit of File August 1, 2002 ### set iter 0 set inneriter 0 foreach_in_collection wanted_clock [all_clocks] { set your_clock [get_clocks $wanted_clock] ########### set iter [expr {$iter +1}] echo "iter is " $iter set your_clock_name [ get_attribute $your_clock full_name ] set clock_period [ get_attribute $your_clock period ] 11
  • 12. ########### set endPoints [all_registers -clock $your_clock -clock_pins] set fanout_this_clock [sizeof $endPoints] if {[expr {$fanout_this_clock < 1 }]} { set fanout_this_clock 1 } foreach_in_collection path [get_timing_paths -from $your_clock -nworst 1 -max_paths $fanout_this_clock ] { set inneriter [expr {$inneriter +1}] set capture [get_attribute $path endpoint_clock_latency] set launch [get_attribute $path startpoint_clock_latency] if {[expr {$launch >= 0.0 }] && [expr {$capture >= 0.001 }] } { set skew_lc [expr { $capture - $launch }] set skew_lc_abs [expr {abs ($skew_lc)}] ###Other examples of criteria for proceeding with analysis ### if {[expr {$skew_lc_abs >= $clock_period }]} ### if {[expr {$skew_lc_abs >= 0.5 }]} set threshold [expr { $clock_period/1 }] if {[expr {$skew_lc_abs >= $threshold }]} { echo "The clock information is as follows" echo "The clock period is " $clock_period echo "The clock name is " $your_clock_name echo "The clock fanout information is as follows" echo "The fanout for the clock is" $fanout_this_clock echo [format "Launch to capture skew is GREATER than the skew threshold"] echo "The threshold for clock skew comparison is " $threshold echo "This is calculated as: clock_period/1 " echo [format "%-25s %-25s %-25s" "Launch" "Capture" "Capture to Launch skew" ] echo [format "%-25s %-25s %-8s" $launch $capture $skew_lc_abs ] set theArrival [get_attribute $path arrival ] set startclock [get_attribute $path startpoint_clock ] set startclock_period [get_attribute $startclock period ] set startclock_name [get_attribute $startclock full_name ] set endclock [get_attribute $path endpoint_clock ] set endclock_period [get_attribute $endclock period ] set endclock_name [get_attribute $endclock full_name ] echo [format "%-15s %-15s %-15s %-15s %-15s %-15s" "Arrival Delay for Data" "Clock Period" "Endclock Period" "Endclock Name" "Startclock Period" "Startclock Name" ] echo [format "%-15s %-15s %-15s %-15s %-15s %-15s" $theArrival $clock_period $endclock_period $endclock_name $startclock_period $startclock_name] 12
  • 13. set theStartpoint [get_attribute $path startpoint] set theEndpoint [get_attribute $path endpoint] report_timing -from $theStartpoint -to $theEndpoint -delay_type max if { [expr {$skew_lc_abs >= $clock_period }] && [expr {$theArrival >= $clock_period }] } { echo [format "L C skew is GREATER than the clock Period and WOW Heavens the data is delayed greater than a period "] report_timing -from $theStartpoint -to $theEndpoint -delay_type min_max } } ##### echo [format "%-25s %-25s %-25s" "Launch" "Capture" "Capture to Launch skew" ] ##### echo [format "%-25s %-25s %-8s" $launch $capture $skew_lc_abs ] } } } echo "Final inneriter is" echo $inneriter echo "Final iter is" echo $iter Further development might include other scripts with extended analysis with other threshold indicators that based on the type of circuit i.e. state machine or simply a data path flag whether it may or may not be sensitive to varying behavior. Also worth understanding or investigating are any process parameters and data that may cause a non- linear propagation delay response between best case to worst case. The circuit becomes more susceptible to any non-linearities with data delay buffer insertion as opposed to clock skew reduction. A linear response between the two extremes of best and worst case is a fundamental assumption of the validity of static timing analysis conducted only at the two extremes. Summary and Conclusions As the process, voltage and temperature change on a CMOS integrated circuit the timing delays vary two to nearly three times. Even if paths track fairly linearly, if they have a wide range, an enable signal may reach a state machine, for example, in time for two clock edges in best case conditions before a strobe signal appears, but only in time for one clock edge before a strobe signal appears in worst case conditions. A state machine would go to different states under these two conditions. The problem lies also from two possible sources: 13
  • 14. (1) Inadequate modeling of data input. In this case setup/hold timing can pass best case and worst case. (2) Process range variation is large Running Setup and Hold timing at both worst case and best case conditions may catch these situations but not necessarily if the set_input_delay values are not correct or if the clock versus data response does not track linearly as expected relative to one another. Primetime uses the definition of clock skew as the time difference in the latencies between two clock inputs of a launch-capture flip-flop pair. This is done regardless of the magnitude of the latencies and independent of clock period. Another definition of clock skew is the time difference between a launch rising edge and the next nearest subsequent rising edge of the clock to the capture flip-flop. When the latency is less than the clock period these are identical. Let's call the former skew; the one used in Primetime, absolute skew and the latter skew definition the relative skew. The maximum delay between a launch capture flip-flop pair to insure that only one rising edge of the capture clock occurs before the edge the launched data is set-up to is as follows: Maximum Data Delay = (Capture clock period) + (Relative clock skew) - (set up time) If the Maximum Data Delay were kept within the constraint defined the dangers described in this paper would NOT occur. The reason for this is that all set-uphold errors would have to be fixed via clock skew adjustment, not data delay buffer insertion and extra clock edges over the process range would not be able to occur. There is no reason to delay a clock greater than a period since a phase shift is functional only modulo 360 degrees. For example, a phase shift of 180 degrees is functionally the same as a phase shift of 540 degrees and so on. Therefore adding more delay buffers just increases process variation sensitivity. Perhaps Primetime should flag any clock latencies and clock skew greater than a clock period as a warning. In any circuit but especially those with absolute skew greater than a clock period when setup check is met with a large slack but hold check fails the fix can be done with clock skew minimization, data delay buffer insertion or a combination of the two. In using a combination when it is too difficult to meet the clock skew requirement over the process range at least a minimization of the insertion of data delay buffers would result. If this still results in a data delay greater than the Maximum Data Delay defined above, the designer needs to verify independently that this is acceptable. Another check to see how a path is meeting timing is to compare the set-up i.e., max delay, timing reports with both relative and absolute skew when these skew definitions are different because of clock delays greater than a clock period. If after fixing a hold error one can see that the timing path now meets set-up only with the absolute skew whereas previously it also met timing with the relative skew. The engineer needs to 14
  • 15. manually ascertain whether this signal is or may be the enable to a state machine. This may or may not be acceptable considering the extra clock edges that may occur relative to other inputs. This relative versus absolute type of analysis is another way of showing that timing has been fixed with data buffer delays as opposed to fixing it by adjusting clock skew. 15