This article is licensed under a Creative Commons Attribution 4.0 International License .
Timing derating means adding an extra margin to STA analysis to accommodate variation in timing parameters of gates (as they were characterized in a timing library). Timing libraries are characterized for a particular operating condition representing a combination of process, voltage and temperature (a PVT for short).
However, the real operating condition will certainly differ from the one used for STA (e.g. voltage level will fluctuate, voltage level will drop over power distribution network and in response to current draw peaks, manufacturing process fluctuates in a fab, parameters change wafer to wafer, etc.). The implied variation of operating conditions has a global and a local component.
The global variation is countered running STA analysis at the corner conditions that form a bounding "box" of where the real silicon is to operate. Despite accounting for worst case and best case shifts in operating conditions, every chip will suffer from local variations (a.k.a. on-chip variation or OCV), meaning that some gates will be little faster, some little slower, some will see slightly higher temperature, or slightly lower voltage.
To account for the local variation, designers add extra margin to make sure the performance of the chip stays on the safe side. The simplest means to scale (or de-rate) the timing from a gate library to yield more pessimistic timing. Hence the term timing derating.
Note
The examples in this text were used with PrimeTime S-2021.06-SP5 and Tempus 21.14.
The easiest way to scale down STA timing is using a fixed scaling factor to affect the timing
paths so that slow paths become slower and fast paths to become faster. And that is what the
set_timing_derate -early|-late
SDC command does. Default STA setting is such that all paths
have scaling factor of 1.0 (shown by PrimeTime report_timing -derate
). To make slow or late
paths 10% slower, we use set_timing_derate -late 1.1
. Note that the late path for the setup
check is the data path and for the hold check the clock path:
pt_shell> pt_shell> set_timing_derate -late 1.1 pt_shell> report_timing -delay_type max -from FF1 -to FF2 -nosplit \ pt_shell> report_timing -delay_type max -from FF1 -to FF2 -nosplit \ -derate -derate Last common pin: clk Last common pin: CKBUF2/Q Path Group: CLK Path Group: CLK Path Type: max Path Type: max Point Derate Incr Path Point Derate Incr Path ----------------------------------------------------------- ----------------------------------------------------------- clock CLK (rise edge) 0.000 0.000 clock CLK (rise edge) 0.000 0.000 clock network delay (propagated) 5.000 5.000 clock network delay (propagated) 5.500 5.500 FF1/CK (dffprqx05_d) 0.000 5.000 r FF1/CK (dffprqx05_d) 0.000 5.500 r FF1/Q (dffprqx05_d) <- 1.000 3.000 8.000 f FF1/Q (dffprqx05_d) <- 1.100 3.300 8.800 f BUF1/A (bufx10_d) 1.000 0.000 8.000 f BUF1/A (bufx10_d) 1.100 0.000 8.800 f BUF1/Q (bufx10_d) 1.000 1.000 9.000 f BUF1/Q (bufx10_d) 1.100 1.100 9.900 f ... ... FF2/D (dffprqx05_d) 1.000 0.000 12.000 f FF2/D (dffprqx05_d) 1.100 0.000 13.200 f data arrival time 12.000 data arrival time 13.200 clock CLK (rise edge) 10.000 10.000 clock CLK (rise edge) 10.000 10.000 clock network delay (propagated) 5.000 15.000 clock network delay (propagated) 5.000 15.000 clock reconvergence pessimism 0.000 15.000 clock reconvergence pessimism 0.200 15.200 clock uncertainty -0.500 14.500 clock uncertainty -0.500 14.700 FF2/CK (dffprqx05_d) 14.500 r FF2/CK (dffprqx05_d) 14.700 r library setup time 1.000 -0.400 14.100 library setup time 1.000 -0.400 14.300 data required time 14.100 data required time 14.300 ----------------------------------------------------------- ----------------------------------------------------------- data required time 14.100 data required time 14.300 data arrival time -12.000 data arrival time -13.200 ----------------------------------------------------------- ----------------------------------------------------------- slack (MET) 2.100 slack (MET) 1.100 pt_shell> set_timing_derate -late 1.1 pt_shell> report_timing -delay_type min -from FF1 -to FF2 -nosplit \ pt_shell> report_timing -delay_type min -from FF1 -to FF2 -nosplit \ -derate -path_type full_clock -derate -path_type full_clock Last common pin: clk Last common pin: CKBUF2/Q Path Group: CLK Path Group: CLK Path Type: min Path Type: min Point Derate Incr Path Point Derate Incr Path ----------------------------------------------------------- ----------------------------------------------------------- clock CLK (rise edge) 0.000 0.000 clock CLK (rise edge) 0.000 0.000 clock source latency 0.000 0.000 clock source latency 0.000 0.000 clk (in) 0.000 0.000 r clk (in) 0.000 0.000 r CKBUF1/A (bufx10_d) 1.000 0.000 0.000 r CKBUF1/A (bufx10_d) 1.000 0.000 0.000 r CKBUF1/Q (bufx10_d) 1.000 1.000 1.000 r CKBUF1/Q (bufx10_d) 1.000 1.000 1.000 r ... ... FF1/CK (dffprqx05_d) 1.000 0.000 5.000 r FF1/CK (dffprqx05_d) 1.000 0.000 5.000 r FF1/Q (dffprqx05_d) <- 1.000 2.000 7.000 r FF1/Q (dffprqx05_d) <- 1.000 2.000 7.000 r BUF1/A (bufx10_d) 1.000 0.000 7.000 r BUF1/A (bufx10_d) 1.000 0.000 7.000 r BUF1/Q (bufx10_d) 1.000 1.000 8.000 r BUF1/Q (bufx10_d) 1.000 1.000 8.000 r ... ... FF2/D (dffprqx05_d) 1.000 0.000 11.000 r FF2/D (dffprqx05_d) 1.000 0.000 11.000 r data arrival time 11.000 data arrival time 11.000 clock CLK (rise edge) 0.000 0.000 clock CLK (rise edge) 0.000 0.000 clock source latency 0.000 0.000 clock source latency 0.000 0.000 clk (in) 0.000 0.000 r clk (in) 0.000 0.000 r CKBUF1/A (bufx10_d) 1.000 0.000 0.000 r CKBUF1/A (bufx10_d) 1.100 0.000 0.000 r CKBUF1/Q (bufx10_d) 1.000 1.000 1.000 r CKBUF1/Q (bufx10_d) 1.100 1.100 1.100 r ... ... FF2/CK (dffprqx05_d) 1.000 0.000 5.000 r FF2/CK (dffprqx05_d) 1.100 0.000 5.500 r clock reconvergence pessimism 0.000 5.000 clock reconvergence pessimism -0.200 5.300 clock uncertainty 0.500 5.500 clock uncertainty 0.500 5.800 library hold time 1.000 0.300 5.800 library hold time 1.000 0.300 6.100 data required time 5.800 data required time 6.100 ----------------------------------------------------------- ----------------------------------------------------------- data required time 5.800 data required time 6.100 data arrival time -11.000 data arrival time -11.000 ----------------------------------------------------------- ----------------------------------------------------------- slack (MET) 5.200 slack (MET) 4.900
There are few things to note in the above example:
- Only the late path is scaled (i.e. the data path for
max
timing and capture clock path formin
timing). - (Clock) cells that appear in both the late and early path scale in only in the late path
(see the
min
path timing which has used the-path_type full_clock
option). The STA tool shall remove clock reconvergence pessimism (CRP) to counter this effect, so it either adds or reduces the capture clock path the difference on the common clock path segment. - Slack reduction (due to being 10% more pessimistic) is more pronounced for the setup timing than for the hold timing. As we will see in the next example, the impact reverses for early derate so that hold timing is impacted more and setup timing less.
To also scale the early path, we would need a second instance of set_timing_derate
.
However, as the early path is the fast one, we need to scale it down to make it faster.
Hence set_timing_derate -early 0.9
is for 10% margin. Reports below show the early
derate vs. the default no-scaling baseline:
pt_shell> pt_shell> set_timing_derate -early 0.9 pt_shell> report_timing -delay_type max -from FF1 -to FF2 -nosplit \ pt_shell> report_timing -delay_type max -from FF1 -to FF2 -nosplit \ -derate -derate Last common pin: clk Last common pin: CKBUF2/Q Path Group: CLK Path Group: CLK Path Type: max Path Type: max Point Derate Incr Path Point Derate Incr Path ----------------------------------------------------------- ----------------------------------------------------------- clock CLK (rise edge) 0.000 0.000 clock CLK (rise edge) 0.000 0.000 clock network delay (propagated) 5.000 5.000 clock network delay (propagated) 5.000 5.000 FF1/CK (dffprqx05_d) 0.000 5.000 r FF1/CK (dffprqx05_d) 0.000 5.000 r FF1/Q (dffprqx05_d) <- 1.000 3.000 8.000 f FF1/Q (dffprqx05_d) <- 1.000 3.000 8.000 f BUF1/A (bufx10_d) 1.000 0.000 8.000 f BUF1/A (bufx10_d) 1.000 0.000 8.000 f BUF1/Q (bufx10_d) 1.000 1.000 9.000 f BUF1/Q (bufx10_d) 1.000 1.000 9.000 f ... ... FF2/D (dffprqx05_d) 1.000 0.000 12.000 f FF2/D (dffprqx05_d) 1.000 0.000 12.000 f data arrival time 12.000 data arrival time 12.000 clock CLK (rise edge) 10.000 10.000 clock CLK (rise edge) 10.000 10.000 clock network delay (propagated) 5.000 15.000 clock network delay (propagated) 4.500 14.500 clock reconvergence pessimism 0.000 15.000 clock reconvergence pessimism 0.200 14.700 clock uncertainty -0.500 14.500 clock uncertainty -0.500 14.200 FF2/CK (dffprqx05_d) 14.500 r FF2/CK (dffprqx05_d) 14.200 r library setup time 1.000 -0.400 14.100 library setup time 1.000 -0.400 13.800 data required time 14.100 data required time 13.800 ----------------------------------------------------------- ----------------------------------------------------------- data required time 14.100 data required time 13.800 data arrival time -12.000 data arrival time -12.000 ----------------------------------------------------------- ----------------------------------------------------------- slack (MET) 2.100 slack (MET) 1.800 pt_shell> set_timing_derate -early 0.9 pt_shell> report_timing -delay_type min -from FF1 -to FF2 -nosplit \ pt_shell> report_timing -delay_type min -from FF1 -to FF2 -nosplit \ -derate -path_type full_clock -derate -path_type full_clock Last common pin: clk Last common pin: CKBUF2/Q Path Group: CLK Path Group: CLK Path Type: min Path Type: min Point Derate Incr Path Point Derate Incr Path ----------------------------------------------------------- ----------------------------------------------------------- clock CLK (rise edge) 0.000 0.000 clock CLK (rise edge) 0.000 0.000 clock source latency 0.000 0.000 clock source latency 0.000 0.000 clk (in) 0.000 0.000 r clk (in) 0.000 0.000 r CKBUF1/A (bufx10_d) 1.000 0.000 0.000 r CKBUF1/A (bufx10_d) 0.900 0.000 0.000 r CKBUF1/Q (bufx10_d) 1.000 1.000 1.000 r CKBUF1/Q (bufx10_d) 0.900 0.900 0.900 r ... ... FF1/CK (dffprqx05_d) 1.000 0.000 5.000 r FF1/CK (dffprqx05_d) 0.900 0.000 4.500 r FF1/Q (dffprqx05_d) <- 1.000 2.000 7.000 r FF1/Q (dffprqx05_d) <- 0.900 1.800 6.300 r BUF1/A (bufx10_d) 1.000 0.000 7.000 r BUF1/A (bufx10_d) 0.900 0.000 6.300 r BUF1/Q (bufx10_d) 1.000 1.000 8.000 r BUF1/Q (bufx10_d) 0.900 0.900 7.200 r ... ... FF2/D (dffprqx05_d) 1.000 0.000 11.000 r FF2/D (dffprqx05_d) 0.900 0.000 9.900 r data arrival time 11.000 data arrival time 9.900 clock CLK (rise edge) 0.000 0.000 clock CLK (rise edge) 0.000 0.000 clock source latency 0.000 0.000 clock source latency 0.000 0.000 clk (in) 0.000 0.000 r clk (in) 0.000 0.000 r CKBUF1/A (bufx10_d) 1.000 0.000 0.000 r CKBUF1/A (bufx10_d) 1.000 0.000 0.000 r CKBUF1/Q (bufx10_d) 1.000 1.000 1.000 r CKBUF1/Q (bufx10_d) 1.000 1.000 1.000 r ... ... FF2/CK (dffprqx05_d) 1.000 0.000 5.000 r FF2/CK (dffprqx05_d) 1.000 0.000 5.000 r clock reconvergence pessimism 0.000 5.000 clock reconvergence pessimism -0.200 4.800 clock uncertainty 0.500 5.500 clock uncertainty 0.500 5.300 library hold time 1.000 0.300 5.800 library hold time 1.000 0.300 5.600 data required time 5.800 data required time 5.600 ----------------------------------------------------------- ----------------------------------------------------------- data required time 5.800 data required time 5.600 data arrival time -11.000 data arrival time -9.900 ----------------------------------------------------------- ----------------------------------------------------------- slack (MET) 5.200 slack (MET) 4.300
The idea of timing derating is intuitive, but gets more complex in real applications. Different IPs may need to be scaled differently, just by the fact that some do have timing margins already incorporated. Certain IP instances may need tighter or more relaxed margins. Nets may derate differently than cell delays. Different cell timing parameters may scale differently (e.g. setup/hold vs. cell delay). Some timing derates should be applied incrementally, which is better done as additive margins than multiplicative scale factors.
STA engines support these case through various set_timing_derate
options. Yet there are slight differences how
they do or how they prioritize or combine derate factors targeting the same instance through derates associated with
different design objects (i.e. instances vs. libraries vs. design).
PrimeTime can set timing derates for different design objects and uses the more specific derate wins precedence method. Hence a derate applied to the design object (i.e. without specifying a libcell or an instance) applies to all instances. This may be preceded by a more specific derate set to a libcell. That may be further preceded by a derate set for a particular instance. For example:
pt_shell> set_timing_derate -cell_delay -late 1.1 pt_shell> set_timing_derate -cell_delay -late 1.2 [get_lib_cells */invx05_d] pt_shell> set_timing_derate -cell_delay -late 1.3 [get_cells INV2] pt_shell> report_timing_derate ----- Clock ------ ------ Data ------ Rise Fall Rise Fall Early Late Early Late Early Late Early Late ---------------------------------------------------------------------------------------------- design: circ Net delay static -- -- -- -- -- -- -- -- Net delay dynamic -- -- -- -- -- -- -- -- Cell delay -- 1.100 -- 1.100 -- 1.100 -- 1.100 Cell check -- -- -- -- -- -- -- -- cell (leaf): INV2 Cell delay -- 1.300 -- 1.300 -- 1.300 -- 1.300 lib_cell: testlib01/invx05_d Cell delay -- 1.200 -- 1.200 -- 1.200 -- 1.200 pt_shell> report_timing -delay_type max -derate -nosplit Point Derate Incr Path -------------------------------------------------------------------------- clock CLK (rise edge) 0.000 0.000 clock network delay (propagated) 5.700 5.700 FF1/CK (dffprqx05_d) 0.000 5.700 r FF1/Q (dffprqx05_d) 1.100 3.300 9.000 f BUF1/A (bufx10_d) 1.000 0.000 9.000 f BUF1/Q (bufx10_d) 1.100 1.100 10.100 f <--- all cells are derated by at least the global 1.1 factor INV2/A (invx05_d) 1.000 0.000 10.100 f INV2/Q (invx05_d) 1.300 1.300 11.400 r <--- INV2 cell is specifically derated by 1.3 factor INV3/A (invx05_d) 1.000 0.000 11.400 r INV3/Q (invx05_d) 1.200 1.200 12.600 f <--- all other invx05_d cells are derated by 1.2 factor BUF4/A (bufx10_d) 1.000 0.000 12.600 f BUF4/Q (bufx10_d) 1.100 1.100 13.700 f FF2/D (dffprqx05_d) 1.000 0.000 13.700 f data arrival time 13.700 clock CLK (rise edge) 10.000 10.000 clock network delay (propagated) 5.000 15.000 clock reconvergence pessimism 0.200 15.200 clock uncertainty -0.500 14.700 FF2/CK (dffprqx05_d) 14.700 r library setup time 1.000 -0.400 14.300 data required time 14.300 -------------------------------------------------------------------------- data required time 14.300 data arrival time -13.700 -------------------------------------------------------------------------- slack (MET) 0.600
Following the scale factors, we can define incremental adjustments through set_timing_derate -incremental
.
Notice that the incremental derates are not reported in report_timing_derate
but only with
report_timing_derate -incremental
:
pt_shell> set_timing_derate -cell_delay -late 0.01 -increment pt_shell> set_timing_derate -cell_delay -late 0.02 -increment [get_lib_cells */invx05_d] pt_shell> set_timing_derate -cell_delay -late 0.03 -increment [get_cells INV3] pt_shell> report_timing_derate ----- Clock ------ ------ Data ------ Rise Fall Rise Fall Early Late Early Late Early Late Early Late ---------------------------------------------------------------------------------------------- design: circ Net delay static -- -- -- -- -- -- -- -- Net delay dynamic -- -- -- -- -- -- -- -- Cell delay -- 1.100 -- 1.100 -- 1.100 -- 1.100 Cell check -- -- -- -- -- -- -- -- cell (leaf): INV2 Cell delay -- 1.300 -- 1.300 -- 1.300 -- 1.300 lib_cell: testlib01/invx05_d Cell delay -- 1.200 -- 1.200 -- 1.200 -- 1.200 pt_shell> report_timing_derate -increment ----- Clock ------ ------ Data ------ Rise Fall Rise Fall Early Late Early Late Early Late Early Late ---------------------------------------------------------------------------------------------- design: circ Net delay static -- -- -- -- -- -- -- -- Net delay dynamic -- -- -- -- -- -- -- -- Cell delay -- 0.010 -- 0.010 -- 0.010 -- 0.010 Cell check -- -- -- -- -- -- -- -- cell (leaf): INV3 Cell delay -- 0.030 -- 0.030 -- 0.030 -- 0.030 lib_cell: testlib01/invx05_d Cell delay -- 0.020 -- 0.020 -- 0.020 -- 0.020 pt_shell> report_timing -delay_type max -derate -nosplit Point Derate Incr Path -------------------------------------------------------------------------- clock CLK (rise edge) 0.000 0.000 clock network delay (propagated) 5.770 5.770 FF1/CK (dffprqx05_d) 0.000 5.770 r FF1/Q (dffprqx05_d) 1.110 3.330 9.100 f <--- all cells have scale factor increased by at least 0.01 BUF1/A (bufx10_d) 1.000 0.000 9.100 f BUF1/Q (bufx10_d) 1.110 1.110 10.210 f INV2/A (invx05_d) 1.000 0.000 10.210 f INV2/Q (invx05_d) 1.320 1.320 11.530 r <--- invx05_d cells have scale factor increased by at least 0.02 INV3/A (invx05_d) 1.000 0.000 11.530 r (note that INV2 has unique scale factor of 1.3) INV3/Q (invx05_d) 1.230 1.230 12.760 f <--- INV3 has its scale factor specifically increased by 0.03 BUF4/A (bufx10_d) 1.000 0.000 12.760 f (note that INV3 has a scale factor of 1.2 default factor of BUF4/Q (bufx10_d) 1.110 1.110 13.870 f invx05_d cells) FF2/D (dffprqx05_d) 1.000 0.000 13.870 f data arrival time 13.870 clock CLK (rise edge) 10.000 10.000 clock network delay (propagated) 5.000 15.000 clock reconvergence pessimism 0.220 15.220 clock uncertainty -0.500 14.720 FF2/CK (dffprqx05_d) 14.720 r library setup time 1.000 -0.400 14.320 data required time 14.320 -------------------------------------------------------------------------- data required time 14.320 data arrival time -13.870 -------------------------------------------------------------------------- slack (MET) 0.450
Note that using set_timing_derate -increment
again is not incremental to the already existing
increment. It actually overrides the previous increment. It is indeed the same for the scale factor, too;
a new set_derate_timing
overrides the last one. See the examples below.
It implies that PrimeTime actually maintains the timing derate as two separate components, a scale
factor and an incremental margin. The total derate is then total_derate = scale_factor + margin
.
The scale factor is set instantly through set_timing_derate
and the margin through
set_timing_derate -incremental
. Hence if the total derate is composed of multiple components,
users need to come up with sub-totals for both the scale factor and the margin (or can come up
with the total itself and use only either the scale factor or the margin).
pt_shell> set_timing_derate -cell_delay -late -0.01 -increment pt_shell> set_timing_derate -cell_delay -late -0.02 -increment [get_lib_cells */invx05_d] pt_shell> set_timing_derate -cell_delay -late -0.03 -increment [get_cells INV3] pt_shell> report_timing_derate -increment ----- Clock ------ ------ Data ------ Rise Fall Rise Fall Early Late Early Late Early Late Early Late ---------------------------------------------------------------------------------------------- design: circ Net delay static -- -- -- -- -- -- -- -- Net delay dynamic -- -- -- -- -- -- -- -- Cell delay -- -0.010 -- -0.010 -- -0.010 -- -0.010 Cell check -- -- -- -- -- -- -- -- cell (leaf): INV3 Cell delay -- -0.030 -- -0.030 -- -0.030 -- -0.030 lib_cell: testlib01/invx05_d Cell delay -- -0.020 -- -0.020 -- -0.020 -- -0.020
In the scale factor overrides, notice that we have not overridden INV2
scale factor, which
then remained at its last value of 1.3 despite the new global and libcell derates. This is in line
with the expected precedence of more specific derate applies:
pt_shell> set_timing_derate -cell_delay -late 0.9 pt_shell> set_timing_derate -cell_delay -late 0.8 [get_lib_cells */invx05_d] pt_shell> set_timing_derate -cell_delay -late 0.7 [get_cells INV3] pt_shell> report_timing_derate ----- Clock ------ ------ Data ------ Rise Fall Rise Fall Early Late Early Late Early Late Early Late ---------------------------------------------------------------------------------------------- design: circ Net delay static -- -- -- -- -- -- -- -- Net delay dynamic -- -- -- -- -- -- -- -- Cell delay -- 0.900 -- 0.900 -- 0.900 -- 0.900 Cell check -- -- -- -- -- -- -- -- cell (leaf): INV2 Cell delay -- 1.300 -- 1.300 -- 1.300 -- 1.300 cell (leaf): INV3 Cell delay -- 0.700 -- 0.700 -- 0.700 -- 0.700 lib_cell: testlib01/invx05_d Cell delay -- 0.800 -- 0.800 -- 0.800 -- 0.800
The new timing after derate updates then looks like follows:
pt_shell> report_timing -delay_type max -derate -nosplit Point Derate Incr Path -------------------------------------------------------------------------- clock CLK (rise edge) 0.000 0.000 clock network delay (propagated) 4.230 4.230 FF1/CK (dffprqx05_d) 0.000 4.230 r FF1/Q (dffprqx05_d) 0.890 2.670 6.900 f BUF1/A (bufx10_d) 1.000 0.000 6.900 f BUF1/Q (bufx10_d) 0.890 0.890 7.790 f INV2/A (invx05_d) 1.000 0.000 7.790 f INV2/Q (invx05_d) 1.280 1.280 9.070 r INV3/A (invx05_d) 1.000 0.000 9.070 r INV3/Q (invx05_d) 0.670 0.670 9.740 f BUF4/A (bufx10_d) 1.000 0.000 9.740 f BUF4/Q (bufx10_d) 0.890 0.890 10.630 f FF2/D (dffprqx05_d) 1.000 0.000 10.630 f data arrival time 10.630 clock CLK (rise edge) 10.000 10.000 clock network delay (propagated) 5.000 15.000 clock reconvergence pessimism 0.000 15.000 clock uncertainty -0.500 14.500 FF2/CK (dffprqx05_d) 14.500 r library setup time 1.000 -0.400 14.100 data required time 14.100 -------------------------------------------------------------------------- data required time 14.100 data arrival time -10.630 -------------------------------------------------------------------------- slack (MET) 3.470
Tempus takes a slightly different approach. It maintains a single derate scale factor set by set_timing_derate
.
Users can incrementally update this scale factor by a multiplicative factor (set_timing_derate -multiply
) or
additive increment (set_timing_derate -add
). There can be as many incremental updates as one likes:
@tempus 3> set_timing_derate -cell_delay -late 1.1 @tempus 4> report_timing_derate -------------Clock----------------- ---------------Data----------------- Rise Fall Rise Fall Early Late Early Late Early Late Early Late --------------------------------------------------------------------------- Cell Delay -- 1.100 -- 1.100 -- 1.100 -- 1.100 Net Delay Static -- -- -- -- -- -- -- -- ... @tempus 5> set_timing_derate -cell_delay -late 1.1 -multiply @tempus 6> set_timing_derate -cell_delay -late 1.1 -multiply @tempus 7> report_timing_derate -------------Clock----------------- ---------------Data----------------- Cell Delay -- 1.331 -- 1.331 -- 1.331 -- 1.331 Net Delay Static -- -- -- -- -- -- -- -- ... @tempus 8> set_timing_derate -cell_delay -late 0.01 -add @tempus 9> set_timing_derate -cell_delay -late 0.01 -add @tempus 10> report_timing_derate -------------Clock----------------- ---------------Data----------------- Cell Delay -- 1.351 -- 1.351 -- 1.351 -- 1.351 Net Delay Static -- -- -- -- -- -- -- -- ... @tempus 11> set_timing_derate -cell_delay -late 1.1 -multiply @tempus 12> report_timing_derate -------------Clock----------------- ---------------Data----------------- Cell Delay -- 1.486 -- 1.486 -- 1.486 -- 1.486 Net Delay Static -- -- -- -- -- -- -- -- ... @tempus 13> set_timing_derate -cell_delay -late 1.1 @tempus 14> report_timing_derate -------------Clock----------------- ---------------Data----------------- Cell Delay -- 1.100 -- 1.100 -- 1.100 -- 1.100 Net Delay Static -- -- -- -- -- -- -- -- ...
Tempus honors similar derate precence like PrimeTime in the sense that more specific derate for a particular design object applies. However, there are certain quirks that are unxepected from user perspective and make the overal derate specification more intricate.
The intuitive derate precednce with decreasing priority is as follows:
- Specific instance/cell object.
- Library cell object.
- Design object.
@tempus 15> set_timing_derate -cell_delay -late 1.1 @tempus 16> set_timing_derate -cell_delay -late 1.2 [get_lib_cells */invx05_d] @tempus 17> set_timing_derate -cell_delay -late 1.3 [get_cells INV2] @tempus 18> report_timing_derate -------------Clock----------------- ---------------Data----------------- Rise Fall Rise Fall Early Late Early Late Early Late Early Late --------------------------------------------------------------------------- Cell Delay -- 1.100 -- 1.100 -- 1.100 -- 1.100 Net Delay Static -- -- -- -- -- -- -- -- ... Cell (leaf): INV2 Cell Delay -- 1.300 -- 1.300 -- 1.300 -- 1.300 LibraryCell: testlib01/invx05_d Cell_delay -- 1.200 -- 1.200 -- 1.200 -- 1.200 Input_switching -- -- -- -- -- -- -- -- @tempus 19> report_timing -late Capture Launch Clock Edge:+ 10.000 0.000 Src Latency:+ 0.000 0.000 Net Latency:+ 5.000 (P) 5.700 (P) Arrival:= 15.000 5.700 Setup:- 0.400 Uncertainty:- 0.500 Cppr Adjust:+ 0.200 Required Time:= 14.300 Launch Clock:= 5.700 Data Path:+ 8.000 Slack:= 0.600 #------------------------------------------------------------------- # Pin Cell Load Trans Incr Total Delay Arrival # (pf) (ns) Delay Derate (ns) (ns) #------------------------------------------------------------------- FF1/CK (arrival) 0.005 0.201 - - - 5.700 FF1/Q dffprqx05_d 0.003 0.201 0.000 1.100 3.300 9.000 BUF1/Q bufx10_d 0.005 0.301 0.000 1.100 1.100 10.100 INV2/Q invx05_d 0.005 0.301 0.000 1.300 1.300 11.400 INV3/Q invx05_d 0.003 0.201 0.000 1.200 1.200 12.600 BUF4/Q bufx10_d 0.005 0.301 0.000 1.100 1.100 13.700 FF2/D dffprqx05_d 0.005 0.301 0.000 1.000 0.000 13.700 #-------------------------------------------------------------------
The quirky behavior is tied to the incremental derate specification, whether through -multiply
or -add
.
The behaviour is similar for both and so showing only the -add
option in the example that follows. Effect
of the incremental derate specs is as follows:
- Inc. derate of cell/instance objects affects only that target objects.
- Inc. derate of library cell objects affects existing derates of those library cells and of any specific cells/instances (that use those library cells).
- Inc. derate of design object affects all existing derates.
@tempus> reset_timing_derate @tempus> set_timing_derate -cell_delay -late 1.1 @tempus> set_timing_derate -cell_delay -late 1.2 [get_lib_cells */invx05_d] @tempus> set_timing_derate -cell_delay -late 1.3 [get_cells INV2] @tempus> report_timing_derate -------------Clock----------------- ---------------Data----------------- Rise Fall Rise Fall Early Late Early Late Early Late Early Late --------------------------------------------------------------------------- Cell Delay -- 1.100 -- 1.100 -- 1.100 -- 1.100 Cell (leaf): INV2 Cell Delay -- 1.300 -- 1.300 -- 1.300 -- 1.300 LibraryCell: testlib01/invx05_d Cell_delay -- 1.200 -- 1.200 -- 1.200 -- 1.200 ######## ######## Notice the following affects *all* objects. ######## @tempus> set_timing_derate -cell_delay -late 0.01 -add @tempus> report_timing_derate -------------Clock----------------- ---------------Data----------------- Cell Delay -- 1.110 -- 1.110 -- 1.110 -- 1.110 Cell (leaf): INV2 Cell Delay -- 1.310 -- 1.310 -- 1.310 -- 1.310 LibraryCell: testlib01/invx05_d Cell_delay -- 1.210 -- 1.210 -- 1.210 -- 1.210 ######## ######## Notice the following affects library and instance objects. ######## @tempus> set_timing_derate -cell_delay -late 0.02 -add [get_lib_cells */invx05_d] @tempus> report_timing_derate -------------Clock----------------- ---------------Data----------------- Cell Delay -- 1.110 -- 1.110 -- 1.110 -- 1.110 Cell (leaf): INV2 Cell Delay -- 1.330 -- 1.330 -- 1.330 -- 1.330 LibraryCell: testlib01/invx05_d Cell_delay -- 1.230 -- 1.230 -- 1.230 -- 1.230 ######## ######## Finally the following affects only target instance objects. ######## @tempus> set_timing_derate -cell_delay -late 0.03 -add [get_cells INV3] @tempus> report_timing_derate -------------Clock----------------- ---------------Data----------------- Cell Delay -- 1.110 -- 1.110 -- 1.110 -- 1.110 Cell (leaf): INV2 Cell Delay -- 1.330 -- 1.330 -- 1.330 <--+ -- 1.330 | Cell (leaf): INV3 | Cell Delay -- 1.260 -- 1.260 <--+ -- 1.260 | -- 1.260 | | LibraryCell: testlib01/invx05_d | | Cell_delay -- 1.230 <--+ -- 1.230 | -- 1.230 | -- 1.230 | | | | | +-- Total additive increment of 0.01+0.02. | +-- Total additive increment of 0.01+0.02+0.03. +-- Total additive increment of 0.01+0.02.
The unfortunate result is that the order of these incremental derate specification matters;
not just relative to themselves (i.e. -multiply
vs. -add
, which is expected) but
relative to the global ones, too (which may be unexpected). For example:
######## ######## Reference case. ######## @tempus> reset_timing_derate @tempus> set_timing_derate -cell_delay -late 1.1 @tempus> set_timing_derate -cell_delay -late 1.2 [get_lib_cells */invx05_d] @tempus> set_timing_derate -cell_delay -late 1.3 [get_cells INV2] @tempus> @tempus> set_timing_derate -cell_delay -late 0.01 -add @tempus> set_timing_derate -cell_delay -late 0.02 -add [get_lib_cells */invx05_d] @tempus> set_timing_derate -cell_delay -late 0.03 -add [get_cells INV3] @tempus> @tempus> report_timing_derate -------------Clock----------------- ---------------Data----------------- Rise Fall Rise Fall Early Late Early Late Early Late Early Late --------------------------------------------------------------------------- Cell Delay -- 1.110 -- 1.110 -- 1.110 -- 1.110 Cell (leaf): INV2 Cell Delay -- 1.330 -- 1.330 -- 1.330 -- 1.330 Cell (leaf): INV3 Cell Delay -- 1.260 -- 1.260 -- 1.260 -- 1.260 LibraryCell: testlib01/invx05_d Cell_delay -- 1.230 -- 1.230 -- 1.230 -- 1.230 ######## ######## Notice the reverse order of incremental derates (which makes no difference) ######## @tempus> reset_timing_derate @tempus> set_timing_derate -cell_delay -late 1.1 @tempus> set_timing_derate -cell_delay -late 1.2 [get_lib_cells */invx05_d] @tempus> set_timing_derate -cell_delay -late 1.3 [get_cells INV2] @tempus> @tempus> set_timing_derate -cell_delay -late 0.03 -add [get_cells INV3] @tempus> set_timing_derate -cell_delay -late 0.02 -add [get_lib_cells */invx05_d] @tempus> set_timing_derate -cell_delay -late 0.01 -add @tempus> @tempus> report_timing_derate -------------Clock----------------- ---------------Data----------------- Rise Fall Rise Fall Early Late Early Late Early Late Early Late --------------------------------------------------------------------------- Cell Delay -- 1.110 -- 1.110 -- 1.110 -- 1.110 Cell (leaf): INV2 Cell Delay -- 1.330 -- 1.330 -- 1.330 -- 1.330 Cell (leaf): INV3 Cell Delay -- 1.260 -- 1.260 -- 1.260 -- 1.260 LibraryCell: testlib01/invx05_d Cell_delay -- 1.230 -- 1.230 -- 1.230 -- 1.230 ######## ######## Notice a different order of global and incremental derates (and hence different ending derates). ######## @tempus> reset_timing_derate @tempus> @tempus> set_timing_derate -cell_delay -late 1.1 @tempus> set_timing_derate -cell_delay -late 0.01 -add @tempus> @tempus> set_timing_derate -cell_delay -late 1.2 [get_lib_cells */invx05_d] @tempus> set_timing_derate -cell_delay -late 0.02 -add [get_lib_cells */invx05_d] @tempus> @tempus> set_timing_derate -cell_delay -late 1.3 [get_cells INV2] @tempus> set_timing_derate -cell_delay -late 0.03 -add [get_cells INV3] @tempus> @tempus> report_timing_derate -------------Clock----------------- ---------------Data----------------- Rise Fall Rise Fall Early Late Early Late Early Late Early Late --------------------------------------------------------------------------- Cell Delay -- 1.110 -- 1.110 -- 1.110 -- 1.110 Cell (leaf): INV2 Cell Delay -- 1.300 -- 1.300 -- 1.300 -- 1.300 Cell (leaf): INV3 Cell Delay -- 1.250 -- 1.250 -- 1.250 -- 1.250 LibraryCell: testlib01/invx05_d Cell_delay -- 1.220 -- 1.220 -- 1.220 -- 1.220
There is, however, more to that quirkiness to keep in mind and that is that the incremental derates
affect only derates **existing* in that moment*. This then yields different results between resetting
derates through reset_timing_derate
(which makes all derates undefined) and setting derates to 1.0
(which makes derates defined). Hence for example:
######## ######## Reference. ######## @tempus> reset_timing_derate; # removes all existing derates (which is different than setting them to 1.0) @tempus> @tempus> set_timing_derate -cell_delay -late 0.01 -add @tempus> set_timing_derate -cell_delay -late 0.02 -add [get_lib_cells */invx05_d] @tempus> set_timing_derate -cell_delay -late 0.03 -add [get_cells INV3] @tempus> @tempus> report_timing_derate -------------Clock----------------- ---------------Data----------------- Rise Fall Rise Fall Early Late Early Late Early Late Early Late --------------------------------------------------------------------------- Cell Delay -- 1.010 -- 1.010 -- 1.010 -- 1.010 Cell (leaf): INV3 Cell Delay -- 1.050 -- 1.050 -- 1.050 -- 1.050 LibraryCell: testlib01/invx05_d Cell_delay -- 1.020 -- 1.020 -- 1.020 -- 1.020 ######## ######## Notice the reverse order of incremental derates (which makes a difference now) ######## @tempus> reset_timing_derate; # removes all existing derates (which is different than setting them to 1.0) @tempus> @tempus> set_timing_derate -cell_delay -late 0.03 -add [get_cells INV3] @tempus> set_timing_derate -cell_delay -late 0.02 -add [get_lib_cells */invx05_d] @tempus> set_timing_derate -cell_delay -late 0.01 -add @tempus> @tempus> report_timing_derate -------------Clock----------------- ---------------Data----------------- Rise Fall Rise Fall Early Late Early Late Early Late Early Late --------------------------------------------------------------------------- Cell Delay -- 1.010 -- 1.010 -- 1.010 -- 1.010 Cell (leaf): INV3 Cell Delay -- 1.060 -- 1.060 -- 1.060 -- 1.060 LibraryCell: testlib01/invx05_d Cell_delay -- 1.030 -- 1.030 -- 1.030 -- 1.030
In recent versions, Tempus added -incerement
option that supposedly brings it closer to how e.g. PrimeTime
keeps derate settings. This is what the help message says:
-increment # incrementally add the derate value to total derate value in all (OCV/AOCV/SOCV) mode
So the expectation is that -increment
is like -add
but does not accumulate. The following is what
it then looks like in the reference case. The results are almost like in PrimeTime, except the -increment
option is not supported at the design/global level. Yet it is obvious that Tempus now keeps the deratings
as a separate scaling factor and an additive margin, and that new set_timing_derate -increment
replaces the existing one (rather than accumulating like in the -add
case).
######## ######## Reference. ######## @tempus> reset_timing_derate @tempus> set_timing_derate -cell_delay -late 1.1 @tempus> set_timing_derate -cell_delay -late 1.2 [get_lib_cells */invx05_d] @tempus> set_timing_derate -cell_delay -late 1.3 [get_cells INV2] @tempus> set_timing_derate -cell_delay -late 0.01 -increment **ERROR: (TCLCMD-1022): -incremental_adjust/-increment options must be specified to instance or library cell objects. @tempus> set_timing_derate -cell_delay -late 0.02 -increment [get_lib_cells */invx05_d] @tempus> set_timing_derate -cell_delay -late 0.03 -increment [get_cells INV3] @tempus> report_timing_derate -------------Clock----------------- ---------------Data----------------- Rise Fall Rise Fall Early Late Early Late Early Late Early Late --------------------------------------------------------------------------- Cell Delay -- 1.100 -- 1.100 -- 1.100 -- 1.100 Cell (leaf): INV2 Cell Delay -- 1.300 -- 1.300 -- 1.300 -- 1.300 Cell (leaf): INV3 Cell Delay -- 1.200 -- 1.200 -- 1.200 -- 1.200 Incremental_adjust 0.000 0.030 0.000 0.030 0.000 0.030 0.000 0.030 LibraryCell: testlib01/invx05_d Cell_delay -- 1.200 -- 1.200 -- 1.200 -- 1.200 Incremental_adjust 0.000 0.020 0.000 0.020 0.000 0.020 0.000 0.020 ######## ######## Another `-increment` replaces the previous one. ######## @tempus> set_timing_derate -cell_delay -late 0.025 -increment [get_lib_cells */invx05_d] @tempus> set_timing_derate -cell_delay -late 0.035 -increment [get_cells INV3] @tempus> report_timing_derate -------------Clock----------------- ---------------Data----------------- Rise Fall Rise Fall Early Late Early Late Early Late Early Late --------------------------------------------------------------------------- Cell Delay -- 1.100 -- 1.100 -- 1.100 -- 1.100 Cell (leaf): INV2 Cell Delay -- 1.300 -- 1.300 -- 1.300 -- 1.300 Cell (leaf): INV3 Cell Delay -- 1.200 -- 1.200 -- 1.200 -- 1.200 Incremental_adjust 0.000 0.035 0.000 0.035 0.000 0.035 0.000 0.035 LibraryCell: testlib01/invx05_d Cell_delay -- 1.200 -- 1.200 -- 1.200 -- 1.200 Incremental_adjust 0.000 0.025 0.000 0.025 0.000 0.025 0.000 0.025 ######## ######## Also works when scaling factors are undefined. ######## @tempus> reset_timing_derate @tempus> set_timing_derate -cell_delay -late 0.02 -increment [get_lib_cells */invx05_d] @tempus> set_timing_derate -cell_delay -late 0.03 -increment [get_cells INV3] @tempus> report_timing_derate -------------Clock----------------- ---------------Data----------------- Rise Fall Rise Fall Early Late Early Late Early Late Early Late --------------------------------------------------------------------------- Cell (leaf): INV3 Cell Delay -- -- -- -- -- -- -- -- Incremental_adjust 0.000 0.030 0.000 0.030 0.000 0.030 0.000 0.030 LibraryCell: testlib01/invx05_d Cell_delay -- -- -- -- -- -- -- -- Incremental_adjust 0.000 0.020 0.000 0.020 0.000 0.020 0.000 0.020
There seems to be some unexpected side effects with -increment
(that look more like a bug than a feature):
One problem is interaction between the seperate
-increment
margin and the Cadence legacy incremental derates (such as-add
as in the exmaple below). The problem is that different order of derate application yields a different total derate. This is exposed when-add <libcell>
follows the-increment
derates. Changing the order of-add
vs. increment, or squeezing a different derate in between yields the expected results.@tempus> reset_timing_derate @tempus> set_timing_derate -cell_delay -late 0.022 -increment [get_cells INV3] @tempus> set_timing_derate -cell_delay -late 0.033 -increment [get_lib_cells */invx05_d] -------------Clock----------------- ---------------Data----------------- Rise Fall Rise Fall Early Late Early Late Early Late Early Late --------------------------------------------------------------------------- Cell (leaf): INV3 Cell Delay -- -- -- -- -- -- -- -- Incremental_adjust 0.000 0.022 0.000 0.022 0.000 0.022 0.000 0.022 LibraryCell: testlib01/invx05_d Cell_delay -- -- -- -- -- -- -- -- Incremental_adjust 0.000 0.033 0.000 0.033 0.000 0.033 0.000 0.033 ####################### ######## `-add <libcell>` yields double-counting the new derate into a specific cell derate ####################### @tempus> set_timing_derate -cell_delay -late 0.015 -add [get_lib_cells */invx05_d] @tempus> report_timing_derate -------------Clock----------------- ---------------Data----------------- Cell (leaf): INV3 Cell Delay -- 1.030 -- 1.030 -- 1.030 -- 1.030 Incremental_adjust 0.000 0.022 0.000 0.022 0.000 0.022 0.000 0.022 LibraryCell: testlib01/invx05_d Cell_delay -- 1.015 -- 1.015 -- 1.015 -- 1.015 Incremental_adjust 0.000 0.033 0.000 0.033 0.000 0.033 0.000 0.033 ######## `-add <cell>` updates only the specific cell as expected @tempus> set_timing_derate -cell_delay -late 0.01 -add [get_cells INV3] @tempus> report_timing_derate -------------Clock----------------- ---------------Data----------------- Cell (leaf): INV3 Cell Delay -- 1.040 -- 1.040 -- 1.040 -- 1.040 Incremental_adjust 0.000 0.022 0.000 0.022 0.000 0.022 0.000 0.022 LibraryCell: testlib01/invx05_d Cell_delay -- 1.015 -- 1.015 -- 1.015 -- 1.015 Incremental_adjust 0.000 0.033 0.000 0.033 0.000 0.033 0.000 0.033
The other inconvenience is that the
-increment
part of the total derate is not reported in theuser_derate
field ofreport_timing
(and neither in a separate field, such asincr_derate
).@tempus> reset_timing_derate @tempus> set_timing_derate -cell_delay -late 0.01 -add [get_lib_cells */bufx10_d] @tempus> set_timing_derate -cell_delay -late 0.005 -add [get_cells BUF1] @tempus> set_timing_derate -cell_delay -late 0.02 -increment [get_lib_cells */bufx10_d] @tempus> set_timing_derate -cell_delay -late 0.03 -increment [get_cells BUF1] @tempus> report_timing_derate -------------Clock----------------- ---------------Data----------------- Rise Fall Rise Fall Early Late Early Late Early Late Early Late --------------------------------------------------------------------------- Cell (leaf): BUF1 Cell Delay -- 1.015 -- 1.015 -- 1.015 -- 1.015 Incremental_adjust 0.000 0.030 0.000 0.030 0.000 0.030 0.000 0.030 LibraryCell: testlib01/bufx10_d Cell_delay -- 1.010 -- 1.010 -- 1.010 -- 1.010 Incremental_adjust 0.000 0.020 0.000 0.020 0.000 0.020 0.000 0.020 @tempus> report_timing #--------------------------------------------------------------------------- # Pin Cell Load Trans Incr Delay Arrival User Total # (pf) (ns) Delay (ns) (ns) Derate Derate #--------------------------------------------------------------------------- FF1/CK (arrival) 0.005 0.201 - - 5.090 - - FF1/Q dffprqx05_d 0.003 0.201 0.000 3.000 8.090 1.000 1.000 BUF1/Q bufx10_d 0.005 0.301 0.000 1.045 9.135 1.015 1.045 <-- 0.015 from `-add <libcell> + <cell>` + 0.030 from `-increment <cell>` INV2/Q invx05_d 0.005 0.301 0.000 1.000 10.135 1.000 1.000 INV3/Q invx05_d 0.003 0.201 0.000 1.000 11.135 1.000 1.000 BUF4/Q bufx10_d 0.005 0.301 0.000 1.030 12.165 1.010 1.030 <-- 0.01 from `-add <libcell>` + 0.02 from `-increment <libcell>` FF2/D dffprqx05_d 0.005 0.301 0.000 0.000 12.165 1.000 1.000 #---------------------------------------------------------------------------
As noted above, flat OCV timing derating can range from a simple, global scale factor to a complex combination of scale factors and incremental margins set on different design objects. When the flat derates are to be used across different tools, users must be very careful about differences in how those tools interpret the derate commands and manage the derate settings internally.
The following list gives summary of key aspects in major STA tools:
- Synopsys PrimeTime
- Internally represents the derate settings by two components, a scale factor and
an incremental margin. The former is reported by
report_timing_derate
, the latter byreport_timing_derate -increment
. - Total derate is a scale factor derived by summing the two components together.
- Generally supports two commands,
set_timing_derate <scale_factor>
andset_timing_derate -increment <margin>
. - Both derate commands are instant and do not accumulate. That is, applying the command multiple times makes the latter occurrence replace any former one(s).
- Derate settings can be applied to different design objects. If cell instance is subject to multiple derate settings (for different objects such as cell and library cell), the more specific (in terms of the object identification) setting applies.
- Internally represents the derate settings by two components, a scale factor and
an incremental margin. The former is reported by
- Cadence Tempus
- Internally represents the derate settings as a scale factor. The setting
can be done instantly through
set_timing_derate <scale_factor>
. - Settings can be applied to different design objects and like in PrimeTime, the more specific derate settings applies.
- Provides additional commands to incrementally alter the scale factor setting,
set_timing_derate -multiply <factor>
andset_timing_derate -add <margin>
.- These incremental derates accumulate and hence order of their application matters.
- The incremental effect is on every design object that can be affected by the incremental settings and has an existing derate setting already defined. For example, a design-level incremental derate would also alter existing derate settings of a library cell and of a cell/instance; a library cell incremental derate would also affect an existing cell/instance derate (that qualifies for the library cell pattern).
- Recent tool versions also support
set_timing_derate -increment
with the semantics of PrimeTime. The only difference is that this command is supported only for library cell and cell/instance objects.
- Internally represents the derate settings as a scale factor. The setting
can be done instantly through
Here is some guidance that would hopefully lead to more consistent results (at least for Synopsys and Cadence tools):
- Mind the differences among tools. Always report the final derate factors and review the report against the intended derate settings.
- Whenever possible, compile the total derate factors for individual design object groups and
apply through the instant
set_timing_derate <scale_factor>
. This would have consistent effect in all tools. - When need to use incremental margins (e.g. in combination with AOCV, see later), express all derate settings as incremental margins (to the default scale factor of 1.0). This is mainly to avoid exploiting differences among tools.
- When using
-add
or-multiply
options in Cadence tools:- Turn the derate settings such that you would not mix those options. That is, all
set_timing_derate
commands would use either-add
or-multiply
. - Order the incremental derate commands from the most generic to the most specific. This would help to avoid "double counting" (i.e. a more generic incremental setting affecting a more specific incremental setting).
- Avoid mixing instant derate commands and incremental derate commands. This will help to avoid "double counting".
- Turn the derate settings such that you would not mix those options. That is, all
- Prefer using
-increment
to any of-add
,-multiply
. This is possible in recent Cadence tools. Mind that Cadence tools support only library cell and cell/instance objects with this option.
One size fits all works neither for humans nor for timing derate. A fixed flat OCV derate turns either overly pessimistic or sometimes pessimistic/sometimes optimistic as the technology node scales down. SPICE statistical simulation shows that the process local variation effects (per gate) decrease as the length of timing path increases. That is, the local variation effect for long paths averages out.
The next evolution step is to come up with look-up tables that would identify a derate factor based on the number of cells (a.k.a. stages) in the timing path. This look-up table would be compiled from results of statistical SPICE simulation (a.k.a. Monte Carlo simulations) for cells in a series of increasing length and put into a side file that would be used along with the traditional timing library. This approach is called advanced OCV (AOCV) or stage-based OCV (SBOCV). The side file with scaling factor look-up tables would look like follows:
object_type: lib_cell delay_type: cell rf_type: rise derate_type: late object_spec: testlib01/bufx10_d depth: 1 2 3 4 5 6 7 8 distance: 0 // Here the derate is only an example and comes from `1 - 1/(10*depth)` calculation. table: 1.1 1.05 1.033 1.025 1.020 1.017 1.014 1.013
Note
Please note the derate factors for particular stages as we will be refering to it from timing reports.
STA analysis session with AOCV is not much different; it only needs to load the AOCV side file and turn on the advanced analysis:
# Synopsys PrimeTime # Cadence Tempus (stylus/CUI) # ------------------------------- # ------------------------------- # enable on AOCV set_db timing_analysis_aocv true; read_libs -aocv ...; # AOCV side file # usual session commands # usual session commands set link_path ...; # Liberty timing library read_libs ...; # Liberty timing library read_verilog ...; # netlist read_netlist ...; # netlist link init_design # enable on AOCV set_app_var timing_aocvm_enable_analysis true; read_aocvm ...; # AOCV side file # usual timing analysis # usual timing analysis report_timing ... report_timing ...
We can test the AOCV flow with the example data. Notice in the prepared AOCV library that not all cells have AOCV tables defined. This is intentitional to see differences between cells with and without AOCV models when combining AOCV and OCV.
To see that the AOCV side file was used, use the report_aocvm
, which prints a summary of annotating
AOCV tables. To see more details, use the -list_annotated
option. The command is then use for reporting
derate details for a particular timing path.
pt_shell> report_aocvm AOCV Table Set : *Default* ****************************************************** | | Fully | Partially | Not | | Total | annotated | annotated | annotated | ------------+-----------+-----------+-----------+-----------+ Leaf cells | 14 | 8 | 0 | 6 | Nets | 16 | 0 | 0 | 16 | ------------+-----------+-----------+-----------+-----------+ | 30 | 8 | 0 | 22 |
pt_shell> report_timing -from FF1 -to FF2 -derate -nosplit -path_type full_clock_expanded Last common pin: CKBUF2/Q Path Group: CLK Path Type: max Point Derate Incr Path ----------------------------------------------------------- clock CLK (rise edge) 0.000 0.000 clock source latency 0.000 0.000 clk (in) 0.000 0.000 r CKBUF1/A (bufx10_d) 1.000 0.000 0.000 r CKBUF1/Q (bufx10_d) 1.017 1.017 1.017 r <--- 1.017 late scaling factor due to 6 cells in the launch clock path ... (CKBUF1-2, CKBUF3a1-3, FF1) FF1/CK (dffprqx05_d) 1.000 0.000 5.047 r FF1/Q (dffprqx05_d) <- 1.025 3.075 8.122 f <--- 1.025 as AOCV table for dffprqx05_d ending at 4 stages ... INV3/A (invx05_d) 1.000 0.000 10.135 r <--- no derate due to no AOCV table for invx05_d INV3/Q (invx05_d) 1.000 1.000 11.135 f BUF4/A (bufx10_d) 1.000 0.000 11.135 f BUF4/Q (bufx10_d) 1.013 1.013 12.148 f <--- 1.013 derate due to 8 cells in data path since the common clock point FF2/D (dffprqx05_d) 1.000 0.000 12.148 f (CKBUF3a1-3, FF1, BUF1, INV1-2, BUF2) data arrival time 12.148 clock CLK (rise edge) 10.000 10.000 clock source latency 0.000 10.000 clk (in) 0.000 10.000 r CKBUF1/A (bufx10_d) 1.000 0.000 10.000 r CKBUF1/Q (bufx10_d) 0.980 0.980 10.980 r <--- 0.980 early factor as there are 5 cells in the capture clock path ... (CKBUF1-2, CKBUF3b1-3) CKBUF3b1/A (bufx10_d) 1.000 0.000 11.960 r CKBUF3b1/Q (bufx10_d) 0.967 0.967 12.927 r <--- 0.967 as there are 3 cells in capture clock from the common clock point CKINV3b2/A (invx05_d) 1.000 0.000 12.927 r CKINV3b2/Q (invx05_d) 1.000 1.000 13.927 f CKINV3b3/A (invx05_d) 1.000 0.000 13.927 f CKINV3b3/Q (invx05_d) 1.000 1.000 14.927 r FF2/CK (dffprqx05_d) 1.000 0.000 14.927 r clock reconvergence pessimism 0.074 15.001 clock uncertainty -0.500 14.501 library setup time 1.000 -0.400 14.101 data required time 14.101 ----------------------------------------------------------- data required time 14.101 data arrival time -12.148 ----------------------------------------------------------- slack (MET) 1.953 Derate Summary Report ------------------------------------------------------------------- total derate : required time 0.073 total derate : arrival time -0.148 ------------------------------------------------------------------- total derate : slack 0.221 slack (with derating applied) (MET) 1.953 clock reconvergence pessimism (due to derating) -0.074 ------------------------------------------------------------------- slack (with no derating) (MET) 2.100 ################### ######## actuals for CKBUF2 stage count ################### pt_shell> report_aocvm [get_timing_arc -from CKBUF1/A -to CKBUF1/Q] From pin: CKBUF1/A To pin: CKBUF1/Q Arc type: cell (clock network) AOCVM arc metrics Launch Capture -------------------------------------------------------- Distance -- -- Depth 6.00 5.00 AOCVM arc derates Launch Capture -------------------------------------------------------- Early rise 0.9830 0.9800 Early fall 0.9830 0.9800 Late rise 1.0170 1.0200 Late fall 1.0170 1.0200 From pin: CKBUF1/A To pin: CKBUF1/Q Arc type: cell (data network) AOCVM arc metrics Launch -------------------------------------- Distance -- Depth 5.00 AOCVM arc derates Launch -------------------------------------- Early rise 0.9800 Early fall 0.9800 Late rise 1.0200 Late fall 1.0200 ################### ######## actuals for FF1 stage count ################### pt_shell> report_aocvm [get_timing_arc -from FF1/CK -to FF1/Q] From pin: FF1/CK To pin: FF1/Q Arc type: cell (data network) AOCVM arc metrics Launch -------------------------------------- Distance -- Depth 8.00 AOCVM arc derates Launch -------------------------------------- Early rise 0.9750 Early fall 0.9750 Late rise 1.0250 Late fall 1.0250 ################### ######## actuals for BUF1 derates ################### pt_shell> report_aocvm [get_timing_arc -from BUF1/A -to BUF1/Q] From pin: BUF1/A To pin: BUF1/Q Arc type: cell (data network) AOCVM arc metrics Launch -------------------------------------- Distance -- Depth 8.00 AOCVM arc derates Launch -------------------------------------- Early rise 0.9880 Early fall 0.9880 Late rise 1.0130 Late fall 1.0130
Tempus results are consistent with those of PrimeTime. The advantage in Tempus report_timing
is larger set of reported fileds, so one can see the number of AOCV stages and the implied derate
directly in a timing report. The reported fields can be set either globally through
set_db timing_report_fields <list>
or temporarily through report_timing -fileds <list>
.
There are slight differences (compared to PrimeTime) in how Tempus counts AOCV stages, such as
for the clock path to FF1/CK
. In some cases, the reported stage count seems wrong and
does not correspond to the actual AOCV derate, e.g. CKBUF2
in both the data and clock path.
As the differences (to PrimeTime) were only in the common clock path, they do not account for
differences in timing thanks to CPPR removal/adjustment.
Hence in our example Tempus yields the same timing result as in PrimeTime.
@tempus> set_db timing_report_fields [concat [get_db timing_report_fields] stage_count aocv_derate] @tempus> report_timing -late -path_type full_clock -from FF1 -to FF2 Group: CLK Startpoint: (R) FF1/CK Clock: (R) CLK Endpoint: (F) FF2/D Clock: (R) CLK Capture Launch Clock Edge:+ 10.000 0.000 Src Latency:+ 0.000 0.000 Net Latency:+ 4.927 (P) 5.053 (P) Arrival:= 14.927 5.053 Setup:- 0.400 Uncertainty:- 0.500 Cppr Adjust:+ 0.080 Required Time:= 14.107 Launch Clock:= 5.053 Data Path:+ 7.101 Slack:= 1.953 Timing Path: #----------------------------------------------------------------------------------------------- # Pin Cell Load Trans Incr Delay Arrival User Total Aocv Aocv # (pf) (ns) Delay (ns) (ns) Derate Derate Stage Derate # Count #----------------------------------------------------------------------------------------------- clk (arrival) 0.003 0.003 0.000 0.000 0.000 - - 5.000 - CKBUF1/Q bufx10_d 0.003 0.003 0.000 1.020 1.020 1.000 1.020 5.000 1.020 <-- Tempus does not count FF1 as a stage, hence 5 (instead of 6 in PrimeTime) CKBUF2/Q bufx10_d 0.006 0.201 0.000 1.020 2.040 1.000 1.020 8.000 1.020 <-- Tempus reports 8 stages but counts 5 (i.e. as for CKBUF1, which is correct) CKBUF3a1/Q bufx10_d 0.005 0.201 0.000 1.013 3.053 1.000 1.013 8.000 1.013 CKINV3a2/Q invx05_d 0.005 0.201 0.000 1.000 4.053 1.000 1.000 8.000 1.000 CKINV3a3/Q invx05_d 0.005 0.301 0.000 1.000 5.053 1.000 1.000 8.000 1.000 FF1/Q dffprqx05_d 0.003 0.201 0.000 3.075 8.128 1.000 1.025 8.000 1.025 BUF1/Q bufx10_d 0.005 0.301 0.000 1.013 9.141 1.000 1.013 8.000 1.013 <-- 8 stages as in PrimeTime INV2/Q invx05_d 0.005 0.301 0.000 1.000 10.141 1.000 1.000 8.000 1.000 INV3/Q invx05_d 0.003 0.201 0.000 1.000 11.141 1.000 1.000 8.000 1.000 BUF4/Q bufx10_d 0.005 0.301 0.000 1.013 12.154 1.000 1.013 8.000 1.013 FF2/D dffprqx05_d 0.005 0.301 0.000 0.000 12.154 1.000 1.000 8.000 1.000 #----------------------------------------------------------------------------------------------- Other End Path: #----------------------------------------------------------------------------------------------- # Pin Cell Load Trans Incr Delay Arrival User Total Aocv Aocv # (pf) (ns) Delay (ns) (ns) Derate Derate Stage Derate # Count #----------------------------------------------------------------------------------------------- clk (arrival) 0.003 0.003 0.000 0.000 10.000 - - 5.000 - CKBUF1/Q bufx10_d 0.003 0.003 0.000 0.980 10.980 1.000 0.980 5.000 0.980 <-- 5 stages as in PrimeTime CKBUF2/Q bufx10_d 0.006 0.201 0.000 0.980 11.960 1.000 0.980 3.000 0.980 <-- Tempus reports 3 stages but counts 5 (i.e. as for CKBUF1, which is correct) CKBUF3b1/Q bufx10_d 0.005 0.201 0.000 0.967 12.927 1.000 0.967 3.000 0.967 <-- 3 stages as in PrimeTime CKINV3b2/Q invx05_d 0.005 0.201 0.000 1.000 13.927 1.000 1.000 3.000 1.000 CKINV3b3/Q invx05_d 0.005 0.301 0.000 1.000 14.927 1.000 1.000 3.000 1.000 FF2/CK dffprqx05_d 0.005 0.201 0.000 0.000 14.927 1.000 1.000 3.000 1.000 #----------------------------------------------------------------------------------------------- ################### ######## actuals for CKBUF2 stage count and derates ################### @tempus> foreach t {early late} { \ foreach a {data launch_clock capture_clock} { \ puts "aocv_stage_count_${a}_${t} = [get_db [get_arcs -from CKBUF2/A -to CKBUF2/Q] .aocv_stage_count_${a}_${t}]"; \ } \ } aocv_stage_count_data_early = 5.0 aocv_stage_count_launch_clock_early = 5.0 aocv_stage_count_capture_clock_early = 5.0 aocv_stage_count_data_late = no_value aocv_stage_count_launch_clock_late = 5.0 aocv_stage_count_capture_clock_late = 5.0 @tempus> foreach t {early late} { \ foreach a {data launch_clock capture_clock} { \ foreach e {rise fall} { \ puts "aocv_derate_${a}_${t}_${e} = [get_db [get_arcs -from CKBUF2/A -to CKBUF2/Q] .aocv_derate_${a}_${t}_${e}]"; \ } \ } \ } aocv_derate_data_early_rise = 0.98 aocv_derate_data_early_fall = 0.98 aocv_derate_launch_clock_early_rise = 0.98 aocv_derate_launch_clock_early_fall = 0.98 aocv_derate_capture_clock_early_rise = 0.98 aocv_derate_capture_clock_early_fall = 0.98 aocv_derate_data_late_rise = 1.02 aocv_derate_data_late_fall = 1.02 aocv_derate_launch_clock_late_rise = 1.02 aocv_derate_launch_clock_late_fall = 1.02 aocv_derate_capture_clock_late_rise = 1.02 aocv_derate_capture_clock_late_fall = 1.02 ################### ######## actuals for FF1 stage count and derates ################### @tempus> foreach t {early late} { \ foreach a {data launch_clock capture_clock} { \ puts "aocv_stage_count_${a}_${t} = [get_db [get_arcs -from FF1/CK -to FF1/Q] .aocv_stage_count_${a}_${t}]"; \ } \ } aocv_stage_count_data_early = 8.0 aocv_stage_count_launch_clock_early = 8.0 aocv_stage_count_capture_clock_early = no_value aocv_stage_count_data_late = no_value aocv_stage_count_launch_clock_late = 8.0 aocv_stage_count_capture_clock_late = no_value @tempus> foreach t {early late} { \ foreach a {data launch_clock capture_clock} { \ foreach e {rise fall} { \ puts "aocv_derate_${a}_${t}_${e} = [get_db [get_arcs -from FF1/CK -to FF1/Q] .aocv_derate_${a}_${t}_${e}]"; \ } \ } \ } aocv_derate_data_early_rise = 0.975 aocv_derate_data_early_fall = 0.975 aocv_derate_launch_clock_early_rise = 0.975 aocv_derate_launch_clock_early_fall = 0.975 aocv_derate_capture_clock_early_rise = 1.0 aocv_derate_capture_clock_early_fall = 1.0 aocv_derate_data_late_rise = 1.025 aocv_derate_data_late_fall = 1.025 aocv_derate_launch_clock_late_rise = 1.025 aocv_derate_launch_clock_late_fall = 1.025 aocv_derate_capture_clock_late_rise = 1.0 aocv_derate_capture_clock_late_fall = 1.0 ################### ######## actuals for BUF1 derates (early & late reported separately) ################### @tempus> report_delay_calculation -from BUF1/A -to BUF1/Q -min From pin : BUF1/A To Pin : BUF1/Q Cell : bufx10_d Library : testlib01 Arc sense : positive unate Delay type : cell delay Rise Fall ------------------------------------------------------------- Input transition time : 0.200600 ns 0.300800 ns Cell delay : 0.999900 ns 0.999900 ns Timing Derate : 0.988000 0.988000 Derated Cell delay : 0.987900 ns 0.987900 ns Output transition time : 0.200600 ns 0.300800 ns ------------------------------------------------------------- @tempus> report_delay_calculation -from BUF1/A -to BUF1/Q -max From pin : BUF1/A To Pin : BUF1/Q Cell : bufx10_d Library : testlib01 Arc sense : positive unate Delay type : cell delay Rise Fall ------------------------------------------------------------- Input transition time : 0.200600 ns 0.300800 ns Cell delay : 1.000000 ns 1.000000 ns Timing Derate : 1.013000 1.013000 Derated Cell delay : 1.013000 ns 1.013000 ns Output transition time : 0.200600 ns 0.300800 ns -------------------------------------------------------------
As AOCV models only process variation, users still need to somehow account for variation components from voltage and temperature. Hence some reduced flat OCV margins are typically used along with AOCV tables.
While the principle is simple and intuitive, the practice is little harder. The flat OCV
component still comes from set_timing_derate
but STA tools may differ in how they
combine it with AOCV.
PrimeTime follows these rules:
PT keeps the OCV margin as two components, a multiplicative scale factor (
set_timing_derate
) plus an additive margin (set_timing_derate -incremental
). FLat-only total derate formula is then<total derate> = <flat scaling> + <flat add margin>
.In AOCV flow:
The OCV additive margin component is always added.
The OCV multiplicative scale component is applied differently to cells with and without AOCV model. Cells with AOCV models are only affected by
set_timing_derate -aocvm_guardband
. Cells with no AOCV model are only affected byset_timing_derate
.This makes sense as a larger scale factor is to be used for cells with no AOCV tables.
OCV selection precedence rules apply as in flat-only OCV flow.
Hence the total OCV derate formula becomes like this:
- cells w/- AOCV model:
<total derate> = <AOCV derate> * <flat aocvm_guardband scaling> + <flat add margin>
- cells w/o AOCV model:
<total derate> = <flat scaling> + <flat add margin>
- cells w/- AOCV model:
The following example shows combining OCV and AOCV timing without using -aocvm_guardband
;
that is, the library AOCV is with no scaling. The example shows various combinations, with BUF1
having AOCV model and OCV scaling and incremental margin, FF1
having AOCV model but no OCV
derates, and INV2
having only OCV scaling and incremental margin.
pt_shell> reset_timing_derate pt_shell> set_timing_derate -cell_delay -late 1.01 [get_lib_cells */invx05_d] pt_shell> set_timing_derate -cell_delay -late 0.02 -increment [get_lib_cells */invx05_d] pt_shell> set_timing_derate -cell_delay -late 1.015 [get_lib_cells */bufx10_d] pt_shell> set_timing_derate -cell_delay -late 0.025 -increment [get_lib_cells */bufx10_d] pt_shell> report_timing_derate ----- Clock ------ ------ Data ------ Rise Fall Rise Fall Early Late Early Late Early Late Early Late ---------------------------------------------------------------------------------------------- lib_cell: testlib01/bufx10_d Cell delay -- 1.015 -- 1.015 -- 1.015 -- 1.015 lib_cell: testlib01/invx05_d Cell delay -- 1.010 -- 1.010 -- 1.010 -- 1.010 pt_shell> report_timing_derate -increment ---------------------------------------------------------------------------------------------- lib_cell: testlib01/bufx10_d Cell delay -- 0.025 -- 0.025 -- 0.025 -- 0.025 lib_cell: testlib01/invx05_d Cell delay -- 0.020 -- 0.020 -- 0.020 -- 0.020 pt_shell> report_aocvm [get_timing_arc -from BUF1/A -to BUF1/Q] ... AOCVM arc derates Launch --------------------------------------- Early rise 0.9880 Early fall 0.9880 Late rise 1.0130 Late fall 1.0130 pt_shell> report_timing -from FF1 -to FF2 -derate -nosplit -path_type full_clock_expanded Startpoint: FF1 (rising edge-triggered flip-flop clocked by CLK) Endpoint: FF2 (rising edge-triggered flip-flop clocked by CLK) Last common pin: CKBUF2/Q Path Group: CLK Path Type: max Point Derate Incr Path aocv_derate aocv_scaling flat_scaling flat_inc -------------------------------------------------------------------------------------------------------------- clock CLK (rise edge) 0.000 0.000 clock source latency 0.000 0.000 clk (in) 0.000 0.000 r CKBUF1/Q (bufx10_d) 1.042 1.042 1.042 r CKBUF2/Q (bufx10_d) 1.042 1.042 2.084 r 1.017 1.015 0.025 <--- 1.017 + 0.025 = 1.042 ... FF1/CK (dffprqx05_d) 1.000 0.000 5.182 r FF1/Q (dffprqx05_d) <- 1.025 3.075 8.257 f 1.025 BUF1/Q (bufx10_d) 1.038 1.038 9.295 f 1.013 1.015 0.025 <--- 1.013 + 0.025 = 1.038 INV2/Q (invx05_d) 1.030 1.030 10.325 r n/a 1.010 0.020 <--- 1.010 + 0.020 = 1.030 ... FF2/D (dffprqx05_d) 1.000 0.000 12.393 f data arrival time 12.393 clock CLK (rise edge) 10.000 10.000 clock source latency 0.000 10.000 clk (in) 0.000 10.000 r CKBUF1/Q (bufx10_d) 0.980 0.980 10.980 r CKBUF2/Q (bufx10_d) 0.980 0.980 11.960 r CKBUF3b1/Q (bufx10_d) 0.967 0.967 12.927 r CKINV3b2/Q (invx05_d) 1.000 1.000 13.927 f CKINV3b3/Q (invx05_d) 1.000 1.000 14.927 r FF2/CK (dffprqx05_d) 1.000 0.000 14.927 r clock reconvergence pessimism 0.124 15.051 clock uncertainty -0.500 14.551 library setup time 1.000 -0.400 14.151 data required time 14.151 --------------------------------------------------------- data required time 14.151 data arrival time -12.393 --------------------------------------------------------- slack (MET) 1.758 Derate Summary Report ------------------------------------------------------------------- total derate : required time 0.073 total derate : arrival time -0.393 ------------------------------------------------------------------- total derate : slack 0.466 slack (with derating applied) (MET) 1.758 clock reconvergence pessimism (due to derating) -0.124 ------------------------------------------------------------------- slack (with no derating) (MET) 2.100
When adding -aocvm_guardband
, all cells with an AOCV model are scaled accordingly:
pt_shell> set_timing_derate -cell_delay -late 1.02 -aocvm_guardband [get_lib_cells */invx05_d] pt_shell> set_timing_derate -cell_delay -late 1.025 -aocvm_guardband [get_lib_cells */bufx10_d] pt_shell> report_timing_derate -aocvm_guardband ----- Clock ------ ------ Data ------ Rise Fall Rise Fall Early Late Early Late Early Late Early Late ---------------------------------------------------------------------------------------------- lib_cell: testlib01/bufx10_d Cell delay -- 1.025 -- 1.025 -- 1.025 -- 1.025 lib_cell: testlib01/invx05_d Cell delay -- 1.020 -- 1.020 -- 1.020 -- 1.020 # Note that AOCV scaling shows up immediately in the reported aocv derating. pt_shell> report_aocvm [get_timing_arc -from BUF1/A -to BUF1/Q] ... AOCVM arc derates Launch --------------------------------------- Early rise 0.9880 Early fall 0.9880 Late rise 1.0383 Late fall 1.0383 pt_shell> report_timing -from FF1 -to FF2 -derate -nosplit -path_type full_clock_expanded -significant_digits 5 Point Derate Incr Path aocv_derate aocv_scaling flat_scaling flat_inc -------------------------------------------------------------------------------------------------------------------- ... CKBUF1/Q (bufx10_d) 1.05725 1.05725 1.05725 r ... CKBUF2/Q (bufx10_d) 1.05725 1.05726 2.11451 r 1.017 1.015 1.015 0.025 <--- 1.017*1.015 + 0.025 = 1.057255 ... FF1/CK (dffprqx05_d) 1.00000 0.00000 5.22771 r FF1/Q (dffprqx05_d) <- 1.02500 3.07500 8.30270 f 1.025 BUF1/Q (bufx10_d) 1.05319 1.05319 9.35590 f 1.013 1.015 1.015 0.025 <--- 1.013*1.015 + 0.025 = 1.053195 INV2/Q (invx05_d) 1.03000 1.03000 10.38590 r n/a 1.010 1.010 0.020 <--- 1.010 + 0.020 = 1.030 ... FF2/D (dffprqx05_d) 1.00000 0.00000 12.46910 f data arrival time 12.46910 clock CLK (rise edge) 10.00000 10.00000 clock source latency 0.00000 10.00000 clk (in) 0.00000 10.00000 r CKBUF1/Q (bufx10_d) 0.98000 0.98000 10.98000 r CKBUF1/Q (bufx10_d) 0.98000 0.98000 10.98000 r CKBUF2/Q (bufx10_d) 0.98000 0.98000 11.96000 r CKBUF3b1/Q (bufx10_d) 0.96700 0.96700 12.92700 r CKINV3b2/Q (invx05_d) 1.00000 1.00000 13.92700 f CKINV3b3/Q (invx05_d) 1.00000 1.00000 14.92700 r FF2/CK (dffprqx05_d) 1.00000 0.00000 14.92700 r clock reconvergence pessimism 0.15451 15.08151 clock uncertainty -0.50000 14.58151 library setup time 1.00000 -0.40000 14.18151 data required time 14.18151 --------------------------------------------------------------- data required time 14.18151 data arrival time -12.46910 --------------------------------------------------------------- slack (MET) 1.71242
Tempus follows these rules:
- Tempus keeps a single flat OCV multiplicative scaling factor. This
scaling factor can be set instantly by
set_timing_derate
or incrementally by subsequent calls toset_timing_derate -add|-multiply
. The total derate formula is simply<total derate> = <flat scaling>
. - Recent versions of Tempus allow incremental additive margin that is maintained separately to the scaling factor (see Flat Deratings (Tempus)). This then affects how OCV gets combined with AOCV!
- In AOCV flow:
- The OCV flat derate scaling factor is applied either multiplicatively (default) or, when
the
timing_aocv_derate_mode
root attribute set toaocv_additive
, as an additive margin. - Tempus makes no distinction between cells with and without AOCV models. The same flat OCV scaling factor is used, following the OCV selection precedence rules (like in PrimeTime).
- If
-increment
is used (in recent Tempus versions), the incremental additive margin is added to the product of AOCV derate and flat OCV scaling factor. - Hence the total OCV derate formula (note that
<flat add margin>
is an additive margin set throughset_timing_derate -incremental
):timing_aocv_derate_mode==aocv_multiplicative
:<total derate> = <aocv_derate> * <flat scaling> + <flat add margin>
.timing_aocv_derate_mode==aocv_additive
:<total derate> = <aocv_derate> + (<flat scaling> - 1.0) + <flat add margin>
.
- The OCV flat derate scaling factor is applied either multiplicatively (default) or, when
the
The following example shows the timing for the aocv_multiplicative
. Notice that the results
are the same as in PrimeTime, except differences in AOCV stage count (and hence AOCV and total
derate) for CKBUF1
and CKBUF2
. The difference gets compensated, though, through CRPR
removal.
@tempus> reset_timing_derate @tempus> set_timing_derate -cell_delay -late 1.01 [get_lib_cells */invx05_d] @tempus> set_timing_derate -cell_delay -late 0.02 -increment [get_lib_cells */invx05_d] # the following yields same as `-cell_delay -late 1.015` without `-add` @tempus> set_timing_derate -cell_delay -late 0.015 -add [get_lib_cells */bufx10_d] @tempus> set_timing_derate -cell_delay -late 0.025 -increment [get_lib_cells */bufx10_d] @tempus> report_timing_derate -------------Clock----------------- ---------------Data----------------- Rise Fall Rise Fall Early Late Early Late Early Late Early Late --------------------------------------------------------------------------- LibraryCell: testlib01/bufx10_d Cell_delay -- 1.015 -- 1.015 -- 1.015 -- 1.015 Incremental_adjust 0.000 0.025 0.000 0.025 0.000 0.025 0.000 0.025 Input_switching -- -- -- -- -- -- -- -- LibraryCell: testlib01/invx05_d Cell_delay -- 1.010 -- 1.010 -- 1.010 -- 1.010 Incremental_adjust 0.000 0.020 0.000 0.020 0.000 0.020 0.000 0.020 @tempus> get_db timing_aocv_derate_mode aocv_multiplicative @tempus> report_timing -late -path_type full_clock -from FF1 -to FF2 Path 1: MET (1.71242 ns) Setup Check with Pin FF2/CK->D Group: CLK Startpoint: (R) FF1/CK Clock: (R) CLK Endpoint: (F) FF2/D Clock: (R) CLK Capture Launch Clock Edge:+ 10.00000 0.00000 Src Latency:+ 0.00000 0.00000 Net Latency:+ 4.92700 (P) 5.23380 (P) Arrival:= 14.92700 5.23380 Setup:- 0.40000 Uncertainty:- 0.50000 Cppr Adjust:+ 0.16060 Required Time:= 14.18760 Launch Clock:= 5.23380 Data Path:+ 7.24139 Slack:= 1.71242 Timing Path: #--------------------------------------------------------------------------------------------------------- # Pin Cell Load Trans Incr Delay Arrival User Total Aocv Aocv # (pf) (ns) Delay (ns) (ns) Derate Derate Stage Derate # Count #--------------------------------------------------------------------------------------------------------- clk (arrival) 0.003 0.00300 0.00000 0.00000 0.00000 - - 5.00000 - CKBUF1/Q bufx10_d 0.003 0.00300 0.00000 1.06030 1.06030 1.015 1.060 5.00000 1.020 CKBUF2/Q bufx10_d 0.006 0.20060 0.00000 1.06030 2.12060 1.015 1.060 8.00000 1.020 <-- 1.02 * 1.015 + 0.025 = 1.06030 CKBUF3a1/Q bufx10_d 0.005 0.20060 0.00000 1.05319 3.17379 1.015 1.053 8.00000 1.013 CKINV3a2/Q invx05_d 0.005 0.20060 0.00000 1.03000 4.20379 1.010 1.030 8.00000 1.000 CKINV3a3/Q invx05_d 0.005 0.30080 0.00000 1.03000 5.23380 1.010 1.030 8.00000 1.000 FF1/Q dffprqx05_d 0.003 0.20060 0.00000 3.07500 8.30879 1.000 1.025 8.00000 1.025 BUF1/Q bufx10_d 0.005 0.30080 0.00000 1.05319 9.36199 1.015 1.053 8.00000 1.013 <-- 1.013 * 1.015 + 0.025 = 1.053195 INV2/Q invx05_d 0.005 0.30080 0.00000 1.03000 10.39199 1.010 1.030 8.00000 1.000 INV3/Q invx05_d 0.003 0.20060 0.00000 1.03000 11.42199 1.010 1.030 8.00000 1.000 BUF4/Q bufx10_d 0.005 0.30080 0.00000 1.05319 12.47518 1.015 1.053 8.00000 1.013 FF2/D dffprqx05_d 0.005 0.30080 0.00000 0.00000 12.47518 1.000 1.000 8.00000 1.000 #--------------------------------------------------------------------------------------------------------- Other End Path: #--------------------------------------------------------------------------------------------------------- # Pin Cell Load Trans Incr Delay Arrival User Total Aocv Aocv # (pf) (ns) Delay (ns) (ns) Derate Derate Stage Derate # Count #--------------------------------------------------------------------------------------------------------- clk (arrival) 0.003 0.00300 0.00000 0.00000 10.00000 - - 5.00000 - CKBUF1/Q bufx10_d 0.003 0.00300 0.00000 0.98000 10.98000 1.000 0.980 5.00000 0.980 CKBUF2/Q bufx10_d 0.006 0.20060 0.00000 0.98000 11.96000 1.000 0.980 3.00000 0.980 CKBUF3b1/Q bufx10_d 0.005 0.20060 0.00000 0.96700 12.92700 1.000 0.967 3.00000 0.967 CKINV3b2/Q invx05_d 0.005 0.20060 0.00000 1.00000 13.92700 1.000 1.000 3.00000 1.000 CKINV3b3/Q invx05_d 0.005 0.30080 0.00000 1.00000 14.92700 1.000 1.000 3.00000 1.000 FF2/CK dffprqx05_d 0.005 0.20060 0.00000 0.00000 14.92700 1.000 1.000 3.00000 1.000 #---------------------------------------------------------------------------------------------------------
Difference between timing_aocv_derate_mode
can be fairly small if the derates are meaningfully
sized. Using the preceding setup, here is an example of aocv_multiplicative
and aocv_additive
for calculating cell_delay
of BUF1
:
@tempus> set_db timing_aocv_derate_mode aocv_multiplicative @tempus> report_delay_calculation -from BUF1/A -to BUF1/Q From pin : BUF1/A To Pin : BUF1/Q Cell : bufx10_d Library : testlib01 Arc sense : positive unate Delay type : cell delay Rise Fall ------------------------------------------------------------- Input transition time : 0.200600 ns 0.300800 ns Cell delay : 0.999900 ns 0.999900 ns Timing Derate : 1.053195 1.053195 Derated Cell delay : 1.053100 ns 1.053100 ns Output transition time : 0.200600 ns 0.300800 ns ------------------------------------------------------------- @tempus> set_db timing_aocv_derate_mode aocv_additive @tempus> report_delay_calculation -from BUF1/A -to BUF1/Q ... Rise Fall ------------------------------------------------------------- Input transition time : 0.200600 ns 0.300800 ns Cell delay : 0.999900 ns 0.999900 ns Timing Derate : 1.053000 1.053000 Derated Cell delay : 1.052900 ns 1.052900 ns Output transition time : 0.200600 ns 0.300800 ns -------------------------------------------------------------
When using aocv_additive
, the full path timing looks like follows and, as expected,
yields somewhat different slack:
@tempus> set_db timing_aocv_derate_mode aocv_additive @tempus> report_timing -late -path_type full_clock -from FF1 -to FF2 Group: CLK Startpoint: (R) FF1/CK Clock: (R) CLK Endpoint: (F) FF2/D Clock: (R) CLK Capture Launch Clock Edge:+ 10.00000 0.00000 Src Latency:+ 0.00000 0.00000 Net Latency:+ 4.92700 (P) 5.23300 (P) Arrival:= 14.92700 5.23300 Setup:- 0.40000 Uncertainty:- 0.50000 Cppr Adjust:+ 0.16000 Required Time:= 14.18700 Launch Clock:= 5.23300 Data Path:+ 7.24100 Slack:= 1.71300 Timing Path: #--------------------------------------------------------------------------------------------------------- # Pin Cell Load Trans Incr Delay Arrival User Total Aocv Aocv # (pf) (ns) Delay (ns) (ns) Derate Derate Stage Derate # Count #--------------------------------------------------------------------------------------------------------- clk (arrival) 0.003 0.00300 0.00000 0.00000 0.00000 - - 5.00000 - CKBUF1/Q bufx10_d 0.003 0.00300 0.00000 1.06000 1.06000 1.015 1.060 5.00000 1.020 CKBUF2/Q bufx10_d 0.006 0.20060 0.00000 1.06000 2.12000 1.015 1.060 8.00000 1.020 CKBUF3a1/Q bufx10_d 0.005 0.20060 0.00000 1.05300 3.17300 1.015 1.053 8.00000 1.013 CKINV3a2/Q invx05_d 0.005 0.20060 0.00000 1.03000 4.20300 1.010 1.030 8.00000 1.000 CKINV3a3/Q invx05_d 0.005 0.30080 0.00000 1.03000 5.23300 1.010 1.030 8.00000 1.000 FF1/Q dffprqx05_d 0.003 0.20060 0.00000 3.07500 8.30800 1.000 1.025 8.00000 1.025 BUF1/Q bufx10_d 0.005 0.30080 0.00000 1.05300 9.36100 1.015 1.053 8.00000 1.013 INV2/Q invx05_d 0.005 0.30080 0.00000 1.03000 10.39100 1.010 1.030 8.00000 1.000 INV3/Q invx05_d 0.003 0.20060 0.00000 1.03000 11.42100 1.010 1.030 8.00000 1.000 BUF4/Q bufx10_d 0.005 0.30080 0.00000 1.05300 12.47400 1.015 1.053 8.00000 1.013 FF2/D dffprqx05_d 0.005 0.30080 0.00000 0.00000 12.47400 1.000 1.000 8.00000 1.000 #--------------------------------------------------------------------------------------------------------- Other End Path: #--------------------------------------------------------------------------------------------------------- # Pin Cell Load Trans Incr Delay Arrival User Total Aocv Aocv # (pf) (ns) Delay (ns) (ns) Derate Derate Stage Derate # Count #--------------------------------------------------------------------------------------------------------- clk (arrival) 0.003 0.00300 0.00000 0.00000 10.00000 - - 5.00000 - CKBUF1/Q bufx10_d 0.003 0.00300 0.00000 0.98000 10.98000 1.000 0.980 5.00000 0.980 CKBUF2/Q bufx10_d 0.006 0.20060 0.00000 0.98000 11.96000 1.000 0.980 3.00000 0.980 CKBUF3b1/Q bufx10_d 0.005 0.20060 0.00000 0.96700 12.92700 1.000 0.967 3.00000 0.967 CKINV3b2/Q invx05_d 0.005 0.20060 0.00000 1.00000 13.92700 1.000 1.000 3.00000 1.000 CKINV3b3/Q invx05_d 0.005 0.30080 0.00000 1.00000 14.92700 1.000 1.000 3.00000 1.000 FF2/CK dffprqx05_d 0.005 0.20060 0.00000 0.00000 14.92700 1.000 1.000 3.00000 1.000 #---------------------------------------------------------------------------------------------------------
As AOCV models only local process variation of transistors/gates, other local variations (V, T, RC) still need to be modeled by flat OCV. Hence combining the two margining methods. This is nothing else then enabling the AOCV flow in the STA tool and specifying alongside flat OCV margins.
The following lists key aspects of combining AOCV and OCV deratings:
- Synopsys PrimeTime
- PrimeTime maintains two derating components, a (multiplicative) scaling factor and an additive margin. This is consistent with flat OCV deratings.
- PrimeTime differentiates between cells with and without AOCV models by separating the multiplicative
scaling factor.
- For cells without AOCV, it uses the flat scaling defined by
set_timing_derate <factor>
. - For cells with AOCV, it uses the aocv scaling defined by
set_timing_derate -aocvm_gaurdband <factor>
.
- For cells without AOCV, it uses the flat scaling defined by
- The flat add margin is always applied.
- The total timing derate is then:
- w/o AOCV:
<total derate> = 1.0 * <flat scaling> + <flat add margin>
- w/- AOCV:
<total derate> = <aocv derate> * <aocv scaling> + <flat add margin>
- w/o AOCV:
- Cadence Tempus
- Tempus maintains a single (multiplicative) scaling factor. This scaling factor is used
independent if the cell has an AOCV model or not.
- Users can select if the scaling factor is to be applied multiplicatively or additively. This setting is global and defaults to multiplicative.
- Tempus versions that support
set_timing_derate -increment
also maintain a separate flat add margin. - The total timing derate is then (based on
timing_aocv_derate_mode
, for cells without AOCV<aocv_derate>=1.0
):aocv_multiplicative
scaling:<total derate> = <aocv derate> * <flat scaling> [+ <flat add margin>]
aocv_additive
scaling:<total derate> = <aocv derate> * (<flat scaling> - 1) [+ <flat add margin>]
- Tempus maintains a single (multiplicative) scaling factor. This scaling factor is used
independent if the cell has an AOCV model or not.
Some recommendations for yielding consistent analysis results across tools:
Follow the recommendations for flat OCVs.
Use separate scaling factors for cells with and without AOCV models. Typically only standard cells come with AOCV models. Those cells that do tend to have lower flat OCV margins anyway.
Separating to w/- and w/o AOCV is really only for PrimeTime to know when to use the
-aocvm_guardband
option. A conservative approach may be to always apply bothset_timing_derate
andset_timing_derate -aocvm_guardband
with the same factor as the two factors do not interfere with one another.Avoid using incremental margins; at all if possible. The only incremental margins that yield consistent results is
set_timing_derate -increment
.
Some more practical considerations:
It may happen that a worst-case AOCV PVT (i.e. with largest margins) may not alias with a worst-case Liberty/timing PVT. This may shift the combined worst-case away from just the Liberty-only one or may call for sign-off at more potentially worst-case corners.
One inconvenience that may kick in is that some tools may not use/honor (A)OCV at all phases or may optimize at only single PVT.
AOCV stops to fairly model OCV effects as technology nodes scale further. First they scale only cell delays and do not anyhow address timing constraints (e.g. setup/hold times). AOCV factors are simple 1D tables and hence do not capture variation in input slew or cell loading.
The next OCV evolution step is statistical OCV (SOCV), where potentially every timing parameter is represented by its mean value (i.e. the 2D tables from good old timing library) and statistical distribution parameters (such as sigma) as either 1D or 2D tables. STA tool than does "simple" statistical calculations to compute the cumulative statistical effect of propagating an input distribution through a series of cells in the timing path. This yields more accurate (and still computationally affordable) OCV modeling than the other OCV methods. This approach also allows to model distributions "skewed" and "tilted" from the typical normal one.
SOCV details are not covered in this text.
While OCV is not a complicated matter, using it across different tools with consistent results may quickly turn a nightmare in a complex derate setups and it is generally recommended to minimize the use of incremental derates.
[StaBasics] | Brabec, Tomas. Static Timing Analysis Basics. On-line. |