Académique Documents
Professionnel Documents
Culture Documents
January 2014
Simulation ............................................................................................................................................................13
Independent control by ATPG of each clock domain to improve coverage, reduce pattern count, and
achieve safe clocking with minimal user intervention
Enable slow or fast clocks during capture for application of slow and at-speed patterns
Scan-programmable clock waveforms generated within a wrapped core are ideal for generating
patterns at the core level that can be retargeted to the top level while simultaneously testing
multiple cores without conflicts in how clocks are controlled within each core
This application note describes a practical on-chip clock control circuit design and demonstrates its use in
a test case using Tessent ATPG tools.
If the target EDT hardware uses the Low Pin Count Test (LPCT) controller, contact Mentor Customer
Support for information on how to use a modified OCC design with the LPCT EDT hardware.
A complete test case that demonstrates the use of this OCC design in a circuit is described in the last
section. The test case is available from Mentor Graphics by downloading it from the following SupportNet
page: http://supportnet.mentor.com/reference/tutorials/index.cfm?id=MG576857
It is important to only balance the functional clock path of the mux in order to avoid over-constraining the
clock tree synthesis flow and causing excessive clock latency. For example, if using a layout tool like
ICCompiler, this can be accomplished by using a set_clock_tree_exceptions -exclude_pins
command and listing slow and fast clock inputs of the clock control block. In tools such as Talus from
Magma, a skew group definition for each clock control block can be used.
The clock control design described in this application note should supply the clock when in test mode
while using the clock output of the PLL as the fast clock for at-speed capture. A top-level slow clock will
be used for shift and slow capture. The reference clock supplied to the PLL is a free-running clock.
It is also recommended not to flatten the clock control blocks during layout in order to ease definition of
the test procedure file after layout.
The following table describes the functionality of pins at the top of the clock control block as well as some
of the internal signals:
Static signals that do not change during the test session can be controlled through on-chip controllers
(such as JTAG) or other means in order to reduce the need for top-level pins.
In version 1.1 of the clock controller RTL, a flop was added on the input side of the synchronization cell
and clocked by the trailing edge of SLOW_CLK. Since SCAN_EN normally fans out to the entire circuit
and may arrive after FAST_CLK, the flop on SLOW_CLK ensures that SCAN_EN is not synchronized by
the fast clock until SLOW_CLK is pulsed thus reducing the risk of a race condition.
Note that the scan enable synchronization logic is not used for slow capture mode which uses
SLOW_CLK for shift and capture.
In the RTL description, the synchronization cell is described as module tessent_sync_cell so that it can
be replaced with a technology specific synchronization cell from the appropriate library.
module tessent_sync_cell (d, clk, q);
input d, clk;
output q;
reg [1:0] R;
In order to ensure proper DRC analysis and simulation, the output of the clock gater cell driven by the
synchronization logic should be defined as a free-running internal clock. This is indicated by an arrow in
Figure 2 and ensures correct simulation of the logic during load_unload and avoids DRC violations.
wire te_fe;
reg latch;
With this specification, the slow shift clock is supplied to the design anytime SCAN_EN is high and should
be pulsed in the shift procedures. The capture clock will be controlled by a clock control definition that
describes the effects of the shift registers condition bits on each capture cycle.
The shift register block contains the programmable scan cells which will be loaded during shift in order to
pulse the internal clock during the cycle required by ATPG.
The AND gate on the input of the shift register loads zeros into the register during capture to clear it. The
EN output signal of the shift register block is used in the clock control block (Figure 2) to turn off the fast
clock to the shift register once the shift register has been unloaded. This ensures switching from the fast
capture clock to slow shift clock without risk of glitches and disturbing the values present in the shift
register. It also ensures that the shift register flip-flops have stable values when the ATPG tool simulates
the load_unload procedure and eliminates unnecessary DRC violations.
A lockup cell on the SCAN_OUT output of the shift register block ensures proper shift operation when
several clock control blocks are concatenated into a scan chain or when scan cells from chains with
different clocks are combined with the shift register flops. This is important because as described in
section 2.1, each clock control block forms a locally balanced clock tree which is not balanced with any
other chain segment.
A key feature of the shift register block is the ability to bypass up to 3 shift registers in order to reduce the
number of bits that must be specified in the patterns. This is done by setting the CAP_CYCLE_CONFIG
signals per the following table:
In addition to limiting the overall number of specified scan cells, it is also important to limit the number of
scan cells that must be specified in each shift cycle. When using compression, the output of the
decompressor loads all chains simultaneously one shift cycle at a time. When stitching the shift register
sub-chains into the design scan chains, care should be taken to avoid the alignment of multiple condition
bits into the same shift cycle. One approach is to stitch the condition registers into an uncompressed scan
chain which is directly loaded. For designs in which all scan chains are compressed, placing condition bits
at the beginning, end or similar cell number of all scan chains should be avoided so that it is not
necessary to load many specified bits in the same shift cycle.
When operating in functional mode (TEST_MODE = 0), all clock gaters are disabled to reduce
power.
In shift mode (SCAN_EN = 1), SLOW_CLOCK is used to load/unload scan chains which
include the condition bits in ShiftReg.
In slow capture mode (FAST_CAP_MODE = 0), SLOW_CLOCK is used to capture data into
scan cells and to shift the condition bits in ShiftReg.
In fast capture mode (FAST_CAP_MODE = 1), FAST_CLOCK is used to capture data into scan
cells and to shift the condition bits in ShiftReg.
(* version=1.2 *)
module tessent_atpg_clock_controller (FAST_CLK, SLOW_CLK, TEST_MODE, SCAN_IN, SCAN_EN,
FAST_CAP_MODE, CAP_CYCLE_CONFIG, SCAN_OUT, CLK_OUT);
input FAST_CLK, SLOW_CLK, TEST_MODE, SCAN_IN, SCAN_EN, FAST_CAP_MODE;
input [1:0] CAP_CYCLE_CONFIG;
output SCAN_OUT, CLK_OUT;
wire SCAN_EN_sync;
wire ShiftReg_EN;
wire ShiftReg_SCAN_OUT;
wire SHIFT_REG_CLK_en;
wire SHIFT_REG_CLK_G;
wire SHIFT_REG_CLK;
wire CLK_OUT_source;
wire CLK_OUT_en;
wire CLK_OUT_G;
reg ShiftReg_SCAN_OUT_lockup;
reg SE_SLOW_CLK;
tessent_cgc cgc_SHIFT_REG_CLK
(.clk(FAST_CLK), .fe(SHIFT_REG_CLK_en), .te(1'b0), .clkg(SHIFT_REG_CLK_G));
tessent_clk_mux clock_mux_SHIFT_REG_CLK
(.a(SHIFT_REG_CLK_G), .b(SLOW_CLK), .s(SCAN_EN | ~FAST_CAP_MODE), .y(SHIFT_REG_CLK));
tessent_clk_mux clock_mux_CLK_OUT_source
(.a(FAST_CLK), .b(SLOW_CLK), .s(TEST_MODE & ~FAST_CAP_MODE), .y(CLK_OUT_source));
tessent_cgc cgc_CLK_OUT
(.clk(CLK_OUT_source), .fe(CLK_OUT_en), .te(1'b0), .clkg(CLK_OUT_G));
tessent_atpg_cc_shift_reg ShiftReg
(.CLK(SHIFT_REG_CLK), .SCAN_EN(SCAN_EN), .CAP_CYCLE_CONFIG(CAP_CYCLE_CONFIG),
.EN(ShiftReg_EN), .SCAN_IN(SCAN_IN), .SCAN_OUT(ShiftReg_SCAN_OUT));
endmodule
wire te_fe;
reg latch;
reg [1:0] R;
assign y = (s) ? b : a;
endmodule
assign EN = |FF;
assign SCAN_OUT = FF[0];
endmodule
The synthesis library used to synthesize the OCC RTL results in 36 library cells with an approximate area
of 72 2-input NAND equivalent gates.
Gates: 10k
Clocks: 3 internal clocks
Scan chains: 12
o 11 design chains
o 1 clock control condition bits (load only)
Scan flops: 446
Total faults: 48k
Test Coverage:
o Stuck-at: ~99%
o Transition: ~93%
/design
o /edt_created
Verilog, dofiles, and procedure files generated during EDT IP creation
o /gates
Gate-level netlist
Synthesized clock controller
o /rtl
/clock_controller
Clock controller RTL
Simulation test benches and scripts
Synthesis script
/pll
Simple RTL simulation model
/dofiles
o Command dofile
o Test procedure file
/library
o /liberty
Liberty file
o /synopsys
Compiled synthesis library file t18.db is not shipped with this test case due to
licensing agreements but should be placed in this directory for use by various
synthesis steps. The t18.db file can be created from the supplied liberty file
(t18.lib) using Synopsys lc_shell library compiler.
o /tessent
Tessent cell library
o /verilog
Simulation library
/logfiles
o Generated log files
Step 2
o Insert clock control logic
Steps 3 4
o Create and simulate uncompressed patterns with internal clocks (slow and fast
capture)
Steps 5 6
o Create, insert, and synthesize EDT hardware
Steps 7 8
o Create and simulate compressed patterns with internal clocks (slow and fast capture)
Simulation
Before synthesizing the clock controller RTL, it can be simulated to ensure clock pulses are generated as
expected. To run the simulation, execute the following scripts in the design/rtl/clock_controller
directory.
run_vsim_slow_capture
run_vsim_fast_capture
For slow capture test, the test bench in the referenced directory sets the CAP_CYCLE_CONFIG signals
to 01 and loads the condition bits to generate a single clock pulse during capture. The waveform for slow
capture test is shown in Figure 10.
RTL Synthesis
After verification, the RTL is synthesized using a Design Compiler synthesis script in the
design/rtl/clock_controller directory:
run_synthesis
This script will run a simple synthesis based on the supplied ADK library and can be used as a template
for synthesizing the RTL with your technology-specific library. The synthesis script will create the following
gate-level netlist:
design/gates/tessent_occ.vg
NOTE: The compiled synthesis library file t18.db is not shipped with this test case due to licensing
agreements but should be placed in the library/synopsys directory for use by various synthesis
steps. The t18.db file can be created from the supplied liberty file (t18.lib) using Synopsys lc_shell
library compiler.
1.display_occ_muxes
The tool will display the PLL and 3 clock muxes in DFTVisualizers Design window as seen in Figure 12.
As shown, the test_mode signal selects the the functional test path to allow the PLL clock to drive design
logic.
The A1 input of each mux is left floating so that the output of the OCC can be connected and selected
when test_mode is set to 1. The OCC insertion script will connect to each OCC mux and supply the slow
and fast clock for test through these muxes which have already been timed and balanced.
2.insert_clock_control
See the insertion dofile for details of various commands in Tessent Shell for insertion and design editing.
#! /bin/sh -f
#\ Execute Tessent
exec tessent -shell -log logfiles/$0.log -replace -dofile "$0" Shell and specify
log file and
################################################################### dofile commands
## SCRIPT TO INSERT CLOCK CONTROLLER INTO THE DESIGN
## TESSENT VERSION 2013.4
###################################################################
set_system_mode insertion
# Create collection of OCC mux input ports where OCCs should be created. For
# each specified connection point, an OCC will be created and its CLK_OUT
# will be connected to the specified connection point.
set clock_connections [get_pins MUX_OCC*/A1]
# Define scan-enable and PLL clock output port that should drive the fast port on clock
controllers
set pll_clock_port pll_i/pll_clock_180
set scan_enable_port test_se
set test_mode_port test_mode
exit -force
The Tessent Shell insertion script calls the insert_occ procedure in the following TCL script
(scripts/insert_occ.tcl) to perform OCC insertion and design edits:
###################################################################
## PROCEDURE TO INSERT TESSENT OCC INTO A DESIGN
## TESSENT VERSION 2013.4
###################################################################
# Concatenate condition bit shift registers into new load-only scan chain
create_connection $var_scan_in clock_control_i1/SCAN_IN
for {set i 1} {$i < $num_clocks} {incr i} {
create_connection clock_control_i$i/SCAN_OUT clock_control_i[expr $i + 1]/SCAN_IN
}
set internal_scan_out clock_control_i$num_clocks/SCAN_OUT
# Connect the output of OCC blocks to the input of the clock muxes
for {set i 1} {$i <= $num_clocks} {incr i} {
create_connection clock_control_i$i/CLK_OUT [get_single_name [index_collection \
$clock_connections [expr $i - 1]]]
}
puts $file_dofile "# Define slow clock created by this script. If it already exists and \
connected to the OCC"
puts $file_dofile "# outside of this script, this can be removed but it should be \
defined during ATPG."
puts $file_dofile "add_clocks 0 $var_slow_clock"
puts $file_dofile ""
puts $file_dofile "# Define clock control scan chain as load-only to reduce pins. \
Should be uncompressed if"
puts $file_dofile "# number of OCCs is high to reduce specified bits."
puts $file_dofile "add_scan_chains -load_only clock_control_chain $var_scan_group_name \
$var_scan_in $internal_scan_out"
close $file_dofile
}
After OCC insertion, the modified design is saved and patterns can be generated using internal clocks.
3.create_patterns
The script can be used to perform slow or fast clock capture and can also initialize the design using
IJTAG/PDL or explicit input constraints. Invocation options described at top of the script determine which
combination is used. If no arguments are specified, by default the script will initialize the design using
IJTAG/PDL and create slow speed capture pattenrs. Contents of this and other included dofiles are
shown here:
#! /bin/sh -f
#\
exec tessent -shell -dofile "$0" -arg ${1+"$@"}
if {$initialization == "pdl"} {
set_context patterns -ijtag
read_verilog design/gates/cpu_scan_occ.vg Read design netlist and ICL
read_cell_library library/tessent/adk.tessent_cell file to setup design for
read_icl design/gates/tessent_occ.icl initialization with IJTAG/PDL
# Read the PDL file that defines necessary constrain values on boundary of OCC module
source design/gates/tessent_occ.pdl
# Apply the constrain values in PDL to all instances of the OCC
if {$atpg_mode == "slow_capture"} {
set fast_cap_mode_value 0
} elseif {$atpg_mode == "fast_capture"} {
set fast_cap_mode_value 1
} else {
error "Invalid command line argument '$atpg_mode' for 'atpg_mode '. Use '-arguments
atpg_mode=<value>' to specify. Valid values are 'slow_capture' and 'fast_capture'"
}
foreach occ_inst [get_name_list [get_icl_instances -of_module tessent_occ]] {
set_test_setup_icall [list $occ_inst.setup fast_cap_mode $fast_cap_mode_value
cap_cycle_config 4] -append
}
} elseif {$initialization == "constraints"} {
# The above IJTAG commands automatically create the below 4 input constraints by tracing
# these control signals from the boundary of each OCC instnace to the top of the design
# and enforcing the values defined in the PDL file for each port. This eliminates the
# need for you to map these signals and setting the input constraints manually. This
# method also verifies that each OCC's control signsal is correctly connected to top.
# Alternatively, you can use the following explicit commands to constrain the top level
# ports assuming that you ensure the connections to all OCCs are correct.
set_context patterns -scan
read_verilog design/gates/cpu_scan_occ.vg
read_cell_library library/tessent/adk.tessent_cell
# Define internal clocks for clock controller Script and procedure call to find
dofile scripts/tessent_occ.tcl all Tessent OCCs in design and
tessent_occ_config $atpg_mode automatically define internal
clocks and setup for ATPG
# Run Design Rule Checks
set_system_mode analysis
report_input_constraints
Procedure call to configure design
tessent_occ_config $atpg_mode based on DRC results
if {$atpg_mode == "fast_capture"} {
set_external_capture_options -capture_procedure ext_fast_cap_proc
}
# Run ATPG
create_patterns
# Save pattern (SIM_KEEP_PATH includes full pattern file pathnames in test bench)
report_patterns > patterns/patterns_report_$atpg_mode.txt
write_patterns patterns/pat1_${atpg_mode}_parallel.v -verilog -parallel -parameter_list
{SIM_KEEP_PATH 1} -replace
set_pattern_filtering -sample 2
write_patterns patterns/pat1_${atpg_mode}_serial.v -verilog -serial -parameter_list
{SIM_KEEP_PATH 1} -replace
exit
Once patterns are generated, test benches are saved for simulation against Verilog. The
SIM_KEEP_PATH parameter ensures that the full pathnames of pattern files are saved in the test bench
so that it is not mandatory to run the simulation from the patterns directory.
The test procedure file used for ATPG is common for all test modes and is stored in the file
dofiles/atpg.testproc. The content of this file is shown below:
//
// Test Procedure File
//
timeplate tmp1 =
force_pi 0 ; Timeplate globally defines force,
measure_po 2 ; measure, and pulse timing for all
pulse_clock 16 32; signals. Note the pulse_clock
period 64 ; statement is new in 2013.4 release.
end;
procedure test_setup =
timeplate tmp1 ;
cycle =
force test_mode 1;
pulse rst_in;
pulse slow_clock;
end;
end;
procedure shift =
timeplate tmp1 ;
cycle =
force_sci ;
measure_sco ;
pulse slow_clock;
pulse int_clk1;
pulse int_clk2;
pulse int_clk3;
procedure load_unload =
scan_group grp1 ;
timeplate tmp1 ;
cycle =
force test_se 1;
end;
apply shift 44;
end;
Most of the test procedure file is typical to most scan design. Additionally, the test procedure file contains
the external capture procedure ext_fast_cap_proc to ensure proper pulses on external clocks are saved
to the pattern file when fast clock is used for capture.
4.simulate_patterns
The script compiles the design, library, and patterns using ModelSim and verifes that all patterns (serial
and parallel) simulate with no mismatches. The simulation waveform for slow capture pattern number 0 is
shown in Figure 13:
The last 3 signals are the output of the clock control blocks which show a single pulse on the output of
clock_control_i2/CLK_OUT. This matches the output of the report_patterns command which is stored
in the file patterns/patterns_report_slow_capture.txt. The content of this file for the first few
pattersn is shown here:
//
// pattern # type cycles loads ... capture_clock_sequence
// --------- ------- ------ ----- ---------------------------------------------
// 0 basic 1 1 [slow_clock,int_clk2,int_pll]
// 1 basic 1 1 [slow_clock,int_clk2,int_pll]
// 2 basic 1 1 [slow_clock,int_clk3,int_pll]
// 3 basic 1 1 [slow_clock,int_clk2,int_pll]
// 4 basic 1 1 [slow_clock,int_clk2,int_pll]
// 5 basic 1 1 [slow_clock,int_clk2,int_pll]
We can see that scan pattern number 0 pulses the top level slow_clock and internal clocks int_clk_2
and int_pll. Note that the top level reference_clock which is a free-running clock is also pulsed in the
Similar observations can be made for fast capture pattern number 0 which pulses in_clk2 followed by
int_clk2. The waveform for this pattern is shown in Figure 14:
5.insert_edt_ip
6.synthesize_edt_ip
The IP creation step runs Tessent Shell with slow clock and fast clock setup files in order to automatically
create the ATPG files for both clock speeds. Since only one compression hardware design file is needed,
only the IP creation step with slow clock writes the compression hardware to a file. This file is synthesized
in step 6.
NOTE: The compiled synthesis library file t18.db is not shipped with this test case due to licensing
agreements but should be placed in the library/synopsys directory for use by various synthesis
steps. The t18.db file can be created from the supplied liberty file (t18.lib) using Synopsys lc_shell
library compiler.
The next step is to create compressed patterns using the slow and fast capture clocks [step 7]:
7.create_edt_patterns
Similar to the 3.create_patterns script, the default behavior of the pattern generation script is to use the
ICL and PDL descriptions of the OCC to constrain the inputs of each OCC. This simplifies the flow by
automatically finding all OCCs in the design and ensuring that their control inputs defined in the PDL are
properly constrainted. An alternative method is to use explicit input constraints in the dofile but this
requires the user to ensure the OCCs are connected to the top and works for directly connected pins but
may be more complicated in other cases. The ICL/PDL method is a more general solution which also
handles hardware that is initialized through IJTAG or TAP controllers.
In order to use input constraints in the pattern generation script, the initialization=constraints
argument can be used at command line. For example:
7.create_edt_patterns initialization=constraints
7.create_edt_patterns initialization=constraints atpg_mode=fast_capture
The final step is to simulate the compressed patterns to ensure no mismatches exist. Script 8 simulates
all compressed patterns:
8.simulate_edt_patterns
To run all steps in the test case, you can use the 0.run_all_steps script but as stated earlier, the
synthesis library is not part of the test case and will require proper setup for running step 5 to synthesize
the EDT hardware.
To obtain the complete test case for the flow described in this application note, use the download link in
section 1 of this document.
If the target EDT hardware uses the Low Pin Count Test (LPCT) controller, contact Mentor Customer
Support for information on how to use a modified OCC design with the LPCT EDT hardware.