Vous êtes sur la page 1sur 22

Dealing with Growing Design

Complexity: Hierarchical DFT for


Multicore Design
Juhee Han
Senior Engineer
Samsung Electronics
May 2018
Motivation
 Recently GPU becomes more complex and bigger, then GPU needs
a long TAT and resources
 In spite of long run time, test results (coverage/pattern count) is
not good
 For next project
— Reduce design TAT
– Optimize/reduce ATPG and verification run time
— Improve test quality
– high test coverage with low pattern count
— Find out optimized DFT solution

-3
-
Samsung Foundry Services
 Samsung 8-nanometer LPP (low power plus)
— It provides excellent choice for performance, power, and area
 Mentor Tessent TestKompress leveraged in 8LPP reference flow
— Dramatic test time savings for large designs
— Important for mobile, high-speed network/server, cryptocurrency, and
autonomous driving
 SFF local event next week

4 JH, Samsung Hierarchical DFT, May 2018


HIERARCHICAL DFT
Overall methodology
 Problem – large designs causing large TAT and resources
— Coverage also suffering
 Solution
— Plug-and-play approach with hierarchical DFT
— Create DFT IPs and patterns at the core and reuse/retarget to top
— Create patterns for duplicate blocks once and map to all copies
 Methodology
1. Insert TestKompress, MBIST, and OCC at RTL
2. Share input channels
3. Retarget patterns

6 JH, Samsung Hierarchical DFT, May 2018


TestKompress skeleton flow with shared-inputs
 Skeleton flow ① RT level insertion

— Compression logic and input ② Input sharing

① RT level insertion
sharing logic inserted at RTL by TOP_SI[1:0]
Tessent Shell ② Input sharing top
— One-pass synthesis flow by 3rd top/inst_a 2 top/inst_b 2 top/inst_c 2

party doesn’t need to be broken module sub_blk


sub_si[1:0]
module sub_blk
sub_si[1:0]
module sub_blk
sub_si[1:0]

Shared-input channels
… … …
 EDT edt_i (,,,,)

EDT edt_i (,,,,)

EDT edt_i (,,,,)

— Identical core compressor inputs endmodule endmodule endmodule

shared sub_so sub_so sub_so

— Less pin limitation leads lower TOP_SO_inst_a TOP_SO_inst_b TOP_SO_inst_c

compression ratio
(internal scan chain / input #)

-7
-
Pattern retargeting
 Pattern retargeting
— Core level ATPG patterns are retargeted to top level pattern
— It does not need a painful top level ATPG any more for core test

① RT level insertion
② Input sharing top
③ Retargeting top/inst_a
sub_si[1] TOP_SI[1]
TOP_SI[1:0] | sub_si[0] | TOP_SI[0]
| | sub_so | | TOP_SO_inst_a
② Input sharing top ③ Pattern | | | TOP_SO_inst_b
1 0 H
2 0 L
Retargeting | | | | TOP_SO_inst_c
top/inst_a top/inst_b 2 top/inst_c 2 0
sub_si[1:0] sub_si[1:0] sub_si[1:0]
1 1 H 1 0 H H H
module sub_blk module sub_blk module sub_blk
1 1 H 0 0 L L L
①RT level insertion … … … 1 1 H H H
EDT edt_i (,,,,) EDT edt_i (,,,,) EDT edt_i (,,,,) 0 1 H
1 0 H 1 1 H H H
… … …
1 0 L 0 1 H H H
endmodule endmodule endmodule
1 0 L 1 0 H H H
sub_so sub_so sub_so
0 1 L 1 0 L L L
1 0 L L L
TOP_SO_inst_a TOP_SO_inst_b TOP_SO_inst_c 0 1 L L L

-8
-
TestKompress skeleton flow

9 JH, Samsung Hierarchical DFT, May 2018


EDT (TestKompress) generation
 Using skeleton netlist and atpglib, dofile RTL

 Define channel count and EDT option DECOMPRESSOR


— Input/output channel count
Xpress Compactor
— bypass ON/OFF
 Power option: set_edt_power Synthesis script
— Shift enable
— Min_switching_threshold_percentage N _dc.scr

 Low pin count controller


— Option to reduce pin count
Pipeline stage insertion
 Pipeline can be inserted when EDT integrated
create_instance edt_channel_in_${i}_pipe -of_module pipe
create_connection edt_channel_in_${i}_pi/CLK slow_clk_stk
delete_connections edt_channel_in_$i
create_connection edt_channel_in_${i}_pi/D edt_channel_in_$i

DECOMPRESSOR

Xpress Compactor
head_pipe_clock

edt_clock

tail_pipe_clock
Synthesis
 Using EDT inserted RTL
 Don’t touch for EDT, pipeline stage
— set_dont_touch [get_cells –hier –filter “full_name =~ *edt_i* and is
hierarchical == false”]
— set_boundary_optimization [get_cells –hier –filter “full_name =~ *edt_i*
and is hierarchical == true”] false
 Scan insertion
— Scan chain stitching
— Wrapping a core
Top level EDT generation
 Same as block level EDT generation flow
 Integration
— EDT insertion
— Channel connection for shared input/dedicated output
— (Pipeline insertion: pipeline stage balancing is not needed)
— Bypass mode channel connecting
 Using tessent –shell
DECOMPRE DECOMPRE DECOMPRE DECOMPRE
SSOR SSOR SSOR SSOR
PATTERN RETARGETING
Generate core level pattern
top/inst_a

 Core internal test mode (wrapper in-test mode) sub_si[1]


| sub_si[0]
| | sub_so

— To retarget core pattern, wrapper cell should be integrated 1


0
1
0
0
1
H
L
H

— Core should be in internal test mode


1 1 H
0 1 H
1 0 H
0

— set_current_mode current_mode –type internal


1 L
1 0 L
0 1 L

— Automatically assigns ‘X’ on PIs and masks POs


 Core description information
— Read tessent IP (EDT/OCC etc.) information (TCD)
— Read scan structure and procedure (stil2mgc)
 Core level pattern generation
— Recommended pattern format is PATDB(binary)
— Save fault list

-
20 -
Core level pattern
top/inst_a

 Constraints for retargeting sub_si[1]


| sub_si[0]
| | sub_so

— Only scan data can be retargeted 1


0
1
0
0
1
H
L
H

— Clock/Reset/ScanEnable should have same


1 1 H
0 1 H
1 0 H
0

value for whole patterns


1 L
1 0 L
0 1 L

— Special care for asynchronous reset signal test


– reset test logic should be inserted or dedicated reset
test pattern should be generated

-
21 -
top

Pattern retargeting to top level


top/inst_a TOP_SI[1]
sub_si[1] | TOP_SI[0]
| sub_si[0] | | TOP_SO_inst_a
| | sub_so | | | TOP_SO_inst_b
| | | | TOP_SO_inst_c
1 0 H

Set pattern retargeting mode 0 0 L 1 H H H


 1
1
1
1
H
H
0
1
L
H
L
H
L
H
0 1 H 1 H H H

Read top and core design


1 0 H 0 H H H

 1
1
0
0
1
L
L
1
1
1
H
L
L
H
L
L
H
L
L

— Read top and core (gray box or black box) netlist


0 L
0 L L L

 Read core scan information and mapping


— Read core tcd file from core level pattern generation step
— Sub core instances can be recognized together
 Set top test mode constraints
— Setting top level test mode signal or TDR register
 Insert top level capture procedure
— Clock information is not ported automatically
— Clock waveform is not checked by DRC if it is same as core pattern
 Core level pattern retarget
— Read core level pattern and re-write pattern as top level
-
22 -
Summary
 Hierarchical DFT flow enables plug-and-play automation
 Pattern retargeting maps core pattern to top
— Automatically adjusts for pipelines and inversions
 ATPG performed at core level early in the design flow
 Full top level netlist not needed
 Tessent hierarchical DFT flow with TestKompress, OCC, diagnosis
and MemoryBIST are available in Samsung Foundry 8LPP

23 JH, Samsung Hierarchical DFT, May 2018


Thank you
Please com plete a session survey in the
U2U Conference App.
Known Limitation
 Reset – constrained as inactive state in core retargeting pattern
— work-around – reset test
– Use name capture procedure
– make all-other clock as off-state and toggle only reset
– logic insertion for reset test
– 그림의 red color or gate 2개를 edt integration 시 삽입하고 reset port에 dedicated wrapper
insertion
Scan_out
Dedicated Input Wrapper cell
Sff1 Sffn
RESET 0 Data_out R R
D Q 1
Scan_i SI
n SE Async_set_reset_static_disable
Scan_en_in CK DIW_sff1
Clock

Test_en
LPCT_capture_en D Q

Clock
CK LPCT 안쓸때 scan enable로 대체
– stuck-at test. 때 Async_set_reset_static_disable = 0 로 두고 reset 흔들며 test
transition 에서는 Async_set_reset_static_disable = 1 로 reset 안흔들리게 막아 버림
– reset active state가 반대일 때는 다른 logic 사용
– design 내 reset 종류 파악
-
27 -
Known Limitation
 No DRC check for external capture clock waveform preserve
— If external ATE clock is used for capture, there is no DRC to check if core
level capture procedure is preserved at top level correctly
For example:
If you generate core level pattern which uses double capture clock procedure by ATE clock due to SWT and you retarget the pattern to
top with wrong capture procedure (by mistake insert single capture procedure). You can recognize the mistake in simulation step by
expected data mismatch because there is no DRC check for external capture clock.

-
28 -

Vous aimerez peut-être aussi