Vous êtes sur la page 1sur 10

Logic BIST for Large Industrial Designs: Real Issues and Case Studies

Nagesh Tamarapalli, Mark Kassab, Abu Hassan, and Janusz Rajski Mentor GraRhics Corporation 8005 S.W. Boeckman Road Wilsonville, OR 97070, USA
Abstract
[I]. Note that manufacturing test is applied to every device multiple times, at different voltage levels, at the wafer, packaged device, etc. The manufacturing test cost is incurred for every manufactured device and might be as high as 25-30% of the total manufacturing cost. Logic built-in self-test (BIST) is based on scan as the fundamental DFT methodology [2,3,4,5]. Initially, the predominant compelling reason for the adoption of BIST was the requirement to perform in-field testing. Recently, there has been growing interest in BIST as it can reduce the cost of manufacturing test as well as improve the quality of the test by providing at-speed testing capability. In BIST, pseudorandom patterns are generated on chip, the responses are compacted on chip, and the control signals are driven by an on-chip controller. The amount of test data exchanged with the tester is therefore drastically reduced. In addition, the scan cells are configured into a large number of relatively short scan chains, thus reducing the time required to apply a single test pattern. The low memory and performance requirements on the tester allows the usage of very low cost testers for manufacturing test of designs with logic BIST. Logic BIST is based on pseudorandom patterns and f test responses. Those two characinvolves compaction o teristics impose more stringent design rules on the BISTed logic than scan with stored patterns. Logic BIST requires that bus conflicts are eliminated, sources of X states are properly bounded to prevent corruption of the signatures, the circuit is random-pattern testable, etc. In many cases, the original design does not satisfy many of these requirements, thus posing barriers to BIST. In those cases, and in general, the only practical way to implement BIST is through automation of the design tasks and their integration in the overall methodology and design flow. The introduction of logic BIST at the Texas Instruments MOS design center is driven by limitations of the currently-used test equipment and a number of specific goals. In particular, the testers currently used already limit the ability to run available tests in the following ways: 1. Scan operates at a maximum frequency of 50 MHz. 2. Tester scan memory is usually filled. 3. Tester has a maximum of 8.scan chains, resulting in a long test application time for large designs. 4. Tester functional test memory is also filled, leading to utilization of as little as 10% of the available functional tests.

Paper 142

ITC INTERNATIONAL TEST CONFERENCE

358

0-7803-5753-1/99 $10.00 01999 IEEE

5. Transition fault or path delay scan ATPG patterns are not used due to lack of tester memory. All these problems are constantly getting worse. They could be solved by investing in tester technology. However, logic BIST is an attractive alternative solution as it removes most of the tester limitations. Given the current design environment and ATPG practice, the following basic logic BIST goals were derived: G 1. Eliminate tester memory and frequency limitations. G2. Solution provides at-speed scan testing. G3. Solution works for 1-2 million logic gate designs. G4. BIST stuck-at grade 5 95%. G5. Logic BIST area overhead 5 2% of logic. G6. Silicon BIST run time < 1 second. G7. Engineering effort < 2 person months per design. It is also important that logic BIST fits seamlessly into the current design process and that the overnight design synthesis time is not compromised, hence the following additional flow-related goals: G8. Ability to use ATPG or logic BIST tests. G9. Minimal impact on current design methodology. G10. Automation of the logic BIST flow. G 11. Additional RTL-to-gates run time < 2 hours. G12. Logic BIST fault grade run time < 12 hours. G13. Logic BIST IkosTM simulation time < 12 hours, G14. BIST can be run on a very low cost tester. In Section 11, the logic BIST architecture is presented with particular emphasis on the controller and its ability to support multi-clock multi-frequency designs. Section I11 covers generation of the BIST-ready core including insertion of test points, bounding of X generators, and handling of primary inputs and outputs. Section IV is devoted to detailed presentation of four case studies. Finally, conclusions are presented in Section V.

eral shallow BIST-mode scan chains into a few deep ATPG-mode scan chains accessed directly from the chip pins in case top-up ATPG is used to improve the fault coverage obtained by BIST.

BIST mode

Figure 1: Generic scan-basedlogic BIST architecture. The BIST can be initiated.either through a boundary scan TAP controller or by appropriately,asserting a set of new primary inputs in case a stand-alone mode logic BIST controller is implemented. Prior to running the actual test, the controller components such as PRPG, MISR and the pattern counter need to be initialized. In addition, the internal scan chains can also be optionally initialized. The actual test of the circuit consisting of several patterns then begins. For each pattern the shift counter counts (Nsc + NclC) cycles where Nsc , the number of cycles in the shift window is equal to the length of the longest scan chain and N , ,the number of cycles in capture window is typically equal to one for a simple capture window. Hence in order to reduce the test application time it is necessary to configure the scan cells into a large number of shallow scan chains. A systematically designed phase shifter circuit [6,7] is placed between the.LFSR and the scan chain inputs to eliminate structural dependencies and allow a large number of scan chains to be driven by a relatively short LFSR. Similarly an XOR structure called space compactor is required to compact the large number of scan outputs before feeding them to a small MISR. As with the phase shifter care must be taken in designing the space compactor to avoid loss of test coverage due to fault masking. During the shift window of a pattern, new pseudo-random values from the PRPG are loaded into the scan chains while simultaneously unloading and compacting the circuits response for the previous pattern into the MISR. In case the internal scan chains are not initialized, for the first pattern, their unknown contents can be blocked as shown in Figure 1 by means of AND gates in front of the MISR. After the scan chains are completely loaded, the multiplexers in the scan cells are placed in system mode for one cycle to capture the circuits response. This sequence of events continues for each pattern. In addition, if multi-

1 1 . LOGIC BIST ARCHITECTURE

A. Generic scan based logic BIST architecture


A generic single clock logic BIST architecture based on the well known STUMPS technique [6] is illustrated in Figure 1. The figure depicts the circuit-under-test or core, and the logic BIST controller in the highlighted area. The circuit is composed of combinational logic, and possibly embedded memories, separated by multiple scan chains. Various components of the logic BIST controller are shown in the highlighted area. These components include test pattern generation block - composed of the pseudo-random pattern generator (PRPG) and phase shifter circuit, the output response analysis block - composed of multipleinput signature register (MISR), space compactor, and optional AND gates. In addition, there are two counters: the pattern counter, and the shift counter which for each pattern keeps track of the number of cycles required to fill the scan chains. The decoder block shown in the figure drives the test points. Finally, the multiplexers between the phase shifter and scan inputs are used to concatenate sev-

Paper 142

359

the capture window of a test pattern. It is shown that in order to achieve at-speed test of the circuit, unlike previous methods [lo], it is not necessary to perform at-speed shift of the scan chains. In fact only events in the capture window are crucial to at-speed testing of the circuit.

Figure 2: Multi-frequencylogic BIST controller. The timing diagram shown in Figure 3 illustrates the partitioning of a test pattern into shift window and programmable capture window. The shift window is comprised of multiple shift operations required to loadunload the scan chains. These shift operations can be performed at a frequency of any of the three clocks or their sub-rnultiples. This freedom of selection of the shift frequency provides a trade-off between design of scan chains vs. the test application time. In the example timing diagram in Figure 3, scan chains are shifted at frequency F, of clock clk, . Note that memory elements in scan chain SC, use the faster frequency F, during the functional operation. This frequency is reduced to F2 in the shift window through clock suppression. Clock suppression in this case suppresses every other pulse of clock sys-clkl to generate slower clock clk, of frequency F,. Scan chain SC, is clocked by clk, which is driven by sys-clk,. Since the frequency of sys-clk, is F2, no modification of this clock is necessary for the shift window. Finally, scan chain SC, is driven by a slower clock sys-clk, in the system mode. Clock multiplexing is used to drive clk, with sys-clk, of frequency F2 during the shift window. Timing diagram of clk, shows the effect of multiplexing in the faster clock. The programmable capture window comprises of captures in different clock domains and some shift operations to create inter-domain at-speed capture. The functional clock of each of the domains is used to obtain a shift followed by a capture. These two consecutive events using the functional clock guarantee that every intra-domain path can

Paper 142

360

Load-unload window: Vector N

t
I'

Programmable capture window: Vector N

sen3

Figure 3: Multi-frequency logic BIST timing diagram. be tested at-speed; i.e., the time between the launch and capture events is equal to one functional clock period. Also to test inter-domain paths, at-speed clock edges are placed appropriately as shown in Figure 3. The exact placement of clock edges for at-speed test of all nine relations is detailed in Table 1. Each table cell T[i, j ] , corresponding to launch clock clki and capture clock clkj, lists the position of the launch pulse of clk, followed by the capture pulse of clkj. All the positions are described in terms of pulses of the fastest clock sys-clk, . For example, the capture edge for clk, is the rising edge of second clock pulse of clk, in the capture window, which is equivalent to the rising edge of the fifth clock pulse of sys-clk, As can be seen, in the capture window clock suppression is used to suppress some pulses of elkl and clk, while no pulses of clk, are suppressed. The scan enable signals for each of the clock domains switch to system mode prior to their respective capture edges. Since the scan enable signals have to be routed to all the scan cells in the circuit, their design constraints can be relaxed by opting for slow scan enables. In the example timing diagram shown in Figure 3, scan enable Senl is designed to be fast, i.e. it has half-a-cycle of the fastest clock to settle whereas the scan enables Sen2 and Sen3 are designed to be slow, i.e. they have one-and-half-a-cycleof the fastest clock to settle.

One of the important advantages of the programmable capture window is the robustness against the clock skew. Figure 3 illustrates that whenever any clock domain is capturing data, other clock domains do not have an active edge. Thus, the capture edge, unlike in previously proposed solutions, is not susceptible to inter-domain clock skew. In addition, the capture window can be programmed to perform multiple captures in each domain as well as control slow scan enables. Performing multiple captures reduces the risk of delay test invalidation and false paths that might occur due to illegal states in scan chains resulting from filling them with pseudo-random values from the PRPG. Slow scan enables, by providing multiple cycles of the fastest clock for the scan enable signals to settle, reduce constraints on their design. They no longer need to be routed as clock signals. Note that programmability of capture window can also be used to handle a circuit containing multiple clock domains of the same frequency. In order to generate appropriate clock control and scan enable signals, the pattern and shift counters have to operate using the fastest clock. Also, unlike the previous controller, the number of cycles in the shift window N s c , is not necessarily equal to the length of the longest scan chain. Nsc depends on the longest effective scan length as determined by the frequency used for shifting the scan chains. Similarly Nee, the number of fastest clock cycles in the capture window is usually more than one. On the completion of scan chain loading, a sequence of events is launched in the capture window to perform atspeed testing of intra- and inter-domain logic. Once a predetermined number of patterns are applied, the contents of the MISR can be, as explained earlier, scanned out and compared externally or compared with an on-chip golden signature.

111. GENERATING A BIST-READY CIRCUIT


In addition to having scan, a BIST-ready circuit should be random pattern testable and should have no unknown values propagating to observable points. In this section, those and other barriers to the implementation of logic BIST will be discussed and automated solutions to overcome them will be presented.

A. Random-pattern resistance
Logic BIST is in general based on pseudo-random patterns. Most circuits, however, have inherent random-pattern resistance, which results in relatively poor test coverage. To achieve test coverage approaching that of ATPG, control and observe points are added to the circuit to increase its susceptibility to random pattern testing.

Table I: Edge placement for intra- and inter-domain at-speed test. Capture clock

Control points A control point is inserted on a signal that has a very high probability of logic 0 or I if this predominant value causes poor controllability or observability of a sufficiently

Paper 142 361

BIST-mode Scan-enable

previous scan cell Additional logic Scan cell Figure 4 Connecting observe points to existing scan cells.

From -

points

OR gate can similarly be used. 3. AND or OR gates can be inserted to force a constant 0

value, since they are only asserted in certain phases, they do not block fault propagation for the entire test session. In addition to the low area overhead of the control points in MTPI, typically fewer are required than when test points are selected using other methods. MTPIs control points are also unlikely to introduce timing problems in test mode; since they force constant values when activated, and the signals driving them are only changed during the relatively long scan load/unload cycle. Note that inserting control points can affect a circuits timing since they add delays along functional paths. However, it is possible to prevent insertion of control points along critical paths or blocks.

B. X generators
An essential requirement for a BIST-ready circuit is that it should not generate any observable unknown states. If an X propagates to the MISR, it corrupts the signature and makes it impossible to distinguish faulty and fault-free circuits. Therefore, test logic must be inserted to suppress unknown states or prevent them from propagating to an observable point. Typical potential X generators include the following: 1. Non-scan flip-flops (FFs). 2. RAMS and CAMS. 3. Combinational loops. 4 .Undriven primary inputs. 5. Bus contention. 6. Violation on a wired gate. Potential X generators can be identified by a design rule checker. Preventing those X sources from propagating to the MISR can be accomplished using several methods which trade area overhead and loss of test coverage. Bounding X generators After identifying potential X generators, analysis is performed to determine which of those X sources need to be bounded. An X generator only needs to be bounded if its value can propagate to an observable point, or if an observe point can be inserted such that the X generator becomes observable. A trade-off can be used to prevent X sources which are already blocked at a nearby location from being bounded. Since in this case, the X generator will only be observable if an observe point is added between the X source and the locations at which it is blocked, simply restricting all gates in this region from being considered as

Paper 142

362

observe point candidates. eliminates this problem. The threshold used to determine whether to re-bound a blocked X generator or exclude its blocked fanout region from consideration for observe points can be set by the user. For X generators which must be bounded, this can be done by inserting one or more control points before the X can propagate to an observable p i n t . For example, if a non-scan FF has 2 outputs (Q and Q), one control point can be inserted on each of the outputs and activated in test mode. Alternatively, if the FF has asynchronous setheset pins, a control point can be added to force the FF to 0 or 1 during the test. While a control point can be added to force a constant value, it is recommended for higher test coverage to insert a MUX control point driven by a nearby existing scan cell, as explained in the control points section. This method for X bounding ensures that no Xs will be observed. However, it doesnot provide means for observing faults which can only propagate to an observable point through the now-blocked X source. This can result in loss of test coverage. If the number of such faults for a given bounded X generator justifies the cost, one or more observe points can be added before the X source to provide an observationpoint to which those faults can propagate. Handling embedded memories Embedded memories, typically RAMs, can act as X generators. However, bounding their outputs can severely impact test coverage as faults which only propagate to the RAM will not be testable. This includes faults propagating to the RAMs data as well as address and control lines. The preferred method for handling embedded RAMs is to bypass them in test mode. The RAM inputs are connected to scan cells for observation. The inputs can be connected to space compactors (XOR trees) before connecting them to the scan cells, to reduce the number of scan cells required. Those same scan cells are used to drive the outputs of the RAMs in test mode. Therefore, in test mode, the RAMs inputs and outputs become pseudo primary inputs and outputs, respectively. This is illustrated in Figure 5. It is assumed that some other DFT methodology, typically memory BIST, is used to test the RAM itself.

logic BIST session follows a memory BIST run, then the RAM can be disabled at the end of the memory BIST session. This forces the outputs of the RAM to have constant values throughout the logic BIST run. While this method has low area overhead, faults propagating to the RAM will be blocked if no observe points are inserted on the RAMs inputs. Furthermore, constant values will be applied from the RAM, which can decrease the testability of faults in the logic driven by the RAM. It may also be possible to bypass some RAMs with low hardware overhead and without adding any logic on their inputs and outputs. If the RAM supports pass-through where the same address is written and read simultaneously, this mode can be used to make the memory transparent. Test logic would be required to force the memory into this mode during the BIST session. The main disadvantage of this method is that while it allows the data inputs to pass through, faults propagating to the address lines may not be tested. Furthermore, if multiple RAMs operate in this mode, combinational loops may form. It is therefore recommended to use the RAM bypass method discussed.

C. Handling of primary inputs and outputs


In logic BIST, only the scan chains are controlled and observed by default. Since the tester does not drive the test, it does not drive the primary inputs (PIS) or observe the primary outputs (POs). If POs are not observed, loss of test coverage will result since faults which only propagate to POs and not to scan cells will not be tested. More importantly, PIS must be driven; in addition to loss of coverage, a floating PI is an X generator. Control points can be added on the PIS to force them to constant values during the BIST session. While this prevents the PIS from generating Xs, loss in coverage may result due to the constant values forced. The recommended solution is to use MUX control points and drive the PIS from nearby existing scan cells. Therefore, PIS are handled exactly the same as X generators. Only PIS which are directly driven by the BIST controller do not need to be bounded. To observe POs during the BIST session, observe points are used to connect them to scan cells.

Iv.

PRACTICAL EXPERIENCE WITH LOGIC BIST

A. Background and motivation


This section describes the practical aspects of introducing logic BIST into a department that designs large ASICs. The designs have 200-800K NAND2 gate equivalents of logic plus an equivalent area of RAMs. There are often multiple clock domains with frequencies ranging from 2.5 MHz to 150 MHz. Register Transfer Level (RTL) VHDL is the design sourcing language. A simplified diagram of the overall design flow is shown in Figure 6. One of the key design processes is daily execution of some design flows; design synthesis to gates including scan gate level insertion, RTL simulation regression, and IkosTM
Paper 142

Figure 5: RAM bypass. If the output multiplexors are not acceptable for timing

or introduce an unacceptable hardware overheard, an alternative is to freeze the RAM after it is initialized. If the

363

t
I

Table 11: Design data. I ASICl 1 ASIC2 1 ASIC3 1 ASIC4 I Design Core comb. gate count 180K 356K 558K 748K 16 10 10 11 Number of RAMS 75 MHZ clocks 1 0 1 1 1 1 1 1 1 , 125 MHz clocks 1 9 1 0 1 0 1 0 1 I I I 150 MHz clocks l o 1 1 1 1 1 1 1 I I I I 2Y2.5 MHz clocks [ 0 1 0 1 16 I 16 J L 125/25 MHz clocks 0 4 2 2 50 MHz clocks I 1 0 0 1
Y

Figure 6: Design f l o w .

ure 6. Daily execution of these flows requires them to run overnight and therefore in less than 12 hours. The DFT/used is nearly full scan, muxed-scan style with scan insertion performed on the gate level core netlist. Scan insertion currently takes 1-2 hours of the allotted 12 hours for the flow. While ATPG is important, it is not a critical path flow; ATPG with pattern compression takes approximately one day. Normally, a 10-20% sample of the ATPG scan pptterns are serially simulated on an IkosTM, which takes approximately 30 hours. Silicon testing uses a suite of parametric tests followed by functional 2Fsts and the scan ATPG tests. Stuck-at ATPG grades are approximately 97% and pseudo stuck-at I p D ~ grades are typplly 80% with 10 stops. The largest designs have 40K scan cells, and ATPG of these designs generates approximately1 5K scan patterns. Assuming that each scan cell generates 3 bits of scan test data per scan pattern, these 5K scan patterns translate to 600 Mb of scan test data. These designslhave 8 parallel scan chains and ATPG patterns are applied at 20-50 MHz, giving silicon run times of 1.3 - 0.5 seconhs. Scan overhead is approximately 9% extra logic which trhslates to a 4% chip area overhead.

B. BET implkmentation
Logic BIST was implemented on a trial basis into four designs. These' are large designs with multiple clocks. Table I1 gives gome vital statistics of the designs. The 75 MHz and I50 MHz clocks are generated within the designs. Although there are lower frequency clocks, all logic works at 75 MHz. For ATPG testing, all clocks run at 50 MHz and this single-frequency multi-clock test mode was used for the current logic BIST implementation. ASICl clocks run at 125 MHz. The other ASICs' clocks run at 75 MHZ. Logic BIST was implemented using the STUMPS architecture, described in Section 11. Moving from a scan ATPG methodology to a STUMPS logic BIST methodolI

ogy is a small step. However, additional design work arises as follows: 1, Generation of a BIST controller. 2. Multiplexing and balancing of clocks. 3. Insertion of many short scan chains for use in BIST mode, and the ability to reconfigure them into.relatively few, long scan chains for use in ATPG mode. 4. Test point insertion. 5. Handling of observable X generators. 6. Bounding module inputs. 7. Fault simulation to measure the fault grade and compute the MISR signature. 8. Timing analysis (TA) and resolution of any test pointrelated TA issues. 9. Gate level timing simulation verification of BIST In what follows, issues related to clocks, test point insertion, and X generators will be described. Multi-clock designs require careful balancing of the clock trees. Clock skew within a clock tree and between clock trees must typically be reduced to 0.3 ns at clock speeds of 75 MHz. Such clock control can be achieved through ASIC layout clock tree synthesis (CTS) tools. The clock multiplexing inherent in a multi-clock logic BIST controller is therefore safe as long as the ASIC CTS macros are placed directly on the output of the BIST controller clock multiplexors. MTPI test point insertion uses simple AND or OR gates for control points, driven by the test phase control signal. Selecting gates of sufficient drive ensures correct operation of these static signals. MTPI observe points are implemented as new output signals, new scan cells, or connected into existing scan cells. Each of these three observe point types also has the option of observe point sharing via XOR trees used as space compactors. Observe point signals must operate at speed so they must be captured close to their source; typically within a 20K gate region. The sparsity of observe points within 400-8OOK gate designs is such that observe point sharing would require non-local XOR compaction trees which would not work at speed. For large designs, observe points must either be connected into local pre-existing scan cells or connected into new additional local scan cells. The current logic BIST implementation utilized new scan cell observe points for ASICI, and

Paper 142

364

observe points connected into pre-existing scan cells (using the XORIMUX circuit of Figure 4) for the larger ASICs. The RAMS in the designs can shadow up to 3% -of the faults. Therefore RAM bypass mode was used as shown in Figure 5. Using 10 scan cells per RAM leads to a bypass cost of approximately 125 gates per RAM. Removal of the other X generators can be done either by manual alteration of the source VHDL, or automatic bounding as described 1 1 . Manual VHDL source fixing of X generators in Section 1 is not practical within the 1-2 person month resource limit so automatic X bounding was used. The only practical way to implement logic BIST is automation of the design tasks together with a methodology which minimizes the probability of failing timing analysis and simulation. Our STUMPS implementation therefore utilized a tap controller whose RTL was automatically generated. The BIST controller described in Section I1 was used; its RTL, was also automatically generated. Finally, the automatic multi-phase test point insertion, X bounding, and module input bounding described in Section 111 were used. This combination allowed complete automation of the logic BIST implementation, thereby freeing the 1-2 person months of resource to handle issues of timing analysis and simulation verification.

Table 111: Summary of logic BIST results.

BIST pattern count BIST stuck-at fault grade


(%)
, I

65K

262K

262K

262K

96.0
3.4 .3 0.7
I

95.7
2.6 .3 3.3
I

95.3
2.1

95.6
1.58
o.8

BIST gates overhead (%) BIST chip area overhead Scan + test point insertion time (hr) Fault simulation time (hr) . . IkosTM simulation time
\ I

(%) . ,

6.7
I

14.5
I

0.9
21

3.4 n/a

5.2 n/a

4.0

0-4
C. BIST implementation results
I

The basic results of the logic BIST implementation are given in Table 111. These results represent typical achieved values, not necessarily the optimum possible. Fault grades are quoted for the design core only but all logic in the
design core is counted, including the bounding multiplex-

BIST silicon run time (sec) ATPG grade [%) ATPG pattern volume
(Mb)

ors and test point logic. The fault list used is the same as that used in ATPG, so all faults in the core are included. Thus, no credit is given for possible detected faults, scan enable faults are not implicitly detected, and faults associated with tied logic are included as not detected. As can be seen, BIST fault grades of 95-96% are achievable with approximately 2% logic overhead. BIST fault simulation time is within the goal. However, scan and test point insertion time ranges from 1- 14 hours giving an additional RTLto-gates time of 0.5-12.5 hours versus the goal of 2 hours. This additional time is mainly the time spent performing test point insertion, which includes fault grading. ATPG is performed on the BISTed cores under the same conditions in which BIST is run. The ATPG comparative grades of 9798% indicate an expected grade shortfall with using logic BIST. Note that while the BIST silicon run times are within the 1 second goal, they are 2-3 times longer than the ATPG pattern silicon run times. In the future as ASIC clock frequencies rise to 0.5 GHz and beyond, BIST silicon run times will become less than ATPG pattern run times. Logic BIST test can be.topped up with ATPG of the residual faults. For these designs, the ATPG top-up pattern volume is 25-65% of a full ATPG test. A breakdown of the BIST overhead for ASIC4 is given in Table IV. The biggest contributors are the observe points and BIST controller.

ATPG top-up pattern volume (Mb) . . ATPGfrequency(MH2) ATPG silicon run time (sec)
\ I

50 .02
I
I

40

20 .94
I
I

50 0.7

0.36

Table IV BIST overhead for ASICB

Overhead as % NAND2 gate of scan-inserted equivalents netlist 0.43% BIST controller I 3489 I 0.04% (Core inmtboundingl 304 0.15% 1246 CAM bounding 1140 0.14% RAM bounding 0.04% X bounding 0.15% 592 control points 0.63% 1200 observe points 12762 1.58% Total BIST component

Sensitivity of the BIST fault grades to the number of added test points is shown in Figure 7 for design ASIC3, The BIST grades rise sharply as control and observe points

Paper 142

365

'

are added. In the 94-96% region, the grade is relatively insensitive to the number of control points but rises significantly as obsyrve points are added. Sensitivity of the grade to the number of BIST patterns is shown in Figure 8 for design ASIC?. Significant grade increases occur for pattern counts up to 256K patterns and beyond.

This is pre- or post-layout gate level timing simulation of the 65K BIST patterns in full serial mode. This long simulation time meant that any debug had to be done using a BIST controller setup for just a few patterns. Automatic checking of expected against simulated values for key points in the BISTed design also facilitated debug. Key points are the PRPG state, stumps scan-in points, stumps scan-out points and MISR signature. ASICl IkosTMsimulation found one timing issue. At the end of the BIST run, the MISR signature is scanned out through a relatively slow chip pin driver. Driver delay variation between min and max timing made the MISR signature slip by one cycle between these conditions. The solution to this was to slow the clock to 50 MHz during scan of the MER. Addressing the design process goals of logic BIST, the new design flow is shown in Figure 9. The changed components are highlighted. From the viewpoint of the main design processes, this new design flow is essentially unchanged. The only change to these main processes is the addition of automatic bounding and test point insertion to the scan insertion step. The DFT engineer, however, has the extra tasks of RTL VHDL generation of the TAP and BIST controller and the job of getting satisfactory results for gate level BIST fault simulation and functional and timing simulation. Finally, any timing issues with the inserted test points will also result in additional engineering time.

100000

200000

300000

400000

500000

600000

Number of EST patterns

Figure 8: ASIC4 logic BIST grades versus patterns.

Figure 9 Design flow with logic BIST. An assessment of the success of current implementations of logic BIST in the designs is presented in Table V. Most of the goals were achieved. However, the run time for the design compile is too long and work is underway to address this through design partitioning and distributed processing. IkosTM simulation times are also over the goal, but in retrospect this goal was impractical. Future simulations will mostly be partial simulations as is our current practice with ATPG pattern IkosTMsimulations. Confi-

Paper 142 366

dence in using logic BIST is now high enough that it will actually be used in new designs.

Goal

Description Eliminate tester memory and frequency G1 - limitations G2 I Provides at-speed scan testing I G3 - Works for 1-2 million gate designs , . G4 BIST stuck-at grade 2 95% G5 Logic BIST area 5 2% of logic G6 Silicon BIST run time c 1sec G7 Effort < 2 person month per design G8 Ability to use ATPG or logic BIST G9 Minimal impact on design methodology G10 Automation of the logic BIST flow

Status Achieved Achieved Expected Achieved Achieved Achieved Expected Achieved Achieved Achieved

these designs with low area overhead and high stuck-at fault coverage. The test application time as well as the fault simulation time were shown to be low. Finally, with the use of automation, it has been possible to implement logic BIST without impacting the product schedule. The proposed scheme, together with the implementation experience reported, show that logic BIST is a viable and acceptable test solution for large industrial designs. Future work will report on practical issues of implementing multi-frequency at-speed logic BIST as well as measuring the effectiveness of logic BIST test.
ACKNOWLEDGEMENTS

The authors would like to thank The0 Powell and the MOS design center engineers of Texas Instruments, as well as Ian Burgess, Ralph Sanchez, and Kelly Scott of Mentor Graphics, for their contributions and support.

REFERENCES
B. Bottoms, The Third Millenniums Test Dilemma, IEEE Design &Test o f Computers, pp. 711, Vol. 15, No. 4, Fall 1998. E. J. McCluskey, Built-In Self Test Techniques, IEEE Design &Test of Computers, pp. 21-28, Vol. 2, No. 2, April 1985. W. Needham and N. Gollakota, DFT Strategy for Intel Microprocessors, Proc. of International Test Conference,pp. 396-399, 1996. T. Foote, D. Hoffman, W. Houtt and M. Kusko, Testing the 400-MHz IBM Generation-4 CMOS Chip, Proc. of International Test Conference, pp. 106-114, 1997. C.-J. Lin, Y. Zorian and S. Bhawmik, PSBIST: A Partial Scan Based Built-In Self Test Scheme, Proc. of International Test Conference, 1993. P.H. Bardell, W.H. McKenney, and J. Savir, Built-In Test for VLSI: Pseudorandom techniques, John Wiley and Sons, New York, 1987. J. Rajski, N. Tamarapalli, ahd J. Tyszer, Automated Synthesis of Large Phase Shifters for Built-In SelfTest, Proc. of International Test Conference, pp. 1047-1056,1998. N. Tamarapalli and J. Rajski, Constructive MultiPhase Test Point Insertion for Scan-Based BIST, Proc. o f International Test Conference, pp. 649-658, 1996. A. Hassan, J. Rajski, R. Thompson and N. Tamarapalli, Method and Apparatus for At-Speed Testing of Digital Circuits, US patent pending. 101 B.Nadeau-Dostie, D: Burek and Hassan, ScanBIST: A Multifrequency Scan-Based BIST Method, IEEE Design &Test of Computers, pp. 7-17, Vol. 11, No. 1, Spring 1994.

V. CONCLUSIONS
In this paper, a practical logic BIST solution for large and complex industrial digital designs has been presented. The challenges in making logic BIST a viable test solution include making a design BIST-ready, achieving high test quality, automating logic BIST, and integrating logic BIST into the overall design flow without impacting the product schedule. Techniques like automatically identifying and bounding X generators, bypassing RAMS, bounding I/Os, and test point insertion have been proposed and discussed to make a design BIST-ready. The multi-phase test point insertion technique has been used to improve random pattern testability of the designs and to make BIST test coverage approach that of AWG. A novel BIST controller has been proposed to handle at-speed testing of multi-frequency designs. This multi-frequency BIST scheme is designed to test various intra- and inter-clock domain paths at-speed, thereby increasing the quality of test, without requiring that scan shifting be performed at speed. The results of implementing the logic BIST solution on four industrial designs have been reported. The solution embodies the techniques described above, and a number of tools have been used to automate the BIST flow. These tools and techniques have made logic BIST a feasible solution for such large and complex industrial designs. The results presented demonstrate that most of the objectives set for logic BIST have been satisfied for the four designs. It has also been shown that logic BIST is implemented in

A.

Paper 142

367

Vous aimerez peut-être aussi