### **Mirosław PUCZKO**

BIALYSTOK UNIVERSITY OF TECHNOLOGY, COMPUTER SCIENCE DEPARTMENT 45A Wiejska St., 15-351 Bialystok

# Low power BIST

#### Abstract

In the last years designers have mainly concentrated on low power consumption in mobile computing devices and cellular phones. In this paper, new solutions for reducing the switching activity of BIST environment for the scan-organized Built-In Self-Test (BIST) architectures is presented. The key idea behind this technique is based on the design of a new structure of LFSR to generate more than one pseudo random bit per one clock pulse. Theoretical calculations were hardware verified in two digital system design environments: WebPACK ISE by Xilinx and Quartus II by Altera. Power consumption measure tools were Xilinx XPower and Altera PowerPlay Power Analyzer Tool. The practical verification covers the power consumption of the Test Pattern Generator (TPG) as well as the complete BIST. The obtained results are over a dozen percent better compared to similar works.

**Keywords**: low power, BIST, Test Pattern generator, Signature Analyzer, test-per-scan, test-per-clock, power consumption.

### 1. Introduction

Power consumption in digital CMOS circuits directly depends on the frequency and physical parameters of primary gates. It has been shown in [1] that the power consumption during the testing mode is higher than during the normal working mode. That is why any new solution for power reducing during the test mode is the matter of great interest for practice. At the same time the adequate power consumption metric is an open issue due to existing approaches allowing getting just an approximate value of real energy consumption, rather than worse case (maximal) estimation.

Several approaches of low power BIST have been proposed. In [2] the author presents a test scheduling approach that takes into consideration the power consumption. For general BIST structure a new test pattern generator is proposed [3] to reduce the circuit input activity without affecting the test efficiency, thus reducing the power consumption. There is a set of solutions to eliminate useless pseudo random patterns during the test mode [4] to keep the same fault coverage at an acceptable test length as the result of the lower level of power consumption.

Scan design techniques are the most spread and best known. They assume that during the testing mode all memory elements (flip-flops and latches) in sequential circuits are connected into one or more shift registers or scan paths. LFSR is the most popular device for generation of pseudo-random test sequences. Also a structure of LFSR can be modified to accept an internal input in order to work as a polynomial divider (called Signature Analyzer - SA). That is why the sufficient amount of energy is consumed during the shifting mode, for bit stream generation, output response compaction and bit stream shifting into the scan. In [5] it has been shown that BIST hardware may consume up to 70% of the total energy during the testing mode.

In the vast majority of research works, the authors take into account the value of the power consumed only by one of the BIST blocks (Test Pattern Generator or Signature Analyzer) depending on the proposed solution [6]. In exceptional cases, there is taken into consideration the power consumption of all BIST [7]. The main difference is in blocks for which the power is calculated and in the software used for this calculation.

This paper presents new solutions for decreasing the BIST switching activity for scan-based architecture.

The paper is organized as follows. In Section 2, the power consumption issue and weighted switching activity modeling are investigated. Section 3 presents new approaches of TPG. In Sections 4 and 5 hardware verification of new approaches is presented. Section 6 is the summary.

### 2. Power model

For decades, CMOS technology have dominated among manufacturers of digital semiconductor chips. The basic element for the implementation of logic functions in CMOS technology is an inverter, or pair of complementary transistors MOSFET [8].

Power in CMOS can be static and dynamic (switchings in a circuit) [9]. The dynamic power is created when switching the system from one logical state to another. You may find that when switching MOSFETs, the gate capacity must be handled, which has the direct influence on switching times, switching currents, and consequently, also on the power. The dynamic power depends on the activity of change of the system, i.e. the higher the switching activity, the higher the power separated. In the absence of dynamic power switching, it is zero.

Static power in CMOS is vanishingly small, and therefore the power loss CMOS dynamic power decides.

The dynamic power can be calculated as:

$$P = \frac{1}{2} \cdot C_L \cdot U^2 \cdot f \tag{1}$$

where:  $C_L$  – output capacity, U – power supply, f – switching frequency.

According to the works [9, 10] the average energy associated with the overload capacity of one node *j* is:

$$E_j = \frac{1}{2} \cdot U_{dd}^2 \cdot C_0 \cdot \left(SA_j \cdot v_j\right).$$
<sup>(2)</sup>

where:  $U_{dd}$  power supply voltage,  $C_0$  – capacity of one node,  $(SA_j)$  – switching activity of the node j,  $v_j$ -branching factor of the node j.

The product of switching activity of the node j and the branching factor of the node j is called WSA (Weighted Switching Activity) and is written as  $WSA_{j}$ .

$$WSA_j = SA_j \cdot v_j \,. \tag{3}$$

WSA for the whole circuit can be written as:

$$WSA = \sum_{j=1}^{n} SA_j \cdot v_j \ . \tag{4}$$

The expression to calculate the average energy associated with the overload capacity of one node j taking into account the weighted switching activity of the node j is:

$$E_j = \frac{1}{2} \cdot U_{dd}^2 \cdot C_0 \cdot WSA_j .$$
<sup>(5)</sup>

The average node power *j* is:

$$P_j = \frac{1}{2} \cdot U_{dd}^2 \cdot C_0 \cdot WSA_j \cdot f .$$
(6)

The total average power is the sum of the average power of all nodes:

$$P_{av} = \frac{1}{2} \cdot U_{dd}^2 \cdot C_0 \cdot f \cdot \sum_{j=1}^n WSA_j =$$
  
=  $\frac{1}{2} \cdot U_{dd}^2 \cdot C_0 \cdot f \cdot WSA$  (7)

*WSA* is the only changing value in (7), so the power calculation is based on calculating *WSA*. The values defining the power consumption commonly used in literature are:

- power P in W (or in aliquot unit) [6, 12],

- Weighted Switching Activity (WSA) [13].

## 3. New approach of TPG

Theoretical calculations presented in [14], [16] were checked in practice in two digital system design environments: WebPACK ISE by Xilinx and Quartus II by Altera. Power consumption tools were Xilinx XPower and Altera PowerPlay Power Analyzer Tool. The practical verification covers the power consumption of the Test Pattern Generator (TPG) as well as the whole Built-In Self Test (BIST). The main idea of the presented solution is shown in Figs. 1 and 2. The standard LFSR (Fig. 1) generates one new bit in each clock cycle.



Fig. 1. Connection schema in the standard LFSR

The modified LFSR (Fig. 2) generates q new bits in each clock cycle. This difference in the number of generated new bits leads to power saving.



Fig. 2. Connection schema in the new approach (modified LFSR)

Using the VHDL code there was written the standard and new approach of the generator, which was verified in the RTL scheme. The RTL of the standard LFSR scheme is presented in Fig. 3.



Fig. 3. The standard LFSR RTL scheme

The RTL of the modified LFSR scheme is shown in Fig. 4.



Fig. 4. The modified LFSR RTL scheme

# 4. Hardware verification in XILINX

Using XILINX there was measured only the power consumption by TPG. The power of the testing circuit was not taken into account.

To get lower power consumption, there should be generated more than one new bit during one clock cycle. To achieve this, there was used the scheme from Fig. 2. The projects for all primitive polynomials of  $5^{\text{th}}$  degree when the number on new bits (per clock cycle) was from 2 to 10 were written in VHDL language. Then using WebPack the ISE project was synthesized based on the FPGA Spartan II XC2S100. Next the power was measured using XPOWER. After verifying that the modified generator will worked properly, the power was measured for 100MHz frequency and the working time 4000 ns. In Tab. 1 there are presented the obtained results.

Tab. 1. Dynamic power consumption, number of logical elements per one new bit for modified generator, for 2-10 new bits for primitive polynomial of 5<sup>th</sup> degree P[mW]-minimum and maximum dynamic power, LM-number of macrocells, LE-number of logical elements

| Number of<br>new bits | 2           | 3           | 4           | 5           | 6         | 7         | 8         | 9  | 10        |
|-----------------------|-------------|-------------|-------------|-------------|-----------|-----------|-----------|----|-----------|
| P, mW                 | 225-<br>228 | 172-<br>174 | 129-<br>131 | 103-<br>105 | 88-<br>90 | 75-<br>76 | 66-<br>67 | 59 | 53-<br>54 |
| LME                   | 21          | 14          | 11          | 9           | 7         | 6         | 5         | 5  | 5         |
| LE                    | 3           | 2           | 1           | 1           | 1         | 1         | 1         | 1  | 1         |

As it is shown in Tab. 1, the modified TPG can lead to lowering the power consumption per one new bit in BIST.

### 5. Hardware verification in ALTERA

Using ALTERA the power consumption was measured for the whole BIST (TPG, testing circuit, SA). To do this, it was necessary to make a special tool in C language for generating the code of the generator. The result is the VHDL code which contains three files of the generator:

- modulo q counter,
- system generator value for  $V^q$  matrix,
- system generator generating q new bits.

Primitive polynomials for which the matrix  $V^q$  and the number of new bits (q) will be generated are needed as parameters. After executing the program there are generated three files. In practical realizations, these files should be added to the project and should be correctly connected to the inputs (CLOCK signal and RES signal) and the outputs (generated bits).

In Fig. 5, there is presented the RTL scheme for the 5 new bit generator using the primitive polynomial  $f(x)=1+x^3+x^6$ .



Fig. 5. RTL scheme for 5 new bit generator using primitive polynomial  $f(x)=1+x^3+x^6$ 

At the inputs there are two signals (clk and res). At the output *y* there are new bits generated by the generator.

In Fig. 6 is there is shown the RTL scheme of the block calculating factors of the matrix M for the primitive polynomial  $f(x)=1+x^3+x^6$ . The matrix M is coded as logical equations so there is no need for external memory.

Fig. 7 shows the RTL scheme of the counter for the primitive polynomial  $f(x)=1+x^3+x^6$ .

This tool was used to prepare files to realize the generator used in power measurement.



Fig. 6. RTL scheme of the block calculating factors of matrix M for the primitive polynomial  $f(x)=1+x^3+x^6$ 



Fig. 7. RTL scheme of the counter for the primitive polynomial  $f(x)=1+x^3+x^6$ 

In the experiments there were used primitive polynomials of 28<sup>th</sup> and 33<sup>th</sup> degree. The primitive polynomial of 33<sup>th</sup> degree allows generating a testing sequence for the circuit C1908 (ISCAS'85). The primitive polynomial of 28<sup>th</sup> degree allows generating a testing sequence for the circuit s38417 (ISCAS'89). These circuits (ISCAS'85, ISCAS'89) were used in comparable works.

Tab. 2 presents the power consumption and the number of logical elements for the primitive polynomial  $f(x)=1+x^4+x^6+x^{33}$ .

Tab. 2. Dynamic power consumption, number of logical elements

for 2-10 new bits for primitive polynomial of  $33^{th}$  degree P, mW – dynamic power, LE – number of logical elements

| Number of<br>new bits | 2     | 3    | 4    | 5    | 6    | 7    | 8    | 9    | 10   |
|-----------------------|-------|------|------|------|------|------|------|------|------|
| P, mW                 | 0.13  | 0.08 | 0.06 | 0.05 | 0.04 | 0.03 | 0.03 | 0.02 | 0.02 |
| LE                    | 81.50 | 54.3 | 42.2 | 33.6 | 28.1 | 24.1 | 21.1 | 18.8 | 17.0 |

The results presented in Tab. 2 show that the number of logical elements decreases with the change of the number of new bits. For ten new bits the number of logical elements per one new bit is lower compared with the number for two new bits. The same concerns the dynamic power consumption, where the difference between generating ten new bits and two new bits is 82.53%.



Fig. 8. Power consumption for LFSR of 33th degree P1  $-f(x)=1+x^4+x^6+x^{33}$ , P2  $-f(x)=1+x^2+x^3+x^4+x^6+x^7+x^{33}$ , P3  $-f(x)=1+x^4+x^{10}+x^{19}+x^{20}+x^{23}+x^{33}$ .

The results obtained for other primitive polynomials of 33<sup>th</sup> degree are presented in Fig. 8 (for the primitive polynomials:  $f(x)=1+x^4+x^6+x^{33}$ ,  $f(x)=1+x^2+x^3+x^4+x^6+x^7+x^{33}$ ,  $f(x)=1+x^4+x^{10}+x^{19}+x^{20}+x^{23}+x^{33}$ )

Tab. 3 presents the power consumption and the number of logical elements for the primitive polynomial  $f(x)=1+x^3+x^{28}$ .

For the primitive polynomials of 28<sup>th</sup> degree, the obtained results of the dynamic power and the number of logical elements are presented in Tab. 3 and in Fig. 9.

Tab. 3. Dynamic power consumption, number of logical elements for 2-10 new bits for primitive polynomial 28<sup>th</sup> degree P, mW – dynamic power, LE – number of logical elements

| Number of<br>new bits | 2    | 3    | 4    | 5    | 6    | 7    | 8    | 9    | 10   |
|-----------------------|------|------|------|------|------|------|------|------|------|
| P, mW                 | 0.10 | 0.05 | 0.05 | 0.04 | 0.03 | 0.03 | 0.02 | 0.02 | 0.01 |
| LE                    | 65.0 | 43.3 | 32.5 | 26.2 | 22.1 | 19.1 | 16.7 | 14.7 | 13.5 |

The results presented in Tab. 3 show that with a change of number of new bits the number of logical elements decreases. Comparing ten and two new bits, the difference is 79.23%. Comparing the dynamic power consumption, the difference is 82.35% per one new bit.



P1  $-f(x)=1+x^3+x^{28}$ , P2  $-f(x)=1+x+x^4+x^6+x^{28}$ , P3  $-f(x)=1+x+x^4+x^5+x^6+x^{8}+x^{28}$ .

### 6. Summary

The methods and algorithms presented in [14, 15, 16] analyze the problem of considerable power consumption by test pattern generators and signature analyzers in BIST. The paper presents briefly the used methods and techniques for the design of test pattern generators and signature analyzers which allow minimizing the power consumption. For the set of the developed methods and algorithms, the simulation and hardware verification confirm that their use leads to lower power consumption in BIST.

### 7. References

- Zorian Y.: A Distributed BIST Control Scheme for Complex VLSI Dissipation. Proc.11th IEEE VLSI Test Symposium, 1993, pp.4-9.
- [2] Wang S., Gupta S.: DS-LFSR: A new BIST TPG for low Heat Dissipation. Proc. of IEEE International Test Conference (ITC'97), November 1997, pp. 848-857.
- [3] Corno F., Rebaudengo M., Sonza Reorda M., Violante M.: A new BIST Arhitecture for Low Power Circuits. IEEE European Test Workshop (ETW'99), 1999, pp. 160-164.
- [4] Girard P., Guiller L., Landrault C., Pravossoudo-vitch S.: A Test Vector Inhibiting Technique for Low Energy BIST Design. Proc. 17th IEEE VLSI Test Symposium, 1999, pp.407-412.
- [5] Gerstendorfer S., Wunderlish H.J.: Minimized Power Consumption for Scan-Based BIST. Proc. of IEEE Int. Test Conf., 1999. p. 77-83.

- [6] Kavitha A., G. Seetharaman A., Prabakar T.N.: Design of Low Power TPG Using LP-LFSR. Third International Conference on Intelligent Systems Modelling and Simulation, 2012, pp. 334-338.
- [7] Nourani M., Tehranipoor M., Ahmed N.: Low-Transition Test Pattern Generation for BIST-Based Applications. IEEE Transactions on Computers, Vol. 57, no. 3, March 2008, pp. 303-315.
- [8] Jacob B.: CMOS Circuit Design, Layout, and Simulation. Third edition. A John Wiley & Sons, INC., Publication, 2010, pp. 331-352.
- [9] Cirit M.A.: Estimating Dynamic Power Consumption of CMOS Circuits. ACM/IEEE International Conference on CAD, November 1987, pp.534-537.
- [10] Wang Y., Roy K.: Maximum power estimation for CMOS circuits using deterministic and statistical approaches. IEEE VLSI Conference, 1996.
- [11]Gary P.Yeap: Practical Low Power Digital VLSI Design. Kluwer Academic Publisher, 1998.
- [12] Ye B., Li T-W.: A Novel BIST Scheme for Low Power Testing. Computer Science and Information Technology (ICCSIT), 2010 3rd IEEE International Conference, 2010, pp. 134-137.
- [13] Vijay R., Chitra S.: Power Reduction in Scan Based BIST Using BS-LFSR and Scan-Chain Ordering. IEEE- International Conference On Advances In Engineering, Science And Management (ICAESM -2012) March 30, 31, 2012, pp. 534-540.
- [14] Puczko M., Murashko I.: Techniki zmniejszania poboru mocy wykorzystywane podczas wbudowanego samotestowania. Pomiary, Automatyka, Kontrola, 2006, R.51, No. 6, pp. 56-58.

- [15] Chowdhury S., Barkatullah J.S.: Estimation of maximum currents in MOS IC logic circuits. IEEE Transactions on Computer-Aided Design, 1990, vol. 9, No. 6, pp. 642-654.
- [16] Puczko M., Yarmolik V.N.: Two-pattern test generation with low power consumption based on LFSR. Information processing and security systems, Springer-Verlag, 2005, pp. 159-166.

Received: 21.04.2015 Paper reviewed Accepted: 02.06.2015

#### Miroslaw PUCZKO, MSc

Works on low power scan-organized Built-In Self Tests.



e-mail: miroslaw.puczko@gmail.com