A Fault Detection Method for Combinational Circuits AliAbbassZoraghchian 1 , Moslem Didehban 2 , MohammadReza Mehrabian 3 1.Department of computer Engineering AllameMohaddesNoori institute of Higher Education Mazandaran, Iran, Zaliabbass@yahoo.com 2.Department of Computer Engineering and Information Technology Amirkabir University of Technology, Tehran, Iran, m_didehban@aut.ac.ir 3.Department of Computer Engineering and Information Technology Amirkabir University of Technology, Tehran, Iran, mhrb78@aut.ac.ir Abstract. As transistors become increasingly smaller and faster and noise margins become tighter, circuits and chip specially microprocessors tend to become more vulnerable to permanent and transient hardware faults. Most microprocessor designers focus on protecting memory elements among other parts of microprocessors against hardware faults through adding redundant error-correcting bits such as parity bits. How ever, the rate of soft errors in combinational parts of microprocessors is consider edas important as in sequential parts such as memory elements nowadays. The reason is that advances in scaling technology have led to reduced electrical masking .This paper proposes and evaluates a logic level fault-tolerant method based on parity for designing combinational circuits. Experimental results on a full adder circuit show that the proposed method makes the circuit fault- tolerant with less overhead in comparison with traditional methods. It will also be demonstrated that our proposed method enables the traditional TMR method to detect multiple faults in addition to single fault masking. Keywords: Soft Error, Transient Fault, Fault-Tolerance, Combinational Circuits, Full Adder. 1. Introduction As the transistor dimensions have shrunk and the large-scale integration in electronic switches has increased, chip fabricators can insert more than one billion transistors in a single chip. Such integration scale can increase the performance of chips. Many of new architecture techniques, such as Superscalar and Chip-Multi Processor (CMP), actually need such number of transistors. However, the ever-increasing nonlinear power consumption in the technology trend could be a disaster for circuits, because the transistor density is going up intensely. To prevent this problem we should decrease the voltage supply, and this change can lead to falling the noise margin in the circuit [1]. On the other hand, by shrinking the feature size, the factor QCritical (electrical charge of capacitances) is decreased too, and this problem can lead to increase the probability of fault occurrence in the circuit [2]. It is proved that large- scale circuit integration increases the failure rate exponentially [3].Generally, in new generation technologies, we have less reliability than the old ones. Some of the reasons for these problem sare: lower CL (load capacitance), lower VDD or VCC (supply voltage)that lead toa smaller noise margin, lower QCritical, more process variation [4] and manufacturing defects[5]. These factors affect the reliability, that is a key concept along with performance and power metrics, and needs to use a fault-tolerance mechanism [3, 6, 7].Typically, all components of chip scan be classified in to two categories, Logic Block sand Memory Elements. Commercial microprocessors typically use Error Correction Codes (ECCs) to protect these circuit elements. ECCs, such as parity, add latency to each access and results in an appreciable performance penalty, moreover, it is difficult to implement for logic blocks [8,9].However, combinational circuits are very importance for fault-tolerant design. Because new technologies are facing less degree of electrical masking and this phenomenon makes circuits more susceptible to faults [10]. In [3] it has been mentioned that from the year 2011on importance of improving fault coverage in combinational circuits will overcome the sequential ones. For this reason, we decided to choose this area for the implementation of our technique. Fault-tolerance techniques are generally accomplished by using redundancy in hardware, software, time and information [11]. In this paper, we have used hardware redundancy in combinational circuits. Some of the fault-tolerant hardware methods are Duplication With Comparison (DWC)in which the module is duplicated, result sare compared and if one mismatch occurs, an error flag is raised.N-Modular Redundancy (NMR) design techniques add reliability to a system at the expense of extra hardware resources. In an NMR system, all protected modules must be replicated N times, in order to allow for automatic masking of N/2 of faults happening in separate modules.Standard Triple-Module Redundancy (TMR) methods are used frequently. Using these methods, triple modules and voting circuits are implemented onto an Application Specific Integrated Circuit (ASIC) or a Field-Programmable Gate Array (FPGA). When a fault occurs, the voting circuit neglects the value of a faulty module and takes a correct value of the other two non-faulty modules. These methods come with high area and power dissipation penalties and are inherently proposed for detecting or masking a single fault [11].This paper is organized as follows: section 2 presents a brief background of fault sensitivity in combinational and sequential circuits ;in section 3 we proposed a new fault- tolerance technique in combinational circuits;in section 4we apply this method to full adder circuit; and finally, in section 5concluded. 2. Background Single event transient pulse is induced when cosmic particles such as Neutron Strikes, or radiation from packaging materials such as alpha Particle with enough energy hit the sensitive region in the circuit [10]. The voltage pulse propagates through an activated path in the logic circuit. When itis captured by a clock edge, a soft error occurs. Otherwise, that pulse is called a transient fault [12].In recent years, with advance in the technology of fabrication, transistor quantities, processors are becoming increasingly vulnerable to transient faults [13].Transient faults currently account for over 80% of faults in processor-based devices [14].In a typical integrated circuit, memory arrays, latch elements, and combinational logic are the most sensitive parts and could be affected by soft errors and transient faults. Historically, soft errors were a concern in the design of memory elements, but the susceptibility of the combinational blocks to transient faults increases as a side effect of technological scaling. Combinational logics are usually used for designing arithmetic circuits (such as adders, multipliers, etc) or in other words the data path of a computer. We should know the importance of employing combinational circuits in applicable chips processing is rising, as they are simpler, operate faster, and consume less power than sequential ones. Moreover, many of statistical researches proved this statement [15, 16].Combinational circuits occupy a considerable portion of processing chips in comparison with sequential circuits. For example in FPGAs,The ratio of using combinational circuits to sequential ones varies between 5 to 100 times [17,18].Fig.1: Overall new approach framework Continuous device scaling,higher degree of pipeliningand decreasing electrical masking effect, contribute to the increase in soft error rates in combinational circuits [10].Transient faults in combinational circuits are catching up with errors in memory elements [3]. A transient fault in a logic circuit might not be captured in a memory circuit, because it could be masked by one of the following three phenomena [3,19,20]:First, Logical Masking, occurs when a particle strikes a portion of the combinational logic that is blocked from affecting the output due to a subsequent gate whose result is completely determined by other input values.Second, Electrical Masking, occurs when the pulse resulting from a particle strike is attenuated by subsequent logic gates due to the electrical properties ofthe gates to the point that it does not affect the result of the circuit.Third, Latching-Window Masking,occurs when the pulse resulting from a particle strike reaches a latch, but not at the clock transition where the latch captures its input value. These masking effects have been found to result in a significantly lower rate of soft errors in combinational logic compared to storage circuits in equivalent device technology [3]. However, these effects could diminish significantly as feature sizes decrease and the number of stages inthe processor pipeline increases. Electrical masking could be reduced by device scaling because smaller transistors are faster and hence may have less attenuation effect on a pulse. Also, deeper processor pipelines allow higher clock rates, meaning the latches in the processor will cycle more frequently, which may reduce latching-window masking. Hence, in this work we focus on occurrence of transient faults and soft errors in combinational logic circuits and suggest a logic level fault-tolerant design method. 3. New Approach Framework In this paper we presented a new approach to design fault-tolerant combinational circuits. Assume a logic circuit with m-input and n-output lines. Each output is a logic function of inputs. In this method, we use hardware redundancy to add a redundant output signal to the circuit. This new output generates the parity bit for output set. The value of redundant output is directly derived from the input lines.There are two main types of parity checking in digital systems, Odd Parity (Po) and Even Parity(Pe). Both of these types can be used in our scheme. Because parity checking mechanism is a relative method and it is sufficient that both sides be aware of the convention of data communication [21].If we model the input/output sides of a logic circuit by the sender/transmitter stations in a telecommunication system, we can say that parity checking is a conventional method to check the bit errors in the telecommunication systems. In this technique, the transmitter station tries to send an extra bit accompanied by transmission data bits,in order to detect single bit errors that may occur on the channel after getting data by the receiver station. In our scheme, first we calculated the truth table of the redundant output line as an even/odd parity for output lines. Then, try to make it related to the input bit arrangements.After finding the function role and simplifying it, we can plot by the least needed logic gates. It is important that we should not use the middle terms of the main circuit to design this redundant line. If middle term are used, faults occurring before these branches would not appear on the output (i.e. the redundant parity signal).If we assume using the even parity mechanism for designing the redundant output line, XORing this line with the other main circuit outputs can demonstrate the error occurrence. We name the result of this XOR gate as the ERROR signal. Seeing zero in this line means that no error has occurred, and seeing one can refer to an error in a part of the circuit. It is clear that by using the odd parity checking mechanism XOR gate converts to XNOR gate, but the outline of overall method remains as before. The framework of this scheme has been shown in Fig. 1. TMR method is a conventional technique to design fault-tolerant circuits. This technique can mask a single fault by voting results. But, if multiple faults occur in the modules, this method is unable to mask or detect them. As a result, incorrect outputs are voted and an error is appeared as the result of the circuit. Hence, TMR treats weakly when facing multiple faults. Replacing our proposal module with conventional TMR modules can help to detect multiple faults by voting ERROR signals in addition to the other output lines.It worth emphasize that our proposed approach is capable to detect all single faults that may occur in the logic circuit, whether they lie in the main part of circuit or not. Fig.1: Overall new approach framework 4. A Case Study ALUs are very important combinational circuit block that lie on the computational data path in the processors. The reason for their importance can be related being on the critical path. Typically, a key point of delay propagation in the processors is the maximum length of path between source registers and destination ones[22]. This length is depended upon the number of ALU functional units (FUs) that are lied on this computational path. Critical path on the processors should be designed as a fault tolerance path in order to increase reliability. A. Designing Fault-Detection Full Adder (FDFA) Figure2.Gate-level FDFA A To show the effects of our approach in practice, we used a FDFA an applicable logic circuit in the ALUs. FA is a simple logic circuit that is used as a basic element to design many of functional units, such as adders, subtracters, multipliers, dividers, etc. All circuits in our experiments are custom designed and laid out in 1.8V, 0.5µmCMOStechnology and simulated by using HSPICEtool. This circuit uses a redundant part to provide an even parity for outputs SUM and COUT, which is named E. Eis a logic function of FA input lines, named A, B and CIN. The value of this output line is selected in a way that the number of 1 bits in three output lines, always be even.TABLE I shows the truth table of FDFA. TABLE I: Truth table of FDFA A B CIN SUM COUT E 0 0 0 0 0 0 0 0 1 1 0 1 0 1 0 1 0 1 0 1 1 0 1 1 1 0 0 1 0 1 1 0 1 0 1 1 1 1 0 0 1 1 1 1 1 1 1 0 CINBACINBACINBACINBAE .........  (1) After deriving the value of E line for all input arranges, the equation (1) simply is resulted. This equation shows that in order to implement E line, three NANDs and three NOTs are totally needed. In the next stage, we try to design this circuit by above gates. The derived gate-level design of FDFA has shown in Fig. 2. After it, we sketch the layout of this circuit in the L-Edit tool and get the waveforms of FDFA operation with giving some experimental inputs. Fig. 3 shows these waveforms for all of input and output lines. B. Implementation Results We demonstrated the hardware and timing overheads of using FDFA in comparison with the NFA. Next, with adding a XOR gate into the FDFA(as XFDFA), the calculated overheads compared with the DWC method with NFA modules. TABLE II shows area, propagation delay and dynamic power consumption overheads. The area overhead is about 27% that is the result of 12 additional transistors in FDFA design. About propagation delay, that is a very important factor to evaluate circuit specifications, both of schemes are similar.Because worst-case delay is limited by path of generating SUM signal.In TABLE III,we compared XFDFA with DWC Figure 2.Gate-level FDFA because both of them are capable to detect a single fault. Figure3. FDFA waveforms TABLE II: NFA vs. FDFA Metric Method Area Dela y Power (fwT) Layout Area # of Transistor s NFA 204*58 λ 2 38 54τ 2.68 FDFA 254*58 λ 2 50 54τ 3.86 Overhead 24% 31% 0% 44% TABLE III: DWC vs.XFDFA Metric Method Area Delay Power (fwT) Layout Area # of Transisto rs DWC 339*215 λ 2 92 131.2 τ 6.387 XFDFA 311*128 λ 2 64 179.8 τ 4.981 Improvem ent 45% 30% -27% 22% In order to implement DWC, we use two coupled NFAs with XORed similar outputs together to check results. Our XFDFA is designed with 28 transistors less than DWC and almost can save 28% in dynamic power consumption. Because the redundant component in DWC is greater than XFDFA.But, in delay we are penalized about 37%. The reason of this penalty is using a two level XOR in the final stage of XFDFA.If we verify the arrangement of gates in each circuit by using logic level viewing, find out that both of these methods use a 2-level combinational circuit in their output side, but the last gate in DWC (an OR gate) is quicker than the end gate of XFDFA (a XOR gate). 5. Conclusions In this paper, we proposed a new approach to design fault-tolerant combinational circuits. The main idea behind our proposed approach is using a redundant circuit that operates as an even parity generator for main output lines. We also showed that traditional methods such as TMR detect multiple faults in addition to single faults if they are combined with the proposed method. Experimental results obtained from testing the proposed approach on a full adder circuit exhibit about 27.5%overhead in area and 44% penalty in power consumption, which shows about 37.5% improvement in area and 22% in power over the traditional DWC method. This method shows to perform almost equal to DWC from propagation delay point of view. References [1] J. Srinivasan, S. V. Adve, P. Bose, and J. A. Rivers, "The impact of technology scaling on lifetime reliability," presented at the INT. Conf. Dependable Systems and Networks, June 2004. [2] K. Mohanram, and N. A. Touba, "Cost-Effective Approach for Reducing Soft Error Failure Rate in Logic Circuits," presented at the ITC. Conf. International Test Conference, 2003, pp. 893. [3] P. Shivakumar, M. Kistler, S. W. Keckler, D. Burger, and L. Alvisi, "Modeling the effect of technology trends on the soft error rate of combinational logic," presented at the DSN. Conf. Proceedings of International Conference on Dependable Systems and Networks, June 2002, p. 389–98. [4] S. Borkar, T. Karnik, S. Narendra, J. Tschanz, A. Keshavarzi, and D. Vivek, "Parameter variations and impact on circuits and microarchitecture," presented at the Conf. Design Automation, June 2003. [5] K. Constantinides, S. Plaza, J. Blome, B. Zhang, V. Bertacco, S. Mahlke, T. Austin, and M. Orshansky, "Bulletproof: A defect-tolerant CMP switch architecture," presented at the INT. Conf. Symposium on High Performance Computer Architecture, February 2006. [6] V. Narayanan, and Y. Xie, "Reliability concerns in embedded system designs," presented at the Conf. IEEE Computer Society, Jan 2006, 39(1):118–20. [7] D. T. Franco, J. F. Naviner, and L. Naviner, "Yield and reliability issues in nanoelectronic technologies," presented at the ANN. Conf. Telecommunication, 2006, 61(11–12):1422–57. [8] J. F. Zielger, and H. Puchner, "SER-History, Trends and Challenges," presented at the Conf. Cypress Semiconductor Corporation, 2004. [9] V. Stojanovic, "A Cost-Effective Implementation of an ECC-Protected Instruction Queue for Out-of-Order Microprocessors," presented at the DAC. Conf., 2006. [10] S. Mukherajee, "Architectural Design for Soft Errors," in Morgan Kaufmann Publishers, 2008. [11] B. Johnson, "Design and Analysis of Fault-Tolerant Digital Systems," in Addison Wesley Reading MA, 1989. [12] J. A. Blome, S. Gupta, S. Feng, and S. Mahlke, “Cost-efficient soft error protection for embedded microprocessors,” in CASES, 2006, pp. 421–31. [13] S. Mukherjee, J. Emer, and S. Reinhardt, “The Soft-Error Problem: An Architectural Perspective,” presented at the HPCA-11 Conf. 11th INT. Symp. High Performance Computer Architecture, 2005. [14] R. K. Dyeriyer, and D. J. Rossetti, "A Measurement-Bascd Model for Workload Dependence of CPU Errors," presented at the IEEE Trans. Comp., vol. C-35, pp. 511-19, June 1986. [15] Z. A. Obaid, N. Sulaiman, and M. N. Hamidon, "Developed Method of FPGA-based Fuzzy Logic Controller Design with the Aid of Conventional PID Algorithm," presented at the Australian Journal of Basic and Applied Sciences, 2009, 3(3):2724-40. [16] K. Perkuszewski, K. T. Pozniak, W. Jalmuna, W. Koprek, J. Szewinski, and R. S. Romaniuk, "FPGA based Multichannel Optical Concenrator SIMCON 4.0," presented at the TESLA cavities LLRF Control System, Deutsche Elektronen-Synchrotron (DESY), Germany, 2007. [17] T. Siriwan, and P. Nilagupta, "HPGAST: High Performance GA-based Sequential circuits Test generation on Beowulf PC-Cluster," presented at the Conf. Pahonyothin Rd. Lardyao Jatujak Bangkok 10900 Thailand, 2002. [18] F. Kocan, and D. G. Saab, "Dynamic Fault Diagnosis of Combinational and Sequential Circuits on Reconfigurable Hardware," presented at the Journal of Electron Test, Springer Science, September 2007, 23:405–20. [19] M. P. Baze, and S. P. Buchner, “Attenuation of Single Event Induced Pulses in CMOS Combinational Logic,” presented at the IEEE Trans. on Nuclear Science, Vol. 44, No. 6, pp. 2217–22, December 1997. [20] P. Liden, P. Dahlgren, R. Johansson, and J. Karlsson, “On Latching Probability of Particle Induced Transient in Combinatorial Networks,” presented at the 24th Symposium on Fault-Tolerant Computing (FTCS), pp. 340–49, June 1994. [21] M. D. Chinn, "Survey Based Expectations and Uncovered Interest Rate Parity," University of Wisconsin, Madison and NBER, October 2009. [22] A. Saberkari, A. Afzalikosha, and S. B. Shokouhi, “A new low voltage and low power CMOS one bit full-adder using GDI technique,”presented at the 14th Iranian Conference on Electrical Engineering (ICEE), Amirkabir University, Tehran, Iran, May 2006.