# Performance analysis of Low energy and highspeed DA-RNS based FIR filter design for SDR Applications on FPGA

<sup>1</sup>Mohan Kumar B. N., <sup>2</sup>Rangaraju H. G.

<sup>1</sup>Research Scholar, Govt. SKSJT Institute, Department of Electronics and Communication Engineering, Dept of ECE, RRIT, Affiliated to Visvesvaraya Technological University, Belagavi, Karnataka, India. <sup>2</sup>Department of Electronics and Communication Engineering, Govt. SKSIT Institute, Affiliated to

<sup>2</sup>Department of Electronics and Communication Engineering, Govt. SKSJT Institute, Affiliated to Visvesvaraya Technological University, Belagavi, Karnataka, India <sup>1</sup>mohankumarbn1@gmail.com and <sup>2</sup>rang raju@yahoo.com

Received: February 15, 2021. Revised: June 22, 2021. Accepted: July 19, 2021. Published: July 22, 2021.

Abstract—For different applications, the Finite Impulse Response (FIR) filter is widely used in digital signal processing (DSP) applications. We exhibit a significant Residue Number System (RNS)-based FIR filter design for Software Defined Radio (SDR) filtration in this article. Including its underlying concurrency and information clustering process, the RNS provides important statistics over FIR application in specific. According to several residue computing and reverse translation, expanded bit size results in a significant performance trade-off, conversely. Through RNS replication, accompanied by conditional delay optimized reverse processing to minimize the FIR filter trade-off features with filter duration optimized Residue Number System arithmetic is proposed in this study, which involves distributed arithmetic-based residue processing. To execute the task of reverse translation and to store pre-computational properties, the suggested Residue Number System architecture makes use of built-in RAM blocks found in field-programmable gate array (FPGA) devices. The proposed FIR filter with core optimized RNS has the benefit of lowering processing latency delay while rising performance torque. Followed by FPGA hardware synthesis for different input word sizes and FIR lengths verification by the efficiency of the FIR filter core, fetal audio signal detection is performed first. The test results reveal that over the optimization procedure RNS method, a compromise in traditional RNS FIR over filter size is narrowed, as well as a substantial decrease in sophistication.

Keywords—RNS-Based Adder, Distributed Arithmetic based Multiplier, FIR, RNS, Low Energy-power product, FPGA, SDR.

#### I. INTRODUCTION

A Modulo converter, which is centered on circuitry processing, is perhaps the most important feature of the residue number system that has gained popularity. This investigation was the one to provide a systematic study of modulo 2n-1 variable design. Modulo differentials are implemented using non - linear VLSI components with a lookup table, i.e. via Read-only memory. On the other side, the term is originally applied to short words, but it can then be applied to longer words. To improve the performance of memory-less coefficients, and an algorithm like Boothencoding is used to improve coefficients; the technologies of residue number systems like concurrent and scalable are used. A highly developed 2n-1 modulo multiplier, which is based on the redundant residue number scheme (RRNS), has also been selected for the extremely high range. The principles of 2n-1 integer coefficients have thus been efficiently evaluated and tested for the new modulo coefficients developments using the description architecture converter method. The 2n-1 modulo multipliers based on the Residue number scheme are commonly accepted as a faster and convenient arithmetic circuit for varied uses of signal processing, such as image processing, finite impulse response filters, interaction, cryptography, discrete cosine transform, and possible uses of the digital signal processor. The Residue number system is more advantageous than the traditional 2's signaling pathway because it is a carry-free, unbalanced set of numbers[27]. Residual Number Systems characterizes the relative co-prime integer moduli m1, m2, m3,..... MK, such that any arbitrary integer, i.e. X=x1, x2, x3,.... xk, is referenced to as vestiges of X and xi =X modulo mi (xi=|X|mi). The four primary elements of the Residual Number Systems block diagram are depicted in Figure 1: forward and reverse converters, inter modulo procedures, and mathematical streams. Forward and backward adapters are known as inter modulo processes in this instance. Reverse adapters are used to transform residue numbers into weighted numbers and conversely, Forward

converters perform the conversion of weighted quantities into residual figures. Multiplication and addition are carried out by the mathematical streams. The mathematical streams, on the other hand, perform modulo intra events [26].

Modulo multipliers relying on look-up tables were earlier used for modulo amplification of shorter word lengths [12]. Because the Read-Only Memory size grows substantially in proportion to the modulus size, it is unsatisfactory for reasonably big moduli, therefore this technique is impossible. perform multiplication operations between two To decomposed numbers, thus reducing the ROM scale [13] proposes a 2n-1 modulo multiplier based on a cost-effective hardware look-up table, which employs cyclic convolution. The look-up-table technique, on the other hand, is ideal for moduli of a fairly limited scale. For moduli that are substantially larger and medium in size, [14] presents a memory less 2n-1 modulo multiplier that is carried out by different logic gates, adders, and multipliers. The Binary multiplier necessitates a substantially greater area, which slows down the process. [15] Proposes and implements a 2n-1 modulo converter that is non-encoded and performs at high speed deprived of the use of a binary converter. In comparison to the existing methodology, the proposed work in [15] has a circuit with a lower level of difficulty. As a result, the Booth technique is used to speed up the multiplier, and a radix-2 Booth encoded multiplier is provided in [16]. The bits in the inputs, on either side, match the partial products shown in [15] and [16]. The multiplier's speed is calculated based on the number of Partial Products added. To reduce the number of Partial Products, [17-20] developed a 2n-1 modulo converter relying on the Booth technique. For the radix-4 Booth encrypted converter proposed in [17-18], the intermediate results needed for the approximation of 2n-1 integer arithmetic are reduced to [(n+1)/2] and n/2. However, in the instance of the radix-8 Booth encoded multiplier, the total number of input signals is reduced to [(n-2)/3] + 1, as shown in [19-20], and one of the disadvantages of this approach is the increased latency caused by the hard various included in the radix-8 Booth encrypted modifier. [19] used an upgraded concurrent prefix and ripple carry adder design to construct hard multiple synthesizers. The implementation of a redundant residual number system (RRNS) for an upper level is applied to avoid intra-carry replication in the RNS [38].

#### II. LITERATURE SURVEY

Residue Number System (RNS): The Residual Number System has been proposed as a replacement for the weighted two's complement counting system, which is one of the most challenging and desired complex numbers in computer arithmetic for the last fifty years [4]. Several elements of residue number systems that are acceptable are studied for the use of quick computer computation. Cryptography and digital signal processing [5] are two applications of residue number systems in which subtraction, multiplication, and addition are the most significant arithmetic operations. In addition to these applications, the residue number system may detect and fix faults using the fault-tolerant and redundant RNS facilities given in [3]. The residues data type can also be used to send data safely even without failure in a messaging system, with data being conveyed in the type of compounds [4]. Because various residues in the network are not dependent upon each other, they can reduce the inaccuracy in their moduli channel, from which the data is sent, on their own. As a result, the residue number system's properties are ideal for a wide range of wireless sensor networks. The Redundant Residual Number System can be used to detect and repair an issue in a communication system by combining redundant moduli with an already existing set of moduli [8].

Multi-rate techniques: In the computation of digital signals, many sampling frequencies are used, and the techniques utilized to perform these computations are known as Multirate systems. As a result of the use of Multi-rate methods, it has been demonstrated that the number of multipliers and adders needed for the project is decreased [1]-[2]. As a result, there are a variety of methods for improving multi-rate filters in Digital Signal Processing. Consider the following example: [1] introduces a methodology for FIR filters design in the Mth band to improve a polyphase system based on two phases for various integer sampling rate conversions, and it is demonstrated that conversions by even factors in a system are not as economical as transformations by odd values. The wideband filters and differentiators are developed using a novel design methodology in [2]-[3], resulting in a significant reduction in complexity. There are two-frequency systems in the above-mentioned innovative methodology that benefit from the Frequency Response Masking Technique to produce the lowest computation cost and crisp crossover zones. One of the most common uses for multi-rate systems is filter bank systems [4]-[7]. To achieve computationally efficient filter banks, the methodology [4] uses the Fast Fourier Transform and the inverse of the Fast Fourier Transform. Although the approach of trans control signals and cosine pulsed filter bank is used to produce a prototype filter [5], the technique of Interpolated Finite Impulse Response is used to construct a prototype filter. The use of nature-inspired meta-heuristic algorithms was used to introduce a strategy for increasing the filter banks values in [6]-[7]. To obtain computationally efficient approaches, the integer D is factorized in q factors, which divides the interpolation and decimation procedure into q phases. Take the case where q = 2 and  $D = M \times R$ . Originally, the Cascaded Integrator-Comb (CIC) design is used, which is substantially more efficient in terms of chip space but consumes a lot of power due to the high rate of mediators. As a result, a decimation system based on a multistage comb is commonly used. The value of q is assumed as 3 (that is, M = M1M2) in the methods mentioned in [8]-[9], while the value of q is not less than 3 in the style makes in [10]-[12], where D is confined to 2's or 3's capacity. The firststage filter is constructed using non-recursive multistage designs, which reduces power consumption while increasing computational time. It has been demonstrated that by using basic sub-filters, the creation of FIR filters can be made more efficient. When compared to alternative direct procedures, the decomposition of the entire filter into simple sub-filters results in filters with fewer arithmetic operations and a narrow transition range. As a result, these strategies are ubiquitous in a variety of applications with little computing difficulty [11]. Because of its capabilities, FRM's approach has sparked a lot of interest in the creation of digital filters. The fundamental

building components of the FRM approach are concealing filters and model filters. The model filters are referred to as filters with sparse variables or sparse filters because of the many zero-valued variables. The sparse filters display the complete filter's transition band structure as well as photos of undesirable received signals. By the use of disguising subfilters, these undesired frequency reactive pictures are removed. The most recent improvements in FRM methodology have been given in [23]. The FRM-based method's architecture is described in [13], where the model filter is used as a hybrid, allowing for the use of extremely little hardware resources and reducing the crucial route's intricacy. [14] describes a combination model based on convex-concave quality management. The Frequency Transformation (FT), which is aimed at creating a linear phase Type I FIR filters while ignoring significantly small error in the stopbands and passbands and also a narrow transition band, is one of the effective algorithms based on sub-filters. The complete filter is made up of cascaded identical sub-filters that are interconnected. This connection contains the layout parameters, which appear to be parallel to the sub-filters. With the use of a scaling factor, such as sub filter amplitude response, the sample filter's amplitude response is mapped into bands that can yield architectural parameters. A methodology for constructing Hilbert transformers was recently disclosed [15], in which the FT technique is applied at nested levels and the multipliers are very few. [16] proposes an approach for FIR filters based on a combined perspective of frequency transformation, in which the entire filter frequency response is considered as a function and constructed using fewer similar characteristics. Slicing approaches, which are utilized to construct filters with improved magnitude features, are a recent research topic that is notably useful in comb-based reduction screens (i.e., CIC-based architectures). Further strategies described in [24], such as ACFs, and more recently in [28], are in addition to the aforementioned refining strategies. These ACFs can be calculated with the use of refinement, which cannot be done using activities organized. Sakamaki-Ritoniemi proposed an appropriate structure for pointed CIC decimators in [28], which is used as the foundation for the other sharpened CIC decimators. The constants are represented using Canonical Signed Digit (CSD), binary, or Minimal Signed Digit (MSD) interpretations across several Common Sub-Expression Elimination (CSE) approaches, including [25]. Because these strategies rely on number representation, one of their drawbacks is that they produce suboptimal results.

The delay D has been eliminated in the methodology provided in [5] to produce an MP equalizer. As a result, the first method is to use the identical strategy provided in [5] to create a Finite impulse Response equalizer. Aside from the technique [5,], numerous additional design techniques for MP FIR filters have been developed, as shown in [6]-[8]. On the other side, either of these methods' drawbacks is that the filtering results require multipliers, which are substantially more expensive in digital filters [1]. To address these issues, the cascaded expanded CSCF is used as a pre filter to execute the whole MP FIR filter using CSCFs without the use of a factor. Multiplication with constants is a common operation in digital signal processing systems, and this multiplication is regarded as difficult work in terms of hardware space and power consumption. As a result, operations like Multiple Constant Multiplication (MCM) and Single Constant Multiplication (SCM) can be implemented using subtractions, shifts, and additions [29]. With the help of the shift-and-add technique, the design of critical path (maximum number of adders connected serially), theoretical lower bounds for the adders and depth levels in single constant multiplication, multiple constant multiplications, and many other constant multiplication blocks is organized with two input adders. [4] Provided new constraints and tighter lower bounds for single constant multiplication, such as the necessity of extra adders to keep the depth levels low. Even so, hypothetical lower bounds are not included in the calculation of total multiplication blocks, which consist of multiple input subtractions, additions, and pipelined registers that are included in calculations, and in research areas such as Field Programmable Gate Arrays, during the deployment of pipelined constant replication blocks, above will type of operations is taken into account. The mathematical formulation of parallelism is greatly reduced because FPGA logic blocks are made up of memory parts [5]-[12]. Adders with three inputs have gotten a lot of attention in recent years, thanks to the fact that the current FPGA generations feature larger logic blocks, allowing for far more complicated adders to be fitted into the same computing resources [10]-[12]. In the last two decades, various high-level synthesis techniques for the building of constant multiplication blocks without multiplier have been effectively created. The amount of subtractions and additions needed to handle the multiplications determines the decrease in the usual objective functions in these techniques. The project plan, from the other side, is thought to have a severe detriment in terms of energy and velocity utilization [13]-[18]. Field-programmable gate arrays [5]-[10], [22]-[25], and Application-Specific Integrated Circuits [20]-[21] have all been the subject of substantial study. The framework's main aim is to reduce the number of mathematical functions, which lowers the depth stages.

#### III. PROPOSED METHODOLOGY OF RRNS FOR FIR FILTER DESIGN

Apart from these areas, residue number systems can detect and fix problems with the use of fault-tolerant and redundant RNS facilities in cryptography and digital signal processing, where subtraction, multiplication, and addition are regarded as the most significant mathematical processes. The residue number system can also be used to convey data in a safe and error-free manner in data transmission, with instructions being sent as leftovers. Because the system's residues aren't interdependent, they can reduce error in their moduli pathway, from which knowledge is transmitted, on their own. As a result, the residue number system's properties are ideal for a wide range of wireless sensor network applications. By combining redundant moduli with an existing set of moduli, the Redundant RNS (RRNS) can be used to discover and fix errors in a communication system.

The Finite impulse response filter is one of the most commonly used building elements in digital signal processing applications.FIR filters are typically used in conjunction with a binary number system that has a considerable latency. The higher-order FIR filters, on the other hand, introduce a significant delay due to the n number of multiplications and additions. The Finite Impulse Response filter speed can be increased with the use of the Residue Number System since it can perform addition and multiplication without any transit time.Modulo multiplication performs an intra-modulo action.As a result, there is no carry transmission across modulo streams. To eliminate intra-carry propagation from mathematical functions in residue number systems as it is suited for a very high dynamic range, a redundant encoding was created. The redundant residue number system (RRNS) can be researched to achieve an effective modulo multiplier in the field of digital signal processing.

**Contributions:** The following are the aspects that are highlighted in this academic research:

> The logical theorem for the filter that is built in the approach of sharpening which is centered on Chebyshev polynomials and cascaded cosine sub-filters is the feature of Minimum Phase (MP). Expansion and tumbled Chebyshev-sharpened cosine filters, as well as MP filters, are shown, and these filters have reduced group latency for the same features of magnitude as traditional cascaded cosine filters. With higher usage of hardware resources, the group delay can be improved.For the use of a short-delay decimation filter, though, the proposed methodology uses fewer hardware resources and has low complexity.

- A comb-based decimator consists of an areaefficient structure driven by a generalized embedded Chebyshev-sharpened unit. The proposed approach improves comb filters' least desirable aliasing rejection while maintaining a design with low computational, as well as consuming much less power and expensive tools.Regularity is one of the distinctive and required characteristics of the conceptual model, which is not found in any other comb-based procedures.
- By applying the Hartnett-Boudreaux sharpening technique, the decimation filter based on the comb is introduced to increase the properties of magnitude response.Improved sharpness is a technique for improving the least desirable attenuation and modifying droop in the passband area. Sharpening coefficients are defined as the Sum of Power of Two (SPT), resulting in a multiplier-free approach.
- The reduction structure based on the comb and which is it's not repetitive is utilized when the downsampling factor is of two's degrees. The comb-

based decimation structure is separated into several phases and is based on Harnett-Boudreaux slicing.When the downsampling factor is a power of two, the non-recursive comb-based reduction design is used, however for larger composite downsampling factors, two and three stages are used, with the nonrecursive architecture in the first phase and the recurrent design in the later steps.

- An existing conceptual lower limit is applied in the situation of numerous operators that are necessary for the blocks of perpetual multiplication with time intervals. The fixed coefficients are constructed using the shift-and-add technique, and each action is pipelined, using this simplified structure, pipelined subtractions or additions with n inputs are possible using pure pipelining entries. The aforementioned need is extremely important because it is available in the most recent Field Programmable Gate Arrays families, which are widely regarded as the most important method for implementing digital signal processing algorithms since it requires minimal hardware and low-cost constant multiplication units.
- Numerous studies have shown that the residue number system outperforms all other system models in terms of hardware consumption.We propose a DAbased hypothetical Residual Number System MAC unit for FIR filer design in this paper, using the evaluation methods:
- Over a large variety of moduli sets, a Distributed Arithmetic-based residue computing unit will give significant solution quality.
- It may be utilized to solve tradeoff restrictions in the typical RNS system using higher-order FIR architecture.
- With direct RAM-based accessing data, aggressive reverse conversion at the final stage can improve the collection speed.
- A. RNS system

The arithmetic calculation is carried out in the Residue number system using a preset modulus set, which consists of prime integers as moduli.The input operands range that the RNS number system can tolerate without truncating the outcomes is statistically formulated using moduli set values and total elements utilized as moduli.For a specific dynamic range, an RNS system is employed, regardless of the arithmetic utilized, M can handle results in the range of [0, M 1].Furthermore, during Residual Number System computation, each modulus and accompanying computations are performed as independent channels in L parallel routes, which significantly reduce path propagation latency.Furthermore, in the suggested method in current designs, this path delay is further improved using modified parallel prefix adder topology-based aggregation within the RNS system, as illustrated in Figure.1 [13,14].

**Moduli conversion:** Considered modulis  $\{m_1, m_2, \dots, m_p\}$ and its associated residue  $\{r_1, r_2, \dots, r_n\}$  are related to the input operands X as given below equations:

$$X = \sum_{j=0}^{4p-1} X_j 2^j$$
  
=  $M_3 2^{3p} + M_2 2^{2p} + M_1 2^p + M_0 - -$   
- (1)  
Where  $M_3 = \{X_{2p+k-1} \dots X_{3p}\}$  for  $1 \le k \le$ 

Where

р

and  $M_3 = 0$  for k = 0

The equation (1) is integer number from binary to RNS forward conversion, X is integer of range [0,N-1] and residuals are  $2^{n-k}$ ,  $2^n$ ,  $2^{n+k}$  of RNS sets  $\{x_1, x_2, x_3\}$  for moduli set  $\{m_1, m_2, m_3\}$  where  $N = m_1 * m_2 * m_3$ .

The  $2^{n+k}$  channel: The straight forward conversion is  $m_2$  should be converted and obtain remainder  $x_2$  by calculating division of X by  $2^{n+k}$  and it can accomplished by truncating the X values as per equation (2).

$$x_2 = X_{2^{p+k}} = X_{p+k-1} \dots \dots X_0 - - - - (2)$$

The  $2^{n-1}$  channel: The  $2^{n-1}$  and  $2^{n+1}$  channels are corresponding residuals are more complex due to their depending on final results of all X bits. In order to minimize the area and increasing of speed, perform sequence of additions as shown in equation (3).

 $x_1 = X_{2^{n}-1} = (M_3 2^{3p} + M_2 2^{2p} + M_1 2^p + M_0)_{2^{p}-1} - ---$ (3)

The equation (3) can be rewrite as  $x_1 = (M_3 + M_2 + M_3)$  $M_1 + M_0)_{m1}$ 

Therefore the forward conversion of x1 to moduli  $2^{n-1}$  can be performed just by adding modulo  $2^{n-1}$ 

The  $2^n + 1$  channel: In an identical procedure the  $2^n + 1$ 1 residue can be measured and it is shown in equation (4)

$$x_3 = X_{2^p+1} = (M_3 2^{3^p} + M_2 2^{2^p} + M_1 2^p + M_0)_{2^p+1} ----(4)$$

The equation (4) can be simplified as

 $x_3 = (-M_3 + M_2 - M_1 + M_0)_{m3}$ 

All source commands and moduli's are identified within the appropriate range based on a dynamic range calculated before measurement, and modulo removal is conducted with both sections of the Eqn (4).

The basic FIR filter function is given by y(z) = $\sum_{j=0}^{M-1} x(j)h^{-j} = \sum_{j=0}^{M-1} x(n)h(n-k) - - - - (1)$ 

Wherever k is the length of the filter, in this project, the k is 0 to 63. Approximation (1) suggests a direct Finite Impulse Response filter process that uses fewer registers, as shown in Figure 2. The Residual Number Systembased RFIR shown in Fig.1 has a subpart called Fig.2.

Standard FIR filter architecture uses a binary number scheme for a model of adders and coefficients, which results in greater diffusion & net weights and restricts the speed of activities. To overcome these drawbacks, the suggested Residual Number System-based Finite Impulse Response filter given in equation (1) uses a faster modified Parallel Prefix Adder (PPA) that avoids carry bit propagation; the results of the current Parallel Prefix Adder and modified Parallel Prefix Adder are shown in Table 1.

#### B. Memory efficient post computation

Following the traditional arithmetic operation of residue calculation, reverse conversion is conducted as a post calculation to convert the residue number to an integer. Statistics with a limited confidence interval based on the size of each module are used in this method.



Fig.3. RNS based Reconfigurable FOR filter design for audio signals.



Fig.1. Current Residual Number System-based Request for Impulse Response filter [14], which uses a Parallel Prefix Adder with less delay and a Residual Number System-based multiplier.



Fig.2 Overall diagram of potential Residual Number System-Ternary Parallel Prefix Adder based Request for Information Response filter for Software Defined Radio application with high speed and low energy usage. All potential outcomes are calculated ahead of time and saved in memory as freely accessible blocks for the reversal transformation operation. Since each of these memory units is converted into specialized on board block RAMs during hardware synthesis, the reversible transition unit's hardware sophistication will be reduced. When contrast to several other algorithms, this on-chip cache memory not only saves money but also saves time by using the least lead time.

| but also saves                                               | s time by u                               | sing the        | DNG 1 11                            |             |                           |  |  |  |
|--------------------------------------------------------------|-------------------------------------------|-----------------|-------------------------------------|-------------|---------------------------|--|--|--|
| Algorithm 1:                                                 | Proposed                                  | I ernar         | y-KINS based I                      | TIK filter  |                           |  |  |  |
| Input: Sa                                                    | implea                                    | Audio           | Signal,                             | Moduli      | sets                      |  |  |  |
| $(m_1, m_2, m_3, \dots)$                                     | $(m_1, m_2, m_3, \dots, m_n)$             |                 |                                     |             |                           |  |  |  |
| Output: Filtered sampled audio signal y(n-k)                 |                                           |                 |                                     |             |                           |  |  |  |
| Start:                                                       | 19.0                                      |                 |                                     |             |                           |  |  |  |
| Process 1: RN                                                | VS Comput                                 | tation          |                                     |             |                           |  |  |  |
|                                                              | ////Defin                                 | e the m         | oduli sets                          |             |                           |  |  |  |
|                                                              | Module                                    | set (7,8        | ,9)                                 |             |                           |  |  |  |
|                                                              | Input op                                  | erands:         | A=14; and $B=$                      | =20; ###fo  | or RNS                    |  |  |  |
| multiplier                                                   |                                           |                 |                                     |             |                           |  |  |  |
|                                                              | Input to                                  | FIR filt        | er: Sampled A                       | udio Signa  | al (x)                    |  |  |  |
| Process 2: Fo                                                | rward Con                                 | version         | for RNS comp                        | putation    |                           |  |  |  |
|                                                              | Let Ar <sub>1</sub>                       | is mod          | lulo operation                      | between     | one of                    |  |  |  |
| the operand a                                                | nd moduli                                 | set             |                                     |             |                           |  |  |  |
| -                                                            | $Ar_1=(A)$                                | mod (n          | n <sub>1</sub> );                   |             |                           |  |  |  |
|                                                              | $Ar_2=(A)$                                | mod (n          | $n_2$ );                            |             |                           |  |  |  |
|                                                              | $Ar_3=(A)$                                | mod (n          | $n_3);$                             |             |                           |  |  |  |
|                                                              | $Br_1 = (B)$                              | mod (n          | n <sub>1</sub> );                   |             |                           |  |  |  |
| $Br_2=(B) \mod$                                              | (m <sub>2</sub> );                        |                 |                                     |             |                           |  |  |  |
| $Br_3=(B) \mod$                                              | (m <sub>3</sub> ):                        |                 |                                     |             |                           |  |  |  |
| Compute abo                                                  | ve $Ar_1$                                 | $Ar_{2}$ to     | get Input Res                       | idual 1= (  | 0, 6, 5)                  |  |  |  |
| and <sub>Bel</sub>                                           | Br <sub>2</sub> to get I                  | nnut Re         | sidual $2 = (6)$                    | 4 2)        | 0, 0, 0)                  |  |  |  |
| Process 3: Re                                                | sidue Corr                                | putatio         | n for Multipli                      | cation who  | ere * is                  |  |  |  |
| RNS based m                                                  | ultinlier                                 | ipututio        | n for Munipin                       | cution, wh  | 010 15                    |  |  |  |
| Compute $r_{-}$                                              | $-(\Lambda r + R_1)$                      | r)mod           | (m) = 0                             |             |                           |  |  |  |
| Compute $r_1 =$                                              | -(Ar + B)                                 | $r_1$ )mod      | $(m_1)=0$                           |             |                           |  |  |  |
| Compute $r_2 = Compute r_2$                                  | $-(Ar_2 * Dr_2)$                          | $r_2$ )mod      | $(m_2) = 0$                         |             |                           |  |  |  |
| Compute $r_3 = (Ar_3 * Br_3)moa(m_3)=1$                      |                                           |                 |                                     |             |                           |  |  |  |
| Output of the process 3 i.e Residue Components are $(0,0,1)$ |                                           |                 |                                     |             |                           |  |  |  |
| Process 4:Residue Computation for Adder (+) where + is       |                                           |                 |                                     |             |                           |  |  |  |
| Ternary-RNS based adder                                      |                                           |                 |                                     |             |                           |  |  |  |
| Compute $r_1 =$                                              | $= (Ar_1 + B)$                            | $(r_1)moo$      | $l(m_1) = 6$                        |             |                           |  |  |  |
| Compute $r_2 =$                                              | $Compute r_2 = (Ar_2 + Br_2)mod(m_2) = 2$ |                 |                                     |             |                           |  |  |  |
| $Compute r_3 = (Ar_3 + Br_3)mod(m_3) = 7$                    |                                           |                 |                                     |             |                           |  |  |  |
| Output of the process 4 i.e Residue Components are (6,2,7)   |                                           |                 |                                     |             |                           |  |  |  |
| Process 5: Reverse Conversion for RNS computation            |                                           |                 |                                     |             |                           |  |  |  |
| Inputs: Modu                                                 | li set and i                              | nvM1, i         | nvM <sub>2,</sub> invM <sub>3</sub> |             |                           |  |  |  |
| Outputs: RON                                                 | M1, ROM2                                  | and R           | DM3                                 |             |                           |  |  |  |
| Where RC                                                     | OM1 st                                    | ores            | computationa                        | ıl of       | $(M_1 *$                  |  |  |  |
| $invM_1$ )mod                                                | $(m_1)=1$                                 |                 |                                     |             |                           |  |  |  |
| Where RC                                                     | DM2 st                                    | ores            | computationa                        | l of        | $(M_2 *$                  |  |  |  |
| $invM_2$ )mod                                                | $(m_2)=1$                                 |                 |                                     |             |                           |  |  |  |
| Where RC                                                     | DM3 st                                    | ores            | computationa                        | l of        | ( <i>M</i> <sub>3</sub> * |  |  |  |
| invM <sub>2</sub> )mod                                       | $(m_2)=1$                                 |                 | 1                                   |             |                           |  |  |  |
| 3)                                                           | Compute                                   | $M_1 =$         | $m_2 * m_2$                         |             |                           |  |  |  |
|                                                              | Comput                                    | $M_{c} =$       | $m_1 * m_2$                         |             |                           |  |  |  |
| Compute $M_2 = m_1 * m_3$                                    |                                           |                 |                                     |             |                           |  |  |  |
|                                                              | Comput                                    | $d_{3} =$       | $m_1 \uparrow m_2$                  | till to get | (M                        |  |  |  |
| in a M ) and J                                               | $(m)^{-1}$                                | <i>z uo</i> 1te | native process                      | s un to get | ( <i>M</i> <sub>n</sub> * |  |  |  |
| invin <sub>n</sub> )mod                                      | $(m_n)^{=1}$                              |                 |                                     | 1           |                           |  |  |  |
|                                                              | $ M_1 $                                   | * INVM          | $ m_1 moa(m_1) $                    | -1          |                           |  |  |  |
|                                                              |                                           |                 |                                     |             |                           |  |  |  |

$$|M_2 * invM_2|mod (m_2)=1 |M_3 * invM_3|mod (m_3)=1 At M_1 = 72, M_2 = 63 and M_3 = 56 we will$$

get above three equations output is 1 Process 6: Final Reverse Conversion for RNS computation RNS output = $(|M_1 * invM_1 * r_1| + |M_2 * invM_2 * r_2| + |M_3 * invM_3 * r_3|)mod(m_1 * m_2 * m_3)$ RNS output=|280|mod 504=280i.e 14\*20=280 hence it is proved end:

## C. DA based residue computation using approximated speculation

Overall path delay propagation with proper carry approximation is enabled by include speculation during accumulation. As shown in [10], it is used to execute multiplication as a concurrent addition. Thus, multiplier-less DA arithmetic is used in conjunction with the most suitable pre-processing units to restrict the carry propagation path and maintain the time complexity using prior calculations. The accretion rate is high without the use of any inner stage pipelining units due to inherent low-complexity parameters. Below, a Media Access Control unit with a speculative delay optimized aggregation unit helps meet the market needs of FIR filter design while also reducing the output penalty gap that occurs when FIR taps are extended. Furthermore, as compared to traditional parallel prefix computing methods, hypothetical units simulate convergence in equivalent blocks, resulting in substantial path delay reduction. The economic categories of using Residue Number System with Distributed Arithmetic-based arithmetic to enforce a Finite Impulse Response filter are greater information rate with intrinsic concurrency, configurability, and path optimized hypothesis activity. As per Figured 2, the entire Media Access Control process is performed using the Residual Number System in each Finite Impulse Response press. The Media Access Control operation is the central operation in FIR filters, and it combines the prediction characteristics of prefix configuration with the Residual Number Sysem method to achieve high execution [9].

#### D. Delay optimization

Carry approximated accumulation and speculation guided reverse conversion mitigates the main limitations that come with improved FIR impulse response in RNS multiplication. An image processing device is used to solve the experimental error in DA-based successively stage-wise shifting operations during FIR computation, consequently. As seen in Table 4, both corrective feedback and reverse optimization play a significant role in optimal output in terms of background subtraction and critical path removal in the Residual Number System method. And, shown in Table.3, the Finite Impulse Response order improves the system performance of Distributed Arithmetic-based residue processing in terms of both background subtraction and performance persistence.

#### IV. RESULTS AND DISCUSSION

The theoretical Residual Number System FIR design's output metrics are validated using experimental findings from various sets of moduli. For propagation and position & path, the digital strategy is carried out using Verilog HDL, and its compatibility is tested using Model Sim simulation, metrics are analyzed using Artix-7 Production FPGA hardware, and the propagation summary and system usages are seen in Tables 1 and 2.Based on speculation Distributed Arithmetic design hacks bring greater path delay limitation with limited capital efficiency degree, according to the test findings. The dominance of the theoretical Distributed Arithmetic based Residual Number System framework in different dynamics of FIR filter properties was demonstrated by evaluating computation overhead as conceptual item utilization during hardware propagation. The FIR based channel utilizing Residue Number System (RNS) is proposed. The picked moduli set offers the upside of shift and add approach. The proposed channel design is contrasted and a prior proposed adaptation of reconfigurable RNS FIR channel. The channels are integrated utilizing Cadence RTL compiler in UMC 90 nm innovation. The exhibition of the channels are analyzed as far as Area (A), Power (P), and Delay (T). Proposed approach is likewise checked practically utilizing FPGA DSP Builder. FIR channels are regularly utilized in the execution of present day computerized signal preparing frameworks. Their proficient execution utilizing financially accessible VLSI innovation is a subject of persistent examination and advancement. This paper presents the build up number framework (RNS) execution of diminished intricacy and superior FIR channels, utilizing current Altera APEX20K field-programmable rationale (FPL) gadgets. List number-crunching over Galois fields and the Quadratic Residue Number System (QRNS), alongside a choice of a little word width modulus set, are the keys for achieving low intricacy and high throughput in genuine and complex FIR channels. RNS-FPL combined FIR channels showed its prevalence when looked at over 2C (two's supplement) channels, being about 65% quicker and requiring less rationale components for most investigation cases. Exceptional consideration was paid to an effective execution of the multi-operand modulo adders. The substitution of a traditional modulo snake tree by a paired viper with expanded exactness followed by a solitary modulo decrease stage diminished region prerequisites by 10% for a 32-tap FIR channel. Then again, a list math QRNSbased complex FIR channel yielded up to 60% execution improvement over a three-multiplier-per-tap 2C channel, while requiring less LEs for channels having in excess of eight taps. Especially, a 32-tap channel required 24% LEs not exactly the traditional plan.

The inclusion of speculation during accumulation allows overall path delay propagation with appropriate carry approximation. Here it is incorporated to perform multiplication as a sequential of addition as given in [10]. Here multiplier less DA arithmetic includes with most appropriate pre-processing units to narrow down the carry propagation path and this will keep the critical path delay as constant using prior computations. Without using any inner stage pipelining units accumulation speed is increased with inherent metrics of low complexity. Here MAC unit designed with speculative delay optimized accumulation unit helps to meet the demand requirements of FIR filter design, and reduce the performance penalty gap that arises with FIR tap extension. Moreover, speculative units compute accumulation in identical blocks as compared to conventional parallel prefix computation methods which lead to significant path delay optimization. Implementation of FIR filter using Residue Number System (RNS) with DA based arithmetic has the following advantages: improved data rate with inherent parallelism, modularity, and path optimized speculation operation. Here the entire MAC operation is performed using RNS in each FIR tap as shown in Figure 2. MAC is the core operation used in FIR filters in which speculation features of prefix topology with the RNS system beneficial for high implementation [9].

However, still there exists a scope for advanced study in the area of 2<sup>n</sup>-1 modulo multipliers. In future research work, we consider the structure of modified Boothencoded multiplier with higher-radix by utilizing hardmultiple generation. Further, to achieve an effective modulo multiplier in the application of digital signal processing, the redundant residue number system (RRNS) can be studied. This paper focused on the implementation of high-end FIR filters using optimized RNS units for the fetal ECG signal detection process. The hardware synthesis results presented in this work proved that each level of optimizations carryout in RNS computation has a direct impact on hardware rate and performance retention of the FIR filter design. Here both the RAM-based speculative reverse conversion and DAbased residue computation are used for path delay reduction in the RNS system which can able to reduce the performance penalty gap in FIR filter design. This work restores consistent performance metrics with the extension of the FIR filter tap by incorporating an optimized RNS MAC and memory-efficient reverse converter unit.

| Input word  | Moduli<br>set(2n+1,2n,2n- | RNS with s and conventi | peculation<br>onal residue | RNS with speculation and residue computa- |          |  |
|-------------|---------------------------|-------------------------|----------------------------|-------------------------------------------|----------|--|
| length size | 1)                        | compu                   | tation                     | tion                                      |          |  |
|             |                           | Area (LEs)              | Fmax                       | Area(LEs)                                 | Fmax     |  |
| 8 bit       | (7,8,9)                   | 1396                    | 57.3MHz                    | 396                                       | 75.48MHz |  |
| 16 bit      | (31,32,33)                | 1823                    | 24.96MHz                   | 1263                                      | 29.78MHz |  |

| г 11   | 1          | DC        | C    | · ·   | • .•     | · 1 ·     | 1        | 1     | • ,   | 11 1 .   |    |
|--------|------------|-----------|------|-------|----------|-----------|----------|-------|-------|----------|----|
| Lable  |            | Reference | OT ( | ontin | 11721101 | i technic | me base  | ed on | 10011 | block si | ze |
| I GOIC | <b>.</b> . | reletence | UL V | spun  | InZacioi | 1 toomine | 140 0400 |       | mpai  |          |    |

Table 2. Performance analyzes of speculative DA-based RNS FIR design.

| FIR<br>length | RNS with s<br>and conven<br>due com | peculation<br>tional resi- | RNS with speculation and<br>DA based residue compu-<br>tation |            |  |
|---------------|-------------------------------------|----------------------------|---------------------------------------------------------------|------------|--|
|               | Area (LEs) Fmax                     |                            | Area(LEs)                                                     | Fmax       |  |
| 8 tap         | 2096                                | 63.46MHz                   | 2009                                                          | 209.600MHz |  |
| 16 tap        | 8623                                | 57.3MHz                    | 8193                                                          | 356.125MHz |  |



#### (a) Hardware complexity overhead

(b) Performance penalty gap

Figure.5. Improved performance trade-off comparison of DA-based arithmetic in RNS over FIR length.

| Multiplier                                                            | Slices<br>(area) | LUT | Delay<br>(ns) | Power<br>(mW) | Area* delay             | Time* power              | Area*time*<br>power          |
|-----------------------------------------------------------------------|------------------|-----|---------------|---------------|-------------------------|--------------------------|------------------------------|
| Two speed Radix-4<br>serial-parallel<br>multiplier for 32 bit<br>[19] | 1590             |     | 26.820        | 86.6          | 7186x10 <sup>-6</sup>   | 143.27 x10 <sup>-9</sup> | 22.852 x10 <sup>-</sup><br>9 |
| Booth Serial-<br>Parallel Multiplier<br>for 16 bits[19]               | 1200             |     | 27.2          | 0.85          | 19.06 x10 <sup>-6</sup> | 23.13 x10 <sup>-9</sup>  | 16.19 x10 <sup>-6</sup>      |
| Modified Shift-Add<br>Multiplier for 16 it<br>is [19,23]              | 2107             |     | 20.51         | 0.1           | 21.07                   | 25.1                     | 52.8                         |
| Bawooley1<br>multiplier                                               | 10475            |     | 10.25         | 22.62         | 1.07368x10 <sup>-</sup> | 1.691x10 <sup>-10</sup>  | 2.42x10 <sup>-06</sup>       |
| Wallace tree<br>multiplier                                            | 111              |     | 8.51ns        | 16.5          | 9.4461x10 <sup>-7</sup> | 1.40x10 <sup>-10</sup>   | 1.55x10 <sup>-08</sup>       |
| Proposed Ternary-<br>RNS based FIR<br>design for SDR                  | 174              | 249 | 4.771         | 0.088         | 8.30 x10 <sup>-6</sup>  | 41.9 x10 <sup>-9</sup>   | 7.3 x10 <sup>-6</sup>        |

Table 3. Assessments of a hypothetical DA-based RNS FIR design in terms of power, energy, place commodity, lag, and power.



Fig.6. Overall Device utilization comparison between proposed ternary-RNS based FIR filter design for SDR applications

| Ba Wave                                            |                              |         |  |  |  |  |  |  |
|----------------------------------------------------|------------------------------|---------|--|--|--|--|--|--|
| File Edit View Add Format Tools Winde              | ow .                         |         |  |  |  |  |  |  |
| ga Wave                                            |                              |         |  |  |  |  |  |  |
|                                                    |                              |         |  |  |  |  |  |  |
| 🥥 🛧 ሩ 🛶 ; R# 🛛 100 ps 🗢 RL RI                      | 😫 🖀 😩   ሮን ዐን ዐን መን- 🏬 🌇 🐢   |         |  |  |  |  |  |  |
| ] ᅶᅶᅚᆇᆂᅚᅚᆂᇓᆥᇑᆥᅴᄤᇾ                                  | ? 🤜  ] 🌫 ! 雛  ] 🔍 이 🔍 🔍 🛝  ] |         |  |  |  |  |  |  |
| <b>€</b>                                           | Msgs                         |         |  |  |  |  |  |  |
| /test_bench/dk                                     | 1                            |         |  |  |  |  |  |  |
| /test_bench/rst                                    | 0                            |         |  |  |  |  |  |  |
| 🖬 🔶 /test_bench/in                                 | 0000001111100010             |         |  |  |  |  |  |  |
| /test_bench/filter_enable                          | 1                            |         |  |  |  |  |  |  |
| /test_bench/level_selection                        | 0110                         | 0 X0110 |  |  |  |  |  |  |
| □ → /test_bench/out                                | 10000 10000000 10 1          |         |  |  |  |  |  |  |
| =                                                  | 130                          |         |  |  |  |  |  |  |
| 🚛 📣 /test_bench/syn_clk                            | 0001                         |         |  |  |  |  |  |  |
| /test_bench/din/dk                                 | St1                          |         |  |  |  |  |  |  |
| /test_bench/din/rst                                | St0                          |         |  |  |  |  |  |  |
| /test_bench/din/enable                             | St1                          |         |  |  |  |  |  |  |
| /test_bench/din/filter_in                          | 0000001111100010             | 00)     |  |  |  |  |  |  |
| /test_bench/din/level_selection                    | 0110                         | 0 )0110 |  |  |  |  |  |  |
|                                                    | 100001000000101              | o)      |  |  |  |  |  |  |
| /test_bench/din/level_one_approximate              | 00000000000000               | 00      |  |  |  |  |  |  |
| /test_bench/din/level_one_detailed                 | 00000000000000               | 00      |  |  |  |  |  |  |
| <pre>/test_bench/din/level_two_approximate_1</pre> | 111111101111001              | 00      |  |  |  |  |  |  |
| test_bench/din/level_two_detailed_1                | 100000000000111              | 00      |  |  |  |  |  |  |
| /test_bench/din/level_two_approximate_2            | 011111110101010              |         |  |  |  |  |  |  |
| test_bench/din/level_two_detailed_2                | 011111111011000              |         |  |  |  |  |  |  |
| - dest peoch/dio/level three approximate           |                              |         |  |  |  |  |  |  |

Figure.7. Audio samples were extracted, and the simulated data was filtered afterward.

Fig.7 indicates that audio signal for Software Defined Radio applications is introduced using techniques to confirm the reliability and functionality of an FIR filter design with long length. High-precision transfer function with enhanced carrier frequency and reliable processing activities are used in the filtering.

#### V.CONCLUSION

This study article presents a systematic study of various forms of 2n-1 modulo multipliers without memory, and it is assessed by enforcing the Booth algorithm, which improves the efficiency of 2n-1 modulo coefficients by restricting the number of PPs, which is restricted to one third for radix-8 Booth-encoded coefficients Because of the presence of hard-multiple, this multiplier is ineffective for smaller dynamic ranges, but it is more advanced in terms of energy and region for larger dynamic ranges. Redundant encryption has been made to solve intra-carry transmission from computation in residue number systems, as it is suitable for a considerable dynamic range, in addition to the above-mentioned modulo converter used in the residue counting system. In the field of 2n-1 modulo coefficients, though, there is still space for more research. We will use hard multiple production to study the structure of an updated Booth-encoded converter with higher-radix in future work. The redundant residue number system can also be carried out to achieve an important modulo converter in the development of digital signal processing. For analysis of fetal ECG signals, the above research involved the implementation of high-end FIR filters using optimized Residual Number System units. The effects of the hardware optimization discussed in this work demonstrated that each degree of optimization carried out in RNS computation has a direct effect on the hardware rate and efficiency of the Finite Impulse Response filter design. By delay in the direct reduction in the Residual Number System method, both RAMbased theoretical reverse transformation and DA-based residue estimation are used, allowing the efficiency penalty gap in FIR filter design to be reduced. By integrating an optimized Residual Number System MAC and a memory-efficient reverse converter unit, this work restores reliable system performance by extending the Filter bank switch.

#### REFERECES

- Oscal T C. Chen, Sandy Wang and Yi-Wen Wu, Minimization of switching activities of partial products for designing low-power multipliers, IEEE Trans. Very Large Scale Integration (VLSI) Systems. 11 (2003) 418–433
- [2]. Jinn-Shyan Wang, Chien-Nan Kuo, and Tsung-Han Yang. Low-power fixed-width array multipliers, in Proc. IEEE Int. Symp. Low Power Electronics and Design (ISLPED04), 2004, pp. 307– 312.
- [3]. Huang Zhijun and Milos D. Ercegovac, Highperformance low-power left-to-right array multiplier design, IEEE Trans. Computers. 54 (2005) 272–283
- [4]. Kuan Hung Chen and Yuan Sun Chu, A lowpower multiplier with the spurious power suppression technique, IEEE Trans. Very Large Scale Integration (VLSI) Systems. 15 (2007) 846–850
- [5]. Hasan Krad and Aws Yousif Al Taie, Performance analysis of a 32-bit multiplier with a carry lookahead adder and a 32-bit multiplier with a ripple adder using VHDL, J. Computer Science. 4(4) (2008) 305–308
- [6]. M. Mottaghi-Dastjerdi, A. Afzali-Kusha, and M. Pedram, BZ-FAD: A low-power low area multiplier based on shift-and-add architecture, IEEE

Trans. Very Large Scale Integration (VLSI) Systems. 17 (2009) 302–306

- [7]. Aloke Saha, Dipankar Pal, and Mahesh Chandra, Low-power 6-GHz wave-pipelined 8b × 8b multiplier, IET Circuits, Devices & Systems. 7(3) [17]. (2013) 124–140.
- [8]. Cong Liu, Jie Han, and Fabrizio Lombardi, A lowpower, high-performance approximate multiplier with configurable partial error recovery, in Proc. IEEE Design, Automation and Test in Europe Conf. and Exhibition (DATE), (2014), pp. 1–4
- [9]. Basant K. Mohanty and Vikas Tiwari, Modified PEB formulation for hardware efficient fixedwidth Booth multiplier, J. Circuits Syst. Signal [19]. Process. 33 (2014) 3981–3994
- [10]. Botang Shao and Peng Li, Array-based approximate arithmetic computing: A general model and applications to the multiplier and squarer design, IEEE Trans. Circuits and Systems-I: Regular Pa- [20]. pers. 62 (2015) 1081–1090
- [11]. When-Quan He, Yuan-Ho Chen, and Shyh-Jye Jou, Dynamic error compensated fixed-width Booth multiplier based on conditional-probability [21]. of input series, J. Circuits Syst. Signal Process. 35 (2016) 2972–2991.
- [12]. Zain Shabbir, Anas Razzaq Ghumman, and Shabbir Majeed Chaudhry, A reduced-sp D3Lsum ad[22]. der based high frequency 4 × 4-bit multiplier using Dadda algorithm, J. Circuits Syst. Signal Process. 35 (2016) 3113–3134.
- [13]. Ahmad Hiasat. et.al, "A Scaler Design for the RNS Three-Moduli Set {2<sup>n+1</sup>-1, 2<sup>n</sup>, 2<sup>n</sup>-1}Based on Mixed-Radix Conversion", Journal of Circuits, Systems, and Computers, Vol. 29, No. 3 (2020) 2050041 (12 pages), World Scientific Publishing [24]. Company, DOI: 10.1142/S0218126620500413.
- [14]. Raj Kumar. et.al, "Perspective and Opportunities of Modulo 2n-1 Multipliers in Residue Number System: A Review", Journal of Circuits, Systems, and Computers Vol. 29, No. 11 (2020) 2030008 [25]. (24 pages), World Scientific Publishing Company, DOI: 10.1142/S0218126620300081.
- [15]. Grande Naga Jyothi. et.al, "ASIC implementation of distributed arithmetic based FIR filter using [26]. RNS for high-speed DSP systems", International Journal of Speech Technology, Springer, 2020, <u>https://doi.org/10.1007/s10772-020-09683-1</u>.
- [16]. Rami Akeela. at.al, "Software-defined Radios: Architecture, State-of-the-art, and Challenges", IN-TERNET OF THINGS RESEARCH LAB, DE- [27].
  PARTMENT OF COMPUTER ENGINEERING, SANTA CLARA UNIVERSITY, USA — MARCH 2018.Lamjed Touil. et.al, "Design of Low-Power Structural FIR Filter Using Data-

Driven Clock Gating and Multibit Flip-Flops", Hindawi, Journal of Electrical and Computer Engineering Volume 2020, Article ID 8108591, 9 pages, <u>https://doi.org/10.1155/2020/8108591</u>.

- I. Lamjed Touil, et.al, "Design of Low-Power Structural FIR Filter Using Data-Driven Clock Gating and Multibit Flip-Flops", Hindawi Journal of Electrical and Computer Engineering Volume 2020, Article ID 8108591, 9 pages, <u>https://doi.org/10.1155/2020/8108591</u>
- [18]. C. Efstathiou. et.al, Modified Booth modulo 2<sup>n</sup>-1 multiplier, IEEE Trans. Comput. 53 (2004) 370– 374.
  - P]. R. Muralidharan. et.al, Radix-8 booth encoded modulo 2<sup>n</sup> -<sup>1</sup> multiplier with adaptive delay for high dynamic range residue number system, IEEE Transactions Circuit Systems. International Regular. Pap. 58 (2011) 982–993.
  - 0]. R. Muralidharan. et.al, Area-power efficient modulo 2<sup>n</sup>-<sup>1</sup> and modulo 2n þ 1 multipliers for 2<sup>n</sup>-<sup>1</sup>, 2<sup>n</sup>, 2<sup>n+1</sup> based RNS, IEEE Trans. Circuits Syst. I Regul.Pap. 59 (2012) 2263–2274.
  - 1]. R. Muralidharan. et.al., Radix-4 and Radix-8 booth encoded multi-modulus multipliers, IEEE Trans. Circuits Syst. I Regul. Pap. 60 (2013) 2940–2952.
  - H. Pettenghi. et.al, Efficient method for designing modulo {2n+k} multipliers, J. Circ. Syst. Comp. 23 (2014) 1450001.
- [23]. Romero, D.E.T. "High-velocity multiplier less Frequency Response Masking (FRM) FIR channels with diminished use of equipment assets," IEEE International Midwest Symposium on Circuits and Systems (MWSCAS), pp. 1-4, 2015.
  - 24]. Lu, W.- S. what's more, Takao H., "Abound together the way to deal with the plan of added and recurrence reaction concealing FIR channels," IEEE Transactions on Circuits and Systems I – Reg. Papers, 2016.
  - 5]. Demirtas, S. what's more, Oppenheim A. V., "A useful synthesis way to deal with channel honing and secluded channel plan," IEEE Transactions on Signal Processing, 2016. (in press)
  - [6]. Candan, C. "Ideal Sharpening of CIC channels and a proficient usage through Saramaki-Ritoniemi obliteration channel structure," 2011. http://www.eee.metu.edu.tr/~ccandan/bar dir/pick honed CIC felt broadened new.pdf. (keep going access on February 2017)
- [27]. Molnar G., Dudarin A. what's more, Vucic M. "Minimax plan of multiplier less honed CIC channels dependent on span examination," IEEE Internat. Show on Information and Communication

Technology, Electronics, and Microelectronics (MIPRO), May 2016.

[28]. Aksoy, L., Costa, E., Flores, P., and Monteiro, J. *Multiplierless design of linear DSP transforms*, in

VLSI-SoC: Advanced Research for Systems on Chip, Springer, Chap. 5, pp. 73–93, 2012.

[29]. Meyer-Baese, U. Digital Signal Processing with Field Programmable Gate Arrays, Springer, 2014.

### **Creative Commons Attribution License 4.0** (Attribution 4.0 International, CC BY 4.0)

This article is published under the terms of the Creative Commons Attribution License 4.0 https://creativecommons.org/licenses/by/4.0/deed.en\_US