Design and Analysis of a Novel Low Complexity and Low Power Ping Lock Arbiter by using EGDI based CMOS Technique

Sangeeta Singh1*, JVR Ravindra2, B Rajendra Naik3

1Research Scholar, Dept. of ECE, JNTUK, Kakinada, Andhra Pradesh, INDIA
2Dept. of ECE, C-ACRL, Vardhaman College of Engineering, Hyderabad, Telangana, INDIA
3Dept. of ECE, University College of Engineering, Osmania University, Hyderabad, Telangana, INDIA

*Corresponding Author

DOI: https://doi.org/10.30880/ijie.2022.14.01.034
Received 28 June 2021; Accepted 05 October 2021; Available online 07 March 2022

Network-on-chip (NoC) provides solution to overcome the complications of the on-chip interconnect architecture in multi-core systems. It mainly consists of router, links and network interface. An essential component of on-chip router is an arbiter that significantly impacts the performance of the router. The arbiter should provide fast and fair arbitration when it is placed in Critical Path Delay (CPD) systems. The main aim of this research work is to design a novel arbiter for an effective network scheduler in complex real time applications. At the same time resource allocation and power consumption should be very low. Previously, a novel gate level Ping Lock Arbiter (PLA) is designed to overcome the limited fair arbitration in Improved Ping Pong Arbiter (IPPA) with less delay. But the chip size and power consumption are very high. To overcome this problem, an Effective Gate Diffusion Input (EGDI) logic-based CMOS scheme is used to design a novel Compact Ping Lock Arbiter (CPLA). The proposed CPLA is compared with the existing PLA based on static CMOS scheme. The comparison between the conventional and proposed arbiter is carried out to analyze the area, delay and power by using Tanner Tool 14.1 with 250nm and 45nm technology. The proposed CPLA is suitable for compact and smart applications with glitch free and fair arbitration. The proposed CPLA presents 41.2% reduction in total area consumption. It also provides 47.12% average Area and Power Product (APP) reduction when compared to the existing 4-bit PLA. Similarly, the proposed 8-bit CPLA shows 49.48% average APP product shrinkage when compared to conventional 8-bit PLA. For low power and area constraint applications, the proposed CPLA with EGDI is best in comparison to the static CMOS based ping lock round robin arbiter. Therefore, the results demonstrate that the proposed CPLA achieves low power and consumes less area than the existing ping lock arbiter.

**Keywords:** Network on chip, ping lock arbiter, effective gate diffusion input logic, CMOS Logic.

1. Introduction

The advancements in VLSI technology have allowed millions of transistors on a single die leading to system-on-chip designs. The major challenge to overcome in such devices is to offer for functionally correct, reliable operation of the interacting components [1]. In addition, with continuous decrease in feature size and increasing integration density, the global interconnect delay now dominates gate delays [2]. Interconnections have therefore been the dominant factor in determining the overall characteristics of a chip. Global interconnects cause severe on-chip synchronization errors, difficult timing closures, routing congestion and high power consumption [3].
The communication network connects diverse functional blocks on a chip to form homogeneous, heterogeneous or a hybrid system [4]. This network connecting different functional blocks together is the backbone of any system. The amount of power and area associated with these interconnects keeps increasing with each scaling and thus the communication architecture has to be modified to achieve low area and high performance. Traditionally, the designs of systems were based on critical paths and clock trees which presented an increased amount of delay and power consumption. The emerging technology that targets such connections is called on-chip interconnection network, also known as Network on Chip [5]. The integration of massive number of storage blocks and processing elements (PEs) are done in a single chip is called as NoC. Network technology uses the NoC feature to create the data exchange within the chip. A typical NoC architecture contains Routers (Switches), Links, Network Interface (NI) and Processing Elements (PE) as shown in Fig.1.

![Basic NoC Architecture](image1)

**Fig. 1 - Basic NoC Architecture**

Routers plays an imperative role in NoC. Routers handle the data transmission via network as packets [6]. Thus, to achieve high throughput, structural design of the router should experience lower delay with less area and power consumption [7].

Arbiter is a main building block of NoC router. An arbiter is used to determine the sequence of granting access to the shared resources when multiple input ports try to access the same output port. Different types of arbitration scheme available are round robin, lottery, token ring [24], and fixed priority [30]. Among all, round robin token scheme is the standard scheme which is used in Network on Chip for bus arbitration. The advantages associated with round robin arbitration scheme are low complexity, fairness in scheduling, prediction of worst-case wait time. Most of the designs used for arbiters are based on achieving the performance and scalability issues. But due to advances in technology, power consumption tends to dominate the design.

There are different types of Round Robin Arbiter (RRA) available to solve the conflicts among multiple input ports requesting for single output port. The names of few RRA based designs are: Programmable Priority Encoder Arbiter (PPE), Ping Pong Arbiter (PPA), Thermometer Coded Arbiter (TCA), Merged Arbiter and Multiplexer (MARX) [28], Switch Arbiter etc. Among all the round robin arbiters available, PPA provides binary tree-based structure to build higher order and complex arbiters [8].

![Tree Structure of PPA](image2)

**Fig. 2 - Tree Structure of PPA**
The Ping Pong Arbiter (PPA) suffers from the drawback of unfair arbitration when the input requests are not uniformly distributed. Thus, a gate level Ping Lock Arbiter (PLA) was developed to overcome the limited fair arbitration in Improved Ping Pong Arbiter (IPPA) with less delay [9]. But this leads to increase in chip size and power consumption. This research work aims to design a low area, low power and high-speed Ping Lock Arbiter (PLA) for NoC router. The PLA is based on Ping Pong Arbiter (PPA) which is designed by connecting small arbiters in binary tree format [8]. The four-bit ping pong arbiter is developed by using a three 2-bit PPA with some additional gates. In addition, the internal feedback flag signal “f” is given as the input through the feedback process. Mostly, the 2-bit Ping Pong Arbiter acquires a grant signal only if the next level subsequently provides a suitable grant signal in 2-bit PPA. Thus, the corresponding arbiter in each layer gets grant signals which are masked with grant signal of parent arbiter. The total arbitration time required is very low in PPA but it suffers from drawback of very high resource utilization.

In the proposed system, the Ping Lock Arbiter (PLA) can be used when greater than one request comes towards the particular port. The ports are arranged in the order of East, West, North, South and Local. It is based on round robin priority. This paper presents two types of ping lock arbiters are designed by using static CMOS technique and gate diffusion input logic (GDI) based CMOS technique [10,11]. Tanner tool is used to design the circuit and check the functionality of the arbiter for NoC router. The comparison between existing and the proposed arbiter is analyzed to check the resource utilization, performance and power consumption. In future, these arbiters can be incorporated into the Application Specific Network on Chip router (ASNoC).

The rest of the paper is arranged as follows: Section 2 describes various works related to arbiter. Section 3 presents the details of implementation of ping lock arbiter based on static CMOS technique. The design of proposed ping lock arbiter based on EGDI is demonstrated in section 4. The results and corresponding analysis is shown in section 5 and conclusion is given in section 6.

2. Related Work

The round robin arbiter is worked based on regular bus arbitration method. A token is used to attain an efficient allocation of resource in equal manner. An individual token is used for each and every user to create a grant signal at a meticulous time. At a time only one user will be receiving the grant signal from numerous requests in a cyclic manner. Currently, a lot of efficient resource allocation method are used for on chip communication. The researchers have concentrated on implementation and analysis of different types of arbiters available but to the best of our knowledge round robin arbitration is the best suitable method for many real time applications as it is less complex and provide fairness in scheduling for uniform traffic conditions. This section provides insights on various work done related to the implementation of arbiter.

In [12], the resource allocation policy is attained by using a fast crossbar scheduler. The multiplexing and demultiplexing processes are performed in the crossbar switch of router. At a time more than one request is received from various input channel. However only one input channel which having highest priority is awarded a grant signal from a queue. Subsequently all other input channel obtains a grant signal in a regular interval of a clock cycle. In addition, the scheduler speed is improved by using pipelined scheme. Anyhow power consumption and area are high. In [13], the round robin arbiter with Programmable Priority Encoder (PPE) is used to set a routing policy and it leads to huge amount of gate delay. To overcome this problem, Parallel Round Robin Arbiter (PRA) with binary tree structure is included for high performance and low complexity. In PRRA, the gate delay is lower than the crossbar scheduler but the power and resource utilization are very high.

Previously, a Dual Round Robin Arbiter (DRRA) is designed to split the transaction bus in SoC implementation [14]. The data bus arbiter and address bus arbiter are used to construct a non-demultiplexed transaction bus. Every request is processed after certain bounded waiting time which is created by DRRA. In [15], the proposed circulated arbiter is created by using a circulated scheduler-based crossbar switch in order to analyze the result of router. The mask trail and compressed scheduler are used to create a fair crossbar control. The scheduling delay, critical delay, hardware cost and power consumption are decreased in the fair crossbar switch. In addition, they can predict the fairness of an arbiter. Currently, few other variations of arbiter are developed to improve the fairness than the existing arbiter.

In [16], a novel bit level algorithm and concurrent prefix adder logic are used to design a novel programmable priority arbiter. Consequently, it increases the complexity of the overall design. Furthermore, performance is high in novel programmable priority arbiter. In [17], the Edge Detector (ED) and two Fixed Priority Encoder (FPE) are used to construct parallel prefix arbiter based on thermo code. Various types of parallel prefix adders are used like Brent Kung, Ladner Fischer, Han Carison and Kogge Stone formations. A Carry Look Ahead Adder (CLA) is created by performing carry generation and carry propagation in three stages of adder. Carry generation is executed by performing AND operation. Similarly, carry propagation is performed by using OR operation. A low complexity parallel prefix adder is formed by reducing the number of stages. Therefore, it provides small resource utilization and high performance.

A novel sorting-based arbitration algorithm and a new Merged Arbiter and Multiplexer (MARX) structure are designed in [18]. Then it is integrated into the proposed merge switch allocation and traversal in NoC switches. This router is used in simple and high complexity arbitration strategy. Moreover, it offers low energy, delay and chip size. The Index Based Round Robin Arbiter (IRR) is designed in [19]. The key is stored in index format to perform switch
functions. The input ports are scaled by using a logarithmic operation. In addition, the IRRA provides low die size, low power and high performance. But it limits the fairness of router.

The design of concurrent pseudo round robin arbiter is presented in [20]. It enables several requests at a time via parallel processing and hence the performance is high. Anyhow the area and power utilization are very much elevated. A weighted round robin arbiter is built by using a generic algorithm-based optimization method [21]. Each and every input port is having a suitable weight. The grant signal is prearranged in which channel having supreme credence and thus the speed is increased. Anyhow the complexity of this arbiter is very high. The design of round robin arbiter with clock gating scheme is presented in order to achieve less delay and low power by shrinking the switching movement of the clock signal [22]. Whenever the clock signal and enable are high simultaneously only then the corresponding clock signal is activated. So the power consumption is low due to reducing the number of clock cycle required for arbiter.

An optimized buffered and greedy arbiter was presented in [23]. The complexity of the developed design increases with increase in number of inputs and the efficiency decreases under the condition of high activity where in all the inputs present are requesting for the access of resources or output ports. An arbitration method depending on token ring was presented in [24] which is suitable for on-chip CDMA based bus architectures. But it suffers with the disadvantage of starving among the input requestors. In addition, the efficiency of the designed technique was measured in terms of latency and throughput with no focus on power consumption.

In [25] a modified round robin arbiter was developed using clock gating technique which is power and area efficient but suffered with the drawback of low throughput. The proposed round robin block does not require the ring counter to choose the priority block. The clock signal is provided by the global ring oscillator, with optional dividers for slower clocking options.

According to [26], large number of arbiters are present in Network on Chip and it plays a critical role in defining the performance of the overall NoC. It presented the comparative analysis amongst various arbiters. The work concentrated on decreasing the response time of arbiters to the requesting input ports thereby increasing its speed. The comparison didn’t provide any insights on the amount of power and area consumption which are very crucial for arbiters.

The performance analysis between fixed priority arbiter, round robin arbiter and matrix arbiter was given in [27]. The results show that the fixed priority arbiter doesn’t support fairness in scheduling whereas matrix arbiter leads to large delay, thus making them less efficient when compared to round robin arbiter.

The fairness in granting resources to the requestors is an essential functionality of arbiter. A generic architecture of Dynamic Priority Arbiter was presented in [28] where in the priority was updated dynamically to achieve fast arbitration process. But this method increased the complexity of the designed arbiters and also led to high area and power consumption.

A technique for fair arbitration was developed to guarantee transmitter access for all the input channels in [29]. The proposed technique focused on improving timing towards granting access to the requested input channels without any reference to power and area consumption of the designed arbiter.

A high throughput arbiter based on fixed priority mechanism for applications related to Time Correlated Single Photon Counting system was presented in [30]. The major drawback of this technique is starvation of certain input requestors with more requests leading to increased wait time to get access to the resources.

The problem of constant wait time associated with a round robin arbiter was addressed in [31]. The proposed scheme increased the frequency of operation and throughput but led to increase in area occupation. A modified round robin arbiter was designed in [32] to overcome the problem of starvation where in the requestors need not wait for long to win the granting signal. The arbiter is made aware of the load present on buffer to decide priority of input requestors. This method suffers with the drawback of increased area and power consumption with reduced speed of operation.

The ping lock arbiter is designed in [9] with lock signal for good arbitration and fairness scheme. Therefore, the delay is less but area and power consumption are high. To overcome this problem, an Effective Gate Diffusion Input Logic based Ping Lock Arbiter (CPLA) is designed in the proposed system with a smaller number of transistors when compared to existing static CMOS logic-based ping lock arbiter. Thus, the proposed PLA offers small die size and low power consumption than the other state of art methods.

3. Existing Ping Lock Arbiter with Static CMOS Technique

Fair arbitration is attained in modified ping pong arbiter by altering the information about priority policy. The round robin arbiter is reorganized at the superior level to obtain the first priority sooner than every input is processed in single round of arbitration. The fair arbitration is received by processing the uppermost index-based request in starting meanwhile not updating the priority vector of midway arbiter.

The modified ping pong arbiter known as Ping Lock Arbiter uses a lock signal “l0” to avoid change of priority register value until the arbiter grants access to the requesting input port. This helps to overcome the problem of unfair arbitration associated with the ping pong arbiter for non-uniform active requests.

The existing ping lock arbiter achieves fair arbitration for both uniform and non-uniform pattern of requesting signals. It’s efficiency in terms of area, power and delay decreases with increase in width of arbiter. For higher order
arbiters, ping lock arbiter’s performance is almost similar to ping pong arbiter. The basic architecture of ping lock arbiter is based on binary tree structure in which higher width arbiters can be constructed by using lower order arbiters.

![Diagram of 2-bit Ping Lock Arbiter’s Intermediate Arbiter](image1)

**Fig. 3 - Structure of 2-bit Ping Lock Arbiter’s Intermediate Arbiter [9]**

The gate level structure of intermediate node of 2-bit ping lock arbiter is shown in Fig.3. It consists of two lock signals (li0, li1) corresponding to each requesting input port. The intermediate node produces an output lock signal “li” to be used by its parent arbiter.

![Diagram of 2-bit Ping Lock Arbiter’s Leaf Arbiter](image2)

**Fig. 4 - Structure of 2-bit Ping Lock Arbiter’s Leaf Arbiter [9]**

The gate level structure of leaf node corresponding to 2-bit ping lock arbiter is shown in Fig.4. It asserts the lock signal “l0” only when it receives active requests from both r0 and r1 along with zero priority pointer. The leaf and intermediate cells presented consists of OR, NAND, AND gates, inverter and a register. The OR operation is performed between two request signals to produce the asserting signal (AG).

The priority register (P-reg) is designed asynchronously to make a glitch free circuit. Any unwanted signal produced at the output is called as glitch. The enable signal of P-reg is generated by using AND gate in which the output is feedback to give as input with error input (ei). When the enable is one, the grant signal is generated based on request inputs and p-register output. The input to output, input to register, register to output delay are calculated to find the arbiter maximum delay (MD). The Granting delay (GD) is defined as the maximum delay of arbiter to affirm the grant later than getting a request. The granting delay should be equal or lesser than the maximum delay of the arbiter.
The general structure of n-bit arbiter is depicted in Fig. 5 where 'r' indicates request signals, 'g' as grant signals, 'lo' is lock signal, ei and eo represent input and output enable signals. An n-bit ping lock arbiter (PLA_n) is constructed with the help of 2 PLA_{n/2} and an intermediate PLA_{2}.

![Fig. 5 - Structure of n-bit ping lock arbiter](image)

The four-bit Ping Lock Arbiter (PLA) as shown in Fig.6 is created by combing two 2-bit PLA leaf and one 2-bit PLA intermediate structure (Fig.3 and Fig.4) along with four AND gates which is also called as new gate level compact round robin arbiter. The excellent grant updating policy is included in the PLA to attain fair arbitration. The lock out (lo), AG, error output (eo) and grant signal are generated by PLA. The output is obtained without any signal distortion with fair arbitration. Static Complementary Metal Oxide Semiconductor (S-CMOS) logic is used to design existing ping lock arbiter of basic and higher width. This leads to increased area and power consumption and the overall performance is quite similar to the conventional ping pong arbiter. Similarly, an 8-bit arbiter built using two 4-bit PLA and one intermediate PLA is also analyzed and compared with existing CMOS logic-based arbiter defined in [9].

![Fig. 6 - Structure of 4-bit ping lock arbiter](image)

4. Proposed Ping Lock Arbiter with EGDI Logic

The existing ping lock arbiter achieves fairness in arbitration but suffers from the drawback of high power and area consumption. To overcome the drawback of existing PLA design, a modified PLA is proposed using GDI technique to reduce total area and power consumed by the arbiter. The effective GDI logic uses a smaller number of transistors to complete the basic logic gate operation compared to all other styles like Static CMOS logic, FINFET logic, Pass Transistor Logic (PTL), domino logic etc. The implementation of AND gate using EGDI logic is shown in Fig. 7(a). It needs only two transistors to perform the operation. In the gate diffusion logic, the drain and source terminal of MOSFET are used as one of the input terminal instead of the gate terminal. Hence the number of transistors are very less in EGDI logic. When A and B values are zero, the PMOS is ON and NMOS is OFF which is connected to A in drain. Thus, the values of A equal to zero is passed to the output. When A=0 and B=1, PMOS is ON and NMOS is OFF but the drain of
PMOS is connected to A. Hence the output is zero. Similarly, if A=1 and B=0, PMOS is OFF and NMOS is ON. Anyhow as the NMOS drain terminal is connected to B, the value of B is passed to output as zero. Similarly, when all the inputs are high, PMOS is OFF and NMOS is ON. The drain of NMOS is attached to B, making the output as B (i.e.) one.

![Circuit diagram of AND, OR, NAND gates using EGDI logic](image)

**Fig. 7 - Circuit diagram of AND, OR, NAND gates using EGDI logic**

The circuit diagram of OR gate using EGDI is shown in Fig.7 (b). When both the inputs are zero, PMOS is ON and NMOS is OFF. The drain terminal of PMOS is attached to A makes the output as zero. When all the inputs are one, PMOS is OFF and NMOS is ON which is connected to B. Thus, the output is equal to one. If A=0 and B=1, NMOS is ON & PMOS is OFF, as the NMOS is connected to B, the output is pulled to one. The inverse operation is performed when A=1 and B=0. The EGDI logic-based NAND gate is shown in Fig.7 (c). The inverse operation of AND gate with EGDI is performed in NAND gate by adding the inverter in final stage.

A basic CMOS logic AND gate normally uses six transistors, whereas the same gate when implemented using EGDI logic uses only two transistors making it more area effective and less complex. Similarly, other basic gates when implemented using EGDI logic turn out to be area efficient and less complex when compared to static CMOS logic circuits.

The proposed compact ping lock round robin arbiter is designed by integrating EGDI based AND, OR, NAND gates instead of static CMOS logic gates in 4-bit and 8-bit ping lock arbiter. Initially the PLA2-leaf cell and PLA2-intermediate cell is designed using the basic gates designed using EGDI logic which uses a smaller number of transistors when compared to existing CMOS logic-based cells. The schematic of PLA2-leaf and PLA2-inter are presented in Fig.8 and Fig.9 respectively. These schematics are taken from the S-Edit window of Tanner tool.
Fig. 8 - Schematic of EGDI based PLA2 Leaf Cell

Fig. 9 - Schematic of EGDI based PLA2 Leaf Cell
The ping lock arbiter uses binary tree-based architecture to build higher order complex arbiter circuits. The 4-bit PLA is developed using an EGDI based PLA-leaf and PLA-intermediate cell as shown in Fig. 10. It is designed by using PLA-2 inter and PLA-2 leaf structure. Finally, six AND gates are used to combine the outputs of PLA2 leaf and PLA2 inter to find the grant value and the error output.

![Fig. 10 - Schematic of proposed 4-bit PLRRA with EGDI logic](image)

The 8-bit PLA is designed by using two 4-bit ping lock arbiter with 2-bit PLA intermediate structure as depicted in Fig. 11. The proposed compact ping lock arbiter (CPLA) offers small die size, low delay and low power consumption than the other state of arts scheme.

![Fig. 11 - Schematic of proposed 8-bit PLRRA with EGDI logic](image)
5. Results and Discussion

In this work, two types of arbiter such as static CMOS based Ping Lock Round Robin Arbiter (PLRRA) and EGDI logic based PLRRA are designed by using tanner tool to verify the working principle of arbiters. Also the simulation results are taken to examine the number of transistors (area) needed along with the values of delay and power consumption. The value of power, delay and area compared for both conventional and proposed scheme to judge the efficiency of performance. The tanner tool W-edit is used to check the functionality of Design under Test (DUT). Also T-Spice is used to calculate the area, delay and power values.

The results of proposed 4-bit PLRRA is shown in Table 1. From the results, it shows that the proposed PLRRA with EGDI offers 41.22% area reduction and 85.55% power reduction than the existing PLA with static CMOS scheme. But the speed is high for static CMOS based ping lock arbiter. Similarly, results of 8-bit ping lock round robin arbiter are given in Table 2. From the obtained results, the existing static CMOS based PLRRA offers low delay but has high area and power consumption in comparison to the proposed EGDI. The proposed EGDI based 8-bit PLRRA provides 41.22% area reduction and 44.42% power reduction, when compared to existing static CMOS logic based 8-bit PLRRA.

The results of proposed 4-bit PLRRA is shown in Table 1. From the results, it shows that the proposed PLRRA with EGDI offers 41.22% area reduction and 85.55% power reduction than the existing PLA with static CMOS scheme. But the speed is high for static CMOS based ping lock arbiter. Similarly, results of 8-bit ping lock round robin arbiter are given in Table 2. From the obtained results, the existing static CMOS based PLRRA offers low delay but has high area and power consumption in comparison to the proposed EGDI. The proposed EGDI based 8-bit PLRRA provides 41.22% area reduction and 44.42% power reduction, when compared to existing static CMOS logic based 8-bit PLRRA.

### Table 1 - Comparison between existing and proposed 4-bit PLRRA for 250nm

<table>
<thead>
<tr>
<th>PLA 4-bit</th>
<th>Number of transistors</th>
<th>Delay (ns)</th>
<th>Power (mw)</th>
</tr>
</thead>
<tbody>
<tr>
<td>With CMOS</td>
<td>252</td>
<td>0.9</td>
<td>3.53</td>
</tr>
<tr>
<td>With EGDI</td>
<td>148</td>
<td>1.12</td>
<td>0.51</td>
</tr>
</tbody>
</table>

### Table 2 - Comparison between existing and proposed 8-bit PLRRA for 250nm

<table>
<thead>
<tr>
<th>PLA 8-bit</th>
<th>No of transistors</th>
<th>Delay (ns)</th>
<th>Power (mw)</th>
</tr>
</thead>
<tbody>
<tr>
<td>With static CMOS</td>
<td>592</td>
<td>1.62</td>
<td>7.88</td>
</tr>
<tr>
<td>With EGDI</td>
<td>348</td>
<td>3.18</td>
<td>4.38</td>
</tr>
</tbody>
</table>

### Table 3 - Comparison between existing and proposed 4-bit PLRRA for 45nm

<table>
<thead>
<tr>
<th>PLA 4-bit</th>
<th>Number of transistors</th>
<th>Delay (ns)</th>
<th>Power (uw)</th>
</tr>
</thead>
<tbody>
<tr>
<td>With Static CMOS</td>
<td>252</td>
<td>1.78</td>
<td>8.41</td>
</tr>
<tr>
<td>With EGDI</td>
<td>148</td>
<td>3.22</td>
<td>7.145</td>
</tr>
</tbody>
</table>

From the results, it shows that the proposed 4-bit and 8-bit Ping Lock Round Robin Arbiter (PLRRA) with EGDI logic requires a smaller number of transistors and power than the conventional PLRRA with static CMOS scheme. The proposed 4-bit PLRRA offers 41.27% area reduction and 15% power reduction when compared to the conventional 4-bit PLRRA as shown in Table 3. The obtained results demonstrate that the proposed EGDI based ping lock arbiter is more efficient than existing CMOS logic-based arbiter.
Table 4 - Comparison between existing and proposed 8-bit PLRRA for 45nm

<table>
<thead>
<tr>
<th>PLA 8-bit</th>
<th>Number of transistors</th>
<th>Delay (ns)</th>
<th>Power (uw)</th>
</tr>
</thead>
<tbody>
<tr>
<td>With Static CMOS</td>
<td>592</td>
<td>4.563</td>
<td>20.45</td>
</tr>
<tr>
<td>With EGDI</td>
<td>348</td>
<td>6.754</td>
<td>16.15</td>
</tr>
</tbody>
</table>

Likewise, from the Table.4, the proposed 8-bit PLRRA gives 41.21% chip size reduction and 21% power reduction than the existing 8-bit PLRAA. However, the speed is high is low for existing static CMOS based 4-bit and 8-bit PLA, when compared to the proposed 4-bit and 8-bit PLA with EGDI.

Fig. 12 - Comparison result for total count of transistors required for existing and proposed PLRRA

The comparison results for the total number of transistors required to design 4-bit PLRRA and 8-bit PLRRA using existing static CMOS and EGDI is shown in Fig.12. It depicts that EGDI based PLRRA requires a smaller number of transistors in comparison to static CMOS PLRRA and thus reduces the overall area consumption.

Fig. 13 - Comparison result for total power consumption in 250 nm for existing and proposed PLRRA

Similarly, the Fig.13 presents the comparison result corresponding to the value of total power consumption required for existing and proposed PLRRA in 250 nm technology. The graphical comparison presents that the total power consumption of EGDI PLRRA is very less when compared to the existing CMOS logic PLRRA and hence is more power efficient.
Fig. 14 - Comparison result for total power consumption in 45 nm for existing and proposed PLRRA

The Fig.14 demonstrates the comparison result corresponding to the value of total power consumption required for existing and proposed PLRRA in 45 nm technology. The output obtained for proposed 4-bit ping lock round robin arbiter related to area utilization given in Fig.15.

Fig. 15 - Resource utilization of proposed 4-bit PLRRA with EGDI logic

The number of transistors used by the proposed 4-bit PLRRA is 148 as shown in Fig.15 and similarly the value of power consumed obtained is presented in Fig.16 which is 7.14 μw.
Similarly, the number of transistors required and power consumption of proposed 8-bit ping lock round robin arbiter are illustrated in Fig.17 and Fig.18 respectively.
6. Conclusion

In this paper, the compact PLRRA with high speed is constructed to attain fair arbitration. Many kinds of arbiters are preded for widespread resource sharing method. The arbiters can be designed using various available logical styles like static CMOS, Pseudo NMOS, Pass Transistor and Gate diffusion input etc. Out of all the styles, EGDI logic can implement circuits with a smaller number of transistors thereby reducing area and power consumption of the design. The two variations of arbiter like Ping Lock Arbiter (PLRRA) with static CMOS and Compact Ping Lock Arbiter (CPLRRA) with EGDI logic are designed to evaluate the die size, implementation delay and power values. The arbiters are designed by using tanner tool S-edit based on switch level (SL) execution. The functionality of every arbiter is demonstrated via tanner tool W-edit. T-spice is used to calculate the area, delay and power outputs. The proposed CPLRRA is best for compact and smart applications with glitch free and fair arbitration. The proposed 4-bit CPLRRA offers 47.12% average Area and Power Product (APP) reduction than the existing 4-bit PLRRA. On the other hand, the proposed 8-bit CPLRRA requires 49.48% average APP product shrinkage when compared to conventional 8-bit PLRRA. For high-speed application, the static CMOS based ping lock round robin arbiter is best when compared to the proposed PLRRA with EGDI. In future, the proposed PLRRA will be integrated in the application specific NoC router, computer networks etc. and their performances can be compared.

References


