Low Cost High Performance VLSI Architecture for Montgomery Modular Multiplication
US$41.34
10000 in stock
SupportDescription
The digital architecture is mainly used in all type of real world application architectures and thus the architecture to modify based on the enhancement purpose. The VLSI technology is to optimize the any type of digital architecture. Multiplication is an important fundamental function in arithmetic logic operation. Computational performance of a DSP system is limited by its multiplication performance and since, multiplication dominates the execution time of most DSP algorithms. Multiplication is one of the basic arithmetic operations and it requires substantially more hardware resources and processing time than addition and subtraction. The latency increase s in the proposed architecture is low power multiplier, this structure is also called as Bypass Zero, Feed A directly (BZ – FA D) for shift – and – add architecture. The architecture reduces the switching activity of the conventional multipliers. The modification s to the multiplier which multiplies A by B include the removal of shifting the B register , direct feeding of A to the adder , by passing the adder whenever possible, using a ring counter instead of a binary counter and removal of the partial product shift. Simulation results for 16 – bit multiplier that the BZ – FAD architecture lowers the total switching. The proposed multiplier can be used for low – power applications where the speed is not a primary design parameter. The architecture makes use of a low power ring counter proposed in this work. The proposed multiplier receives and outputs the data with binary representation and uses only one-level carry-save adder (CSA) to avoid the carry propagation at each addition operation. This CSA is also used to perform operand precomputation and format conversion from the carrysave format to the binary representation, leading to a low hardware cost and short critical path delay at the expense of extra clock cycles for completing one modular multiplication. To overcome the weakness, a configurable CSA (CCSA), which could be one full-adder or two serial half-adders, is proposed to reduce the extra clock cycles for operand precomputation and format conversion by half. In addition, a mechanism that can detect and skip the unnecessary carry-save addition operations in the one-level CCSA architecture while maintaining the short critical path delay is developed. As a result, the extra clock cycles for operand precomputation and format conversion can be hidden and high throughput can be obtained. Experimental results show that the proposed Montgomery modular multiplier can achieve higher performance and significant area–time product improvement when compared with previous designs