Fast and Low Complexity Architectures for Arithmetics over GF2m - PowerPoint PPT Presentation

1 / 45
About This Presentation
Title:

Fast and Low Complexity Architectures for Arithmetics over GF2m

Description:

Finite field arithmetics are of great importance in the applications of public ... reconfigurable architecture are its capability to be reconfigured on-line, and ... – PowerPoint PPT presentation

Number of Views:24
Avg rating:3.0/5.0
Slides: 46
Provided by: trl4
Category:

less

Transcript and Presenter's Notes

Title: Fast and Low Complexity Architectures for Arithmetics over GF2m


1
Fast and Low Complexity Architectures for
Arithmetics over GF(2m)
  • Hua Li
  • Department of Math CS
  • University of Lethbridge

2
Introduction
  • Finite field arithmetics are of great importance
    in the applications of public-key cryptography
    (ECC Elliptic Curve Cryptography), and digital
    signal processing (Reed-Solomn encoder/decoder).
  • The addition operation is fast and inexpensive as
    it can be realized with m bitwise XOR operations.
    The multiplication operation is costly in terms
    of gate number and time delay
  • There is a great need for fast as well as
    low-complexity VLSI (Very Large Scale Integrated)
    chips that can efficiently implement fundamental
    finite field arithmetic operations.

3
Introduction
  • This thesis focuses on the designs of fast and
    low-complexity architectures for fundamental
    finite field arithmetic operations and their
    applications in public-key cryptography systems.
  • The proposed architectures have the properties of
    modularity, simplicity, and regular
    interconnection, and are very easy and suitable
    for VLSI implementations.

4
A LOW-COMPLEXITY PIPELINED ARCHITECTURE FOR
NORMAL BASIS MULTIPLIER
  • In normal basis, the squaring is a cost-free
    cyclic shift operation.
  • The inversion (the most complicated operation
    among the important finite field arithmetic
    operations) can be effectively computed by
    recursive squaring and multiplication.

5
A LOW-COMPLEXITY PIPELINED ARCHITECTURE FOR
NORMAL BASIS MULTIPLIER
  • Most of the previously proposed finite field
    multipliers operate over a fixed field.
  • In other words, a new multiplier is needed if
    there is a change of the irreducible polynomial.

6
A LOW-COMPLEXITY PIPELINED ARCHITECTURE FOR
NORMAL BASIS MULTIPLIER
  • A new versatile pipelined multiplier based on
    the normal basis representation.
  • Advantages
  • 1. The finite field parameters can be changed
    according to the application environments,
    increasing the flexibility of using the same
    multiplier for different applications.

7
A LOW-COMPLEXITY PIPELINED ARCHITECTURE FOR
NORMAL BASIS MULTIPLIER
  • Advantages
  • 2. The structure of the multiplier can be easily
    extended to higher order finite fields.
  • 3. The basic architecture of the proposed
    multiplier can be modified to a low-cost
    multiplier which is very suitable for both
    embedded systems and wireless devices.

8
PIPELINED ARCHITECTURE FOR SERIAL VERSATILE
NORMAL BASIS MULTIPLIER
9
A LOW-COST SERIAL VERSATILE NORMAL BASIS
MULTIPLIER
10
(No Transcript)
11
DESIGN OF MULTIPLIERS USING REDUNDANT
CANONICAL BASIS
  • A redundant canonical basis representation is
    defined with the irreducible All One Polynomial
    (AOP).
  • Based on the proposed redundant representation,
    the multiplication operation can be simplified
    and the squaring operation will be a cost-free
    permutation of the element coefficients.
  • Three new multipliers in redundant basis are
    presented.

12
  • The first two are fast bit-parallel multipliers
    (Design 1, 2) whose time delays are less than the
    previous bit-parallel multipliers 31,33,34,35.

13
Design 1 Logic circuit diagram for bit-parallel
multiplier in GF(24)
14
Design 2 Another structure for a bit-parallel
multiplier in GF(24)
15
Design 3 Bit-serial multiplier in GF(24)
16
  • The third one is a low-cost bit-serial multiplier
  • (Design 3) which only requires m1 2-input
    AND/XOR gates.
  • It reduces the clock period to T_ANDT_XOR,
    which is a significant improvement in comparison
    with the bit-serial multiplier proposed by Wu in
    52.

17
The time delays of the proposed redundant basis
bit-parallel multipliers (Design 1 and 2)
are less than the previous bit-parallel
multipliers in \cite Itoh89,Hasan92,Koc98,Hasan93
,Wu98. In particular, the decrease of time
delay is significant when a number
of multiplication/squaring operations are
performed because all of the arithmetic
operations are performed in a redundant canonical
basis until the final operation at which point
the result is converted to a regular canonical
basis (if required) through a simple XOR-gate
constructed hardware.
18
the proposed bit-serial multiplier (Design 3) is
innovative and reduces the clock period to
T_ANDT_XOR, which is a significant
improvement in comparison with the bit-serial
multiplier proposed by Wu \it et al. \cite
Wu99 in which the clock period is
T_AND\lceil log_2 (m1) \rceil T_XOR.
The new proposed bit-serial multiplier is
very competitive in the restricted computing
environments, such as smart cards and wireless
communications, especially for applications where
large values of m are used.
19
A HYBRID ARITHMETIC ARCHITECTURE FOR REDUNDANT
CANONICAL BASIS AND NORMAL BASIS
  • In order to make the proposed redundant
    multipliers applicable to the normal basis
    representation, a new hybrid arithmetic
    architecture is presented to compute the
    multiplication, squaring, and inversion
    efficiently in both redundant and normal bases.

20
The conversion from a normal basis to a redundant
canonical basis is a cost-free permutation of the
coefficients of the element. The conversion
from a redundant canonical basis to a normal
basis requires m 2-input XOR gates.
21
A HYBRID ARITHMETIC ARCHITECTURE FOR REDUNDANT
CANONICAL BASIS AND NORMAL BASIS
22
The logic structure of the hybrid arithmetic
unit is illustrated in Figure for m10. We
use two signals, the Basis-signal and the
Operation-signal, to control the output, and
basis input. We set the Basis-signal0 if it
is used in a redundant canonical basis and
Basis-signal1 if it is used in a normal basis.
We set the Operation-signal0 if the operation
is multiplication and Operation-signal1 if
the operation is squaring. The output of the
multiplexer is the left input if the control
signal is 0'' otherwise, the output is the
right input. The core unit of the hybrid
arithmetic architecture is the redundant basis
multiplier proposed in the previous chapter.
23
There are two options for the multiplier. The
first one is to use the proposed bit-parallel
multiplier (Design 1 and 2) for fast parallel
computation and the second one is to use the
bit-serial multiplier (Design 3) to achieve the
best space and time trade-off when applied in the
embedded systems. The modules of Shifter''
and Permutation'' in Figure \ref
hybrid_mulare used for the squaring operation
of a normal basis and a redundant canonical
basis, respectively. The inversion operation can
be obtained by iteratively squaring and
multiplication.
24
The proposed bit-parallel hybrid arithmetic
architecture requires m22m XOR gates and
(m1)2 AND gates. The maximum time delay is
T_\rm AND (\lceil log_2 (m1) \rceil
1)T_\rm XOR. It achieves significant space
improvement compared with the optimal normal
basis multiplier proposed by Sunar and Koc \cite
Sunar2001 which requires 1.5(m2-m) XOR
gates. A reconfigurable VLSI chip for hybrid
finite field arithmetic in GF(210) has been
designed and simulated by Verilog HDL. It was
also synthesized and placed by Synopsys VLSI
design packages.
25
CELLULAR AUTOMATA BASED RECONFIGURABLE
ARCHITECTURE FOR SYMMETRIC-KEY AND PUBLIC-KEY
CRYPTOSYSTEMS
  • In practical applications, hybrid cryptosystems
    are employed to contain both public-key and
    symmetric-key cryptosystems. They have both the
    security advantages of public-key cryptosystems
    and the speed advantages of symmetric-key
    cryptosystems.
  • A low-complexity Programmable Cellular Automata
    (PCA) based reconfigurable architecture is
    proposed.

26
  • Through simple configurations, the architecture
    not only can be used in the PCA-based block
    cipher of symmetric-key encryption, but also can
    be configured to be an efficient versatile
    modular multiplier in GF(2m), an essential
    operation in public-key cryptography.
  • The unique properties of this reconfigurable
    architecture are its capability to be
    reconfigured on-line, and its ratio of
    throughput/area is much higher than the
    traditional FPGA (Field Programmable Gate Array)
    method.

27
Preliminary CA Theory
  • Programmable CA (PCA) is a structure where the CL
    (Combinational Logic) of each cell is not fixed
    but controlled by a number of control signals
    such that different functions (rules) can be
    realized on the same structure.

28
PCA-based Block Cipher Scheme
  • Encryption CEK M
  • Decryption MEK-1 C
  • A PCA-based block cipher scheme can be achieved
    by applying some special characteristics of CA
    rules to form the transformation function E. The
    fundamental transformations are the combinations
    of rules of 51, 153, and 195.

No explain --gt
29
2-D Pipelined PCA-based Block Cipher
we first propose a fast scheme of 2-D (two
dimensional) pipelined PCA block cipher. In this
scheme, each message block is enciphered by only
one fundamental transformation, that is, q
fundamental transformations ( T0, T1, , Tq-1)
will be applied to q message blocks.
30
High Security PCA-based Block Cipher Scheme
  • In order to improve the encryption security, the
    proposed fast 2-D scheme can be extended such
    that each message block is encrypted by q
    fundamental transformations.

31
PCA-based Versatile Multiplier in GF(2m)
32
PCA-based Versatile Multiplier in GF(2m)
  • A PCA based versatile modular multiplier in the
    canonical (standard, polynomial) basis.
  • The field parameters can be changed according to
    the application environments.

33
Optimal PCA Multiplier
34
Complexity of the PCA Based Multiplier
35
The major differences between the standard PCA
and the extended PCA are that (1) the three
neighbor cells of the standard PCA are the cell
itself and its two nearest left/right neighbor
cells but in the extended PCA, the neighbor
cells can be either the nearest left neighbor
cell or the left/right most cells and (2) the
standard PCA cell (used in PCA block cipher)
performs an XNOR operation and the extended PCA
cell performs an XOR operation.
36
Unified PCA Cell
  • If the control signal C is set to '1' --gt the
    block cipher encryption (Standard PCA)
  • If C is set to '0 --gt a finite field multiplier.
    (Extended PCA).

37
Unified PCA Based Reconfigurable Architecture
  • If the control signal C is set to '1' --gt the
    block cipher encryption The CA rules of
    fundamental transformations (T0, T1, ..., Tq-1)
    can be loaded into the cyclic shift registers at
    configuration time and they are not required to
    change once the the encryption scheme is settled.
  • If C is set to '0 --gt the architecture will be
    configured to be a finite field multiplier. We
    can load the coefficients of the irreducible
    polynomial into the registers at initialization
    and change the values of registers only if the
    irreducible polynomial is changed.

38
Unified PCA Based Reconfigurable Architecture
39
A FAST ALGORITHM FOR MULTIPLICATION ON ELLIPTIC
CURVES
  • A new fast algorithm for multiplication of a
    point on the elliptic curve with a large integer
    based on non-adjacent form(NAF) Frobenius
    expansion is proposed.

40
(No Transcript)
41
Simulation and VLSI Design
  • An 8-bit Reconfigurable Crypto-Chip
  • 879 Standard Cells
  • 55 I/O Circuit Cells
  • Total Area is 131845
  • 200MHz clock

42
A multiplier two clock cycles are required for
one multiplication.A PCA block cipher one
clock cycle is required to encrypt a message
block. It can perform encryption 2 message block
simultaneously.
43
Simulation and VLSI Design
44
Future Research
  • In the future research, we will extract the
    common components of the different basis
    multipliers and design a reconfigurable and
    versatile multiplier which can be used in any
    basis and can be applied in hybrid cryptography
    systems.
  • Furthermore, hardware/software co-design and
    system-on-chip will be considered in order to
    implement the fast and low-complexity
    reconfigurable VLSI architectures.

45
  • Thank you!
Write a Comment
User Comments (0)
About PowerShow.com