Title: How do I make hardware fast, power-efficient, less noisy, and easy-to-design?
1How do I make hardware fast, power-efficient,
less noisy, and easy-to-design?
Clockless Chips or
2Presentation Flow
- Motivation.
- Introduction to the clocked circuits,
- Challenges with Clocked circuits.
- Clock less / Asynchronous circuits.
- How clock less chips work?
- Usefulness Challenges for Clock less Design.
- Applications
3Motivation for Clock less (Designers view)
- Modularity for system-on-chip design
- Plug-and-play interconnectivity
- Average-case performance
- No worst-case delay synchronization
- Many interfaces are asynchronous
- Buses, networks, ...
-
4Introduction Clocked Digital Design
- Most current digital systems are synchronous
- Clock a global signal that paces operation of
all components
- Benefit of clocking enables discrete-time
representation - all components operate exactly once per clock
tick - component outputs need to be ready by next clock
tick - allows glitchy or incorrect outputs between
clock ticks
5ChallengesWith Clocked Design
- Unprecedented Challenges
- Distributing the clock globally.
- Wastage of energy. Bcoz clocks themselves
consume lot of energy (30). - Traverse the chips longest wires in one clock
cycle. - Order of arrival of the signals is unimportant.
- Design becoming unmanageable using a centralized
(synchronous) approach.
6Why Clock less?
- The world already is mostly clock less!
- So, why bother with clocks at all?!
- make everything Clockless ? greater elegance and
robustness
Events at the level of Clock Speed Example
(In between) Large-scale systems Several Seconds to milliseconds PC-printer comm., Keyboard inputs, network comm.
Between chips Milliseconds to 100 nanoseconds CPU-memory interface, Interrupts
Within a chip (functional units) Nanoseconds to 100 picoseconds Adders, control logic
single logic gate 10 picoseconds
quantum level Picoseconds to femtoseconds
7What is Clock less Design?
- Digital design with no centralized clock
- Functions away from the clock
- Different parts work at different speeds
- Synchronization using local handshaking
- Hand-off the result immediately
8 How is data represented in an asynchronous
system? How is information exchanged?
control signaling (handshake styles)
Solution to above Questions are
9Handshaking Protocols Components
- Bundled-Data Handshaking Protocols
- 2-Phase Bundled-Data Protocol
- 4-Phase Bundled-Data protocol
- Dual-rail Handshaking Protocols
- 2-Phase Dual-rail Protocol
- 4-Phase Dual-rail Protocol
Delay Insensitive Method
10Data Encoding Bundled Data
- Single-rail Bundled Datapath simplest approach
- widely used
- Features
- datapath 1 wire per bit (e.g. standard sync
blocks) - matched delay produces delayed done signal
- worst-case delay longer than slowest path
- Practical style can reuse sync components small
area - Fixed (worst-case) completion time
11Data Encoding Dual-Rail
- Dual-rail uses 2 wires per data bit
Each Dual-Rail Pair provides both data value and
validity
- provides robust data-dependent completion
- needs completion detectors
12Dual-Rail Completion Sensing
- Dual-Rail Completion Detector
- combines dual-rail signals
- indicates when all bits are valid (or reset)
- C-element
- if all inputs1, output ? 1
- if all inputs0, output ? 0
- else, maintain output value
- OR together 2 rails per bit
- Merge results using a Müller C-element
13Muller C-element
Vdd
A
B
Z
A
B
Z
A
B
Z
Static Logic Implementation
A
B
Gnd
14Handshaking Styles 4-phase
- 4-Phase requires 4 events per handshake
- Level-sensitive ? simpler logic implementation
- Overhead of return-to-zero (RTZ or resetting)
- extra events which do no useful computation
15Handshaking Styles 2-phase
- 2-Phase requires 2 events per handshake
- transition signaling
- Elegant no return-to-zero
- Slower logic implementation
- logic primitives are inherently level-sensitive,
not event-based (at least in CMOS)
16Handshaking Data Representation
- Several combinations possible
- dual-rail 4-phase, single-rail 4-phase, dual-rail
2-phase, and single-rail 2-phase - Example dual-rail 4-phase
Which To Use?
- dual-rail data functions as an implicit
request - 4-phase cycle between acknowledge and implicit
request
A
B
17Why Clock less? (Technical View)
Power Consumption
Performance
Robustness
Electro- magnetic Compability
18Challenges of Clock less Design
Difficult to DESIGN
Concurrent models for SPECIFI-
CATION
Complex TIMING ANALYSIS
Difficult to TEST
19Applications of Clock less Chips
Clock less Logic Application
20References
- BOOKS
- Computers without clocks Ivan E Sutherland and
Jo Ebergen. - 2) Principals of Asynchronous Circuit
Design- A System Perspective - - Jens Sparso,Technical University of
Denmark Steve Furber, - The University of Manchester, UK.
- 3) Scanning the Technology Applications of
Asynchronous Circuits - C. H. (Kees) van Berkel, Mark B.
Josephs, and Steven M. Nowick - WEBSITES
- 1) Recent blurb It's Time for Clockless Chips,
by Claire Tristram (MIT Technology Review, v.
104, no.8, October 2001 http//www.technologyrev
iew.com/magazine/oct01/tristram.asp) - 2) websites of Philips, HAL, Motorola, Sun,
Intel, Self Time Sol., etc. - 3) http//www1.cs.columbia.edu/async/misc/technol
ogyreview_oct_01_2001.html - 4) http//www.technologyreview.com/articles/01/10
/tristram1001.asp - 5) http//www.cs.columbia.edu/async/misc/technolo
gyreview_oct_01_2001.html
21Questions??? Suggestions