VLSI DESIGN 1998 TUTORIAL Part 1' Core Building Blocks and Building Systems using Cores - PowerPoint PPT Presentation

1 / 73
About This Presentation
Title:

VLSI DESIGN 1998 TUTORIAL Part 1' Core Building Blocks and Building Systems using Cores

Description:

Processor: LSI logic CW4001/4010/4100, ARM 7TDMI, ARM 810, NEC 85x, Motorola 680x0, IBM PPC ... DSP Group (Pine and Oak Cores), 3Soft, ARM (RISC) ... – PowerPoint PPT presentation

Number of Views:214
Avg rating:3.0/5.0
Slides: 74
Provided by: rajesh49
Category:

less

Transcript and Presenter's Notes

Title: VLSI DESIGN 1998 TUTORIAL Part 1' Core Building Blocks and Building Systems using Cores


1
VLSI DESIGN 1998 TUTORIAL Part 1. Core Building
Blocks andBuilding Systems using Cores
  • What are cores?
  • Building systems using cores
  • Challenges in using cores
  • Rajesh K. Gupta
  • University of California, Irvine.

2
Available Core Building Blocks
68030
ARM810
PPC401
3
What Is A Core Cell?
  • Working definition
  • at least 5K gates
  • pre-designed
  • pre-verified
  • re-usable
  • Examples
  • Processor LSI logic CW4001/4010/4100, ARM 7TDMI,
    ARM 810, NEC 85x, Motorola 680x0, IBM PPC
  • DSP cores TI TMS320C54X, Pine, Oak
  • Encryption PKuP, DES
  • Controllers USB, PCI, UART
  • Multimedia JPEG comp., MPEG decoder, DAC
  • Networking ATM SAR, Ethernet

4
Core Types
  • Soft cores (code)
  • HDL description
  • flexible, i.e., can be changed to suit an
    application
  • technology independent may be resynthesized
    across processes
  • significant IP protection risks
  • Firm cores (codestructure)
  • gate-level netlist to be placed and routed
  • technology sampled
  • Hard cores (physical)
  • ready for drop in
  • include layout and timing (technology dependent)
  • IP is easily protected
  • mostly processors and memory
  • functional test vectors or ATPG vectors available.

5
Core Types and Their Use
Technology ASIC or FPGA
6
Core Portability
  • Determined by technology independence and data
    format.
  • Technology independence based on the type of core
  • both open and proprietary data formats are
    current in use.

DEF Design Exchange Format (Cadence) SPEF
Standard Parasitic Extended Format
(Cadence) GDSII Layout format (Cadence) ITL
Interpolated Table Lookup cell-level timing model
(Mentor) LEF Layout Exchange Format (Cadence)
MMF Motive Modeling Format (Viewlogic) NLDM
Non-linear Delay Model (Synopsys) TLF Table
Lookup Format (Cadence) VCD Verilog Change Dump
(Cadence) WGL Waveform Graphical Language (TSSI)
7
Timing Information in Firm and Hard Cores
  • Timing behavior can be generated from SPICE
    inputs
  • However, it is not always possible for big cores
  • static timing information is necessary
  • Basic delay model
  • propagation delay model from inputs to outputs
  • slew model (as a function of load and input slew)
  • input/output capacitances
  • setup and hold constraints on inputs.

8
  • What are cores?
  • Building systems using cores
  • Challenges in using cores

9
Building Systems-On-A-Chip Using Cores
Commodity Hardware -compression -encryption -mode
m -signal proc. -image proc.
Commodity Software - encryption/decryption -
device drivers - legacy code - operating/runtime
system
SOC is a SM of LSI Logic Corporation.
10
S-O-C Application Classes
11
Systems-On-A-Chip (SOCs)
  • Two Types
  • Technology-Driven
  • Developed In-House, maximum leverage of
    technology crown-jewels
  • Close cooperation between module developers and
    system designers
  • or wide-ranging cross-licensing agreements
    between partners
  • Component-Driven
  • Core cells as IP carriers
  • IP encapsulated into usable products
  • design reuse is critical to IP products

12
Component-Driven SOC
  • Core supplier different from core user
  • Third party IP providers
  • Significant technology packaging without
    importing it
  • The IP provider wants to sell a product and not
    the technology behind the product
  • Enormous technical, and legal challenges
  • can it be done successfully?
  • who guarantees if a SOC works as required
  • who is liable in case the end product does not
    perform?

13
ASIC Cores Availability
  • 3Soft uC, DSP, LAN, SCSI, PI
  • ARM uC, uP
  • Plessey per. controllers, DSP
  • Scenix uC, PCI, DMA
  • Western Digital Center uC
  • TI DSP NEC DSP, uC
  • Symbios ARM7 TC
  • VAutomation uP, controllers
  • CAST 2910A, IDT49C410, DMAc
  • LSI logic CoreWare
  • IBM Microelectronics
  • Motorola FlexWare
  • Lucent

One-stop Shops
One-Stop Shops
  • Digital Design Dev MIDI
  • Hitachi MPGE, PCI, SCSI, uC
  • Palmchip MPEG, UART, ECC
  • Silicon Engg. micro VGA
  • Butterfly DSP DSP, FFT, DFT, ADSL, OFDM
  • Int. Sil. Systems ADPCM, FIR
  • Analog Devices DSP
  • DSP Group Pine, Oak
  • LogicVision BIST, JTAG
  • ROHM UART, SIO, PIO, FIFOc, Add, Mpy, ALU
  • Synopsys DesignWare, ISA, Intel uC
  • Chip Express FIFO, RAM, ROM
  • VLSI Libraries Memory, Mpy
  • Eureka PCI Virtual Chips PCI, USB
  • Logic Innovations PCI, ATM
  • OKI PCI, PCMCIA, DMA, UART
  • Sand USB, PCI
  • Sierra ATM SAR, Ether, R3000
  • Focus Semi PLL, VCXO
  • VLSI Cores Encryption, DES
  • ASIC Intl DES

NOT EXHAUSTIVE.
14
FPGA/CPLD Cores Availability
  • Capacity constrained cores
  • do not include wide/high performance PCI, ATM
    SAR, or Microprocessors
  • Altera
  • 8-bit 6502
  • DMAC 8237
  • Xilinx
  • PCI
  • Actel
  • System Programmable Gate Array (SPGA)
  • combine FPGA with customer ASIC
  • ASIC examples PCI, Router, DMA controller.

15
Current Core Market Models
Three ways
  • 1. A design house licenses design and tools
  • DSP Group (Pine and Oak Cores), 3Soft, ARM (RISC)
  • offering includes HDL simulation model, tool
    and/or an emulator
  • customer does the design, fab.
  • 2. Core vendor designs and fabs ICs
  • TI, Motorola, Lucent
  • VLSI, SSI, Cirrus, Adaptec
  • 3. Core vendor sells cores, takes customer
    designs and fabs ICs
  • LSI logic, TI, Lucent

Licensable
Foundary Captive
Foundary captive cores do not have to reveal
internal design and layoutof the core. The
foundary provides a bounding box.
16
Core Trends1997 Survey of Designers
Months to completion
  • 74 hardware designers.
  • 26 plan to purchase core for next design
  • 40 hard, 68 soft, 32 firm

Source Integrated System Design
17
Application Needs
Source Integrated System Design
18
Using Cores PCI
  • Class of interface cores such as
  • USB, UART, SCSI, PCI, 1394 etc.
  • Identify target technology
  • ASIC, FPGA
  • PCI (Peripheral Component Interface)
  • processor independent CPU interface to
    peripherals
  • multi-master, peer-to-peer protocol
  • synchronous 8-33 MHz (132 MB/s)
  • arbitration central, access oriented, hidden
  • variable length bursting on reads and writes
  • (I/O, Mem) x (Read, Write) and IACK commands

19
PCI Cores
  • VHDL/Verilog synthesizable cores with options
  • PCI-Host, PCI-Satellite
  • 32-bit (33 MHz) or 64-bit (66 MHz)
  • FIFO or register data storage
  • Synchronous or Asynchronous host interface
  • Core components
  • Master/Target Read/Write FIFOs,
  • Master/Target State Machines
  • Configuration registers
  • Timing requirements
  • input setup time 7ns clock to output delay
    11ns
  • DC Specs input pin caps 10 pF, clk pin 12 pF,
    ID Sel 8pF

20
User Experience
  • Huges Network Systems
  • DirecPC ASIC in a satellite receiver card
  • 80K gates device on Chip Express process
  • DirecPC consists of
  • IDT R3041 RISC controller
  • Memory, Demodulator, Error-check, PCI core
  • PCI core from Virtual Chips
  • 17K gates including asynchronous FIFOs
  • Guesstimate 4K extra gates due to the core (5)
  • Comments
  • Their test vectors assume you have direct access
    to the internal interface of the core. I looked
    through their test vectors and tried to do the
    same things using my back end.
  • They were kind of giving us a reference
    documentation. It wasnt turnkey.

Source EE Times
21
Using Cores DSPs
  • 16-bit fixed point processors are most commonly
    used.
  • DSPs
  • simple Clarkspur Design CD2450 (variable data
    width)
  • compatible DSPGroup, TI, SGS-T 320C5x
  • clone
  • Options
  • memory, mem controller, interrupt controller,
    host port, serial port
  • Criticals
  • power consumption as most DSP applications go
    into portable products

22
Design using DSP Cores
  • Core vendors often supply a development chip or
    core version of the COTS processor
  • board-level prototyping fairly common
  • followed by single-chip solution
  • To avoid board-level prototyping, a
    full-functional simulation model is a must,
    particularly for foundry captive cores.
  • Software tools provided
  • assembler, linker, instruction set simulator,
    debugger, (high-level language compiler?)

23
DSP Sample Points
  • TI TEC320C52
  • 16-bit fixed-point TMS320C52
  • 1Kx16 data RAM, 4Kx16 program RAM
  • 2 serial ports, 1 16-bit timer
  • and 0.8 micron 15,000-gate gate array
  • Motorola 7-Day CSIC
  • 8-16 MHz HC08, DMA, MMU, ..
  • SGS-Thomson ST18932, ST18950
  • 16-bit fixed-point DSPs, 0.5 u, 3.3 volt CMOS,
    80MHz
  • has no off-the-shelf DSP IC
  • used in PC sound cards, 950 has a better assembly

Not exhaustive, only a representative sample.
24
Third Party DSP Cores
  • DSPGroup Pine
  • 16-bit fixed-point, 0.8u CMOS, 5.0/3.3 V, 40 MHz
  • 36-bit ALU, 16-bit MPY, 2Kx16 RAM/ROM, (prog mem
    is outside core)
  • used in pagers and answering machines
  • DSPGroup Oak
  • same as Pine, plus includes a bit manipulation
    unit
  • Viterbi decoding support instructions (min, max)
  • used in digital cellular telephony
  • Clarkspur CD2400, CD2450
  • 16-bit fixed-point
  • 24-bit ALU, MPY, Acc, 2x 256x16 data RAM/450
    makes it 48 bits
  • used in fax-modem

25
One-Stop Shops LSI Logic CoreWare
  • Cores for building ASIC for most embedded
    applications
  • laser printer, ATM, PDA, Set-top, Router,
    Graphics accelerators, etc.
  • CPU cores miniRISC CW4K, Oak DSP
  • miniRISC compatible with MIPS R4000
  • 0.5u CMOS, 2mW/MHz, 60MHz, 3-stage pipeline
  • 32-bit address/data bus
  • full scan 99 fault coverage, gate-level timing
    model
  • Interface PCI, Fibre Channel, SerialLink
  • Networking Ethernet, ATM (SAR), Viterbi, RS
  • Compression etc MPEG, JPEG, DAC/ADC.

26
Core Examples
  • Only a representative sample of cores. Not
    exhaustive or even comparative.
  • Processor cores
  • LSI Logic CW4001, CW4010
  • ARM (7) processors
  • Motorola FlexCore
  • Memory cores
  • 16M/18M Rambus DRAM
  • Multimedia cores
  • CompCore CD2
  • Networking
  • Media Access Controller (MAC)
  • Encryption cores
  • VLSI cores, ASIC international.

27
LSI Logic CW4001 Core
  • Behavioral Verilog/VHDL model
  • Gate-level timing accurate model
  • Specifications
  • 60 MHz, 60 MIPS (45 MIPS average), 3 stage
    pipeline
  • 0.5 micron CMOS process, 4 sq. mm., 2mW/MHz
  • Full-scan with 99 fault coverage.
  • Interfaces
  • CBUS, Computational Bolt-On (CBO), Co-processor,
    MMU
  • Customizability
  • BIU, cache controller, MDU, MMU, DRAM/SRAM
    controllers, timers, caches (lt16K), RAM/ROM, DMAc
  • Upto 3 Co-processors (FPU, Graphics, Compression,
    Network Protocol), MPY/DIV unit, CRC, direct
    access to CPU GPRs

28
Using CW4001
  • Co-processor has its own instruction set
    including
  • read data bus for instruction, rd/wr to external
    mem.
  • read/write to CPU registers, stall and interrupt
    CPU
  • CW delivers 05 and 2631 opc fields to
    Co-processor instr. decoder
  • Coprocessor executs in lockstep with CPU
    pipeline stages.

29
CW4010 CPU Core
  • Verilog/VHDL model with gate-level timing
  • 80MHz, 160 MIPS (110 MIPS average), 6 stage
    pipeline
  • 0.5 micron CMOS, 9 sq. mm., 5 mW/MHz
  • Integrated cache controllers with separate I and
    D caches
  • cache size from 2-16 KB
  • 64-bit memory and cache interface
  • Up to 3 co-processors
  • Full-scan with 99 fault coverage.

30
Advanced RISC Machines (ARM )
  • A family of 32-bit RISC processor cores
  • ARM6, ARM7 MPU with Cache, MMU, Write Buffer and
    JTAG
  • ARM7TDMI ARM7 with Thumb ISA, ICE, Debug MPY
  • ARM8 cached, low power, 5-stage pipe (vs 3 in
    others)
  • StrongARM1, StrongARM2 available as Digital
    SA-110 (21285)
  • Piccolo DSP co-processor for ARM, shares system
    bus (AMBA)
  • support for Viterbi, bit manipulation operations
  • four nestable zero-overhead hardware loop
    constructs
  • splittable ALU, 1 cycle dual 16-bit operations
  • saturation arithmetic
  • 1024 point in place complex radix 2 FFT in 33,331
    cycles
  • Manufacturing partnerships and/or licensing with
  • Cirrus logic, GEC Plessey, Sharp, TI and VLSI
    Tech.

31
ARM Processor Cores
Source ARM Inc.
  • Enhancements ARM7D, ARM7DM, ARM7DMI
  • M 64-bit result hardware multiplier running at
    8bits/cycle
  • D 2 boundary scan chains for basic debug
  • I Embedded ICE debug
  • Thumb instruction set

32
ARM Enhancements Embedded ICE
  • The EmbeddedICE core cell allows debugging of ARM
    core embedded with an ASIC
  • real time address and data-dependent breakpoints
  • full access and control of the CPU
  • can be reduced for size savings once the part
    goes into production.

40KB/s software download
ASIC
ICE
ARM Core
Uses boundary scan pins
Debug Host running ARMsd
EmbeddedICE Cell (creates to core)
Source ARM Inc.
33
ARM Enhancements Thumb ISA
  • 8- or 16-bit external, 32-bit internal
  • Thumb instruction set is a subset of 32-bit ARM
    instruction set
  • 16-bit instructions
  • expanded into 32-bit ARM instructions at run
    time without any penalty
  • Up to 65-70 smaller code size compared to ARM
  • 130 of ARM performance with 8/16 bit memory
  • 85 of ARM performance with 32-bit memory

001
10
Rd
Constant
16-bit Thumb instr.
ADD Rd constant
maj. opc.
min. opc.
dest. and src.
zero extended
always
1110
001
01001
0 Rd
0 Rd
0000 Constant
32-bit ARM instr.
34
ARM Applications
  • Widely used in a variety of applications
  • low cost 16-bit applications
  • mobile phones, modems, fax machines, pagers
  • hard disk and CD drive controllers
  • engine management
  • low cost 32-bit applications
  • smart cards
  • ATM and ethernet network interfaces
  • low power, on-chip application code
  • high performance 32-bit applications
  • digital cameras
  • set top boxes, network switches, laser printers
  • external memory system (RAM, ROMs)

Courtesy S. Dey, ICCAD96
35
Motorola FlexCore
  • CPU cores based on 680x0 family
  • EC000, EC020, EC030
  • all with static operation, 5/3.3 volt supplies
  • performance
  • EC000 2.7 MIPS _at_16.67MHz, 33 mW
  • EC020 7.4 MIPS _at_25 MHz, 150 mW
  • EC030 11.8 MIPS _at_33 MHz, 258 mW
  • Serial I/O cores 68681UART, MBus, SPI
  • RT clock, Dual timer cores
  • SCSCI, Parallel I/O, 8051 interfaces
  • DRAM, Interrupt, JTAG controllers
  • PLA, PLL, oscillators, power management cells.

36
Memory Core Example
  • Virtual Chips 16M/18M bit Rambus DRAM
  • Verilog/VHDL simulation model
  • Organization
  • two banks, 512 pages per bank, 72x256 per page
  • dual internal banks, 2K byte cache per bank
  • Programmable ack, write, read delays through
    control registers
  • Synchronous protocol for fast block oriented
    xfrs.
  • Modes of operation
  • reset, stand-by, power-down, active
  • Deliverable VHDL, Verilog source, test bench,
    test vectors, documentations.
  • Others Sand DRAM, VRAM verilog models.

37
Multimedia Cores
MPEG input
Source CompCore
  • JPEG compression, MPEG decoding, Video DAC, etc.
  • IBM Microelectronics, LSI logic, PalmChip,
    Silicon Engineering, Mentor Graphics, CompCore,
    Intrinsix VGA
  • Example MPEG-2 decoder from CompCore
  • 70K-80K gates
  • 18K bits of internal SRAM
  • 16Mbit SDRAM (external)
  • bitstream buffering, frames
  • 54MHz, 16-bit external mem. bus

CD2 Decoder
microc. interface
Audio Decoder
Video Decoder
virtual mem. controller
synchronization
SRAM
SRAM
SRAM
phy. mem. controller
1Mx16 SDRAM
audio stream
video str.
38
Other Core Categories
Networking
Encryption
  • Protocol choices
  • switched Ether, s. TR, ATM155, ATM25
  • Example SYM1000 from Symbios
  • HDL code, 3.3 V, 0.5u
  • CSMA/CD ethernet
  • programmable inter-packet gap.
  • Optional CRC insertion, and check
  • MII interface to physical layer device
  • Host bus interface
  • LSI Logic ATMizer
  • VLSI Cores
  • PKuP encryption core
  • implements modular exponentiation
  • synthesizable HDL core
  • DES core as a synthesizable Verilog model
  • two models 8 bytes/8 cycle, 8 bytes/16 cycles
  • ASIC International
  • DES cores
  • Exponentiator Engine
  • Hash function cores

39
  • What are cores?
  • Building systems using cores
  • Challenges in using cores

40
Challenges in Using Cores
  • A core cell is not a single product
  • a PCI cell consists of 25 separate Verilog files
  • plus as many synthesis scripts
  • immature interface abstraction
  • e.g., there is no direct access to the core from
    the end product. Access must be created.
  • A core is not an end product
  • a core cell is design know-how to use it for a
    particular process, tools and even application
  • Testability and testing is a challenge
  • as opposed to design, testing is not a
    hierarchical problem
  • using 90 testable cores does not give 90 system
    testability
  • tests are core-specific, not applicable from
    primary IO
  • What is an efficient design methodology using
    cores?

41
SOC Design Problem Components
2. HDL Modeling Architectural synthesis Logic
synthesis Physical synthesis
1. Design environment, co-simulation constraint
analysis.
Interface
Analog I/O
3. Software synthesis, Optimization, Retargetable
code gen., Debugging Programming environ.
Processor
ASIC
Interface
4. Test Issues, Test access, Isolation, ATPG
Memory
DMA
Processor cores introduce software part of system
design.
42
Co-Design Components
  • Specification, Modeling and Analysis
  • How to capture designer intent efficiently in a
    design language?
  • HDL optimizations
  • Constraint modeling and analysis
  • System Validation
  • How to use description in building a
    (computational) prototype capable of running
    actual applications?
  • Co-simulation, Formal Verification
  • System Design and Synthesis
  • Delayed partitioning of hardware and software
  • Software synthesis and optimizations
  • Interface design and optimizations.

9
43
System Specification Goals Characteristics
  • Main purpose provide clear and unambiguous
    description of the system function, and to
    provide a
  • documentation of the initial design process
  • Support
  • diverse models of computation
  • allow the application of computer-aided design
    tools for
  • design space exploration
  • partitioning
  • software-hardware synthesis
  • validation (verification, simulation)
  • testing
  • Should not constrain the implementation options.
  • diverse implementation technologies.

44
Embedded System Modeling
  • Reactive and time-constrained interactions
  • Consist of structural and behavioral components.
  • Hierarchically organized components.
  • Synchronous and asynchronous communications.
  • Locally or globally clocked.
  • Idealized as Synchronous Reactive Systems.

45
Synchronous Reactive Modeling
  • Zero computation time
  • System outputs produced in synchrony with inputs
  • Instantaneous broadcast communications
  • Deterministic behavior
  • a given sequence of inputs always produces same
    output sequence.
  • Examples languages using this model
  • ESTEREL, LUSTURE.
  • More later.

46
Example Esterel
  • Reactive and atomicity of reactions
  • watching implements a generalized watchdog
  • Time as discrete instants
  • Easily translated into a transducer (FSM
    generation)
  • Perfect synchrony hypothesis
  • Instantaneous broadcast
  • Implicit communication architecture.
  • Using signals which are present or absent and may
    carry a value.
  • Pure signals do not carry a value.

47
Constraint and Interface Modeling
  • Source of timing constraints
  • Time-constrained interactions between system
    components and environment
  • Specified using statement tags on HDL
    descriptions.
  • Types of constraints
  • Delay and interval constraints (latency-type)
  • Rate constraints (throughput-type)
  • Constraint satisfiability
  • Are constraints satisfied for a given
    implementation?
  • Given an implementation, resynthesize to satisfy
    a given set of constraints.

48
Example
Derived from events at system interfaces.
49
Interface Modeling using Constraints
  • Interface described using events.
  • Events are instances of actions.
  • Most common interface action is a signal
    transition on a wire.
  • Temporal relationship between events
  • Propagation delays
  • Bounds on event separation intervals min, max,
    linear
  • Absolute versus relative rate constraints.

50
Binary Delay Constraints
i
j
k
MAX
max
max
i
j
k
MIN
min
min
51
Interface Delay Timing Constraints
  • Three types (McMillan Dill)
  • Given events i and j with time stamps ti and tj
    respectively and dij as the delay i to event j,
    such that lij lt dij lt uij
  • min constraints tj miniltj (ti dij )
  • max constraints tj maxiltj (ti dij )
  • linear constraints tj - ti lt sij where sij
    is maximum achievable separation between i and
    j.
  • Constraint graph
  • nodes ltgt events edges ltgt constraints.
  • Synthesis find maximum achievable separation
    between pairs of events (minimum separation
    depends upon operation delays.)
  • Rate constraint analysis and debugging.

52
Hardware Modeling As A Programming Activity
  • Programming languages are often used for
    constructing system models
  • Core based designs assume that all new designs
    originate as an HDL model
  • Hardware
  • concurrency in operations
  • I/O ports and interconnection of blocks
  • exact event timing is important open computation
  • Software
  • typically sequential execution
  • structural information is less important
  • exact event timing is not important closed
    computation.

53
HDL Semantic Necessities
  • Abstraction
  • provide a mechanism for building larger systems
    by composing smaller ones
  • Reactive programming
  • provide mechansims to model non-terminating
    interaction with other components
  • watching (signal) and waiting (condition)
  • must be separate (else one is an implementation
    of the other)
  • exception handling
  • Determinism
  • provide a predictable simulation behavior
  • Simultaneity
  • model hardware parallelism, multiple clocks

54
HDL Pragmatics
  • Data types
  • simple (bit/Boolean) HardwareC, Verilog
  • complex (records) VHDL
  • Interface abstraction
  • provide an external view independent of
    implementation
  • Classes (packages) in C, VHDL
  • Entity interfaces or Tasks VHDL, ADA

55
Pragmatics (contd.)
  • Communication
  • shared variables using explicit communication
    architectures
  • synchronous handshaking using implicit
    communications (ADA task entry call)
  • instantaneous broadcast (Esterel)
  • asynchronous message passing using explicitly
    communication architectures
  • Time
  • global, multiple clocks, logics.

56
Going from HLL to HDL
(Restricted) HLL Description
Refine data types - bit true, fixed point -
saturation arithmetic
Add reactivity, clock(s), waiting watching
CONTROL
DATA
HDL Description
57
HLL Restrictions
  • Classes for synthesis target do not use
  • unions, floating, pointers (only interface with
    lib)
  • type casts
  • virtual functions (restricted to only library
    classes)
  • policy of use on shared variables
  • Suggestions
  • explicit initialization blocks
  • use defines instead of conditional process
    enables for statically determined conditions

58
Adding Reactivity
  • Reactivity can be added in one of three ways
  • 1. use annotations, comments
  • commonly used in home-grown C-based HDLs
  • sometime use semantic overloads that is
    association an alternative interpretations.
  • 2. use library assists
  • additional library elements that can be used by
    the programmer in modeling hardware.
  • example additional classes in C
  • 3. use additional language constructs
  • new constructs require a specific language
    front-end, new debugging tools.
  • example divide operations across cycles using
    next()

59
Adding Data Types
  • Identify signals
  • storage elements, structured memory blocks
  • Type variables signed, unsigned, std_logic
  • Size state variables on instantiation

60
Language Comparisons
  • Verilog, VHDL compiler produces inputs to run a
    DES simulator.
  • Esterel compiler produces a single deterministic
    FSM.
  • Scenic compiler produces (synthesizable)
    processes and a simulator.

61
From HDL to Circuit/SystemCompilation
Synthesis
  • Compilation spans programming language theory,
    architecture and algorithms
  • Synthesis spans concurrency, finite automata,
    switching theory and algorithms
  • In practice, the two tasks are inter-related.
  • Compilation and synthesis tasks are done in three
    steps
  • front-end, intermediate optimizations, back-end.

62
Compilation
  • Program compilation for software target
  • Front-end parsing into intermediate form
  • Optimization over the intermediate form
  • Back-end code-generation for a given processor
  • HDL compilation for hardware target
  • Front-end parsing into intermediate form
  • Optimization over the intermediate form
  • Back-end architecture, logic and physical
    synthesis.

63
Synthesis and Optimization
  • Substantial growth in last twenty years
  • Industry-standard tools in
  • Logic synthesis
  • Physical synthesis
  • Behavioral synthesis just becoming commercial.
  • Substantial room for growth when considered
    together with software compilation.

64
Behavioral to RTL
  • Basic transformations needed
  • 1. Operation scheduling
  • 2. Resource binding
  • 3. Control generation central or distributed..
  • Evolutionary growth to synthesis tools
  • Designer expertise today lies in the RTL coding
  • Synthesis tools are strongly dependent upon
    design methodology.
  • Generate a structure suitable for synchronous and
    single-phase circuits
  • resource performance in terms of execution delay
  • in number of clock cycles
  • Design space
  • area, cycle time, latency, throughput

65
Synthesis Tasks
  • Operation scheduling, resource binding, control
    generation
  • Scheduling determines operation start times
  • minimize latency
  • Resource binding resource selection, allocation
  • minimize area (maximize sharing)
  • Control synthesis
  • data-path connectivity synthesis
  • detailed resource connections
  • steering logic
  • connection to the interface
  • control synthesis
  • synthesize controller that provides
    operations/resource enables, operation
    synchronization, resource arbitration

66
A CAD Methodology for SW
  • Automated software synthesis from specs.
  • Synthesis tools generate implementation
  • Global optimization of the program.
  • Optimization used to achieve design goals.
  • Analysis and verification tools for feedback.
  • Compilation for embeddable software
  • Software Optimizations
  • Code compression
  • Optimization for power
  • Instruction-set generation
  • Static memory allocation

67
Compression
  • Block-based compression
  • Program compressed in small blocks to preserve
    random-access properties (e.g., cache line
    blocks)
  • Transparent code compression
  • ISA unchanged. Compression uses compiler output.
  • Decompression performed by cache refill engine.
  • Processor sees only uncompressed code.
  • Techniques Huffman coding.
  • Key issue code location in memory after
    compression?

68
Compilation What is New?
  • Machine description
  • in terms of architecture -gt programming
  • in terms of organization -gt hardware
  • Retargetable code generation has traditionally
    addressed the problem of compilation for an
    architecture.
  • SOCs also need input about machine organization
    in order to perform timing analysis on generated
    code
  • Two approaches
  • describe detailed machine
  • extract ISA from machine organization

69
Co-Design Framework
Hardware Design Synthesis
70
Test Strategy for Firm/Hard Cores
  • System-level test strategy
  • build test sets for cores
  • generate functional vectors
  • fault grade for interconnects
  • prepare cores for test application from primary
    inputs through access/isolation, Scan/DFT
  • if BIST, schedule BIST application and signature
    analysis.
  • System-level DFT
  • goal is to reduce testing cost
  • increase accessability of the internal nodes
  • controllability ability to establish a specific
    signal value at each node from primary inputs
    (PIs)
  • observability determine signal value by
    controlling Pis and observing primary outputs
  • tradeoffs area, I/O pins, performance, yield, TTM

71
DFT Techniques
  • Commonly used approach is to modify a sequential
    circuit into a combinational one during test.
  • Automatic test generation is much easier for
    combinational circuits
  • Current monitoring techniques.
  • For sequential circuits, scan techniques are
    often used
  • link memory elements into a shift register
  • serially load and read out
  • boundary scan is commonly used to test
    board-level devices
  • Built-In Self Test
  • minimal external support, high fault coverage,
    easy access requirements, protect IP

72
Test Access for Cores
  • Peripheral access techniques
  • parallel access, serial access or functional
    access
  • Parallel access
  • add MUXs to connect core IOs, high routing
    overhead, pin limitations may prevent parallel
    access
  • Serial access
  • most common is ring approach, during test core
    I/Os are connected via a scan chain, low
    overhead, delay penalty, easy to test
    user-defined logic, long test application time
  • Functional access
  • sensitize path through cores, low hardware cost,
    parallel test pattern translation possible.
  • Also need isolation mechanisms for cores.

73
Summary of Part I
  • Core cells present a new market opportunity
  • core cells are breathing life into many old
    designs (6502)
  • a new class of third-party vendors who bridge
    the gap between design houses and EDA vendors.
  • Productization of cores faces many challenges
  • portability of cores versus design reuse
  • socketing standards (portability and reuse)
  • IP protection encryption, product versus
    technology
  • design and test methodologies
  • Research outlook is aligned with industry
    expectations
  • all new designs start with HDL description
  • immediate focus on validation, testability issues
  • long term focus on software optimization,
    complexity management.
Write a Comment
User Comments (0)
About PowerShow.com