Giga-Scale%20System-On-A-Chip%20International%20Center%20on%20System-on-a-Chip%20(ICSOC) - PowerPoint PPT Presentation

View by Category
About This Presentation
Title:

Giga-Scale%20System-On-A-Chip%20International%20Center%20on%20System-on-a-Chip%20(ICSOC)

Description:

International Center on System-on-a-Chip (ICSOC) Jason Cong University of California, Los Angeles Tel: 310-206-2775, Email: cong_at_cs.ucla.edu (Other participants are ... – PowerPoint PPT presentation

Number of Views:57
Avg rating:3.0/5.0
Slides: 25
Provided by: JasonC186
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Giga-Scale%20System-On-A-Chip%20International%20Center%20on%20System-on-a-Chip%20(ICSOC)


1
Giga-Scale System-On-A-Chip International Center
on System-on-a-Chip (ICSOC)
  • Jason Cong
  • University of California, Los Angeles
  • Tel 310-206-2775, Email cong_at_cs.ucla.edu
  • (Other participants are listed inside)

2
Project Summary
  • Develop new design methodology to enable
    efficient giga-scale integration for
    system-on-a-chip (SOC) designs
  • Project includes three major components
  • SOC synthesis tools and methodologies
  • SOC verification, test, and diagnosis
  • SOC design driver network processor

3
Research Team by Institutions
  • US
  • UCLA Jason Cong
  • UC Santa Barbara Tim Cheng
  • Taiwan
  • NTHU Shi-Yu Huang, Tingting Hwang, J. K. Lee,
    Youn-Long Lin, C. L. Liu, Cheng-Wen Wu, Allen Wu
  • NCTU Jing-Yang Jou
  • China
  • Tsinghua Univ. Jinian Bian, Xianlong Hong, Zeyi
    Wang, Hongxi Xue
  • Peking Univ. Xu Cheng
  • Zhejiang Univ. Xiaolang Yan

4
Current Research Team
  • US
  • UCLA Jason Cong
  • UC Santa Barbara Tim Cheng
  • Taiwan
  • NTHU Shi-Yu Huang, Tingting Hwang, J. K. Lee,
    Youn-Long Lin, C. L. Liu, Cheng-Wen Wu, Allen Wu
  • NCTU Jing-Yang Jou
  • China
  • Tsinghua Univ. Jinian Bian, Xianlong Hong, Zeyi
    Wang, Hongxi Xue
  • Peking Univ. Xu Cheng
  • Zhejiang Univ. Xiaolang Yan
  • Several new faculty members in the 7 institutions
  • Guest members from National University of
    Singapore, Purdue Univ., and UCLA (EE Dept)

5
Thrust 1 -- SOC Synthesis Environment/Methodology
(Led by Jason Cong)
VHDL/C Co-Simulation
Design Spec VHDL/C
Design Partitioning
Code Generation for Retargetable Compiler and
Assembler Generator
DSP Synthesis and Optimization
FPGA Synthesis and Technology Mapping
Embedded Processors
DSPs
Embedded FPGAs
Customized Logic
6
Interconnect Bottleneck in Nanometer Designs
  • 2nd challenge Single-cycle full chip
    synchronization is no longer possible
  • Not supported by the current CAD toolset
  • About to happen soon
  • ITRS01 0.07um Tech
  • 5.63 G Hz across-chip clock
  • 800 mm2 (28.3mm x 28.3mm)
  • IPEM BIWS estimations
  • Buffer size 100x
  • Driver/receiver size 100x
  • On semi-global layer (tier 3)
  • Can travel up to 11.4 mm in one cycle
  • Need 5 clock cycles from corner to corner

7
Regular Distributed Register Architecture (2)
  • Use register banks
  • Registers in each island are partitioned to k
    banks for 1 cycle, 2 cycle, k cycle
    interconnect communication in each island
  • Highly regular

8
MCAS Placement-Driven Architectural Synthesis
Using RDR Architecture
9
Experimental Results (3)
  • MCAS basic flow vs. Synopsys Behavioral Compiler
    (on Virtex-II)
  • Synopsys Behavioral Compiler setting default
    (optimizing latency)
  • Average latency ratio of MCAS vs. BC 69

Latency
Resource
10
Optimality Study of Large-Scale Circuit Placement
  • Construction of Placement Example with Known
    Optimal (PEKO) C. Chang et al, 2003

?
11
High Interest in the Community
  • Two EE Times articles coverage
  • Placement tools criticized for hampering IC
    designs Feb03
  • IC placement benchmarks needed, researchers say
    April03
  • More than 60 downloads from our website
  • Cadence, IBM, Intel, Magma, Mentor Graphics,
    Synopsys, etc
  • CMU, SUNY, UCB, UCSB, UCSD, UIC, UMichgan,
    UWaterloo, etc
  • Used in every placement since its publication

http//ballade.cs.ucla.edu/pubbench
12
1. Synthesis Verification
  • Hardware/Software Partition
  • Propose a SSS based H/S partition algorithm
    (ASICON2003)
  • better solution than SA and less runtime than
    Tabu
  • High-level Synthesis
  • Re-synthesis algorithm after floorplanning for
    timing optimization (ASICON2003)
  • Based on initial scheduling do floorplanning
  • After floorplanning do re-scheduling and
    re-allocation by force-balance method
  • Controller Synthesis
  • A Heuristic State Minimization Algorithm For
    Incompletely Specified Finite State Machine
    (ASICON2003, JCST)

13
2. Floorplanning Interconnect Planning
  • Based on proposed Corner Block List (CBL)
    representation propose several Extended Corner
    Block List, ECBL, CCBL and SUB-CBL to speed up
    floorplanning and handle more complicate L/T
    shaped and rectilinear shaped blocks.
  • Propose floorplanning algorithms with some
    geometric constraints, such as boundary,
    abutment, L/T shaped blocks.
  • Propose integrated floorplanning and buffer
    planning algorithms with consideration of
    congestion .
  • Using research results from UCLA on interconnect
    planning
  • About 30 papers published in DAC, ICCAD, ISPD,
    ASPDAC, ISCAS and Transactions.

14
3. P/G Network Analysis Optimization
  • Propose an Area Minimization of Power
    Distribution Network Using Efficient Nonlinear
    Programming Techniques (ICCAD2001, accepted by
    IEEE Trans. On CAD)
  • Propose a decoupling capacitance optimization
    algorithm for Robust On-Chip Power Delivery
    (ASPDAC2004, ASICON2003)

4. Global Routing Special Routing
  • Propose several congestion, timing, and both
    timing
  • and congestion optimization global routing
    algorithms
  • Papers were published in ASPDAC, ISCAS, and IEEE
    Transactions.

15
5. Parasitic R/L/C Etraction
  • 3-D R/C Extraction using Boundary Element Method
    (BEM)
  • Quasi-Multiple Medium (QMM) BEM algorithms
  • Hierarchical Block BEM (HBBEM) technique
  • Fast 3-D Inductance Extraction (FIE)
  • Papers were published in ASPDAC, ASICON and IEEE
    Transaction on MTT

16
Thrust 2 -- SOC Verification, Test, and
Diagnosis (Led by Tim Cheng)
Verification and Testing
Enabling techniques for semi-formal functional
verification
Testing and diagnosis for heterogeneous SOC
Self-testing using on-chip programmable components
Self-testing for on-chip analog/mixed-signal
components
Automatic/semi-automatic functional vector
generation from HDL code
Scalable constraint-solving techniques
Integrated framework for simulation, vector
generation and model checking
New test techniques for deep-submicron embedded
memories
17
Key Results - Verification
  • Developed and released ATPG-based SAT solvers for
    circuits (Univ. of California, Santa Barbara)
  • Integrating structural ATPG and SAT techniques
    with new conflict learning
  • CSAT Fast combinational solver (released on
    March 2003)
  • Demonstrated 10-100X speedup over
    state-of-the-art SAT solvers on industrial test
    cases (reported by Intel and Calypto)
  • Has been integrated into Intels FV verification
    system and a startups verification engine
  • Publications DATE2003 and DAC2003
  • Satori2 Fast sequential solver (released on Dec.
    2003)
  • Demonstrated 10X-200X speedup over a commercial,
    sequential ATPG engine on public benchmark
    circuits
  • Publications ICCAD2003, HLDVT2003 and ASPDAC2004

18
Key Results - Testing
A new Statistical Delay Testing and Diagnosis
framework consisting of five major components
(UCSB)
  • Statistical timing analysis
  • Statistical critical path selection
    DAC02,ICCAD02
  • Selecting statistical long true paths whose
    tests maximize detection of parametric failures
  • Path coverage metric ASPDAC03
  • Estimating the quality of a path set
  • Selection/Generation of high quality tests for
    target paths ITC01DATE 2004
  • Identifying tests that activate longer delay
    along the target path
  • Delay fault diagnosis based on statistical timing
    model DATE03, VTS03, DAC03
  • Ref Krstic, Wang, Cheng, Abadir, DATE03Best
    Paper Award in Test

19
Key Results - Testing
  • On-Chip Jitter Extraction for Bit-Error-Rate
    (BER) Testing of Multi-GHz Signal (UCSB)
  • Using on-chip, single-shot measurement unit to
    sample signal periods for spectral analysis
  • Demonstrated, through simulation, accurate
    extraction of multiple sinusoids and random
    jitter components for a 3GHz signal
  • Publications ASPDAC2004 and DATE2004

20
Thrust 3 Design Driver Network Security
Processor (Led by Prof. C. W. Wu)
  • Applications IPSec, SSL, VPN, etc.
  • Functionalities
  • Public key RSA, ECC
  • Secret key AES
  • Hashing (Message authentication) HMAC
    (SHA-1/MD5)
  • Truly random number generator (FIPS 140-1,140-2
    compliant)
  • Target technology 0.18?m or below
  • Clock rate 200MHz or higher (internal)
  • 32-bit data and instruction word
  • 10Gbps (OC192)
  • Power 1 to 10mW/MHz at 3V (LP to HP)
  • Die size 50mm2
  • On-chip bus AMBA (Advanced Microcontroller Bus
    Architecture)

21
Encryption Modules (PKEM)
  • Public key encryption module
  • Operations
  • 32-bit word-based modular multiplication
  • Multiplication over GF(p) and GF(2m)
  • An RSA cryptography engine with small area
    overhead and high speed
  • Scalable word-width
  • TSMC 0.35µm
  • 34K gates (1.71.8 mm2 )
  • 100MHz clock
  • Scalable key length
  • Throughput
  • 512-bit key 1.79Kbps/MHz
  • 1024-bit key 470bps/MHz

22
Encryption Modules (SKEM)
  • Secret key encryption module
  • Operations
  • Matrix operations, manipulation
  • AES cryptography
  • 32-bit external interface
  • 58K gates
  • Over 200MHz clock
  • Throughput 2Gbps
  • Support key length of 128/192/256 bits

Technology TSMC 0.25?m CMOS
Package 128CQFP
Core Size 1,279 x 1,271 ?m2
Gate Count 63.4K
Max. Freq. 250MHz
Throughput 2.977 Gbps (128-bit key) 2.510 Gbps (196-bit key) 2.169 Gbps (256-bit key)
23
Journal Publications
  • C.-T. Huang and C.-W. Wu, High-speed easily
    testable Galois-field inverter'', IEEE Trans.
    Circuits and Systems II Analog and Digital
    Signal Processing, vol. 47, no. 9, pp. 909-918,
    Sept. 2000.
  • S.-A. Hwang and C.-W. Wu, Unified VLSI systolic
    array design for LZ data compression'', IEEE
    Trans. VLSI Systems, vol. 9, no. 4, pp. 489-499,
    Aug. 2001.
  • C.-H. Wu, J.-H. Hong, and C.-W. Wu, VLSI design
    of RSA cryptosystem based on the Chinese
    Remainder Theorem'', J. Inform. Science and
    Engineering, vol. 17, no. 6, pp. 967-979, Nov.
    2001.
  • J.-H. Hong and C.-W. Wu, Cellular array modular
    multiplier for the RSA public-key cryptosystem
    based on modified Booth's algorithm'', IEEE
    Trans. VLSI Systems, vol. 11, no. 3, pp. 474-484,
    June 2003.
  • C.-P. Su, T.-F. Lin, C.-T. Huang, and C.-W. Wu,
    A high-throughput low-cost AES processor'',
    IEEE Communications Magazine, vol. 41, no. 12,
    pp. 86-91, Dec. 2003.

24
Conference Publications
  • J.-H. Hong and C.-W. Wu, Radix-4 modular
    multiplication and exponentiation algorithms for
    the RSA public-key cryptosystem'', in Proc. Asia
    and South Pacific Design Automation Conf.
    (ASP-DAC), Yokohama, Jan. 2000, pp. 565-570.
  • J.-H. Hong, P.-Y. Tsai, and C.-W. Wu,
    Interleaving schemes for a systolic RSA
    public-key cryptosystem based on an improved
    Montgomery's algorithm'', in Proc. 11th VLSI
    Design/CAD Symp., Pingtung, Aug. 2000, pp.
    163-166.
  • C.-H. Wu, J.-H. Hong, and C.-W. Wu, An RSA
    cryptosystem based on the Chinese Remainder
    Theorem'', in Proc. 11th VLSI Design/CAD Symp.,
    Pingtung, Aug. 2000, pp. 167-170.
  • C.-H. Wu, J.-H. Hong, and C.-W. Wu, RSA
    cryptosystem design based on the Chinese
    Remainder Theorem'', in Proc. Asia and South
    Pacific Design Automation Conf. (ASP-DAC),
    Yokohama, Jan. 2001, pp. 391-395.
  • Y.-C. Lin, C.-P. Su, C.-W. Wang, and C.-W. Wu,
    A word-based RSA public-key crypto-procesoor
    core'', in Proc. 12th VLSI Design/CAD Symp.,
    Hsinchu, Aug. 2001.
  • T.-F. Lin, C.-P. Su, C.-T. Huang, and C.-W. Wu,
    A high-throughput low-cost AES cipher chip'',
    in Proc. 3rd IEEE Asia-Pacific Conf. ASIC,
    Taipei, Aug. 2002, pp. 85-88.
  • Y.-T. Lin, C.-P. Su, C.-T. Huang, C.-W. Wu, S.-Y.
    Huang, and T.-Y. Chang, Low-power embedded
    memory architecture design for SOC'', in Proc.
    13th VLSI Design/CAD Symp., Taitung, Aug. 2002,
    pp. 306-309.
  • M.-C. Sun, C.-P. Su, C.-T. Huang, and C.-W. Wu,
    Design of a scalable RSA and ECC
    crypto-processor'', in Proc. Asia and South
    Pacific Design Automation Conf. (ASP-DAC),
    Kitakyushu, Jan. 2003, pp. 495-498, (Best Paper
    Award).
  • C.-P. Su, T.-F. Lin, C.-T. Huang, and C.-W. Wu,
    A highly efficient AES cipher chip'', in Proc.
    Asia and South Pacific Design Automation Conf.
    (ASP-DAC), Kitakyushu, Jan. 2003, pp. 561-562,
    (Design Contest Special Feature Award).
  • J.-H. Hong, C.-L. Liu, B.-Y. Tsai, and C.-W. Wu,
    A radix-4 modular multiplier for fast RSA
    public-key cryptosystem'', in Proc. 14th VLSI
    Design/CAD Symp., Hualien, Aug. 2003, pp.
    553-556.
  • M.-Y. Wang, C.-P. Su, C.-T. Huang, and C.-W. Wu,
    An HMAC processor with integrated SHA-1 and MD5
    algorithms'', in Proc. Asia and South Pacific
    Design Automation Conf. (ASP-DAC), Yokohama, Jan.
    2004 (to appear).
About PowerShow.com