Using a Communication Architecture Specification in an Applicationdriven Retargetable Prototyping Pl - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

Using a Communication Architecture Specification in an Applicationdriven Retargetable Prototyping Pl

Description:

Ref: Zhu, Malik, A Hierarchical Modeling Framework for On-Chip Communication ... a fast execution-driven modeling and simulation framework targeting processor ... – PowerPoint PPT presentation

Number of Views:65
Avg rating:3.0/5.0
Slides: 29
Provided by: jau55
Category:

less

Transcript and Presenter's Notes

Title: Using a Communication Architecture Specification in an Applicationdriven Retargetable Prototyping Pl


1
Using a Communication Architecture Specification
in an Application-driven Retargetable Prototyping
Platform for Distributed Processing
  • Xinping Zhu, Sharad Malik
  • Princeton University, USA

2
Outline
  • Research Context and Motivation
  • Previous Work
  • Communication Architecture Description
  • Prototype Implementations
  • Case Study 3DES application
  • Summary and Future Work

3
Generic Multiprocessing SoC Architecture
4
OCA Actions
Send( to, message, length)
Sender
Receiver
Recv (from, message, length)
Message
Message
Outcoming Packets/flits queue
Incoming Packets/flits queue
flits
Switch A
Switch B
Network
5
OCA Structure
101
100
Bus Topology with 8 Nodes
000
001
100
111
PE
PE
PE
PE
010
011
Shared Bus
Cube Topology with 8 Nodes
4x4 Mesh Topology with 16 Nodes
6
OCA Microarchitecture
Scheduler
grant
request
out
in
select
Buf West
config
Buf South
in
out
Crossbar 5 x 5
Buf East
out
in
Buf North
out
in
Buf Local
in
out
router ?architecture
grant
7
Research Motivation
  • Choices for Design Space Exploration
  • Enhance system-level design productivity
  • Focus on the OCA part

OCA Structure
8
Outline
  • Research Context and Motivation
  • Previous Work
  • Communication Architecture Description
  • Prototype Implementations
  • Case Study 3DES application
  • Summary and Future Work

9
Our Previous Work Microarchitectural Building
Blocks
  • A Hierarchical Modeling Framework
  • A Classified Library of Reusable OCA Components

Module
Link
Mux
Duplex Link
CrossBar
Bus Backplane
Buffer
FIFO
Central Pool
Interface
SendInterface
ReceiveInterface
SlaveInterface
MasterInterface
ResourceScheduler
Ref Zhu, Malik, A Hierarchical Modeling
Framework for On-Chip Communication
Architectures, ICCAD02
Allocator
Arbiter
10
Our Previous WorkSimulation Environments
  • Methodology and Library Successfully Used in Two
    Modular Modeling Environments
  • Implementations
  • Liberty Simulation Environment (LSE)
  • a fast execution-driven modeling and
    simulation framework targeting processor
    microarchitecture modeling
  • SystemC
  • A general digital synchronous design framework
    which enables system-level design

11
Related Previous Work
  • Metropolis (UC Berkeley)
  • Top-down vs. Bottom Up
  • StepNP
  • System level design tool for NPU using SystemC
  • Functional for now
  • Benini et al. (IEEE Computer 36-4, 2003)
  • Integrate SystemC and GNU GDB based ISS, No PE
    model
  • Cowares ConvergenSC
  • System level modeling and verification
  • Multiple LISA 2.0 PE model with complex on-chip
    buses
  • Tensillicas XTMP
  • Integrate C-callable Xtensa instruction
    simulators
  • Functional simulation, custom interconnect

12
Paper Contributions
  • Communication architecture specification
  • Simple template based specification for rapid
    prototyping
  • Integration of application, processor and
    communication architecture models
  • Provides for application accurate workloads
    instead of statistical/synthesized workloads

13
Design Exploration for OCAs
  • Specifying OCAs through descriptions
  • Evaluating OCA choices

14
Outline
  • Research Context and Motivation
  • Previous Work
  • Communication Architecture Description
  • Prototype Implementations
  • Case Study 3DES application
  • Summary and Future Work

15
Representing OCAs
  • A retargetable OCA description/modeling language
  • Control path vs. data path
  • Datapath microarchitecture components and
    structure
  • Controlpath how the communication resources are
    allocated concurrently
  • Current emphasis on type and topology
  • Controlpath is implicitly encoded in the modules
  • Template based short, expressive with C-like
    syntax

Datapath
Controlpath
Protocol
Topology
µarch blocks
Timing
OCA
16
Topology and Type Based Descriptions Examples
Mesh
Bus
NODE n0, n1, n2, n3 n0.addr 0 n1.addr 1
n2.addr 2 n3.addr 3 CLUSTER
my_bus my_bus.data_width 32 my_bus.buffer_size
64 my_bus.protocol round_robin my_bus
bus (n1, n2, n3, n4)
CLUSTER my_net my_net.init_credit
64 my_net.routing dimension my_net torus
(16)
PE
PE
PE
PE
Shared Bus
17
Outline
  • Research Context and Motivation
  • Previous Work
  • Communication Architecture Description
  • Prototype Implementations
  • Case Study 3DES application
  • Summary and Future Work

18
Retargetable Simulation Flow
PE
OCA
  • Application Model
  • Enables us to go beyond statistical/synthetic
    traffic patterns
  • System Architecture includes both PE and OCA
  • Flexible Implementation Strategy
  • SystemC
  • Discrete Event MoC
  • Liberty Simulation Environment (LSE)
  • Synchornous Reactive MoC

System Architecture Description
Simulation Engine
Model Configuration
Distributed Application Model
SystemC Model
LSE Model
Application Binary
Wrapper
Wrapper
Execution
Execution
Performance
19
Integrating PE models
  • Need a cycle-accurate PE model to
    simulate/execute real-world applications
  • SimIt-ARM simulator (W. Qin, DATE03)
  • Wrapper Strategy
  • Define a well-maintained interface between PE and
    OCA so that PE details are hidden behind it
  • Flexible, other PE models could be added
    (applicable to commercial PE IPs)
  • Currently we can use both SystemC and LSE style
    of code as wrappers

20
Distributed Application Modeling
  • currently message passing (C-based)
  • Several send/recv message passing primitives are
    defined

main.c
arm_mp.h
include arm_mp.h main() if
(ns_arm_get_addr() 0) d
ns_arm_send(1, c, 2) else a -1
do a ns_arm_recv(0, b, 2)
while ( a -1)
/ send / int ns_arm_send(int dest, int value,
int length) / recv, return value -1, then
failed / int ns_arm_recv(int source, int buf,
int length) / get the local PE address
mapping/ int ns_arm_get_addr()
GNU ARM Compiler Suite
21
Target Specific Communication Libraries
Sender
Receiver
a ns_arm_recv(0, b, 2)
ns_arm_send(1, c, 2)
ARM assembly
ARM assembly
Message
Message
ldc p6, cr0, r1
stc p8, cr2, r1
Incoming Packets/flits queue
Outcoming Packets/flits queue
flits
Node 1
Node 0
Network
22
Outline
  • Research Context and Motivation
  • Previous Work
  • Communication Architecture Description
  • Prototype Implementations
  • Case Study 3DES application
  • Summary and Future Work

23
Case Study
  • Applications
  • 3DES widely used encryption algorithm
  • Two subcomponents
  • Key exchange (KEY_EX)
  • Communication Oriented
  • Actual Encryption (3DES)
  • Computation Oriented
  • System Architecture
  • PEs
  • 3x3 array of ARM-V PEs
  • OCA types
  • simple bus (BUS)
  • 3x3 2d torus(TORUS)
  • fully connected crossbar (FULL)

Speedup Comparison of Different Machine
Configurations
24
Toolset Evaluation
  • Simulation Speed
  • Up to 15.8K cycles/s
  • on P3 1.1GHz with g
  • 4x slower than single PE model counting 9x slow
    down due to 9-PE model
  • Fast prototyping
  • Total process of building the system takes less
    than 10 minutes after parameters and models are
    ready

Comparison between two simulation platforms
25
Outline
  • Research Context and Motivation
  • Previous Work
  • Communication Architecture Description
  • Prototype Implementations
  • Case Study 3DES application
  • Summary and Future Work

26
Work in Progress
  • Revision of OCA description syntax
  • Modeling OCA microarchitecture concurrency using
    the Operation State Machine (OSM) model
  • Fully automated simulator synthesis
  • Toolkit software release

Machine Configuration
Execution
Performance
27
Summary
  • Integrated application, PE and OCA modeling and
    simulation for design space exploration
  • Type and topology based OCA descriptions
  • Fast and accurate application driven SoC
    prototyping
  • Proof-of-concept embedded system application

28
Acknowledgements
  • Part of the MESCAL Project
  • Modern Embedded Systems Compilers Architectures
    and Languages
  • Princeton and UC Berkeley
  • www.gigascale.org/mescal
  • mescal.princeton.edu
  • A Gigascale System Research Center (GSRC) effort
  • www.gigascale.org
  • Funded by DARPA and MARCO
  • Liberty Research Group _at_ Princeton
  • http//liberty.cs.princeton.edu
Write a Comment
User Comments (0)
About PowerShow.com