DATE: Smart Interconnects for HMP SoCs - PowerPoint PPT Presentation

Loading...

PPT – DATE: Smart Interconnects for HMP SoCs PowerPoint presentation | free to view - id: 48a5a-ZDc1Z



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

DATE: Smart Interconnects for HMP SoCs

Description:

In-house interconnect design for most SoCs. Simple Bus (AMBA) Advanced Fabric Features ... Pipelined, multi-threaded, non-blocking fabric. Distributed QoS arbiter ... – PowerPoint PPT presentation

Number of Views:36
Avg rating:3.0/5.0
Slides: 31
Provided by: pcas
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: DATE: Smart Interconnects for HMP SoCs


1
DATE Smart Interconnects for HMP SoCs
  • Efficient Data Flow for Multimedia-Intensive
    Heterogeneous MultiProcessor SoCs

Jeff Haight, Dir. Tech. Mrktng Sonics,
Inc. 650-605-6171 jhaight_at_sonicsinc.com
2
Who is Sonics
  • Established in 1996
  • Headquartered in Mountain View, CA
  • Field offices London, Munich, Nice, Seoul, Tokyo
  • Development center Yerevan
  • Senior management team
  • Customer base of industry-leaders in
  • Wireless and Communications
  • Digital Consumer
  • Office Automation
  • Over 200 million chips shipped using Sonics
    SMART Interconnect Solutions

3
Using the Old, Adding the New
Core 1
AXI for Seamless Connections
Core 2
µP
Core N
AHBLegacy Support
Intelligent Internal Interconnect
SMART Interconnects
OCP Maximizes Flexibility
AHB Cores
Core
DSP Core
APBLegacySupport
I/O
Memory
SoCs
Circa 2006
Adding Intelligent Data Flow Services
4
Multicore Mobile Handset Example
P
P
T
T
S3220
T
T
T
T
CPU Tile
2D/3D GraphicsTile
MPEG4 CodecTile
MP3
USB 2.0
I
I
I
I
I
SMX
SMX
I
T
T
I
I
I
I
I
I
Flash Controller
T
T
I
T
T
DSP Tile
LCDController
CameraInterface
DMA
EmbeddedSRAM
SDRAM Controller
T
T
T
T
T
P
5
Mobile Handset Example
P
P
T
T
S3220
T
T
T
T
CPU Tile
2D/3D GraphicsTile
MPEG4 CodecTile
MP3
USB 2.0
I
I
I
I
I
SMX
SMX
I
T
T
I
I
I
I
I
I
Flash Controller
T
T
I
T
T
DSP Tile
LCDController
CameraInterface
DMA
EmbeddedSRAM
SDRAM Controller
T
T
T
T
T
P
6
Physical Implementation
  • Cores 18 (10ia 8ta agents)
  • TSMC 90nm (CLN90G-HiVt) Low Power Process
  • SMX gate count 369K
  • Die Area 8x8 sq mm
  • SMX Cell Area 1.7 sq mm
  • Frequency 250 MHz
  • Features 1 XB 1 SL 2 PPs, Fully connected
  • Benchmark for Wireless Cell Phone chip

7
The Requirements.
  • Multimedia traffic demands high bandwidth
  • Congestion avoidance critical especially to
    shared memory access high efficiency DRAM
    scheduling
  • Predictability in performance
  • Low power and leakage management
  • Access protection support for Digital Rights
    Managements
  • Low latency requirements and guaranteed
    throughput high QOS
  • Scalability
  • Fast time to market, IP reuse, and rapid feature
    set evolution

8
Data Flow Design ChallengesTypical Bus Style
Offerings Address Few of the Real Issues
Perf. Verification
Virtual Prototyping
Parallel IP Creation
Arch. Modeling
BusGenerator
Design Re-use
SW Development
Variable Clock Freq.
Timing Closure
Voltage Isolation
On-ChipBus
Complex Memory Hierarchies
Power Management
Error Management
Signal Integrity
Access Security
High Peripheral Count
Data Width Conversion
Distributed Processing
Mixed Endianness
Guaranteed BW QoS
Protocol Conversion
Pipelining
9
SoC Design Reality.
10
Perceptions of the Problem
11
IP Core Integration is THE SoC challenge
  • Problem Hardware performance bottlenecks
  • Cause Processors blocked from intercommunication
    or memory access
  • Problem Competitive Chip Power-Performance-Area
  • Cause Non-optimal interconnect forces design
    compromises
  • Problem Missed Time-to-market windows
  • Cause Long verification tails and
    re-engineering times because problems found right
    before tape out
  • Problem Increasing software development
    dependency
  • Cause Software having to workaround
    architecture issues

12
Drivers of SoC Design Economics
More engineers for longer project schedules
Cost
The primary source for these trends is
increasingly complex requirements for SoC
interconnectivity
Time To Market
Feature Set
More features require more engineers for longer
projects cycles
Longer schedules escalate feature demand
13
SoC Design Complexity Circa 2003-2005
SoC Design Complexity Circa 2005-2007
Sonics Offers SMART Interconnect Solutions
with Comprehensive Data Flow Services ?
Advanced features available early when
architecting product lines lowers development
costs ? Minimal re-engineering of product
derivatives via IP core and interconnect
decoupling reduces time to market gaps ?
Consistent IP block and sub system sharing
reduces probability of chip respins
Interconnects today are more than just wires
In-house interconnect design for most SoCs
Heterogeneous multi-processing exponentially
increases Architectural complexity of interconnect
14
Outsourcing Has The Largest Impact
Outsourcing improves productivity gt25....
15
And Outsourcing is gt 5 Times LESS Expensive
IP Acquisition Costs
35
30
3rd Party IP Cost
25
20
Percent of Silicon Cost
15
10
5
Internally Developed IP Cost
0
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
Significant cost advantage over in-house
interconnect
16
The Intelligence is in the Agents
  • Agents provide
  • Protocol conversion
  • Agent adapts to IP core
  • Decoupling of IP cores from fabric
  • Provide local, isolated environment
  • Data flow services
  • Agent data flow services
  • QoS-based arbitration
  • Power management
  • Access security
  • Error management
  • Burst, width, and command conversion

INITIATOR SOCKETS
I
I
I
I
I
Initiator Agents (IA)
Fabric
Target Agents (TA)
T
T
T
T
T
TARGET SOCKETS
17
Network-based SoC Example Actively Decoupled
  • Separation
  • Abstraction
  • Optimization
  • Independence

DRAM Controller
DMA
CPU
Network
18
SonicsMX Basic Architecture
  • Hybrid topologies
  • Full / partial cross-bar
  • Shared bus
  • Fully split (dual) request / response
  • Pipelined, multi-threaded, non-blocking fabric
  • Distributed QoS arbiter
  • Spans cycle, frequency, and data width boundaries
  • Supports flexible thread merging tree topologies

SMX
CPU
ROM
DSP
SRAM
GFX
FlashCtl.
DRAMCtl.
SMX
T
T
I
I
I
I
I
19
Sonics Delivers Time-to-Market
Today (Typical)
Architectural Definition
Logic Design Verification
Physical Design
Fab, Assy, Test
First Design
Derivative Design(s)
12 to 18 month time engineering savings !
  • How is this possible?
  • Socket-Based Design Methodology
  • Highly Configurable Interconnect IP

20
QoS-based Arbitration
  • Initiator data flow threads mapped to target
    threads by SMX fabric
  • E.g. 40 data flows sharing 8 DRAM threads in a
    digital video system
  • Data flows sharing a target thread arbitrated
    using bandwidth weighting
  • Independent threads assigned to QoS level
    (maintained throughout SMX)
  • Non-blocking, multi-threaded fabric and target
    interfaces allow
  • Higher priority requests to interleave with
    respond before others
  • Guaranteed BW threads to minimize buffering /
    receive latency guarantees
  • Optimum DRAM efficiency

Thread QoSLevel BandwidthAllocation ? QoS Model
Priority Yes Low latency while within BW allocation, best-effort otherwise
Bandwidth Yes Guaranteed BW while within BW allocation, best-effort otherwise
Best-effort No N/A
21
Design Flow With Smart Interconnects
Today (Typical)
Architectural Definition
Logic Design Verification
Physical Design
Fab, Assy, Test
First Design
Derivative Design(s)
12 to 18 month time engineering savings !
  • How is this possible?
  • Socket-Based Design Methodology
  • Highly Configurable Interconnect IP

22
Access Security
  • Optional multi-region firewall
  • Per-target, re-programmable
  • Layered architecture supports rich set of
    security domains with variable region sizes
  • Access permissions determined per role and access
    type
  • Flexible security error caching and reporting

I_0 I_1 I_N
PR0 R RW R
PR1 X W R

PR7 X X R
role_1 role_2 role_32
Y Y N
N Y N

N Y Y
MAddr, MAddrSpace
L3 CAM
L2 CAM
L1 CAM
L0 CAM
1
L0 permissions
L2 permissions
L1 permissions
L3 permissions
L0 valid
L3 valid
L2 valid
L1 valid
priority
role permissions
MReqInfo
role
write permissions
read permissions
group ROM
Init thread ID
group
roleOK
read OK
MCmd
write OK
Access OK
TARGET CORE
23
Data Flow Services Security Management
  • Optional multi-region firewall
  • Per-target, re-programmable
  • Layered architecture supports secure update of
    permissions and sizes
  • Access permissions determined per role and access
    type
  • Flexible security error caching and reporting

CPU in user mode fails
CPU in supervisor, unsecure OK
DMA OK
Any intiator, RW
CPU in user, RO
CPU in supervisor
CPU in supervisor, non-secure
Default region CPU in supervisor, secure
24
Power Management
  • Simplifies design of APM
  • Active status indication configurable on a
    per-socket basis
  • Allows target-specific power management
  • Supports interconnect chaining -- Unit Power
    Managers can simply OR active flags for all
    incoming signaling
  • Request/OK handshake provided by the interconnect
  • Handshake sequence
  • Application Power Manager (APM) makes request
  • Interconnect blocks new transactions (continuing
    existing transactions)
  • Interconnect drains
  • Interconnect indicates OK
  • APM removes clock or voltage, as appropriate

APM
I0
I1
Active
Unit Pwr Mgr
Active
Active
Active
Down_req
IA
IA
Down_ok
I2
TA
TA
Active
Active
Unit Pwr Mgr
Active
Active
T0
IA
IA
Down_req
Down_ok
TA
TA
Active
T1
T2
25
Multicore Mobile Handset Example
P
P
T
T
S3220
T
T
T
T
CPU Tile
2D/3D GraphicsTile
MPEG4 CodecTile
MP3
USB 2.0
I
I
I
I
I
SMX
SMX
I
T
T
I
I
I
I
I
I
Flash Controller
T
T
I
T
T
DSP Tile
LCDController
CameraInterface
DMA
EmbeddedSRAM
SDRAM Controller
T
T
T
T
T
P
26
SMART Interconnect ApproachAddresses the Total
Global Interconnect Challenge
Perf. Verification
Virtual Prototyping
Parallel IP Creation
Methodology Automation
Arch. Modeling
Design Re-use
SW Development
Variable Clock Freq.
ScalableFabrics
Timing Closure
Voltage Isolation
Power Management
Complex Memory Hierarchies
IntelligentAgents
Error Management
Signal Integrity
Access Security
High Peripheral Count
Data Width Conversion
Distributed Processing
Mixed Endianness
Guaranteed BW QoS
Protocol Conversion
Pipelining
27
Sonics Delivers Time-to-Market
Today (Typical)
Architectural Definition
Logic Design Verification
Physical Design
Fab, Assy, Test
First Design
Derivative Design(s)
12 to 18 month time engineering savings !
  • How is this possible?
  • Socket-Based Design Methodology
  • Highly Configurable Interconnect IP

28
Continuous Integration
  • In traditional bus-based designs, integration is
    performed once, at the end of the logic design
    phase
  • Architecture, µArchitecture, and logic nearly
    frozen
  • Labor-intensive error-prone
  • In SMART Interconnect-based design, integration
    is performed continuously
  • Validate choices
  • Explore implications at lower levels
  • Cope with (inevitable) specification changes
  • Allow optimization at any time, at any level
  • System C modeling capabilities allows different
    levels of abstraction, rapid architectural
    exploration, simulation, and concurrent software
    development

29
Key Schedule Resource Differentiation
  • Use of SMART Interconnects cuts design time
  • Conventional design is serial iterative
  • Sonics structured approach is // predictable
  • Key differences
  • Decoupling / Complete socket for IP cores
  • Modeling of Communications / tradeoffs
  • Predictable physical implementation
  • Quality of Service Guarantees
  • Automation of integration
  • Architectural investigations based on real
    process technology data

30
Thank You
Nokia
Sony
Hughes Network Systems
Over 200 million Sonics enabled chips shipped
Cisco
Samsung
Dell
Toshiba
About PowerShow.com