NoC Symposium07 Panel Proliferating the Use and Acceptance of NoC Benchmark Standards - PowerPoint PPT Presentation

1 / 9
About This Presentation
Title:

NoC Symposium07 Panel Proliferating the Use and Acceptance of NoC Benchmark Standards

Description:

defines how system functions are supported ... Simulation: 3-D Torus, 4,096 nodes (16 ? 16 ? 16), uniform traffic load, virtual ... – PowerPoint PPT presentation

Number of Views:39
Avg rating:3.0/5.0
Slides: 10
Provided by: timot73
Learn more at: https://www.ocpip.org
Category:

less

Transcript and Presenter's Notes

Title: NoC Symposium07 Panel Proliferating the Use and Acceptance of NoC Benchmark Standards


1
NoC Symposium07 Panel Proliferating the Use
and Acceptance of NoC Benchmark Standards
  • Timothy M. Pinkston
  • National Science Foundation (NSF)
  • tpinksto_at_nsf.gov
  • University of Southern California (USC)
  • tpink_at_usc.edu

2
Driving Forces
Arch
Apps
Architecture
Tech
defines how system functions are supported
Alg SW
Applications
Implementation (Circuit) Technology
define what system functions should be supported
defines the extent to which desired system
functions can be implemented in hardware
Trends Towards On-chip Networked Microsystems,
T. Pinkston and J. Shin, IJHPCN.
(http//ceng.usc.edu/smart/publications/archives/C
ENG-2004-17.pdf)
3
Need for a NoC Benchmark Suite
Is There a
?
  • A sampling of benchmark suites already out there

Gen-Purpose/PC
Embedded/SoC
Sci-Eng/HPC
SPEC CPU
STREAM
HPL
-2006
SPLASH
CPU2
EEMBC
-2
Netperf
MiBench
LINPACK
MediaBench
LAPACK
Dhry-/Whetstone
BAPCo SYSmark
ALPBench
ScaLAPACK
NPB (NAS PB)
GraalBench
BYTEmark
LFK (Livermore)
LMBench
NPCryptBench
SparseBench
CommBench
LLCbench
DMABench
BioBench
  • Do we really need yet another benchmark suite?

4
December 2006 NSF OCIN Workshop
Recommendations(www.ece.ucdavis.edu/ocin06)
  • A set of standard workloads/benchmarks and
    evaluation methods are needed to enable realistic
    evaluation and uniform (fair) comparison between
    various approaches
  • Need for cooperation (agreement) between academia
    and industry
  • Need for qualified performance metrics latency
    and bandwidth under power, energy, thermal,
    reliability, area, etc., constraints
  • Need for standardization of metrics clear
    definition of what is being represented by
    metrics (e.g., network latency, throughput,...)
  • Need for effective alternatives to time consuming
    full-system execution-driven simulation,
    including use of microbenchmarks, parameterized
    synthetic traffic/workloads, traces, etc.
  • Need for accurate characterization and modelling
    of system traffic behavior across various
    domains general-purpose embedded
  • Need for analytical methods (complementary to
    simulation) to explore and quantitatively
    narrow-down the large design space

Challenges in Computer Architecture Evaluation,
K. Skadron, M. Martonosi, D. August, M. Hill, D.
Lilja, V. Pai, in IEEE Computer, pp. 30-36,
August 2003.
5
Meaning of Latency and Throughput
  • Latency fabric only, endnode-to-endnode, ave.,
    no-load, saturation?
  • Throughput peak, sustained, saturation,
    best-case, worst-case?

Simulation 3-D Torus, 4,096 nodes (16 ? 16 ?
16), uniform traffic load, virtual cut-through
switching, three-phase arbitration, 2 and 4
virtual channels. Bubble flow control is used in
dimension order on one virtual channel the other
virtual channel(s) is supplied in dimension order
(deterministic routing) or along any shortest
path to destination (adaptive routing).
6
Simple (Analytical) Latency and Throughput Models
  • HP Int.Net. chapter ceng.usc.edu/smart/slides/a
    ppendixE.html
  • Network traffic pattern/load determine s g ,
    traffic-dependent parameters
  • Topology and switch marchitecture determine d, Tr
    , Ta , Ts , BWBisection
  • Routing, switching, FC, march, etc., influence
    network efficiency factor, r
  • internal switch speedup reduction of contention
    within switches
  • buffer organizations to mitigate HOL blocking in
    and across switches
  • balance load across network links maximally
    utilize link bandwidth
  • r rL x rR x rA x rS x rmArch x ,
    architecture-dependent parameters

7
Modeling Throughput of Cell BE EIB (Worst-Case)
BWNetwork ? BWBisection /g
BWNetwork ? 204.8 /1 GB/s
78 GB/s (measured)
Injection bandwidth 25.6 GB/s per element
g 1
Reception bandwidth 25.6 GB/s per element
s 1
Command Bus Bandwidth
BWBisection 8 links 204.8
GB/s
204.8 GB/s
Aggregate bandwidth
Network injection
Network reception
Peak BWNetwork of 25.6 GB/s x 3 x 4 307.2 GB/s
(4 rings each with 12 links)
(12 Nodes)
(12 Nodes)
1,228.8 GB/s
(3 transfers per ring)
307.2 GB/s
307.2 GB/s
r limited, at best, to only 50 due to ring
interferrence
Traffic pattern determines s g
8
Integer Programs
Floating-Point Programs
Ref Hennessy Patterson, Computer
Architecture A Quantitative Approach, 4th Ed.
9
In Conclusion Answers to Panel Questions
  • What are the hallmarks of successful benchmark
    suites?
  • Fairness represent the proper workload
    behavior/characteristics
  • Portability open, free access, not
    architecture/vendor-specific
  • Transparency yield reproducible performance
    results (reporting)
  • Evolutionary adaptable over time in composition
    and reporting
  • How can industry and academia facilitate use?
  • Establish need/importance for common evaluation
    best-practices
  • Cross-cutting effort architects, circuit
    designers, CAD researchers
  • Need to place high value on developing and using
    eval. standards
  • What are the main obstacles to establishing a de
    facto NoC standard benchmark suite, and how to
    address?
  • Capturing the diversity of NoC applications
    computing domains
  • Red herrings ? converge on performance evaluation
    standards and agree on characteristic traffic
    loads and/or microbenchmarks
  • Ultimately, system-level performance is
    important, not component
Write a Comment
User Comments (0)
About PowerShow.com