Operating System Support for Fine-Grain Parallelism on Multicore Architectures - PowerPoint PPT Presentation

1 / 19

About This Presentation

Title:

Operating System Support for Fine-Grain Parallelism on Multicore Architectures

Description:

(zero-stall guarantee) Including hardware stages. Scheduling multi-domain entities ... Zero-stall guarantee. Selective timesharing. Pipelineable system services ... – PowerPoint PPT presentation

Number of Views:72

Avg rating:3.0/5.0

Slides: 20

Provided by: JohnGia6

Category:

more less

Transcript and Presenter's Notes

Title: Operating System Support for Fine-Grain Parallelism on Multicore Architectures

1
Operating System Support forFine-Grain
Parallelism on Multicore Architectures
John Giacomoni

Manish Vachharajani
University of Colorado at Boulder
2007.10.14

2
Problem

UP performance at end of life
Chip-Multiprocessor systems
What do we want from multicore systems?

Individual cores less powerful than UP
Asymmetric and Heterogeneous
10s-100s-1000s of cores

Performance!
Intel (2x2-core)
MIT RAW (16-core)
100-core
400-core
3
ExtractingPerformance

Task Parallelism
Desktop
Data Parallelism
Web serving
Split/Join, MapReduce, etc
Pipeline Parallelism
Video decoding
Network processing

4
ExtractingPerformance (2)

Stream Parallelism
Combines
Data Parallelism
Pipeline Parallelism
Ad-Hoc Parallelism
Semi- or unstructured
Usual thread model

5
Focus onPipeline Parallelism

Most stringent timing requirements
Example applications
Network Processing
Network Intrusion Detection
DDoS Filtering
Multimedia processing
Transcoding
Signal Processing
Software Defined Radio
Also applies to
Data parallelism
Stream Parallelism

6
Soft Network Processing(Soft-NP)

How do we protect?

GigE Network Properties
1,488,095 frames/sec
672 ns/frame
Frame dependencies

Frame Shared Memory Line-Rate Networking on
Commodity Hardware. To Appear Proceedings of
the ACM/IEEE Symposium on Architectures for
Networking and Communications Systems 2007
(ANCS), December 2007. John Giacomoni, John K.
Bennett, Antonio Carzaniga, Douglas C. Sicker,
Manish Vachharajani and Alexander L. Wolf.
7
Frame Shared Memory(Soft-NP)
Input (IP) Output(OP)
8
What OS support is necessary?
9
Low-OverheadCommunication
Gigabit Ethernet
Syscalls 170ns
pthread mutex 200ns
10
FastForward

Portable software only framework
35-40ns/queue operation 2.0 GHz AMD Opteron
26-28ns/queue operation 2.6 GHz AMD Opteron
Architecturally tuned CLF queues
Works with strong to weak consistency models
Hides die-die communication
Robust against unbalanced stages
Poster FastForward for Efficient Pipeline
Parallelism. Proceedings of the 16th
International Conference on Parallel
Architectures and Compilation Techniques (PACT),
September 2007. John Giacomoni, Tipp Moseley,
Manish Vachharajani.

11
FastForwardPerformance
Lamport
FF
FF Unbalanced
FF Re-Balanced
12
Zero-StallGuarantee
13
GangScheduling

Optimize for application performance
Instead of system throughput or fairness
Computer Utility -gt max(System Utilization)
Multicore system -gt excess of resources.
Dedicate resources to pipeline applications
Want selective timesharing

14
SystemServices

Fast!
Synchronous calls introduce too much overhead
System calls 170ns
Asynchronous calls may limit parallelism
Want System services with independent I/O paths

15
PipelinableSystem Services

Mixing stages from multiple process domains
Push model vs. call/return or poll
Hardware can be an active participant

16
HeterogeneousGang Scheduling

Need a single scheduling label for every pipeline
stage
Ensures simultaneous scheduling of every
necessary resource
(zero-stall guarantee)
Including hardware stages.
Scheduling multi-domain entities

17
Multi-DomainEntities

Application state
Shared with local stages
Pipeline private state
Stage state shared with pipeline and parent
process.
The multi-domain application model respects the
private data model implicit in single-domain
applications while providing first-class naming
for multi-domain pipelines.

18
Summaryof Discussion