Operating System Support for improving data locality on CCNUMA machines - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

Operating System Support for improving data locality on CCNUMA machines

Description:

Scientific workload : Splash. Decision support database. Multiprogrammed software: Pmake ... Splash:3 parallel applications,Raytrace,Ocean,Volume rendering.For ocean ... – PowerPoint PPT presentation

Number of Views:37
Avg rating:3.0/5.0
Slides: 18
Provided by: amul9
Learn more at: http://www.cse.psu.edu
Category:

less

Transcript and Presenter's Notes

Title: Operating System Support for improving data locality on CCNUMA machines


1
Operating System Support for improving data
locality on CC-NUMA machines
  • CSE597A Presentation
  • By
  • V.N.Murali

2
WHY CC-NUMA?
  • Scalable with increase in number of nodes
  • Attractive properties.Transparent access to local
    and remote memory at the cost of increased access
    latency to remote memory.
  • 2 variations,CC-NUMA-(Stanford DASH,MIT
    Alewife,Sequent),CC-NOW(SUN s3.mp).

3
OS support
  • Most important issue Data locality,
  • Performance enhancement provided by OS supported
    page migration and replication by as much as 30

4
Issues in Migration/Replication
  • When should pages be migrated?
  • When should pages be replicated?
  • Both are needed to boost performance.
  • When not to migrate/replicate is also important.
  • Which system parameter can be used to decide?
    Ideas?

5
Differences with S/W shared memory
  • M R in S/W DSM is needed for correctness.On
    CC-NUMA MR is purely an optimization.
  • M R in S/W DSM is triggered by page faults.On
    CC-NUMA MR is triggered by cache misses.

6
  • If workload exhibits good cache locality,less
    benefits from MR.Hence selective criteria for
    moving pages.
  • Study based on SimOS environment.

7
Solution
  • How do we improve data locality?
  • 3 access patterns a)primarily accessed by a
    single process b)mostly read access by many
    processes c)both read and write access by many
    processes
  • Which method has to be applied for a),b),c)?

8
Costs to be considered
  • 1)Cost of determining candidate pages for MR.
    (Cost of cache misses/TLB misses)
  • 2)Overhead of MR.(new mappings,allocating a
    page,flushing TLB)
  • 3)Actual data transfer
  • 4)Memory pressure!

9
miss rate to page
HIGH
LOW
nothing
sharing?
HIGH
LOW
write freq. and mem.pressure
migration rate
HIGH
HIGH
LOW
LOW
nothing
replicate
nothing
migrate
10
Key Parameters
11
Summary of the algorithm
  • Hot pagepage whose counter for a processor
    reaches the trigger threshold
  • If the miss counter for this page (on any other
    processor) reaches the sharing threshold then it
    is considered for replication else it is
    considered for migration.
  • Replicated only if write counter has not exceeded
    write threshold.Migrated only if the migrate
    counter has not exceeded migrate threshold

12
Implementation details
  • Directory controller maintains the miss counters
    and generates a low-priority interrupt.
  • Bunches a couple of pages before raising
    interrupt.
  • Writes to replicated pages are collapsed to a
    single page

13
IRIX changes
  • Replication support
  • Finer grain locking
  • Page table back mappings

14
Workloads
  • Engineering workloadlarge sequential memory
    intensive,used Verilog simulator,Flashlite.
  • Parallel application Raytrace which is a
    parallel graphics algorithm
  • Scientific workload Splash
  • Decision support database
  • Multiprogrammed software Pmake

15
Performance analysis
  • 3 factors a)user stall time ,b)fraction of misses
    satisfied in local memory,c)kernel overhead.
  • Engineeringlarge user stall timegtbest
    performance gain.MR were used successfully
  • Raytrace read only accesses mostly.Mainly
    benefits from replication.

16
  • Splash3 parallel applications,Raytrace,Ocean,Volu
    me rendering.For ocean migration is
    helpful.Raytrace and Volume can benefit from
    replication
  • Databasemostly read access and hence replication

17
Alternative policies
  • Static policies,dynamic policies.
  • StaticRound robin,First touch,Post facto(similar
    to optimal page replacement algorithm)
  • DynamicMigration only,replication
    only,Migration-Replication.
Write a Comment
User Comments (0)
About PowerShow.com