Shrinking AIX as a compute node OS - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

Shrinking AIX as a compute node OS

Description:

New needs arising from today's parallel machines pose new challenges for system software ... Lawrence Livermore National Laboratory under contract No. W-7405 ... – PowerPoint PPT presentation

Number of Views:112
Avg rating:3.0/5.0
Slides: 20
Provided by: terry270
Category:

less

Transcript and Presenter's Notes

Title: Shrinking AIX as a compute node OS


1
Fast-OS
Shrinking AIX as a compute node OS
July-10-2002 Terry Jones, Integrated Computing
Communications Dept trj_at_llnl.gov
2
Outline
  • Introduction
  • Todays landscape
  • Directions
  • Problem Areas Ripe for Investigation
  • Parallel Aware Scaling
  • Parallel Aware Memory Management
  • Metrics for evaluating system software
  • Why would anyone want to muck with AIX
  • Bottom-up and Top-down Approaches
  • Why AIX?
  • How AIX?
  • Conclusion

3
  • Introduction
  • Todays landscape
  • Directions
  • Problem Areas Ripe for Investigation
  • Parallel Aware Scaling
  • Parallel Aware Memory Management
  • Metrics for evaluating system software
  • Why would anyone want to muck with AIX
  • Bottom-up and Top-down Approaches
  • Why AIX?
  • How AIX?
  • Conclusion

4
The Landscape
  • Parallel applications need to span thousands of
    nodes
  • Architectures are adding more processor state
  • Applications are not mission critical
  • Both interrupts and busy-waiting are bad
  • Cache effects (processor affinity) cannot be
    ignored
  • Two modes Capability mode (jobs are dedicated)
  • Capacity mode (jobs may space-share machine)

5
Directions
  • Continue to move from a monolithic operating
    system which communicates via shared-memory TO a
    decentralized design which communicates via
    efficient messages
  • Small kernel process level managers
  • Modularity
  • Fault-tolerance
  • Extensibility

Question How much should system software offer
in terms of features?
Answer Everything required, and as much
desired as possible
6
  • Introduction
  • Todays landscape
  • Directions
  • Problem Areas Ripe for Investigation
  • Parallel Aware Scaling
  • Parallel Aware Memory Management
  • Metrics for evaluating system software
  • Why would anyone want to muck with AIX
  • Bottom-up and Top-down Approaches
  • Why AIX?
  • How AIX?
  • Conclusion

7
Problem Areas Ripe For Investigation
  • Add parallel awareness
  • CPU resource (local/global program context,
    scheduling)
  • Memory resource (demand paging, address space
    extent)
  • Metrics
  • Other possibilities Fault tolerance/Membership
    services
  • Re-visit where we insert boundaries (e.g.
    boundary between kernel and user-level code)

8
Scheduling Is An Overloaded Word
  • Spatial Scheduling
  • Assign processes to nodes
  • For example, batch schedulers gang-schedulers
  • Coarse grain view of work to be done
  • Temporal Scheduling
  • For example, native operating system scheduling
  • Fine grain view of work to be done (e.g.
    efficient pthread level scheduling)
  • Lack necessary global view
  • Coscheduling

9
The Need for Parallel Aware Scheduling
  • Even on the most bare-bones operating systems,
    there can be more runnable processes than
    processors
  • Many parallel algorithms are extremely sensitive
    to serializations
  • A first order goal is to maximize the overlap of
    competing (interfering) processes during a
    parallel application.

10
Improving Memory Management
  • Provide as much memory as possible with as
    little pain as possible
  • Memory systems are becoming more complex
  • Improved mechanisms to counter false-sharing.

11
Why Demand Paging
  • External storage (secondary networked) will
    continue to exceed local memory
  • Memory requirements for certain simulations are
    almost unbounded
  • Removing constraints on memory is very desirable,
    but the cost of a page-fault is too much to have
    hidden from an application
  • Default process level manager provide page-cache
    management as in Stanford DASH.

12
Challenges For A
Virtual Memory Environment
  • Thought to preclude or make more difficult OS
    bypass communications
  • An application cannot know the amount of physical
    memory it has available
  • An application cannot efficiently control the
    contents of the physical memory allocated to it
  • An application cannot control the read-ahead,
    writeback and discarding of pages within its
    physical memory.

13
Metrics For Evaluating System Software
  • An aid for reaching agreement on what we want
  • A quantitative measure of different approaches
  • Compared to the scheduler work and the virtual
    memory work, may be the most difficult

14
  • Introduction
  • Todays landscape
  • Directions
  • Problem Areas Ripe for Investigation
  • Parallel Aware Scaling
  • Parallel Aware Memory Management
  • Metrics for evaluating system software
  • Why would anyone want to muck with AIX
  • Bottom-up and Top-down Approaches
  • Why AIX?
  • How AIX?
  • Conclusion

15
Bottom-up Top-down Approaches
  • Bottom-up
  • Start with a clean-slate
  • Add features as the need arises
  • Settle on a reasonable boundary
  • Top-down
  • Start with a full-featured implementation
  • Remove the unnecessary cruft
  • Settle on a reasonable boundary

16
Why AIX?
  • AIX is ubiquitous in supercomputer centers
  • AIX already has extensive capabilities
  • Not required to build everything before we try
    anything
  • AIX is mature (read is not in radical change
    mode)
  • AIX scalability (32-way with AIX 5.x)

17
How AIX?
  • In close conjunction with IBM
  • Expect successes to payoff in IBM products
  • Done in an operating system independent manner
  • Findings apropos and available to other operating
    systems
  • Evaluated with real applications on very large
    machines

18
  • Introduction
  • Todays landscape
  • Directions
  • Problem Areas Ripe for Investigation
  • Parallel Aware Scaling
  • Parallel Aware Memory Management
  • Metrics for evaluating system software
  • Why would anyone want to muck with AIX
  • Bottom-up and Top-down Approaches
  • Why AIX?
  • How AIX?
  • Conclusion

19
Conclusion
  • New needs arising from todays parallel machines
    pose new challenges for system software
  • Among the key needs which emerge...
  • Parallel aware scheduling
  • Improved memory management
  • Metrics for evaluating operating systems
  • These can be investigated from a bottom-up
    approach, or a top-down approach, or both
  • AIX is a reasonable choice for a top-down
    approach

This work was performed under the auspices of the
U.S. Department of Energy by University
of California Lawrence Livermore National
Laboratory under contract No. W-7405-Eng-48.
Write a Comment
User Comments (0)
About PowerShow.com