The Utility of Exploiting Idle Memory for DataIntensive Computations PowerPoint PPT Presentation

presentation player overlay
1 / 22
About This Presentation
Transcript and Presenter's Notes

Title: The Utility of Exploiting Idle Memory for DataIntensive Computations


1
The Utility of Exploiting Idle Memory for
Data-Intensive Computations
  • Anurag Acharya and Sanjeev Setia
  • Presented by John Oleszkiewicz
  • CSE 807 4/22/03

2
Problem Statement
  • We would like to use the idle memory of a network
    of workstations to allow data-intensive
    applications to run more quickly

3
Questions
  • What fraction of memory is idle at any given
    time on a NOW?
  • What fraction of individual host memory is idle
    at any given time?
  • How long will memory typically stay idle?
  • What is the benefit of harvesting idle memory to
    the owner of the program?
  • Will harvesting memory have an adverse impact on
    users of the workstations?

4
Workstation Environments
  • Alpha
  • 10 stations, 4-processor DEC, batch processing
  • Umdres
  • 14 stations, faculty and grad students
  • Gmu
  • 32 stations, faculty and grad students
  • Comsearch
  • 37 workstations, engineers and software developers

5
Traces
  • Two traces collected for each workstation
  • User activity (e.g. Mouse movement)
  • Memory load information
  • Each trace has a resolution of 90 seconds

6
Workloads
  • OLAP 5 queries to department store database
  • 1 million customers
  • 100,000 products
  • 10,000 employees
  • 200 departments
  • 100 million transactions
  • Can't fit in available memory of NOW

7
Workloads
  • DMINE application that extracts association
    rules from retail databases
  • 50 million transactions
  • Each transaction typically contains 10 items
  • Smaller than OLAP, but still can't fit in idle
    memory of workstation.

8
Workloads
  • LU
  • Decomposition of a 8192x8192 matrix
  • Can fit in idle memory of NOW

9
Metrics
  • Memory Availability
  • Benefits
  • User Perception

10
Memory Availability
  • Aggregate availability
  • total memory in pool that is available
  • Fractional availability
  • fraction of workstations that have at least x
    memory available
  • Idle
  • No activity and memory load less than 30 for 5
    min
  • Idle Interval
  • How long a chunk of memory remains idle

11
Benefits
  • Execution time
  • reduction in execution time by using idle
    memory in NOW as cache
  • Equivalent Memory
  • Amount of memory that has to be added to the
    local workstation to get the equivalent speedup

12
User Perception
  • Memory context size
  • How much memory must be reclaimed when the
    workstation becomes active
  • Directly impacts user-perceived slowdown

13
Parameters
  • Recruitment policy
  • Network Bandwidth
  • Memory vacate time
  • How fast should harvested memory be relinquished?

14
Results
15
Metrics
  • Aggregate Availability
  • 70-85 of total memory is available on average
  • 60 is (almost) always available
  • Fractional Availability
  • 90 have 50 memory available
  • Idle Interval
  • Median 18-30 minutes

16
What does it mean?
  • Parameter recruitment policy should be set to
    50
  • Very large memory chunks are likely to survive at
    least 15 minutes
  • Apps that access datasets several times in 15-20
    minutes will benefit

17
Workload Results
  • LU
  • Performance benefits on some clusters
  • Equivalent Memory 379 MB
  • EQ limited by network overhead
  • DMINE
  • Performance benefit reflect amount of idle memory
    available
  • EQ highest at 1 GB
  • EQ limited by idle memory

18
Workload Results (con't)
  • OLAP
  • Performance increase similar to DMINE, but
    benefit was much lower
  • EQ benefits lower
  • Bottleneck average I/O requirements much
    larger, bigger communication overhead.

19
Metrics User Costs
  • In most cases, no delays
  • Memory context size typically between 0-1.5 MB
  • Harvesting at most 50 of memory has very little
    impact on users.
  • What about increasing the recruitment policy to
    over 50?
  • Beneficial to apps with footprint greater than
    idle memory available
  • Cost user-visible delay.

20
Other Results
  • Increasing network bandwidth increased equivalent
    memory
  • Memory vacate times
  • Numbers other than 0 never resulted in
    improvement of more than 10
  • Small benefit probably not worth slowing down
    workstation owner.

21
Conclusions
  • 30 of memory on a typical NOW can be had with no
    adverse impact on users
  • If this memory is used as an intermediate cache,
    execution time can be reduced by 26
  • Restricting maximum recruited memory to 50
    ensures no delays during reclamation.

22
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com