Dynamic Adaptation of Data Distribution Policies in a Shared Data Space System - PowerPoint PPT Presentation

1 / 30

About This Presentation

Title:

Dynamic Adaptation of Data Distribution Policies in a Shared Data Space System

Description:

Dynamic Adaptation of Data Distribution Policies in a Shared ... Maarten van Steen. Faculty of Science, Dept. Computer Science. Vrije Universiteit Amsterdam ... – PowerPoint PPT presentation

Number of Views:109

Avg rating:3.0/5.0

Slides: 31

Provided by: johanl8

Category:

more less

Transcript and Presenter's Notes

Title: Dynamic Adaptation of Data Distribution Policies in a Shared Data Space System

1
Dynamic Adaptation of Data Distribution Policies
in a Shared Data Space System
Giovanni Russello Michel Chaudron Dept. of
Mathematics and Computing Science Eindhoven
University of Technology
Maarten van Steen Faculty of Science, Dept.
Computer Science Vrije Universiteit Amsterdam
2
Problem Context Perspectives

Distributed System in which
Application level software components join and
leave a system dynamically
?The communication profile of applications is
irregular / unpredictable
Middleware designer
How can I best cater for the communication needs
of an assembly of components? ? optimize resource
use
Component designer
Can I design my component such that it is
independent of design choices in the
communication mechanisms of the middleware? ?
Hence enhancing reusability

3
Shared Data Space Model Overview
A
B
C
put read take
shared data space
tuple ordered sequence of typed fields with
specified values
ltstr name, int agegt - ltGiovanni, 28gt
template ordered sequence of typed fields with
or without a specified value
ltstr name, int agegt - ltGiovanni, int ?gt
4
Shared Data Space Model Features

Small, yet powerful, API
Uncoupling in time means that applications do not
need to communicate at the same time in order to
exchange data
Uncoupling in space allows applications to
cooperate even if they do not know the location
of each other
The computation is separated form the
coordination

5
Advantages for Component Based Systems

Application components are not bound to any
specific application interface
Support of run time dynamic (de)composition
Absence of referential information among
application components

6
Existing SD Implementations

Shared data spaces typically employ a static,
system-wide distribution scheme for distributing
data.
Often, the distribution scheme is dictated by
the application characteristics
target platform (HW)
Examples of distribution schema
Centralized (JavaSpaces)
Uniform Distribution (Corradi et al.)
Hash-based Distribution (Rowstron)

7
Problem Statement

Generally, applications have different needs for
the different types of data types used.
Examples of data usage pattern in a Process Farm
application
Master-Workers job data
Write-many partial-result data
Read-most result data

How can we maintain the simple programming model
of the shared data space, yet also cater for
different quality needs?
8
A Solution Separation of Concerns

Specify application functionality outside
extra-functional concerns (such as data
distribution)
A pre-condition is that computation is separated
from coordination
Treat different data types using different
distribution policies ..
.. in order to distribute data more efficiently.

Huh, more efficiently?!?!
9
Our Approach GSpace

Distributed Shared Data Space System
Separation of functionality from extra-functional
requirements
Differentiation of distribution policy per tuple
type
Dynamic adaptation of distribution policy
Extendable suite of distribution policies

10
GSpace Kernel Deployment

Node 1
Node 2
Node n
Network
11
Examples of Distribution Policies

Store locally (SL)
Full replication (FR)
Caching with invalidation (CI)
Caching with verification (CV)

12
Separating Concerns in GSpace
mapping
Implementation
Specification
Application
Computation
Layer
Distribution
Middleware
Coordination
Middleware
Policy Descriptor
Layer
downloading
NW Level
Layer
Tuple Type Ti Distribution Policy Pj
13
More Efficient, Huh?
Minimize the costs involved in data distribution
Cost Function captures the performance of a given
distribution policy during a period of time
CF (p) w1 m1,p w2 m2,p wn mn,p
? wi 1
The policy that produces the lowest CF value
represents the best policy
mi,js are performance indicators (to be measured
from the system )
14
Performance Metrics

Read latency (rl) time spent for reading a
tuple
Take latency (tl) time spent for taking a tuple
Bandwidth usage (bu) amount of bandwidth used
for distributing tuples and synch messages
Memory usage (mu) amount of memory used for
storing tuples

CF (p) w1 rlp w2 tlp w3 mup w4
bup ? wi 1
15
GSpace Kernel Internal Structure
Application Level
Application requests
Middleware Level
Inter-kernel comm
Net OS Level
16
OPS Modules
Application Level
OPS
Middleware Level
GSpace Kernel
Net OS Level
17
Adaptation System Modules
For each tuple type there is one AM working in
Master mode. All the others work in Slave mode
Application Level
From Controller
AS
Adaptation Module
Logger
Cost Computation Module
To AddTable and PolTable
Middleware Level
To LocDataSpace and DPS
DPCM
Adapt-Comm
Module
GSpace Kernel
Net OS Level
18
Adaptation Mechanism Phases

Logging
Evaluation
Adaptation (optional)

19
Logging Phase MSC
20
Evaluation Phase MSC
21
Adaptation Phase MSC
22
Experiment Settings
coordinator
Application Model
...
noden
node2
node1
n 2 10

Application Usage Patterns Simulated by the
Application Model
Local Usage Pattern (LUP)
Write-many Usage Pattern (WUP)
Read-mostly Usage Pattern (RUP) (i) and (ii)

Example of a operation run p,l1r,l2r,l2
r,l2t,l1..
23
Experiment Settings
run
generate operation run
run phase3
run phase2
run phase1
coordinator

Static settings Adaptation disabled
Dynamic settings Adaptation enabled

24
Experiment Settings
Static settings Adaptation disabled
Operation Run
Any more distribution policy?
Select the policy with min CF value
yes
Assign the distribution policy to the tuple type
no
Execute the Operation run
Compute the CF value
25
Experiment Settings
Dynamic settings Adaptation enabled
same operation run as previous experiments
Operation Run
Any more threshold values?
yes
no
Execute the operation run for the selected
threshold
termination
Aggregate the min CF and actual CF values
During each evaluation phase store the min CF
value and the CF value of the actual policy
26
CF of Adaptive and Static Settings
wi 0,25 for all i
Run-phase length 500
27
CF of Adaptive and Static Settings
wi 0,25 for all i
Run-phase length 8000
28
Accuracy of the Cost Model
29
Adaptation Mechanism Overhead
Percentage of time spent in different modules of
a kernel
30
Conclusions Future work