First NASA Workshop on Performance-Engineered Information Systems - PowerPoint PPT Presentation

About This Presentation
Title:

First NASA Workshop on Performance-Engineered Information Systems

Description:

Kento Aida, Tokyo Institute of Technology. First NASA Workshop on Performance ... Ocha-U [SS10,2PEx8] (0.16MB/s, 32ms) NITech [Ultra2] (0.15MB/s, 41ms) TITech [Ultra1] ... – PowerPoint PPT presentation

Number of Views:42
Avg rating:3.0/5.0
Slides: 33
Provided by: ninfA9
Learn more at: https://ninf.apgrid.org
Category:

less

Transcript and Presenter's Notes

Title: First NASA Workshop on Performance-Engineered Information Systems


1
First NASA Workshop on Performance-Engineered
Information Systems
  • A Performance Evaluation Model for Scheduling in
    Global Computing Systems

Kento Aida (TIT) Atsuko Takefusa (Ochanomizu
Univ.) Hidemoto Nakada (ETL) Satoshi Matsuoka
(TIT) Satoshi Sekiguchi (ETL) Umpei Nagashima
(National Institute of Materials and Chemical
Research)
http//ninf.etl.go.jp/
2
Global Computing System
  • Proposed Global Computing Systems
  • Globus, Netsolve, Ninf, Legion, RCS, etc.

3
Scheduling for Global Computing System
An effective scheduling is required to achieve
high-performance global computing!
  • Scheduling under Dynamic, Hetero. Env.
  • computing server performance / load
  • network topology / bandwidth / congestion
  • multiple users at multiple sites
  • Software Systems for Scheduling
  • AppLes, Netsolve agent, Nimrod, Ninf Metaserver,
    Prophet, etc.

4
Framework to evaluate scheduling algorithm
  • Benchmarking on Real Systems
  • practical measurement
  • difficult to perform large-scale experiments
  • a small number of replications
  • partial solution

No Effective Framework to evaluate the
performance of scheduling in global computing
systems!
5
Performance Evaluation Model
  • Objective
  • modeling various global computing systems
  • large-scale simulation
  • reproducibility
  • Contents
  • overview of the model
  • verification of the model
  • evaluation of scheduling algorithm on the model

6
General Arch. of Global Computing System
  • Clients
  • Computing Servers
  • Scheduling System
  • Schedulers (e.g. AppLes, Prophet)
  • perform scheduling according to system / user
    policy
  • Directory Service (e.g. Globus-MDS)
  • central database of resource information
  • Monitors/Predictors (e.g. NWS)
  • monitor and predict server and network status

7
Canonical Model of Task Execution
(1) query about suitable server (2) assign
suitable server (3) request execution (4) return
computed result
Scheduling System
Directory Service
(0) monitor server and network status
periodically
Monitor
Scheduler
Site1
(0)
(1)
(2)
Client A
Server A
(3)
Site2
Server B
Client B
WAN
Server C
Client C
(4)
8
Requirement for the Model
  • Modeling
  • various topology
  • clients, servers, networks
  • server
  • performance, load (congestion), variance over
    time
  • network
  • bandwidth, throughput (congestion), variance
    over time
  • Performing
  • large-scale simulation
  • reproducible simulation

9
Proposed Performance Evaluation Model
Queueing Network
  • Global Computing System
  • Qs computational servers
  • Qns network from the client to the server
  • Qnr network from the server to the client
  • Congestion on Servers and Networks
  • other tasks
  • tasks which are invoked from other processes
    and enter Qs
  • other data
  • data which are transmitted from other
    processes and enter Qns or Qnr

10
Example of Proposed Model
Site 1
Site 1
Server A
Qns1
Qnr1
Client A
Client A
Qs1
Qns2
Qnr2
Site 2
Site 2
Server B
Client B
Client B
Qns3
Qnr3
Qs2
Qns4
Qnr4
Server C
Client C
Client C
Qs3
11
Client
  • Task Invoked by a Client
  • data transmitted to the server (Dsend)
  • computation of the task
  • data transmitted from the server, or computed
    result (Drecv)
  • Procedure to Invoke Tasks
  • query scheduler about a suitable server
  • The scheduler assigns a server.
  • decompose Dsend into logical packets and transmit
    these packets to Qns connected to the assigned
    server
  • The server completes the execution of the task.
  • receive Drecv from Qnr

12
Parameters for the Client
  • Packet Transmission Rate
  • ?packet Tnet / Wpacket
  • Tnet bandwidth of the network between
  • the client and the server
  • Wpacket logical packet size

13
Queue as a Network (Qns)
other data


Qns
Qs
client
single server queue with finite buffer FCFS
  • Procedure
  • A packet transmitted from the client enters Qns.
  • A packet is retransmitted when buffer is full.
  • A packet in Qns is processed for Wpacket / Tnet
    time.
  • A packet of the clients task leaves for Qs.
  • Arrival rate of other data indicates congestion
    of the network.

14
Parameters for Qns
  • Arrival Rate of Other Data
  • determine network throughput
  • Arrival is currently assumed to be Poisson.
  • ?ns_others (Tnet / Tact - 1) x ?packet
  • Tact ave. actual throughput of the
    network to be simulated
  • Buffer Size of Queue
  • determine network latency
  • N Tlatency x Tnet / Wpacket
  • Tlatency ave. actual latency of the network
    to be simulated

15
Example
  • Simulated Condition
  • bandwidth Tnet 1.0 MB/s
  • ave. actual throughput Tact 0.1 MB/s
  • latency Tlatency 0.1 sec.
  • logical packet size Wpacket 0.01 MB
  • Arrival Rate of Other Data and Latency
  • ?packet Tnet / Wpacket 1.0 / 0.01 100
  • ?ns_others (Tnet / Tact - 1) x ?packet
  • (1.0 / 0.1 - 1) x 100 900
  • N Tlatency x Tnet / Wpacket 0.1 x 1.0 / 0.01
    10

16
Queue as a Server (Qs)
other tasks
Qs

Qnr



Qns
single server queue FCFS or other strategies
  • Procedure
  • The computation of the clients task enters Qs
    after all associated data arrive at Qs.
  • A queued task waits for its turn and is processed
    for Wc / Tser time. (Tser server
    performance, Wc ave. comput. size)
  • Data of computed result are decomposed into
    logical packets and these packets are transmitted
    to Qnr.
  • Arrival rate of other tasks indicates congestion
    on the server.

17
Parameters for Qs
  • Arrival Rate of Other tasks
  • determine server utilization
  • Arrival is currently assumed to be Poisson.
  • ?s_others Tser / Ws_others x U
  • Tser performance of the server
  • Ws_others ave. computation size
    of other tasks
  • U ave. actual utilization on the server
    to be simulated
  • Packet Transmission Rate
  • ?packet Tnet / Wpacket

18
Example
  • Simulated Condition
  • server performance Tser 100 MFlops
  • ave. actual utilization U 10
  • ave. computation size Ws_others 0.1 MFlops
  • Arrival Rate of Other Tasks
  • ?s_others Tser / Ws_others x U
  • 100 / 0.1 x 0.1
  • 100

19
Queue as a Network (Qnr)
other data


Qnr
Qs
single server queue with finite buffer FCFS
  • Procedure
  • A packet transmitted from Qns enters Qnr.
  • A packet is retransmitted when buffer is full.
  • A packet in Qnr is processed for Wpacket / Tnet
    time.
  • A packet transmitted from Qns leaves for the
    client.
  • Arrival rate of other data indicates congestion
    of the network.

20
Verification of the Proposed Model
  • Comparison
  • results in simulation on the proposed model
  • results in experiments on the actual global
    computing system, Ninf system

21
Ninf System
Other System
Ninf DB Server
Meta Server
Internet
Meta Server
Meta Server
Ninf Computational Server
Ninf RPC
Program
22
Simulation Parameter (1)
  • Client
  • invoking tasks repeatedly
  • Linpack (problem size 600, 1000, 1400)
  • (comput. O(2/3n3 2n2), comm. 8n2 20n
    O(1))
  • invocation rate of Ninf_call at the client
  • ?request 1 / (worst response time interval )
  • packet size 10, 50, 100 KB

23
Simulation Parameter (2)
  • Network
  • bandwidth 1.5MB/s
  • other data
  • ave. packet size 10, 50, 100KB (Exp. Dist.)
  • Poisson Arrival
  • Server
  • CPU performance 500MFlops
  • ave. actual utilization 4
  • other tasks
  • ave. computation size 10MFlops (Exp. Dist.)
  • Poisson Arrival

24
Performance of a Clients Tasks
client WS in Ochanomizu Univ., server J90 in
ETL
  • The performance of clients tasks in the
    simulation closely matches experimental results.
  • Effect of different packet sizes is almost
    negligible.
  • Simulation cost could be reduced.

25
Performance of Clients Tasks
clients WS in U-Tokyo, NITech and TITech,
server J90 in ETL
  • The performance of tasks invoked by multiclients
    in the simulation closely matches experimental
    results.
  • Effect of different packet sizes is almost
    negligible.
  • Simulation cost could be reduced.

26
Evaluation of Scheduling Algorithm
  • Evaluation
  • Evaluation of basic scheduling algorithm on
    imaginary environment in the simulation on the
    proposed model
  • Scheduling Algorithm
  • RR round robin
  • LOAD server load
  • min. (L 1) / P (L ave. load, P server
    performance)
  • LOTH server load network congestion
  • min. Compt. / (P / (L 1)) Comm. / Tnet

27
Imaginary Environment
400Mops
100Mops
Server A
Server B
50KB/s
200KB/s
Client 1
Client 2
Client 3
Client 4
28
Simulation Parameter (1)
  • Client
  • invoking tasks repeatedly
  • Linpack (problem size 600)
  • (comput. O(2/3n3 2n2), comm. 8n2 20n
    O(1))
  • EP (problem size 221)
  • (comput. number of random number, comm.
    O(1))
  • invocation rate of Ninf_call at the client
  • ?request 1 / (worst response time interval)
  • interval Linpack 5sec., EP
    20sec.
  • Poisson Arrival
  • packet size 100 KB

29
Simulation Parameter (2)
  • Network
  • bandwidth 1.5MB/s
  • other data
  • ave. packet size 100KB (Exp. Dist.)
  • Poisson Arrival
  • Server
  • ave. actual utilization 10
  • other tasks
  • ave. computation size 10Mops (Exp. Dist.)
  • Poisson Arrival

30
Scheduling Performance
  • RR
  • performs worst
  • LOAD
  • performs well for EP
  • causes network congestion and degrades the
    performance for Linpack
  • LOTH
  • performs best

31
Conclusions
  • Proposal
  • performance evaluation model for scheduling in
    global computing systems
  • Verification of the Model
  • The proposed model could effectively simulate the
    performance of clients tasks in simple setup of
    the actual global computing system, Ninf system.
  • Evaluation on the Model
  • Dynamic information of both servers and networks
    should be employed for scheduling.

32
Future Work
  • Modeling
  • parallel task execution
  • invocation of parallel tasks at the client
  • Inter-server communication / synchronization
  • co-allocation of parallel tasks
  • arrival of other data / task
  • Developing Scheduling Algorithm
  • prediction of server load and network congestion
Write a Comment
User Comments (0)
About PowerShow.com