MONARC 2 distributed systems simulation - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

MONARC 2 distributed systems simulation

Description:

... pool) Scheduler. Task. Event. EventQueue. WorkerThread. Pool ... Any client can subscribe with a filter and will receive the results it is. Interested in. ... – PowerPoint PPT presentation

Number of Views:25
Avg rating:3.0/5.0
Slides: 25
Provided by: dobrecipr
Category:

less

Transcript and Presenter's Notes

Title: MONARC 2 distributed systems simulation


1
MONARC 2- distributed systems simulation -
2
The Goals of the Project
  • To perform realistic simulation and modelling of
    large scale distributed computing systems,
    customised for specific large scale HEP
    applications.
  • To provide a design framework to evaluate the
    performance of a range of possible computer
    systems, as measured by their ability to provide
    the physicists with the requested data in the
    required time, and to optimise the cost.
  • To narrow down a region in this parameter space
    in which viable models can be chosen by any of
    the future LHC-era experiments.
  • To offer a dynamic and flexible simulation
    environment.

3
LHC Computing Different from Previous
Experiment Generations
One of the four LHC detectors (CMS)
online system multi-level trigger filter out
background reduce data volume
40 MHz (40 TB/sec)
level 1 - special hardware
75 KHz (75 GB/sec)
level 2 - embedded processors
5 KHz (5 GB/sec)
level 3 - PCs
100 Hz (100-1000 MB/sec)
data processing offline analysis, selection
Raw recording rate 0.1 1 GB/sec3 - 8 PetaBytes
/ year
4
Off-Line LHC Computing Data Analysis
Geographical dispersion of people and resources
Complexity the detector and the LHC
environment Scale 100 times more processing
power Petabytes per year of data
CMS
1800 Physicists 150 Institutes 32
Countries
VERY LARGE SCALE DISTRIBUTED SYSTEM AND IT HAS TO
PROVIDE (NEAR) REAL-TIME DATA ACCESS FOR ALL THE
PARTICIPANTS
5
Regional Center Hierarchy (Worldwide Data Grid)
Experiment
PBytes/sec
Online System
1001000 MBytes/sec
Bunch crossing per 25 nsecs.Event is 1 MByte in
size
Offline Farm,CERN Computer
Tier 0 1
HPSS
0.6 - 2.5 Gbits/sec
FNAL Center
Italy Center
UK Center
France Center
Tier 1
2.4 Gbits/sec
Tier 2
622 Mbits/sec
Tier 3
Physicists work on analysis channels. Processing
power 200,000 of todays fastest PCs
Institute 0.25TIPS
Institute
Institute
Institute
100 - 1000 Mbits/sec
Physics data cache
Tier 4
Workstations
6
Simulation Models
  • The simulation model
  • abstracts the components of the real system and
    their interactions
  • must be equivalent to the simulated system
  • Simulation models
  • continuous time - the system is described by a
    set of differential equations
  • discrete time - the state changes only at certain
    time moments
  • In MONARC one of the discrete time models
    (Discrete Event Simulation DES) the events
    represent important activities from the system,
    managed with the aid of an internal clock

7
A Global View for Modelling
MONITORING
REAL Systems
Testbeds
8
Regional Center Model
REGIONAL CENTER
LAN
FARM
9
The Simulation Engine
  • Provides the multithreading mechanism for the
    simulation
  • The entities with time dependent behavior are
    mapped on active objects
  • In the simulation engine management of active
    objects and events
  • Thread reusability (thread pool)

Activity
Scheduler
AJob
Job
Event
Task
EventQueue
Farm
JobScheduler
Pool
WorkerThread
Engine
CPUUnit
10
Multitasking Processing Model
Concurrent running tasks share resources (CPU,
memory, I/O) Interrupt driven scheme For each
new task or when one task is finished, an
interrupt is generated and all processing times
are recomputed.
11
Engine tests
Processing a TOTAL of 100 000 simple jobs in
1 , 10, 100, 1000, 2 000 , 4 000, 10 000 CPUs
(number of CPUs number of parallel threads)
more tests http//monalisa.cacr.caltech.edu/MONA
RC/
12
Job Scheduling
  • Dynamically loadable modules for each regional
    center
  • Basic job scheduler assigns the jobs to CPUs
    from the local farm
  • More complex schedulers allow job migration
    between regional centers

Dynamically loadable module
13
Centralized Scheduling
Site A
GLOBAL Job Scheduler
14
Distributed Scheduling market model
COST
Request
DECISION
JobScheduler
Site A
15
Example simple distributed scheduling
  • Very simple scheduling algorithm, based on
    searching the center with the minimum load
  • We simulated the activity of 4 regional centers
  • When all the centers are heavily loaded, the
    number of job transfers grows unnecessarily

16
Network Model
Simulated network components
Farm
Farm
WAN
WAN
LinkPort
LinkPort
LAN
LAN
Simulated local traffic
Simulated inter-regional traffic
17
LAN/WAN Simulation Model
Link
Node
LAN
ROUTER
Internet Connections
Interrupt driven simulation for each new
message an interrupt is created and for all the
active transfers the speed and the estimated
time to complete the transfer are recalculated.
ROUTER
Continuous Flow between events ! An efficient and
realistic way to simulate concurrent transfers
having different sizes / protocols.
18
Network Model
The TCP/IP layers are closely followed
Application Layer
Transport Layer
Internet Layer
Network Access Layer
19
Data Model
Database Index
Client
Mapare
Database
LinkPort
Database
Task
Database Entity
Database
DContainer
DContainer
Database Server
Mass Storage
DContainer
20
Data Model
  • Generic Data
  • Container
  • Size
  • Event Type
  • Event Range
  • Access Count
  • INSTANCE

META DATA Catalog Replication Catalog
Network FILE
FILE
Data Base
Custom Data Server
FTP Server Node
DB Server
NFS Server
Export / Import
21
Data Model
META DATA Catalog Replication Catalog
Data Processing JOB
Data Request
Data Container
Select from the options
JOB
List Of IO Transactions
22
Activities Arrival Patterns
A flexible mechanism to define the Stochastic
process of how users perform data processing
tasks
Dynamic loading of Activity tasks, which are
threaded objects and are controlled by the
simulation scheduling mechanism
Physics Activities Injecting Jobs
Each Activity thread generates data processing
jobs
These dynamic objects are used to model the users
behavior
23
Output of the simulation
Node
Simulation Engine
DB
Output Listener Filters
GRAPHICS
Router
Output Listener Filters
Log Files EXCEL
User C
Any component in the system can generate generic
results objects Any client can subscribe with a
filter and will receive the results it is
Interested in . VERY SIMILAR structure as in
MonALISA . We will integrate soon The output of
the simulation framework into MonaLISA
24
Conclusions
  • Modelling and understanding current systems,
    their performance and limitations, is essential
    for the design of the large scale distributed
    processing systems. This will require continuous
    iterations between modelling and monitoring
  • Simulation and Modelling tools must provide the
    functionality to help in designing complex
    systems and evaluate different strategies and
    algorithms for the decision making units and the
    data flow management.
  • For future development efficient distributed
    scheduling algorithms, data replication, more
    complex examples.
  • http//monalisa.cacr.caltech.edu/MONARC
Write a Comment
User Comments (0)
About PowerShow.com