PicsouGrid: A Grid Framework For Computational Finance - PowerPoint PPT Presentation

1 / 47
About This Presentation
Title:

PicsouGrid: A Grid Framework For Computational Finance

Description:

Consequently, de-emphasises computational finance-specific aspects (i.e. ... Bordeaux. Characteristics of Grid5000. Private network ... – PowerPoint PPT presentation

Number of Views:48
Avg rating:3.0/5.0
Slides: 48
Provided by: wwwpnpPh
Category:

less

Transcript and Presenter's Notes

Title: PicsouGrid: A Grid Framework For Computational Finance


1
PicsouGridA Grid Framework For Computational
Finance
  • Francoise BAUDE
  • Mireille BOSSY
  • Viet Dung DOAN
  • Ian STOKES-REES
  • INRIA Sophia-Antipolis
  • France

2
Outline
  • Objectives
  • Background
  • Architecture
  • Layered Grid Process Model
  • Grid5000
  • Performance Results
  • Future

3
High Level Project Objectives
  • Framework for distributed computational finance
    algorithms
  • Investigate grid component model
  • http//gridcomp.ercim.org/
  • Implement open source versions of parallel
    algorithms for computational finance
  • Utilise ProActive grid middleware
  • Deploy and evaluate on various grid platforms
  • Grid5000 (France)
  • DAS3 (Netherlands)
  • EGEE (Europe)

4
Grid Emphasis
  • This presentation and subsequent paper focuses on
  • Multi site (5)
  • Large scale (500-2000 cores)
  • Long term (days to weeks)
  • Multi-grid (2)
  • parallel computing grid framework
  • Consequently, de-emphasises computational
    finance-specific aspects (i.e. algorithms and
    application domain)
  • However other team members are working hard on
    this!

5
Outline
  • Objectives
  • Background
  • Architecture
  • Layered Grid Process Model
  • Grid5000
  • Performance Results
  • Future

6
ProActive
  • http//www.objectweb.org/proactive
  • Java Library for Distributed Computing
  • Developed by INRIA Sophia Antipolis, France
    (Project OASIS)
  • 50-100 person-years RD work invested
  • Provides transparent asynchronous distributed
    method calls
  • Implemented on top of Java RMI
  • Fully documented (600 page manual)
  • Available under LGPL
  • Used in commercial applications
  • Graphical debugger

7
ProActive (II)
  • OO SPMD with Active Objects
  • Any Java Object can automatically be turned into
    an Active Object
  • Utilises Java Reflection
  • Wait by necessity and futures allow method
    calls to return immediately and then subsequent
    object access blocks until result is ready
  • Objects appear local but may be deployed on any
    system within ProActive environment (local
    system/cluster, or remote system, cluster, or
    grid)
  • Easy Integration with Existing Systems
  • Extensions seamlessly support various cluster,
    network, and grid environments Globus, ssh,
    http(s), LSF, PBS, SGE, EGEE, Grid5000

8
Background Options
  • Option trading financial instruments which allow
    buyers to bet on future asset prices and sellers
    to reduce risk of owning asset
  • Call option allows holder to purchase an asset
    at a fixed price in the future
  • Put option allows holder to sell an asset at a
    fixed price in the future
  • Option Pricing
  • European fixed future exercise date
  • American can be exercised any time up to expiry
    date
  • Basket prices a set of options together
  • Barrier exercise depends on a certain barrier
    price being reached
  • Uses Monte Carlo simulations
  • Possibility to aggregate statistical results

9
Background PicsouGrid v1,2,3
  • Original versions of PicsouGrid utilised
  • Grid5000
  • ProActive
  • JavaSpaces
  • Implemented
  • European Simple, Basket, and Barrier Pricing
  • Medium-size distributed system 4 sites, 180
    nodes
  • Short operational runs (5-10 minutes)
  • Fault Tolerance mechanisms
  • Achieved 90x speed-up with 140 systems
  • 65 efficiency
  • Reported in e-Science 2006 (Amsterdam, Nov 2006)
  • A Fault Tolerant and Multi-Paradigm Grid
    Architecture for Time Constrained Problems.
    Application to Option Pricing in Finance.

10
PicsouGrid v3 Performance
Multi-site
Peak speed-up
Performance degradation
11
Outline
  • Objectives
  • Background
  • Architecture
  • Layered Grid Process Model
  • Grid5000
  • Performance Results
  • Future

12
PicsouGrid Architecture
  • Server/Control Node
  • Provides User Interface
  • Instantiates network of Sub-Servers
  • Allows configuration of Simulator network
  • Creates Request for Option Price (with
    algorithm parameters)
  • Controls Sub-Servers and aggregates/reports
    results
  • Monitors Sub-Servers for failures and spawns new
    Sub-Servers if necessary
  • Sub-Server
  • Acts as local site/cluster/system controller
  • Instantiates local Simulators
  • Delegates simulations in packets to Simulators
  • Collects results, aggregates, and returns to
    Server
  • Monitors Simulators for failures and spawns new
    Simulators if necessary
  • Simulator
  • Computes Monte Carlo simulations for option
    pricing using packets

13
PicsouGrid Deployment and Operation
14
PicsouGrid v5 Design Objectives
  • Multi-Grid
  • Grid5000
  • gLite/EGEE
  • INRIA Sophia desktop cluster
  • Decoupled Workers
  • Autonomous
  • Independent deployment and operation
  • P2P discover and acquire
  • Long Running, Multi-Algorithm
  • Create standing application
  • Augment (or reduce) P2P worker network based on
    demand
  • Computational tasks specify algorithm and
    parameters

15
Outline
  • Objectives
  • Background
  • Architecture
  • Layered Grid Process Model
  • Grid5000
  • Performance Results
  • Future

16
Grid Performance Monitoring and State Machines
  • Grid-ified distributed applications add at least
    three new layers of complexity compared to serial
    counterpart
  • Grid interaction and management
  • Local cluster interaction and management
  • Distributed application code
  • Notoriously difficult to figure out what is going
    on where and when it is happening
  • Bottlenecks
  • Hot spots
  • Idle time
  • Limiting factor CPU, storage, network?
  • What state is an application/task/process/system
    currently in?
  • Solution Utilise a common state machine model
    for grid applications/processes

17
Layered System
Grid
Site
Cluster
Host
Core
VM
Process
18
Proof of layering
  • What I execute on a Grid5000 Submit (UI) Node
  • mysub -l nodes30 es-bench1e6
  • What eventually runs on Worker Node
  • /bin/sh -c /usr/lib/oar/oarexecuser.sh
    /tmp/OAR_59658 30 59658 istokes-rees \/bin/bash
    /proc/fgrillon1.nancy.grid5000.fr/submit N
    script-wrapper \/bin/script-wrapper
    fgrillon1.nancy.grid5000.fr \/es-bench1e6
  • Granted, this is nothing more than good system
    design and separation of concerns
  • We are just looking at the implicit API layers of
    the grid
  • Universal interface command shell, environment
    variables and file system

19
Abstract Recursive Process Model
  • Question Is it possible to propose a recursive
    process model which can be applied at all layers?
  • Create process description
  • Bind process to the physical layer
  • Prepare prepare for execution (software, stage
    in, config)
  • Execute initiate process execution (enter next
    lower layer)
  • Complete book keeping, stage out, clean up
  • Clear wipe system, ready for next invocation
  • Each stage can be in a particular state
  • Ready
  • Active
  • Done

20
Grid Process State Machine
Fail
Cancel
System
User
Suspend
Pause
Ready
Ready
Ready
Ready
Active
Active
Active
Active
Done
Done
Done
Done
Prepare
Execute
Complete
Clear
Create process description
Bind to a particular system
Prepare system to execute process
Execute process (recurse to next lower level)
Tidy up system and accounting after completion of
process
Clear process from system
21
CREAM Job States
Create
Bind
  • New LCG/EGEE Workload Management System
  • Can be mapped to Grid Process State Machine
  • This only shows one level of mapping
  • In practice, would apply state machine at Grid
    level, LRMS level, and task level
  • Timestamps on state entry
  • Layer.Stage.State

Prepare
Suspend
Execute
Done
Failed
Failed
Cancelled
22
Outline
  • Objectives
  • Background
  • Architecture
  • Layered Grid Process Model
  • Grid5000
  • Performance Results
  • Future

23
Grid5000 Stats
Lille
  • 9 Sites across France
  • 21 Clusters
  • 17 Batch systems
  • 3138 cores
  • Xeons
  • Opterons
  • Itaniums
  • G5

Nancy
Paris-Orsay
Rennes
Lyon
Bordeaux
Grenoble
Toulouse
Sophia
24
Characteristics of Grid5000
  • Private network
  • Outbound Internet access possibly via ssh tunnel
  • Access based on ssh keys (passwordless)
  • Shared NFS file space at each site
  • Very limited data management facilities
  • Myrinet and Infiniband prevalent on many clusters
  • RENATER French research network, 2.5 to 10 Gb/s
    inter-site
  • Focus on multi-node (and multi-site) grid
    computing
  • Kadeploy provides mechanism for custom system
    image to be loaded before job starts

Grid5000 site
25
Deployment and Execution on Grid5000
  • Limited grid-wide (cross-site) job submission
    mechanisms
  • In practice, submit individually at each site
  • Coordinate between sites via multiple
    reservation job submissions with same
    reservation window
  • Limited data-management/staging/configuration
  • Kadeploy (often too heavy weight)
  • rsync
  • Configuration wrapper scripts
  • Node count reservations best effort
  • Rule of thumb dont expect more than 80 of
    requested nodes to be available when reservation
    starts
  • Experience shows reservation start times could be
    delayed 30 seconds to 10 minutes

26
Outline
  • Objectives
  • Background
  • Architecture
  • Layered Grid Process Model
  • Grid5000
  • Performance Results
  • Future

27
Experimental Setup
  • European Simple call/put option price
  • 1e6 Monte Carlo iterations
  • Single asset pricing reference
  • treference 67.3 seconds
  • AMD Opteron 2218 (64 bit) 2.6 GHz 1 MB L1 667 MHz
    bus (best performing core available)
  • Objective 1 maximize number of options priced in
    a fixed time window
  • Objective 2 maximize speed-up efficiency
  • (noptions?treference)
  • ?sites(ncores_i ? treservation_i)

28
Run Now Experiment
  • Make immediate request for maximum number of
    nodes on all Grid5000 clusters
  • Price one option per acquired core
  • Not really fair Grid5000 is not a production
    grid
  • Submit to 15 clusters
  • 8 clusters at 6 sites completed tasks within 6
    hours
  • Remainder either failed or hadnt started 24
    hours later
  • 1272 cores utilised
  • 85 core-hours occupied
  • This is the total amount of time the tasks held
    a particular core idle time execution time
  • Objective 1(alt) 1272 options priced in 8
    minute window
  • Objective 2 1272 options ? 67.3 s / 85 hr 28
    efficient
  • Discovered various grid issues (e.g. NTP, rsync)

29
Queuing
Queuing
Queuing
Execution
Queuing
Result stage-out
30
When everything is working
31
NTP Problems (Time Sync)
32
Unexplained slow downs (homogeneous cluster)
33
Erratic node/core startup
34
Coordinated Start with Reservation
  • Reservation made 12 hours in advance
  • Confirmed no other reservations for time slot
  • Start time at low utilisation point of 605am
  • 5 minutes provided for system restarts and
    Kadeploy re-imaging after end of reservations
    going to 6am
  • Submitted to 12 clusters, at 8 sites
  • 9 clusters at 7 sites ran successfully
  • 894 cores utilised
  • 31.3 core-hours occupied
  • No task started on time
  • Start time delays of 20s to 5.5 minutes
  • Illustrates difficulty of cross-site coordinated
    parallel processing
  • Objective 1 894 options priced in 9.5 minute
    window
  • Objective 2 894 options ? 67.3 s / 31.3 hr
    53.4 efficient
  • Still problems (heterogeneous clusters, NTP,
    rsync)

35
(No Transcript)
36
Intra-node timing variations
37
Heterogeneous clusters (hyper threading on)
38
Mis-configured timezone
39
Overall cluster benchmarks
40
Outline
  • Objectives
  • Background
  • Architecture
  • Layered Grid Process Model
  • Grid5000
  • Performance Results
  • Future

41
Parallelism
  • American option pricing with floating exercise
    date is much more difficult to calculate
  • Two algorithms with good opportunities for
    parallelism are available
  • Longstaff-Schwartz (2001)
  • Ibanez-Zapetero (2002)
  • Interesting to see what speed up can be achieved
    by parallel implementation
  • Interested in possibility of cross-site parallel
    computation utilising ProActive

42
Longstaff Schwartz
43
Ibanez-Zapetero
44
Multi-Grids
  • Very interested in experimenting with Multi-Grid
    environment
  • Grid5000
  • gLite/EGEE
  • DAS3
  • Local cluster/desktop-grid/p2p network
  • ProActive deploys on LCG (gLite/EGEE)
  • Other ProActive applications deployed and run
    successfully
  • VO problems in Feb/March meant PicsouGrid could
    not be run on LCG so no results for ISGC! ??
  • Investigate use of HTTP-based task pools to
    bridge grids

45
Future for PicsouGrid
  • Many more computational finance algorithms have
    already been developed and need to be similarly
    benchmarked
  • Barrier, Basket
  • American (Longstaff-Schwartz and Ibanez-Zapatero)
  • Continuous operation of option pricing, rather
    than one-shot
  • Incorporate dynamic node availability
  • Improve modularization/componentization of
    finance algorithms

46
Summary of Observations
  • Deploying parallel applications in a grid
    environment continues to be a challenging problem
  • Heterogeneity in a grid is pervasive and still
    hard to deal with
  • Understanding performance issues, hot spots,
    bottlenecks, wasted idle time, and
    synchronisation points can be aided by a grid
    process model
  • Middleware really is critical gLite, LRMS, OAR,
    ProActive, etc. need to provide end users and
    application developers with reliable, consistent,
    and easy to use interface to the grid

47
Thank you
  • Questions?
  • https//gforge.inria.fr/projects/picsougrid/
  • Ian.Stokes-Rees_at_inria.fr
Write a Comment
User Comments (0)
About PowerShow.com