Visualization Resource Scheduling with AlphaServer SC - PowerPoint PPT Presentation

1 / 39
About This Presentation
Title:

Visualization Resource Scheduling with AlphaServer SC

Description:

PITTSBURGH SUPERCOMPUTING CENTER HP-CAST September 2003 ... 16 HP xw8000 workstations. 2.8GHz Pentium 4 Xeon (single) Elan card ... – PowerPoint PPT presentation

Number of Views:75
Avg rating:3.0/5.0
Slides: 40
Provided by: chadv
Category:

less

Transcript and Presenter's Notes

Title: Visualization Resource Scheduling with AlphaServer SC


1
Visualization Resource Scheduling with
AlphaServer SC
  • Chad Vizino
  • HP-CAST
  • September 2003

2
Goal
  • Compute on AlphaServer SC computational nodes
    using an OpenGL package to render
  • Render remotely to specialized graphics hardware
    running Chromium
  • End result produce web page with visual results
    as computation performed

3
How does it look to the user?
  • Standard graphics packages can drive Chromium
  • DrawP3D, OpenDX, Vtk, Ensight
  • Minor tweaking is often needed
  • Downstream, a web server
  • Treat vis nodes as an allocatable resource of
    AlphaServer SC machine

4
A Real Example
5
A Real Example (2)
  • qsub l rmsnodes312,othervisnodes5
  • Job waits until 3 nodes (12 cpus) become
    available AND 5 vis nodes are available
  • When resources available, job runs
  • Visit vis web page for rendering

6
(No Transcript)
7
(No Transcript)
8
(No Transcript)
9
(No Transcript)
10
Terascale Computing System
Summary
Compute Nodes
Control
  • 750 Compute Nodes
  • 3000 EV68 processors
  • 6 Tf (peak, est 4Tf on LSMS)
  • 3. TB memory
  • 27 TB local disk
  • Multi-rail fat-tree network
  • Redundant monitor/ctrl
  • WAN/LAN accessible
  • Parallel visualization
  • File servers 30TB, 32 GB/s
  • Mass store, 1 TB/hr
  • (many network details omitted)

File Servers
/home
Quadrics
WAN/LAN
Switched ethernet
Viz
Mass Store
Archive
buffer
11
Vis Node Hardware
  • 16 HP xw8000 workstations
  • 2.8GHz Pentium 4 Xeon (single)
  • Elan card
  • 64 bit/66MHz PCI slots (for Elan card)
  • 1 GB memory
  • Nvidia Quadro4 980XGL
  • Located on same net as management server for RMS
    DB access

12
Vis Node Software
  • Red Hat 9
  • Kernel 2.4.20 from kernel.org
  • Intel AGP 3.0 (8x)
  • Nvidia 1.0-4496 drivers
  • QSW QsNet libraries
  • mSQL 2.0
  • Used to read RMS database on AlphaServer SC
    machine

13
(No Transcript)
14
Steps to Cluster Rendering
  • Transfer data from compute nodes
  • Render using commodity graphics
  • Read back image
  • Composite with other vis processor images

Quadrics (Elan) network
PC Graphics
PC Graphics
PC Graphics
. . .
Composite
Composite
Composite
15
About OpenGL
  • Widely adopted 2D/3D graphics API
  • Not tied to any OS
  • Interacts with native graphics HW
  • Does not allow parallel rendering by itself
  • Problems
  • Not distributed
  • No parallelism
  • Use Chromium

16
About Chromium
  • Open source
  • Accepts OpenGL calls
  • Derived from WireGL project
  • Builds application stub library
  • Provides SPUs stream processing units
  • Creates OpenGL context without window
  • Provides distribution
  • Can perform parallel rendering

17
Parallel Rendering
  • 1 program or thread
  • Each renders part of entire model or scene
  • All send graphics commands to rendering system
  • Pieces combined at some point to form final image

18
Standard Chromium Components
  • Mothership
  • Chromium server
  • Chromium application faker

19
Chromium mothership
  • Handles all configuration requests from other
    Chromium components
  • Tells other Chromium components what they should
    be doing

20
Chromium application faker
  • crappfaker
  • Load OpenGL replacement library rather than
    systems
  • Pack SPU
  • Packs stream of OpenGL commands into buffer sent
    to Chromium server

21
Chromium server
  • Dispatches incoming blocks of encoded OpenGL
    commands to chain of SPUs
  • Render SPU
  • Hands OpenGL calls to systems implementation
  • Serializes streams
  • Important in parallel rendering

22
Elements of Our System
User code/Graphics App
Chromium
TEAC
Tru64
Quadrics Driver
HP ES45 Node(s)
Quadrics
Chromium Server
TEAC
RedHat Linux
Quadrics Driver
NV Driver
HP xw8000
NVIDIA
Quadrics
23
TEAC
  • Terascale Elan Asynchronous Communication
  • Local Elan transport library
  • Built on top of libelan
  • Requires control channel over TCP/IP network
  • Asynchronous nature eases control flow
  • Heres the process

24
Job Startup
Simon
Job exec node
Job PE
Job PE
Job PE
RMS DB
CR Appl
CR Appl
CR Appl
CR mothership
RMS database provides info on allocated hosts via
mSQL over TCP/IP net
Iamvis00
iamvisNN
iamvisNN
iamvisNN
RMSwatcher
RMSwatcher
RMSwatcher
RMSwatcher
httpd
25
Job Startup (2)
Simon
Job exec node
Job PE
Job PE
Job PE
RMS DB
CR Appl
CR Appl
CR Appl
CR mothership
RMSwatchers spawn Crservers and create needed web
pages
Iamvis00
iamvisNN
iamvisNN
iamvisNN
RMSwatcher
RMSwatcher
RMSwatcher
RMSwatcher
CRserver
CRserver
CRserver
httpd
26
Job Startup (3)
Simon
Job exec node
Job PE
Job PE
Job PE
RMS DB
CR Appl
CR Appl
CR Appl
CR mothership
CR mothership provides rendezvous info to clients
and servers via TCP/IP
Iamvis00
iamvisNN
iamvisNN
iamvisNN
RMSwatcher
RMSwatcher
RMSwatcher
RMSwatcher
CRserver
CRserver
CRserver
httpd
27
Job Startup (4)
Simon
Job exec node
Job PE
Job PE
Job PE
RMS DB
CR Appl
CR Appl
CR Appl
CR mothership
Quadrics CR communication is established
rendering happens.
Iamvis00
iamvisNN
iamvisNN
iamvisNN
RMSwatcher
RMSwatcher
RMSwatcher
RMSwatcher
CRserver
CRserver
CRserver
httpd
28
Job Shutdown
Simon
Job exec node
Job PE
Job PE
Job PE
RMS DB
RMS database provides info on deallocated hosts
via mSQL over TCP/IP net
Iamvis00
iamvisNN
iamvisNN
iamvisNN
RMSwatcher
RMSwatcher
RMSwatcher
RMSwatcher
httpd
29
Vis Node Resource allocation
  • Outside RMS
  • Two components
  • Node allocation
  • Communication parameter allocation
  • Parameters are limited!
  • Treat as schedulable resource
  • Let batch system handle it

30
OpenPBS
  • Integrated with RMS
  • All daemons run on interactive (login) nodes
    (none on compute nodes)
  • Simon
  • Locally developed scheduler
  • 2000 lines of TCL
  • It just works!

31
Vis nodes under OpenPBS
  • qsub l rmsnodes312,othervisnodes5
  • Simon takes care of the rest

32
Simon vis-related components
  • Vis node manager
  • Vis comms manager
  • Vis resources manager
  • Vis allocation/deallocation manager

33
Vis node manager
  • Returns free vis nodes
  • Vis nodes stored in RMS table, vis_nodes
  • hostid numeric host id
  • host host name
  • enabled vis node enabled for scheduling? (01)
  • allocated_id (0vis_resources.id)
  • aliases - for now
  • comments

34
Vis comms manager
  • Performs context/port management
  • Parameters stored in RMS table, vis_comms
  • slot slot id
  • ctx_start start context number
  • ctx_end end context number (ctx_start4)
  • port unique port to talk to mothership
  • enabled slot enabled? (01)
  • allocated_id allocated? (0vis_resources.id)
  • comments

35
Vis resources manager
  • Establishes a vis resource
  • Parameters stored in RMS table, vis_resources
  • id PBS batch id
  • startTime when started
  • endTime when ended (or 0 if running)
  • rand_int 4 byte communication key
  • ctx_start from vis comms manager
  • ctx_end from vis comms manager
  • port from vis_comms manager
  • exec_host where mothership running
  • vis_nodes vis nodes allocated (ex. 1-4,6)

36
Vis allocation/deallocation manager
  • Allocate
  • Get list of free vis nodes from Vis node manager
  • Mark nodes in use
  • Call Vis comms manager
  • Call Vis resources manager
  • Deallocate
  • Clear nodes in use
  • Call Vis comms manager
  • Call Vis resources manager

37
The Vis Node Monitor
38
Future Challenges
  • Improve interactivity
  • Multiple OpenGL programs in one job
  • Would like to see more Elan user contexts
  • Limited to 32-63 starting with UK1
  • Used to have 32-1023!
  • Need 1 context/process
  • Interactive visualization
  • Display image on users desktop
  • Real time visualization
  • Limited to about 1 frame/sec
  • Reservable vis resources

39
Summary
  • We have integrated this directly with AlphaServer
    SC
  • No roadblocks encountered making this a seamless
    part of PSCs computing environment
Write a Comment
User Comments (0)
About PowerShow.com