Title: Self-configuring Condor Virtual Machine Appliances for Ad-Hoc Grids
1Self-configuring Condor Virtual Machine
Appliances for Ad-Hoc Grids
- Renato Figueiredo
- Arijit Ganguly, David Wolinsky, J. Rhett Aultman,
P. Oscar Boykin, - ACIS Lab, University of Florida
- http//wow.acis.ufl.edu
2Outline
- Motivations
- Background
- Condor Virtual Appliance features
- On-going and future work
3Motivations
- Goal plug-and-play deployment of Condor grids
- High-throughput computing LAN and WAN
- Collaboration file systems, messaging, ..
- Synergistic approach VM virtual network
Condor - WOWs are wide-area NOWs, where
- Nodes are virtual machines
- Network is virtual IP-over-P2P (IPOP) overlay
- VMs provide
- Sandboxing software packaging decoupling
- Virtual network provides
- Virtual private LAN over WAN self-configuring
and capable of firewall/NAT traversal - Condor provides
- Match-making, reliable scheduling, unmodified
4Condor WOWs - outlook
10.0.0.1
5Condor WOW snapshot
Gainesville
Zurich
Long Beach
6Roadmap
- The basics
- 1.1 VMs and appliances
- 1.2 IPOP IP-over-P2P virtual network
- 1.3 Grid Appliance and Condor
- The details
- 2.1 Customization, updates
- 2.2 User interface
- 2.3 Security
- 2.4 Performance
- Usage experience
71.1 VMs and appliances
- System VMs
- VMware, KVM, Xen
- Homogenous system
- Sandboxing
- Co-exist with
- unmodified hosts
- Virtual appliances
- Hardware/software configuration packaged in easy
to deploy VM images - Only dependences ISA (x86), VMM
81.2 IPOP virtual networking
- Key technique IP-over-P2P tunneling
- Interconnect VM appliances
- WAN VMs perceive a virtual LAN environment
- IPOP is self-configuring
- Avoid administrative overhead of VPNs
- NAT and firewall traversal
- IPOP is scalable and robust
- P2P routing deals with node joins and leaves
- IPOP networks are isolated
- One or more private IP address spaces
- Decentralized DHCP serves addresses for each space
91.2 IPOP virtual networking
- Structured overlay network topology
- Bootstrap 1-hop IP tunnels on demand
- Discover NAT mappings decentralized hole
punching - VM keeps IPOP address even if it migrates on WAN
- Ganguly et al, IPDPS 2006, HPDC 2006
101.3 Grid appliance and Condor
- Base Debian Linux Condor IPOP
- Works on x86 Linux/Windows/MacOS VMware,
KVM/QEMU - 157MB zipped
- Uses NAT and host-only NICs
- No need to get IP address on host network
- Managed negotiator/collector VMs
- Easy to deploy schedd/startd VMs
- Flocking is easy virtual network is a LAN
112.1 Customization and updates
- VM image Virtual Disks
- Portable medium for data
- Growable after distribution
- Disks are logically stacked
- Leverage UnionFS file system
- Three stacks
- Base O/S, Condor, IPOP
- Module site specific configuration (e.g.
nanoHUB) - Home user persistent data
- Major updates replace base/module
- Minor updates automatic, apt-based
122.2 User interface (Windows host)
VM console X11 GUI
Host-mounted loop-back Samba folder
Loopback SSH
132.2 User interface (Mac host)
VM console X11 GUI
Host-mounted loop-back Samba folder
Loopback SSH
142.2 User interface (Linux host)
VM console X11 GUI
Host-mounted loop-back Samba folder
Loopback SSH
152.3 Security
- Appliance firewall
- eth0 block all outgoing Internet packets
- Except DHCP, DNS, IPOPs UDP port
- Only traffic within WOW allowed
- eth1 (host-only) allow ssh, Samba
- IPsec
- X.509 host certificates
- Authentication and end-to-end encryption
- VM joins WOW only with signed certificate bound
to its virtual IP - Private net/netmask 10 lines of IPsec
configuration for an entire class A network!
162.4 Performance
- User-level C IPOP implementation (UDP)
- Link bandwidth 25-30Mbit/s
- Latency overhead 4ms
- Connection times
- 5-10s to join P2P ring and obtain DHCP address
- 10s to create shortcuts, UDP hole-punching
SimpleScalar 3.0 (cycle-accurate CPU simulator)
17Experiences
- Bootstrap WOW with VMs at UF and partners
- Currently 300 VMs, IPOP overlay routers
(Planetlab) - Exercised with 10,000s of Condor jobs from real
users - nanoHUB 3-week long, 9,000-job batch (BioMoca)
submitted via a Condor-G gateway - P2Psim, CH3D, SimpleScalar
- Pursuing interactions with users and the Condor
community for broader dissemination
18Time scales and expertise
- Development of baseline VM image
- VM/Condor/IPOP expertise weeks/months
- Development of custom module
- Domain-specific expertise hours/days/weeks
- Deployment of VM appliance
- No previous experience with VMs or Condor
- 15-30 minutes to download and install VMM
- 15-30 minutes to download and unzip appliance
- 15-30 minutes to boot appliance, automatically
connect to a Condor pool, run condor_status and a
demo condor_submit job
19On-going and future work
- Enhancing self-organization at the Condor level
- Structured P2P for manager publish/discovery
- Distributed hash table (DHT) primary and
flocking - Condor integration via configuration files, DHT
scripts - Unstructured P2P for matchmaking
- Publish/replicate/cache classads on P2P overlay
- Support for arbitrary queries
- Condor integration proxies for
collector/negotiator - Decentralized storage, cooperative caching
- Virtual file systems (NFS proxies)
- Distribution of updates, read-only code
repositories - Caching and COW for diskless, net-boot appliances
20Acknowledgments
- National Science Foundation NMI, CI-TEAM
- SURA SCOOP (Coastal Ocean Observing and
Prediction)
http//wow.acis.ufl.edu Publications, Brunet/IPOP
code (GPLed C), Condor Grid appliance
21Questions?
22Self-organizing NAT traversal, shortcuts
Sends CTM request
Node A
Node B
CTM request connect to me at my NAT IPport
- A starts exchanging IP packets with B -
Traffic inspection triggers request to create
shortcut - Connect-to-me (CTM) - A tells B
its known address(es) - A had learned NATed
public IP/port when it joined overlay
23Self-organizing NAT traversal, shortcuts
Link request NAT endpoint (IPport)A
Node A
Node B
CTM reply through overlay send NAT (IPport)B
- B sends CTM reply routed through overlay
- B tells A its address(es) - B initiates
linking protocol by attempting to connect to A
directly
24Self-organizing NAT traversal, shortcuts
A Gets CTM reply initiates linking
Node A
Node B
- - Bs linking protocol message to A pokes hole on
Bs NAT - As linking protocol message to B pokes hole on
As NAT - CTM protocol establishes direct shortcut
25Performance considerations
- CPU-intensive application, Condor
- SimpleScalar 3.0d execution-driven computer
architecture simulator
26Performance considerations
- I/O PostMark
- Version 1.51
- Parameters
- Minimum file
- size 500 bytes
- Maximum file
- size 4.77 MB
- Transactions
- 5,000
27Performance considerations
- User-level C IPOP implementation (UDP)
- Link bandwidth 25-30Mbit/s (LAN)
- Latency overhead 4ms
- Connection times
- (Fine-tuning has reduced mean acquire time to
6-10s, with degree of redundancy n8)
28Condor Appliance on a desktop
VM Hardware configuration
Swap
User files
Domain- specific tools
Linux, Condor, IPOP
29Related Work
- Virtual Networking
- VIOLIN
- VNET topology adaptation
- ViNe
- Internet Indirection Infrastructure (i3)
- Support for mobility, multicast, anycast
- Decouples packet sending from receiving
- Based on Chord p2p protocol
- IPv6 tunneling
- IPv6 over UDP (Teredo protocol)
- IPv6 over P2P (P6P)