Condor in the University of Oxford OxGrid - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

Condor in the University of Oxford OxGrid

Description:

... to use the CoLinux system for running Linux Condor under Windows ... Can be combined with Condor for Windows. Zero-configuration installer for all components ... – PowerPoint PPT presentation

Number of Views:22
Avg rating:3.0/5.0
Slides: 18
Provided by: olib
Category:

less

Transcript and Presenter's Notes

Title: Condor in the University of Oxford OxGrid


1
Condor in the University of Oxford - OxGrid
  • David Spence, Tiejun Ma, Xin Xiong and David
    Wallom
  • Oxford e-Research Centre

2
Overview
  • Background
  • OxGrid Design
  • Linux Clients for Windows
  • Networking
  • Job management
  • Packaging
  • OxGrid Projects
  • Conclusion

3
Background
  • Oxford University has a complex structure, which
    makes rolling out a Campus Grid tricky
  • There are 38 colleges which are
    legally-independent to the University...
  • ... and at least 63 academic departments which
    are semi-autonomous
  • The colleges provide the bulk of PCs accessible
    to students
  • Some departments tend to provide teaching
    laboratories
  • Each have their own independent IT support staff
  • OeRC is just one of the departments

4
OxGrid Architecture Administration
Oxford e-Research Centre
Department/College
Department/College
Department/College
Resource Broker/ Login (Condor)
Storage (SRB)
BDII, VOMS, SSO CA...
Condor pool
Departmental Clusters
Condor pool
Other University/Institution
Other University/Institution
Other University/Institution
Microsoft Cluster
National Grid Service Cluster
Super-computing centre
National Grid Service Resource
5
OxGrid Architecture Technical
OxGrid Resource Broker
Job Submit Scripts
Adverts include dynamic information e.g. FreeCPU
condor_schedd
RB Advertise Script
condor_collector
condor_ negotiator
Glue
condor_ gridmanager
Globus
OxGrid BDII
NGS BDII
Machines
Glue
Machines
Glue
Machines
Condor
Machines
National Grid Service Resource
Oxford Condor Pool
OeRC Resources
Machines
National Grid Service Resource
Oxford Condor Pool
OeRC Resources
Machines
National Grid Service Resource
Other Oxford Resources
Oxford Condor Pool Headnodes
Machines
Machines
Machines
Machines
Machines
Machines
6
User Interface
  • getcert
  • Obtain a local, low assurance certificate
  • Uses University SSO architecture and MyProxy
  • submit-job
  • Test accessibility of resources
  • Matchmaking based on free CPUs
  • Submit to different resource types (Campus Grid,
    NGS...)
  • Flexible parameter/file/filename sweeps
  • Various waiting primitives for implementing
    workflows
  • job-submission-script
  • Simpler one job

7
Departmental Condor pools
  • We are signing up Colleges and Departments
  • We set up a Head node for the department
  • Automatic installation script using VDT
  • Called condor.DEPT.ox.ac.uk
  • This is placed in their machine room
  • Firewalls only needs to be open for Globus
    traffic between this headnode and the resource
    broker in OeRC
  • We then need to get IT staff to install clients
  • Linux installation instructions
  • But we really want Linux on Windows
  • Electricity cost is becoming an issue

8
Windows Client Requirements
  • We want to support Windows and Linux jobs
  • With mixed Condor pools
  • Desktop and Lab PCs are not centrally owned or
    managed
  • Most departments have Windows PCs
  • It is not easy to persuade departmental IT staff
    to support the Campus Grid
  • We need to provide an easy to install client
    which can be used existing management frameworks

9
Using CoLinux
  • We decided to use the CoLinux system for running
    Linux Condor under Windows
  • CoLinux is the Linux kernel ported to Windows
  • It has proved to be reliable in our experience
  • It is nearly as fast as native (0.1 overhead,
    see http//www.ibm.com/developerworks/linux/libra
    ry/l-colinux/)
  • It requires no porting of code
  • Appears as a normal Windows service to Windows
  • Can be stopped started from Services
  • Or net start/stop colinux

10
CoLinux Networking
  • Must use the same interface/IP as Windows in
    general case
  • Use Slirp which makes all CoLinux look like a
    Windows process
  • Internal 10.x.x.x network
  • Great for security specify exactly want ports
    that Linux can listen on.
  • There is a small range of outgoing ports
    configured in Condor with IN_LOWPORT and
    IN_HIGHPORT.

11
CoLinux Network Configuration
Windows
Internal Network
CoLinux
Win Sock API
eth0 10.0.2.15
Condor
Connection REAL_IP
Gateway 10.0.0.2
Socket API
From RealIP To RemoteIP ClassAd Contact
10.0.2.15port
From 10.0.2.15 To RemoteIP ClassAd Contact
10.0.2.15port
From RealIP To RemoteIP ClassAd Contact
RealIPport
From 10.0.2.15 To RemoteIP ClassAd Contact
RealIPport
From RealIP To RemoteIP ClassAd Contact
RealIPport
12
CoLinux Network Configuration
  • On start-up re-configure the networking in
    CoLinux
  • IP of Windows passed to CoLinux as kernel
    parameter
  • Create a new IP alias for eth0 with the Windows
    IP address.
  • ifconfig eth01 ip netmask 255.255.255.255
  • Set up IP Tables to re-route requests
  • iptables -t nat -A POSTROUTING -o eth0 -j SNAT
    --to 10.0.2.15
  • iptables -t nat -APREROUTING -i eth0 -j DNAT --to
    ip
  • Make up a unique hostname colinux.ltwindows IPgt
  • hostname colinux.ip
  • Add to /etc/hosts
  • Export some variables to Condor, which are used
    in the Condor configuration files
  • HOST_OS_IPip used for
    NETWORK_INTERFACE
  • CONDOR_HOSTcondor.domain used for
    CONDOR_HOST
  • CONDOR_DOMAIN.domain used for
    access control

13
Controlling CoLinux Jobs
  • Providing feedback about user activity in Windows
  • University of Nebraska monitoring scripts in
    Windows
  • University of Reading start up service on
    power-on/log-off, stop at log-in Group Policy
  • Oxford Teaching labs run in background at low
    priority
  • Many Oxford Departments support Linux AND
    Windows
  • Let Condor for Windows monitor the machine!
  • Periodically check Windows Condor status
  • Add HostState to the ClassAd for CoLinux
    Condor
  • Used to control when jobs can run
  • e.g. Start (HostState ! Claimed)
  • Run job if there is at least one free core
    reported
  • Best to only use with multiple cores

14
Packaging
  • For ease of installation we create a MSI
  • One-click installation and configuration of
    CoLinux
  • Can include Condor for Windows and set it up to
    work with the CoLinux Condor setup
  • Can be used with all Windows remote management
    systems
  • Will automatically set-up configuration files
    based on machine characteristics
  • Debian based filesystem image
  • 1 Gb limit to cab files compress image with
    bzip2 (2.2 Gb -gt 336Mb)
  • Once we have set up a head-node for the
    department/college then all the local IT need to
    do is run the installer

15
OnGoing work OxGrid Projects
  • Low-Carbon ICT
  • Research and communications in low carbon
    technologies University-wide wake-on-LAN
    service and Condor integration, (with OUCS,
    Oxford Environmental Change Institute).
  • GridBS
  • Extending the GridSAM middleware with Condor
    Matchmaking, (with Imperial, OMII at
    Southampton). Basis of the current Resource
    Broker
  • SARoNGS
  • Integrating the access to e-Research facilities
    with Federated Access Management (Shibboleth)
    being rolled-out across the UK, (with STFC and
    Manchester).

16
Conclusion
  • Departmental and College Condor pools are
    accessed through a central pool of pools that
    also supports access to other resources
  • CoLinux is used to provide Linux support on
    Windows
  • No networking configuration required
  • Can be combined with Condor for Windows
  • Zero-configuration installer for all components

17
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com