Maui High Performance Computing Center - PowerPoint PPT Presentation

About This Presentation
Title:

Maui High Performance Computing Center

Description:

40, 4 processor/8GB 'nodes' Intel 3.0Ghz Dual Core Woodcrest Processors ... Cisco 6500 Core. Fibre Channel. Jaws Architecture. User Webtop. User Webtop. User Webtop ... – PowerPoint PPT presentation

Number of Views:27
Avg rating:3.0/5.0
Slides: 14
Provided by: gene55
Learn more at: http://www.hawaii.edu
Category:

less

Transcript and Presenter's Notes

Title: Maui High Performance Computing Center


1
Maui High Performance Computing Center
Open System Support An AFRL, MHPCC and UH
Collaboration December 18, 2007
Mike McCraney MHPCC Operations Director
2
Agenda
  • MHPCC Background and History
  • Open System Description
  • Scheduled and Unscheduled Maintenance
  • Application Process
  • Additional Information Required
  • Summary and Q/A

3
An AFRL Center
  • An Air Force Research Laboratory Center
  • Operational since 1993
  • Managed by the University of Hawaii
  • Subcontractor Partners SAIC / Boeing
  • A DoD High Performance Computing Modernization
    Program (HPCMP) Distributed Center
  • Task Order Contract Maximum Estimated
    Ordering Value 181,000,000
  • Performance Dependent 10 Years
  • 4 Year Base Period with 2, 3-Year Term
    Awards

4
A DoD HPCMP Distributed Center
Director, Defense Research and Engineering
DUSD (Science and Technology)
High Performance Computing Modernization Program
  • Distributed Centers
  • Allocated Distributed Centers
  • Army High Performance Computing Research Center
    (AHPCRC)
  • Arctic Region Supercomputing Center (ARSC)
  • Maui High Performance Computing Center (MHPCC)
  • Space and Missile Defense Command (SMDC)
  • Dedicated Distributed Centers
  • ATC
  • AFWA
  • AEDC
  • AFRL/IF
  • Eglin
  • FNMOC
  • JFCOM/J9
  • Major Shared Resource Centers
  • Aeronautical Systems Center (ASC)
  • Army Research Laboratory (ARL)
  • Engineer Research and Development Center (ERDC)
  • Naval Oceanographic Office (NAVO)
  • NAWC-AD
  • NAWC-CD
  • NUWC
  • RTTC
  • SIMAF
  • SSCSD
  • WSMR

5
MHPCC HPC History
  • 1994 - IBM P2SC Typhoon Installed
  • 1996 - 2000 IBM P2SC
  • 2000 - IBM P3 Tempest Installed
  • 2001 - IBM Netfinity Huinalu Installed
  • 2002 - IBM P2SC Typhoon Retired
  • 2002 - IBM P4 Tempest Installed
  • 2004 - LNXi Evolocity II Koa Installed
  • 2005 - Cray XD1 Hoku Installed
  • 2006 - IBM P3 Tempest Retired
  • 2007 - IBM P4 Tempest Reassigned

6
Hurricane Configuration Summary
Current Hurricane Configuration
  • Eight, 32 processor/32GB nodes IBM P690 Power4
  • Jobs may be scheduled across nodes for a total of
    288p
  • Shared memory jobs can span up to 32p and 32GB
  • 10TB Shared Disk available to all nodes
  • LoadLeveler Scheduling
  • One job per node 32p chunks can only support
    8 simultaneous jobs
  • Issues
  • Old technology, reaching end of life,
    upgradability issues
  • Cost prohibitive Power consumption constant
    400,000 annual power cost

7
Dell Configuration Summary
Proposed Shark Configuration
  • 40, 4 processor/8GB nodes Intel 3.0Ghz Dual
    Core Woodcrest Processors
  • Jobs may be scheduled across nodes for a total of
    160p
  • Shared memory jobs can span up to 8p and 16GB
  • 10TB Shared Disk available to all nodes
  • LSF Scheduler
  • One job per node 8p chunks can support up to
    40 simultaneous jobs
  • Shared use as Open system and TDS (test and
    development system)
  • Much lower power cost Intel power management
  • System already maintained and in use
  • System covered 24x7 UPS, generator
  • Possible short-notice downtime

Features/Issues
8
Jaws Architecture
Cisco 6500 Core
Head Node
  • Head Node for System Administration
  • Build Nodes
  • Running Parallel Tools
  • (pdsh, pdcp, etc.)
  • SSH Communications Between Nodes
  • Localized Infiniband Network
  • Private Ethernet
  • Dell Remote Access Controllers
  • Private Ethernet
  • Remote Power On/Off
  • Temperature Reporting
  • Operability Status
  • Alarms
  • 10 Blades Per Chassis
  • CFS Lustre Filesystem
  • Shared Access
  • High Performance
  • Using Infiniband Fabric

10 Gig-E Ethernet Fibre
Gig-E nodes with 10 Gig-E uplinks. 40 nodes per
uplink.
Fibre Channel
Cisco Infiniband (Copper)
9
Shark Software
  • Systems Software
  • Red Hat Enterprise Linux v4
  • 2.6.9 Kernel
  • Infiniband
  • Cisco Software stack
  • MVAPICH
  • MPICH 1.2.7 over IB Library
  • Gnu 3.4.6 C/C/Fortran
  • Intel 9.1 C/C/Fortran
  • Platform LSF HPC 6.2
  • Platform Rocks

10
Maintenance Schedule
  • Current
  • 200pm 400pm
  • 2nd and 4th Thursday (as necessary)
  • Check website (mhpcc.hpc.mil) for maintenance
    notices
  • New Proposed Schedule
  • 800am 500pm
  • 2nd and 4th Wednesdays (as necessary)
  • Check website for maintenance notices
  • Only take maintenance on scheduled systems
  • Check on Mondays before submitting jobs

11
Account Applications and Documentation
  • Contact Helpdesk or website for application
    information
  • Documentation Needed
  • Account names, systems, special requirements
  • Project title, nature of work, accessibility of
    code
  • Nationality of applicant
  • Collaborative relevance with AFRL
  • New Requirements
  • Case File information
  • For use in AFRL research collaboration
  • Future AFRL applicability
  • Intellectual property shared with AFRL
  • Annual Account Renewals
  • September 30 is final day of the fiscal year

12
Summary
  • Anticipated migration to Shark
  • Should be more productive and able to support
    wide range of jobs
  • Cutting edge technology
  • Cost savings from Hurricane (400,000 annual)
  • Stay tuned for timeline likely end of
    January, early February

13
Mahalo
Write a Comment
User Comments (0)
About PowerShow.com