The NERSC Global File System - PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

The NERSC Global File System

Description:

We plan on being able to extend that access to remote ... vorpal. 895 GB. Robert Ryne. acceldac. 912 GB. Catherine Chuang. aerosol. 922 GB. Saul Perlmutter ... – PowerPoint PPT presentation

Number of Views:56
Avg rating:3.0/5.0
Slides: 30
Provided by: Matthew457
Category:

less

Transcript and Presenter's Notes

Title: The NERSC Global File System


1
  • The NERSC Global File System
  • NERSC
  • June 12th, 2006

2
Overview
  • NGF What/Why/How
  • NGF Today
  • Architecture
  • Whos Using it
  • Problems/Solutions
  • NGF Tomorrow
  • Performance Improvements
  • Reliability Enhancements
  • New Filesystems(/home)

3
  • What is NGF?

4
NERSC Global File System - what
  • What do we mean by a global file systems?
  • Available via standard APIs for file system
    access on all NERSC systems.
  • POSIX
  • MPI-IO
  • We plan on being able to extend that access to
    remote sites via future enhancements.
  • High Performance
  • NGF is seen as a replacement for our current file
    systems, and is expected to meet the same high
    performance standards

5
NERSC Global File System - why
  • Increase User productivity
  • To reduce users data management burden.
  • Enable/Simplify workflows involving multiple
    NERSC computational systems
  • Accelerate the adoption of new NERSC systems
  • Users have access to all of their data, source
    code, scripts, etc. the first time they log into
    the new machine
  • Enable more flexible/responsive management of
    storage
  • Increase Capacity/Bandwidth on demand

6
NERSC Global File System - how
  • Parallel
  • Network/SAN heterogeneous access model
  • Multi-Platform (AIX/linux for now)

7
NGF Today
8
NGF current architecture
  • NGF is a GPFS file system using GPFS
    multi-cluster capabilities
  • Mounted on all NERSC systems as /project
  • External to all NERSC computational clusters
  • Small linux server cluster managed separately
    from computational systems.
  • 70 TB user visible storage. 50 Million inodes.
  • 3GB/s aggregate bandwith

9
NGF Current Configuration
10
/project
  • Limited initial deployment - no homes, no
    /scratch
  • Projects can include many users potentially using
    multiple systems(mpp, vis, ) and seemed to be
    prime candidates to benefit from the NGF shared
    data access model
  • Backed up to HPSS bi-weekly
  • Will eventually receive nightly incremental
    backups.
  • Default project quota
  • 1 TB
  • 250,000 inodes

11
/project 2
  • Current usage
  • 19.5 TB used (28 of capacity)
  • 2.2 M inodes used (5 of capacity)
  • NGF /project is currently mounted on all major
    NERSC systems (1240 clients)
  • Jacquard, LNXI Opteron System running SLES 9
  • Da Vinci, SGI Altix running SLES 9 Service Pack 3
    with direct storage access
  • PDSF IA32 Linux cluster running Scientific Linux
  • Bassi, IBM Power5 running AIX 5.3
  • Seaborg, IBM SP running AIX 5.2

12
/project problems Solutions
  • /project has not been without its problems
  • Software bugs
  • 2/14/06 outage due to Seaborg gateway crash
    problem reported to IBM, new ptf with fix
    installed.
  • GPFS on AIX5.3 ftruncate() error on compiles
    problem reported to IBM. efix now installed on
    Bassi.
  • Firmware bugs
  • FibreChannel Switch bug firmware upgraded.
  • DDN firmware bug(triggered on rebuild) firmware
    upgraded
  • Hardware Failures
  • Dual disk failure in raid array more exhaustive
    monitoring of disk health including soft errors
    now in place

13
NGF Solutions
  • General actions taken to improve reliability.
  • Pro-active monitoring see the problems before
    theyre problems
  • Procedural development decrease time to problem
    resolution/perform maintenance without outages
  • Operations staff activities decrease time to
    problem resolution
  • PMRs filed and fixes applied prevent problem
    recurrence
  • Replacing old servers remove hardware with
    demonstrated low MTBF
  • NGF Availability since 12/1/05 99 (total down
    time 2439 minutes)

14
Current Project Information
  • Projects using /project file system (46 projects
    to date)
  • narccap North American Regional Climate Change
    Assessment Program Phil Duffy, LLNL
  • Currently using 4.1 TB
  • Global model with fine resolution in 3D and time
    will be used to drive regional models
  • Currently using only Seaborg
  • mp107 CMB Data Analysis Julian Borrill, LBNL
  • Currently using 2.9 TB
  • Concerns about quota management and performance
  • 16 different file groups

15
Current Project Information
  • Projects using /project file system (cont.)
  • incite6 Molecular Dynameomics Valerie Daggett,
    UW
  • Currently using 2.1 TB
  • snaz Supernova Science Center Stan Woosley,
    UCSC
  • Currently using 1.6 TB

16
Other Large Projects
17
NGF Performance
  • Many users have reported good performance for
    their applications(little difference from
    /scratch)
  • Some applications show variability of read
    performance(MADCAP/MADbench) we are
    investigating this actively.

18
MADbench Results
19
Bassi Read Performance
20
Bassi Write Performance
21
Current Architecture Limitations
  • NGF performance is limited by the architecture of
    current NERSC systems
  • Most NGF I/O uses GPFS TCP/IP storage access
    protocol
  • Only Da Vinci can access NGF storage directly via
    FC.
  • Most NERSC systems have limited IP bandwidth
    outside of the cluster interconnect.
  • 1 gig-e per I/O node on Jacquard. each compute
    node uses only 1 I/O node for NGF traffic. 20 I/O
    noodes feed into 1 10Gb ethernet
  • Seaborg has 2 gateways with 4xgig-e bonds. Again
    each compute node uses only 1 gateway.
  • Bassi nodes each have 1-gig interfaces all
    feeding into a single 10Gb ethernet link

22
NGF tomorrow(and beyond )
23
Performance Improvements
  • NGF Client System Performance upgrades
  • Increase client bandwidth to NGF via hardware and
    routing improvements.
  • NGF storage fabric upgrades
  • Increase Bandwidth and ports of NGF storage
    fabric to support future systems.
  • Replace old NGF Servers
  • New servers will be more reliable.
  • 10-gig ethernet capable.
  • New Systems will be designed to support High
    performance to NGF.

24
NGF /home
  • We will deploy a shared /home file system in 2007
  • Initially only home for 1 system, may be mounted
    on others.
  • New systems thereafter all have home directories
    on NGF /home
  • Will be a new file system with tuning parameters
    configured for small file accesses.

25
/home layout decision slide
  • Two options
  • A users login directory is the same for all
    systems
  • /home/matt/
  • A users login directory is a different
    subdirectory of the users directory for each
    system
  • /home/matt/seaborg
  • /home/matt/jacquard
  • /home/matt/common
  • /home/matt/seaborg/common -gt ../common

26
One directory for all
  • Users see exactly the same thing in their home
    dir every time they log in, no matter what
    machine theyre on.
  • Problems
  • Programs sometimes change the format of their
    configuration files(dotfiles) from one release to
    another without changing the files name.
  • Setting HOME affects all applications not just
    the one that needs different config files
  • Programs have been known to use getpwnam() to
    determine the users home directory, and look
    there for config files rather than in HOME
  • Setting HOME essentially emulates the effect of
    having separate home dirs for each system

27
One directory per system
  • By default users start off in a different
    directory on each system
  • Dotfiles are different on each system unless the
    user uses symbolic links to make them the same
  • All of a users files are accessible from all
    systems, but a user may need to cd ../seaborg
    to get at files he created on seaborg if hes
    logged into a different system

28
NGF /home conclusion
  • We currently believe that the multiple
    directories option will result in less problems
    for the users, but are actively evaluating both
    options.
  • We would welcome user input on the matter.

29
NGF /scratch
  • We plan on deploying a shared /scratch to NERSC-5
    sometime in 2008
Write a Comment
User Comments (0)
About PowerShow.com