HPC At PNNL March 2004 - PowerPoint PPT Presentation

About This Presentation
Title:

HPC At PNNL March 2004

Description:

Molecular Science Computing Facility ... Classical molecular dynamics of the LPS membrane of Pseudomonas aeruginosa and mineral ... – PowerPoint PPT presentation

Number of Views:27
Avg rating:3.0/5.0
Slides: 17
Provided by: D3K4
Learn more at: https://www.csm.ornl.gov
Category:
Tags: hpc | pnnl | march | molecular

less

Transcript and Presenter's Notes

Title: HPC At PNNL March 2004


1
HPC At PNNLMarch 2004
R. Scott Studham, Associate Director Advanced
Computing April 13, 2004
2
HPC Systems at PNNL
  • Molecular Science Computing Facility
  • 11.8TF Linux based supercomputer using Intel
    Itanium2 processors and Elan4 interconnect
  • A balance for our users 500TB Disk, 6.8 TB
    memory
  • PNNL Advanced Computing Center
  • 128 Processor SGI Altix
  • NNSA-ASC Spray Cool Cluster

3
William R. Wiley Environmental Molecular
Sciences Laboratory
  • Who are we?
  • A 200,000 square-foot U.S. Department of Energy
    national scientific user facility
  • Operated by Pacific Northwest National Laboratory
    in Richland, Washington
  • What we provide for you
  • Free access to over 100 state-of-the-art research
    instruments
  • A peer-review proposal process
  • Expert staff to assist or collaborate
  • Why use EMSL?
  • EMSL provides - under one roof - staff and
    instruments for fundamental research on physical,
    chemical, and biological processes.

4
HPCS2 Configuration
1,976 next generation Itanium processors
928 compute nodes
...
Elan4
Elan3

Lustre
2Gb SAN / 53TB
2 System Mgt nodes
4 Login nodes with 4Gb-Enet
The 11.8TF system is in full operations now.
11.8TF 6.8TB Memory
5
Who uses the MSCF, and what do they run?
Gaussian
FY02 numbers
6
MSCF is focused on grand challenges
Fewer users focused on Longer, Larger runs and
Big Science.
More than 67 of the usage is for large jobs.
Demand for access to this resource is high.
7
The world-class science is enabled by having
systems that enable the fastest time-to-solution
for our science
  • Significant improvement (25-45 for moderate
    number of processors) in time to solution by
    upgrading the interconnect to Elan4.
  • Improved efficiency
  • Improved scalability
  • HPCS2 is a science driven computer architecture
    that has the fastest time-to-solution for our
    users science of any system we have benchmarked.

8
Accurate binding energies for large water
clusters
  • These results provide unique information on the
    transition from the cluster to the liquid and
    solid phases of water.
  • Code NWChem
  • Kernel MP2 (Disk Bound)
  • Sustained Performance 0.6 Gflop/s per processor
    (10 of peak)
  • Choke Point Sustained 61GB/s of Disk IO and used
    400TB of scratch space.
  • Only took 5 hours on 1024 CPUs of the HP cluster.
    This is a capability class problem that could
    not be completed on any other system.

9
Energy calculation of a protein complex
  • The Ras-RasGAP protein complex is a key switch in
    the signaling network initiated by the epidermal
    growth factor (EGF). This signal network
    controls cell death and differentiation, and
    mutations in the protein complex are responsible
    for 30 of all human tumors.
  • Code NWChem
  • Kernel Hartree-Fock
  • Time for solution3 hours for one iteration on
    1400 processors
  • Computation of 107 residues of the full protein
    complex using approximately 15,000 basis
    functions. This is believed to be the largest
    calculation of its type.

10
BiogeochemistryMembranes for Bioremediation
Molecular dynamics of a lipopolysaccharide (LPS)
HPCS1
Classical molecular dynamics of the LPS membrane
of Pseudomonas aeruginosa and mineral
Quantum mechanical/molecular mechanics molecular
dynamics of membrane plus mineral
HPCS2
HPCS3
11
A new trend is emerging
Projected Growth Trend for Biology Log Scale!
  • With the expansion into biology, the need for
    storage has drastically increased.
  • EMSL users have stored gt50TB in the past 8
    months. More than 80 of the data is from
    experimentalists.

12
Storage DriversWe support Three different
domains with different requirements
  • High Performance Computing Chemistry
  • Low storage volumes (10 TB)
  • High performance storage (gt500MB/s per client,
    GB/s aggregate)
  • POSIX access
  • High Throughput Proteomics Biology
  • Large storage volumes (PBs) and exploding
  • Write once, read rarely if used as an archive
  • Modest latency okay (lt10s to data)
  • If analysis could be done in place it would
    require faster storage
  • Atmospheric Radiation Measurement - Climate
  • Modest side storage requirements (100s TB)
  • Shared with community and replicated to ORNL

13
PNNL's Lustre Implementation
  • PNNL and the ASCI Tri-Labs are currently working
    with CFS and HP to develop Lustre.
  • Lustre has been in full production since last Aug
    and used for aggressive IO from our
    supercomputer.
  • Highly stable
  • Still hard to manage
  • We are expanding our use of Lustre to act as the
    filesystem for our archival storage.
  • Deploying a 400TB filesystem

660MB/s from a single client with a simple dd
is faster than any local or global filesystem we
have tested.
We are finally in the era where global
filesystems provide faster access
14
Security
  • Open computing requires a trust relationship
    between sites.
  • User logs into siteA and sshs to siteB. If
    siteA is compromised the hacker has probably
    sniffed the password for siteB.
  • Reaction 1 Teach users to minimize jumping
    through hosts they do not personally know are
    secure (why did the user trust SiteA?)
  • Reaction 2 Implement one-time passwords
    (SecureID)
  • Reaction 3 Turn off open access (Earth
    simulator?)

15
Thoughts about one-time-passwords
  • A couple of different hurdles to cross
  • We would like to avoid having to force our users
    to carry a different SecureID card for each site
    they have access to.
  • However the distributed nature of security (it is
    run by local site policy) will probably end up
    with something like this for the short term.
  • As of April 8th the MSCF has converted over to
    the PNNL SecureID system for all remote ssh
    logins.
  • Lots of FedExed SecureID cards

16
Summary
  • HPCS2 is running well and the IO capabilities of
    the system are enabling chemistry and biology
    calculations that could not be run on any other
    system in the world.
  • Storage for proteomics is on a super-exponential
    trend.
  • Lustre is great. 660MB/s from a single client.
    Building 1/2PB single filesystem.
  • We rapidly implemented SecureID authentication
    methods last week.
Write a Comment
User Comments (0)
About PowerShow.com