Interdisciplinary HighPerformance Computing: Renaissance Computing Institute - PowerPoint PPT Presentation

Loading...

PPT – Interdisciplinary HighPerformance Computing: Renaissance Computing Institute PowerPoint presentation | free to view - id: 955ea-MzA1N



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Interdisciplinary HighPerformance Computing: Renaissance Computing Institute

Description:

Carolina Renaissance Computing Institute. Interdisciplinary High-Performance Computing: ... high levels of performance and resource efficiency ... – PowerPoint PPT presentation

Number of Views:39
Avg rating:3.0/5.0
Slides: 20
Provided by: ncsa
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Interdisciplinary HighPerformance Computing: Renaissance Computing Institute


1
Interdisciplinary High-Performance Computing
Renaissance Computing Institute
  • Dan Reed
  • Director
  • Alan Blatecky
  • Deputy Director
  • Diane Pozefsky
  • Research Scientist and Professor
  • Duke University
  • North Carolina State University
  • University of North Carolina at Chapel Hill

2
Renaissance Computing Institute
  • Vision
  • a multidisciplinary institute
  • academe, commerce and society
  • broad in scope and participation
  • from art to zoology
  • Objectives
  • enrich and empower human potential
  • communities at all levels
  • create multidisciplinary partnerships
  • science, engineering and computing
  • commerce, humanities and the arts
  • develop and deploy world-leading infrastructure
  • computing, communications and data management
  • visualization, collaboration and manufacturing

3
The Big Questions
  • Life and nature
  • structures, processes and interactions
  • Matter and universe
  • origins, structure, manipulation and futures
  • interactions, systems, and context
  • Humanity
  • creativity, socialization and community
  • Answering big questions (usually) requires
  • boldness to engage opportunities
  • expandable approaches
  • world-leading infrastructure
  • collaborations and interdisciplinary partnerships

4
How Big Is Big?
  • Every 10X brings new challenges
  • 64 processors was once considered large
  • its now a research cluster in a closet
  • 1024 processors is todays medium size
  • 2048-8096 processors is todays large
  • were struggling even here
  • 10K-100K processors is in sight
  • we have fundamental challenges …
  • … and no integrated research program
  • Grids bring a complementary set of challenges
  • diversity
  • unreliable communication links
  • shared data stores
  • widely varying system support
  • maintenance and software stability

Norman et al
5
Big System Reliability
  • Facing the issues
  • ASCI Q boot time is 8 hours
  • not far from the system MTTF
  • Cost of frequent application checkpoints
  • Its time to take RAS seriously
  • systems do provide warnings
  • soft bit errors
  • disk read/write retries, packet loss
  • status and health provide guidance
  • node temperature/fan duty cycles
  • Potential software and algorithmic responses
  • diagnostic-mediated checkpointing
  • domain-specific fault tolerance
  • optimal system size for minimum execution time

Source Jack Horner and Charng-da Lu
6
Representative Research Projects
  • VGrADS
  • (Virtual Grid Application Development Software)
  • LEAD
  • (Linked Environments for Atmospheric Discovery)
  • PERC
  • (Performance Evaluation Research Center)
  • LACSI
  • (Los Alamos Computer Science Institute)
  • NCSA
  • (National Computational Science Alliance)
  • Grids, HPC, biology, atmospheric science,
    astronomy, …
  • NC Bio Portal

7
Portals and Grids
Grid Resources
OGCE Portlets with Container
Service API
Grid Service Stubs
Grid Protocols
Grid Services
OGCE Science Portal
Local Portal Services
Open Source Tools
Remote Content Servers
Remote Content Services
HTTP
Apache Jetspeed Internal Services
8
Whats A Grid?
http//
Web Uniform access to documents
http//
Software catalogs
Grid Flexible, high-performance access to
resources and services for distributed communities
Computers
Sensors and instruments
Colleagues
Data archives
9
Science and Engineering Grids
10
Web Services

Source Globus Team
11
Grid and Web Services
Source Globus/IBM
12
VGrADS
  • Virtual Grid Application Development Software
  • Goals
  • simplify and accelerate the development of Grid
    applications and services
  • high levels of performance and resource
    efficiency
  • expand the community of Grid users and developers
  • Contributions
  • Introduction of virtual grids (vgrids)
  • classification of grid types
  • language to define type of grid needed

13
VGrADS
Application Targets LEAD and Encyclopedia of
Life (EOL)
14
LEAD and Cyclic Tornado Genesis
  • LEAD (large NSF project) for atmospheric science
    Grid
  • Linked Environments for Atmospheric Discovery
  • UNC, Oklahoma, Indiana, National Center for
    Atmospheric Research, Alabama, Illinois, …
  • What vertical profiles of wind, temperature
    humidity
  • lead to multiple mesocyclones and/or
    tornadoes?

15
How will LEAD help?
  • Allows the use of analysis tools, forecast
    models, and data repositories as dynamically
    adaptive, on-demand systems that can
  • change configuration rapidly and automatically in
    response to weather
  • continually be steered by new data (i.e., the
    weather)
  • respond to decision-driven inputs from users
  • initiate other processes automatically and
  • steer remote observing technologies to optimize
    data collection for the problem at hand.

16
LEAD System
Local Resources and Services
Grid Resources and Services
Tools Sub-System
17
Two Level Instrumentation/Monitoring
  • SvPablo for modules
  • Graphical user interface tool for
  • Source code instrumentation
  • Browsing runtime performance data
  • Autopilot for workflow
  • Sensors and actuators
  • distributed measurement and software control
  • Fuzzy logic decision procedures
  • distributed performance control
  • Standard performance daemons

18
AutoPilot
19
Research Opportunities
  • Large-scale Parallel Systems
  • fault-tolerance, performance analysis, scheduling
  • I/O, applications
  • Computational and Data Grids
  • resource management, policies, applications
About PowerShow.com