The IEEE CS Task Force on Cluster Computing (TFCC) - PowerPoint PPT Presentation

About This Presentation

Title:

The IEEE CS Task Force on Cluster Computing (TFCC)

Description:

The IEEE CS Task Force on Cluster Computing (TFCC) William Gropp Mathematics and Computer Science Argonne National Lab www.mcs.anl.gov/~gropp Thanks to Mark Baker – PowerPoint PPT presentation

Number of Views:76

Avg rating:3.0/5.0

Slides: 33

Provided by: fna87

Learn more at: https://conferences.fnal.gov

Category:

more less

Transcript and Presenter's Notes

Title: The IEEE CS Task Force on Cluster Computing (TFCC)

1
The IEEE CS Task Force on Cluster Computing (TFCC)
William GroppMathematics and Computer
ScienceArgonne National Labwww.mcs.anl.gov/grop
p
Thanks to Mark BakerUniversity of Portsmouth,
UKhttp//www.dcs.port.ac.uk/mab
2
A Little History

In 1998 there was obvious huge interest in
clusters, so it seemed natural to set up a
focused group in this area.
A Cluster Computing Task Force was proposed to
the IEEE CS.
The TFCC was approved and started operating in
February 1999 been going just over 2 years.

3
Proposed Activities

Act as an international forum to promote cluster
computing research and education, and participate
in setting up technical standards in this area.
Be involved with issues related to the design,
analysis and development of cluster systems as
well as the applications that use them.
Sponsor professional meetings, produce
publications, set guidelines for educational
programs, and help co-ordinate academic, funding
agency, and industry activities.
Organize events and hold a number of workshops
that would span the range of activities sponsored
by the Task Force.
Publish a bi-annual newsletter to help the
community keep abreast of activities in field.

4
IEEE CS Task Forces

A TF is expected to have a finite term of
existence, normally a period of 2-3 years -
continued existence beyond that point is
generally not appropriate.
A TF is expected to either increase their scope
of activities such that establishment of a
Technical Committee (TC) is warranted, or the
task force will be merged into existing TCs.
TFCC will submit an application to the CS become
a TC later this year.

5
Why a separate TFCC!

It brings together all the activities/technologies
used with Cluster Computing into one area - so
instead of tracking four or five IEEE TCs there
is one...
Cluster Computing is NOT just Parallel,
Distributed, OSs, or the Internet, it is a mix of
them all, and consequently different.
The TFCC is an appropriate body for focusing
activities and publications associated with
Cluster Computing.

6
http//www.ieeetfcc.org
7
TFCC Mailing Lists

Currently three emails lists have been set up
tfcc-l_at_bucknell.edu a discussion list open to
anyone interested in the TFCC - see TFCC page for
info. on how to subscribe.
tfcc-exe_at_port.ac.uk a closed executive
committee mailing reflector.
tfcc-adv_at_port.ac.uk a closed advisory
committee mailing reflector.

8
Annual Conference ClusterXY

1st IEEE International Workshop on Cluster
Computing (Cluster 1999), Melbourne, Australia,
December 1999, about 105 attendees from 16
countries.
http//www.clustercomp.org
2nd IEEE International Conference on Cluster
Computing (Cluster 2000), Chemnitz, Germany,
November, 2000, anticipate 160 attendees.
http//www.tu-chemnitz.de/cluster2000
3rd IEEE International Conference on Cluster
Computing (Cluster 2001), Newport Beach,
California, October 8-11, 2001, expect 250-300
attendees.
http//andy.usc.edu/cluster2001

9
Associated Events - GRIDXY

1st IEEE/ACM International Workshop on Grid
Computing (Grid2000), Bangalore, India, December
17, 2000 (attendees from 15 countries).
http//www.gridcomputing.org
2nd IEEE/ACM International Workshop on Grid
Computing (Grid2001), at SC2001, November 2001

10
Supercomputing

Birds of A Feather at SC99 and SC2000.
Aims of meetings are to gather together
interested parties and bring them up to date, but
also put together a bunch of short talks and
start a discussion on a variety of topics
Probably be another at SC01 depending on the
community interest.

11
Other Activities

Book donation program
Cluster Computing Archive
www.ieeetfcc.org/ClusterArchive.html
TopClusters Project
www.TopClusters.org
TFCC Whitepaper
www.dcs.port.ac.uk/mab/tfcc/WhitePaper
TFCC Newsletter
www.eg.bucknell.edu/hyde/tfcc

12
TopClusters Project

http//www.TopClusters.org
TFCC collaboration with Top500 project.
Numeric, I/O, Web, Database, and Application
level benchmarking of clusters.
Joint BOF with Top500 at SC2000 on Cluster-based
benchmarking.
Ongoing effort

13
TFCC Whitepaper

A Whitepaper on Cluster Computing, submitted to
the International Journal of High-Performance
Applications and Supercomputing, November 2000
Snap-shot of the state-of-the-art of Cluster
Computing.
Preprint, www.dcs.port.ac.uk/mab/tfcc/WhitePaper/

14
TFCC Membership

Over 300 registered members
Free membership open to all, but few benefits may
be restricted - (reduced registration fee for
IEEE members)
Over 450 on the TFCC mailing list
lttfcc-l_at_bucknell.edugt

15
Future Plans

We plan to submit an application to the IEEE CS
Technical Activities Board (TAB) to attain full
Technical Committee status.
The TAB see the TFCC as a success and we hope
that our application will be successful.
Obviously if we achieve TC status, we will need
the continuing assistance and help of the TFCCs
current volunteers plus encourage a bunch of new
ones

16
Summary

Successful conference series has been started,
with commercial sponsorship.
Promoting Cluster-based technologies through TFCC
sponsorship.
Helping the community with our book donation
program.
Engendering debate and discussion through mailing
forum.
Keeping the community informed with our
information rich TFCC Web site.

17
Scalable Clusters

TopCluster.org list
26 Clusters with 128 nodes
8 with 500 nodes
34 with 64-127 nodes
Most run Linux
Most dedicated to applications
Where are scalable tools developed and tested?
Caveats
Does not include MPP-like systems (IBM SP, SGI
Origin, Compaq, Intel TFLOPs, etc.)
Not a complete list
Only clusters explicitly contributed to
topcluster.org

18
What is Scalability?

Most common definition in use
Works for n1 nodes if it works for n, for small
n
Practical definition
Operations complete fast enough
0.5 to 3 seconds for interactive
Operations are reliable
Approach to scalability must not be fragile

19
Issues in Clusters and Scalability

Developing and Testing Tools
Requires convenient access to large-scale system
Can this co-exist with production computing?
Too many different tools
Why not adopt Unix philosophy?
Example solution Scalable Unix Tools
Following slides thanks to Rusty Lusk and Emil Ong

20
What Are the Scalable Unix Tools?

Parallel versions of common Unix commands like
ps, ls, cp, , with appropriate semantics
A few new commands in the same spirit but without
a serial counterpart
Designed for users
New this spring release of a high-performance
implementation based on MPI
One of the original official Ptools projects
Original definition published
Proceedings of the Scalable High Performance
Computing Conference
http//www.mcs.anl.gov/gropp/papers/1994/shpcc-pa
per.ps

21
Motivation

Basic Unix commands (ls, grep, find, ) are
quintessential tools.
Simple syntax and semantics (except maybe find
syntax)
Have same component interface (lines of text,
stdin, stdout)
Unix redirection ( lt, gt, and especially ) allow
tools to be easily combined into powerful command
lines
Old-fashioned no GUI, little interactivity

22
Motivation, continued

Many parallel machines have Unix and at least
partially distinct file systems on each node.
A user needs simple and familiar ways to
Copy a file to local file space on each node
Find all processes running on all nodes
Test for conditions on all nodes
Avoid getting swamped with output
On large machines these commands are not useful
unless they take advantage of parallelism in
their execution.

23
Design Goals

Familiar to Unix users
Similar names (we chose ptltUnix-namegt)
Same arguments, similar semantics
Interact well with traditional Unix commands,
facilitating construction of powerful command
lines
Run at interactive speeds (requires scalability
in parallel process manager startup and handling
of I/O)

24
Part I Parallel Versions of Traditional Commands