Title: Clusters Part 1 Definition of and motivation for clusters
1Clusters Part 1 - Definition of and motivation
for clusters
- Lars Lundberg
- The slides in this presentation cover Part 1
(Chapters 1-4) in Pfisters book
2Introduction
- There are three ways of doing anything faster
- Work harder (increased processor speed)
- Work smarter (better algorithms)
- Get help (parallel processing)
- This course is about clusters, and they are one
way of getting help, i.e. one way of obtaining
parallel processing
3Work harder
- The processor speed is increasing with a factor
of two every 9-18 months (depending on who you
ask)
4Getting help
- Parallel processing occurs on many level, e.g.,
instruction parallelism inside the processor
(superscalar). - We are focusing on parallelism that is visible in
the program in the form of multiple processes or
threads. - It is cost-effective to build a large computer
based on a (large) number of cheap
microprocessors. - It is relatively easy to build the multiprocessor
hardware, but much more difficult to build good
parallel software. - Massive multiprocessors can potentially solve
challenging problems, e.g., global weather
simulation, full system simulation of cars and
airplanes.
5Lowly Parallel Processing
- The current market for massively parallel
multiprocessors is small, but it is increasingly
interesting to connect a small number (e.g. 2-16)
of computers in a cluster. - There are at least two reasons for this
- Microprocessors are getting faster, i.e. many
problems can be solved without the aid of
massively parallel computers. - Availability (i.e. non-stop operation) is
becoming increasingly important.
6Availability
- It is (almost) always desirable to build systems
that will not stop working, and cluster
technology makes it possible to obtain high
availability for a reasonable cost. - In its simplest form cluster availability is
obtained by having two computers. One active
(primary) computer and one stand-by (secondary)
computer if the primary computer fails you
simply switch to the secondary computer.
7Availability continued
- Instead of having one stand-by computer that
just sits getting dusty almost all the time, we
use both, and if either fails move all the work
to one until you fix the one that died. - We have now started to do (lowly) parallel
processing across those two computers. - In order to obtain higher availability we may
want to use more than just two computers.
8Motivation for Clusters
- Based on the discussion on the previous slides
we conclude that the primary reason for using
clusters is availability and not processing
capacity. At least from an industrial
perspective. - However, people from academia are generally
interested in clusters because they provide
inexpensive massively parallel computes, i.e.
clusters are popular in both industry and
academia but for different reasons.
9Cluster Example - BreweryIf one system goes
down (e.g. Manufacturing) then this task is
picked up by another system (e.g. Administration)
Administration
Shared disk
Shared disk
Manufacturing
Shared disk
Distribution
10Cluster Example - Office EnvironmentIf the
active server fail the work will be picked up by
the standby server. The standby servers disk is
consistent with the active servers disk at all
times.
Standby server
Active server
11Cluster Example - Web-severSome popular Internet
sites need more than one server in order to
handle all incoming requests. In that case one
can send allrequests to a dispatcher and let the
dispatcher distribute the load among a number of
servers.When serving the the Olympics in Nagano
IBM had this kind of configuration with 53
servers.
12Cluster Example - Our Beowolf clusterThis is
the system that you will use in the laboratory
exercises.
Internet
Front-end (king)
Clients
13Database cluster productsThe database server
processes are equivalent to the clients.
Clients
...
DB server
DB server
DB server
Disk
14The standard reasons for using parallel and
distributed systems in general
- Performance (always important)
- Availabilty (in most cases the most important
reason for using clusters) - Price/performance (clusters consist of standard
computers, which generally have good
price/performance ratio) - Incremental Growth (one can incrementally extend
the system by adding more computers) - Scaling (there is no upper limit on the number of
computers in cluster, as opposed to the maximum
number of processors in an SMP) - Scavenging (turn the idle time on organizations
computers into something useful)
15Trends that promote clusters
- Very high-performance microprocessors, i.e. the
need for massive parallelism is decreasing - The communication technology is improving
rapidly, e.g. fiber channels, Gbit networks etc. - Standard tools and protocols for distributed
computing, e.g. TCP/IP - The need for high availability is increasing
16Problems with cluster systems
- Lack of single system image software. The
important exception in parallel processing is the
SMP. This is probably a major reason why SMPs
have been relatively successful - Limited exploitation. Only a limited number of
software products currently support clusters - Consequently, the problem with clusters is not
hardware it is software.
17The Need for High Availability
- There are a number of reasons why availability
is becoming increasingly important - The Internet if your site is down you will lose
customers immediately - Remote accesses from employees I.e, people
working from their homes and sales personnel
downloading presentations and price lists - Centralized server resources and thin clients
Reduced maintanance cost. - Etc.
18Definition of a Cluster
- A cluster is a type of parallel system that
- Consists of a collection of interconnected whole
computers, - and is used as a single unified computing
resource - A whole computer could be a uni-processor or a
SMP.
19Clusters versus SMPs
- Clusters are composed of whole computers, SMPs
are composed of processors - Compared to a SMP it is easier to obtain high
availabilty in clusters - Compared to a SMP it is easier to incrementally
increase the size of a cluster - SMPs are easier to maintain from a system
administrators point of view - On a SMP you will often get away with only one
license for your favorite software
20Clusters versus distributed systems
- The nodes in a distributed system have their own
identity from the outside the cluster nodes are
anonymous - The computers in a distributed system often have
dedicated roles, e.g. servers and clients the
computers in a cluster are usually equal. - A cluster can be one node in a distributed system
21System Size
- Due to the rapid growth in processor speed,
neither parallel nor distributed systems are
particularly interesting if they cannot scale to
thousands of processors/computers. - Clusters on the other hand are interesting (for
availability reasons) also for systems with ten
or less computers.