Distributed Processing and Networking presentation

About This Presentation

Transcript and Presenter's Notes

Title: Distributed Processing and Networking

1
Distributed Processing and Networking

Chapter 13
A Brief Overview

2
Centralized Systems v. Distributed Systems

Centralized system (uniprocessor, SMP)
shared memory, tightly coupled hardware, single
clock
user applications run on the central computer
data storage is centralized as well
users may have a terminal or low-end PC for
communication with central computing facility
some processes run locally, others on the large
centralized system.
processes communicate using the shared memory
model

3
Centralized Systems v. Distributed Systems

Distributed Systems
multiple separate or whole computers each has
its own memory, clock
individual computers (nodes) are connected by
some kind of communications network
nodes share resources disk storage, I/O
facilities, etc.
processes at separate nodes communicate using the
message passing model
some process run locally, some could benefit from
being distributed across several processors.

4
Software for Distributed Systems

Distributed systems can be connected by software
in a number of different ways.
A variety of connective techniques exist, as the
following examples show
communications architecture
network operating system (NOS)
distributed operating system (DOS)

5
Communications Architecture

Software to connect computers that are primarily
stand-alone.
The connection is designed to support
applications such as email, file transfer, etc.
Computers on the network may be heterogeneous and
have different operating systems.
TCP/IP is the most common example

6
Network Operating System (NOS)

The distributed system consists of a network of
machines, including servers
Servers support file system, printers, etc.
The NOS is an add-on to the local OS
The user is fully aware that there are separate
machines in the system.
A common communications architecture supports the
NOS
NSF or Windows NT are examples

7
Distributed Operating System (DOS)

A DOS tries to make a distributed system look
like a centralized system
Users can transparently access all system
resources as if they were local - no need to name
remote sites explicitly.
DOS must still use some kind of communications
architecture.
True distributed operating systems are still
mostly experimental.

8
Review

Centralized versus distributed architectures
Architectural differences
System software for distributed architectures
Communications software (e.g TCP/IP)
NOS (TCP/IP non-transparent resource sharing)
DOS (transparent resource sharing)

9
The Need for Communication Protocols

A protocol is a formal set of rules that governs
interaction between two entities (here, processes
or computers)
Issues include
agreement on data format message format
a negotiation to make sure the receiver is ready
to accept a message
a routing mechanism to forward the message across
the network and numerous other details

10
Communication Protocols

Communication protocols are designed as layered
systems.
The same set of protocols exist on each machine
communication is between peer layers on the
communicating machines.
Layers on the sender side append information to
the message, the corresponding layer on the
receiver side uses the information

11
Protocol Architecture

A protocol architecture describes the functions
that must be performed to support computer
communication.
The architecture structures the functions into a
set of layered modules.
The next slide shows a simplified architecture
for accomplishing a file transfer.

12
File Transfer
13
The File Transfer Protocol

In the previous slide, each module incorporates
several logical functions.
e.g., the file transfer application is concerned
with things like passwords and specific file
operations
Each module provides/receives services for other
modules in the stack.

14
TCP/IP Protocol Architecture

TCP/IP (Transmission Control Protocol/Internet
Protocol) was developed by DARPA to support
networking in support of defense-related
projects.
Today, TCP/IP is the basic communication
architecture for the Internet.

15
Packet Switching

TCP/IP is an example of a packet-switching
protocol.
The original message is broken up into small,
fixed-size units (packets).
A header is incrementally appended to each packet
by the sender.
The receiver uses the header to interpret the
message.

16
TCP/IP Layers

One view of TCP/IP divides it into 5 layers, from
the bottom (hardware dependent) to the top
(abstractions)
Physical layer
Network access layer
Internet layer (IP)
Transport layer (TCP)
Application layer

17
TCP/IP Layers

Physical governs the physical interface between
a computer and the network
Network handles details of data exchange between
a computer and the network. This layer is
network dependent e.g. Ethernet versus Myrinet.

18
TCP/IP Layers

Internet (network) layer handles routing
(point-to-point transmission) of packets.
Packets are routed from sender to receiver,
possibly through multiple steps.
Routers are special processors that connect two
networks and switch a packet from one network to
the next.
At each step the network layer handles the
transmission.

19
TCP/IP Layers

Transport (host-to-host) Layer responsible for
providing reliable transmission.
Packets are numbered and transmitted when they
are received, they are reassembled in the
original order.
Each packet must be acknowledged. If the sender
fails to get an ack, the transport layer will
re-transmit the message.
Applications that dont want added overhead can
use UDP instead of TCP protocol at this level.

20
TCP/IP Layers

Application layer contains the code needed to
support network applications.
There must be a separate module for each
application.
Applications that run on TCP/IP include SMTP
(Simple Mail Transfer Protocol), FTP, and TELNET.

21
Ports

Messages sent from one host machine to another
are associated with a specific process at each
end.
The network layer only needs to know sending and
receiving host, to get the message to the right
computer.
The transport layer needs to know the process
identity
A port is associated with a particular process,
so messages are actually sent from Host X, Port Y
to Host I, Port J.

22
Sockets

Socket concept developed at Berkeley.
Every message has a source port and a destination
port.
Host IP address port value socket
consequently, a socket is unique throughout the
Internet
Sockets act as communication endpoints.

23
Distributed Processing, Client/Server, and
Clusters

Chapter 14

24
Distributed Processing

A category of processing in which various parts
of an application may be processed at different
nodes in a network.
Location of processing will ideally be determined
by such things as load balancing and the choice
of the most appropriate platform for a task.

25
Client-Server Computing

Client-server processing is based on the
following model client processes request
services from server processes.
Client machines are often single user systems
Server machines support multiple users (clients)
and provide specific services

26
Generic Client/Server Environment
27
Client/Server Computing

Client machines are connected to servers through
some type of communication software probably
TCP/IP.
Applications are divided into tasks, each task
executes where it can be handled most
efficiently.
For example, the client will provide the user
interface to the system.
The server may do all or part of the processing.

28
Fat Client/Thin Client

In the fat-client model, much of the processing
is done locally on the client.
Requires high-end PCs or workstations.
Maintaining large numbers of client machines is
hard upgrades, etc. must be applied locally to
each machine.
Thin-client model does most processing on the
server
Maintenance is centralized, therefore simpler
Client machines can be much simpler

29
(No Transcript)
30
Client/server Applications

File servers store files for a distributed
system.
Print servers allow multiple users to share a set
of printers.
Web servers provide documents and forms
Database servers store and process data for large
applications.
Name servers map domain names to IP addresses.

31
Database Applications

In business applications the database is often
the primary computing application.
The server is a database server
Interaction between client and server is in the
form of transactions
the client makes a database request and receives
a database response
Server maintains the database

32
(No Transcript)
33
Cache Consistency

Clients may cache files/data from the server in
client caches to reduce network transmission
time.
Several clients may cache the same data.
Caches are consistent if they contain exact
copies of the remote (server-based) data.
If one client updates a file, other copies are
now stale out of date.
The cache consistency problem how to maintain
local caches in a consistent state.

34
(No Transcript)
35
Caches in a Distributed System

Client caches and server cache are all in primary
memory no special hardware. (On the client
side, caching may also use disk.)
When a client process accesses a file
Check local cache(s)
If not present, check server cache
If not present, retrieve from server disk (the
primary copy)

36
Distributed File Caches

The advantage of client-side caching is a
reduction in network access time, and a reduction
of network load.
The disadvantage is the possibility of
inconsistency.
Note that if one client modifies cached data the
servers copy is stale and so are any other
copies cached at other clients.

37
Middleware

How do clients communicate with servers from
different vendors with different APIs?
One approach middleware software that glues
together two different applications.
Middleware becomes another layer in the
architecture of a client/server system

38
(No Transcript)
39
Middleware

Client/server products and communication
architectures are not standardized.
Ideally, developers should be able to design
applications that use uniform methods to access
data regardless of the platform or system that
supplies the data.
Middleware provides a standard programming
interface to support this uniformity.

40
Middleware

A set of tools that provide a uniform way to
access system resources across all platforms
Enable programmers to use the same method to
access data, regardless of where it is located.
Example middleware products that link a database
system to a Web server.
Users can request data from the database using
forms displayed on a browser. The Web server can
return dynamic Web pages based on the user's
requests.

41
Logical View of Middleware
APIs
Platform Interfaces
42
Middleware

There are both client and server components to
the middleware (both client and server must be
able to interact with this level)
Objective provide uniform access to different
systems.
Examples CORBA, SOAP, DCOM
Middleware is typically based on either message
passing or remote procedure calls.

43
Peer to Peer (P2P) Processing

P2P is an alternative to client/server
processing.
Client/server has a non-symmetric structure
different nodes have different capabilities.
In P2P processing every node has the capability
of acting as a client or a server.
Most familiar in music-sharing services, but not
limited to that application.

44
P2P

P2P systems are more unstructured than
client/server.
They distribute network load more evenly across
the network, dont suffer from congestion around
server nodes, etc.
Resources from many different host machines can
be shared

45
P2P

Napster made the term popular, although strictly
speaking it did not have a true P2P structure (it
used a central server to locate resources.)
P2P systems are more loosely structured than
traditional C/S they include nodes that are only
intermittently connected to the network, are not
as reliable as managed servers, and may even be
malicious.

46
P2P

A drawback to P2P is the difficulty of locating
resources (because of the lack of centralized
servers)

47
Distributed Communication

Message passing is the only communication
technique for processes in distributed systems,
since there is no shared memory.
Remote Procedure Calls (RPC) provide an interface
to message passing, so processes can interact
using call/return semantics, as in ordinary
procedure or function calls.

48
Message Passing

Message passing was covered in Chapter 5 as a
contrast to shared memory communication between
processes.
In some systems it merely provides an alternate
communication mode (e.g. client/server operating
systems support message passing between modules)
In a distributed system there is no other choice.

49
Basic Message-Passing Primitives
50
Message Passing Review

Message passing primitives
Send (message, destination)
Receive (message, source)
In a network, TCP/IP protocols typically govern
message formats.
Messages are typically broken into smaller pieces
(packets) which are transmitted over the network

51
Design Issues for Messages

Reliability versus unreliability
Reliable message passing guarantees that the
message will be received
Reliable message passing usually relies on a
reliable communication protocol, such as TCP
Unreliable doesnt (which makes it faster)
However, since network communication is subject
to failure, results arent guaranteed.

52
Design Issues for Messages

Blocking versus Nonblocking
Nonblocking (asynchronous) primitives return
control as soon as the OS has processed the
command
Sender regains control when the message has been
copied to kernel buffer (or queued for
transmission)
Blocking (synchronous)
Sender blocks until message has been sent
(unreliable) or acknowledged (reliable)
Receiver blocks until a message is received.

53
Remote Procedure Calls (RPC)

Allow programs on different machines to interact
using simple procedure call/return semantics
Widely accepted
Standardized
Client and server modules can be moved among
computers and operating systems easily

54
(No Transcript)
55
Synchronous vs. Asynchronous

Synchronous RPC
Behaves much like a subroutine call
Caller must wait for results before proceeding
Asynchronous RPC
Does not block the caller
Enables a client execution to proceed locally in
parallel with server invocation
Suitable for some application (e.g., dont wait
for a print server)

56
Cluster Computing

Alternative to symmetric multiprocessing (SMP)
Group of interconnected computers working
together as a unified computing resource
Illusion is one machine (ideally)
Individual nodes in a cluster may, themselves, be
multiprocessors.

57
Benefits of Clustering

Absolute scalability can have much more
computing power than any standalone machine
Incremental scalability cluster size can
increase as needs increase small clusters can
grow.
High availability if one node fails, the others
can continue to process fault tolerant
Superior price/performance cluster can be built
more cheaply than a multiprocessor of equivalent
power.

58
Applications

Cluster servers
Provide redundancy for fault tolerance
Partition workload across several servers
Server clusters can share large RAID disk
clusters and/or have private disks
Parallel programming large scientific or
engineering applications require huge amounts of
processing power

59
Cluster Computer Architecture

Machines in a cluster are generally connected by
a high-speed network which may or may not be
connected to the outside world
Each node runs its own OS, but also shares
software (middleware) to support internode
communication and interoperability.

60
Cluster Computer Architecture
61
Cluster Computing

The middleware layer may not provide full
transparency.
Many parallel programming applications are
structured using PVM or MPI (message passing
packages) as the support structure for managing
parallel operations.

62
Beowulf Clusters

A Beowulf cluster can be homemade it is
characterized by being composed of off-the-shelf
components both hardware and OS software.
For example, PCs running Linux

Distributed Processing and Networking PowerPoint PPT Presentation