Introduction to Distributed Systems - PowerPoint PPT Presentation

1 / 63
About This Presentation
Title:

Introduction to Distributed Systems

Description:

... applications such as controlling a flight, weather forecast, stock trading, ... performance) is to use several cheap CPUs or connecting the existing small ... – PowerPoint PPT presentation

Number of Views:90
Avg rating:3.0/5.0
Slides: 64
Provided by: CIT788
Category:

less

Transcript and Presenter's Notes

Title: Introduction to Distributed Systems


1
Introduction to Distributed Systems
  • What is a Distributed System?
  • Why do we want to have distributed systems?
  • Different Types of Distributed Computer Systems
  • Middleware DOS Vs. NOS
  • Some Examples Applications
  • Resources Sharing in a Distributed System
  • Challenges and Problems
  • New Developments in Distributed Systems

2
  • What is a distributed system?
  • A definition based on what we want to have
  • From Single Processor System -gt
  • Distributed System Middleware
  • What is a middleware?
  • A definition based on what we want to have and
    the current situations

3
What is a Distributed System?
  • What is a distributed system?
  • A system which is distributed
  • What is a system? Multiple components, connected,
    defined functions, etc
  • What is a centralized system? Standalone
    computer???
  • Centralized Vs. distributed (performance and
    structures)
  • What is the meaning of distributed? Separated
    connected?
  • What are the implications (problems) of being
    distributed? Many!
  • A distributed system is a collection of
    independent computers that appear to the system
    as a single computer (physically distributed but
    logically centralized) (abstraction)
  • - Tanenbaum 2002
  • A distributed system is one in which components
    located at networked computers communicate and
    coordinate their actions only by passing messages
    (components) (connection)
  • - Dollimore et al. 2005

4
Abstraction Vs. Connection
  • Which definition is better?
  • A. Tanenbaum (functionally single but physically
    distributed)
  • B. Dollimore (physically distributed) (A or B?)
  • Why do we have these two definitions?
  • Any more? Any suggestions?
  • What are the differences between these two
    definitions in
  • System development and implementation
  • Services provided
  • Performance

5
Why do we want to have distributed systems?
  • Two basic reasons for going distributed
  • Performance reasons
  • Reduce response time (better performance)
  • Distributed systems give better performance
    (normally)
  • More processing units, larger memory, more data
    for processing
  • Performance tradeoffs (security, reliability, )
  • Resource sharing
  • More resources (data, hardware, memory, computing
    units, ) for sharing across the networks
  • I.e., sharing of a printer, memory, disk
    storages, CPUs, etc

6
Different Types of Computer Systems
  • What is a Computer (Computing system)?
  • Hardware software (system boundary)
  • Functionally
  • A machine that can perform computation
  • What is the meaning of computation or
    compute?
  • Structurally
  • A specially designed machine a CPU, memory
    devices and I/O devices, etc.
  • Mostly, a computer can be used for multiple
    (general) purposes (loading different programs to
    execute for different purposes)
  • Do we have computers for specific purposes? Yes?
    No?
  • Hardware
  • Single processor, multiple processors, multiple
    computers, loosely coupled, tightly coupled
    hardware
  • Software (supported by OS applications)
  • Single process, multiple processes, concurrent
    processes

7
Performance Issues
  • In the old days, a computer has a single CPU and
    processes jobs sequentially one by one without
    interleaving
  • How to improve the performance of a computer
    system?
  • More and better (faster) hardware, operating
    environment, efficient coding,
  • How to measure the performance of a computer
    system?
  • I.e., Response time, throughput (number of jobs
    completed per unit time), utilization
  • Limitations of machines adding more resources
    (faster CPU, more CPUs, more memory, )
  • Performance is limited by bottleneck resource
    sequentially uses resources
  • Limitations of operating environment Concurrent
    execution of processes may not be allowed or
    limited (each time can only serve one job
    although multiple jobs may be active)
  • What are other considerations in addition to
    these performance measures? Reliability,
    security, availability,

8
Single Thread and Single CPU
What is the meaning of a thread in here? Single
active process
U S / A Q U / (1 U) R QS S U
utilization S service time A inter-arrival
time Q queue length
9
Multiple Threads and Single CPU
Executing
How to determine the switching order and the
number active processes?
Process A
CPU
Process E
Process D
Process B
Process C
Switching among Processes A, B and C
Multiple active processes
What is the main benefit of having multiple
threads? If Process A is suspended, i.e., due to
waiting input data, the CPU may execute Process
B What are the overheads? Context switching
10
Concurrent ProcessesMultiple Threads and
Multiple CPU
CPU 1
Process A
CPU 2
Process E
Process D
Process B
Processes A, B and C are executed
concurrently Shorten response time (waiting time)
CPU 3
Process C
What is ignored in this figure? Data structure
algorithm (modeling of application environment
into the computer virtual environment)
11
Different Types of Computer Systems
  • Centralized Computer Systems
  • Computing units are physically located at the
    same site
  • Note the users may be distributed
  • No network delay in processing (communication/memo
    ry data access delay gt 0)
  • What are the implications? Timing and
    synchronization are easier
  • Single processor or multiprocessors

12
Centralized Computer System
Computing units (may be multi-processor and
multithreading)
request
Simple user interface for submitting requests
result
network
user
13
Centralized Computer Systems
  • Performance problems of centralized computer
    systems
  • All requests (jobs) are performed at a
    centralized site
  • Workload at the site could be very heavy
    (overloaded) (unpredictable performance)
  • Q U / (1 U)
  • Transmission delay of job requests and results to
    and from the originating site and the centralized
    site (requests may be from remote users)
  • Scalability problem
  • Management of a very large amount of data (i.e.,
    a large database)
  • I.e., making a phone call requires location
    management of millions of mobile users (searching
    a tree)
  • Reliability problem single point of failure
  • Price/performance a power machine (mainframe)
    (millions HKD) Vs. several cheap machines (PCs)
    (a few thousand HKD)

14
Multicomputer/Multiprocessor Systems
  • Multiprocessors are aimed to resolve the
    performance problem (i.e., shorten response time
    and higher throughput) in CCS
  • Note a single processor can complete a program
    in 10 sec does not mean that using two processor
    can finish it in 5 sec (why?)
  • Different architectures of multiprocessor/multicom
    puter systems
  • Different degrees of sharing of hardware
    resources
  • Varieties in machine architecture and operation
    environment of different machines (how to
    organize the processors)
  • Multiprocessor system (tightly coupled)
  • All processors map to the same memory address
    space
  • Multicomputer system (loosely coupled)
  • Each processor (computer) has its own private
    memory
  • Heterogeneous and homogeneous
  • How to make the sharing?
  • Needs the redesign of the whole architecture of
    the processors and computer, and the support of
    the operation systems
  • What are the functions of an operating system?

15
Shared-Memory Architecture
16
Shared-Disk Architecture
17
Shared-Nothing Architecture
Are they distributed systems? Structurally YES
18
Multiple threads Single processor system
Tightly coupled system
Single thread Single processor system
Multiple threads multiple processors system
Loosely coupled system
Distributed computers
19
What are distributed?
  • Hardware resources
  • Software resources (various types of services)
  • What is software? Specifying the functions to be
    performed, normally in steps
  • How to divide a single software program into
    several software programs to be executed by
    different computing units?
  • How to implement an algorithm into distributed
    processes? I.e., a searching algorithm becomes a
    distributed searching algorithm
  • Data
  • I.e., a large database is partitioned into
    several fragments to be maintained by different
    local database systems
  • How to process the distributed data? I.e., a
    SELECT state to access to distributed database.

20
Operating Systems
  • How to connect the different machines together?
  • What are the tasks of an OS?
  • Distributed operating systems (DOS)
  • Network operating systems (NOS)
  • Middleware

21
Distributed Systems Services
  • DOS
  • An operating system for distributed computers
  • Not intended for independent computers (can join
    and leave independently)
  • The computers have high degree of coupling and
    similarity in structure, architecture and
    operating environment
  • NOS
  • An operating system for loosely connected
    computers and could be very different in
    structure, architecture and operating environment
  • Does not intended to provide a view of single
    coherent system
  • Add an additional layer (middleware) to achieve
    the two objectives
  • To hide the heterogeneity (differences) and
    provide a high degree of transparency

22
  • Why does DOS have these limitations, such as high
    degree of coupling, not for independent computer
    and heterogeneous computers?
  • If you are asked to design a new OS, will you
    choose to build a new OS which is a DOS or use a
    NOS and add a new layer as middleware?

23
Network Operating Systems
  • In principles, there is NO distributed operating
    systems (DOS). Why?
  • An operating system that produces a single system
    image like this for all the resources in a
    distributed system
  • The DOS has total control over all the nodes in
    the system and it transparently locates new
    processes and resources at whatever node suits
    its scheduling policies
  • Examples of NOS Unix and Windows
  • They provides networking capability and can
    access to remote resources
  • NOS retains autonomy in managing their own
    resources. Processes created by the process
    resided at another machine has no control of its
    child process

24
Middleware Positioning
  • A distributed system organized as middleware on
    top of a network operating system to hide the
    heterogeneity of the underlying platform from the
    applications
  • The middleware layer extends over multiple
    machines
  • Applications become operating system independent
    but middleware dependent
  • The primary function to be provided from the
    middleware is the various types of transparency
    services (What is the meaning of transparency?
    Transparent to whom? What are the benefits?)
  • The machines to the user program are logically a
    single machine (Why?)
  • Each local operating system forming a part of the
    entire network operating system provides local
    resource management

25
Middleware Positioning
NOS
26
Transparencies Provided by Middleware
Different forms of transparency in a distributed
system
27
Middleware Services
In an open middleware-based distributed system,
the protocols used by each middleware layer
should be the same, as well as the interfaces
they offer to applications.
28
A Comparison of Different Architectures
29
Middleware Services
  • Some common services from middleware
  • Distributed file systems (accessing a remote file
    like accessing a local file)
  • Remote procedure calls (RPC) (calling a procedure
    supported by a remote node is similar to calling
    local procedure)
  • Distributed objects
  • Distributed documents
  • High levels communication facilities that hides
    the low level message passing
  • Naming services allow the search of remote
    entities
  • Persistence storage of data
  • Distributed transaction management
  • Security
  • Note Many of them are resource management jobs

30
  • What is a distributed system?
  • By connecting existing Computer Systems
  • A definition based on the existing
    architecture/structure

31
Distributed Systems Concepts of Networked
Computers
  • Components gt processes (communicating processes)
  • Networked computers gt connected (loosely
    coupled) computers for sharing of resources
  • Networked computers
  • Similar to loosely coupled hardware
  • Spatially (physically) separated
  • Communication delays are long and unpredictable
    gt when to decide for time-out (in case of
    network failure, worst-case estimation)
  • Concurrent execution of processes are common
    (concurrency) gt performance
  • No global clocks
  • Coordinating processes at different networked
    computers
  • What are the problems of lacking a global clock
  • What is the main function of a global clock?
    Event sequencing
  • Independent failures

32
Examples of Distributed Systems
  • The Internet
  • Variety a large number of different types of
    networked computers connected using a set of
    standard communication protocols
  • Mostly a share of information and resources
  • A lot of reading requests
  • Use the same interfaces and protocols to access
    remote resources
  • Intranets
  • A portion of the Internet separately
    administrated and has a boundary
  • Configured with local security policy
  • Connect to the Internet through a router and
    protected with a firewall
  • A firewall filters incoming and outgoing messages
  • Mobile computing and ubiquitous computing
  • Mobile computing provides computing services
    while the application is moving (also called
    nomadic computing)
  • Ubiquitous computing provide computing services
    everywhere (smart spaces) (also called pervasive
    computing)

33
Examples of Distributed Systems
A typical portion of the Internet
34
Examples of Distributed Systems
A typical intranet
35
Examples of Distributed Systems
Portable and handheld devices in a distributed
system
36
Examples of Distributed Systems
  • Note Computers are NOT just Internet computers
  • What are the applications of computer systems?
  • Personal, commercial, government and many others
  • Computers are not just for entertainment, (i.e.,
    playing games , chatting with people), there are
    still many various applications such as
    controlling a flight, weather forecast, stock
    trading,
  • Many of these applications are distributed in
    nature, i.e., stock trading systems and ticket
    booking systems
  • Our real world gt virtual world in computers
  • Our world is distributed. We are the computing
    unit. Our brain is the memory unit and we have
    communication facilities
  • They are better to be supported by a distributed
    architecture instead of a centralized
    architecture
  • Distributed users, distributed data and
    distributed resources
  • We use a single computer in the past mainly
    because building distributed computer systems
    were expensive

37
Some Benefits of Distributed Systems
  • Price/performance
  • Computers are expensive in the past
  • Easier to manage a centralized computer system
  • A cost-effective way to build a larger system
    (higher performance) is to use several cheap CPUs
    or connecting the existing small computers to
    form a large system
  • Reliability
  • If one machine crashes, the system as a whole can
    still survive
  • What are the different types of failures?
    Different degrees of reliability gt some
    functions are failed, multiple components provide
    the same function
  • Nature of some applications
  • Some applications are inherently distributed
    (e.g. banking and supermarket chain)
  • Some applications are moving (Examples? Why?)

38
Example
39
Some Benefits of Distributed Systems
  • Communication
  • It provides communication facilities (i.e. same
    communication protocol)
  • Sending emails and transmitting documents to
    different users
  • Flexibility
  • Load balancing
  • It spreads the workload over the available
    machines in the most cost-effective way
  • Dynamic workload management (performance Vs.
    workload)
  • Performance gt response time
  • Given a workload, under what situation, the
    response time is the smallest?
  • Different nodes have similar utilization gt
    minimum response time
  • Note These two are not benefits (Why?)

40
Resources Sharing in a Distributed System
  • Many physical resources are distributed in nature
    (devices)
  • The sources for generating soft resources
    (information/data) are also distributed in nature
  • I.e. weather, news, sport results, ticket
    information, etc.
  • A natural trend to share resources
  • Data and software sharing
  • It allows many users access to a remote database
    or even download a program for execution locally
  • Device sharing
  • It allows many users to share expensive
    peripherals
  • I.e., Printers and other peripherals
  • Computation power
  • Computation may be performed by remote computer
  • Incremental growth and scalability
  • Computing power can be added in small increments

41
Resources Sharing in a Distributed System
  • Note resource sharing is NOT always good
  • Why do you want to sharing of resources with
    other users?
  • Although you access to other users resources,
    you also need to provide your resources for other
    users to access to
  • If you have all your required resource, what do
    you want? Sharing? No sharing?
  • What are the problems associated with resource
    sharing?
  • Security, management problems, access problems,
    reliability,

42
Resources Sharing in a Distributed System
  • How to access to remote resources? Through a
    Resource manager
  • What is a resource manager?
  • A program that offers a communication interface
    enabling the resource to be accessed, manipulated
    and updated reliably and consistently
  • What should the resource manager do?
  • Provide resource name (naming services)
  • Identify resource location (distributed directory
    management)
  • Map resource name to communication address
    (naming directory management)
  • Coordinate concurrent accesses to ensure
    consistency (correctness)
  • Different scales in sharing
  • Internet and computer-supported cooperative
    working (CSCW)
  • Resource encapsulation (security)
  • Only the resource manager can access the resource
  • Other users send request to the resource manager
    using a standard way and protocol

43
Example Association (Group Management)
  • Multiple objects
  • Multiple objects co-exist in a distributed
    environment. Some of them are service providers
    and the others are users
  • Association at least one of a given pair of
    components communicates with another within the
    system (cooperatively perform a task (provide
    services))
  • After association gt Interoperation the
    interaction during association
  • Association is spontaneous (without user
    intervention)
  • Network bootrapping
  • Communication takes place over a local network
    within the system
  • The device acquires an address (ID and a name) on
    the local network
  • Who determine the assignment and manage the
    network

44
Centralized Vs. Distributed Management
  • Management (Algorithm) gt Centralized OR
    Distributed
  • Centralized approach use a powerful server to
    manage the space status and connection
    information
  • Distributed approach multiple devices (service
    providers) manage the information
  • Comparisons
  • Problems in distributed computing
  • Perform operations at device level because of
    limited bandwidth
  • Due to the dynamic properties of the objects, a
    lot of updates are needed to be generated
  • A distributed approach can make the management of
    objects to be localized and adaptive to the
    changing systems status (in-network processing).
    But, the communication overhead could be very
    heavy
  • A hierarchical approach multiple levels with
    different levels of coordinators may be used

45
Example Jinis Discovery System
  • Java based system for mobile and pervasive
    computing systems
  • Components lookup services (discovery services),
    Jini services and Jini clients
  • A Jini service provides services
  • The lookup stores services
  • Jini clients request services
  • Lookup service allows Jini services to register
    the services they offer
  • A Jini service may be registered with one or more
    lookup services
  • Jini clients request services that match their
    requirements
  • If a match is found, the Jini client downloads an
    object that provides access to the service from
    the lookup service

46
Example Jinis Discovery System
  • When a Jini client or service starts up, it sends
    a request to a well-known IP multicast address
  • Any lookup service that receives the request
    sends its address enabling the requester to
    perform a remote invocation to look up or
    register a service with it
  • The client requires a lookup service in the
    finance group so it multicasts a request with
    that group name
  • Only one lookup is bound to the group name and
    that service responds including its address
  • The client communicates directly using RMI to
    locate all services of type printing
  • Only one printing service has registered with the
    lookup service
  • The client then uses the printing directly

47
Service Discovery in Jini
admin
1. finance lookup service
Printing
service
Client
admin
Client
Lookup
service
Network
2. Here I am .....
4. Use printing
service
admin, finance
Lookup
3. Request printing
service
Printing
Corporate
infoservice
service
finance
48
  • How to satisfy the definitions (requirements) of
    a distributed system?
  • Tanenbaums requirements and others
  • Challenges???

49
Challenges of Distributed Systems
  • Heterogeneity
  • Openness
  • Security
  • Scalability
  • Failure handling
  • Concurrency
  • Transparency

50
Heterogeneity
  • One of the most important aims of the middleware
    is to hide the differences in underlying systems
  • Applications access remote objects and resources
    using a standard way (interface and protocols) as
    they are managed locally
  • Heterogeneity Vs. Transparency
  • Differences in
  • Networks (LAN, WAN, wireless LAN, GSM, etc.)
  • Computer hardware (different types of CPUs and
    machines)
  • Operating systems (unix, windows, WinCE, etc.)
  • Programming languages (C, Java, C, etc.)
  • Implementations by different developers
  • Standardization Although the Internet consists
    of different types of networks, all the
    communications use the same set of Internet
    protocols

51
Openness
  • Expandability it is the characteristic that
    determines whether the system can be extended in
    various ways and connected to other systems
    (interoperability)
  • New users can join the Internet at any time
  • New resources can be added and be made available
    for use
  • Portability an existing application developed
    for a specific distributed system can be moved
    to work in another distributed system
  • Standardization of interface for accessing the
    resources
  • Flexibility a distributed system should be
    easily configured (reconfigured) even the system
    components are from different developers
  • Need to provide definitions not only for the high
    level interface but also definitions for
    interfaces to internal parts of the system and
    describe how those parts interact
  • Monolithic systems tend to be closed

52
Security
  • Security for information has three components
  • Confidentiality protection against disclosure to
    unauthorized individuals
  • Integrity protection against alteration or
    corruption (correctness)
  • Availability protection against interference
    with the mean to access the resources
  • Specification of what services and resources are
    provided to each user or each group of users
    (levels of accesses and authorities)
  • Methods (encryption and decryption) for encoding
    the messages transmitted over the network
  • Identification of the right users
  • Denial of services
  • A user may wish to disrupt the service by
    bombarding the service with a large number of
    pointless requests
  • Security of mobile code
  • Receives an executable program as an attachment
    of an email

53
Concurrency gt Consistency
  • Processes access to the same resources (or
    different resources) at the same time
  • The server serves the processes concurrently
    (why?)
  • Parallel executions occur for two reasons
  • Many users simultaneously invoke commands or
    interact with application programs
  • Many server processes run concurrently, each
    responding to different requests from client
    processes
  • A higher concurrency in general implies a better
    performance (shorter waiting time for services)
  • In a distributed system with M computers, up to M
    processes can execute in parallel
  • However, this may not be true in many cases
    (why?)
  • The two processes may alter the resources that
    will be used by the other

54
Example
Global Data
X
Data Synchronization
X
X
X is duplicated
55
Scalability
  • A distributed system is scalable if it will
    remain effective (providing similar quality of
    services) if there is a significant increase in
    the number resources and users
  • There are 3 scales
  • The smallest 2 workstations 1 file server
  • Local area network (LAN) up to hundreds
    workstations and several file servers and print
    servers (fax servers etc..)
  • Internetworking Several LANs interconnected may
    contains thousands of computers and share
    resources
  • The Internet
  • What will be the consequence of doubling the
    number of users?
  • Requesting the same set of data
  • Requesting to connect to the same server
  • Requesting to transmit data through the same
    segment of network

56
Scalability
  • To resolve the performance problem, the system
    configuration may need to be changed
  • Adding more servers to balance the workload
  • Duplicating data to resolve the problem in data
    synchronization
  • Caching data to reduce the transmission workload
  • Note Mostly, a solution creates another problem
  • The applications should not be affected due to
    the change in system configuration

57
Failure Handing
  • Failures are possible at any time (planning for
    the worst) (unavoidable)
  • Mostly the failures are partial in a distributed
    systems and failures occur one by one
  • Failure handling consists of
  • Detection of failures
  • Masking failures
  • Recovery of failures
  • The design of fault-tolerant computer systems is
    based on (redundancy)
  • Hardware redundancy the use of redundant
    components
  • Software redundancy and data redundancy
  • Software recovery the design of programs to
    tolerate (process group) or recover from faults
  • Availability measures the proportion of time
    that the system is available for services

58
Transparency
  • Hidden from the user (application) programmer of
    separation of components
  • Achieve a single system image to make everyone
    into thinking that the collection of machines is
    simply an old-fashioned time-sharing system
  • Using the same access method even when the system
    configuration has been changed
  • Logical system design Vs. physical implementation
  • Layer structure to hide the details
  • Access transparency
  • Enable local and remote information to be
    accessed using identical operations
  • Location transparency
  • Enable the information objects to be accessed
    without knowledge of their location (users need
    not tell where resources are located)
  • Who knows the locations?

59
Transparency
  • Concurrency transparency
  • Enable several processes to operate concurrently
    using shared information objects without
    interference (multiple users can share resources
    automatically)
  • Replication transparency
  • Enable multiple replicas to be used to increase
    reliability and performance without user
    knowledge of how many replicas exist
  • Why need replicated data?
  • Failure transparency
  • Enable concealment of faults, allowing users to
    complete their tasks despite the failure of
    hardware or software components
  • Migration transparency
  • Allow information objects move within a system
    without changing their name or affecting users
  • Why do we need to migrate data objects?

60
Transparency
  • Performance transparency
  • Allow the system to be configured to improve
    (maintain the guaranteed) performance as loads
    vary
  • Scaling transparency
  • Allow the system and applications to expand in
    scale without change to the system structure or
    the application algorithms
  • Parallelism transparency
  • Allow the program to be executed in parallel
    without users knowledge

61
Some Basic Techniques for Building a Distributed
System
  • Replicate to increase availability
  • Trade off availability against consistency
  • Exploit cache locality to reduce access delay
  • Use time-out for revocation
  • Use a standard remote invocation mechanism
  • Use encryption for authentication and data
    security
  • Distributed Vs. centralized resource management

62
New Development in Distributed Systems
  • Computing units getting smaller and smaller but
    with higher computation power and energy supply
  • Extreme large memory storage units
  • Network everywhere both wired and wireless
    networks
  • Performance of mobile network has been improved
    greatly
  • Applications both commercial and personal
    (personal computer becomes one of our essential
    units at home)
  • Computation everywhere (mobile games and mobile
    phones)
  • New applications
  • Real-time systems Distributed real-time
    multimedia systems
  • Many small computation units Peer-to-peer
    systems
  • Interaction with environment sensor network
    systems
  • Multiple information stream Information
    integration and filtering
  • What will be the FUTURE???

63
References
  • Readings
  • DollimoreCh1
  • Tanenbaum Ch1 (except 1.5)
Write a Comment
User Comments (0)
About PowerShow.com