TCOM505 Networked MultiComputer Systems - PowerPoint PPT Presentation

1 / 90
About This Presentation
Title:

TCOM505 Networked MultiComputer Systems

Description:

OH: Tuesdays 2:00 4:00 pm, and by an appointment. phone: 993-1552 email: ppach_at_gmu.edu ... Groupware, transaction, and web servers are examples of fat servers ... – PowerPoint PPT presentation

Number of Views:92
Avg rating:3.0/5.0
Slides: 91
Provided by: ppa9
Category:

less

Transcript and Presenter's Notes

Title: TCOM505 Networked MultiComputer Systems


1
TCOM-505Networked Multi-Computer Systems
  • Instructor Dr. Peter W. Pachowicz
  • ST 2, R. 253
  • OH Tuesdays 200 400 pm, and by an
    appointment
  • phone 993-1552 email ppach_at_gmu.edu

2
CONTENTS
  • Introduction to multi-computer distributed
    systems
  • Client/Server model
  • Multi-server architectures
  • Network Operating System
  • Distributed File System and Principles of
    Replication
  • Scaling Up
  • Robust Computer Architectures
  • Federated DB and Data Warehousing
  • Replication architectures
  • C/S Distributed System Management
  • Other issues

3
MOTIVATION
  • Business-driven computing revolution
  • Shift in computing paradigm
  • Traditional computing paradigm vs.
    Internet-based computing paradigm
  • Top-level decomposition
  • Implications - Internet architect
  • Time table

4
DISTRIBUTED SYSTEMS
  • Distributed system
  • A system in which components located at networked
    computers communicate and coordinate their
    actions only by passing messages
  • A collection of independent computers that
    appears to its users as a single coherent system
  • Concurrency
  • Concurrent program execution at different
    locations
  • Coordination is needed
  • No global clock
  • No global synchronization by a clock
  • Synchronization by messages
  • Independent failures
  • Each component can fail independently
  • A system can fail in many new ways

5
  • Examples of Distributed Systems
  • Internet (a global system)
  • Intranets (a subsystem)
  • Mobile computing (a system with a different type
    of dynamics)

A distributed system organized as
middleware.Note that the middleware layer
extends over multiple machines.
6
Challenges
  • Connecting users and resources
  • Heterogeneity
  • Integration of systems of different computer
    hardware, networks, operating systems,
    programming languages, implementations by
    different developers
  • Hardware, data representations, and software
    differ through computing platforms
  • Middleware
  • Provides a programming abstraction as well as
    masking the heterogeneity of the underlying
    networks, hardware, OSs, programming languages
  • Provides a uniform computational model for use by
    the programmers
  • Mobile code
  • Can be sent from one computer to another and run
    at a destination
  • Virtual machine concept - code executable on any
    machine

7
  • Openness
  • Defines system extension and re-implementation
    capabilities by adding new services or modifying
    existing ones
  • Systems are designed to support resource sharing
    in a way that these resources and services can be
    extended
  • Extension can be achieved at
  • Hardware level - by the addition of computer to
    the network
  • Software level - by the introduction of new
    services and re-implementation of the existing
    ones
  • Open system
  • Open system interfaces are published
  • Open distributed systems are based on the
    provision of a uniform communication mechanism
    and published interfaces
  • Open distributed systems can be constructed from
    heterogeneous hardware and software, possibly
    from different vendors !!!

8
  • Security
  • Confidentiality protection against disclosure
    to unauthorized individuals
  • Integrity protection against alteration or
    corruption
  • Availability protection against interference
    with the means to access the resources

9
  • Scalability
  • Scalability is a dominant theme in the
    development of distributed systems. Three
    dimensions Size, Distance, Administration
  • A system is scaleable if it will remain effective
    when there is a significant increase in the
    number of resources and the number of users
  • Controlling the cost of physical resources e.g.,
  • The need for new resources should be proportional
    to the number of users
  • Controlling the performance loss e.g.,
  • Algorithms using hierarchical structures scale
    better than those using linear structures
  • Preventing software resources from running off
    e.g.,
  • Design must consider the demand growth years
    ahead
  • Avoiding performance bottlenecks e.g.,
  • Algorithms should be decentralized
  • Preventing total failures

10
  • Failure handling
  • Failures in a distributed system are partial
  • Many failure scenarios exist (due to the
    combination of partial failures). It is
    difficult to predict them at the design phase.
  • Dealing with failures
  • Failure detection
  • Failure masking
  • Failure toleration
  • Recovery from failures
  • Redundancy
  • Concurrency
  • Concept of sharing resources - availability to
    others
  • Multi-threading

11
  • Transparency
  • Concealment from the user and the application
    programmer --- they do not have to know about the
    separation of components in a distributed system

12
  • Another look at transparency
  • Access transparency - access to local and remote
    resources using identical operations
  • Location transparency - no knowledge of the
    location is needed
  • Concurrency transparency - several processes can
    operate concurrently and share resources without
    interference
  • Replication transparency - multiple instances of
    resources can be used without knowledge of the
    replicas
  • Failure transparency - user does not need to know
    about faults
  • Mobility transparency - allows movement of
    resources and system elements
  • Performance transparency - allows for system
    reconfiguration to improve performance as load
    vary
  • Scaling transparency - allows the system and
    applications to expand in scale without change to
    the system structure or the application algorithm

13
HARDWARE CONCEPTS
  • Basic organizations of processors and memories
    (RAM) in distributed computer systems
  • Multi-processor systems (have shared memories)
  • Multi-computer systems (do not share memory)

14
Bus-Based Multiprocessor
  • Pros
  • Simple configuration
  • Improved speed by added caches (hit ratio may
    reach 90 for a larger cache)
  • Cons
  • Incoherent memory (data can be changed by another
    processor)
  • Scalability
  • A large number of processors cannot be added
  • A bus becomes a system bottleneck

15
Switch-Based Multiprocessor
  • Cross-bar
  • Memory divided into modules and accessed through
    a crossbar
  • N2 number of crossbars is needed
  • Omega switching network
  • A single 2x2 switch has 2 inputs and two outputs
    allows access to every memory
  • Fewer switches needed
  • Switching is more time consuming
  • Higher cost

16
ARCHITECTURAL MODELS
  • The architecture of a system is its structure in
    terms of separately specified components
    interaction capabilities
  • Considerations
  • Placement of system components across a network
  • Patterns for the distribution of data and
    workload
  • Interrelationships between components
  • Component functional roles
  • Patterns of communication
  • Additional considerations for dynamic systems
  • Moving code from one process (computer) to
    another
  • Dynamic connection and removal from a net, search
    for services

17
CLIENT-SERVER ARCHITECTURE / MODEL
  • The most important and basic architecture of a
    distributed system
  • Communication request/reply invocation/result

request
CLIENT
SERVER
reply
request
SERVER/CLIENT
SERVER
reply
18
C/S Characteristics
  • Service
  • A relationship between two computers
  • The server is a provider of services
  • The client is a consumer of services
  • Shared resources
  • A server can serve many clients providing the
    same resources
  • Asymmetrical protocols
  • Clients always initiate the dialog
  • Servers are passively awaiting requests from
    clients
  • Callback - a client can pass a reference to a
    callback object and a server can invoke it. So,
    the client becomes a server.
  • Transparency of location
  • Computers can be anywhere but connected to the
    network
  • Users do not need to know servers location

19
  • Mix-and-match
  • Independence from hardware and software platforms
  • Message-based exchanges
  • Clients and servers are loosely coupled systems
  • Interactions through message-passing mechanism
  • Encapsulation of services
  • The server is a specialist and decides how to
    get the job done
  • Servers can be upgraded without affecting clients
    as long as the published interface is not changed
  • Scalability
  • C/S systems can be scaled horizontally (more
    clients) or vertically (faster servers or
    distributed processes)
  • Integrity
  • Servers are centrally managed
  • Clients remain personal and independent

20
Basic C/S Systems
  • File Servers
  • File transfer across the network - file sharing
  • The server functions as a storage of files
  • Primitive form of C/S data service
  • Generation of a lot of network traffic

request
Application
21
  • DB Servers
  • The client passes SQL requests as messages (it
    looks like sending instructions one after another
    for the execution on a remote computer)
  • The server executes requests in SQL statements
    and returns result (data)
  • Application code on the client
  • Data and data retrieval controlled by SQL
    statements on the server
  • Efficient use of distributed processing power
  • Vendor support for DBMS servers (Oracle, MS)

SQL statement
DBMS Server
Application
Data
22
  • Transaction Servers
  • The client invokes remote procedures that are on
    the server (a procedure is a collection of SQL
    statements). The SQL statements either all
    succeed or fail as a unit.
  • Creation of a distributed application by writing
    the code for the client and the server
  • OLAP (On-Line Transaction Processing) - Mission
    critical applications requiring 1-3 sec. response
    time 100 of the time

TP Objects
Invocation
Application
DBMS
Result
Application
23
  • Groupware Servers
  • Putting people in direct contact - group project
    management, access to email, etc.
  • Groupware software is distributed through the
    network
  • Vendor specific software

Groupware Server
Application
Application
24
  • Object Application Servers
  • Application software is written as a set of
    communicating objects
  • Client objects communicate with server objects
    using an Object Request Broker (ORB)
  • The client invokes a method on a remote object
  • ORB locates an instance of that object server
    class, invokes the requested method, and returns
    the results to the client object
  • Server objects must provide support for
    concurrency and sharing
  • CORBA - an emerging technology for distributed
    systems

Application
Invocation
ORB
ORB
Objects
Return
ORB
Object
25
  • Web Application Servers
  • Thin, portable, universal client idea
  • Superfat servers with stored documents
  • Communication through HTML protocol (RPC
    protocol)
  • Clients have extended GUI capabilities
  • Evolving model - bringing web and objects
    together

HTML Forms
Application
CGI
HTTP over TCP/IP
HTML Documents
Java
Internet
26
Fat Servers or Fat Clients
  • Fat client
  • More traditional form of C/S configuration
  • Main processing on the client
  • Fat server
  • Easier to manage and deploy on the network
  • Main processing on the server
  • Ultra-thin clients
  • Mobile communication devices
  • Groupware, transaction, and web servers are
    examples of fat servers
  • Database and file servers are examples of fat
    clients
  • Distributed objects can be either

27
MULTISERVER ARCHITECTURES
  • Splitting the C/S application into functional
    units and distributing them over multiple
    computers
  • Functional units
  • User interface - GUI
  • Business logic - Application
  • Shared data - BD
  • Multi-tier architectures depend on
  • The split of the application into functional
    units and their distribution
  • The middleware used to communicate between the
    tiers

28
Alternative Client-Server Organizations
29
1-Tier Architecture
  • All modules on one computer - there is no
    distribution
  • Advantage - simplicity, cost
  • Problems relate to the old computing paradigm -
    for example - no scaling up

Application
GUI
DB
30
2-Tier Architecture
  • Remote presentation architecture
  • Distributed presentation architecture

31
  • Distributed programs architecture
  • Remote data architecture
  • Distributed data architecture

32
  • Advantages
  • Simplicity
  • Fast development of 2-tier application systems
  • Problems
  • Systems do not scale up
  • Difficulty in managing fat clients

33
3-Tier Architecture
  • Basic 3-tier architecture
  • Thin-client 3-tier architecture

34
  • Thick-client 3-tier architecture
  • Hybrid 3-tier architecture

35
3-Tier Architecture Summary
  • The most popular and flexible configuration
  • Configuration
  • 1st tier - Presentation devices and software
    (GUI)
  • 2nd tier - Mission critical server (Web server
    Application server)
  • - Gateway to back-end application servers
  • 3rd tier - Data, software applications,
    additional software
  • Less complex system administration
  • Good performance and excellent scaling up
  • Excellent application reuse
  • Legacy applications integration
  • Excellent Internet support - clients on
    low-bandwidth
  • Heterogeneous data base support
  • Excellent hardware and software flexibility

36
n-Tier Architecture
  • The most recent trend
  • Architecture for serious application systems
    (large-scale systems) with many clients and
    real-time DBs
  • Bridging several application systems into a large
    enterprise and inter-enterprise infrastructure
  • Inter-enterprise transactions
  • E-business Inventory-planning-ordering-accounti
    ng-banking-production-delivery
  • E-biding
  • Challenge
  • Bridging distributed software components
    altogether
  • Synchronization in a distributed environment
  • Fault-tolerant computing - fault detection and
    recovery

37
  • Architecture 1 - Middle-tier is a
    gateway/dispatcher/scheduler
  • Architecture 2 Middle-tier is a cluster of
    cooperating services

38
  • Benefits of n-tier architecture
  • Embedded load balancing (gateway/dispatcher/sched
    uler)
  • Easy scaling up
  • Development of large systems / applications in
    small steps
  • A cluster of small applications
  • Easy gradual testing
  • Gradual deployment of large systems
  • Small development teams
  • Incremental development
  • Risk reduction of system development (53 of IT
    projects fail)
  • Component reuse
  • Can be sold separately
  • Can be used by the other applications
  • Integration with off-the-shelf components
  • Suite of applications
  • Enterprise system (a win-win situation)
  • Component environments dont get older - they
    only get better

39
  • Problems
  • Providing robust communication between components
  • Component integration through middleware
  • Difficult plug-and-play capability
  • Traffic increase
  • Many fault scenarios

40
Proxy Server and Caches
  • Cache memory - analogy to the microprocessor
    system design
  • A storage of recently used data
  • Small size storage
  • Elimination of unnecessary bus transfer cycles
  • Explain the process

µP
Memory
CM
C
Bus
41
  • Cache distribution
  • Within a client
  • On a proxy server
  • Caches are used extensively in practice
  • Traffic reduction
  • Web browser maintains a cache of recently visited
    web pages
  • Web proxy servers
  • They provide a shared cache of web resources for
    the client machines at a site or across several
    sites
  • The purpose of proxy servers
  • Increase availability and performance of the
    service by reducing the load on the wide-area
    network and web servers
  • Access remote web servers through a firewall

42
Peer Architecture
  • All processes play similar role - no distinction
    between clients and servers
  • Peers interact cooperatively to perform a
    distributed activity
  • Peers are distributed components and monitor
    common resources blackboard to view and
    interactively modify data posted and shared
  • Major problem - coordination

43
Other Concepts - Mobile Agents
  • Mobile agent is a running program (code and data)
  • Mobile agent travels from one computer to another
    over a network with a given mission
  • May make many invocations to local resources
  • May go back to the source point of its journey
  • Problems
  • They are a potential security threat - secret
    sniffers, silent viruses, etc.
  • They may not accomplish their mission due to
    faults, data constraints, or mission specification

44
Other Concepts - Mobile Devices
  • A form of distribution - spontaneous networking
    or dynamic networking
  • Connecting mobile and non-mobile devices to
    networks
  • A form of a very flexible network architecture
    and inter-device communication possibilities
  • Require special type of middleware to build
    distributed applications
  • Key features
  • Easy connection to a local network - no cabling
  • Easy integration with local services - no special
    configuration is needed - services are
    broadcasted
  • Serious problems
  • Limited connectivity Security and privacy
    Handling dynamic change of location and IP
    address Robustness etc.

45
C/S BUILDING BLOCKS
  • The three basic building blocks
  • Figure II-3-6
  • Server
  • Server side of the application - typically
  • SQL database server TP Monitors Groupware
    servers Object servers The Web
  • Support for Distributed System Management (DSM)
  • A simple agent
  • A managed PC

Client
Server
Middleware
46
  • Client
  • Client side of the application - for example, web
    browser
  • Non-GUI clients
  • Without multitasking - ATM machines, barcode
    reader, cellular phones, fax machines, etc.
  • Requiring multitasking - robots, testers, etc.
  • GUI clients
  • Graphical windows with dialog boxes - Figure
    II-5-8
  • Object Oriented User Interfaces (OOUI)
  • More sophisticated environments, highly iconic,
    with interactive manipulation of graphical
    components
  • A visual desktop metaphore - Figure II-5-9
  • The GUI/OOUI evolution - Figure II-5-10
  • From Web pages to Shippable Places (virtual
    world) - Figure II-5-11

47
  • Middleware
  • The nervous system of a C/S infrastructure
  • Three categories - Figure II-3-6
  • Transport Stack
  • Network Operating System
  • Service Specific Middleware
  • Also contains DSM components
  • Middleware for n-tier architecture must provide a
    platform for
  • running server-side components balancing loads
    managing the integrity of transactions
    maintaining high-availability supporting
    security providing C/S communication pipes
  • Figure II-3-7
  • Platforms
  • The application servers that run the server-side
    components
  • Used across different OSs to provide a unified
    view of the distributed environment - Web server,
    Object Transaction server, TP Monitor
  • Pipes
  • Provide the intercomponent communication

48
NETWORK OPERATING SYSTEM
  • Task of an OS
  • Provide problem-oriented abstractions of the
    underlying physical resources - the processors,
    memory, communications, and storage
  • System layers - Figure I-6-1
  • Kernels and processes are executable components
    (programs) that manage resources
  • Encapsulation - provide a useful service
    interface to their resources
  • Protection - protection from illegitimate access
  • Concurrency - provide access by many clients
  • One of the kernels processes executes
    application code

49
Core OS functionality
  • Figure I-6-2
  • OS components
  • Process manager
  • A process is a unit of resource management,
    including address space and one or more threads
  • Thread manager
  • Creation, synchronization and scheduling of
    threads
  • Communication manager
  • Communication between threads attached to
    different processes on the same computer. A
    kernel may support remote thread communication.
  • Memory manager
  • Management of physical and virtual memory of a
    computer
  • Supervisor
  • Dispatching of interruptions, system call traps
    and other exceptions

50
Processes and Threads
  • Division into processes and threads caused by
    multitasking and concurrency requirement insisted
    on a single computer
  • Processes creation is expensive but secure
  • Threads creation is fast but they work over the
    same execution environment
  • A process consists of
  • An execution environment
  • An address space
  • Thread synchronization and communication
    resources such as semaphores and communication
    interfaces (sockets)
  • Higher-level resources such as open files and
    windows
  • One or more threads
  • A thread is the operating system abstraction of
    an activity
  • Threads can be created or destroyed dynamically
    as needed to maximize the degree of concurrency
  • The future belongs to multi-threaded processes

51
Kernel (Application)
Process (Computation)
Process (In/Out)
Execution Environment
Thread 1
Thread 2
Thread 3
Process
52
  • Benefits of multi-threaded systems
  • Creating a new thread within an existing process
    is cheaper than creating a process
  • Switching to a different thread within the same
    process is cheaper than switching between threads
    belonging to different processes
  • Process switching is executed by a system call to
    the OS
  • Switching processes causes an extensive OS
    overhead for memory management, etc
  • Threads within a process may share data and other
    resources conveniently and efficiently compared
    with separate processes
  • But by the same token, threads within a process
    are not protected from one another

53
Multi-Thread Architectures
  • Threads on a single processor server
  • Help to maximize the throughput
  • Architectures for multi-threaded servers
  • The worker pool architecture - the simplest
    architecture
  • Advantage Prioritized queue
  • Problems Inflexibility

Threads (workers)
In/Out
queue
Requests
54
  • Thread-per-request architecture
  • Each request generates a thread
  • Thread is destroyed after the execution is
    finished
  • Advantages Throughput is potentially maximized
  • Problems The overhead of the thread creation
    and destruction

Remote Objects
Workers
I/O
Requests
55
  • Thread-per-connection architecture
  • Associates a thread with each connection - a new
    worker thread to service a single client - the
    thread is destroyed when the connection is closed
  • Advantages Low thread management

Workers
Remote Objects
Requests
56
  • Thread-per-object architecture
  • A thread associated with each remote object
  • Per-object queuing by I/O thread
  • Advantages Resource-driven design Very low
    thread management
  • Problems Scaling up

Workers
Remote Objects
Requests
57
DISTRIBUTED FILE SYSTEMS
  • Distributed file system supports the sharing of
    information in the form of files throughout the
    Internet
  • To enable programs to store and access remote
    files exactly as they do local ones - allows
    users to access files from any computer in the
    Internet
  • File transparency feature
  • A part of NOS
  • Sun Network File System - NFS - a case study
  • Basic distributed file systems provide an
    essential support for organizational computing
    based on Intranets
  • File caching concept - similar to memory caching
  • Caching on the server and client side
  • Caching on the client side is more important

58
File System Requirements
  • Transparency
  • Usually the most heavily loaded service on the
    Internet
  • Access transparency
  • Clients should be unaware of file distribution
  • The same file access operation are for local and
    remote files
  • Location transparency
  • Can be relocated without changes to the pathnames
  • Mobility transparency
  • Neither client programs nor system administration
    tables in client nodes need to be changed when
    files are moved to another location
  • Scaling transparency
  • The service can be expanded by incremental growth
    to deal with a wide range of loads and network
    sizes

59
  • Concurrent file updates
  • Changes to a file by one client should not
    interfere with the operations of other clients
    simultaneously accessing or changing the same
    file
  • File replication
  • A file may be represented by several copies of
    its contents at different locations
  • Hardware and OS heterogeneity
  • Fault tolerance
  • The server will operate in the face of client and
    server failures
  • Stateless server (web server)
  • Consistency
  • Across multiple copies/replicas - a delay exists
    in replica update

60
File Service Architecture
  • Figure I-8-5
  • Components
  • Flat file service
  • Operations on the contents of files
  • Unique File Identifiers (UFIDs) are used to refer
    to files
  • Directory service
  • Provides mapping between file names and their
    UFIDs
  • Provides hierarchical organization of a file
    system
  • Client module
  • Integrating and extending the operations of the
    flat file service and the directory service under
    a single application programming interface
  • Holds information about network resources
    (locations of the flat file server and directory
    server processes)

61
  • Flat file service interface
  • RPC service used by client modules - it is not
    used directly by the user-level programs
  • Typical operations
  • Read and Write
  • Create and Delete
  • GetAttributes and SetAttributes
  • In comparison with UNIX OS it does not have Open
    and Close operations - files can be accessed
    immediately
  • Access control
  • Access rights check is implemented on a server
    using users ID
  • Two approaches - both support stateless server
    implementation
  • Access check when a file name is converted into a
    UFID
  • Access check with every client request (any
    operation on a file)
  • Directory service interface
  • Translation of a file names into UFIDs
  • Hierarchic file system - a tree structure of file
    directories
  • File groups - for file moving purpose across
    servers, etc.

62
Sun NFS
  • Architecture - Figure I-8-8
  • Client integration
  • User programs can access files via UNIX system
    calls without recompilation or reloading
  • The encryption key is used to authenticate user
    IDs passed to the server
  • Buffering and caching in/out data
  • Virtual file system
  • Provides access transparency
  • Distinguishes between local and remote files
  • Integration between UNIX and non-UNIX remote file
    servers

63
  • Mount service
  • Allows for mounting the remote file system on a
    given machine
  • Figure I-8-10
  • Server caching
  • Used for improved performance
  • Read-ahead concept
  • Delayed-write concept
  • UNIX sync operation flushes altered cache pages
    every 30 seconds
  • Client caching
  • Used to reduce the number of requests across the
    network

64
SCALING UP
  • Scaling up is one of strategic requirements
  • Goals
  • Serve more clients
  • Reduce traffic
  • Protect the system against a crash
  • Grow the service
  • Increase the productivity / decrease the costs
  • Scaling up must target
  • Infrastructure growth (architectural issues)
  • Single computer hardware modifications
  • Application software architecture, design and
    implementation

65
Server Scalability
  • Goal To extend upper limits of a server
  • Evolution - Figure II-5-3
  • PC Server
  • Asymmetric Multiprocessing Superserver
  • Symmetric Multiprocessing Superserver
  • Multiserver Cluster
  • Multiprocessor servers
  • Multiple processors in one box
  • High-speed disk arrays for intensive I/O
  • Fault-tolerant processing features

66
  • Asymmetric multiprocessing - Figure II-5-4 -
    Simple solution
  • Only one processor (master processor) runs OS
  • Coordinates all processing
  • Divides the tasks into specialized processors
  • Other processors are used as workers (slaves)
  • Problems
  • Some processors can be temporarily overloaded
  • Symmetric multiprocessing (SMP) - Advanced
    solution
  • All processors are equal
  • Applications are divided into threads that can
    run concurrently on any available processor
  • Any processor in the pool can run the OS kernel
  • OS should support OS kernel, global scheduler,
    shared I/O structure
  • Fully multiprocessor hardware is needed with
    shared memory and local instruction caches
  • Applications must be written in a way that
    supports multithreaded processing

67
  • Clusters
  • Made of a group of interconnected SMP machines
    behaving like a single system
  • High-speed LAN is frequently used as the
    interconnection
  • Types of clusters - Figure II-5-5
  • Shared-disk cluster
  • Shared-nothing cluster - provide a very
    high-level of parallelism
  • Clusters provide high-availability because
  • SMP machines do not share memory or synchronized
    caches - so -
  • Failures are contained within a single node
  • Some form of Im alive mechanism is needed to
    monitor the health of cluster components
  • Advanced OS support server clusters - Figure
    II-6-2

68
Scaling-Up vs. Scaling-Out
Scaling Up
Scaling Out
69
  • Scaling up is the traditional approach - get a
    bigger and bigger server (cluster)
  • A complex solution
  • Time consuming solution to implement
  • Best when applied to DB servers
  • Scaling out - a new approach - use multiple small
    servers
  • Simple and inexpensive solution ??? - not always
  • Fast deployment
  • Typical for web servers and e-commerce
  • Temporary solution

70
Other Solutions To Scalability
  • Dynamic-caching (on a Proxy server)
  • Applies updates to portions of a page that
    actually changed
  • Condenser sits between Content Site and the ISP
  • Condenser is an intelligent agent
  • Understands page content (sections)
  • Can self-adjust to network conditions -
    regulating frequency of updates, etc.
  • Condenser can reduce the traffic by 90 !!!

Client
Proxy Server
Content Provider
71
  • Replication of the same resources
  • Replication of service
  • Replication of data
  • Mainly used for global systems with users spread
    over multiple and/or larger geographic regions
  • US server (East coast, West coast), Europe
    server, Asia server
  • Clear advantage for information providers
  • Problems
  • Update and synchronization of replicas for
    systems with frequent data updates

72
New ArchitecturesStorage-Area-Network (SAN)
  • Crossing boundaries of storage limits (size and
    access)
  • Architecture

Server
Users on LAN
Disk Arrays
Fibre Channel Hub
Switch
Users on LAN
Server
73
  • Provides very fast data access for widely
    distributed users
  • High-speed (gt1G bit/sec) dedicated subnetwork
    connecting storage disks or tapes with their
    associated servers
  • Long-distance fibre connections (lt10km)
  • Provides the most efficient use of storage
    devices (sharing over a number of LANs)
  • Designed to support
  • Disk mirroring
  • Backup and restoration
  • Archiving and retrieving
  • Data migration among storage devices
  • Great solution to data warehousing
  • Problems
  • Complex technology High cost Experienced people
    needed

74
FAULT-TOLERANT COMPUTING
  • Fault tolerant computing describes an environment
    that provides continuous, uninterrupted service -
    access to data and application programs - even
    when a hardware, software or network component
    fails
  • Fault tolerance is about true redundancy
  • Provided by
  • hardware
  • software
  • combination of hardware and software
  • Typical users
  • Financial institutions
  • Airline institutions
  • E-commerce

75
Fault-Tolerance vs. High-Availability
  • Both
  • Designed to maximize application and server
    availability
  • Use of backup resources - like mirrored servers
    and disks for recovering from failure
  • Goal of availability
  • To recover from a crash quickly
  • Goal of fault-tolerance
  • To eliminate the recovery time completely
  • Less than 5min of downtime a year
  • Fault-tolerant configurations feature a high
    degree of built-in hardware redundancy,
    serviceability and remote management capabilities

76
Architecture 1 Process pair
  • Primary process and a backup process run on a
    separate processors
  • The backup process mirrors all the information in
    the primary process and can take over in any case
    of the primary processor failure
  • Comment
  • This is not the best architecture for
    fault-tolerant computing - due to potential
    server failure
  • Trend - multiserver architectures

77
Architecture 2 A four-server architecture
Computational Processor (CPUMemory)
Computational Processor (CPUMemory)
High-Speed Links
I/O Processor
I/O Processor
100 Base-T
Disk Storage
Disk Storage
Mirrored Storage
LAN
78
DB SERVER ARCHITECTURES
  • SQL server
  • Manages the control and execution of SQL commends
  • Provides logical and physical views of the data
    and generates optimized access plans for
    executing the SQL commands
  • Server administration features and utilities that
    help manage the data
  • All other functions related to concurrency,
    security, and consistency
  • Process-per-Client Architecture - Figure II-10-2
  • Provides maximum separation - separate address
    spaces
  • Advantages
  • Users are directly protected from each other
  • Processes can be assigned to separate processors
    (SMP machine)
  • Problems
  • Uses more memory and CPU - potentially slower
    solution
  • Relies on OS supporting SMP

79
  • Multithreaded Architecture - Figure II-10-3
  • Single multithreaded process
  • Advantages
  • Best performance by running all the user
    connections, applications, and the database in
    the same address space
  • Does not rely on OS
  • Conserves memory and CPU cycles - does not
    require frequent context switching
  • Problems
  • It is easier to bring it down by a misbehaving
    application
  • Hybrid Architecture - Figure II-10-3
  • Consists of three components multithreaded
    network listeners, dispatcher tasks, reusable
    shared server worker processes
  • Advantage
  • Protected environment without requiring
    significant memory
  • Difficult to break it down
  • Problems
  • Queue latencies can be a problem

80
DATA WAREHOUSES
  • Motivation
  • Explosion of data stored on computers and data
    sources
  • Business change - intelligent and fast decisions
    based on advanced analysis of available data
  • Data warehousing provides foundation technology
    for creating intelligent clients
  • Exploding emerging technology of an exceptional
    growth
  • 2nd-tier (or 3rd-tier) of multi-computer
    architecture taken by
  • OLTP - On-line Transaction Processing
  • Time critical systems
  • Mission critical systems
  • DSS - Decision Support Systems
  • Analyzing and finding right information and
    presenting it to a session maker

81
Elements of Data Warehouse
  • Data warehouse
  • An active intelligent store of data that can
    manage and aggregate information from many
    sources, distribute it where needed, and activate
    business policies
  • Top level architecture - Figure II-12-1
  • Operational Data
  • Data Replication Manger
  • Manages the copying and distribution of data
    across databases
  • Transformation Cleansing Replication
  • Informational Database
  • Goal specific subset of operational data
  • Metadata - data about data - describes contents
    of the DB
  • Information Directory
  • Information hound element - defines what kind of
    data should be collected
  • EIS/DDS Tool Support - intelligent data analysis

82
Warehouse Hierarchies - The Datamarts
  • Architecture - Figure II-12-2
  • All data extracts from production databases are
    first applied against an enterprise data
    warehouse
  • Data from the enterprise warehouse can be
    distributed / replicated (as needed) to
    departmental (goal-oriented) warehouses also
    known as datamarts
  • Datamarts are organized by subject (sales data,
    product data, etc.)
  • Why such an organization
  • Data warehousing is a large project - carried out
    in increments - from a single datamart to the
    other (collectively called data warehouse)
  • A business case to order the data

83
REPLICATION ARCHITECTURES
  • Motivation
  • A key in providing high availability and fault
    tolerance in distributed systems
  • Used to remove capacity, performance, and
    organizational roadblocks of centralized data
    access
  • Data replication and transformation process
  • Figure II-12-5
  • Specialized middleware provides a glue in
    muli-server DB replication environment
  • Replication transparency
  • Client should not have to be aware that multiple
    physical copies of data exist

84
Refresh and Updates
  • Refresh
  • Architecture Figure II-12-6
  • Replace the entire target with data from the
    source
  • Update
  • Architecture Figure II-12-7
  • Send the changed data only to the target
  • Synchronous update - high availability
    applications
  • Asynchronous update - data warehousing
    applications
  • Staging - Figure II-12-8
  • Broadcasting
  • Can be considered as a form of replication
    service
  • Does not have a feedback from the update

85
Passive Replication Architecture
  • Architecture
  • Figure I-14-4
  • Front ends for clients - communication function
    only
  • A single primary replica
  • One or more secondary replicas (backups)
  • Characteristics
  • Clients communicate with the primary replica only
  • Primary replica executes operations and sends
    copies of updated data to the backups
  • If the primary fails, one of the backups is
    promoted to act as the primary

86
  • Communication sequence
  • Request
  • Made by a client to the primary replica with
    attached request identifier
  • Coordination
  • Checks execution. If request duplicated re-sends
    the response
  • Execution
  • The primary executes the request and stores the
    response
  • Agreement
  • If the request is an update then the primary
    sends the updated state, the response and the
    identifier to all backups
  • The backups send an acknowledgment
  • Response
  • The primary responds to the client

87
Active Replication Architecture
  • Architecture
  • Figure I-14-5
  • Characteristics
  • Replicas are state machines that play equivalent
    roles and are organized as a group
  • Front-ends multicast their requests to the group
    of replicas
  • Replicas process the request independently and
    reply
  • Front-ends collect and compare replies
  • This architecture can tolerate many failures
  • More replicas are needed to support a voting
    process at a front end

88
  • Communication sequence
  • Request
  • Multicast a request with its identifier to the
    group of replicas
  • Coordination
  • The group communication system delivers requests
    to all available replicas
  • Execution
  • Every replica executes the request independently
  • Agreement
  • No agreement phase is needed
  • Response
  • Each replica sends its response to the front end
  • Frond end collects responses, checks their
    consistency and replies to the client

89
The Gossip Architecture
  • Architecture
  • Figure I-14-6
  • Highly available service but of weaker
    consistency
  • Application bulletin post - due to slightly
    out-of-date information
  • Characteristics
  • Replicated data is close to the points where
    groups of clients need it
  • Two basic types of operations
  • Queries - read-only (read-only replicas)
  • Updates - change without reading (update
    replicas)
  • Replicas exchange gossip messages periodically
    in order to exercise the updates they received
    from clients
  • All replicas eventually receive all updates -
    convergence over time
  • The architecture has weaker consistency - due to
    casual nature of updates but they are less
    costly

90
  • Scalability of the gossip architecture
  • A problematic issue - if the number of replicas
    grow than the traffic of gossip messages grows
  • Solution - increase the number of read-only
    replicas and place them closer to the clients
Write a Comment
User Comments (0)
About PowerShow.com