Distributed Systems - PowerPoint PPT Presentation

1 / 59
About This Presentation
Title:

Distributed Systems

Description:

car. car. car. The power law of settlements ... Quotes. Distributed. Cache. Web. Server. CORBA. RMI. XML-RPC. WebService. JMS. JDBC ... – PowerPoint PPT presentation

Number of Views:189
Avg rating:3.0/5.0
Slides: 60
Provided by: krih
Category:

less

Transcript and Presenter's Notes

Title: Distributed Systems


1
Distributed Systems
  • Lecture on

Hochschule der Medien
Walter Kriha
2
Overall Goals
  • Learn the basic concepts of Distributed Systems
    like concurrency and remoteness
  • Understand different programming models for
    Distributed Systems
  • Understand interdependencies between technical
    means of distribution and distribution as a
    business or social model

3
Goal for today
  • Give an overview of distributed systems. Later
    lectures will dig into the gory details like
    security, transactions, remote calling mechanisms
    etc.

4
Introduction
  • What is a Distributed System (DS)
  • Why distribute?
  • Types of DS
  • Characteristics of DS
  • Middleware for DS
  • Resources
  • Exercises

5
Definition of a Distributed System
Independent agents repeatedly interacting in a
way that a coherent behavior (system) emerges
6
Why learn about Distributed Systems?
Because most IT systems ARE ALREADY DISTRIBUTED
SYSTEMS (and not only the IT Systems)
7
Types of Distributed Systems
  • Energy grid, telcom net
  • Villages, towns and big cities
  • It-Infrastructure of large companies
  • High-performance clusters
  • The WWW
  • The human body, organizations, states
  • A flock of birds

8
The energy grid now hub and spoke
office
home
home
Power- plant
factory
home
Electricity flows in one direction only, with a
lot of it lost during transport. Control resides
with the power plant. www.wired.com/wired/archive/
9.07/juice.html
9
The future grid micropower
Fuel cell
Fuel cell
office
home
home
car
Fuel cell
Power- plant
Fuel cell
factory
home
Fuel cell
car
car
Power flows many directions, controlled by
independent sensors in the grid. A tenfold
increase of transactions. Modern GRID computing
allows users to tap into a wealth of distributed
computing resources. http//www.thegridreport.com/
10
The power law of settlements
There are many villages, quite a few towns but
only a small number of big cities
11
IT-Infrastructure of large corporations
Un-trusted clients
Customer data zone
Processing zone
12
World Wide Web
Internet
DNS
Intranet
Dialup clients live at the edges of the
internet (no fixed IP address, slow upload). How
many graphs are layered on top of the physical
network structure? (hyperlinks, search-engines,
DNS)
13
The New Web
P2p overlay
Social network overlay
Mobile PAN network
mashups
Internet
Location based service
Sensors
Cams
Real world
Aggregation of external information and
collaboration based on social networks will bring
new forms of content production and consumption
and consumer areas will influence companies
(consumerization, Gartner Group). More
interconnection of different net-types brings
more emergent phenomenons.
14
Why Distribute?
  • Risk avoid single points of failures (e.g. Use
    hot stand-by data centers)
  • Performance run tasks on several nodes
  • Security create different security domains

15
Application Structure
16
Views of Distributed Systems
  • Enterprise View the role within an enterprise
  • Information View the flow of information in the
    DS (information architecture)
  • Computational View the processing of information
    in the DS (logical architecture)
  • Engineering View the system infrastructure
    (nodes, connections, system management, replicas
    etc.) (physical or distribution architecture)
  • Technology View the specific technology used to
    build the DS

(From Open Distributed Processing plus my
architecture categories)
17
Characteristics of Distributed Systems
  • re-definition of programming language concepts
  • distribution topology
  • emergent behavior
  • autonomous components
  • heterogeneous components
  • a strong need for security
  • Concurrency
  • Scale
  • Remoteness
  • global naming and addressing
  • ownership and control
  • transactions across nodes
  • no global state
  • no centralized control
  • many points of failure
  • Asynchronous communication

Dont worry, well dig into all this another day!
18
Programming Languages and Distributed Systems
  • DS
  • system defines security
  • Objects are versioned
  • Global identity
  • Components

PL and DS are orthogonal.
  • PL
  • Design security (private, protected etc.)
  • no versioning, no components as types
  • memory address used for identity
  • platform dependent basic type size (int)

19
Distribution Topology
  • The small world effect

It takes only a small number of intermediate
persons to connect any person on this world to
any other one. (A knows B, B knows C, .... F
knows G.)
From The Milgram experiments on social networks.
(Andy Oram, Peer-To-Peer, Harnessing the power of
disruptive technologies). OpenBC or LinkedIn
create a social network from distributed
participants.
20
Systems showing the small world effect
High local clustering
How efficient can this DS transport messages?
Queries? How robust is it against random attacks
on nodes, targeted attacks on the important
connecting nodes?
21
The power law of DS a law of nature?
Cities, Companies, Power, social networks etc
seem to exhibit the power law. Each size is a
tenth of the next bigger size but has ten times
more instances. In defense of cities, Clay
Sharky, www.openp2p.com
22
Metcalfes law - Network Effects
  • The usefulness of a network grows by the square
    of the number of users (think about a fax machine
    how useful is one?)
  • The adoption rate of a network increases in
    proportion to the utility provided by the
    network. (Thats why companies give away software
    e.g.)

23
Emergent Behavior a flock of birds
There is no central controller, no Super-bird.
No bird has a representation of the figure in its
head. Instead, every bird follows very simple
rules. The resulting figure shows EMERGENT
behavior. Many distributed systems show it as
well for good or for bad. (Kevin Kelly, Out of
Control The biology of the new machines. Peter
Wegner, Interaction vs Algorithm.)
24
Heterogeneous Components
Hardware unreliable Frequent downtimes Little
endian byte order Java Data Types No
callbacks Slow, no access control
Fault tolerant hardware System management Big
endian byte order C data types Fast, access
controlled
25
Security in Distributed Systems
Authentication Authorization
Integrity, Confidentiality
But sometimes anonymity is needed!! (peer-to-peer
systems)
Authentication Authorization
26
Security Topics
  • Firewalls
  • Certificates, Public Key Infrastructure, Digital
    Signature
  • Encryption (methods and devices)
  • Software Architecture
  • Intrusion Detection
  • Sniffing
  • PGP, SSL etc.
  • Denial of Service attacks
  • Authentication (who are you?)
  • Authorization (what can you do?)
  • Confidentiality (can someone spy on us?)
  • Integrity (Did somebody change your message?)
  • Non-repudiation (It was you who ordered X)
  • Privacy/Anonymity

27
Important Programming Terms for DS
  • Identity
  • Value vs. Reference
  • Exception
  • Interface vs. Implementation
  • Interface Definition Language (IDL)
  • Quality of Service (QOS)
  • Stubs/Proxies

28
Distributed System Design
  • Common Problems (performance, fail-over,
    maintenance, policies, security integration)
  • Information Architecture (define and qualify the
    information fragments and flows)
  • Distribution Architecture (create a map of all
    participating systems and their quality of
    service)
  • Policy-Driven Architectures

29
Middleware for Distributed Systems
30
What is Middleware?
software that helps two separate systems
communicate seamlessly. (www.knownow.com/middlewar
e/lexicon.html)
In a strict sense middleware is transport
software that is used to move information from
one program to one or more other programs,
shielding the developer from dependencies on
communication protocols, operating systems and
hardware platform (plumbing) (www.talarian.com)
31
Positioning Middleware
  • General structure of a distributed system as
    middleware.

1-22
From van Steen/Tanenbaum
32
The Transparency Dogma
  • Middleware is supposed to hide remote-ness and
    concurrency by hiding distribution behind local
    programming language constructs

Critique Jim Waldo, SUN Full transparency is
impossible and the price is too high
33
Distribution Transparencies
  • Access mask differences in data representation
    and invocation mechanisms between heterogeneous
    systems
  • Failure mask failures to enable fault tolerance
    (e.g. Intelligent load-balancing)
  • Location use logical, not physical names to
    access services
  • Migration hide the true location of a service or
    object from clients. If the location changes, the
    client wont notice it.

34
Distribution Transparencies contd
  • Replication hide a group of equal objects behind
    an interface (performance, availability)
  • Persistence hide the storage mechanisms and
    internal policies from a client. Make a remote
    object look like it is persistently activated.
  • Transaction hide the complex coordination
    necessary to achieve consistency.

Source ISO/IEC 10746-1 Open Distributed
Processing, www.iso.org
35
Where do we find Middleware?
LDAP or DCE
Quotes
Distributed Cache
Directory
WebService
JMS
JNDI
Application Server Web- Tier
Application Server EJB Tier
Web Server
JDBC
RMI
XML-RPC
CORBA
News
E-bank
Part of a Portal running on a Web Cluster.
36
Classification
  • Socket Based Services
  • Remote Procedure Calls (RPCs)
  • Object Request Brokers (CORBA, RMI)
  • Message Oriented Middleware (MOMs)
  • Web-Services (XML-RPC, SOAP,UDDI)
  • Component Systems (Enterprise Java Beans, J2EE)
  • Peer-To-Peer (Napster, Gnutella, Freenet,
    seti_at_home)
  • Agent based (Jini, Aglets)

37
The ilities
  • Reliability
  • Availability
  • Security
  • Scalability
  • Quality
  • Performance
  • Maintainability

Before using a specific middleware, always make
sure that the ilities aka non-functional
requirements are met. Middleware almost always
differs implementation quality between vendors.
38
Real-World Problems
  • Skills/Understanding Best practice patterns?
  • Single-Point-Of-Failures replication,
    load-balancing etc.
  • Tooling generators, deployment tools
  • Brittle-ness if interfaces change (Compiler
    illusion)

39
RPC type Middleware
  • E.g. Sun-RPC, OSF DCE
  • Main idea distribute functions, use concurrent
    processing
  • On top of it Distributed Directory, File system,
    Security (cells, principals)
  • XML-RPC over http (www.userland.com)

Layer foundations UUIDs, value vs. reference,
marshaling, versioning etc.
40
Distributed Objects
CORBA
RMI
  • Java only (e.g.Introspection used)
  • Lightweight method call semantics
  • Java Implementations
  • Wire Protocoll now mostly RMI over IIOP
  • Object Request Broker
  • Multi-language support (platform independence)
  • Interface Definition language
  • Wire Protocoll IIOP, GIOP

Both try to preserve object semantics.
Interface/Implementation separation
41
Distributed Components
  • Objects are too granular performance and
    maintenance problems
  • Programmers need more help separation of
    concerns and context
  • Solutions
  • Enterprise Java Beans
  • CORBA Components
  • COM

42
Example Enterprise Java Beans
EJB Framework (Separation of concerns)
Deployment (Separation of context)
  • Automatic Transaction Management
  • System Management defines Data Sources and
    Containers

System Management defines Pool sizes
Concurrency Control
System Management defines Role/User Binding
Automatic, method level Security
43
EJB Container
Client
Entity Bean
invoke
Load/ persist
delegate
At the point of interception the container
provides the following services to the bean
Resource management, life-cycle,
state-management, transactions, security,
persistence
44
Distributed Messages (MOM)
Asynchronous, loosely-coupled (fault tolerant),
persistent messages with either publish/subscribe
(topics) or queuing semantics. Scales well.
Delivery guarantees differ.
Sub
Get
Sub
Pub
Pub
Put (M1,M2)
Sub
Topic
M2
M1
queue
Sub
publish
send
Sub
Get
MOM
MOM
45
Distributed Code I (Agents, Aglets)
The Problem who wants a new runtime system?
Agent
Agent
Perform work, come back with results
pack
unpack
Serialized Agent
Agent Runtime
Agent Runtime
Channel
OS
OS
46
Distributed Code II (Jini) The End of Protocols?
Jini Lookup Service
Proxy moves to lookup service during registration
Proxy moves to client during service lookup
Jini Client
Jini Service
Service private protocol
Service Proxy Code
47
Peer 2 Peer
Seti_at_home, freenet JXTA etc.
INTERNET DNS
Nodes have no fixed IP address and frequent
down-times
ISP
ISP
ISP
P2P uses cycles, provides file sharing and
anonymity because no central servers are used
Problems How do you version files? Overhead?
48
WebServices
Promises de-coupling of service provider and
requester, document interfaces,
machine-to-machine communication and ease of use
compared to distributed objects.
Core services
Security, Transactions etc.
Registry (advertise)
Universal Description, Discovery and Integration
Service features
Web Services Description Language
exchange messages
SOAP
Wire Format/ Transport
XML Syntax/HTTP
Web Server
Broker
Service Granularity? Application, Component,
Object or Request?
Use your de-hyper generously!
49
GRID Computing
A Grid providing OGSA
The problem requires a massive amount of system
resources
different companies
The abstraction single system image
Grid computing promises the just-in-time
availability of vast amounts of computing
resources, easily accessible through a single
system image. Scientific simulation or even game
construction (www.butterfly.net ) are possible
applications. See http//www.globus.org/research/
papers/anatomy.pdf, the anatomy to the grid.
50
(Tuple) Spaces
A space providing tuple storage
users or agents storing or finding tuples
users or agents interacting through the space
The abstraction Anything can be stored as long
as it is addressable
The worlds largest space is the WWW. Other spaces
are WIKI-WIKI collaboration systems or more
traditional tuple spaces like tspaces or jspaces.
The principle is always the same a few simple
methods (put/take/find) which lets users or
machines store or find content. The content
itself is returned as a representation of a
resource. Thats why some people call those
systems REST (Representational State Transfer
Architecture), after a theses from Roy Fielding,
the father of http.
51
Others
  • Internet Games
  • Portal Architectures
  • Java Communicating Sequential Processes (JCSP). A
    library implementing Hoares CSP.
  • Pi-Calculus for mobility
  • Mozart/OZ http//www.mozart-oz.org/
  • E-language, http//www.erights.org/
  • Erlang language for distributed telco systems
    with asynchr. message passing http//www.erlang.or
    g/
  • Parallel Processing PVM, MPI (e.g. for Linux
    Beowulf cluster)
  • Wireless mobile communication, Bluetooth
  • System Management
  • Jiro/FMA/JMX
  • Group Computing (virtual synchrony) Horus, iBus,
    javagroups (www.javagroups.org , good for
    building distributed caches or HA
    infrastructures)
  • Distribution Subsystem (DSS) middleware library
  • Simjava (discrete event simulation)
  • Gridsim (grid simulation package)
  • Teatime (www.opencroquet.org)

52
Future Applications
  • Collective Intelligence collaborative production
    of content
  • Mashups dynamic integration of external sources
  • Social Networks Analysis the use of information
    and knowledge from many people and their personal
    networks.
  • Sensor Mesh Networks ad hoc (self-organizing)
    networks formed by dynamic meshes of peer nodes,
    each of which includes simple networking,
    computing and sensing capabilities.
    (Real-world-web)
  • Event-driven Applications an architectural style
    for distributed applications, in which certain
    discrete functions are packaged into modular,
    encapsulated, shareable components, some of which
    are triggered by the arrival of one or more event
    objects.
  • Web 2.0 represents a broad collection of recent
    trends in Internet technologies and business
    models.  Particular focus has been given to
    user-created content, lightweight technology,
    service-based access and shared revenue models.
    (technical base AJAX, RubyOnRails etc)

Roughly taken from Gartner's 2006 Emerging
Technologies http//www.gartner.com/it/page.jsp?id
495475
53
Resources (Technologies)
  • Jim Waldo, End of Protocols
  • Java Data Objects (DO) specification
    (www.java.sun.com)
  • ObjectSpectrum 7/2001, WebServices
  • Jim Waldo, A note on distributed computing
    (please read till next session)
  • Java Magazine 7/2001, Java Message Service
  • www.theserverside.com
  • on Enterprise Java Beans
  • Clay Shirky, What is P2P and what Isnt
    (www.openp2p.com)
  • S.Tai, I.Rouvellou, Strategies for Integrating
    Messaging and Distributed Object Transactions

54
Resources (Programming)
  • Wolfgang Emmerich, Engineering Distributed
    Objects (www.distributed-objects.com) With slides
    and tests.
  • Marco Boger, Java in verteilten Systemen
  • Mastering Enterprise Java Beans
    (www.theserverside.com) free!
  • Ted Neward, Java Server Side Programming
    (sockets, servlets etc.) www.manning.com/neward
  • www.swarm.org, portal for swarm programming. Used
    also as simulation tools for research in
    economics and finance

55
Resources (Systems)
  • Coulouris, e.al., Distributed Systems
  • Andrew Tanenbaum, Maarten van Steen, Distributed
    Systems. Get this one or Coulouris for a long
    term effect . (http//ajax.prenhall.com/divisions/
    esm/app/author_tanenbaum/custom/dist_sys_1e/index.
    html (slides and book chapters)
  • Ken Birman, Building secure and reliable Network
    Applications (a good draft existed once)
  • Grey/Reuter, Transaction Processing
  • Jiro/Federated Management Architecture (FMA)
  • Open Grid Service Architecture http//www.gridforu
    m.org/ogsi-wg/drafts/ogsa_draft2.9_2002-06-22.pdf
    , explains the services needed in the new GRID
    computing paradigm

56
Resources (Theory)
  • Designing Distributed Systems, A Conversation
    with Ken Arnold, Part III, http//www.artima.com/i
    ntv/distribP.html , shows importance of failures
    and state in DS
  • The Paradigm Shift from Algorithms to
    Interaction, Peter Wegner, 1996, a provocative
    short essay on why interactive systems are much
    more powerful than turing machines. Shows that DS
    is more than just concurrency and remoteness. The
    basics of emergence and non-algorithmic behavior.
    Good for agent systems as well.
  • Reliable Distributed Systems Technologies, Web
    Services, and Applications Birman, Kenneth P.
  • Phillip J. Windley, Digital Identity, Contains
    architecture of identity repositories including
    federation aspects. Network effects and its
    effects against bilateral identity management.

57
Resources (Scale-Free)
  • Stability and topology of scale-free networks
    under attack and defense strategies, Lazaros K.
    Gallos u.a. http//xxx.lanl.gov/pdf/cond-mat/0505
    201
  • Albert-Laszlo Barabasi, Linked. Investigates
    small worlds, scale free networks etc. Basically
    moves from random networks to hub/spoke
    architectures. Discovered how the WWW space is
    organized (in/out/core/islands etc.). A must read
    for everybody interested in the effects of
    topology (e.g. on virus spreads)

58
Resources (Web)
  • Tim Oreillys famous article on Web2.0
    http//www.oreillynet.com/pub/a/oreilly/tim/news/2
    005/09/30/what-is-web-20.html
  • Gartner's 2006 Emerging Technologies Hype Cycle
    Highlights Key Technology Themes
    http//www.gartner.com/it/page.jsp?id495475
  • Mashups Duane Merrill, http//www-128.ibm.com/dev
    eloperworks/library/x-mashups.html?cadnw-727

59
Resources (Events, Simulation)
  • Simjava, discrete event simulation package.
    Tutorial at http//www.dcs.ed.ac.uk/home/simjava/
    tutorial/
  • GridSim, Grid Simulation Package,
    http//www.gridbus.org/gridsim/gridsim2.2/
Write a Comment
User Comments (0)
About PowerShow.com