Bran Selic Rational Software Canada bselic@rational.com - PowerPoint PPT Presentation

1 / 41
About This Presentation
Title:

Bran Selic Rational Software Canada bselic@rational.com

Description:

By focussing on the imperfect world of physical reality we may miss the essence ... and will have stringent dependability requirements ('cannot reboot the Internet' ... – PowerPoint PPT presentation

Number of Views:47
Avg rating:3.0/5.0
Slides: 42
Provided by: bran190
Category:

less

Transcript and Presenter's Notes

Title: Bran Selic Rational Software Canada bselic@rational.com


1
Physical Programming Beyond Mere Logic
  • Bran SelicRational Software Canadabselic_at_rationa
    l.com

2
What I am Hoping For
E THEORY AND PRACTICE OF
SOFTWARE
3
The Ideal and the Real
  • By focussing on the imperfect world of physical
    reality we may miss the essence
  • Software seems much closer to the ideal world

4
The Software World
  • Fundamental design principle separate program
    logic from the underlying implementation
    technology
  • separation of concerns
  • software portability

Program Logic
HL ProgrammingLanguages
Computing Environment Technology
5
The Real-Time Software World
  • Key question How long will it take?
  • The quantitative characteristics of the computing
    environment encroach upon the purity of the logic
  • software design involves engineering tradeoffs

6
A Simple Programming Application
  • Traverse a transactions log database and print
    all transactions pertaining to a specific account

open (DB) for i 1 to DB.size do record
read (DB) if (record.acctNo
myAccount)then print (record) enddo close
(DB)
7
Porting to a Distributed Environment
  • Can it really be this simple?

Network
open (DB)for i 1 to DB.size do record
read (DB) if (record.acctNo
myAccount)then print (record) enddoclose
(DB)
RPC_open (DB)for i 1 to DB.size do record
RPC_read (DB) if (record.acctNo
myAccount)then print (record) enddoRPC_close
(DB)
8
Some (Unstated!) Assumptions
  • The CPU and database are fast enough for the
    needs of the application
  • e.g. random access database hardware
  • The CPU and database fail as a unit
  • i.e., no need to contend with failures of the
    database
  • Communications is reliable
  • order preserving
  • exactly once semantics
  • A system never has anything more important to do
    than what it is doing at the moment

9
Partial Failures
  • Distributed systems can exhibit partial failures
  • fault tolerance ability to recover from partial
    failures
  • Issue failure recovery strategy
  • fault detection
  • failure recovery
  • fault diagnosis
  • Issue how do other sites detect that a site has
    failed?
  • (apparent) lack of activity/response
  • how do we distinguish between a failed site and a
    lost message?
  • Timeout is the only general mechanism available
  • how long do we wait?
  • Tradeoff between responsiveness vs. degree of
    certainty

10
A More Realistic Distribution Scenario
  • Dealing with partial failures

DB locate_database (Network)exception abort
RPC_open (DB)exception do DB
locate_database (Network)exception abort
enddo for i 1 to DB.size do record
RPC_read (DB)exception do DB
locate_database (Network)exception abort for
j 1 to (i-1) do RPC_read (DB)
exception abort retry enddo if
(record.acctNo myAccount)then print
(record) enddo RPC_close (DB)
Most of the code is in the exception handlers!
11
Asynchronous Events and Fault Tolerance
  • Partial system failures are only one kind of
    event that may need to be handled in the course
    of execution of a distributed program
  • Others
  • high-priority situations (e.g., imminent
    deadlines)
  • aborts
  • These events are often unpredictable
  • may occur at any point in the execution of a
    program
  • fault tolerance requires that whenever they occur
    and whatever they are, we need to deal with them

12
Revisiting An Old Assumption
  • Is the traditional main path focussed
    programming style appropriate when exceptions are
    the rule?

13
Asynchronous Event Handling
  • This is nicely captured by the state-event matrix
    of finite state machines

Event A
etc.
Event S
Handler AN
Handler AN1
Handler AN2
14
A Conclusion
  • In an event-driven and deadline-based
    application, a state machine-based programming
    model may be more appropriate than the
    traditional algorithmic (main path) programming
    model
  • The environment strikes back
  • the program logic is strongly affected by the
    environment

15
Communication Media Failures
  • Message loss
  • due to hardware failures
  • due to software failures (e.g., buffer overflow)
  • Message reordering
  • due to different paths
  • due to variable delays (e.g., due to variable
    message lengths)
  • retransmission due to fault-tolerant protocols
  • Message duplication
  • due to faulty hardware
  • retransmission due to fault-tolerant protocols

16
Transmission Delays
  • Possibility of out of date status information

17
Relativistic Effects
  • Relativistic effects
  • different observers see different event orderings
    (due to different and variable transmission
    delays)

18
Distribution Transparencies
  • Providing supporting layers of functionality that
    shield the application from the undesirable
    effects of distribution
  • e.g., reliable communication protocols

client
server
19
Impossibility Result No.1
  • It is not possible to guarantee that agreement
    can be reached in finite time over an
    asynchronous communication medium, if the medium
    is lossy or one of the distributed sites can fail
  • Fischer, M., N. Lynch, and M. Paterson,
    Impossibility of Distributed Consensus with One
    Faulty Process Journal of the ACM, (32, 2) April
    1985.

20
Impossibility Result No.2
  • Even when communication is fully reliable, it is
    not possible to guarantee common knowledge if
    communication delays are unbounded
  • Halpern, J.Y, and Moses, Y., Knowledge and
    common knowledge in a distributed environment
    Journal of the ACM, (37, 3) 1990.

21
The End-To-End Argument
  • Transparency mechanisms are intended to protect
    the application from observing the undesirable
    effects of distribution
  • Most transparency types require distributed
    agreement!
  • The end-to-end argument Saltzer et al.
  • if transparency cannot be guaranteed, the
    application is not really shielded from the
    effects of distribution
  • the overhead of introducing transparency
    mechanisms may not be justified

22
Stepping Back...
  • Most distribution problems are a consequence of
    the encroachment of the physical world into the
    pliable and limitless logical world of software
  • the problem is fundamental (e.g., the end-to-end
    argument)
  • Traditional Programming Logic
  • Physical Programming Logic Physics
  • like traditional engineers, software designers
    must take into account the raw material out of
    which they spin their logic
  • finite resources, finite delays, finite
    reliability...

23
Quality of Service Concepts
  • The physical characteristics of software can be
    specified using the general notion of Quality of
    Service (QoS)
  • a specification of how well a service is (to be)
    performed
  • e.g. throughput, capacity, response time
  • usually a quantitative measure
  • QoS specifications are two sided
  • offered QoS the QoS that is offered to clients
  • required QoS the QoS required by a client

24
Resources and Quality of Service
  • Resource an element whose functional capacity is
    limited, directly or indirectly, by the finite
    capacities of the underlying physical computing
    environment
  • The services of a resource are characterized by
    one or more QoS attributes
  • capacity, reliability, availability, response
    time, etc.

Client
Resource
Resource Demand
OfferedQoS
RequiredQoS
RequiredQoS ? OfferedQoS
25
Simple Example
  • Concurrent tasks accessing a monitor with known
    response time characteristics

Required QoS
Deadline 3 ms
MaxExecutionTime 4 ms
Offered QoS
26
Types and Physical Types
  • The purpose of types is to tell us about the
    externally relevant properties of software
    components so that we can validate whether they
    are being used appropriately
  • Physical types type specifications that
    incorporate QoS characteristics
  • Answer two key engineering questions
  • can this component support the load intended
    for it?
  • what does this component require to support its
    offered QoS?

27
Physical Type Example
  • A semaphore type
  • class Semaphore
  • heap 10 bytes -- required QoS
  • CPU? 5 MIPS -- required QoS
  • get()proc? 0.4CPU usstack4 bytes
  • rel()proc? 0.4CPU usstack4 bytes
  • Usage
  • mySema Semaphore
  • mySema.get() proc? 3 us -- req. QoS

28
Violation of Encapsulation?
  • Arent the offered QoS characteristics a
    consequence of the implementation?
  • Not necessarily...
  • The offered QoS characteristics can and should be
    defined independently of the implementation
  • the worst-case numbers of traditional
    engineering
  • The contractual obligations that the component
    designer is willing to assume

29
Physical Type Checking
  • Can physical types be statically checked?
  • The good news Yes, they can (in most cases)
  • The bad news typically requires complex analysis
    methods (queueing network analysis,
    schedulability analysis, etc.)
  • but then, model checking and theorem proving is
    not simple either
  • Some issues
  • Typically, QoS-based analyses cannot be done
    incrementally -- the full system context is
    required
  • but then, the same holds for many formal
    verification methods
  • Each type of QoS (e.g., bandwidth, CPU
    performance) combines differently

30
Required QoS
  • Like all guarantees, the offered QoS is
    contingent on the component getting what it needs
    to do its job
  • There are two distinct dimensions to this
  • the peer dimension
  • the layering dimension

31
Logical Viewpoint
  • Example logical view of aircraft simulator
    software

INSTRUCTOR STATION
AIRFRAME
ATMOSPHEREMODEL
PILOT CONTROLS
CONTROLSURFACES
GROUNDMODEL
ENGINES
32
Engineering (Realization) Viewpoint
  • The realization of a specific set of logical
    components using facilities of the run-time
    environment

33
Viewpoints and Mappings
Realizationmappings
34
The Engineering Viewpoint
  • The engineering viewpoint represents the raw
    material out of which we construct the logical
    viewpoint
  • the quality of the outcome is only as good as the
    quality of the ingredients that are put in
  • as in all true engineering, the quantitative
    aspects of the logical model are often crucial
    (How long will it take? How much will be
    required?)

35
Distributed Systems Dilemma
  • Dilemma How can we account for the engineering
    characteristics of the system without prematurely
    and possibly unnecessarily committing to a
    specific technology?
  • Proposed solution Include in the logical model a
    generic (technology-neutral) specification of the
    required/expected characteristics of the
    engineering environment

36
Viewpoint Separation
  • Required Environment a technology-neutral
    environment specification required by the logical
    elements of a model

Logical Viewpoint
37
Required Environment Specifications
  • What a logical component needs in order to
    perform its function according to spec

realization mapping
38
Required Environment Partitions
  • Logical elements often share common QoS
    requirements

QoS domain (e.g.,failure unit, uniform comm
properties)
39
QoS Domains
  • Specify a domain in which certain QoS values
    apply throughout
  • failure characteristics (failure modes,
    availability, reliability)
  • CPU speeds
  • communications characteristics (delay,
    throughput, capacity)
  • etc.
  • The QoS values of a domain can be compared
    against those of a concrete engineering
    environment to see if a given environment is
    adequate for a specific model

40
Physical Programming
  • The notions of QoS and QoS domains enable the
    design of distributed systems that properly
    account for the effects of distribution and other
    non-transparent physical phenomena, while
    allowing for a high degree of portability and
    technology independence
  • They are also the basis for formal verification
    of realization mappings
  • required QoS ? QoS of the proposed engineering
    environment
  • May also be used to automatically synthesize
    engineering environments that satisfy a given QoS
    specification of a logical model

41
Conclusions and an Appeal...
  • The physical aspects of software will not go away
  • ignoring them can be perilous especially when
    working with distributed systems
  • most interesting software systems of the future
    will be distributed and will have stringent
    dependability requirements (cannot reboot the
    Internet)
  • What is needed is a proper theoretical framework
    for dealing with physical types
  • The QoS framework described here is currently
    being incorporated into a profile of UML for
    real-time applications
Write a Comment
User Comments (0)
About PowerShow.com