Title: Applying RAMS to Design of Safety- and Mission-Critical Java Standards
1Applying RAMS to Design of Safety- and
Mission-Critical Java Standards
- Kelvin Nilsen, Ph.D., CTO
2The Lost Art of Programming Language Design
- Brief History of Programming Language
Introductions - 1957 Fortran
- 1959 LISP, COBOL
- 1960 Algol
- 1962 APL, SNOBOL
- 1972 C, Prolog
- 1975 Pascal, Basic, Scheme
- 1980 Smalltalk-80, Modula-2, C precursors
- 1983 Ada
- 1986 C, Smalltalk-V
Expressive Power (enable writing of
code) Abstraction and Encapsulation (make code
readable) Object Orientation,
Maintenance (make code extensible)
Where are we today with real-time Java?
3What is Real Time?
- Soft Real-Time sometimes connotes uncertainty
regarding deadlines, resource requirements,
budgeting, and enforcement. Here, we assume it
means - Awareness of resource requirements and timing
constraints - Disciplined approach to allocating resources so
as to satisfy timing constraints - Resource needs analysis, budgeting, and
enforcement may use empirical and/or heuristic
techniques - Hard Real-Time means resource needs are
determined analytically and budgets enforced
algorithmically, guaranteeing 100 compliance
with timing constraints.
4Our Focus
- In this talk, we are specifically considering
DO-178B levels A, B, and C. - For levels A and B, we are primarily considering
hard real-time problems - RAMS is a European Space Agency (ESA) term
representing Reliability, Availability,
Maintainability, and Safety - ESA expects subcontractors to apply RAMS
principles in all software developed for use
within the European space program - Note that developers of real-time software are
responsible for analyzing and proactively
managing more details than traditional IT
developers (i.e. memory, CPU time, blocking
times, and real time)
5Not RAMS, But Also Very Relevant
- Generality
- java.util.collections and other standard Java
APIs - Interrupt handling
- Efficiency
- Is the technology relevant and interesting to
more than a small specialized niche? - Efficiency
- How much memory required?
- How much CPU performance required?
- How long will it run on a single battery?
- How many cents does it add to the cost of each
unit manufactured?
6Preliminary Performance Metrics for JRTK
JRTK performance vs. C Java HotSpot
Sieve (standalone) 120 307
Sieve (CaffeineMark) 90 148
Loop (CaffeineMark) 114 326
Logic (CaffeineMark) 65 148
Method (CaffeineMark) 71 108
- Note most other real-time Java technologies
(including PERC) run slower than traditional Java
7Reliability Concerns
- Traditional Java does not fully constrain the
initialization of global (static) variables, so
that initial (and final) values of these
variables may depend on various race conditions - Can we be sure that global variables are used
before they are initialized? - Can we be sure that global final variables are
constants? - The Ravenscar Java proposal suggests that during
initialization, we apply different operating
semantics than during the mission-phase - How long will it take to complete initialization?
- How much memory will be required to perform
initialization? - How much memory will be available for execution
of mission code after initialization has
completed?
8Reliability Solutions
- The Scalable Java proposal requires that
- All initialization of global variables adhere to
particular style guidelines that allow the
initialization expressions to be analyzed prior
to run time - Intelligent linking tools precompute the
initial values of (most) global variables so that
these initial values are encoded as part of the
load image, and placed in ROM for final
variables. - Static analysis tools enforce that every static
variable is initialized before use, and that
initialization expressions are free of race
conditions and circular dependencies.
9Reliability Concerns
- The RTSJ programming abstractions are difficult
to use and error prone. Consider - MemoryAccessError thrown if NoHeapRealtimeThread
fetches a reference to heap memory - IllegalAssignmentError if RealtimeThread attempts
to write heap reference into a ScopedMemory
object - IllegalAssignmentError if RealtimeThread or
NoHeapRealtimeThread attempts to write
ScopedMemory reference to ImmortalMemory object - IllegalAssignmentError if RealtimeThread or
NoHeapRealtimeThread attempts to write reference
to inner-nested ScopedMemory object to an
outer-nested ScopedMemory object - Note code that is perfectly valid in one
context is considered erroneous in other
contexts, but there are no syntactic markers on
the code to indicate where it is valid
10Reliability Concerns
- More RTSJ run-time exceptions
- InaccessibleAreaException if I attempt to
allocate memory within an area that this thread
has not entered - MemoryScopeException if a wait-free queue is
constructed with the ends of the queue in
inappropriate areas - ScopedCycleException if a threads attempt to
enter a memory scope would violate the
single-parent rule - ThrowBoundaryError if a thrown exception
attempts to propagate beyond departure from the
scope within which it was allocated - In an ideal world, programmers would not be
allowed to write programs that are vulnerable to
these problems, but enforcement of this guideline
is intractable
11Reliability Solutions
- The Scalable Java proposal forbids all behaviors
that could result in MemoryAccessError,
InaccessibleAreaException, MemoryScopeException,
and ScopedCycleException. - The Scalable Java proposal requires syntactic
markers and checked exception handling in the
rare contexts that might possibly throw
IllegalAssignmentError. Further, it enables
system managers to forbid code that might throw
IllegalAssignmentError, and to enforce
prohibition with standard tools. - Open issue The Scalable Java proposal may
prohibit throwing of scoped exceptions, or may
establish standards for tools that would provide
project-specific optional enforcement of this
restriction.
12Reliability Concerns
- The RTSJ does not guarantee constant-time entry
into a new inner-nested LTMemory scope, and does
not guarantee that there will exist sufficient
defragmented memory to allow entry into that
scope - The RTSJ is not able to guarantee that a
previously entered, exited, and subsequently
re-entered memory scope is in its virgin state
when the scope is re-entered. Consequently,
reliability of subsequent allocations performed
within the scope is compromised
13Reliability Solution
- The Scalable Java proposal requires LIFO entry
and exit of memory scopes and forbids overriding
of Objects finalize() method. This guarantees - absence of fragmentation, and
- constant-time entry and exit of each nested
memory scope, and - all newly entered scopes are in their virgin state
14Reliability Concerns
- The RTSJ provides no standard mechanisms for
determining worst-case CPU time or memory
requirements - Ad hoc techniques require significant tedious
effort that is prone to human error - Code that is properly configured for particular
execution environment will not run properly in
other environments - If resource needs are underestimated, system
reliability suffers
15Reliability Solution
- The Scalable Java proposal introduces standard
annotations to enable static analysis of memory
and CPU time requirements for particular software
components - System managers can use standard tools to enforce
that particular components are fully analyzable - Open issue must every compliant implementation
of the Scalable Java standard provide CPU-time
and memory analyzers?
16General Strategies to Improve Availability
- Improved reliability extends MTBF
- Reliability issues already discussed
- Minimize downtime
- Fast, deterministic restart
- Fast, deterministic reconfiguration
- Hot-swap failed hardware components (requires
support for dynamic reconfiguration of software,
device drivers) - Support for redundant computation
17Availability Concerns
- Some proposals for safety-critical Java suggest
that initialization involves dynamic class
loading, byte-code verification, JIT compilation,
garbage collection, etc. - This is neither fast, nor deterministic
18Availability Solutions
- The Scalable Java proposal enforces that
initialization order be fully deterministic - Byte-code verification rejects program components
that introduce circularity dependencies, and
rejects components that fail to initialize static
globals - Most initialization is performed at link time,
and initial values are stored in the in the load
image - There is no garbage collection during
initialization initialization code implements
same virtual machine model as mission code
19Availability Concerns (Level C or lower)
- In the case that system field maintenance
requires replacement of certain hardware devices,
some proposals for safety-critical Java require
that the effort to rebuild the kernel involve
cumbersome trial-and-error experimentation with
source code and/or searches for new static
analysis techniques in order to certify that
the revised system configuration satisfies the
definition of legal program - Regarding only the memory management, a program
that can be proven not to contain memory-related
runtime errors is such a "legal safety-critical
Java program". This definition leaves open what
tools may be used for this analysis, since very
different analysis techniques exist. These
techniques have very different characteristics
with respect to the degree to which the tools can
operate automatically and the accuracy of the
results. Fridtjof Siebert, Feb. 2, 2005 - This is neither fast, nor deterministic
20Availability Solutions
- The Scalable Java proposal provides a definition
of legal program that is enforced by the
byte-code verifier of every compliant
implementation - Enforcement of these rules is modular, in the
traditional sense of object orientation - Method implementations are independently verified
to conform with their declared interfaces - Method compositions (invoking one method from
another) are verified compatible by examining
only the interface declarations - A verified Scalable Java device driver that
satisfies the interface requirements defined for
the device driver is guaranteed to integrate
cleanly into an existing safety-critical system
21Availability Concerns
- In many high-availability systems, it is
necessary to hot-swap failed hardware components.
In the most general case, this requires that we - Unload the classes that represent the device
driver for the failed components - Load new classes that represent the device driver
for the replacement component - Problem the RTSJ and some proposals for
safety-critical Java dont specifically allow
class unloading and dont require support for
deterministic and reliable integration of new
classes
22Availability Solutions
- We are designing into the Scalable Java proposal
the ability to dynamically unload classes that
have been loaded into scoped memory with custom
class loaders - The Scalable Java proposal allows deterministic
loading and reliable integration of independently
verified classes into a running system
23Availability Concerns
- The redundancy of computation and information
that is required to achieve high availability in
the face of occasional hardware failures requires
network communication, but the memory model
restrictions proposed by certain safety-critical
Java proposals make it very difficult if not
impossible to implement network stacks, RMI, or
CORBA. - In some proposals, temporary objects must be
periodically destroyed. How do we represent
domain name server caches, routing tables, RMI
handles, temporaries for serialization and
deserialization, etc?
24Availability Solutions
- The Scalable Java proposal supports reliable and
very efficient allocation and reclamation of
temporary objects using stacked scope
abstractions - We are designing into the Scalable Java proposal
the ability to support statically analyzable
collections (to represent loaded classes, domain
name server information caches, routing data
structures, etc.)
25Availability Concerns
- In fault tolerant systems, it is sometimes
necessary to migrate redundant computations to
new computation servers when particular servers
or networking infrastructure fails, but certain
proposals for safety-critical Java do not support
reliable deterministic class loading and unloading
26Availability Solutions
- As discussed above, the Scalable Java proposal
will support both dynamic class loading and
dynamic class unloading
27Maintainability
- Maintenance consists of activities such as
- Minor modifications to existing software in order
to fix a bug, improve performance, or add
incremental new functionality - Combining an existing collection of software with
an independently developed separate collection of
software to yield a new composite system that
combines the capabilities of both of the smaller
systems - Porting an existing software system to a new OS
or CPU platform
28Maintainability
- Maintainability is primarily an economic
consideration, but there is subtle interplay with
reliability, availability, and safety - What is the probability that a maintenance action
will reduce system reliability? - If maintenance upgrades are required (e.g. space
shuttle), how long will this impact availability
to fulfill mission objectives? - Can the impact of an incremental maintenance
change be addressed with an incremental change to
safety certification artifacts, or will I need to
completely recertify all aspects of the system? - Clear encapsulation of control and data, properly
abstracted by interface definitions - Portability
29Maintainability Concerns
- Some proposals for safety-critical Java suggest
that the definition of legal program depends on
which tools you use from which vendor to analyze
your program. - Inherent in this approach the programmer cannot
proactively create legal programs. He must
discover which code is legal by simply trying to
compile it. - Furthermore, code that is legal in one context
will be illegal in others. - This makes it very difficult to port a large
system to a new tool set, or to make incremental
refinements to an existing system, or to combine
one complex subsystem with another
30Maintainability Solutions
- The Scalable Java proposal carefully defines the
notion of legal program in terms that a
programmer can readily understand - Enforcement of legality is performed one method
at a time, so programmers receive immediate
feedback if they write code that is considered
illegal - It is straightforward to determine through
examination of interface declarations (and
annotations) whether it is appropriate to invoke
particular methods from specific contexts
31Maintainability of Object-Oriented Code
- The motivation for object orientation is to
combine strong encapsulation of control and data
with ease of extensibility through polymorphism
and inheritance - Critical to satisfying these objectives all of
the semantic information required to determine
whether particular components compose must be
readily available through examination of the
interface declarations
32Maintainability Concerns
- If I am called upon to modify an existing RTSJ
component (method), it is essential that I
understand - Whether my incoming reference arguments point to
immortal or scoped memory - If the incoming scoped arguments are known to
nest in a particular order - Which externally created ScopedMemory region
sizes must be adjusted if I need to allocate new
scope-compatible temporary objects, or if I find
it possible to decrease my need for temporary
object allocations - Which worst-case CPU time calculations must be
recomputed if my modifications alter the
execution time of this method - Unfortunately, none of this information is
represented in the interface specification - Furthermore, the technique of searching for all
invocations of this method, studying the contexts
from which the method is invoked, and tracing the
possible global impact of any changes made in
each of those contexts, scales exponentially with
program size
33Maintainability Concerns
- If I am called upon to combine two independently
developed components (invoke one method from
another), it is essential that I understand - Whether I can safely pass references to scoped
objects - Whether I can safely pass references to
ImmortalMemory objects - Whether the invoked method might perform
operations that would cause the thread to block - Whether the invoked method might perform
ImmortalMemory allocation - Whether the invoked method might perform
temporary memory allocations in the current
scope - Whether the invoked method is known to execute in
bounded time and memory - Unfortunately, none of this information is
represented in the interface specification
34Maintainability Concerns
- If I am called upon to extend an existing class,
overriding one of the existing methods, it is
essential that I understand - Whether I can assume that my incoming reference
arguments point to immortal memory, scoped memory - Whether I can assume that incoming scoped
arguments are known to nest in a particular order - Whether I am allowed to allocate memory in the
current scope, in ImmortalMemory, in a newly
created scope - Whether I am allowed to invoke services that
might cause the current thread to block - Whether I am required to restrict myself to
control structures that are bounded in execution
time - Which worst-case CPU time calculations must be
recomputed if my new method has different
execution time than the overridden method - Which ScopedMemory sizes need to be adjusted if
my new method allocates different amounts of
memory than the overridden method - Unfortunately, none of this information is
represented in the overridden methods interface
specification
35Maintainability Solutions
- The Scalable Java proposal introduces standard
annotations to represent the required information
in method interface specifications - The Scalable Java proposal requires byte-code
verification tools to assure consistency between
interface description and method implementation,
and between interface requirements and interface
invocations
36Safety
- Note that the safest airplane, is the one that
never leaves the hangar, but this is not - Very reliable in fulfilling its mission
- Very available for doing useful work
- Sometimes, I wonder if our safety-critical Java
standardization efforts will succeed only in this
regard.
37Safety Fundamentals
- In many regards, safety is a combination of
reliability and availability (once youre in the
air) - However, it is special in that regulatory
authorities impose certain certification
practices that must be followed - Look specifically at DO-178B Level A
certification requirements
38Traceability Analysis
- All high-level requirements map to low-level
requirements - All low-level requirements map to design, source
code, and test plan - Trace each line of source code to corresponding
object code - Run all tests and perform coverage analysis
- If object code is not 100 covered, youve got
dead code, unstated requirements, or incomplete
test plan. Fix the problem! - Note test plan must derive from high-level
requirements, not language semantics (more later)
39Level-A Testing Requirements
- Code coverage analysis must be performed on
machine language translation of programs - Must provide MCDC (multiple-condition decision
coverage) - every condition in a decision in the program has
taken all possible outcomes at least once, - every decision in the program has taken all
possible outcomes at least once, and - each condition in a decision has been shown to
independently affect that decisions outcome. A
condition is shown to independently affect a
decisions outcome by varying just that condition
while holding fixed all other possible
conditions.
40Sample Program (in C)
- / Return the maximum of its 4 integer arguments
/int max(int a, int b, int c, int d) if ((a
gt b) ((a gt c) (a gt d))) return a else if
((b gt a) (b gt c) (b gt d)) return
b else if ((c gt b) ((c gt a) (c gt
d))) return c else return d
41Test Vectors
Ra Rb Rc Rd Code Sequence (line numbers)
Statement Coverage A 5 4 3 2 1, 2, 3, 4, 5, 6, 7
Statement Coverage B 4 5 3 2 1, 2, 8, 9, 10, 11, 12, 13
Statement Coverage C 5 4 6 3 1, 2, 3, 4, 5 6, 14, 15, 16, 17, 18,19, 20, 21
Statement Coverage D 2 2 4 5 1, 2, 8, 9, 16, 17, 18, 19, 22, 23
Decision Coverage E 4 5 3 6 1, 2, 8, 9, 10, 11, 22, 23
Decision Coverage F 5 4 3 6 1, 2, 3, 4, 5, 6, 14, 15, 22, 23
42Control Flow of Test Vectors
43Control Flow of Test Vectors
Consider branch condition at instructions 16-17.
TT and TF demonstrate alternative branches. To
satisfy MCDC requirements, we must demonstrate
that FT branches differently than TT. But
theres no way to deliver this condition to
label Lz. This code cannot be tested to
Level-A certification requirements.
44Safety (Certification) Concerns
- How do we perform MCDC testing of assignment
checks that, by design, always succeed? - How do we perform MCDC testing of stack overflow
tests that weve (hopefully) arranged to always
fail? - How do we perform MCDC testing of class
initialization tests that always succeed
following the first test for each class, which
always fails? - How do we perform MCDC testing of array subscript
out-of-bounds checks that always fail? - And so on
45Safety Solutions
- The Scalable Java proposal establishes guidelines
that enable static analysis tools to guarantee
absence of many common error conditions that
would need to be tested at run-time in a
traditional fully compliant RTSJ implementation - Static initialization is performed by the
intelligent static linker rather than by run-time
checks - Having proven through static analysis that no
run-time checks are necessary, there is no need
to emit untestable run-time checks
46Summary
- Language design must address a breadth of
important issues - While ability to achieve safety certification is
important, the question of whether this
technology appeals to industry users depends also
on ease of software development and maintenance,
cost of deployment, etc. - Many of these issues are addressed (at least
partially) in the Scalable Java proposal