ISTORE%20Software%20Runtime%20Architecture - PowerPoint PPT Presentation

About This Presentation
Title:

ISTORE%20Software%20Runtime%20Architecture

Description:

Aaron Brown, David Oppenheimer, Kimberly Keeton, Randi Thomas, Jim Beck, John Kubiatowicz, and ... fetch web page (from disk or remote site), apply ... – PowerPoint PPT presentation

Number of Views:83
Avg rating:3.0/5.0
Slides: 38
Provided by: davidoppe
Category:

less

Transcript and Presenter's Notes

Title: ISTORE%20Software%20Runtime%20Architecture


1
ISTORE Software Runtime Architecture
  • Aaron Brown, David Oppenheimer, Kimberly Keeton,
    Randi Thomas, Jim Beck, John Kubiatowicz, and
    David Patterson
  • http//iram.cs.berkeley.edu/istore
  • 1999 Winter IRAM Retreat

2
ISTORE RuntimeSoftware Architecture
  • Runtime system goals for the ISTORE
    meta-appliance
  • (1) Provide mechanisms that allow network service
    applications to exploit introspection (monitor
    adapt)
  • (2) Allow appliance designer to tailor runtime
    system policies and interfaces
  • How the goals are achieved
  • (1) Introspection layered local and global
    runtime system libraries that manipulate and
    react to monitoring data
  • (2) Specialization runtime system is extensible
    using domain- specific languages (DSLs)

3
Roadmap
  • Layered software structure
  • Example of introspection
  • Runtime system extensibility using DSLs
  • Conclusion

4
Layered software structure
HW Device (NIC)
HW Device
5
Device Interface Layer
HW Device (NIC)
HW Device
6
Device interface layer
  • Microkernel OS modules
  • Traditional OS services
  • Networking, mem management, process scheduling,
    threads,
  • Device-specific monitoring
  • Raw access patterns
  • Utilization statistics
  • Environmental parameters
  • Indications of impending failure
  • Self-characterization of performance, functional
    capabilities

7
Local runtime layer
HW Device (NIC)
HW Device
8
Local runtime layer
  • Non-distributed mechanisms needed by network
    service applications
  • Feeds information to global layer or performs
    local operations on behalf of global layer
  • Example mechanisms
  • Application-specific filtering/aggregation of
    device monitoring data
  • Example OLTP server vs. DSS server
  • Data layout and naming
  • Example record-based interface for DB,
    file-based for web server
  • Device scheduling
  • Example maximize TPS vs. maximize disk bandwidth
    utilization
  • Caching
  • Coherence essential vs. coherence unnecessary
  • More efficient caching implementation possible in
    second case

9
Global runtime layer
HW Device (NIC)
HW Device
10
Global runtime layer
  • Aggregate, process, react to monitoring data
  • Relies on local per-device runtime mechanisms to
    provide monitoring data, implement control
    actions
  • Provides application interface that hides
    distributed implementation of runtime services
  • Example services
  • High-level services
  • Load balancing replicate and/or migrate heavily
    used data objects when a disk becomes
    over-utilized
  • Availability replicate data from failed or
    failing component to restore required redundancy
  • Plug-and-play integrate new devices into the
    system
  • Low-level services used to implement high-level
    global services
  • Distributed directory tracks data and metadata
    objects
  • Migration, replication, caching
  • Inter-brick communication
  • Distributed transactions

11
Distributed application worker code
HW Device (NIC)
HW Device
12
Distributed application worker code
  • Runs on top of global runtime system
  • Written by appliance designer
  • Application-specific
  • Database
  • scan, sort, join, aggregate, update record,
    delete record, ...
  • Transformational web proxy
  • fetch web page (from disk or remote site), apply
    transformation filter, update user preferences
    database, ...
  • System administration tools implemented at this
    level
  • Customized runtime system defines administrative
    interface tailored to application

13
Application front-end code
HW Device (NIC)
HW Device
14
Application front-end code
  • Runs on front-end interface bricks
  • Accepts requests from LAN/WAN connection
  • Incoming requests made using standard high-level
    protocols
  • HTTP, NFS, SQL, ODBC,
  • Invokes and coordinates appropriate worker code
    components that execute on internal blocks
  • Takes into account locality and load balancing
  • Database front-end performs SQL query
    optimization, invokes distributed relational
    operators on data storage devices
  • Transformational proxy front-end invokes
    distiller thread on appropriate device brick
  • if data is cached, invoke on disk node
  • otherwise, fetch data from web and invoke on
    compute node or disk node

15
Roadmap
  • Layered software structure
  • Example of introspection
  • Runtime system extensibility using DSLs
  • Conclusion

16
From introspection to adaptation
Intelligent HWcomponents
Adaptive, self-maintaining appliance

Continuous monitoring
Extensible, application-tailored runtime system
  • Example slowly-failing data disk in large DB
    system
  • (1) Detect problem
  • (2) Repair problem while continuing to handle
    incoming requests
  • (3) Return to normal system operation

17
Failing disk detection
  • Microkernel monitoring module continuously
    monitoring disks health detects exceptional
    condition, e.g.
  • ECC failures
  • Media errors
  • Increased rates of ECC retries
  • Notifies global fault handling mechanism

18
Failing disk reaction
  • Global fault handling mechanism
  • Prevents system from sending more work to failed
    device
  • Modifies global directory to remove entries
    corresponding to failed components data
  • Application-specific response to impending
    failure
  • Transactional system discard work currently in
    progress on failing device, reissue to another
    data replica
  • Non-transactional system w/o coherent replicas
    checkpoint computation, restore on another data
    replica
  • Transformational web proxy do nothing
  • Instruct disk runtime system to shut disk down
  • Disk device is considered failed

19
Failing disk return to normal operation
  • Global fault handling mechanism...
  • Rebuilds data redundancy
  • By allocating space for a new replica on a
    functioning disk and copying data to it from
    existing replicas
  • Using an application-specific data replication
    mechanism
  • Where to allocate new replicas, how to copy data,
    how to lay out data for new replicas, how to
    update global directory
  • Example in upcoming slide
  • Life returns to normal
  • Degree of fault-tolerance has been restored
  • Failed component can be replaced during
    regularly-scheduled maintenance

20
Roadmap
  • Layered software structure
  • Example of introspection
  • Runtime system extensibility using DSLs
  • Conclusion

21
Runtime system extensibility
  • Two ways of looking at system
  • Partitioned on functional/mechanism boundaries
  • Collection of libraries failure detection,
    transactions, ...
  • Mechanisms are isolated

libfail
librepl
application
OS
libtrxn
libcache
22
Runtime system extensibility
  • Two ways of looking at system
  • Partitioned on functional/mechanism boundaries
  • Collection of libraries failure detection,
    transactions, ...
  • Mechanisms are isolated
  • Partitioned on global system properties
  • This is how the programmer thinks about the
    system (high-level)
  • e.g. application-specific data availability
    policy
  • Failure detection (which devices to monitor, )
  • Replication (used to restore redundancy)
  • Transactions (how to restart work in progress)
  • Caching (how to handle dirty cached objects
    during failure)

libfail
librepl
application
OS
libtrxn
libcache
23
Runtime system extensibility
  • Two ways of looking at system
  • Partitioned on functional/mechanism boundaries
  • Collection of libraries failure detection,
    transactions, ...
  • Mechanisms are isolated
  • Partitioned on global system properties
  • This is how the programmer thinks about the
    system (high-level)
  • e.g. application-specific data availability
    policy
  • Failure detection (which devices to monitor, )
  • Replication (used to restore redundancy)
  • Transactions (how to restart work in progress)
  • Caching (how to handle dirty cached objects
    during failure)

libfail
librepl
Customized runtimesystem library
application
OS
libtrxn
libcache
compiler
policy
24
Extensibility using DSLs
  • DSLs are languages specialized for a particular
    task
  • Each ISTORE DSL
  • Encapsulates high-level semantics of one system
    behavior
  • Allows declarative specification of
  • Behavior of one aspect of the system (a policy)
  • Interfaces to coordinated mechanisms that
    implement the policy
  • Is compiled into an implementation that might
    coordinate several local and/or global base
    runtime system mechanisms
  • May be implemented as background and/or
    foreground tasks
  • Analysis tools can potentially infer unspecified
    emergent system behaviors from the specifications
  • e.g. what impact will a new redundancy policy
    have on transaction commit time
  • Extensions compiled together with local and
    global base mechanisms form the distributed
    runtime system

25
Extensibility using DSLs Example
  • AvailFailureDetected(Device d) Object o
    ObjList objsTransaction t TxnList
    txnsReplica x, c, rDirectoryMarkDeviceDisab
    led(d)AdminAlertFailure(d)objs
    DirectoryGetObjects(d) objs stored on
    failed deviceforeach o (objs) x
    DirectoryGetReplica(o,d) find os
    replica on d DirectoryDeleteReplica(x)
    delete from global directory txns
    TxnGetActiveTxns(x) foreach t (txns)
    TxnAbortTxn(t) abort pending txns
    for o on d c DirectoryGetReplica(o)
    find still-accessible copy r
    LoadBalancerAllocateReplica(o) get space for
    new repl LocalRuntimeCopyObject(c,c-gtdevice,r,r
    -gtdevice) copy it DirectoryAddReplica(r,r-
    gtdevice) update directory foreach t
    (txns) TxnIssueTxn(txn,r) reissue
    txns on new replica

26
Extensibility using DSLs (cont.)
  • Similar specification written for each extension
    to base library
  • Other examples of extensible system behaviors
  • Transaction response time requirements
  • Prioritizing operations based on type of data
    processed
  • Resource allocation
  • Backup policy
  • Exported administrative interface

27
Why use DSLs?
  • Possible choices
  • Each appliance designer writes runtime system
    from scratch
  • Similar to exokernel operating systems
  • All designers use single parameterized runtime
    system library
  • Similar to tunable kernel parameters in modern
    OSs
  • Designer writes high-level specification of
    system behavior
  • DSL compiler automatically translates
    specification into runtime system extensions that
    coordinate base mechanisms
  • Advantages include
  • Programmability
  • Performance
  • Reliability, verifiability, safety
  • Artificial diversity

28
DSL advantages (cont.)
  • Programmability
  • High-level specification close to designers
    abstraction level
  • Easier to write, reason about, maintain, modify
    runtime system code
  • Simple enough to allow site-specific
    customization at installation time
  • Performance
  • Aggressive DSL compiler can take advantage of
    high-level semantics of specification language
  • Base library mechanisms can be highly optimized
    optimization complexity hidden from appliance
    designer
  • Web example infer that TCP checksums should be
    stored with web pages

29
DSL advantages (cont.)
  • Reliability
  • Automatically generate code thats easy to forget
    or get wrong
  • Example synchronization operations to serialize
    accesses to distributed data structure
  • Verifiability
  • Of input code (DSL specification)
  • More abstract form of semantic checking
  • e.g. DSL supports types natural to behavior being
    specified gt type-checking verifies some semantic
    constraints
  • e.g. ensure no unencrypted objects are written
    to disk
  • Of output code (coordinated use of base
    mechanisms)
  • DSL compiler writer satisfied DSL compiler is
    correct gt appliance designer inherits
    verification effort
  • Safety (prevent runtime errors)
  • Whole classes of general programming errors not
    possible
  • DSLs hide details runtime memory management,
    IPC, ...
  • Compiler automatically adds code
    synchronization, ...

30
DSL advantages (cont.)
Specification
DSL compiler
Implementation 1
Implementation 2
Implementation 3
...
  • Artificial diversity
  • Potentially allow system to continue operation in
    face of internal bugs or malicious attack
  • Multiple implementations of component run
    simultaneously on different data replicas
  • Continuously check each other with respect to
    high-level behavior
  • Non-traditional fault-tolerance, but related to
    process pairs
  • Potentially usable to enhance performance
  • Select best-performing implementation(s) for
    future use periodically reevaluate choice
  • Examples of possible implementation differences
  • Low-level runtime memory layout, code ordering
    and layout
  • High-level system resource usage (recompute vs.
    use stored data, general space/time/bandwidth
    tradeoffs)

31
ISTORE software summary
  • ISTORE software architecture provides an
    extensible runtime environment for distributed
    network service application code
  • Layered local and global mechanism libraries
    provide introspection and self-maintenance
  • Mechanisms can be customized using DSL-based
    specifications of application policy
  • DSL code coordinates base mechanisms to implement
    application semantics and interfaces
  • DSL-based extension offers significant advantages
    in programmability, performance, reliability,
    safety, diversity

32
ISTORE summary
  • Network services are increasing in importance
  • Self-maintaining scaleable storage appliances
    match the needs of these services
  • ISTORE provides a flexible architecture for
    implementing storage-based network service apps
  • Modular, intelligent, fault-tolerant hardware
    platform is easy to configure, scale, and
    administer
  • Runtime system allows applications to leverage
    intelligent hardware, achieve introspection, and
    provide self-maintenance through
  • Layered runtime software structure
  • DSL-based extensibility that allows easy
    application-specific customization

33
Agenda
  • Overview of ISTORE Motivation and Architecture
  • Hardware Details and Prototype Plans
  • Software Architecture
  • Discussion and Feedback

34
Backup slides
35
What ISTORE is not
  • An extensible operating system
  • Use commodity OS, only add hardware monitoring
    module
  • MM could just be a device driver gt no need for
    microkernel OS
  • ISTORE could be built on top of an extensible
    operating system for even greater flexibility
  • An attempt to make commodity OSs extensible
  • Extensible runtime system allows designer to
    customize higher-level operations than OS
    extensions do
  • Closest to an extensible distributed operating
    system built on top of a commodity single-node
    operating system
  • A multiple-protection-domain system
  • Assumes non-malicious programmer
  • If user-downloaded code permitted, sandbox must
    be implemented as part of (trusted) application
  • DSLs specify resource allocation/scheduling
    policies, appliance designer responsible for
    ensuring fairness
  • A framework for building generic servers

36
ISTORE boot process
  • (1) Initially, undifferentiated ISTORE system
  • (2) On boot, each device block contacts system
    boot server
  • (3) Device blocks download customized runtime
    system and application worker code
  • Front-end blocks also download application
    front-end code
  • Runtime system libraries structured as shared
    libraries gt hot upgrade

37
Example Appliances
  • E-commerce
  • Web search engine
  • Transformational web/PDA proxy
  • Election server
  • Mail server
  • News server
  • NFS server
  • Database server OLTP, DSS, mixed OLTP-DSS
  • Video server
Write a Comment
User Comments (0)
About PowerShow.com