Fault Detection in a HWSW CoDesign Environment - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

Fault Detection in a HWSW CoDesign Environment

Description:

System reliability aspects are generally considered to the end of ... It is better to asses if fault detection should be done in HW or SW for system performance ... – PowerPoint PPT presentation

Number of Views:24
Avg rating:3.0/5.0
Slides: 22
Provided by: tekno1
Category:

less

Transcript and Presenter's Notes

Title: Fault Detection in a HWSW CoDesign Environment


1
Fault Detection in a HW/SW CoDesign Environment
  • Prepared by A. Gaye Soykök

2
Outline
  • Introduction
  • System Specification
  • Fault model
  • Some terminology
  • Methodology Analysis
  • Reliable communication
  • HW/SW Partitioning

3
Introduction
  • System reliability aspects are generally
    considered to the end of the design process, at
    low abstraction levels
  • Working at low abstraction levels introduces more
    overhead
  • Not all systems can be considered at low levels
  • It is better to handle fault detection at higher
    levels
  • It is better to asses if fault detection should
    be done in HW or SW for system performance

4
Introduction
  • At system level several parameters are considered
    and an alternative design is chosen among several
    alternatives
  • Time constraints
  • Power consumption
  • Testability
  • Area

5
Introduction
  • Fault detection facilities are introduced at
    system level
  • HW/SW binding of components is affected
  • System Specification which parts are critical
    and need fault detection
  • Design methodologies how these detection
    facilities are applied either in HW or SW
  • HW/SW partitioning which parts are in SW, which
    are in HW. Guided by methodologies

6
System Specification
  • Language must support .. User should eb able to
    specify which sections require reliability
    aspects
  • For ex SystemC or OCCAM
  • Architecture CPU(dsp or general purpose),
  • Coprocessors,
  • (ASIC or FPGA)

7
FAULT MODEL
  • Single Functional Failure
  • Any number of physical faults causes a functional
    model to perform incorrectly
  • HW is faulty, software is affected by hardware
  • CPU, communication channels, one of Co processors
    , memory may fail
  • Module failure is detected before any other fails
  • Temporal, architectural and informational
    redundancy is adopted

8
Some Terminology
  • Nominal original system function elements
  • Checking redundant elements for fault detection
  • Checker element to compare checking and nominal
  • Each of these elements can be independently
    implemented in either HW or SW

9
HW or SW
  • Nominal SW, Checker SW, Checking SW
  • Checking and checker are either executed by
    system processor or a dedicated processor
  • Ex Self checking SW, Assertions, Dual_processor
    and VLIW

10
HW or SW (Contd)
  • Nominal SW, checker HW and checking SW
  • Interface for functional Redundancy check,
  • VLIW with hardware, Dma checker
  • Nominal SW, checker HW and checking HW
  • CED solutions are implemented totally in HW, EX
    Dynamically configurable checker

11
HW or SW (Contd)
  • Nominal HW, Checker HW, Checking HW
  • Classical Approach. Ex Duplication , TSC devices

12
Methodologies Analysis - Concepts
  • Number and type of processing elements
  • Whether special architecture is necessary
  • Synchronization issues between processing
    elements
  • Allocation of checker memory space
  • Checker structure and complexity
  • Selection of a checker methodolgy to raise errors
    in case of mismatches

13
Methodologies Analysis - Metrics
  • Detection latency the time between the instant
    an error occurs and the instance it is detected
  • Coverage how many of the existing faults can be
    detected
  • Performance degradation overhead caused by fault
    detection facilities compared to nominal functions

14
Methodologies Analysis Metrics (Contd)
  • Material cost cost of physical components
  • Design Cost effort needed to design the system

15
Reliable Communication
  • Apart from data processing communication needs to
    be reliable
  • Hardware redundancy lines duplication
  • Information redundancy data encoding
  • Best effective when data encoding is used when SW
    is involved and hardware sections employ
    dedicated lines (dublicated, encoded)

16
HW/SW Partitioning
  • After systems is specified, methodologies has
    been assessed, different alternatives have been
    produced with cost functions partitioning step
    takes place.
  • Evaluate cost functions, evaluate constraints of
    the user
  • Reliability aspects make it more complex
  • Make partitioning in two stages!

17
HW/SW Partitioning (Contd)
  • First level classical aspects and functions are
    taken into account
  • Second level given the first solution
    reliability aspects are introduced and a solution
    between solution set that has the best trade off
    and that satisfies the first constraints is
    chosen.
  • If no reliability constraints is given second
    level is not carried

18
(No Transcript)
19
HW/SW Partitioning (Contd)
  • If specific architecture is required for
    reliability (for example dual processor) fist
    level benefits from earlier partitioning
    solutions
  • A solution may not exist after reliability
    constraints are introduced and first level may
    need to be repeated

20
HW/SW Partitioning (Contd)
  • Reliability constraints may be which druve the
    second stage
  • Hard, ex 100 fault coverage
  • Soft, ex any fault coverage
  • Parameters considered
  • Fault coverage
  • Performance degradation
  • Detection latency
  • Area overhead

21
Conclusion
  • Design for reliability has been merged into HW/SW
    codesign process resulting in a final design that
    has on-line fault detection properties
  • Future work is introducing fault tolerancy into
    HW/SW codesign process
Write a Comment
User Comments (0)
About PowerShow.com