Torture tests - PowerPoint PPT Presentation

About This Presentation
Title:

Torture tests

Description:

Perry Groot, Frank van Harmelen, and Annette Ten Teije. Perry Groot. EKAW 2000. 2. Motivation ... The ability of KBSs to deal with missing or invalid data is an ... – PowerPoint PPT presentation

Number of Views:38
Avg rating:3.0/5.0
Slides: 19
Provided by: gro5162
Category:
Tags: annette | tests | torture

less

Transcript and Presenter's Notes

Title: Torture tests


1
Torture tests
  • A quantitative analysis for the robustness of
    Knowledge-Based Systems

Perry Groot, Frank van Harmelen, and Annette Ten
Teije
2
Motivation
Belief
The ability of KBSs to deal with
missing or invalid data is an essential dimension
of KBS validation.
Our claim
A quantitative analysis of
the robustness of KBSs is both possible and
useful.
3
Two informal definitions
The degree to which a system or component can
function correctly in the presence of invalid
inputs or stressful environmental conditions
IEEE, 1990.
Robustness
Degradation study
4
Two informal definitions
Robustness
In a degradation study we gradually decrease the
quality of the KBS input and measure how the KBS
output quality decreases as a result.
Degradation study
5
Two quality measures
correct(I) ? output(I)
Recall(I)

correct(I)
correct(I) ? output(I)
Precision(I)

output(I)
6
Recall Fraction of correct answers that the
system actually computes
Output(I)
Recall
?
Correct(I)
7
Precision The fraction of computed answers that
are actually correct
Output(I)
Recall
?
Correct(I)
Precision
?
8
Recall completeness, Precision
soundness.
Output(I)
Recall
?
Correct(I)
Precision
?
9
Two quality measures
  • Well known in information retrieval.
  • No commitment to task or domain.
  • Geared to KBSs with discrete answers.
  • Correct answer has to be known beforehand.
  • Answer set has to be finite.

10
Case study
  • System classifies plants from a part of Germany.
  • Input Observations (flower,leafs,stem)
  • Output Plant
  • Internals ??? (feature of methodology!)
  • Our degradation study uses the number of
  • observations as gradual input measure.

11
Some observations
Both average precision and average recall grow
almost monotonically when adding observations.
(Only 58 of the individual cases have an
monotonically increasing output set.)
Surprise 1.
Surprise 2.
Surprise 3.
Surprise 4.
Surprise 5.
Surprise 6.
12
Some observations
Surprise 1.
After about 12 observations, adding more
observations does not increase the
precision. (Most cases contain 19-30
observations)
Surprise 2.
Surprise 3.
Surprise 4.
Surprise 5.
Surprise 6.
13
Some observations
Surprise 1.
The region in which additional observations
actually contribute to an increase in precision
is surprisingly small, namely between the 6 and
12 observations.
Surprise 2.
Surprise 3.
Surprise 4.
Surprise 5.
Surprise 6.
14
Some observations
Surprise 1.
When aiming for the maximum precision of 1,
there is no need to use any more than 12
observations. (Out of a maximum of 30!).
Surprise 2.
Surprise 3.
Surprise 4.
Surprise 5.
Surprise 6.
15
Some observations
Surprise 1.
Surprise 2.
Surprise 3.
No increase in precision can be gained from the
first 6 observations.
Surprise 4.
Surprise 5.
Surprise 6.
16
Some observations
Whatever the final precision that is ultimately
obtained by the system, this level of precision
is already obtained after at most 20
observations. (98 of the cases contained more
than 20 observations.)
Surprise 1.
Surprise 2.
Surprise 3.
Surprise 4.
Surprise 5.
Surprise 6.
17
The conclusion
  • Need for quantitative analysis of KBSs.
  • Degradation studies are a good approach.
  • Recall Precision are appropriate measures.
  • Independence of the underlying PSM.
  • Approach shown with case study.

18
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com