P1253553562mXHzc - PowerPoint PPT Presentation

1 / 40
About This Presentation
Title:

P1253553562mXHzc

Description:

SALT Elevators/Lifts. References see - WWW: http//www.csd.abdn.ac.uk ... car. 1. vehicle. 15 - 20 (yes) (medium. Large) 6 - 8. train. 0. root. 0 - 20 (yes no) ... – PowerPoint PPT presentation

Number of Views:46
Avg rating:3.0/5.0
Slides: 41
Provided by: dereks9
Category:

less

Transcript and Presenter's Notes

Title: P1253553562mXHzc


1
Supporting Creativity in Science Cooperative
Knowledge Acquisition Knowledge Refinement
Systems Derek Sleeman Department of Computing
Science The University ABERDEEN AB24 3FX Tel
44 (0)1224 272296 Email d.sleeman_at_abdn.ac.uk WWW
http//www.csd.abdn.ac.uk Acknowledgements
EPSRC support for the AKT Consortium Students
Eugenio Alberdi, David Corsar, Andy Aiken, Mark
Winter
2
OVERVIEW of TALK
I Context Advanced Knowledge Technologies
(AKT) Consortium II Co-operative Knowledge
Acquisition Knowledge Refinement
Systems. III ReTAX system IV The REFINER
System Questions / Discussion
3
I AKTs CHALLENGES
Knowledge Acquisition
Knowledge Maintenance
Knowledge Modelling
Life Cycle, Integration Issues Testbeds
Knowledge Reuse
Knowledge Publishing
Knowledge Retrieval
4
II Co-operative KA Knowledge Refinement
Systems
  • Knowledge-Based systems inevitably require a
    sizeable amount of
  • domain knowledge. This can be acquired from
  • domain experts (KA)
  • detailed examples (using ML techniques) etc
  • However for complex tasks these KBs are
    inevitably
  • incomplete when further Knowledge-Acquisition
    is needed
  • inconsistent when the KB needs to be refined.
  • also it is likely that background knowledge
    will be incomplete thus requiring an expert
    to act as an oracle.
  • Hence the need for Co-operative (Problem
    Solving) Knowledge
  • Acquisition Knowledge Refinement Systems

5
II Co-operative KA Knowledge Refinement Systems
KRUST (Classical KB Classification) (Susan
Craw) STALKER (Efficient Truth Maintenance based
system Classification) (Leo Carbonara) REFI
NER/Refiner / R5 (Case-base Classification) (S
unil Sharma Mark Winter Andy
Aiken) RETAX (Revision of Taxonomies) (Eugenio
Alberdi David Corsar) CRIMSON
(Refinement of Constraints) (Mark
Winter) TIGON Time Series Data/Causal Model
(Diagnosis) (Fraser
Mitchell) SALT Rules Constraints Propose
Revise (Piero Leo) References see - WWW
http//www.csd.abdn.ac.uk
6
II Co-operative KA Knowledge Refinement Systems
KRUST Wine Adviser STALKER REFINER Attendan
ce at Medical Clinics Stock control
CRIMSON/ConRef Stock control RETAX Botanical
Taxonomies TIGON Turbines (Fault Detection
Diagnosis) SALT Elevators/Lifts References
see - WWW http//www.csd.abdn.ac.uk
7
III RETAX
  • The heuristics in RETAX are based on a study to
    determine how Botanists reacted to a rogue
    item(s).
  • There are 2 (principal) rules which determine
    whether a taxonomy is well formed
  • each child node must be more specialized that
    its parent
  • each of a nodes siblings must be unique.
  • Retax was used to replicate the revision of a
    major botanical taxonomy done manually in
    Aberdeens Botany dept in the 90s.
  • References Middleton Wilcox (1990) Edinburgh
    Journal of Botany revision of taxonomy for
    Pernettya / Gaultheria
  • Alberdi Sleeman (1997) AI Journal, p257-279.
  • Alberdi, Sleeman Korpi (1999) Cognitive
    Science Journal

8
Label Wheels Size Motor Engine-Power Parent Depth
string ANY integer-range (2 8) ordered-set 4 (low medium large high) ordered-set 2 (yes no) Integer- Range (0 20) string ANY Integer- Range (0 3)

vehicle 2 - 8 (low medium Large, high) (yes no) 0 - 20 root 0

train 6 - 8 (medium Large) (yes) 15 - 20 vehicle 1
car 3 - 6 (low medium high) (yes) 2 - 10 vehicle 1
cycle 2 - 3 (low) (yes no) 0 - 3 vehicle 1
lorry 4 - 8 (medium high large) (yes) 5 - 20 vehicle 1
9
sports- car 4 (low) (yes) 5 10 car 2
salon-car 4 (medium) (yes) 3 5 car 2
bicycle 2 (low) (no) 0 cycle 2
motor- cycle 2 (low) (yes) 1 3 cycle 2
large- lorry 4 8 (large) (yes) 6 - 20 lorry 2
small- van 4 (medium) (yes) 5 10 lorry 2

smaller- van 4 (medium) (yes) 6 small- van 3

10
Vehicle
Train
Car
Cycle
Lorry
Sports Car
Salon Car
Bicycle
Motorbike
Large Lorry
Small Van
Smaller Van
11
RETAX
  • Lets refer to a new object/node as N, the
    existing hierarchy/tree as T, and the potential
    parent node as P. Then possible operations are
  • Is T well formed? (If not report nodes which
    violate the rules.)
  • E.G., If Sibling nodes N1 N2 are
    equal, then merge the 2 nodes.
  • Is N already in T?
  • Assuming T is well-formed, to which parent node,
    P, can N be attached without causing T to be
    rearranged or N modified? (Answer could be none)
  • What changes have to be made to N to make it a
    legal child of node P?
  • What changes have to be made to T so that N can
    be a child of P?
  • Combinations of the last 2 operations

12
ReTAX
  • Ericaceae
  • Arctostaphylos Arbutus Pernettya
    Leucothoe Gaultheria Agauria Andromeda
  • A. uva-ursi A. unedo P. tasminica
    G.oppositfolia G. rupestris G. antipoda
    A. polifolia

13
ReTAX
  • - Historical In Bentham Hookers (1876)
    classification the main differences detected
    between the Pernettya Gaultheria genera were
    type of fruit and succulence of the calyx
    features.
  • G Bentham JD Hooker (1876). Genera Plantarum,
    Vol II, Part2. (Publ Reeves Co, London)
  • - Subsequent botanical investigations in the
    20th Century challenged this analysis, but did
    not suggest any further distinguishing features
    for the 2 genera hence the 2 genera were
    combined, (Middleton Wilcox, 1990).

14
ReTAX
  • Simulation (Simplified)
  • - The descriptions of several species of the
    Pernettya Gaultheria genus were replaced by
    others with revised features (descriptors) which
    effect the definitions of the parent nodes (P G)
  • - When parent nodes (Pernettya Gaultheria) are
    found to be the same, the system checks a set of
    other features (further facility of ReTAX) to see
    if they are distinctive when no differences are
    found, the 2 nodes (PG) are collapsed

15
RETAX Current / Future activities
  • Use with other experts to help them formulate /
    refine taxonomies (eg other aspects of botany,
    microbiology)
  • Use RETAX, or a variant, to formulate / refine
    ontologies (eg medical terminologies). This has
    resulted in the Protégé RepairTAB which detects
    inconsistencies on OWL Ontologies gives advice
    about removing inconsistencies. (Lam, Sleeman,
    Pan, Wasconcelos (2008) Journal of Data
    Semantics)

16
IV REFINER System
  • The Refiner algorithm
  • Sample dataset
  • Interaction with experts
  • Current / future work

17
The Sample Dataset
Age DBP Associated Disease Category
1 50 90 D1 A
2 56 90 D2 A
3 52 101 D3 A
4 50 95 D3 B
5 56 97 D3 B
6 - 89 D5 A
7 52 97 D3 A
18
The Refiner Algorithm
  • Each case is assigned to a category
  • Category descriptions are inferred from the
    case values
  • When a case matches a category it was not
    assigned, by the expert, this is an
    inconsistency
  • While inconsistencies exist
  • A selection of disambiguation strategies are
    suggested
  • The user chooses a strategy to be performed
  • The list of inconsistencies is re-evaluated
  • The refined dataset is now consistent

19
Generating Descriptions
  • Generalise each field
  • Numeric range from lowest to highest
  • String set of all unique items
  • Taxon nearest common parent
  • Boolean set of all unique items from the set
    true, false, any
  • Combine to get category description

20
Category Descriptions
Category Age DBP Disease
A 50 56 89 101 All
B 50 56 95 97 D3
  • There are inconsistencies
  • Cases 4 and 5 match A
  • Case 7 matches B
  • We need to remove the overlap

21
Disambiguation Strategies
  • Change values for certain cases
  • Remove values from a category (eg, create a
    disjunction)
  • Reclassify a case
  • Make a case match an additional category
  • Shelve a problem case
  • Add a new field

22
Refiner
C2
C1
C3
23
Strategies for this problem
  • Change value of DBP in case 7 to 90
  • Change value of DBP in case 5 to 95
  • Reclassify case 7 to category B
  • Add case 7 to category B
  • Shelve case 7
  • Change value of Disease in cases 3 and 7 to D3
  • Reclassify cases 4 and 5 to category A
  • Add cases 4 and 5 to category A
  • Shelve cases 4 and 5
  • Add a new field

24
Strategy Ordering
  • Typically, many strategies are suggested
  • We need heuristics to order them
  • Ordered by number of times suggested prefer
    strategies which are suggested many times
  • Ordered by number of cases affected prefer
    strategies which affect fewer cases

25
The Refiner Main Screen
26
Scalability
  • Measured the time taken to
  • perform validation on
  • randomly-generated datasets
  • with varying numbers of
  • cases, fields and categories
  • For most datasets, time taken
  • is under 1 second

27
Use of REFINER by Experts
  • Refiner has been used with various experts
    including
  • Pain Control Expert (Anaesthesiology)
  • Child psychologist
  • High Dependency Unit (HDU) Physician
  • KCAP-2003 paper (Aiken Sleeman)

28
Pain Control
  • Pre-existing Access dataset on epidural patients
  • Many cases, lots of fields / descriptors
  • Refiner imported the data (almost) perfectly
  • Expert categorised cases based on the length of
    the epidural (in days)
  • REFINER took only a few seconds to create
    category descriptions and validate
  • But

29
Pain Control
  • Hundreds of inconsistencies found
  • Hundreds of strategies suggested
  • Almost all which were change value
  • Why did it not work better?
  • Subjective nature of the subject domain.
  • Categories were contiguous

30
Child Psychology
  • The session was a series of anecdotes and
    outlines of specific cases
  • Three types of cases were identified
  • Severely autistic
  • Mildly autistic
  • Difficulties with language development

31
Child Psychology
  • The expert stated that autistic children usually
    had the
  • following characteristics
  • Problems with language and verbal communication
  • Problems with social interaction
  • Obsessive behaviour
  • These characteristics were abstracted by the
    knowledge
  • engineers and subsequently confirmed with the
    expert
  • The expert showed no inclination to use
    REFINER, but a case set was created by the
    knowledge engineers

32
HDU
  • Task poised by domain expert when to move high
    dependency unit (HDU) patients to a general ward,
    or the intensive care unit (ICU), or leave them
    in the HDU.
  • Used Refiner with three datasets one for each
    condition (cardiac, neuro respiratory)
  • Expert did not use the system but did dictate the
    descriptors the sets of cases to the knowledge
    engineers who typed this information into
    REFINER.
  • Refiner found 2 categories were consistent
    in the third identified inconsistencies

33
Inconsistent Dataset
HR RR AVPU Sat O2 Cat.
1 105 27 1 94 Higher
2 120 35 2 88 Higher
3 140 45 3 80 Higher
4 105 28 1 94 Same
5 90 22 1 95 Same
6 80 18 1 96 Lower
7 70 15 1 98 Lower
34
Category Descriptions
Category HR RR AVPU Sat O2
higher 105-140 27-45 1-3 80-94
same 90-105 22-38 1 94-95
lower 70-80 15-18 1 96-98
  • There are inconsistencies
  • Case 1 matches Category SAME
  • Case 4 matches Category HIGHER
  • We need to remove the overlap
  • Refiner suggested lower and upper danger
    zones for each field

35
Future Work Use with Domain Experts
  • Make the systems GUI more intuitive (some
    changes already made)
  • Ask expert to come along to the session with a
    document which summarizes the main features of
    the dataset they wish to discuss. (In session ask
    them to highlight principal concepts)
  • For each domain expert contacted, record an AVI
    session of a simple but related domain (eg simple
    childhood diseases before approach a
    paediatrician) (demo)

36
Current Work (ICU domain)
  • Developed system which is statistically based, so
    given a case description it returns the
    likelihood of that case belonging to one of the
    predefined categories (R5 Andy Aiken)
  • Acquired data set of patients physiological
    parameters from an ICU DB, and have clinicians
    assign patients on day-by-day hour-by-hour to a
    5-point severity score. (Develop in conjunction
    with Glasgow Royal Infirmary)
  • Using R5 with the above data set to assign new
    patient reports to a severity class. (Practically
    important as the descriptors include clinical
    interventions which standard scales dont.)
  • Identify analyse (explain) anomalous / unusual
    cases (segments of cases)

37
VI Dimensional Analysis ??
  • Outline issue
  • Pointer to TR
  • Pointer to WWW systems / sources

38
Questions/Comments
39
V (Causal) Explanations for Anomalous Medical
cases
  • Discuss ICU context
  • Experiment to detect Anomalous cases / sections
    of cases
  • Outline a typical investigation

40
V Seeking to Explain an anomalous Observation
  • EXPECTED An injection of X will cause the heart
    (Organ, O) to increase its contraction rate
    within T seconds.
  • SUPPOSE that does not happen, then here are some
    of the investigations which might be performed
  • Is the injection being given effectively
  • IF so then check whether the drug X is being
    transported to Organ, O
  • Is the transport path physically /
    bio-chemically blocked?
  • Is the transport mechanism inhibited slowed down?
  • IF the drug is actually arriving at Organ O the
    conc is OK, then investigate
  • Is the drug mechanism within the organ being
    blocked?
  • Is the organ for some reason unable to respond in
    the usual way (eg weaken heart muscle)
Write a Comment
User Comments (0)
About PowerShow.com