Multi-agent architectures that facilitate apprenticeship learning for real-time decision making: Minerva and Gerona - PowerPoint PPT Presentation

About This Presentation

Title:

Multi-agent architectures that facilitate apprenticeship learning for real-time decision making: Minerva and Gerona

Description:

Title: PowerPoint Presentation Last modified by: dcw Created Date: 1/1/1601 12:00:00 AM Document presentation format: On-screen Show Other titles – PowerPoint PPT presentation

Number of Views:176

Avg rating:3.0/5.0

Slides: 32

Provided by: lacGmuEd2

Learn more at: http://lac.gmu.edu

Category:

more less

Transcript and Presenter's Notes

Title: Multi-agent architectures that facilitate apprenticeship learning for real-time decision making: Minerva and Gerona

1
Multi-agent architectures that facilitate
apprenticeship learning for real-time decision
making Minerva and Gerona

David C. Wilkins
Center for Study of Language and Expertise
Stanford University
David Fried
Department of Computer Science
University of Illinois at U-C
November 5, 2005
Supported by ONR N00014-00-1-0660,
N00014-02-1-0731

2
Outline

Goal
Expert shells ? multi-agent capabilities
Minerva medical diagnosis (1992-1994)
Apprentice program observes expert, improves
agent
Genona ship damage control (2002-2005)
Apprentice program observes student, improves
student
Summary and conclusions

3
Expert Shells -gt Multi-Agent Capabilities

Traditional performance capabilities
Correct solution, Efficient problem solving
Multi-agent capabilities
Critiquing
Expert agent watches finds errors
omission/commission
Apprenticeship Learning
Expert agent watches expert, improves expert
agent
Expert agent watches student, improves student
Research philosophy
Critiquing apprenticeship should be natural
artifact of shell architecture
Same apprenticeship method should support both
learning and tutoring
Unified arch for dimensions of expertise is
approach to cognitive modeling

4
Apprenticeship Learning Paradigm

Problem
Human
Expert
Problem Solver
Agent
Actions Learning
Actions
Program
KN Differences
Situated Learning within context of problem
solving
Good for knowledge refinement of human or expert
agent

5
Apprenticeship Learning Challenges

Global credit assignment
Does good explanation of human action exist?
Challenge some explanation usually exists
Local credit assignment
What KN difference creates good explanation?
Challenge Many repairs will create explanation
Variance among human problem solvers
How to distinguish between allowable variations
among human problem solvers (who among other
things often disagree) and variations that
suggest knowledge errors
Solution
Minerva shell architecture

6
Minerva-Based Apprenticeship Learning Domain of
Neurology Diagnosis

1. Debra Arbed, a 39 year old black female.
2. Chief complaint is headache, nausea, vomiting,
stiff neck.
3. Headache duration? 6 hours.
4. Headache severity? 4 on scale of 0-4.
5. Fever? No.
6. Recent seizures? No.
7. Visual problems? No.
8. Headache onset? Abrupt.
30. Final diagnosis is subarachnoid hemorrhage.
31. Secondary dx is acute bacterial meningitis.

7
Evolution of Decision-Making Expert
ShellsSeparation of Different Knowledge Types
Minerva (1992) Odysseus2 (1994)
Neomycin (1982) Guidon2 (1987) Odysseus (1988)
Inference
Mycin (1972) Guidon Tieresias (1978)
Inference
Sched Kn
Task Kn
Inference
Task Kn
Program
Domain Kn
Domain Kn
Domain Kn
8
Domain, Task, and Scheduling KN are Distinct

Domain KN vocabulary and predicates mention
domain
Task KN no mention of domain (e.g., medicine)

strategy(differentiate-hypotheses(Hyp1, Hyp2)
- active-hypothesis(Hyp1), active-hypothesis(Hyp1
), different(Hyp1, Hyp2), evidence-for(Finding1,
Hyp1, Rule1, Cf1), evidence-for(Finding1, Hyp2,
Rule2, Cf2), same-sign-cfs(Cf1,
Cf2), get-premise(Rule1, Finding, Premise1),
get-premise( Rule2, Finding, Premiise2), premises
-contradicting(Premise1, Premise2), not
rule-applied(Rule1), strategy (apply-rule
(Rule1))

Scheduling KN Chains (G?SG??A) created by
unification. But which Action A is best?

9
Recursive Classification Use in Scheduler
Inference Level (Scheduler BBoard)
Inference Level (Domain BBoard)
Scheduler Level (Recursive HC)
Scheduler Level (FIFO)
Strategy Level (Exhaustive-Chaining)
Strategy Level (Hypothesis-Directed)
Domain Level (Scheduling knowledge)
Domain Level (Medical knowledge)
Minerva-Scheduler
Minerva-Medicine
10
Recursive ClassificationInduction of Embedded
Knowledge Base of Scheduler Rules

Induction of Scheduling rules
10-70 (39 avg.) classes, 42 features
286 scheduling rules
Disjoint training and validation sets.
Critiquing evaluation
Experts action upper 10 52.2
Experts action upper 20 67.4
Experts action upper 50 84.8

11
Minerva Related Research

Blackboard Architectures (BB1, Hearsay III)
Opaque code or scheduler hardwired not
learnable.
Classification Shells (Mole, Neomycin, Protos,
Internist)
Scheduler is mostly hard-wired.
Advanced Classification Shells (Ask/Mu)
scheduler knowledge specialized 1 expert.
Critiquing Systems (Disciple, Oncocin/Protégé)
Classification vs. task reduction vs. therapy
plans

12
The Problem of Ship Damage Control

Ship crises
Fire, smoke, flooding, pipe rupture
Primary and secondary damage
Damage Control Assistant (DCA)
Responsible for overall crisis management
Makes damage control decisions
Coordinates investigation and repair teams

13
Damage Control Assistant ExpertiseHow to get
decision-making practice?!

Expertise requires practice
Time-critical decision-making
High stress, information overload
Uncertain and incomplete information
Whole task practice difficult to acquire
Actual ship crises infrequent
Realistic practice expensive and dangerous
Rotation cycle is 2-3 years

14
The DCA Decision-Making TaskFires, Smoke,
Floods, Ruptures, etc

Event to DCA fire observed in compartment
1-174-0-L
Event to DCA pipe rupture observed compart
1-191-0-Q
Action by DCA send repair party to compart
1-174-0-L
Action by DCA go to General Quarters (GQ)
Action by DCA start fire pump 3 on port side
Critique to DCA Error of omission must request
permission of CO to turn on fire pump during GQ
Action by DCA Close firemain valve 3-274-2
Critique to DCA Error of commission valve
3-274-2 does not isolate pipe rupture

15
DC-Train 4.0 Simulation Capabilities

Physical ship simulation
Primary and secondary damage
Fire, smoke, flooding, rupture, firemain
Intelligent agent personnel simulation
67 ship personnel
Commanding officer
Engineering Officer of the Watch
Investigator Teams, Repair Teams, etc.

16
DC-Train and SCoT-DCPost-Scenario Spoken
Dialogue Tutoring
Spoken Dialogue Interface Interactive Visualiza
tion Interface
DCA student solves problem presentedby DC-Train
Simulator
Correct Expert Solution Critique of
Student Actions
Expert Critiquing Modules
Tutoring Dialogue Modules
University of Illinois
Stanford University DC-Train 4.0 w/ Critiquing
Spoken Dialogue Tutoring
17
Whole-Task Simulation-Based Training of Crisis
Decision Making Skills
Expert, Critiquing, Explanation Models Graph Mod
Operators (GMOs, Meta-GMOs)
Causal Story Graph (CSG)
DC-Train Physical Simulator and Intelligent
Agents
Events
WorldState
WorldInfo
Actions
Text-Based and Spoken Dialogue Tutors
Event Comm Language (ECL) is used along all arrows
DCA Student
18
Gerona Expert Agent Overview

Goal
Agent architecture to support multiple uses
expert model, critiquing, question-answering,
explanations, spoken dialogue tutoring, etc.
Solution
Explicit Knowledge Representation
ECL (vocabulary),
GMOs, G-Clauses (expert and student critique
models)
Meta-GMOs (question-answering, explanations)
CSGs (structured ECLs that represent all models)
Good for knowledge acquisition from experts
Gerona representation can be executed by an
interpreter

19
Event Communication Language (ECL)

Event Communication Language (ECL) statements
encode communication to and from the DCA, and
communication about state of world.
Example
English Boundaries set RL5 Talker to DCA DCA,
Repair 5 reports fire boundaries set for
compartment 4-220-0-E, auxiliary machinery room
2.
ECL message 6310 Boundaries set
ECL-6310 (to, from, reports, problem,
boundaries set for compartment, compartment)

20
Event Communication Language (ECL)

ECL 2000 WorldInfo (81)
E.g., Contents of compartments, location of
bulkheads
ECL 3000 WorldState Predicates (29)
E.g., Boundaries contain compartment
ECL 4000 WorldState Functions (22)
E.g., Compartment to Jurisdiction
ECL 5000 Actions from the DCA (48)
E.g., Send firefighters, Start fire pump, Request
permiss
ECL 6000 Events reported to DCA (88)
E.g., Fire alarm, firemain pressure low,
desmoking space
ECL 7000 Goals (36)
E.g., Identify fire, contain fire, patch pipe
rupture,
ECL 8000 Crises (7)
E.g, Fire, hot mags, flood, smoke, pipe rupture,
low fp

21
Causal Story Graph (CSG)
Crisis Fire
Active Goal Control Fire
Active Goal Extinguish Fire
Error of Commission Fight Fire in Space
Satisfied Goal Identify Fire
Active Goal Apply Fire Suppressant
Addressed Goal Contain Fire
Active Goal Isolate Space
Error of Omission Electrically Isolate Space
Event Set Fire Boundaries in progress
Justification Why Error of Commission?
Event Fire Report
Correct Action Set Fire Boundaries
22
Graph Modification Operators (GMO)
GMO 5120 FOR ECL 5120 Fight Fire compartment
-gt Compartment target -gt Station RULE
5120.fight-fire.critique.1 IF goal(find,
unaddressed, 7118, Apply fire suppressant, co
mpartment Compartment, _, G) AND
action(find, pending, 5120, Fight fire in
space, compartment Compartment, _,
A) AND goal(find, satisfied, 7116, Isolate
compartment if necessary, compartment
Compartment, _, _), AND goal(find, satisfied,
7117, Active desmoke if necessary, compartmen
t Compartment, _, _), AND ship-state(find,
_, 4302, Best repair locker for
compartment, compartment Compartment,
station Station, _, _)
23
Graph Modification Operators (cont)
THEN action(modify, correct, 5120, Fight fire
in space, compartment lt- Compartment, station
lt- Station, _, A) goal(modify, addressed,
7118, Apply fire suppressant, compartment lt-
Compartment, _, G) END RULE END GMO
24
Meta-GMO Question Types

About 100 templates cover all past
instructor-student QAs
Why questions for justifying CSG nodes (12)
Why should I have ordered firefighting?
What questions for retrieving expert
recommendations (32)
What should I have done after I got the fire
report?
What if questions to get critiques on
hypothetical actions (4)
What if I ordered fire boundaries to be set?
When/How questions to explain domain rules (9)
How do you determine what repair locker has
jurisdiction?
When/What/Is questions evaluate conditions and
relations (26)
Is there a starboard fire pump on at 300?
More complex questions involving chaining and
inference (14)
How can I satisfy the preconditions for
dewatering?
If I ordered smoke boundaries, what could I do
then?

25
Meta-GMO Example

When is it appropriate to order firefighting?
Question ECL 9300 when action
MGMO 9300
FOR ECL 9300 When Action
LET action-ecl-number -gt ActionECL
IF
g-clause(find, action(create, pending,
ActionECL, _, _, _, _), GClauses)
g-clause(justify, GClauses,
Justifications)
THEN
answer(create, _, 9300, When Action,
action-ecl-number lt- ActionECL,
justification lt- Justification,
miscellaneous-questions,
JustificationNode)
END IF
END MGMO

26
In English(direct translation)

There are two conditions under which you should
order firefighting.
First, when you receive a report that electrical
and mechanical isolation has completed, you still
need to extinguish the fire in that compartment,
you have either active desmoked the compartment
or do not need to active desmoke the compartment,
and either there is no halon or halon has failed,
find the best repair locker for that compartment,
and order that repair locker to fight the fire in
the compartment.
Second, when you receive a report that halon has
failed, you have either isolated the compartment
or the compartment cannot be isolated, and you
have either active desmoked the compartment or do
not need to active desmoke the compartment, find
the best repair locker for that compartment, and
order that repair locker to fight the fire in the
compartment.

27
In English(intelligent translation)
There are two things that might trigger ordering
firefighting. The first is a report of
electrical and mechanical isolation achieved, and
the second is a report that halon has
failed. The first case only applies when you
need to extinguish a fire. You also need to have
active desmoked the compartment, if necessary,
and if the compartment has halon, it has to
already have failed. In the second case, you
must have active desmoked if necessary and
isolated the compartment if possible. In both
cases, you should send the best repair locker for
the compartment to fight the fire.
28
Meta-Graph Modification Operators (M-GMOs)

MGMO 9002 FOR ECL 9002 "Why Sub-Optimal Action?"
LET action-node -gt ActionNode
RULE 9002.1 "Explain why the action isn't
correct."
IF g-clause( find, action(create, modify,
correct,
ActionNode.ecl, _, _, _, _), _,
CorrectGClauses)
AND roll-back(before, ActionNode, _)
AND g-clause(justify-and-evaluate,
CorrectGClauses, ActionNode, Justification)
THEN answer(create, _, 9002, "Why Sub-Optimal
Action?",
action-node lt- ActionNode, justification lt-
Justification, ActionNode, A)
END RULE
END MGMO

29
Power and Learnability

A Gerona system responding to an incoming message
from an agent can do so using an efficiently
parallelizable algorithm.
Total space complexity is O(n) and time
complexity is low-order polynomial.
GMO rules are PAC-learnable using learning to
take actions paradigm, given certain constraints
on length.

30
Current Research Direction

Extend SCoT-DC/DC-Train Spoken Tutor to allow
user-initiated tutoring.
Approach is to map user-initiated questions in
natural language to Gerona question classes
QABLE for Story Comprehension Q/A (Grois and
Wilkins, IJCAI-05 and ICML-05)
Use Gerona domain model to constrain
interpretations (Fried, et al, 2003)

31
Summary

Ability to critique and learn is facilitated by
agent KRI
KN factorization, explicitness, modularity, being
able to reason over static and dynamic knowledge
Two examples
Minerva separation of domain, task, and
scheduling knowledge use of Recursive Heuristic
Classification for scheduling.
Gerona graph operators construct a dynamic
task-centered representation