Transfer Learning in Jean - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

Transfer Learning in Jean

Description:

ESS: Experimental State Splitting. The Domain: ISIS. The Games Lattice ... Experimental State Splitting (ESS) differentiates, or splits, states to increase ... – PowerPoint PPT presentation

Number of Views:20
Avg rating:3.0/5.0
Slides: 27
Provided by: david1738
Category:

less

Transcript and Presenter's Notes

Title: Transfer Learning in Jean


1
Transfer Learning in Jean
  • Paul R. Cohen
  • Clayton T. Morrison
  • Yu-Han Chang
  • Joshua Moody

2
Outline
  • How Jean Does Transfer
  • ESS Experimental State Splitting
  • The Domain ISIS
  • The Games Lattice
  • Experiment and Evaluation
  • Experiment
  • Metrics
  • Results

3
Jean, ESS and Transfer Learning
  • Incrementally build knowledge of world over time
  • Procedural knowledge as Finite State Machines
  • Transitions occur when we choose to execute a new
    action
  • Experimental State Splitting (ESS)
    differentiates, or splits, states to increase
    predictive power of our model
  • ESS experiments with many hypotheses to find
    causal accounts for the observed results
  • A theory of transfer learning includes both what
    to transfer as well as when it is appropriate to
    transfer.

4
Domain Military tactics
  • Goal Eliminate all enemy troops
  • Enemy evades our troops if they can sight us

5
Sensors to States to Knowledge
  • Data from many sensors, often continuous
  • States determined by regions where some sensors
    stay constant, or fall into certain regions
  • FSMs control behavior based on state
  • Use ESS to expand FSM models and make them more
    predictive

Approaching
Enemy evading
Steady state chasing
6
Experimental State Splitting
  • Building up a model of the world

1/2
fire
1
1
?GOAL
run, crawl, or fire
1/2
GOAL
crawl or run
fire
7
Experimental State Splitting
?GOAL
1/2
fire
  • Splitting

?GOAL
1/2
fire
GOAL
8
Experimental State Splitting
?GOAL
1/2
fire
  • Splitting

?GOAL
1/2
fire
GOAL
4/5
?GOAL
?GOAL
0
fire
fire
?GOAL D ? 200
?GOAL D lt 200
1/5
1
fire
fire
GOAL
GOAL
9
Transferring Causal Knowledge
  • Create splits in order to decrease entropy of the
    next-state distributions
  • Transfer learned state machines (or causal
    sub-components) between domains / test problems
  • Store in memory a repository of causal state
    machines or components

Evasive Enemy
Enemy in Hilly Terrain
?GOAL NOT VISIBLE
find unit
?GOAL FAR
crawl
?GOAL FAR
crawl
?GOAL NEAR
?GOAL NEAR
GOAL
fire
GOAL
fire
10
The Domain ISIS
11
ISIS
  • What is ISIS?
  • Real-time tactical and strategic military
    simulation
  • Allows first-person perspective
  • Military scenarios, simulated robot
  • Why is ISIS good for TL?
  • Able to configure scenarios of varying
    complexity range from single-unit maneuvers to
    complex coordinated operations
  • Require different types and combinations of
    knowledge

12
Game/Scenario Lattice
Schemas (static, dynamic, action) learned in
one game transfered to another. Not all
transfer is relevant and sometimes may be
detrimental (in absence of other knowledge).
Full
Intermediate
Link absence little that is relevant to
transfer.
Blue useful transfer
Red if SOLE transfer, then detrimental
Basic
Positive transfer speeds Learning in new game
13
Experiment
14
Scenarios
  • Scenarios
  • Restrained Mobile
  • lt range iii
  • gt range iii
  • Full Mobile 50 lt, 50 gt
  • Mountains
  • Entrenched

Dependent Variables Final unit strength,
Time to complete task
Engagement Ranges
15
Learning
Early
Later
Early failure Running at opponent allows them to
see you and they escape
Later Success Sneak up to opponent until
close then attack.
16
Scenario/Experiment Relations
17
Protocol
A Protocol for Generating Learning Curves
  • Tick - Jean gets ISIS state, selects and runs
    controller for fixed time in ISIS.
  • Trial - a series of 100 ticks in a given
    scenario.
  • Training Phase - a set of 20 learning trials.
  • Testing Phase - a fixed set of 10 test trials
    (not learning). Mean/Variance performance on
    test is recorded.
  • Performance Unit - one training phase followed by
    one test phase. The test from a performance unit
    1 point on a learning curve.
  • Replication - a series of 10 performance units.
    A complete replication 1 learning curve (with
    10 points).

Performance Unit
Replication
18
Protocol
A Protocol for Generating Learning Curves
  • Based on BEP
  • Test Condition 1
  • Administer B scenario for one replication epoch
    B, test B
  • Copy eval data, Copy Jeans memory, Wipe Jeans
    memory
  • Test Condition 2
  • Training Phase
  • Administer A scenario for one replication epoch
    A test with A.
  • Copy eval data Copy Jeans memory Do NOT wipe
    jeans memory
  • Testing Phase
  • Administer B scenario for one replication epoch
    B test B.
  • Copy eval data copy Jeans memory Wipe Jeans
    memory

19
Experiment 1 Results
20
Experiment 2 Results
21
Experiment 3 Results
22
Results Summary
23
END
24
Y1 Internal Results
  • Metric Ratio in areas below each learning curve
  • Area(B) / Area(AB)
  • Experiment 1 1.704
  • P-value 0.035
  • Sampling distribution for null hypothesis (AB is
    the same as B) generated by randomization-bootstra
    p

25
Y1 Internal Results
Observed
  • Sampling distribution for null hypothesis
  • Vertical line marks our observed statistic
  • P-value 0.035
  • Sampling distribution of the difference between
    the areas of the B and AB curves

2.5 quantile
97.5 quantile
26
Publications
  • St. Amant, R., Morrison, C. T., Chang, Y., Mu,
    W., Cohen, P. R. and Beal, C. (2006). An Image
    Schema Language. In Proceedings of the
    International Conference on Cognitive Modeling
    (ICCM 2006).
  • Chang, Y., Morrison, C. T., Kerr, W., Galstyan,
    A., Cohen, P. R., Beal, C., St. Amant, R. and
    Oates, T. (2006). The Jean System. In
    Proceedings of the 5th International Conference
    on Development and Learning (ICDL 2006).
  • Chang, Y., Cohen, P., Morrison, C. T. and St.
    Amant, R. (2006). Piagetian Adaptation Meets
    Image Schemas The Jean System. In Proceedings
    of the Ninth International Conference on the
    Simulation of Adaptive Behavior (SAB 2006).
  • Morrison, C. T., Chang, Y., Cohen, P. R., Moody,
    J. (2006). Transfer Learning with the Jean
    System. In Proceedings of the ICML 2006 Workshop
    on Structural Knowledge Transfer for Machine
    Learning.
Write a Comment
User Comments (0)
About PowerShow.com