Learning to Support Constraint Programmers - PowerPoint PPT Presentation

About This Presentation
Title:

Learning to Support Constraint Programmers

Description:

ACE Rediscovers Br laz Heuristic. Graph coloring: assign different colors ... ACE learned this consistently on different classes of graph ... ACE Discovers ... – PowerPoint PPT presentation

Number of Views:23
Avg rating:3.0/5.0
Slides: 39
Provided by: susanle
Learn more at: http://www.isle.org
Category:

less

Transcript and Presenter's Notes

Title: Learning to Support Constraint Programmers


1
Learning to Support Constraint Programmers
  • Susan L. Epstein1
  • Gene Freuder2 and Rick Wallace2
  • 1Department of Computer Science
  • Hunter College and The Graduate Center of
  • The City University of New York
  • 2Cork Constraint Computation Centre

2
Facts about ACE
  • Learns to solve constraint satisfaction problems
  • Learns search heuristics
  • Can transfer what it learns on simple problems to
    solve more difficult ones
  • Can export knowledge to ordinary constraint
    solvers
  • Both a learner and a test bed
  • Heuristic but complete will find a solution,
    eventually, if one exists
  • Guarantees high-quality, not optimal, solutions
  • Begins with substantial domain knowledge

3
Outline
  • The task constraint satisfaction
  • Performance results
  • Reasoning mechanism
  • Learning
  • Representations

4
The Problem Space
  • Constraint satisfaction problem ltX, D, Cgt
  • Solution assign a value to every variable
    consistent with constraints
  • Many real-world problems can be represented and
    solved this way (design and configuration,
    planning and scheduling, diagnosis and testing)

Variables A, B, C, D
Domains A ? 1,2,3 B ? 1,2,4,5,6 C ? 1,2 D ?
1,3
Constraints A B A gt D C ? D
5
A Challenging Domain
  • Constraint solving is NP-hard
  • Problem class parameters ltn, k, d, tgt
  • n number of variables
  • k maximum domain size
  • d edge density ( of possible constraints)
  • t tightness ( of value pairs excluded)
  • Complexity peak values for d and t that make
    problems hardest
  • Heavy-tailed distribution difficulty Gomes et
    al., 2002
  • Problem may have multiple or no solutions
  • Unexplored choices may be good

6
Finding a Path to a Solution
  • Sequence of decision pairs (select variable,
    assign value)
  • Optimal length 2n for n variables
  • For n variables with domain size d, there are
    (d1)n possible states

7
Solution Method
Domains A ? 1,2,3 B ? 1,2,4,5,6 C ? 1,2 D ?
1,3
Constraints A B A gt D C ? D
  • Search from initial state to goal

8
Consistency Maintenance
  • Some values may initially be inconsistent
  • Value assignment can restrict domains

Constraints A B A gt D C ? D
9
Retraction
  • When an inconsistency arises, a retraction method
    removes a value and returns to an earlier state

Domains A ? 1,2,3 B ? 1,2,4,5,6 C ? 1,2 D ?
1,3
10
Variable Ordering
  • A good variable ordering can speed search

11
Value Ordering
  • A good value ordering can speed search too

Domains A ? 1,2,3 B ? 1,2,4,5,6 C ? 1,2 D ?
1,3
Solution A2, B2, C2, D1
12
Constraint Solvers Know
  • Several consistency methods
  • Several retraction methods
  • Many variable ordering heuristics
  • Many value ordering heuristics

but the interactions among them are not well
understood, nor is one combination best for all
problem classes.
13
Goals of the ACE Project
  • Characterize problem classes
  • Learn to solve classes of problems well
  • Evaluate mixtures of known heuristics
  • Develop new heuristics
  • Explore the role of planning in solution

14
Outline
  • The task constraint satisfaction
  • Performance results ACE
  • Reasoning mechanism
  • Learning
  • Representation

15
Experimental Design
  • Specify problem class, consistency and retraction
    methods
  • Average performance across 10 runs
  • Learn on L problems (halt at 10,000 steps)
  • To-completion testing on T new problems
  • During testing, use only heuristics judged
    accurate during learning
  • Evaluate performance on
  • Steps to solution
  • Constraint checks
  • Retractions
  • Elapsed time

16
ACE Learns to Solve Hard Problems
  • lt30, 8, .24, .66gt near the complexity peak
  • Learn on 80 problems
  • 10 runs, binned in sets of 10 learning problems
  • Discards 26 of 38 heuristics
  • Outperforms MinDomain, an off-the-shelf
    heuristic

Steps to solution
1 2 3 4 5 6 7 8
Bin
Means in blue, medians in red
17
ACE Rediscovers Brélaz Heuristic
  • Graph coloring assign different colors to
    adjacent nodes.
  • Graph coloring is a kind of constraint
    satisfaction problem.
  • Brélaz Minimize dynamic domain, break ties with
    maximum forward degree.
  • ACE learned this consistently on different
    classes of graph coloring problems.

Color each vertex red, blue, or green so pair of
adjacent vertices are different colors.
Epstein Freuder, 2001
18
ACE Discovers a New Heuristic
  • Maximize the product of degree and forward
    degree at the top of the search tree
  • Exported to several traditional approaches
  • Min Domain
  • Min Domain/Degree
  • Min Domain degree preorder
  • Learned on small problems but tested in 10 runs
    on n 150, domain size 5, density .05, tightness
    .24
  • Reduced search tree size by 25 96

Epstein, Freuder, Wallace, Morozov, Samuels
2002
19
Outline
  • The task constraint satisfaction
  • Performance results
  • Reasoning mechanism
  • Learning
  • Representation

20
Constraint-Solving Heuristic
  • Uses domain knowledge
  • What problem classes does it work well on?
  • Is it valid throughout a single solution?
  • Can its dual also be valid?
  • How can heuristics be combined?

and where do new heuristics come from?
21
FORR (For the Right Reasons)
  • General architecture for learning and problem
    solving
  • Multiple learning methods, multiple
    representations, multiple decision rationales
  • Specialized by domain knowledge
  • Learns useful knowledge to support reasoning
  • Specify whether a rationale is correct or
    heuristic
  • Learns to combine rationales to improve problem
    solving

Epstein 1992
22
An Advisor Implements a Rationale
  • Class-independent action-selection rationale
  • Supports or opposes actions by comments
  • Expresses opinion direction by strengths
  • Limitedly-rational procedure

23
Advisor Categories
  • Tier 1 rationales that correctly select a single
    action
  • Tier 2 rationales produce a set of actions
    directed to a subgoal
  • Tier 3 heuristic rationales that select a single
    action

24
Choosing an Action
25
ACEs Domain Knowledge
  • Consistency maintenance methods forward
    checking, arc consistency
  • Backtracking methods chronological
  • 21 variable ordering heuristics
  • 19 value ordering heuristics
  • 3 languages whose expressions have
    interpretations as heuristics
  • Graph theory knowledge, e.g., connected, acyclic
  • Constraint solving knowledge, e.g., only one arc
    consistency pass is required on a tree

26
An Overview of ACE
  • The task constraint satisfaction
  • Performance results ACE
  • Reasoning mechanism
  • Learning
  • Representation

27
What ACE Learns
  • Weighted linear combination for comment strengths
  • For voting in tier 3 only
  • Includes only valuable heuristics
  • Indicates relative accuracy of valuable
    heuristics
  • New, learned heuristics
  • How to restructure tier 3
  • When random choice is the right thing to do
  • Acquire knowledge that supports heuristics (e.g.,
    typical solution path length)

28
Digression-based Weight Learning
  • Learn from trace of each solved problem
  • Reward decisions on perfect solution path
  • Shorter paths reward variable ordering
  • Longer paths reward value ordering
  • Blame digression-producing decisions in
    proportion to error
  • Valuable Advisors weight gt baselines

29
Learning New Advisors
  • Advisor grammar on pairs of concerns
  • Maximize or minimize
  • Product or quotient
  • Stage
  • Monitor all expressions
  • Use good ones collectively
  • Use best ones individually

30
Outline
  • The task constraint satisfaction
  • Performance results ACE
  • Reasoning mechanism
  • Learning
  • Representation

31
Representation of Experience
  • State describes variables and value assignments,
    impossible future values, prior state, connected
    components, constraint checks incurred, dynamic
    edges, trees
  • History of successful decisions
  • plus other significant decisions
    become training examples

Is Can be Cannot be A 1 2 B
2 C 1,2 D 1,3 Checks incurred 4 1
acyclic component A,C,D Dynamic edges AD, CD
32
Representation of Learned Knowledge
  • Weights for Advisors
  • Solution size distribution
  • Latest error greatest number of variables bound
    at retraction

33
ACEs Status Report
  • 41 Advisors in tiers 1 and 3
  • 3 languages in which to express additional
    Advisors
  • 5 experimental planners
  • Problem classes random, coloring, geometric,
    logic, n-queens, small world, and quasigroup
    (with and without holes)
  • Learns to solve hard problems
  • Learns new heuristics
  • Transfers to harder problems
  • Divides and conquers problems
  • Learns when not to reason

34
Current ACE Research
  • Further weight-learning refinements
  • Learn appropriate restart parameters
  • More problem classes, consistency methods,
    retraction methods, planners, and Advisor
    languages
  • Learn appropriate consistency checking methods
  • Learn appropriate backtracking methods
  • Learn to bias initial weights
  • Metaheuristics to reformulate the architecture
  • Modeling strategies

and, coming soon, ACE on the Web
35
Acknowledgements
  • Continued thanks for their ideas and efforts go
    to
  • Diarmuid Grimes
  • Mark Hennessey
  • Tiziana Ligorio
  • Anton Morozov
  • Smiljana Petrovic
  • Bruce Samuels
  • Students of the FORR study group
  • The Cork Constraint Computation Centre
  • and, for their support, to
  • The National Science Foundation
  • Science Foundation Ireland

36
Is ACE Reinforcement Learning?
  • Similarities
  • Unsupervised learning through trial and error
  • Delayed rewards
  • Learns a policy
  • Primary differences
  • Reinforcement learning learns a policy
    represented as the estimated values of states it
    has experienced repeatedly but ACE is unlikely
    to revisit a state instead it learns how to act
    in any state
  • Q-learning learns state-action preferences but
    ACE learns a policy that combines action
    preferences

37
How is ACE like STAGGER?
  • STAGGER ACE
  • Learns Boolean classifier Search control
    preference function for a sequence of
    decisions in a class of problems
  • Represents Weighted booleans Weighted linear
    function
  • Supervised Yes No
  • New elements Failure-driven Success-driven
  • Initial bias Yes Under construction
  • Real attributes Yes No

Schlimmer 1987
38
How is ACE like SAGE.2?
  • Both learn search control from unsupervised
    experience, reinforce decisions on a successful
    path, gradually introduce new factors, specify a
    threshold, and transfer to harder problems, but
  • SAGE.2 ACE
  • Learns on Same task Different problems in a
    class
  • Represents Symbolic rules Weighted linear
    function
  • Reinforces Repeating rules Correct comments
  • Failure response Revise Reduce weight
  • Proportional to error No Yes
  • Compares states Yes No
  • Random benchmarks No Yes
  • Subgoals No Yes
  • Learns during search Yes No

Langley 1985
Write a Comment
User Comments (0)
About PowerShow.com