Integration%20of%20Artificial%20Intelligence%20and%20Operations%20Research%20Techniques%20for%20Combinatorial%20Problems%20%20Carla%20P.%20Gomes%20Cornell%20University%20gomes@cs.cornell.edu%20Ken%20McAloon%20and%20Carol%20Tretkoff%20ILOG%20{mcaloon,tretkoff}@ilog.com - PowerPoint PPT Presentation

About This Presentation
Title:

Integration%20of%20Artificial%20Intelligence%20and%20Operations%20Research%20Techniques%20for%20Combinatorial%20Problems%20%20Carla%20P.%20Gomes%20Cornell%20University%20gomes@cs.cornell.edu%20Ken%20McAloon%20and%20Carol%20Tretkoff%20ILOG%20{mcaloon,tretkoff}@ilog.com

Description:

Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl ... Nutritional Requirements. At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron ... – PowerPoint PPT presentation

Number of Views:149
Avg rating:3.0/5.0

less

Transcript and Presenter's Notes

Title: Integration%20of%20Artificial%20Intelligence%20and%20Operations%20Research%20Techniques%20for%20Combinatorial%20Problems%20%20Carla%20P.%20Gomes%20Cornell%20University%20gomes@cs.cornell.edu%20Ken%20McAloon%20and%20Carol%20Tretkoff%20ILOG%20{mcaloon,tretkoff}@ilog.com


1
Integration of Artificial Intelligence and
Operations Research Techniques for Combinatorial
ProblemsCarla P. GomesCornell University
gomes_at_cs.cornell.eduKen McAloon and Carol
TretkoffILOGmcaloon,tretkoff_at_ilog.com

2
AI, OR, and CS
AI
OR
CS
3
Integration of Artificial Intelligence
Operations ResearchTechniques
  • AI

OR
Representations Constraint Languages Logic
Formalisms Object-Oriented Prog. Bayesian
Nets Rule Based Systems     Tools Constraint
Propagation Systematic Search Stochastic
Search      Pros / Cons Rich
Representations Computational Complexity
Representations Mathematical Modeling
Languages Linear Non-linear (In)Equalities  
  Tools Linear Programming Mixed-Integer
Prog. Non-linear Models      Pros / Cons More
Tractable (LP) Primarily Complete Info Limited
Representations
Combinatorial Problems
Planning
Scheduling
THE CHALLENGE AI OR UNIFY
APPROACHES TO
SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE
COMPLEXITY (phase transition)
EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS
4
Outline
  • I. Short Overview of OR
  • II. Disjunctive Programming and Hybrid Solvers
  • III. Exploiting Randomization to Solve Hard
    Combinatorial Problems
  • IV. Conclusions

5
I. Short OR Overview

6
Outline for Linear Programming and Integer
Programming
  • Standard Form of LP and a Simple Example
  • Geometric Interpretation of LP
  • Complexity issues
  • MIP
  • Example Fast Food
  • Example Capacitated Warehouse
  • Example 911

7
Outline
  • 1. Short Overview of OR
  • 2. Constraint Programming
  • 3. Cooperating Solvers
  • 4. Disjunctive Programming
  • 5. Exploiting Randomization to Solve Hard
    Combinatorial Problems
  • 6. Conclusions

8
Optimization Technology Evolution
1960
1970
1980
1990
1998
1947
Shifting Bottleneck
Primal Simplex LP
Interior Point
Dispatch Rules
CPM PERT
First CP Systems
Dual Simplex Implementation
Constraint Propagation
SA, GA, Tabu
Constraint-based Scheduling
Dual Simplex
Barrier LP
Barrier Crossover
Cooperating Solvers (LP/CP)
Global constraints
MIP
Concurrent Scheduling
Parallel LP/MIP
Large IPs
9
1. Short OR Overview

10
Outline for Linear Programming and Integer
Programming
  • Standard Form of LP and a Simple Example
  • Geometric Interpretation of LP
  • Complexity issues
  • MIP
  • Example Fast Food
  • Example Capacitated Warehouse
  • Example 911

11
An LP Story
  • A factory can produce n products from m parts
  • For product j it needs aij units of part i
  • There are bi units of part i available
  • Each unit of product j sold earns cj
  • Amount of each product to make is unknown xj ?
    0
  • Each part i determines a constraint
  • ai1 x1 ain xn ? bi
  • Obvious solution do nothing
  • Better maximize c1 x1 cn xn

12
Standard Forms of LP
  • A linear program (LP) in standard form
    (Dantzig 1947)
  • max cTx
  • subject to Ax ? b
  • x ? 0
  • Input data c (n x 1), A (m x n), b (m x 1).
  • Variables x (n x 1)

13
Standard Forms of LP
  • // The objective function
  • max c1 x1 cn xn
  • // The constraints
  • subject to
  • a11 x1 a1n xn ? b1
  • ...
  • am1 x1 amn xn ? bm
  • x1 ? 0 , , xn ? 0

14
Standard Forms of LP
  • In OR emphasis is on optimality
  • Solution means optimal solution
  • Feasible solution means solution in the ordinary
    sense

15
Standard Forms of LP
  • Interpretation of standard form
  • xj amount of product j to make
  • cj revenue per unit product j
  • bi available amount of component i
  • aij units of i used per unit of j produced
  • The constraints say
  • ? aijxj ? units of i used by j
  • units of i used
  • ? bi

16
What are models?
  • A model is a data-independent abstraction of a
    problem
  • A model lets you write down the mathematical
    representation of a model independently of the
    data

One Problem Instance
17
Products Could be Jewelry
Products Rings and Earrings Components Gold
and Diamonds One ring requires 3 units of Gold,
and 1 Diamond One set of earrings requires 2
units of Gold, and 2 Diamonds Total Gold and
Diamonds are limited Profit is different for
Rings than for Earrings
Products rings, earrings Components
Gold, Diamonds demand 3, 1, 2, 2
stock 150, 180 profit 60, 40
18
Products Could be Chemicals
Products Ammonium Gas NH3 Ammonium
Chloride NH4Cl Components Nitrogen,
Hydrogen, Chlorine One unit of Gas requires 1
unit of Nitrogen, 3 units Hydrogen One unit of
Chloride requires 1 unit of Nitrogen, 4 units
Hydrogen, and 1 unit of Chlorine Total Nitrogen,
Hydrogen, Chlorine is limited Profit is different
for Gas than Chloride
Products gas, chloride Components
nitrogen, hydrogen, chlorine demand 1,
3, 0, 1, 4, 1 stock 50, 180,
40 profit 30, 40
19
The Problems Have One Model
enum Products ... enum Components ... float
demandProducts, Components ... float
profitProducts ... float stockComponents
... var float productionProducts maximize
sum (p in Products) profitp
productionp subject to forall (c in
Components) sum (p in Products)
demandp, c productionp lt
stockc
20
OR Modeling Systems
  • OPL
  • AMPL
  • 2LP
  • AIMMS
  • GAMS
  • MPL
  • ILOG Planner
  • etc

21
The Dual
  • The dual linear program (von Neumann 1947)
  • min yTb
  • subject to yTA ? c
  • y ? 0
  • Variables y (m x 1)
  • Awesome Symmetry -
  • The dual of the dual is the primal

22
Rows and Columns Exchanged
  • min b1 y1 bm yn
  • subject to
  • a11 y1 am1 ym ? c1
  • ...
  • a1n y1 amn ym ? cn
  • y1 ? 0 , , ym ? 0

23
Duality Theorem
  • Theorem min yTb max cTx
  • Consequence This turns optimality problem into a
    feasibility problem in x and y
  • Ax ? b
  • x ? 0
  • yTA ? cT
  • y ? 0
  • yTb cTx
  • Consequence Enumeration not needed to verify
    optimality

24
Duality Theorem
  • Sensitivity Analysis
  • Consequence The solution values y for the y
    variables yield the Lagrange multipliers of the
    primal constraints which measure the rate of
    change of the objective function with respect to
    the right hand side bounds b
  • yi ? Z / ? bi where Z is the
    optimum
  • Reference McAloon and Tretkoff 1996
    Wiley

25
Duality
  • Two different views of the same phenomenon

Point vs Set
Arc vs Node
Momentum vs Position
Vector vs Hyperplane
Landlord vs Renter
26
Simplex and Barrier
  • The simplex algorithm turns the feasibility
    problem into a iterative repair process with a
    powerful evaluation function
  • The barrier method transforms the LP into a
    system of differential equations that describe a
    vector field of flow on the polytope

27
Geometric Interpretation of LP

Max X subject to -X Y lt 4 X 4y lt 36 2X
y lt 23 X Y gt 4 Y gt X 10
Y
(4,8)
(0,4)
Simplex
(10,3)
Barrier
(4,0)
(8,0)
X
28
Complexity of Linear Programming
  • Simplex Method
  • Worst-case --- exponential (Klee and Minty 72)
  • Practice --- good performance
  • Ellipsoid Method
  • Khachians Ellipsoid Method
  • Worst-case --- polynomial
  • Practice --- poor performance

29
Complexity of Linear Programming
  • Interior Point Methods or Barrier Methods
  • Karmarkars (and variants) Method
  • Worst-case --- polynomial
  • Practice --- good performance

30
Complexity of Linear Programming
  • Despite its worst case exponential time
    complexity, the simplex method is usually the
    method of choice since it provides tools for
    sensitivity analysis and its performance is very
    competitive in practice.
  • Which method performs best is problem dependent.

31
Success Stories
  • Industrial Planning
  • Given current resources, decide what to produce
    in what quantity
  • Supply Chain Management
  • Multiperiod planning models that link flow from
    one period to the next
  • Network Flow
  • How best to route goods across a network

32
Assumptions of Linear Programming
  • Linearity
  • when violated ( xy 50)
  • Nonlinear programming
  • Continuity
  • when violated (x integral)
  • (Mixed) Integer programming

33
Assumptions of Linear Programming - continued
  • No Disjunctive Constraints
  • when violated (x ? 100 or x ? 0)
  • Disjunctive programming
  • Additional 0-1 variables and Big M
    constraints
  • Certainty
  • when violated (cost c is a random variable)
  • Stochastic programming

34
Search and MIP
  • In order to deal with variables that must have
    integer values in the solution, a search must be
    performed.
  • Mixed Integer Programming problems are
    combinatorial optimization problems and are NP
    hard
  • feasibility is NP-Complete
  • verifying optimality is co-NP-Complete

35
MIP and Combinatorial Optimization
  • These problems have been attacked by both the AI
    and OR communities.
  • In AI, these problems are attacked as CSPs or as
    Planning Problems.
  • In OR, they are done as MIPs and use linear
    relaxation to help guide the search.
  • The overriding idea in each case is to limit
    search.

36
Integer Program All Integer Points in Region
37
Cut to Create Integer Vertex
Integer Vertex
38
Example - Fast Food
  • Question Is it possible for a male college
    student to eat at the local fast food outlet and
    still meet the requirements of a balanced diet?
  • If so, what is the least he can do it for?

39
Nutritional Requirements
  • At least 100 of vitamins A, C, B1, B2, niacin,
    calcium and iron
  • At least 55 grams of protein
  • At most 3000 milligrams of sodium
  • At most 30 of the calories can come from fat
  • Nutritional information is available from fast
    food outlets

40
College Students Requirements
  • At least 2000 calories a day
  • No more than 3 servings of any one food
  • Milk only with cereal and not as a stand-alone
    drink

41
Fast Food - MIP Model
  • We will have variables Servk to represent the
    number of servings of item k in the plan.
  • The variable Servk will have to take an integer
    value for the solution to be valid.
  • The objective function Z for cost

42
Fast Food - MIP Model
  • Let foodk,j represent the percent of RDA of
    nutrient j in a serving of item k
  • The for each nutrient j, we have a constraint
  • ? foodk,j Servk ? 100
  • k

43
Fast Food - MIP Model
  • Let sodiumk represent the amount of salt in a
    serving of item k
  • For salt we have the constraint
  • ? sodiumk Servk ? 3000
  • k
  • Similarly for fat

44
Fast Food - MIP Model
  • Let costk represent the cost of a serving of item
    k
  • For the objective function we have the defining
    constraint
  • ? costk Servk Z
  • k

45
Fast Food - Solution
  • With a MIP solver and a way to input these
    constraints we ask for
  • a solution that makes the variables Servk
    integral
  • and which minimizes Z

46
MIP Solution Technique
  • What the MIP solver does is to carry out a branch
    and bound search guided by
  • the linear relaxation
  • the solution to the problem with the integrality
    requirements relaxed
  • Initialize the global variable best_so_far to
    1000 (or something else very big).

47
At a Node
  • Compute a solution to the linear relaxation which
    minimizes Z yielding z. Prune this node if
  • z ? best_so_far ,
  • If all values of Servk are integral, this is a
    solution. Set best_so_far z. Save this node.

48
Branching at a node
  • Choose a variable Servk whose value s is not
    integral.
  • Typical heuristic most non-integral variable
  • Create two child nodes,
  • add Servk ? floor(s)
  • add Servk ? ceil(s)

49
Good News
  • The linear relaxation can prune nodes before all
    variables Servk are forced to be integral.
  • Surprisingly often a node high in the tree will
    turn up with all relevant variables integer.
    Heres why
  • A solution to the LP is at a vertex
  • A vertex is defined as the simultaneous solution
    of the equality form of n linearly independent
    constraints
  • Many of these constraints are integer bounding
    constraints yielding X integer

50
Arboreally Speaking
  • Breadth first search is often preferred - it
    visits the smallest number of nodes needed to
    find and verify the optimal solution - analogous
    to A
  • If the linear relaxation is tight
  • zlinear - zintegral is relatively small
  • then zlinear is an excellent evaluation
    function

51
Answer - Fast Food
  • Total cost is 8.71
  • Buy 3 burgers
  • Buy 2 fries
  • Buy 3 honeys
  • Buy 1 yogurt
  • ...

52
Example - Fixed Cost
  • Warehouses must be rented in order to supply
    stores and we must decide which to use
  • For each store j we know its monthly demand dj
  • For each warehouse i we know its capacity ki
  • For each warehouse i we know the fixed cost to
    run it each month fci
  • For each pair i, j we know the monthly cost cij
    of supplying j from i

53
Example - Fixed Cost
  • Xij is the fraction of store js demand met by i
  • Xij ? 1
  • Yi is a fuzzy boolean
  • it will be 1 if the warehouse is rented
  • 0 if it is not rented
  • Yi ? 1

54
Example - Fixed Cost
  • Each store must be supplied
  • ? X ij 1
  • i
  • Warehouse capacity can not be exceeded
  • ? dj Xij ? ki
  • j
  • Tighter
  • ? dj Xij ? ki Yi
  • j

55
Example - Fixed Cost
  • Objective function
  • ? fci Yi ? ? cij Xij
  • This yields a MIP with 0-1 variables Yi

56
Branch and Cut An Enhanced Solution Method
  • Cuts - redundant constraints for the MIP model
    but not redundant for the linear relaxation
  • Xij ? Yi
  • Add at a node if violated by solution to linear
    relaxation
  • Powerful method - will solve the Imperial College
    OR lib CW problems very easily

57
Example - Call 911
  • PCTs answer the phone 24 hours a day, 7 days a
    week.
  • It is known how many PCTs should be on duty
    during each of the 168 hours during the week in
    order to assure the necessary response rate.
  • Workers can arrive at any hour and they work for
    8 hours except for a one hour break after 4 hours.

58
Example - Call 911
  • Each PCT has a work week of 5 days followed by 2
    days off.
  • Want to meet the demand with minimal or
    near-minimal number of PCTs.
  • So need to determine how many PCTs start their
    work week at each hour h of the week

59
Modeling 911
  • A continuous variable Pcth will represent the
    number of workers who start their work week at
    hour h, 0 ? h lt 168.

60
Modeling 911
  • A continuous variable Z will represent the
    objective function
  • ? Pcth Z
  • h
  • There will be a constraint for each hour h to
    assert that there are enough workers on duty at
    that time. The rhs of this constraint is bh the
    number of workers needed.

61
Modeling 911
  • For this constraint we need to represent the
    number of workers who are on duty at time h
  • Certainly, those who start the week at time h are
    here, as are those who started the week at time h
    - 1
  • And so on back to time h - 7 with the exception
    of those who started at time h - 4 and who are
    now on break.

62
Modeling 911
  • This also applies to the previous 4 days. When
    the smoke clears, we sum over the workers w who
    are working at time h
  • ? Pctw ? bh
  • w

63
Call 911 solved with progressive roundoff
  • int b168 // New York City 911
  • 30,24,18,15,14,14,15,25,34,36,38,40,
  • 41,43,46,57,57,59,61,59,55,50,45,38,
  • 32,25,20,17,15,13,17,25,32,35,38,40,
  • 42,43,47,58,57,57,59,57,55,52,47,41,
  • 33,25,20,17,15,13,15,25,32,33,37,39,
  • 42,43,47,57,56,57,57,56,53,50,47,41,
  • 34,27,22,19,16,15,16,25,31,35,37,40,
  • 44,45,48,57,57,56,58,56,53,53,46,41,
  • 34,28,23,19,16,15,17,25,33,37,39,42,
  • 45,47,51,59,58,60,61,61,57,56,57,55,
  • 48,41,35,30,26,20,18,22,26,32,42,46,
  • 49,53,54,56,56,56,59,59,57,57,56,56,
  • 52,46,41,34,29,23,18,19,25,31,36,41,
  • 46,50,52,53,52,53,54,53,50,49,45,40

64
Modeling 911
  • Subject to these constraint we want to find a
    solution which makes the Pcth integer and which
    makes Z small.
  • The naïve approach is to compute the minimal
    linear solution and to round up all the values of
    Pcth to the nearest integer.
  • The linear relaxation yields Z 204.67 fuzzy
    workers but rounding yields a mediocre integral
    solution of 259 workers.

65
Modeling 911
  • For this and many other applications, heuristics
    can be used to develop good solutions
  • Progressive Roundoff - solve the linear
    relaxation, round up first variable and freeze
    it, re-solve etc.

66
Solving the Integer Problem
  • main() // Planner Code
  • IlcInitFloat()
  • IlcManager m(IlcNoEdit)
  • IlcLinOpt simplex(m)
  • IlcFloatVarArray Pct(m,168,0,1000)
  • IlcFloatArray coeffs(m,168)
  • int i,j,k,h,n

67
Solving the Integer Problem
  • // ? Pctw ? bh
  • w
  • for(h0hlt168h) // for each hour of 168 in
    week
  • for(j0jlt168j)
  • coeffsj 0
  • for(k0klt5k) // for each of 5 days
  • for(jk24jltk248j) // for each of 8
  • if (j!(k244)) // hours
  • coeffs(h168-j)168 1
  • simplex.add(IlcScalProd(coeffs,Pct) gt bh)

68
Solving the Integer Linear Problem
  • IlcFloatVar Z IlcSum(Pct)// Objective
  • simplex.setObjMin(Z)
  • for(i0ilt168i) //Progressive roundoff
  • n ceil(simplex.getCurrentValue(Pcti))
  • // Fix variable and re-optimize
  • simplex.add(Pcti n)
  • m.out() ltlt Number of Pcts needed is ltlt Z ltlt
    endl
  • m.end()

69
Solution
  • This code finds a solution with 208 workers in a
    couple of seconds. The optimum is 207.
  • The heuristic works well in part because if there
    were no lunch breaks, it would find the
    guaranteed optimal solution
  • Bartholdi,Ratliffe,Orlin

70
2. Constraint Programming
71
LP/MIP is Beautiful, except when
  • Variable domain information is important to the
    search strategy
  • especially critical in scheduling
  • The problem variables range over symbolic
    entities and there are lots of symmetries
  • timetabling
  • The MIP representation can be too verbose or
    awkward
  • configuration
  • There are just too many constraints
  • e.g. vehicle routing

72
Mathematical Basis of Constraint Programming (CP)
  • The Constraint Satisfaction Problem
  • Suppose a finite set of variables is given and
    with each variable is associated a non-empty
    finite domain.
  • A constraint on k variables X1,,Xk is a relation
    R(X1,,Xk) ? D1 x x Dk.
  • A constraint satisfaction problem (CSP) is given
    by a finite set of constraints.
  • A solution to a CSP is an assignment of values to
    all the variables so that the constraints are
    satisfied.

73
Domain Reduction
  • In CP, each constraint of a CSP is considered as
    a subproblem and techniques are developed for
    handling frequently encountered constraints.
  • With each constraint is associated a domain
    reduction algorithm which reduces the domains of
    the variables that occur in the constraint.
  • Accelerates convergence toward a solution
  • Detects infeasibility

74
Constraint Propagation
  • The other key issue is communication among the
    constraints or subproblems.
  • The basic method used is called constraint
    propagation which links the constraints through
    their shared variables.
  • The important thing about this setup is that it
    is very modular and independent of the particular
    structure of the individual constraints.

75
Monsieur Jordan Phenomenon
  • Like prose, you have been doing constraint
    propagation all your life.
  • Crossword puzzles
  • Incomplete and so backtracking is needed
  • NY Times Sunday Crossword
  • Optical Illusions
  • Origin Vision analysis (Marr,Waltz et al)

76
Strengths of Constraint Programming
  • Constraint Programming provides a rich Rich
  • Rich representation language.
  • CP variables naturally represent problem entities
    and the constraints do not have to be translated
    into a specific problem format such as MIP or
    SAT.
  • Opportunity to choose a good heuristic for the
    solution strategy.

77
Which Method for Which App?
LP
MIP
MIP
Constraint Based Scheduling
CP Local search
CP
Technology
Product Mix
Production Planning
Distribution Planning
Scheduling
Dispatching
Configuration
Application
Linear gt Disjunctive Constraints Strategic
gt Operational Optimization
78
3. Cooperating Solvers

79
First Stop
  • CP/CP

80
Mother of All Examples - N Queens
  • Do we think in terms of queens
  • Where do we place this queen ?
  • Do we think in terms of squares
  • Will this square contain a queen ?
  • These views are dual to one other

81
The Primal View
  • For each queen assign it a square

Place this queen in this square ?
82
The Dual View
  • For each square decide whether it will have a
    queen

Place a queen in this square ?
83
The Primal Model
  • In which row do we place qj - the queen in
    column j

The constraints qi ! qj qi - qj ! i -
j qi - qj ! j - i
Note no alldifferent constraint
84
Yet Another duality - rows vs columns
  • In which column do we place qqi the queen in
    row i

The constraints are the same qqi ! qqj qqi
- qqj ! i - j qqi - qqj ! j - i
85
The Relationship
  • Can link them as inverse functions
  • qqqi i
  • qqqj j

The constraint propagation i leaves domain of
qj iff j leaves domain of qqi
86
In this primal/dual model
  • Apply first-fail to qi
  • Lo and behold
  • one-third fewer fails
  • (Example from Jean Jordans thesis)

87
(No Transcript)
88
Q
X
X
X
X
X
Q
X
X
X
X
X
X
X
X
X
X
89
Q
X
X
X
X
X
Q
X
X
X
X
X
Q
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
90
Q
Q
X
X
X
X
X
X
X
X
X
X
X
Q
X
X
X
X
X
X
X
X
X
X
X
Q
X
X
X
X
X
X
X
91
So
  • The cooperating primal-dual formulation
    captured the generalized arc consistency of the
    alldifferent constraint

The arc consistency of this global constraint is
non-trivial to maintain Network flow
algorithms flow goes from values to
variables each variable has unit demand and
capacity
92
Remarks
  • An IP model will encode the first dual solution
  • Will this square contain a queen
  • xij 0 or xij 1
  • A disaster beyond 30 queens
  • network structure on rows and columns lost
  • Another example - sports scheduling

93
Constraints and Indices
  • In IP symbols are represented by indices as
    opposed to values for variables.
  • nurses, teams
  • Paris-St-Germain plays Manchester United on day k
  • xijk 0-1 to represent team i plays team j on
    day k
  • You cant put symmetry breaking and other
    constraints on indices.

94
Second Stop
  • CP/IP

95
CP Is Powerful, But .
  • Sometimes, inconsistencies can be overlooked
  • X - Y ? 12
  • X Y ? 10
  • X in 1..20
  • Y in 1..20
  • Domain reduction on each constraint and
    constraint propagation will not reduce the
    domains although the system has no linear
    solution
  • but an LP solver would spot this

96
2 Dimensional Bin Packing
  • Application for the Automobile Industry built by
    Greg Glockner

97
2 Dimensional Bin Packing
  • The problem here is to put as many small
    rectangles in a big rectangle with 90 degree
    rotation allowed.
  • The actual application involves circuit boards
  • There are two complete models, one a CP model and
    the other an IP model.
  • The CP model directs the search
  • The LP relaxation prunes the search space by
    detecting infeasible nodes

98
2-D Bin Packing
  • Arrange circuit boards onto raw material
  • Boards may be rotated
  • Use same number of each board
  • Objective minimize scrap

Classic combinatorial optimization problem
99
Solving 2-D Bin Packing
  • Use CP to generate partial solutions (nodes)
  • Restrict placement to reduce fragmentation of
    blank space
  • Use tight LP to test feasibility
  • If any partial solution is infeasible in the LP,
    prune the tree immediately

CP constraints reduce the tree width LP allows us
to prune quickly
100
2 Dimensional Bin Packing
  • As the search tree is traversed, the two models
    are in sync.
  • Note that the variables used in the 2 models are
    disjoint
  • The two models are dual to each other
  • The IP sees the model from the point of view of
    the board, the large rectangle
  • The CP sees the model from the point of view of
    the small rectangles
  • Solutions are obtained in minutes

101
2DBP Basic CP Formulation
  • Let (xi, yi) be the location and (wi, hi) be the
    dimensions of the ith tile
  • Basic constraints
  • Disjunctive constraints to prevent overlapping
    tilesxi wi xj Ú yi hi yj Úxj wj
    xi Ú yj hj yi
  • Constraints to count the number of each tile type
  • Tile-oriented formulation

102
2DBP Basic IP Formulation
  • Let xijnt 1 if tile n of type t is in position
    (i, j)
  • The constraints are

å
"

t
n
x
,
1
ijnt
,
j
i
å
"

j
i
x
,
1


nt
j
i



,
,
,
t
n
j
i





lt

lt


h
j
j
w
i
i
j
j
i
i
,
,
,
t
t
"
Î
t
n
j
i
x
,
,
,

1
,
0

ijnt
Grid-oriented formulation
103
2DBP LP Issues
  • The LP is large
  • The LP exhibits significant primal degeneracy
  • The LP exhibits significant dual degeneracy

104
2DBP LP Issues
  • The simplex algorithm cannot solve the LP
  • There is no way for a MIP solver to solve the IP
    as such
  • The barrier method can solve the LP

105
2DBP Summary
  • CP as master problem
  • Orders tiles
  • Places tiles by position, then type
  • Selects tile type by frequency to scatter tiles
    throughout the bin
  • Uses a one-ply lookahead constraint to limit the
    position of following tile
  • LP relaxation prunes the CP search space
  • Checks whether the partial solution will lead to
    an infeasible instance
  • Use idiomatic formulations for CP and IP

106
2DBP Remarks
  • The CP fixes significant numbers of variables at
    each node
  • The LP pre-processor greatly simplifies the LP
  • Therefore, the lack of incrementality of the
    barrier method does not cost us

107
2DBP Cooperative Algorithm Demo
108
Last Stop
  • Constraint Programming and Local Search
    cooperation
  • Another example of duality in action

109
CP/LS
  • Parallel machines with set-up times

Ready times Dues dates Splittable jobs Rogue
machines
Objectives meet due dates minimize setup
costs
110
Two Phase Cooperation
  • Phase I - the Primal (Work on first objective)
  • Configure and schedule the jobs
  • Use constraint based scheduling

111
Two Phase Cooperation
Machines morph into trucks
112
Two Phase Cooperation
  • Phase 2 - the Dual (Work on second objective)
  • Schedule the trucks
  • Use Lin, Lin-Kernighan, tabu etc

113
Parallel Machines Cooperative Algorithm Demo
114
IC Park Example
  • Hoist Scheduling (Rodosek and Wallace)
  • The original model is an IP
  • The CP model is the same
  • CP guides search, LP relaxation and CP share
    pruning duties
  • No apparent duality

115
Remarks
  • One can get great benefit with CP/IP algorithms
    CP/CP algorithms and CP/LS algorithms
  • IP/LS is just around the corner
  • IP/IP cooperation is hard because one cant
    formulate truly dual views
  • either simply not there
  • or too verbose
  • counterexamples welcome

116
4. DISJUNCTIVE PROGRAMMING

117
Disjunctive Linear Programming
  • An extension of Mixed Integer Programming
  • A union of polyhedral sets (feasible regions) is
    called a disjunctive set.

118
Disjunctive Set
119
Disjunctive Linear Programming
  • The problem of determining whether the
    intersection of a family of disjunctive sets is
    non-empty is called the disjunctive linear
    programming problem or simply disjunctive
    programming problem.
  • The solution set of the disjunctive programming
    problem is
  • Ç È Fij
  • iltM jltN

120
Disjunctive Linear Programming Examples
  • Semi-continuous variables
  • either X gt 100
  • or X 0
  • Rather than
  • X lt BigMY ,
  • X gt100Y,
  • Y a 0-1 variable

121
Solution Set Inside Initial Region
122
Disjunctive Linear Programming Examples -
continued
  • Bollapragada, Ghattas and Hooker
  • Truss structure design problem
  • Branches directly on alternatives dictated by
    Hookes Law
  • Wyatt
  • Disjunctive programming and mean absolute
    deviation models (MAD) for portfolio
    optimization
  • Extends Benders decomposition to disjunctive
    linear programs

123
Disjunctive Linear Programming continued
  • Balas, Cornuejols and Ceria
  • Generating cuts for disjunctive programming
    problems.
  • McAloon and Tretkoff
  • Basic mathematical results Optimization and
    Computational Logic, Wiley

124
Disjunctive Linear Programming continued
  • Dealing with the disjunctive part requires
    search.
  • This requires an engine which is not available in
    MIP packages
  • Also the linear relaxation is not as tight and
    the evaluation function is not as faithful
  • The solution is to use a CSP solver and an LP
    based solver in tandem - cooperating solvers
  • Beringer and DeBacker for MIP

125
To Keep It Simple
  • GERALD DONALD ROBERT

An AI classic Newell and Simon
Assignment problem 1 constraint
Surprisingly hard for MIP solvers CPLEX MIP takes
1 minute and 29048 nodes (on Sun Enterprise) to
find a feasible integer solution
126
The Disjunctive Program
  • One constraint for the equation
  • 100000 G D 100000 R T
  • For each variable X among G,,T
  • X 0 or X 1 or or X 9
  • For each pair X, Y
  • X ? Y-1 or Y ? X-1

127
Solution Set SOME of the Integer Points in the
Region
128
The Twin Variables for Cooperating Solvers
  • Integer variables for the letters
  • 0 ? g, e, r, a, l, d, o, n, b, t ? 9
  • With continuous doppelgangers
  • 0 ? G, E, R, A, L, D, O, N, B, T ? 9

129
The Variables
  • One multi-variable constraint on the continuous
    doppelgangers posted to an LP solver and to the
    CSP solver
  • 100000 G 10000 E 1000 R D
  • 100000 D 10000 O 1000 N D
  • 100000 R 10000 O 1000 B T

130
The Variables
  • One CSP constraint on the integer variables
    posted to a discrete constraint propagation
    engine
  • AllDifferent(g, e, r, a, l, d, r, n, b, t )

131
The Search
  • Bounding information from the discrete variables
    is passed to the continuous doppelgangers and
    conversely
  • The branching strategy is guided by the linear
    relaxation on the continuous variables
  • if there is a non-integral variable X, branch on
    it
  • X ? floor(X)
  • or
  • X ? ceil(X)

132
The Search
  • If the AllDifferent constraint, the initial
    bounding constraints and the bounding constraints
    from branching detect a contradiction on the
    discrete variables, both sides backtrack
  • If the linear relaxation is made infeasible by
    the bounding constraints that come from the
    discrete computation or from branching, both
    sides backtrack

133
The Search
  • New wrinkle
  • The solution to the linear relaxation might have
    all variables integral - but the AllDifferent
    constraint can be violated by this set of values
  • In this case, branch to keep them apart
  • either X ? Y - 1
  • or Y ? X - 1

134
The Variables
  • void main()
  • IlcInitFloat()
  • IlcManager m(IlcNoEdit)
  • IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9),
    A(m, 0, 9), L(m, 0, 9),
  • G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9),
    T(m, 0, 9)
  • IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E,
    R, B, T)
  • // Continued on next slide

135
The Constraints
  • m.add(IlcAllDiff(vars,IlcWhenValue))
  • IlcLinOpt simplex(m)
  • simplex.add(
  • 100000R 10000O 1000B 100E 10R T
  • 100000G 10000E 1000R 100A 10L D
  • 100000D 10000O 1000N 100A 10L D
    ,
  • IlcTrue // Post to Solver as well
  • )

136
The Search for solutions
  • m.add(Generate(m,simplex,vars)) // Search
    strategy
  • if (m.nextSolution()) // Find a solution
  • m.out() ltlt " solution found " ltlt endl
  • m.printInformation()
  • m.end()

137
Branch if a variable is non-integer
  • ILCGOAL2(Generate, IlcSimplex, simplex,
    IlcIntVarArray, vars)
  • IlcInt varIndex MostNotInteger(vars,
    simplex)
  • if (varIndex gt 0) // There is a non-integer
    variable
  • return IlcAnd(IlcTryUpwardFirst(varsvarIndex,
    simplex), this)

138
Is integer relaxation a solution ?
  • IlcManager m getManager()
  • if(m.solve(TestIntegerRelaxation(m,simplex)))
  • return 0

139
Find two variables with same value
  • IlcInt j
  • for(i0iltvars.getSize()-1i)
  • if (varsi.isBound()) continue // Cant both
    be bound
  • IlcInt n simplex.nearest(simplex.getCurrentVal
    ue(varsi))
  • for(ji1jltvars.getSize()j)
  • IlcInt m simplex.nearest(simplex.ge
    tCurrentValue(varsj))
  • if (m n) break
  • if (jlt vars.getSize()) break

140
Branch to push them apart
  • // j and i are the indices of two variables
    with same current value
  • return
  • IlcAnd( IlcOr(
  • Smaller(m,varsi,varsj,simplex),
  • Smaller(m,varsj,varsi,simplex)),
  • this // Recursion
  • )

141
Pushing two variables apart
  • ILCGOAL3(Smaller,IlcIntVar,x,IlcIntVar,y,IlcSimple
    x,simplex)
  • simplex.add(x lt y-1,IlcTrue)
  • return 0

142
Testing the integer relaxation
  • ILCGOAL1(TestIntegerRelaxation, IlcSimplex,
    simplex)
  • simplex.trySolution()
  • return 0

143
Results
  • ILOG Solver/Planner finds a solution in 6 nodes
    (.29 seconds on laptop)
  • Straightforward ILOG Solver finds a solution in
    8024 nodes (1.8 seconds on a laptop)
  • Again, CPLEX MIP takes 1 minute and 29048 nodes
    (on Sun Enterprise) to find a feasible integer
    solution

144
Example The Dutch Trains
  • Scheduling intercity trains
  • Amsterdam,Rotterdam,Roosendaal,Vlissengen

Without coupling constraints, multi-commodity
integer flow problem With coupling constraints, a
DLP with an integer relaxation
Additional logic handled directly in 2LP with
CPLEX Disjunctive Programming and Cooperating
Solvers, CSTS 98 (Kluwer, edited by D. Woodruff)
145
Conclusions
  • CP and MIP are powerful techniques that can solve
    many combinatorial problems
  • Each has preferred formulations
  • Can get even greater benefits when combining CP
    and IP algorithms

146
Recent and Current Work
  • Beaumont
  • Beringer, DeBacker
  • Balas, Ceria, Cornuejols.
  • Wallace, Rodosek, Schrimpf
  • Heipke, Colombani
  • Bockmayr
  • McAloon, Tretkoff, Wetzel

147
III. Exploiting Randomization to Solve Hard
Combinatorial Problems

148
Background
  • Combinatorial search methods often exhibit
  • a remarkable variability in performance. It is
    common to observe significant differences
    between
  • - different heuristics
  • - same heuristic on different instances
  • - different runs of same heuristic with
    different seeds (stochastic methods)

149
Main Claim
  • One can take advantage of the extreme
    variability of combinatorial search methods
  • One can improve the performance of a
    deterministic complete method, by introducing a
    stochastic element, while maintaining
    completeness.
  • Well explain WHY that is the case.

150
  • A Structured Benchmark Domain for Studying
    the Distributions of Search Methods
  • Stochasticity in Search Procedures
  • Intriguing Properties of Complete
    Backtrack Style Algorithms
  • Consequences for Algorithm Design - Rapid
    Randomized Restarts
  • Portfolio of Algorithms

151
Structured Benchmark Domain
152
Background
  • Study of local and systematic search methods has
    been driven by
  • Random instance distributions (Hogg et al. 96).
    Limitation lack of structure that characterizes
    realistic problems
  • Highly structured problems (Fujita at al. 93).
    Limitation too much structure.
  • We propose a benchmark domain that bridges the
    gap between purely random instances and highly
    structured problems.

153
Quasigroups
Defn. a pair (Q, ) where Q is a set, and is a
binary operation on Q such that
a x b y a b are uniquely
solvable for every pair of elements a,b in
Q. The multiplication table of its binary
operation defines a latin square (i.e., each
element of Q appears exactly once in each
row/column). Example
Quasigroup of order 4
154
Quasigroup Completion Problem (QCP)
Given a partial latin square, can it be
completed? Example
155
Quasigroup Completion Problem A Framework for
Studying Search
  • NP-Complete (Colbourn 1983, 1984 Anderson 1985).
  • Has a structure not found in random instances.
  • Leads to interesting search problems when
    structure is perturbed.
  • The study of this problem led us to identify
  • the unusual distributions of combinatorial search
    (Gomes, Selman Crato --- CP97)

156
Aside Applications of Quasigroups
  • Design of statistical experiments
  • eliminating data dependencies
  • Scheduling/Timetabling (Anderson 1992)
  • completing a schedule given a set of pre-defined
    events
  • Automated theorem proving (Fujita et al. 1993)
  • existence vs. non-existence of quasigroups with
    intricate mathematical properties

157
Example Scheduling of Drug Experiment
  • Given 5 different drugs, test the effects of the
  • different medications on 5 different subjects
    over
  • different days of the week.
  • Use constraint
  • No two people get same brand on the same day
  • (eliminate bias for day of the week).

158
Quasigroup Completion
DAY
Mon. Tues. Wed.
Thurs. Fri.
Tylenol Aleve Bayer Exhedrin
Advil Aleve Bayer Exhedrin
Advil Tylenol Bayer Exhedrin
Advil Tylenol Aleve Exhedrin
Advil Tylenol Aleve
Bayer Advil Tylenol Aleve
Bayer Exhedrin
Tim Sue Frank Teresa Todd
SUBJECT
() Pre-assigned
159
  • QCP has a natural formulation as a Constraint
  • Satisfaction Problem
  • variable for each NxN entry
  • constraints capture row/column requirement
  • variable assignments capture pre-assigned values

160
  • How does the difficulty of
  • QCP vary with the fraction
  • of pre-assignment?

161
Median number of backtracks (log)
Fraction of pre-assignment
162
  • Complexity Graph shows (up to order 20)
  • curve peaks around 42 of pre-assignment ---
  • critically constrained area.
  • under-constrained and over-constrained areas are
    easier.

163
  • Directly related to the peak in
  • computational difficulty is the so-
  • called phase transition graph for
  • the QCP problem.

164
Fraction of Unsolved cases
Fraction of pre-assignment
165
Phase Transition
  • QCP Phase Transition --- threshold phenomenon
    from almost all solvable to almost all unsolvable
    --- occurs around 42 of preassignment.
  • Its called a phase transition because of the
    close
  • relation to state transition phenomena studied in
  • physics, such as the melting of a solid into a
  • liquid.

166
Exploiting Structure
167
Exploiting Structure in QCP
Arc Consistency on binary constraints
Forward Checking
168
Further Exploiting Structure in QCP
General Arc Consistency on all different
constraints
Arc Consistency on Binary Constraints
169
Enforcing General Arc Consistency on All
Different Constraints
  • Beautiful example of integration of AI/OR
    techniques for a well defined sub-problem
  • Propagation uses Maximum Matching problem
    (particular case of Network Flow problems which
    have polynomial time complexity)

170
Further Exploiting Structure in QCP
  • By enforcing general arc consistency on all
    different constraints problems up to order 50
    could be solved!

171
Stochasticity in Search Procedures
172
Background
  • Stochastic strategies have been very successful
    in the area of local search.
  • Limitation inherent incomplete nature of local
    search methods.
  • We want to explore the addition of a stochastic
    element to a systematic search procedure without
    losing completeness.

173
  • We introduce stochasticity in a
  • backtrack search method by randomly
  • breaking ties in variable and/or value
  • selection.
  • Compare with standard lexicographic
  • tie-breaking.

174
Randomized Strategies

175
(No Transcript)
176
(No Transcript)
177
(No Transcript)
178
  • Lesson
  • Randomized tie-breaking can
  • improve performance over a purely
  • deterministic strategy.
  • Next
  • But we can obtain a more dramatic
  • advantage from randomization ...

179
Cost Distributions
  • Key Properties
  • I Erratic behavior of mean.
  • II Distributions have heavy tails.

180
3500!
sample mean
Median 1!
number of runs
181
1
182
75lt30
5gt100000
Proportion of cases Solved
183
Heavy-Tailed Distributions
  • infinite variance infinite mean
  • Introduced by Pareto in the 1920s
  • --- probabilistic curiosity.
  • Mandelbrot established the use of heavy-tailed
    distributions to model real-world fractal
    phenomena.
  • Examples stock-market, earth-quakes, weather,...

184
Decay of Distributions
  • Standard --- Exponential Decay
  • e.g. Normal
  • Heavy-Tailed --- Power Law Decay
  • e.g. Pareto-Levy

185
(No Transcript)
186
Normal, Cauchy, and Levy
187
Tail Probabilities (Standard Normal, Cauchy,
Levy)

188
How to Check for Heavy Tails?
  • Log-Log plot of tail of distribution
  • should be approximately linear.
  • Slope gives value of
  • infinite mean and
    infinite variance
  • infinite variance

189
Example of Heavy Tailed Model(Random Walk)
  • Random Walk
  • Start at position 0
  • Toss a fair coin
  • with each head take a step up (1)
  • with each tail take a step down (-1)

X --- number of steps the random walk takes
to return to position 0.
190
(No Transcript)
191
Heavy-tails vs. Non-Heavy-Tails
Normal (2,1000000)
1-F(x) Unsolved fraction
O,1gt200000
Normal (2,1)
X - number of steps the walk takes to return to
zero (log scale)
192
Heavy-tails in QCP Domain
1-F(x) Unsolved fraction
Number backtracks (log)
193
  • The Log-Log plot shows a linear relation
  • over many orders of magnitude. This is
  • clear evidence of heavy-tailed behavior.

194
(No Transcript)
195

196
Heavy Tailed Cost Distribution
197
  • The Log-Log plot shows a linear relation
  • over many orders of magnitude. This is
  • clear evidence of heavy-tailed behavior.

198
  • By studying larger problems we discovered that
    not only does the heavy tail phenomenon occur at
    the right-hand side of the distribution, but we
    also observed a high frequency of data points on
    the left-hand side of the distribution.
  • Right-hand side non-negligible fraction of very
    long runs
  • Left-hand side non-negligible fraction of
    very short runs

199
Sports Scheduling
70gt 250000
Cumulative Distribution Function
15!
Number backtracks (log)
200
Power Law Decay
Exponential Decay
Standard Distribution (finite mean variance)
201
  • Consequence for algorithm design
  • Use rapid restarts or parallel / inter-leaved
    runs

Super linear speedups!!!
202
Super-linear Speedups
Interleaved (1 machine) 10 x 1 10 seconds
5 x speedup
203
  • Rapid Restarts work particularly well on hard
    computational problems because of the Heavy
    Tailed Phenomena in the run time distribution.
  • RAPID RANDOMIZED RESTARTS strategy avoids the
    tail on the right and exploits the short runs on
    the left.
  • Restarts provably eliminate heavy tails (Gomes,
    Selman Crato )

204
Sketch of proof of elimination of heavy tails
  • Lets truncate the search procedure
  • after m backtracks.
  • Probability of solving problem with truncated
    version
  • Run the truncated procedure and restart it
    repeatedly.

205

Y - does not have Heavy Tails
206
Restarts
70 unsolved
1-F(x) Unsolved fraction
Number backtracks (log)
207
Example of Rapid Restart Speedup(planning)
Number backtracks (log)
Cutoff (log)
208
Summary Results
Deterministic

() not found after 2 days
209
  • Our results provide the first indication of
    heavy-tailed distri-butions in a computational
    model.
  • Overall insight Randomized tie-breaking with
    rapid restarts gives powerful search strategy.

210
Heavy-Tailed Distributionsin Other Domains
  • Quasigroup Completion Problem
  • Graph Coloring
  • Logistic Planning
  • Circuit Synthesis

211
Summary Results
Deterministic

() not found after 2 days
212
Rapid Restart Speedup
213
  • Our results provide the first indication of
    heavy-tailed distri-butions in a computational
    model.
  • Overall insight
  • Randomized tie-breaking with
  • rapid restarts gives powerful
  • search strategy.

214
Heavy-Tailed Distributionsin Other Domains
  • Quasigroup Completion Problem
  • Graph Coloring
  • Logistic Planning
  • Circuit Synthesis

215
Algorithm Portfolio Design

216
Motivation
  • The runtime and performance of randomized
    algorithms can vary dramatically on the same
    instance and on different instances.
  • Goal Improve the performance of different
    algorithms by combining them into a portfolio to
    exploit their relative strengths.

217
Branch BoundBest Bound vs. Depth First
Search
218
Branch Bound(Randomized)
  • Standard OR approach for solving Mixed Integer
    Programs (MIPs)
  • Solve linear relaxation of MIP
  • Branch on the integer variables for which the
    solution of the LP relaxation is non-integer
  • apply a good heuristic (e.g., max infeasibility)
    for variable selection ( randomization ) and
    create two new nodes (floor and ceiling of the
    fractional value)
  • Once we have found an integer solution, its
    objective value can be used to prune other nodes,
    whose relaxations have worse values

219
Branch BoundDepth First vs. Best bound
  • Critical in performance of Branch Bound
  • the way in which the next node to be expanded
    is selected.
  • Best-bound - select the node with the
    best LP bound
  • (standard OR approach) ---gt
  • this case is equivalent to A, the LP
    relaxation provides an admissible search
    heuristic
  • Depth-first - often quickly reaches an integer
    solution
  • (may take longer to produce an overall optimal
    value)

220
Portfolio of Algorithms
  • A portfolio of algorithm is a collection of
    algorithms and / or copies of the same
    algorithm running interleaved or on different
    processors.
  • Goal to improve on the performance of the
    component algorithms in terms of
  • expected computational cost
  • risk (variance)
  • Efficient Set or Efficient Frontier set of
    portfolios that are best in terms of expected
    value and risk.

221
Depth-first vs. Best-bound(logistics planning)
Cumulative Frequencies
Number of nodes
222
  • Depth-First and Best and Bound do not dominate
    each other overall.

223
Heavy-tailed behavior of Depth-first
224
Portfolio for heavy-tailed search procedures (2
processors)
2 DF / 0 BB
Expected run time of portfolios
0 DF / 2 BB
Standard deviation of run time of portfolios
225
Portfolio for heavy-tailed search procedures (6
processors)
0 DF / 6 BB
Expected run time of portfolios
6 DF / 0BB
Standard deviation of run time of portfolios
226
Portfolio for heavy-tailed search procedures (20
processors)
0 DF / 20 BB
Expected run time of portfolios
20 DF / 0 BB
Standard deviation of run time of portfolios
227
Portfolio for heavy-tailed search procedures
(2-20 processors)
228
  • A portfolio approach can lead to substantial
    improvements in the expected cost and risk of
    stochastic algorithms, especially in the presence
    of heavy-tailed phenomena.

229
Summary of Randomization
  • Considered randomized backtrack search.
  • Showed Heavy-Tailed Distributions.
  • Suggests Rapid Restart Strategy.
  • --- cuts very long runs
  • --- exploits ultra-short runs
Write a Comment
User Comments (0)
About PowerShow.com