Title: Integration%20of%20Artificial%20Intelligence%20and%20Operations%20Research%20Techniques%20for%20Combinatorial%20Problems%20%20Carla%20P.%20Gomes%20Cornell%20University%20gomes@cs.cornell.edu%20Ken%20McAloon%20and%20Carol%20Tretkoff%20ILOG%20{mcaloon,tretkoff}@ilog.com
1Integration of Artificial Intelligence and
Operations Research Techniques for Combinatorial
ProblemsCarla P. GomesCornell University
gomes_at_cs.cornell.eduKen McAloon and Carol
TretkoffILOGmcaloon,tretkoff_at_ilog.com
2AI, OR, and CS
AI
OR
CS
3Integration of Artificial Intelligence
Operations ResearchTechniques
OR
Representations Constraint Languages Logic
Formalisms Object-Oriented Prog. Bayesian
Nets Rule Based Systems Tools Constraint
Propagation Systematic Search Stochastic
Search Pros / Cons Rich
Representations Computational Complexity
Representations Mathematical Modeling
Languages Linear Non-linear (In)Equalities
Tools Linear Programming Mixed-Integer
Prog. Non-linear Models Pros / Cons More
Tractable (LP) Primarily Complete Info Limited
Representations
Combinatorial Problems
Planning
Scheduling
THE CHALLENGE AI OR UNIFY
APPROACHES TO
SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE
COMPLEXITY (phase transition)
EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS
4Outline
-
- I. Short Overview of OR
- II. Disjunctive Programming and Hybrid Solvers
- III. Exploiting Randomization to Solve Hard
Combinatorial Problems - IV. Conclusions
5I. Short OR Overview
6Outline for Linear Programming and Integer
Programming
- Standard Form of LP and a Simple Example
- Geometric Interpretation of LP
- Complexity issues
- MIP
- Example Fast Food
- Example Capacitated Warehouse
- Example 911
-
7Outline
-
- 1. Short Overview of OR
- 2. Constraint Programming
- 3. Cooperating Solvers
- 4. Disjunctive Programming
- 5. Exploiting Randomization to Solve Hard
Combinatorial Problems - 6. Conclusions
8Optimization Technology Evolution
1960
1970
1980
1990
1998
1947
Shifting Bottleneck
Primal Simplex LP
Interior Point
Dispatch Rules
CPM PERT
First CP Systems
Dual Simplex Implementation
Constraint Propagation
SA, GA, Tabu
Constraint-based Scheduling
Dual Simplex
Barrier LP
Barrier Crossover
Cooperating Solvers (LP/CP)
Global constraints
MIP
Concurrent Scheduling
Parallel LP/MIP
Large IPs
91. Short OR Overview
10Outline for Linear Programming and Integer
Programming
- Standard Form of LP and a Simple Example
- Geometric Interpretation of LP
- Complexity issues
- MIP
- Example Fast Food
- Example Capacitated Warehouse
- Example 911
-
11An LP Story
- A factory can produce n products from m parts
- For product j it needs aij units of part i
- There are bi units of part i available
- Each unit of product j sold earns cj
- Amount of each product to make is unknown xj ?
0 - Each part i determines a constraint
- ai1 x1 ain xn ? bi
- Obvious solution do nothing
- Better maximize c1 x1 cn xn
-
12Standard Forms of LP
- A linear program (LP) in standard form
(Dantzig 1947) - max cTx
- subject to Ax ? b
- x ? 0
- Input data c (n x 1), A (m x n), b (m x 1).
- Variables x (n x 1)
-
13Standard Forms of LP
- // The objective function
- max c1 x1 cn xn
-
- // The constraints
- subject to
- a11 x1 a1n xn ? b1
- ...
- am1 x1 amn xn ? bm
- x1 ? 0 , , xn ? 0
14Standard Forms of LP
- In OR emphasis is on optimality
- Solution means optimal solution
- Feasible solution means solution in the ordinary
sense
15Standard Forms of LP
- Interpretation of standard form
- xj amount of product j to make
- cj revenue per unit product j
- bi available amount of component i
- aij units of i used per unit of j produced
- The constraints say
- ? aijxj ? units of i used by j
- units of i used
- ? bi
-
16What are models?
- A model is a data-independent abstraction of a
problem - A model lets you write down the mathematical
representation of a model independently of the
data
One Problem Instance
17Products Could be Jewelry
Products Rings and Earrings Components Gold
and Diamonds One ring requires 3 units of Gold,
and 1 Diamond One set of earrings requires 2
units of Gold, and 2 Diamonds Total Gold and
Diamonds are limited Profit is different for
Rings than for Earrings
Products rings, earrings Components
Gold, Diamonds demand 3, 1, 2, 2
stock 150, 180 profit 60, 40
18Products Could be Chemicals
Products Ammonium Gas NH3 Ammonium
Chloride NH4Cl Components Nitrogen,
Hydrogen, Chlorine One unit of Gas requires 1
unit of Nitrogen, 3 units Hydrogen One unit of
Chloride requires 1 unit of Nitrogen, 4 units
Hydrogen, and 1 unit of Chlorine Total Nitrogen,
Hydrogen, Chlorine is limited Profit is different
for Gas than Chloride
Products gas, chloride Components
nitrogen, hydrogen, chlorine demand 1,
3, 0, 1, 4, 1 stock 50, 180,
40 profit 30, 40
19The Problems Have One Model
enum Products ... enum Components ... float
demandProducts, Components ... float
profitProducts ... float stockComponents
... var float productionProducts maximize
sum (p in Products) profitp
productionp subject to forall (c in
Components) sum (p in Products)
demandp, c productionp lt
stockc
20OR Modeling Systems
-
- OPL
- AMPL
- 2LP
- AIMMS
- GAMS
- MPL
- ILOG Planner
- etc
-
21The Dual
- The dual linear program (von Neumann 1947)
- min yTb
- subject to yTA ? c
- y ? 0
- Variables y (m x 1)
-
- Awesome Symmetry -
- The dual of the dual is the primal
-
-
-
22Rows and Columns Exchanged
- min b1 y1 bm yn
-
- subject to
- a11 y1 am1 ym ? c1
- ...
- a1n y1 amn ym ? cn
- y1 ? 0 , , ym ? 0
23Duality Theorem
- Theorem min yTb max cTx
- Consequence This turns optimality problem into a
feasibility problem in x and y - Ax ? b
- x ? 0
- yTA ? cT
- y ? 0
- yTb cTx
- Consequence Enumeration not needed to verify
optimality -
-
24Duality Theorem
- Sensitivity Analysis
- Consequence The solution values y for the y
variables yield the Lagrange multipliers of the
primal constraints which measure the rate of
change of the objective function with respect to
the right hand side bounds b -
- yi ? Z / ? bi where Z is the
optimum - Reference McAloon and Tretkoff 1996
Wiley
25Duality
- Two different views of the same phenomenon
Point vs Set
Arc vs Node
Momentum vs Position
Vector vs Hyperplane
Landlord vs Renter
26Simplex and Barrier
- The simplex algorithm turns the feasibility
problem into a iterative repair process with a
powerful evaluation function - The barrier method transforms the LP into a
system of differential equations that describe a
vector field of flow on the polytope
27Geometric Interpretation of LP
Max X subject to -X Y lt 4 X 4y lt 36 2X
y lt 23 X Y gt 4 Y gt X 10
Y
(4,8)
(0,4)
Simplex
(10,3)
Barrier
(4,0)
(8,0)
X
28Complexity of Linear Programming
- Simplex Method
- Worst-case --- exponential (Klee and Minty 72)
- Practice --- good performance
- Ellipsoid Method
- Khachians Ellipsoid Method
- Worst-case --- polynomial
- Practice --- poor performance
-
29Complexity of Linear Programming
- Interior Point Methods or Barrier Methods
- Karmarkars (and variants) Method
- Worst-case --- polynomial
- Practice --- good performance
30Complexity of Linear Programming
-
- Despite its worst case exponential time
complexity, the simplex method is usually the
method of choice since it provides tools for
sensitivity analysis and its performance is very
competitive in practice. - Which method performs best is problem dependent.
31Success Stories
- Industrial Planning
- Given current resources, decide what to produce
in what quantity - Supply Chain Management
- Multiperiod planning models that link flow from
one period to the next - Network Flow
- How best to route goods across a network
32Assumptions of Linear Programming
-
- Linearity
- when violated ( xy 50)
- Nonlinear programming
- Continuity
- when violated (x integral)
- (Mixed) Integer programming
-
-
33Assumptions of Linear Programming - continued
-
- No Disjunctive Constraints
- when violated (x ? 100 or x ? 0)
- Disjunctive programming
- Additional 0-1 variables and Big M
constraints - Certainty
- when violated (cost c is a random variable)
- Stochastic programming
-
34Search and MIP
- In order to deal with variables that must have
integer values in the solution, a search must be
performed. - Mixed Integer Programming problems are
combinatorial optimization problems and are NP
hard - feasibility is NP-Complete
- verifying optimality is co-NP-Complete
35MIP and Combinatorial Optimization
- These problems have been attacked by both the AI
and OR communities. - In AI, these problems are attacked as CSPs or as
Planning Problems. - In OR, they are done as MIPs and use linear
relaxation to help guide the search. - The overriding idea in each case is to limit
search.
36Integer Program All Integer Points in Region
37Cut to Create Integer Vertex
Integer Vertex
38Example - Fast Food
- Question Is it possible for a male college
student to eat at the local fast food outlet and
still meet the requirements of a balanced diet? - If so, what is the least he can do it for?
39Nutritional Requirements
- At least 100 of vitamins A, C, B1, B2, niacin,
calcium and iron - At least 55 grams of protein
- At most 3000 milligrams of sodium
- At most 30 of the calories can come from fat
- Nutritional information is available from fast
food outlets
40College Students Requirements
- At least 2000 calories a day
- No more than 3 servings of any one food
- Milk only with cereal and not as a stand-alone
drink
41Fast Food - MIP Model
- We will have variables Servk to represent the
number of servings of item k in the plan. - The variable Servk will have to take an integer
value for the solution to be valid. - The objective function Z for cost
42Fast Food - MIP Model
- Let foodk,j represent the percent of RDA of
nutrient j in a serving of item k - The for each nutrient j, we have a constraint
- ? foodk,j Servk ? 100
- k
-
-
43Fast Food - MIP Model
- Let sodiumk represent the amount of salt in a
serving of item k -
- For salt we have the constraint
- ? sodiumk Servk ? 3000
- k
- Similarly for fat
-
44Fast Food - MIP Model
- Let costk represent the cost of a serving of item
k -
- For the objective function we have the defining
constraint - ? costk Servk Z
- k
45Fast Food - Solution
- With a MIP solver and a way to input these
constraints we ask for - a solution that makes the variables Servk
integral - and which minimizes Z
46MIP Solution Technique
- What the MIP solver does is to carry out a branch
and bound search guided by - the linear relaxation
- the solution to the problem with the integrality
requirements relaxed - Initialize the global variable best_so_far to
1000 (or something else very big). -
47At a Node
- Compute a solution to the linear relaxation which
minimizes Z yielding z. Prune this node if - z ? best_so_far ,
- If all values of Servk are integral, this is a
solution. Set best_so_far z. Save this node. -
48Branching at a node
- Choose a variable Servk whose value s is not
integral. - Typical heuristic most non-integral variable
- Create two child nodes,
- add Servk ? floor(s)
- add Servk ? ceil(s)
-
49Good News
- The linear relaxation can prune nodes before all
variables Servk are forced to be integral. - Surprisingly often a node high in the tree will
turn up with all relevant variables integer.
Heres why - A solution to the LP is at a vertex
- A vertex is defined as the simultaneous solution
of the equality form of n linearly independent
constraints - Many of these constraints are integer bounding
constraints yielding X integer -
50Arboreally Speaking
- Breadth first search is often preferred - it
visits the smallest number of nodes needed to
find and verify the optimal solution - analogous
to A - If the linear relaxation is tight
- zlinear - zintegral is relatively small
- then zlinear is an excellent evaluation
function
51Answer - Fast Food
- Total cost is 8.71
- Buy 3 burgers
- Buy 2 fries
- Buy 3 honeys
- Buy 1 yogurt
- ...
52Example - Fixed Cost
- Warehouses must be rented in order to supply
stores and we must decide which to use - For each store j we know its monthly demand dj
- For each warehouse i we know its capacity ki
- For each warehouse i we know the fixed cost to
run it each month fci - For each pair i, j we know the monthly cost cij
of supplying j from i
53Example - Fixed Cost
- Xij is the fraction of store js demand met by i
- Xij ? 1
- Yi is a fuzzy boolean
- it will be 1 if the warehouse is rented
- 0 if it is not rented
- Yi ? 1
54Example - Fixed Cost
- Each store must be supplied
- ? X ij 1
- i
-
- Warehouse capacity can not be exceeded
- ? dj Xij ? ki
- j
-
- Tighter
- ? dj Xij ? ki Yi
- j
55Example - Fixed Cost
- Objective function
- ? fci Yi ? ? cij Xij
- This yields a MIP with 0-1 variables Yi
-
56Branch and Cut An Enhanced Solution Method
- Cuts - redundant constraints for the MIP model
but not redundant for the linear relaxation - Xij ? Yi
- Add at a node if violated by solution to linear
relaxation - Powerful method - will solve the Imperial College
OR lib CW problems very easily
57Example - Call 911
- PCTs answer the phone 24 hours a day, 7 days a
week. - It is known how many PCTs should be on duty
during each of the 168 hours during the week in
order to assure the necessary response rate. - Workers can arrive at any hour and they work for
8 hours except for a one hour break after 4 hours.
58Example - Call 911
- Each PCT has a work week of 5 days followed by 2
days off. - Want to meet the demand with minimal or
near-minimal number of PCTs. - So need to determine how many PCTs start their
work week at each hour h of the week
59Modeling 911
- A continuous variable Pcth will represent the
number of workers who start their work week at
hour h, 0 ? h lt 168.
60Modeling 911
- A continuous variable Z will represent the
objective function - ? Pcth Z
- h
- There will be a constraint for each hour h to
assert that there are enough workers on duty at
that time. The rhs of this constraint is bh the
number of workers needed.
61Modeling 911
- For this constraint we need to represent the
number of workers who are on duty at time h - Certainly, those who start the week at time h are
here, as are those who started the week at time h
- 1 - And so on back to time h - 7 with the exception
of those who started at time h - 4 and who are
now on break. -
62Modeling 911
- This also applies to the previous 4 days. When
the smoke clears, we sum over the workers w who
are working at time h - ? Pctw ? bh
- w
63Call 911 solved with progressive roundoff
- int b168 // New York City 911
- 30,24,18,15,14,14,15,25,34,36,38,40,
- 41,43,46,57,57,59,61,59,55,50,45,38,
- 32,25,20,17,15,13,17,25,32,35,38,40,
- 42,43,47,58,57,57,59,57,55,52,47,41,
- 33,25,20,17,15,13,15,25,32,33,37,39,
- 42,43,47,57,56,57,57,56,53,50,47,41,
- 34,27,22,19,16,15,16,25,31,35,37,40,
- 44,45,48,57,57,56,58,56,53,53,46,41,
- 34,28,23,19,16,15,17,25,33,37,39,42,
- 45,47,51,59,58,60,61,61,57,56,57,55,
- 48,41,35,30,26,20,18,22,26,32,42,46,
- 49,53,54,56,56,56,59,59,57,57,56,56,
- 52,46,41,34,29,23,18,19,25,31,36,41,
- 46,50,52,53,52,53,54,53,50,49,45,40
-
64Modeling 911
- Subject to these constraint we want to find a
solution which makes the Pcth integer and which
makes Z small. - The naïve approach is to compute the minimal
linear solution and to round up all the values of
Pcth to the nearest integer. - The linear relaxation yields Z 204.67 fuzzy
workers but rounding yields a mediocre integral
solution of 259 workers.
65Modeling 911
- For this and many other applications, heuristics
can be used to develop good solutions - Progressive Roundoff - solve the linear
relaxation, round up first variable and freeze
it, re-solve etc.
66Solving the Integer Problem
- main() // Planner Code
-
- IlcInitFloat()
- IlcManager m(IlcNoEdit)
- IlcLinOpt simplex(m)
- IlcFloatVarArray Pct(m,168,0,1000)
- IlcFloatArray coeffs(m,168)
- int i,j,k,h,n
67Solving the Integer Problem
-
- // ? Pctw ? bh
- w
- for(h0hlt168h) // for each hour of 168 in
week - for(j0jlt168j)
- coeffsj 0
- for(k0klt5k) // for each of 5 days
- for(jk24jltk248j) // for each of 8
- if (j!(k244)) // hours
- coeffs(h168-j)168 1
- simplex.add(IlcScalProd(coeffs,Pct) gt bh)
-
68Solving the Integer Linear Problem
-
- IlcFloatVar Z IlcSum(Pct)// Objective
- simplex.setObjMin(Z)
- for(i0ilt168i) //Progressive roundoff
- n ceil(simplex.getCurrentValue(Pcti))
- // Fix variable and re-optimize
- simplex.add(Pcti n)
-
- m.out() ltlt Number of Pcts needed is ltlt Z ltlt
endl - m.end()
69Solution
- This code finds a solution with 208 workers in a
couple of seconds. The optimum is 207. - The heuristic works well in part because if there
were no lunch breaks, it would find the
guaranteed optimal solution - Bartholdi,Ratliffe,Orlin
702. Constraint Programming
71 LP/MIP is Beautiful, except when
- Variable domain information is important to the
search strategy - especially critical in scheduling
- The problem variables range over symbolic
entities and there are lots of symmetries - timetabling
- The MIP representation can be too verbose or
awkward - configuration
- There are just too many constraints
- e.g. vehicle routing
72Mathematical Basis of Constraint Programming (CP)
- The Constraint Satisfaction Problem
- Suppose a finite set of variables is given and
with each variable is associated a non-empty
finite domain. - A constraint on k variables X1,,Xk is a relation
R(X1,,Xk) ? D1 x x Dk. - A constraint satisfaction problem (CSP) is given
by a finite set of constraints. - A solution to a CSP is an assignment of values to
all the variables so that the constraints are
satisfied.
73Domain Reduction
- In CP, each constraint of a CSP is considered as
a subproblem and techniques are developed for
handling frequently encountered constraints. - With each constraint is associated a domain
reduction algorithm which reduces the domains of
the variables that occur in the constraint. - Accelerates convergence toward a solution
- Detects infeasibility
74Constraint Propagation
- The other key issue is communication among the
constraints or subproblems. - The basic method used is called constraint
propagation which links the constraints through
their shared variables. - The important thing about this setup is that it
is very modular and independent of the particular
structure of the individual constraints.
75Monsieur Jordan Phenomenon
- Like prose, you have been doing constraint
propagation all your life. - Crossword puzzles
- Incomplete and so backtracking is needed
- NY Times Sunday Crossword
- Optical Illusions
- Origin Vision analysis (Marr,Waltz et al)
76Strengths of Constraint Programming
- Constraint Programming provides a rich Rich
- Rich representation language.
- CP variables naturally represent problem entities
and the constraints do not have to be translated
into a specific problem format such as MIP or
SAT. - Opportunity to choose a good heuristic for the
solution strategy.
77Which Method for Which App?
LP
MIP
MIP
Constraint Based Scheduling
CP Local search
CP
Technology
Product Mix
Production Planning
Distribution Planning
Scheduling
Dispatching
Configuration
Application
Linear gt Disjunctive Constraints Strategic
gt Operational Optimization
783. Cooperating Solvers
79First Stop
80Mother of All Examples - N Queens
- Do we think in terms of queens
- Where do we place this queen ?
- Do we think in terms of squares
- Will this square contain a queen ?
- These views are dual to one other
81The Primal View
- For each queen assign it a square
Place this queen in this square ?
82The Dual View
- For each square decide whether it will have a
queen
Place a queen in this square ?
83The Primal Model
- In which row do we place qj - the queen in
column j
The constraints qi ! qj qi - qj ! i -
j qi - qj ! j - i
Note no alldifferent constraint
84Yet Another duality - rows vs columns
- In which column do we place qqi the queen in
row i
The constraints are the same qqi ! qqj qqi
- qqj ! i - j qqi - qqj ! j - i
85The Relationship
- Can link them as inverse functions
- qqqi i
- qqqj j
The constraint propagation i leaves domain of
qj iff j leaves domain of qqi
86In this primal/dual model
- Lo and behold
- one-third fewer fails
- (Example from Jean Jordans thesis)
-
87(No Transcript)
88Q
X
X
X
X
X
Q
X
X
X
X
X
X
X
X
X
X
89Q
X
X
X
X
X
Q
X
X
X
X
X
Q
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
90Q
Q
X
X
X
X
X
X
X
X
X
X
X
Q
X
X
X
X
X
X
X
X
X
X
X
Q
X
X
X
X
X
X
X
91So
- The cooperating primal-dual formulation
captured the generalized arc consistency of the
alldifferent constraint
The arc consistency of this global constraint is
non-trivial to maintain Network flow
algorithms flow goes from values to
variables each variable has unit demand and
capacity
92Remarks
- An IP model will encode the first dual solution
- Will this square contain a queen
- xij 0 or xij 1
- A disaster beyond 30 queens
- network structure on rows and columns lost
- Another example - sports scheduling
93Constraints and Indices
- In IP symbols are represented by indices as
opposed to values for variables. - nurses, teams
- Paris-St-Germain plays Manchester United on day k
- xijk 0-1 to represent team i plays team j on
day k - You cant put symmetry breaking and other
constraints on indices.
94Second Stop
95CP Is Powerful, But .
- Sometimes, inconsistencies can be overlooked
- X - Y ? 12
- X Y ? 10
- X in 1..20
- Y in 1..20
- Domain reduction on each constraint and
constraint propagation will not reduce the
domains although the system has no linear
solution - but an LP solver would spot this
962 Dimensional Bin Packing
- Application for the Automobile Industry built by
Greg Glockner
972 Dimensional Bin Packing
- The problem here is to put as many small
rectangles in a big rectangle with 90 degree
rotation allowed. - The actual application involves circuit boards
- There are two complete models, one a CP model and
the other an IP model. - The CP model directs the search
- The LP relaxation prunes the search space by
detecting infeasible nodes
982-D Bin Packing
- Arrange circuit boards onto raw material
- Boards may be rotated
- Use same number of each board
- Objective minimize scrap
Classic combinatorial optimization problem
99Solving 2-D Bin Packing
- Use CP to generate partial solutions (nodes)
- Restrict placement to reduce fragmentation of
blank space - Use tight LP to test feasibility
- If any partial solution is infeasible in the LP,
prune the tree immediately
CP constraints reduce the tree width LP allows us
to prune quickly
1002 Dimensional Bin Packing
- As the search tree is traversed, the two models
are in sync. - Note that the variables used in the 2 models are
disjoint - The two models are dual to each other
- The IP sees the model from the point of view of
the board, the large rectangle - The CP sees the model from the point of view of
the small rectangles - Solutions are obtained in minutes
1012DBP Basic CP Formulation
- Let (xi, yi) be the location and (wi, hi) be the
dimensions of the ith tile - Basic constraints
- Disjunctive constraints to prevent overlapping
tilesxi wi xj Ú yi hi yj Úxj wj
xi Ú yj hj yi - Constraints to count the number of each tile type
- Tile-oriented formulation
1022DBP Basic IP Formulation
- Let xijnt 1 if tile n of type t is in position
(i, j) - The constraints are
å
"
t
n
x
,
1
ijnt
,
j
i
å
"
j
i
x
,
1
nt
j
i
,
,
,
t
n
j
i
lt
lt
h
j
j
w
i
i
j
j
i
i
,
,
,
t
t
"
Î
t
n
j
i
x
,
,
,
1
,
0
ijnt
Grid-oriented formulation
1032DBP LP Issues
- The LP is large
- The LP exhibits significant primal degeneracy
- The LP exhibits significant dual degeneracy
1042DBP LP Issues
- The simplex algorithm cannot solve the LP
- There is no way for a MIP solver to solve the IP
as such - The barrier method can solve the LP
1052DBP Summary
- CP as master problem
- Orders tiles
- Places tiles by position, then type
- Selects tile type by frequency to scatter tiles
throughout the bin - Uses a one-ply lookahead constraint to limit the
position of following tile - LP relaxation prunes the CP search space
- Checks whether the partial solution will lead to
an infeasible instance - Use idiomatic formulations for CP and IP
1062DBP Remarks
- The CP fixes significant numbers of variables at
each node - The LP pre-processor greatly simplifies the LP
- Therefore, the lack of incrementality of the
barrier method does not cost us
1072DBP Cooperative Algorithm Demo
108Last Stop
- Constraint Programming and Local Search
cooperation
- Another example of duality in action
109CP/LS
- Parallel machines with set-up times
Ready times Dues dates Splittable jobs Rogue
machines
Objectives meet due dates minimize setup
costs
110Two Phase Cooperation
- Phase I - the Primal (Work on first objective)
-
- Configure and schedule the jobs
- Use constraint based scheduling
111Two Phase Cooperation
Machines morph into trucks
112Two Phase Cooperation
- Phase 2 - the Dual (Work on second objective)
- Schedule the trucks
- Use Lin, Lin-Kernighan, tabu etc
113Parallel Machines Cooperative Algorithm Demo
114IC Park Example
- Hoist Scheduling (Rodosek and Wallace)
- The original model is an IP
- The CP model is the same
- CP guides search, LP relaxation and CP share
pruning duties - No apparent duality
115Remarks
- One can get great benefit with CP/IP algorithms
CP/CP algorithms and CP/LS algorithms - IP/LS is just around the corner
- IP/IP cooperation is hard because one cant
formulate truly dual views - either simply not there
- or too verbose
1164. DISJUNCTIVE PROGRAMMING
117Disjunctive Linear Programming
- An extension of Mixed Integer Programming
- A union of polyhedral sets (feasible regions) is
called a disjunctive set.
118Disjunctive Set
119Disjunctive Linear Programming
- The problem of determining whether the
intersection of a family of disjunctive sets is
non-empty is called the disjunctive linear
programming problem or simply disjunctive
programming problem. - The solution set of the disjunctive programming
problem is - Ç È Fij
- iltM jltN
120Disjunctive Linear Programming Examples
- Semi-continuous variables
- either X gt 100
- or X 0
-
- Rather than
- X lt BigMY ,
- X gt100Y,
- Y a 0-1 variable
-
121Solution Set Inside Initial Region
122Disjunctive Linear Programming Examples -
continued
- Bollapragada, Ghattas and Hooker
- Truss structure design problem
- Branches directly on alternatives dictated by
Hookes Law - Wyatt
- Disjunctive programming and mean absolute
deviation models (MAD) for portfolio
optimization - Extends Benders decomposition to disjunctive
linear programs
123Disjunctive Linear Programming continued
- Balas, Cornuejols and Ceria
- Generating cuts for disjunctive programming
problems. - McAloon and Tretkoff
- Basic mathematical results Optimization and
Computational Logic, Wiley
124Disjunctive Linear Programming continued
- Dealing with the disjunctive part requires
search. - This requires an engine which is not available in
MIP packages - Also the linear relaxation is not as tight and
the evaluation function is not as faithful - The solution is to use a CSP solver and an LP
based solver in tandem - cooperating solvers - Beringer and DeBacker for MIP
-
125To Keep It Simple
An AI classic Newell and Simon
Assignment problem 1 constraint
Surprisingly hard for MIP solvers CPLEX MIP takes
1 minute and 29048 nodes (on Sun Enterprise) to
find a feasible integer solution
126The Disjunctive Program
- One constraint for the equation
- 100000 G D 100000 R T
- For each variable X among G,,T
- X 0 or X 1 or or X 9
- For each pair X, Y
- X ? Y-1 or Y ? X-1
-
127Solution Set SOME of the Integer Points in the
Region
128The Twin Variables for Cooperating Solvers
- Integer variables for the letters
- 0 ? g, e, r, a, l, d, o, n, b, t ? 9
- With continuous doppelgangers
- 0 ? G, E, R, A, L, D, O, N, B, T ? 9
129The Variables
- One multi-variable constraint on the continuous
doppelgangers posted to an LP solver and to the
CSP solver - 100000 G 10000 E 1000 R D
- 100000 D 10000 O 1000 N D
-
- 100000 R 10000 O 1000 B T
130The Variables
- One CSP constraint on the integer variables
posted to a discrete constraint propagation
engine - AllDifferent(g, e, r, a, l, d, r, n, b, t )
131The Search
- Bounding information from the discrete variables
is passed to the continuous doppelgangers and
conversely - The branching strategy is guided by the linear
relaxation on the continuous variables - if there is a non-integral variable X, branch on
it - X ? floor(X)
- or
- X ? ceil(X)
132The Search
- If the AllDifferent constraint, the initial
bounding constraints and the bounding constraints
from branching detect a contradiction on the
discrete variables, both sides backtrack - If the linear relaxation is made infeasible by
the bounding constraints that come from the
discrete computation or from branching, both
sides backtrack
133The Search
- New wrinkle
- The solution to the linear relaxation might have
all variables integral - but the AllDifferent
constraint can be violated by this set of values - In this case, branch to keep them apart
- either X ? Y - 1
- or Y ? X - 1
134The Variables
- void main()
-
- IlcInitFloat()
- IlcManager m(IlcNoEdit)
-
- IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9),
A(m, 0, 9), L(m, 0, 9), - G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9),
T(m, 0, 9) - IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E,
R, B, T) -
- // Continued on next slide
-
135The Constraints
- m.add(IlcAllDiff(vars,IlcWhenValue))
- IlcLinOpt simplex(m)
- simplex.add(
- 100000R 10000O 1000B 100E 10R T
-
- 100000G 10000E 1000R 100A 10L D
-
- 100000D 10000O 1000N 100A 10L D
, - IlcTrue // Post to Solver as well
- )
136The Search for solutions
-
- m.add(Generate(m,simplex,vars)) // Search
strategy - if (m.nextSolution()) // Find a solution
- m.out() ltlt " solution found " ltlt endl
-
- m.printInformation()
- m.end()
137Branch if a variable is non-integer
- ILCGOAL2(Generate, IlcSimplex, simplex,
IlcIntVarArray, vars) -
- IlcInt varIndex MostNotInteger(vars,
simplex) - if (varIndex gt 0) // There is a non-integer
variable - return IlcAnd(IlcTryUpwardFirst(varsvarIndex,
simplex), this) -
-
-
138Is integer relaxation a solution ?
- IlcManager m getManager()
- if(m.solve(TestIntegerRelaxation(m,simplex)))
- return 0
-
-
139Find two variables with same value
- IlcInt j
- for(i0iltvars.getSize()-1i)
- if (varsi.isBound()) continue // Cant both
be bound - IlcInt n simplex.nearest(simplex.getCurrentVal
ue(varsi)) - for(ji1jltvars.getSize()j)
- IlcInt m simplex.nearest(simplex.ge
tCurrentValue(varsj)) - if (m n) break
-
- if (jlt vars.getSize()) break
-
-
140Branch to push them apart
-
- // j and i are the indices of two variables
with same current value - return
- IlcAnd( IlcOr(
- Smaller(m,varsi,varsj,simplex),
- Smaller(m,varsj,varsi,simplex)),
- this // Recursion
- )
141Pushing two variables apart
-
- ILCGOAL3(Smaller,IlcIntVar,x,IlcIntVar,y,IlcSimple
x,simplex) -
- simplex.add(x lt y-1,IlcTrue)
- return 0
142Testing the integer relaxation
-
- ILCGOAL1(TestIntegerRelaxation, IlcSimplex,
simplex) -
- simplex.trySolution()
- return 0
143Results
- ILOG Solver/Planner finds a solution in 6 nodes
(.29 seconds on laptop) - Straightforward ILOG Solver finds a solution in
8024 nodes (1.8 seconds on a laptop) - Again, CPLEX MIP takes 1 minute and 29048 nodes
(on Sun Enterprise) to find a feasible integer
solution
144Example The Dutch Trains
- Scheduling intercity trains
- Amsterdam,Rotterdam,Roosendaal,Vlissengen
Without coupling constraints, multi-commodity
integer flow problem With coupling constraints, a
DLP with an integer relaxation
Additional logic handled directly in 2LP with
CPLEX Disjunctive Programming and Cooperating
Solvers, CSTS 98 (Kluwer, edited by D. Woodruff)
145Conclusions
- CP and MIP are powerful techniques that can solve
many combinatorial problems - Each has preferred formulations
- Can get even greater benefits when combining CP
and IP algorithms
146Recent and Current Work
- Beaumont
- Beringer, DeBacker
- Balas, Ceria, Cornuejols.
- Wallace, Rodosek, Schrimpf
- Heipke, Colombani
- Bockmayr
- McAloon, Tretkoff, Wetzel
147III. Exploiting Randomization to Solve Hard
Combinatorial Problems
148Background
- Combinatorial search methods often exhibit
- a remarkable variability in performance. It is
common to observe significant differences
between - - different heuristics
- - same heuristic on different instances
- - different runs of same heuristic with
different seeds (stochastic methods)
149Main Claim
- One can take advantage of the extreme
variability of combinatorial search methods -
- One can improve the performance of a
deterministic complete method, by introducing a
stochastic element, while maintaining
completeness. - Well explain WHY that is the case.
150-
- A Structured Benchmark Domain for Studying
the Distributions of Search Methods - Stochasticity in Search Procedures
- Intriguing Properties of Complete
Backtrack Style Algorithms - Consequences for Algorithm Design - Rapid
Randomized Restarts - Portfolio of Algorithms
-
151Structured Benchmark Domain
152Background
- Study of local and systematic search methods has
been driven by - Random instance distributions (Hogg et al. 96).
Limitation lack of structure that characterizes
realistic problems -
- Highly structured problems (Fujita at al. 93).
Limitation too much structure. - We propose a benchmark domain that bridges the
gap between purely random instances and highly
structured problems. -
153Quasigroups
Defn. a pair (Q, ) where Q is a set, and is a
binary operation on Q such that
a x b y a b are uniquely
solvable for every pair of elements a,b in
Q. The multiplication table of its binary
operation defines a latin square (i.e., each
element of Q appears exactly once in each
row/column). Example
Quasigroup of order 4
154Quasigroup Completion Problem (QCP)
Given a partial latin square, can it be
completed? Example
155Quasigroup Completion Problem A Framework for
Studying Search
- NP-Complete (Colbourn 1983, 1984 Anderson 1985).
- Has a structure not found in random instances.
- Leads to interesting search problems when
structure is perturbed. - The study of this problem led us to identify
- the unusual distributions of combinatorial search
(Gomes, Selman Crato --- CP97)
156Aside Applications of Quasigroups
- Design of statistical experiments
- eliminating data dependencies
- Scheduling/Timetabling (Anderson 1992)
- completing a schedule given a set of pre-defined
events - Automated theorem proving (Fujita et al. 1993)
- existence vs. non-existence of quasigroups with
intricate mathematical properties
157Example Scheduling of Drug Experiment
- Given 5 different drugs, test the effects of the
- different medications on 5 different subjects
over - different days of the week.
- Use constraint
- No two people get same brand on the same day
- (eliminate bias for day of the week).
158 Quasigroup Completion
DAY
Mon. Tues. Wed.
Thurs. Fri.
Tylenol Aleve Bayer Exhedrin
Advil Aleve Bayer Exhedrin
Advil Tylenol Bayer Exhedrin
Advil Tylenol Aleve Exhedrin
Advil Tylenol Aleve
Bayer Advil Tylenol Aleve
Bayer Exhedrin
Tim Sue Frank Teresa Todd
SUBJECT
() Pre-assigned
159 - QCP has a natural formulation as a Constraint
- Satisfaction Problem
- variable for each NxN entry
- constraints capture row/column requirement
- variable assignments capture pre-assigned values
160 - How does the difficulty of
- QCP vary with the fraction
- of pre-assignment?
161Median number of backtracks (log)
Fraction of pre-assignment
162- Complexity Graph shows (up to order 20)
- curve peaks around 42 of pre-assignment ---
- critically constrained area.
- under-constrained and over-constrained areas are
easier.
163- Directly related to the peak in
- computational difficulty is the so-
- called phase transition graph for
- the QCP problem.
-
164Fraction of Unsolved cases
Fraction of pre-assignment
165Phase Transition
- QCP Phase Transition --- threshold phenomenon
from almost all solvable to almost all unsolvable
--- occurs around 42 of preassignment. - Its called a phase transition because of the
close - relation to state transition phenomena studied in
- physics, such as the melting of a solid into a
- liquid.
166Exploiting Structure
167Exploiting Structure in QCP
Arc Consistency on binary constraints
Forward Checking
168Further Exploiting Structure in QCP
General Arc Consistency on all different
constraints
Arc Consistency on Binary Constraints
169Enforcing General Arc Consistency on All
Different Constraints
- Beautiful example of integration of AI/OR
techniques for a well defined sub-problem - Propagation uses Maximum Matching problem
(particular case of Network Flow problems which
have polynomial time complexity)
170Further Exploiting Structure in QCP
- By enforcing general arc consistency on all
different constraints problems up to order 50
could be solved! -
171Stochasticity in Search Procedures
172Background
- Stochastic strategies have been very successful
in the area of local search. - Limitation inherent incomplete nature of local
search methods. -
- We want to explore the addition of a stochastic
element to a systematic search procedure without
losing completeness.
173 - We introduce stochasticity in a
- backtrack search method by randomly
- breaking ties in variable and/or value
- selection.
- Compare with standard lexicographic
- tie-breaking.
174Randomized Strategies
175(No Transcript)
176(No Transcript)
177(No Transcript)
178- Lesson
- Randomized tie-breaking can
- improve performance over a purely
- deterministic strategy.
- Next
- But we can obtain a more dramatic
- advantage from randomization ...
179Cost Distributions
- Key Properties
- I Erratic behavior of mean.
- II Distributions have heavy tails.
1803500!
sample mean
Median 1!
number of runs
1811
18275lt30
5gt100000
Proportion of cases Solved
183Heavy-Tailed Distributions
- infinite variance infinite mean
- Introduced by Pareto in the 1920s
- --- probabilistic curiosity.
- Mandelbrot established the use of heavy-tailed
distributions to model real-world fractal
phenomena. - Examples stock-market, earth-quakes, weather,...
184Decay of Distributions
- Standard --- Exponential Decay
- e.g. Normal
-
- Heavy-Tailed --- Power Law Decay
- e.g. Pareto-Levy
-
-
185(No Transcript)
186Normal, Cauchy, and Levy
187Tail Probabilities (Standard Normal, Cauchy,
Levy)
188How to Check for Heavy Tails?
- Log-Log plot of tail of distribution
- should be approximately linear.
- Slope gives value of
-
- infinite mean and
infinite variance - infinite variance
-
-
189Example of Heavy Tailed Model(Random Walk)
- Random Walk
- Start at position 0
- Toss a fair coin
- with each head take a step up (1)
- with each tail take a step down (-1)
X --- number of steps the random walk takes
to return to position 0.
190(No Transcript)
191Heavy-tails vs. Non-Heavy-Tails
Normal (2,1000000)
1-F(x) Unsolved fraction
O,1gt200000
Normal (2,1)
X - number of steps the walk takes to return to
zero (log scale)
192Heavy-tails in QCP Domain
1-F(x) Unsolved fraction
Number backtracks (log)
193- The Log-Log plot shows a linear relation
- over many orders of magnitude. This is
- clear evidence of heavy-tailed behavior.
194(No Transcript)
195 196Heavy Tailed Cost Distribution
197- The Log-Log plot shows a linear relation
- over many orders of magnitude. This is
- clear evidence of heavy-tailed behavior.
198- By studying larger problems we discovered that
not only does the heavy tail phenomenon occur at
the right-hand side of the distribution, but we
also observed a high frequency of data points on
the left-hand side of the distribution. - Right-hand side non-negligible fraction of very
long runs - Left-hand side non-negligible fraction of
very short runs -
199Sports Scheduling
70gt 250000
Cumulative Distribution Function
15!
Number backtracks (log)
200Power Law Decay
Exponential Decay
Standard Distribution (finite mean variance)
201- Consequence for algorithm design
- Use rapid restarts or parallel / inter-leaved
runs
Super linear speedups!!!
202Super-linear Speedups
Interleaved (1 machine) 10 x 1 10 seconds
5 x speedup
203 - Rapid Restarts work particularly well on hard
computational problems because of the Heavy
Tailed Phenomena in the run time distribution. - RAPID RANDOMIZED RESTARTS strategy avoids the
tail on the right and exploits the short runs on
the left. - Restarts provably eliminate heavy tails (Gomes,
Selman Crato )
204Sketch of proof of elimination of heavy tails
- Lets truncate the search procedure
- after m backtracks.
- Probability of solving problem with truncated
version - Run the truncated procedure and restart it
repeatedly.
205 Y - does not have Heavy Tails
206Restarts
70 unsolved
1-F(x) Unsolved fraction
Number backtracks (log)
207Example of Rapid Restart Speedup(planning)
Number backtracks (log)
Cutoff (log)
208Summary Results
Deterministic
() not found after 2 days
209- Our results provide the first indication of
heavy-tailed distri-butions in a computational
model. - Overall insight Randomized tie-breaking with
rapid restarts gives powerful search strategy.
210Heavy-Tailed Distributionsin Other Domains
- Quasigroup Completion Problem
- Graph Coloring
- Logistic Planning
- Circuit Synthesis
211Summary Results
Deterministic
() not found after 2 days
212Rapid Restart Speedup
213- Our results provide the first indication of
heavy-tailed distri-butions in a computational
model. - Overall insight
- Randomized tie-breaking with
- rapid restarts gives powerful
- search strategy.
214Heavy-Tailed Distributionsin Other Domains
- Quasigroup Completion Problem
- Graph Coloring
- Logistic Planning
- Circuit Synthesis
215Algorithm Portfolio Design
216Motivation
- The runtime and performance of randomized
algorithms can vary dramatically on the same
instance and on different instances. - Goal Improve the performance of different
algorithms by combining them into a portfolio to
exploit their relative strengths.
217Branch BoundBest Bound vs. Depth First
Search
218Branch Bound(Randomized)
- Standard OR approach for solving Mixed Integer
Programs (MIPs) - Solve linear relaxation of MIP
- Branch on the integer variables for which the
solution of the LP relaxation is non-integer - apply a good heuristic (e.g., max infeasibility)
for variable selection ( randomization ) and
create two new nodes (floor and ceiling of the
fractional value) - Once we have found an integer solution, its
objective value can be used to prune other nodes,
whose relaxations have worse values -
219Branch BoundDepth First vs. Best bound
- Critical in performance of Branch Bound
- the way in which the next node to be expanded
is selected. -
- Best-bound - select the node with the
best LP bound - (standard OR approach) ---gt
- this case is equivalent to A, the LP
relaxation provides an admissible search
heuristic - Depth-first - often quickly reaches an integer
solution - (may take longer to produce an overall optimal
value) -
220Portfolio of Algorithms
- A portfolio of algorithm is a collection of
algorithms and / or copies of the same
algorithm running interleaved or on different
processors. - Goal to improve on the performance of the
component algorithms in terms of - expected computational cost
- risk (variance)
- Efficient Set or Efficient Frontier set of
portfolios that are best in terms of expected
value and risk.
221Depth-first vs. Best-bound(logistics planning)
Cumulative Frequencies
Number of nodes
222 - Depth-First and Best and Bound do not dominate
each other overall.
223Heavy-tailed behavior of Depth-first
224Portfolio for heavy-tailed search procedures (2
processors)
2 DF / 0 BB
Expected run time of portfolios
0 DF / 2 BB
Standard deviation of run time of portfolios
225Portfolio for heavy-tailed search procedures (6
processors)
0 DF / 6 BB
Expected run time of portfolios
6 DF / 0BB
Standard deviation of run time of portfolios
226Portfolio for heavy-tailed search procedures (20
processors)
0 DF / 20 BB
Expected run time of portfolios
20 DF / 0 BB
Standard deviation of run time of portfolios
227Portfolio for heavy-tailed search procedures
(2-20 processors)
228 - A portfolio approach can lead to substantial
improvements in the expected cost and risk of
stochastic algorithms, especially in the presence
of heavy-tailed phenomena.
229Summary of Randomization
- Considered randomized backtrack search.
- Showed Heavy-Tailed Distributions.
- Suggests Rapid Restart Strategy.
- --- cuts very long runs
- --- exploits ultra-short runs