Integration%20of%20Artificial%20Intelligence%20and%20Operations%20Research%20Techniques%20for%20Combinatorial%20Problems%20%20Carla%20P.%20Gomes%20Cornell%20University%20gomes@cs.cornell.edu%20Ken%20McAloon%20and%20Carol%20Tretkoff%20ILOG%20{mcaloon,tretkoff}@ilog.com

About This Presentation

Title:

Integration%20of%20Artificial%20Intelligence%20and%20Operations%20Research%20Techniques%20for%20Combinatorial%20Problems%20%20Carla%20P.%20Gomes%20Cornell%20University%20gomes@cs.cornell.edu%20Ken%20McAloon%20and%20Carol%20Tretkoff%20ILOG%20{mcaloon,tretkoff}@ilog.com

Description:

Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl ... Nutritional Requirements. At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron ... – PowerPoint PPT presentation

Number of Views:149

Avg rating:3.0/5.0

Slides: 251

Provided by: karena152

Learn more at: http://www.cs.cornell.edu

Category:

more less

Transcript and Presenter's Notes

Title: Integration%20of%20Artificial%20Intelligence%20and%20Operations%20Research%20Techniques%20for%20Combinatorial%20Problems%20%20Carla%20P.%20Gomes%20Cornell%20University%20gomes@cs.cornell.edu%20Ken%20McAloon%20and%20Carol%20Tretkoff%20ILOG%20{mcaloon,tretkoff}@ilog.com

1
Integration of Artificial Intelligence and
Operations Research Techniques for Combinatorial
ProblemsCarla P. GomesCornell University
gomes_at_cs.cornell.eduKen McAloon and Carol
TretkoffILOGmcaloon,tretkoff_at_ilog.com

2
AI, OR, and CS
AI
OR
CS
3
Integration of Artificial Intelligence
Operations ResearchTechniques

OR
Representations Constraint Languages Logic
Formalisms Object-Oriented Prog. Bayesian
Nets Rule Based Systems Tools Constraint
Propagation Systematic Search Stochastic
Search Pros / Cons Rich
Representations Computational Complexity
Representations Mathematical Modeling
Languages Linear Non-linear (In)Equalities
Tools Linear Programming Mixed-Integer
Prog. Non-linear Models Pros / Cons More
Tractable (LP) Primarily Complete Info Limited
Representations
Combinatorial Problems
Planning
Scheduling
THE CHALLENGE AI OR UNIFY
APPROACHES TO
SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE
COMPLEXITY (phase transition)
EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS
4
Outline

I. Short Overview of OR
II. Disjunctive Programming and Hybrid Solvers
III. Exploiting Randomization to Solve Hard
Combinatorial Problems
IV. Conclusions

5
I. Short OR Overview

6
Outline for Linear Programming and Integer
Programming

Standard Form of LP and a Simple Example
Geometric Interpretation of LP
Complexity issues
MIP
Example Fast Food
Example Capacitated Warehouse
Example 911

7
Outline

1. Short Overview of OR
2. Constraint Programming
3. Cooperating Solvers
4. Disjunctive Programming
5. Exploiting Randomization to Solve Hard
Combinatorial Problems
6. Conclusions

8
Optimization Technology Evolution
1960
1970
1980
1990
1998
1947
Shifting Bottleneck
Primal Simplex LP
Interior Point
Dispatch Rules
CPM PERT
First CP Systems
Dual Simplex Implementation
Constraint Propagation
SA, GA, Tabu
Constraint-based Scheduling
Dual Simplex
Barrier LP
Barrier Crossover
Cooperating Solvers (LP/CP)
Global constraints
MIP
Concurrent Scheduling
Parallel LP/MIP
Large IPs
9
1. Short OR Overview

10
Outline for Linear Programming and Integer
Programming

Standard Form of LP and a Simple Example
Geometric Interpretation of LP
Complexity issues
MIP
Example Fast Food
Example Capacitated Warehouse
Example 911

11
An LP Story

A factory can produce n products from m parts
For product j it needs aij units of part i
There are bi units of part i available
Each unit of product j sold earns cj
Amount of each product to make is unknown xj ?
0
Each part i determines a constraint
ai1 x1 ain xn ? bi
Obvious solution do nothing
Better maximize c1 x1 cn xn

12
Standard Forms of LP

A linear program (LP) in standard form
(Dantzig 1947)
max cTx
subject to Ax ? b
x ? 0
Input data c (n x 1), A (m x n), b (m x 1).
Variables x (n x 1)

13
Standard Forms of LP

// The objective function
max c1 x1 cn xn
// The constraints
subject to
a11 x1 a1n xn ? b1
...
am1 x1 amn xn ? bm
x1 ? 0 , , xn ? 0

14
Standard Forms of LP

In OR emphasis is on optimality
Solution means optimal solution
Feasible solution means solution in the ordinary
sense

15
Standard Forms of LP

Interpretation of standard form
xj amount of product j to make
cj revenue per unit product j
bi available amount of component i
aij units of i used per unit of j produced
The constraints say
? aijxj ? units of i used by j
units of i used
? bi

16
What are models?

A model is a data-independent abstraction of a
problem
A model lets you write down the mathematical
representation of a model independently of the
data

One Problem Instance
17
Products Could be Jewelry
Products Rings and Earrings Components Gold
and Diamonds One ring requires 3 units of Gold,
and 1 Diamond One set of earrings requires 2
units of Gold, and 2 Diamonds Total Gold and
Diamonds are limited Profit is different for
Rings than for Earrings
Products rings, earrings Components
Gold, Diamonds demand 3, 1, 2, 2
stock 150, 180 profit 60, 40
18
Products Could be Chemicals
Products Ammonium Gas NH3 Ammonium
Chloride NH4Cl Components Nitrogen,
Hydrogen, Chlorine One unit of Gas requires 1
unit of Nitrogen, 3 units Hydrogen One unit of
Chloride requires 1 unit of Nitrogen, 4 units
Hydrogen, and 1 unit of Chlorine Total Nitrogen,
Hydrogen, Chlorine is limited Profit is different
for Gas than Chloride
Products gas, chloride Components
nitrogen, hydrogen, chlorine demand 1,
3, 0, 1, 4, 1 stock 50, 180,
40 profit 30, 40
19
The Problems Have One Model
enum Products ... enum Components ... float
demandProducts, Components ... float
profitProducts ... float stockComponents
... var float productionProducts maximize
sum (p in Products) profitp
productionp subject to forall (c in
Components) sum (p in Products)
demandp, c productionp lt
stockc
20
OR Modeling Systems

OPL
AMPL
2LP
AIMMS
GAMS
MPL
ILOG Planner
etc

21
The Dual

The dual linear program (von Neumann 1947)
min yTb
subject to yTA ? c
y ? 0
Variables y (m x 1)
Awesome Symmetry -
The dual of the dual is the primal

22
Rows and Columns Exchanged

min b1 y1 bm yn
subject to
a11 y1 am1 ym ? c1
...
a1n y1 amn ym ? cn
y1 ? 0 , , ym ? 0

23
Duality Theorem

Theorem min yTb max cTx
Consequence This turns optimality problem into a
feasibility problem in x and y
Ax ? b
x ? 0
yTA ? cT
y ? 0
yTb cTx
Consequence Enumeration not needed to verify
optimality

24
Duality Theorem

Sensitivity Analysis
Consequence The solution values y for the y
variables yield the Lagrange multipliers of the
primal constraints which measure the rate of
change of the objective function with respect to
the right hand side bounds b
yi ? Z / ? bi where Z is the
optimum
Reference McAloon and Tretkoff 1996
Wiley

25
Duality

Two different views of the same phenomenon

Point vs Set
Arc vs Node
Momentum vs Position
Vector vs Hyperplane
Landlord vs Renter
26
Simplex and Barrier

The simplex algorithm turns the feasibility
problem into a iterative repair process with a
powerful evaluation function
The barrier method transforms the LP into a
system of differential equations that describe a
vector field of flow on the polytope

27
Geometric Interpretation of LP

Max X subject to -X Y lt 4 X 4y lt 36 2X
y lt 23 X Y gt 4 Y gt X 10
Y
(4,8)
(0,4)
Simplex
(10,3)
Barrier
(4,0)
(8,0)
X
28
Complexity of Linear Programming

Simplex Method
Worst-case --- exponential (Klee and Minty 72)
Practice --- good performance
Ellipsoid Method
Khachians Ellipsoid Method
Worst-case --- polynomial
Practice --- poor performance

29
Complexity of Linear Programming

Interior Point Methods or Barrier Methods
Karmarkars (and variants) Method
Worst-case --- polynomial
Practice --- good performance

30
Complexity of Linear Programming

Despite its worst case exponential time
complexity, the simplex method is usually the
method of choice since it provides tools for
sensitivity analysis and its performance is very
competitive in practice.
Which method performs best is problem dependent.

31
Success Stories

Industrial Planning
Given current resources, decide what to produce
in what quantity
Supply Chain Management
Multiperiod planning models that link flow from
one period to the next
Network Flow
How best to route goods across a network

32
Assumptions of Linear Programming

Linearity
when violated ( xy 50)
Nonlinear programming
Continuity
when violated (x integral)
(Mixed) Integer programming

33
Assumptions of Linear Programming - continued

No Disjunctive Constraints
when violated (x ? 100 or x ? 0)
Disjunctive programming
Additional 0-1 variables and Big M
constraints
Certainty
when violated (cost c is a random variable)
Stochastic programming

34
Search and MIP

In order to deal with variables that must have
integer values in the solution, a search must be
performed.
Mixed Integer Programming problems are
combinatorial optimization problems and are NP
hard
feasibility is NP-Complete
verifying optimality is co-NP-Complete

35
MIP and Combinatorial Optimization

These problems have been attacked by both the AI
and OR communities.
In AI, these problems are attacked as CSPs or as
Planning Problems.
In OR, they are done as MIPs and use linear
relaxation to help guide the search.
The overriding idea in each case is to limit
search.

36
Integer Program All Integer Points in Region
37
Cut to Create Integer Vertex
Integer Vertex
38
Example - Fast Food

Question Is it possible for a male college
student to eat at the local fast food outlet and
still meet the requirements of a balanced diet?
If so, what is the least he can do it for?

39
Nutritional Requirements

At least 100 of vitamins A, C, B1, B2, niacin,
calcium and iron
At least 55 grams of protein
At most 3000 milligrams of sodium
At most 30 of the calories can come from fat
Nutritional information is available from fast
food outlets

40
College Students Requirements

At least 2000 calories a day
No more than 3 servings of any one food
Milk only with cereal and not as a stand-alone
drink

41
Fast Food - MIP Model

We will have variables Servk to represent the
number of servings of item k in the plan.
The variable Servk will have to take an integer
value for the solution to be valid.
The objective function Z for cost

42
Fast Food - MIP Model

Let foodk,j represent the percent of RDA of
nutrient j in a serving of item k
The for each nutrient j, we have a constraint
? foodk,j Servk ? 100
k

43
Fast Food - MIP Model

Let sodiumk represent the amount of salt in a
serving of item k
For salt we have the constraint
? sodiumk Servk ? 3000
k
Similarly for fat

44
Fast Food - MIP Model

Let costk represent the cost of a serving of item
k
For the objective function we have the defining
constraint
? costk Servk Z
k

45
Fast Food - Solution

With a MIP solver and a way to input these
constraints we ask for
a solution that makes the variables Servk
integral
and which minimizes Z

46
MIP Solution Technique

What the MIP solver does is to carry out a branch
and bound search guided by
the linear relaxation
the solution to the problem with the integrality
requirements relaxed
Initialize the global variable best_so_far to
1000 (or something else very big).

47
At a Node

Compute a solution to the linear relaxation which
minimizes Z yielding z. Prune this node if
z ? best_so_far ,
If all values of Servk are integral, this is a
solution. Set best_so_far z. Save this node.

48
Branching at a node

Choose a variable Servk whose value s is not
integral.
Typical heuristic most non-integral variable
Create two child nodes,
add Servk ? floor(s)
add Servk ? ceil(s)

49
Good News

The linear relaxation can prune nodes before all
variables Servk are forced to be integral.
Surprisingly often a node high in the tree will
turn up with all relevant variables integer.
Heres why
A solution to the LP is at a vertex
A vertex is defined as the simultaneous solution
of the equality form of n linearly independent
constraints
Many of these constraints are integer bounding
constraints yielding X integer

50
Arboreally Speaking

Breadth first search is often preferred - it
visits the smallest number of nodes needed to
find and verify the optimal solution - analogous
to A
If the linear relaxation is tight
zlinear - zintegral is relatively small
then zlinear is an excellent evaluation
function

51
Answer - Fast Food

Total cost is 8.71
Buy 3 burgers
Buy 2 fries
Buy 3 honeys
Buy 1 yogurt
...

52
Example - Fixed Cost

Warehouses must be rented in order to supply
stores and we must decide which to use
For each store j we know its monthly demand dj
For each warehouse i we know its capacity ki
For each warehouse i we know the fixed cost to
run it each month fci
For each pair i, j we know the monthly cost cij
of supplying j from i

53
Example - Fixed Cost

Xij is the fraction of store js demand met by i
Xij ? 1
Yi is a fuzzy boolean
it will be 1 if the warehouse is rented
0 if it is not rented
Yi ? 1

54
Example - Fixed Cost

Each store must be supplied
? X ij 1
i
Warehouse capacity can not be exceeded
? dj Xij ? ki
j
Tighter
? dj Xij ? ki Yi
j

55
Example - Fixed Cost

Objective function
? fci Yi ? ? cij Xij
This yields a MIP with 0-1 variables Yi

56
Branch and Cut An Enhanced Solution Method

Cuts - redundant constraints for the MIP model
but not redundant for the linear relaxation
Xij ? Yi
Add at a node if violated by solution to linear
relaxation
Powerful method - will solve the Imperial College
OR lib CW problems very easily

57
Example - Call 911

PCTs answer the phone 24 hours a day, 7 days a
week.
It is known how many PCTs should be on duty
during each of the 168 hours during the week in
order to assure the necessary response rate.
Workers can arrive at any hour and they work for
8 hours except for a one hour break after 4 hours.

58
Example - Call 911

Each PCT has a work week of 5 days followed by 2
days off.
Want to meet the demand with minimal or
near-minimal number of PCTs.
So need to determine how many PCTs start their
work week at each hour h of the week

59
Modeling 911

A continuous variable Pcth will represent the
number of workers who start their work week at
hour h, 0 ? h lt 168.

60
Modeling 911

A continuous variable Z will represent the
objective function
? Pcth Z
h
There will be a constraint for each hour h to
assert that there are enough workers on duty at
that time. The rhs of this constraint is bh the
number of workers needed.

61
Modeling 911

For this constraint we need to represent the
number of workers who are on duty at time h
Certainly, those who start the week at time h are
here, as are those who started the week at time h
- 1
And so on back to time h - 7 with the exception
of those who started at time h - 4 and who are
now on break.

62
Modeling 911

This also applies to the previous 4 days. When
the smoke clears, we sum over the workers w who
are working at time h
? Pctw ? bh
w

63
Call 911 solved with progressive roundoff

int b168 // New York City 911
30,24,18,15,14,14,15,25,34,36,38,40,
41,43,46,57,57,59,61,59,55,50,45,38,
32,25,20,17,15,13,17,25,32,35,38,40,
42,43,47,58,57,57,59,57,55,52,47,41,
33,25,20,17,15,13,15,25,32,33,37,39,
42,43,47,57,56,57,57,56,53,50,47,41,
34,27,22,19,16,15,16,25,31,35,37,40,
44,45,48,57,57,56,58,56,53,53,46,41,
34,28,23,19,16,15,17,25,33,37,39,42,
45,47,51,59,58,60,61,61,57,56,57,55,
48,41,35,30,26,20,18,22,26,32,42,46,
49,53,54,56,56,56,59,59,57,57,56,56,
52,46,41,34,29,23,18,19,25,31,36,41,
46,50,52,53,52,53,54,53,50,49,45,40

64
Modeling 911

Subject to these constraint we want to find a
solution which makes the Pcth integer and which
makes Z small.
The naïve approach is to compute the minimal
linear solution and to round up all the values of
Pcth to the nearest integer.
The linear relaxation yields Z 204.67 fuzzy
workers but rounding yields a mediocre integral
solution of 259 workers.

65
Modeling 911

For this and many other applications, heuristics
can be used to develop good solutions
Progressive Roundoff - solve the linear
relaxation, round up first variable and freeze
it, re-solve etc.

66
Solving the Integer Problem

main() // Planner Code
IlcInitFloat()
IlcManager m(IlcNoEdit)
IlcLinOpt simplex(m)
IlcFloatVarArray Pct(m,168,0,1000)
IlcFloatArray coeffs(m,168)
int i,j,k,h,n

67
Solving the Integer Problem

// ? Pctw ? bh
w
for(h0hlt168h) // for each hour of 168 in
week
for(j0jlt168j)
coeffsj 0
for(k0klt5k) // for each of 5 days
for(jk24jltk248j) // for each of 8
if (j!(k244)) // hours
coeffs(h168-j)168 1
simplex.add(IlcScalProd(coeffs,Pct) gt bh)

68
Solving the Integer Linear Problem

IlcFloatVar Z IlcSum(Pct)// Objective
simplex.setObjMin(Z)
for(i0ilt168i) //Progressive roundoff
n ceil(simplex.getCurrentValue(Pcti))
// Fix variable and re-optimize
simplex.add(Pcti n)
m.out() ltlt Number of Pcts needed is ltlt Z ltlt
endl
m.end()

69
Solution

This code finds a solution with 208 workers in a
couple of seconds. The optimum is 207.
The heuristic works well in part because if there
were no lunch breaks, it would find the
guaranteed optimal solution
Bartholdi,Ratliffe,Orlin

70
2. Constraint Programming
71
LP/MIP is Beautiful, except when

Variable domain information is important to the
search strategy
especially critical in scheduling

The problem variables range over symbolic
entities and there are lots of symmetries
timetabling

The MIP representation can be too verbose or
awkward
configuration

There are just too many constraints
e.g. vehicle routing

72
Mathematical Basis of Constraint Programming (CP)

The Constraint Satisfaction Problem
Suppose a finite set of variables is given and
with each variable is associated a non-empty
finite domain.
A constraint on k variables X1,,Xk is a relation
R(X1,,Xk) ? D1 x x Dk.
A constraint satisfaction problem (CSP) is given
by a finite set of constraints.
A solution to a CSP is an assignment of values to
all the variables so that the constraints are
satisfied.

73
Domain Reduction

In CP, each constraint of a CSP is considered as
a subproblem and techniques are developed for
handling frequently encountered constraints.
With each constraint is associated a domain
reduction algorithm which reduces the domains of
the variables that occur in the constraint.
Accelerates convergence toward a solution
Detects infeasibility

74
Constraint Propagation

The other key issue is communication among the
constraints or subproblems.
The basic method used is called constraint
propagation which links the constraints through
their shared variables.
The important thing about this setup is that it
is very modular and independent of the particular
structure of the individual constraints.

75
Monsieur Jordan Phenomenon

Like prose, you have been doing constraint
propagation all your life.
Crossword puzzles
Incomplete and so backtracking is needed
NY Times Sunday Crossword
Optical Illusions
Origin Vision analysis (Marr,Waltz et al)

76
Strengths of Constraint Programming

Constraint Programming provides a rich Rich
Rich representation language.
CP variables naturally represent problem entities
and the constraints do not have to be translated
into a specific problem format such as MIP or
SAT.
Opportunity to choose a good heuristic for the
solution strategy.

77
Which Method for Which App?
LP
MIP
MIP
Constraint Based Scheduling
CP Local search
CP
Technology
Product Mix
Production Planning
Distribution Planning
Scheduling
Dispatching
Configuration
Application
Linear gt Disjunctive Constraints Strategic
gt Operational Optimization
78
3. Cooperating Solvers

79
First Stop

CP/CP

80
Mother of All Examples - N Queens

Do we think in terms of queens
Where do we place this queen ?

Do we think in terms of squares
Will this square contain a queen ?

These views are dual to one other

81
The Primal View

For each queen assign it a square

Place this queen in this square ?
82
The Dual View

For each square decide whether it will have a
queen

Place a queen in this square ?
83
The Primal Model

In which row do we place qj - the queen in
column j

The constraints qi ! qj qi - qj ! i -
j qi - qj ! j - i
Note no alldifferent constraint
84
Yet Another duality - rows vs columns

In which column do we place qqi the queen in
row i

The constraints are the same qqi ! qqj qqi
- qqj ! i - j qqi - qqj ! j - i
85
The Relationship

Can link them as inverse functions
qqqi i
qqqj j

The constraint propagation i leaves domain of
qj iff j leaves domain of qqi
86
In this primal/dual model

Apply first-fail to qi

Lo and behold
one-third fewer fails
(Example from Jean Jordans thesis)

87
(No Transcript)
88
Q
X
X
X
X
X
Q
X
X
X
X
X
X
X
X
X
X
89
Q
X
X
X
X
X
Q
X
X
X
X
X
Q
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
90
Q
Q
X
X
X
X
X
X
X
X
X
X
X
Q
X
X
X
X
X
X
X
X
X
X
X
Q
X
X
X
X
X
X
X
91
So

The cooperating primal-dual formulation
captured the generalized arc consistency of the
alldifferent constraint

The arc consistency of this global constraint is
non-trivial to maintain Network flow
algorithms flow goes from values to
variables each variable has unit demand and
capacity
92
Remarks

An IP model will encode the first dual solution
Will this square contain a queen
xij 0 or xij 1

A disaster beyond 30 queens
network structure on rows and columns lost

Another example - sports scheduling

93
Constraints and Indices

In IP symbols are represented by indices as
opposed to values for variables.
nurses, teams
Paris-St-Germain plays Manchester United on day k
xijk 0-1 to represent team i plays team j on
day k
You cant put symmetry breaking and other
constraints on indices.

94
Second Stop

CP/IP

95
CP Is Powerful, But .

Sometimes, inconsistencies can be overlooked
X - Y ? 12
X Y ? 10
X in 1..20
Y in 1..20
Domain reduction on each constraint and
constraint propagation will not reduce the
domains although the system has no linear
solution
but an LP solver would spot this

96
2 Dimensional Bin Packing

Application for the Automobile Industry built by
Greg Glockner

97
2 Dimensional Bin Packing

The problem here is to put as many small
rectangles in a big rectangle with 90 degree
rotation allowed.
The actual application involves circuit boards
There are two complete models, one a CP model and
the other an IP model.
The CP model directs the search
The LP relaxation prunes the search space by
detecting infeasible nodes

98
2-D Bin Packing

Arrange circuit boards onto raw material
Boards may be rotated
Use same number of each board
Objective minimize scrap

Classic combinatorial optimization problem
99
Solving 2-D Bin Packing

Use CP to generate partial solutions (nodes)
Restrict placement to reduce fragmentation of
blank space
Use tight LP to test feasibility
If any partial solution is infeasible in the LP,
prune the tree immediately

CP constraints reduce the tree width LP allows us
to prune quickly
100
2 Dimensional Bin Packing

As the search tree is traversed, the two models
are in sync.
Note that the variables used in the 2 models are
disjoint
The two models are dual to each other
The IP sees the model from the point of view of
the board, the large rectangle
The CP sees the model from the point of view of
the small rectangles
Solutions are obtained in minutes

101
2DBP Basic CP Formulation

Let (xi, yi) be the location and (wi, hi) be the
dimensions of the ith tile
Basic constraints
Disjunctive constraints to prevent overlapping
tilesxi wi xj Ú yi hi yj Úxj wj
xi Ú yj hj yi
Constraints to count the number of each tile type
Tile-oriented formulation

102
2DBP Basic IP Formulation

Let xijnt 1 if tile n of type t is in position
(i, j)
The constraints are

å
"

t
n
x
,
1
ijnt
,
j
i
å
"

j
i
x
,
1

nt
j
i

,
,
,
t
n
j
i

lt

lt

h
j
j
w
i
i
j
j
i
i
,
,
,
t
t
"
Î
t
n
j
i
x
,
,
,

1
,
0

ijnt
Grid-oriented formulation
103
2DBP LP Issues

The LP is large
The LP exhibits significant primal degeneracy
The LP exhibits significant dual degeneracy

104
2DBP LP Issues

The simplex algorithm cannot solve the LP
There is no way for a MIP solver to solve the IP
as such
The barrier method can solve the LP

105
2DBP Summary

CP as master problem
Orders tiles
Places tiles by position, then type
Selects tile type by frequency to scatter tiles
throughout the bin
Uses a one-ply lookahead constraint to limit the
position of following tile
LP relaxation prunes the CP search space
Checks whether the partial solution will lead to
an infeasible instance
Use idiomatic formulations for CP and IP

106
2DBP Remarks

The CP fixes significant numbers of variables at
each node
The LP pre-processor greatly simplifies the LP
Therefore, the lack of incrementality of the
barrier method does not cost us

107
2DBP Cooperative Algorithm Demo
108
Last Stop

Constraint Programming and Local Search
cooperation

Another example of duality in action

109
CP/LS

Parallel machines with set-up times

Ready times Dues dates Splittable jobs Rogue
machines
Objectives meet due dates minimize setup
costs
110
Two Phase Cooperation

Phase I - the Primal (Work on first objective)
Configure and schedule the jobs
Use constraint based scheduling

111
Two Phase Cooperation
Machines morph into trucks
112
Two Phase Cooperation

Phase 2 - the Dual (Work on second objective)
Schedule the trucks
Use Lin, Lin-Kernighan, tabu etc

113
Parallel Machines Cooperative Algorithm Demo
114
IC Park Example

Hoist Scheduling (Rodosek and Wallace)
The original model is an IP
The CP model is the same
CP guides search, LP relaxation and CP share
pruning duties
No apparent duality

115
Remarks

One can get great benefit with CP/IP algorithms
CP/CP algorithms and CP/LS algorithms
IP/LS is just around the corner

IP/IP cooperation is hard because one cant
formulate truly dual views
either simply not there
or too verbose

counterexamples welcome

116
4. DISJUNCTIVE PROGRAMMING

117
Disjunctive Linear Programming

An extension of Mixed Integer Programming
A union of polyhedral sets (feasible regions) is
called a disjunctive set.

118
Disjunctive Set
119
Disjunctive Linear Programming

The problem of determining whether the
intersection of a family of disjunctive sets is
non-empty is called the disjunctive linear
programming problem or simply disjunctive
programming problem.
The solution set of the disjunctive programming
problem is
Ç È Fij
iltM jltN

120
Disjunctive Linear Programming Examples

Semi-continuous variables
either X gt 100
or X 0
Rather than
X lt BigMY ,
X gt100Y,
Y a 0-1 variable

121
Solution Set Inside Initial Region
122
Disjunctive Linear Programming Examples -
continued

Bollapragada, Ghattas and Hooker
Truss structure design problem
Branches directly on alternatives dictated by
Hookes Law
Wyatt
Disjunctive programming and mean absolute
deviation models (MAD) for portfolio
optimization
Extends Benders decomposition to disjunctive
linear programs

123
Disjunctive Linear Programming continued

Balas, Cornuejols and Ceria
Generating cuts for disjunctive programming
problems.
McAloon and Tretkoff
Basic mathematical results Optimization and
Computational Logic, Wiley

124
Disjunctive Linear Programming continued

Dealing with the disjunctive part requires
search.
This requires an engine which is not available in
MIP packages
Also the linear relaxation is not as tight and
the evaluation function is not as faithful
The solution is to use a CSP solver and an LP
based solver in tandem - cooperating solvers
Beringer and DeBacker for MIP

125
To Keep It Simple

GERALD DONALD ROBERT

An AI classic Newell and Simon
Assignment problem 1 constraint
Surprisingly hard for MIP solvers CPLEX MIP takes
1 minute and 29048 nodes (on Sun Enterprise) to
find a feasible integer solution
126
The Disjunctive Program

One constraint for the equation
100000 G D 100000 R T
For each variable X among G,,T
X 0 or X 1 or or X 9
For each pair X, Y
X ? Y-1 or Y ? X-1

127
Solution Set SOME of the Integer Points in the
Region
128
The Twin Variables for Cooperating Solvers

Integer variables for the letters
0 ? g, e, r, a, l, d, o, n, b, t ? 9
With continuous doppelgangers
0 ? G, E, R, A, L, D, O, N, B, T ? 9

129
The Variables

One multi-variable constraint on the continuous
doppelgangers posted to an LP solver and to the
CSP solver
100000 G 10000 E 1000 R D
100000 D 10000 O 1000 N D
100000 R 10000 O 1000 B T

130
The Variables

One CSP constraint on the integer variables
posted to a discrete constraint propagation
engine
AllDifferent(g, e, r, a, l, d, r, n, b, t )

131
The Search

Bounding information from the discrete variables
is passed to the continuous doppelgangers and
conversely
The branching strategy is guided by the linear
relaxation on the continuous variables
if there is a non-integral variable X, branch on
it
X ? floor(X)
or
X ? ceil(X)

132
The Search

If the AllDifferent constraint, the initial
bounding constraints and the bounding constraints
from branching detect a contradiction on the
discrete variables, both sides backtrack
If the linear relaxation is made infeasible by
the bounding constraints that come from the
discrete computation or from branching, both
sides backtrack

133
The Search

New wrinkle
The solution to the linear relaxation might have
all variables integral - but the AllDifferent
constraint can be violated by this set of values
In this case, branch to keep them apart
either X ? Y - 1
or Y ? X - 1

134
The Variables

void main()
IlcInitFloat()
IlcManager m(IlcNoEdit)
IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9),
A(m, 0, 9), L(m, 0, 9),
G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9),
T(m, 0, 9)
IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E,
R, B, T)
// Continued on next slide

135
The Constraints

m.add(IlcAllDiff(vars,IlcWhenValue))
IlcLinOpt simplex(m)
simplex.add(
100000R 10000O 1000B 100E 10R T
100000G 10000E 1000R 100A 10L D
100000D 10000O 1000N 100A 10L D
,
IlcTrue // Post to Solver as well
)

136
The Search for solutions

m.add(Generate(m,simplex,vars)) // Search
strategy
if (m.nextSolution()) // Find a solution
m.out() ltlt " solution found " ltlt endl
m.printInformation()
m.end()

137
Branch if a variable is non-integer

ILCGOAL2(Generate, IlcSimplex, simplex,
IlcIntVarArray, vars)
IlcInt varIndex MostNotInteger(vars,
simplex)
if (varIndex gt 0) // There is a non-integer
variable
return IlcAnd(IlcTryUpwardFirst(varsvarIndex,
simplex), this)

138
Is integer relaxation a solution ?

IlcManager m getManager()
if(m.solve(TestIntegerRelaxation(m,simplex)))
return 0

139
Find two variables with same value

IlcInt j
for(i0iltvars.getSize()-1i)
if (varsi.isBound()) continue // Cant both
be bound
IlcInt n simplex.nearest(simplex.getCurrentVal
ue(varsi))
for(ji1jltvars.getSize()j)
IlcInt m simplex.nearest(simplex.ge
tCurrentValue(varsj))
if (m n) break
if (jlt vars.getSize()) break

140
Branch to push them apart

// j and i are the indices of two variables
with same current value
return
IlcAnd( IlcOr(
Smaller(m,varsi,varsj,simplex),
Smaller(m,varsj,varsi,simplex)),
this // Recursion
)

141
Pushing two variables apart

ILCGOAL3(Smaller,IlcIntVar,x,IlcIntVar,y,IlcSimple
x,simplex)
simplex.add(x lt y-1,IlcTrue)
return 0

142
Testing the integer relaxation

ILCGOAL1(TestIntegerRelaxation, IlcSimplex,
simplex)
simplex.trySolution()
return 0

143
Results

ILOG Solver/Planner finds a solution in 6 nodes
(.29 seconds on laptop)
Straightforward ILOG Solver finds a solution in
8024 nodes (1.8 seconds on a laptop)
Again, CPLEX MIP takes 1 minute and 29048 nodes
(on Sun Enterprise) to find a feasible integer
solution

144
Example The Dutch Trains

Scheduling intercity trains
Amsterdam,Rotterdam,Roosendaal,Vlissengen

Without coupling constraints, multi-commodity
integer flow problem With coupling constraints, a
DLP with an integer relaxation
Additional logic handled directly in 2LP with
CPLEX Disjunctive Programming and Cooperating
Solvers, CSTS 98 (Kluwer, edited by D. Woodruff)
145
Conclusions

CP and MIP are powerful techniques that can solve
many combinatorial problems
Each has preferred formulations
Can get even greater benefits when combining CP
and IP algorithms

146
Recent and Current Work

Beaumont
Beringer, DeBacker
Balas, Ceria, Cornuejols.
Wallace, Rodosek, Schrimpf
Heipke, Colombani
Bockmayr
McAloon, Tretkoff, Wetzel

147
III. Exploiting Randomization to Solve Hard
Combinatorial Problems

148
Background

Combinatorial search methods often exhibit
a remarkable variability in performance. It is
common to observe significant differences
between
- different heuristics
- same heuristic on different instances
- different runs of same heuristic with
different seeds (stochastic methods)

149
Main Claim

One can take advantage of the extreme
variability of combinatorial search methods
One can improve the performance of a
deterministic complete method, by introducing a
stochastic element, while maintaining
completeness.
Well explain WHY that is the case.

150

A Structured Benchmark Domain for Studying
the Distributions of Search Methods
Stochasticity in Search Procedures
Intriguing Properties of Complete
Backtrack Style Algorithms
Consequences for Algorithm Design - Rapid
Randomized Restarts
Portfolio of Algorithms

151
Structured Benchmark Domain
152
Background

Study of local and systematic search methods has
been driven by
Random instance distributions (Hogg et al. 96).
Limitation lack of structure that characterizes
realistic problems
Highly structured problems (Fujita at al. 93).
Limitation too much structure.
We propose a benchmark domain that bridges the
gap between purely random instances and highly
structured problems.

153
Quasigroups
Defn. a pair (Q, ) where Q is a set, and is a
binary operation on Q such that
a x b y a b are uniquely
solvable for every pair of elements a,b in
Q. The multiplication table of its binary
operation defines a latin square (i.e., each
element of Q appears exactly once in each
row/column). Example
Quasigroup of order 4
154
Quasigroup Completion Problem (QCP)
Given a partial latin square, can it be
completed? Example
155
Quasigroup Completion Problem A Framework for
Studying Search

NP-Complete (Colbourn 1983, 1984 Anderson 1985).
Has a structure not found in random instances.
Leads to interesting search problems when
structure is perturbed.
The study of this problem led us to identify
the unusual distributions of combinatorial search
(Gomes, Selman Crato --- CP97)

156
Aside Applications of Quasigroups

Design of statistical experiments
eliminating data dependencies
Scheduling/Timetabling (Anderson 1992)
completing a schedule given a set of pre-defined
events
Automated theorem proving (Fujita et al. 1993)
existence vs. non-existence of quasigroups with
intricate mathematical properties

157
Example Scheduling of Drug Experiment

Given 5 different drugs, test the effects of the
different medications on 5 different subjects
over
different days of the week.
Use constraint
No two people get same brand on the same day
(eliminate bias for day of the week).

158
Quasigroup Completion
DAY
Mon. Tues. Wed.
Thurs. Fri.
Tylenol Aleve Bayer Exhedrin
Advil Aleve Bayer Exhedrin
Advil Tylenol Bayer Exhedrin
Advil Tylenol Aleve Exhedrin
Advil Tylenol Aleve
Bayer Advil Tylenol Aleve
Bayer Exhedrin
Tim Sue Frank Teresa Todd
SUBJECT
() Pre-assigned
159

QCP has a natural formulation as a Constraint
Satisfaction Problem
variable for each NxN entry
constraints capture row/column requirement
variable assignments capture pre-assigned values

160

How does the difficulty of
QCP vary with the fraction
of pre-assignment?

161
Median number of backtracks (log)
Fraction of pre-assignment
162

Complexity Graph shows (up to order 20)
curve peaks around 42 of pre-assignment ---
critically constrained area.
under-constrained and over-constrained areas are
easier.

163

Directly related to the peak in
computational difficulty is the so-
called phase transition graph for
the QCP problem.

164
Fraction of Unsolved cases
Fraction of pre-assignment
165
Phase Transition

QCP Phase Transition --- threshold phenomenon
from almost all solvable to almost all unsolvable
--- occurs around 42 of preassignment.
Its called a phase transition because of the
close
relation to state transition phenomena studied in
physics, such as the melting of a solid into a
liquid.

166
Exploiting Structure
167
Exploiting Structure in QCP
Arc Consistency on binary constraints
Forward Checking
168
Further Exploiting Structure in QCP
General Arc Consistency on all different
constraints
Arc Consistency on Binary Constraints
169
Enforcing General Arc Consistency on All
Different Constraints

Beautiful example of integration of AI/OR
techniques for a well defined sub-problem
Propagation uses Maximum Matching problem
(particular case of Network Flow problems which
have polynomial time complexity)

170
Further Exploiting Structure in QCP

By enforcing general arc consistency on all
different constraints problems up to order 50
could be solved!

171
Stochasticity in Search Procedures
172
Background

Stochastic strategies have been very successful
in the area of local search.
Limitation inherent incomplete nature of local
search methods.
We want to explore the addition of a stochastic
element to a systematic search procedure without
losing completeness.

173

We introduce stochasticity in a
backtrack search method by randomly
breaking ties in variable and/or value
selection.
Compare with standard lexicographic
tie-breaking.

174
Randomized Strategies

175
(No Transcript)
176
(No Transcript)
177
(No Transcript)
178

Lesson
Randomized tie-breaking can
improve performance over a purely
deterministic strategy.
Next
But we can obtain a more dramatic
advantage from randomization ...

179
Cost Distributions

Key Properties
I Erratic behavior of mean.
II Distributions have heavy tails.

180
3500!
sample mean
Median 1!
number of runs
181
1
182
75lt30
5gt100000
Proportion of cases Solved
183
Heavy-Tailed Distributions

infinite variance infinite mean
Introduced by Pareto in the 1920s
--- probabilistic curiosity.
Mandelbrot established the use of heavy-tailed
distributions to model real-world fractal
phenomena.
Examples stock-market, earth-quakes, weather,...

184
Decay of Distributions

Standard --- Exponential Decay
e.g. Normal
Heavy-Tailed --- Power Law Decay
e.g. Pareto-Levy

185
(No Transcript)
186
Normal, Cauchy, and Levy
187
Tail Probabilities (Standard Normal, Cauchy,
Levy)

188
How to Check for Heavy Tails?

Log-Log plot of tail of distribution
should be approximately linear.
Slope gives value of
infinite mean and
infinite variance
infinite variance

189
Example of Heavy Tailed Model(Random Walk)

Random Walk
Start at position 0
Toss a fair coin
with each head take a step up (1)
with each tail take a step down (-1)

X --- number of steps the random walk takes
to return to position 0.
190
(No Transcript)
191
Heavy-tails vs. Non-Heavy-Tails
Normal (2,1000000)
1-F(x) Unsolved fraction
O,1gt200000
Normal (2,1)
X - number of steps the walk takes to return to
zero (log scale)
192
Heavy-tails in QCP Domain
1-F(x) Unsolved fraction
Number backtracks (log)
193

The Log-Log plot shows a linear relation
over many orders of magnitude. This is
clear evidence of heavy-tailed behavior.

194
(No Transcript)
195

196
Heavy Tailed Cost Distribution
197

The Log-Log plot shows a linear relation
over many orders of magnitude. This is
clear evidence of heavy-tailed behavior.

198

By studying larger problems we discovered that
not only does the heavy tail phenomenon occur at
the right-hand side of the distribution, but we
also observed a high frequency of data points on
the left-hand side of the distribution.
Right-hand side non-negligible fraction of very
long runs
Left-hand side non-negligible fraction of
very short runs

199
Sports Scheduling
70gt 250000
Cumulative Distribution Function
15!
Number backtracks (log)
200
Power Law Decay
Exponential Decay
Standard Distribution (finite mean variance)
201

Consequence for algorithm design
Use rapid restarts or parallel / inter-leaved
runs

Super linear speedups!!!
202
Super-linear Speedups
Interleaved (1 machine) 10 x 1 10 seconds
5 x speedup
203

Rapid Restarts work particularly well on hard
computational problems because of the Heavy
Tailed Phenomena in the run time distribution.
RAPID RANDOMIZED RESTARTS strategy avoids the
tail on the right and exploits the short runs on
the left.
Restarts provably eliminate heavy tails (Gomes,
Selman Crato )

204
Sketch of proof of elimination of heavy tails

Lets truncate the search procedure
after m backtracks.
Probability of solving problem with truncated
version
Run the truncated procedure and restart it
repeatedly.

205

Y - does not have Heavy Tails
206
Restarts
70 unsolved
1-F(x) Unsolved fraction
Number backtracks (log)
207
Example of Rapid Restart Speedup(planning)
Number backtracks (log)
Cutoff (log)
208
Summary Results
Deterministic

() not found after 2 days
209

Our results provide the first indication of
heavy-tailed distri-butions in a computational
model.
Overall insight Randomized tie-breaking with
rapid restarts gives powerful search strategy.

210
Heavy-Tailed Distributionsin Other Domains

Quasigroup Completion Problem
Graph Coloring
Logistic Planning
Circuit Synthesis

211
Summary Results
Deterministic

() not found after 2 days
212
Rapid Restart Speedup
213

Our results provide the first indication of
heavy-tailed distri-butions in a computational
model.
Overall insight
Randomized tie-breaking with
rapid restarts gives powerful
search strategy.

214
Heavy-Tailed Distributionsin Other Domains

Quasigroup Completion Problem
Graph Coloring
Logistic Planning
Circuit Synthesis

215
Algorithm Portfolio Design

216
Motivation

The runtime and performance of randomized
algorithms can vary dramatically on the same
instance and on different instances.
Goal Improve the performance of different
algorithms by combining them into a portfolio to
exploit their relative strengths.

217
Branch BoundBest Bound vs. Depth First
Search
218
Branch Bound(Randomized)

Standard OR approach for solving Mixed Integer
Programs (MIPs)
Solve linear relaxation of MIP
Branch on the integer variables for which the
solution of the LP relaxation is non-integer
apply a good heuristic (e.g., max infeasibility)
for variable selection ( randomization ) and
create two new nodes (floor and ceiling of the
fractional value)
Once we have found an integer solution, its
objective value can be used to prune other nodes,
whose relaxations have worse values

219
Branch BoundDepth First vs. Best bound

Critical in performance of Branch Bound
the way in which the next node to be expanded
is selected.
Best-bound - select the node with the
best LP bound
(standard OR approach) ---gt
this case is equivalent to A, the LP
relaxation provides an admissible search
heuristic
Depth-first - often quickly reaches an integer
solution
(may take longer to produce an overall optimal
value)

220
Portfolio of Algorithms

A portfolio of algorithm is a collection of
algorithms and / or copies of the same
algorithm running interleaved or on different
processors.
Goal to improve on the performance of the
component algorithms in terms of
expected computational cost
risk (variance)
Efficient Set or Efficient Frontier set of
portfolios that are best in terms of expected
value and risk.

221
Depth-first vs. Best-bound(logistics planning)
Cumulative Frequencies
Number of nodes
222

Depth-First and Best and Bound do not dominate
each other overall.

223
Heavy-tailed behavior of Depth-first
224
Portfolio for heavy-tailed search procedures (2
processors)
2 DF / 0 BB
Expected run time of portfolios
0 DF / 2 BB
Standard deviation of run time of portfolios
225
Portfolio for heavy-tailed search procedures (6
processors)
0 DF / 6 BB
Expected run time of portfolios
6 DF / 0BB
Standard deviation of run time of portfolios
226
Portfolio for heavy-tailed search procedures (20
processors)
0 DF / 20 BB
Expected run time of portfolios
20 DF / 0 BB
Standard deviation of run time of portfolios
227
Portfolio for heavy-tailed search procedures
(2-20 processors)
228

A portfolio approach can lead to substantial
improvements in the expected cost and risk of
stochastic algorithms, especially in the presence
of heavy-tailed phenomena.

229
Summary of Randomization