Towards Efficient Sampling: Exploiting Random Walk Strategy - PowerPoint PPT Presentation

About This Presentation
Title:

Towards Efficient Sampling: Exploiting Random Walk Strategy

Description:

Recent years have seen tremendous improvements in SAT solving. ... Harder formulas - handcraft formulas compare with analytic results. 28 ... – PowerPoint PPT presentation

Number of Views:41
Avg rating:3.0/5.0
Slides: 29
Provided by: wei8151
Category:

less

Transcript and Presenter's Notes

Title: Towards Efficient Sampling: Exploiting Random Walk Strategy


1
Towards Efficient Sampling Exploiting Random
Walk Strategy
  • Wei Wei, Jordan Erenrich, and Bart Selman

2
Motivations
  • Recent years have seen tremendous improvements in
    SAT solving. Formulas with up to 300 variables
    (1992) to formulas with one million variables.
  • Various techniques for answering
  • does a satisfying assignment exist for a
    formula?
  • But there are harder questions to be answered .
  • how many satisfying assignments does a formula
    have? Or closely related can we sample from the
    satisfying assignments of a formula?

3
Complexity
  • SAT is NP-complete. 2-SAT is solvable in linear
    time.
  • Counting assignments (even for 2cnf) is
    P-complete, and is NP-hard to approximate
    (Valiant, 1979).
  • Approximate counting and sampling are equivalent
    if the problem is downward self-reducible.

4
Challenge
  • Can we extend SAT techniques to solve harder
    counting/sampling problems?
  • Such an extension would lead us to a wide range
    of new applications.

SAT testing
5
Standard Methods for Sampling - MCMC
  • Based on setting up a Markov chain with a
    predefined stationary distribution.
  • Draw samples from the stationary distribution by
    running the Markov chain for sufficiently long.
  • Problem for interesting problems, Markov chain
    takes exponential time to converge to its
    stationary distribution

6
Simulated Annealing
  • Simulated Annealing uses Boltzmann distribution
    as the stationary distribution.
  • At low temperature, the distribution concentrates
    around minimum energy states.
  • In terms of satisfiability problem, each
    satisfying assignment (with 0 cost) gets the same
    probability.
  • Again, reaching such a stationary distribution
    takes exponential time for interesting problems.
    shown in a later slide.

7
Standard Methods for Counting
  • Current solution counting procedures extend DPLL
    methods with component analysis.
  • Two counting precedures are available. relsat
    (Bayardo and Pehoushek, 2000) and cachet (Sang,
    Beame, and Kautz, 2004). They both count exact
    number of solutions.

8
  • Question Can state-of-the-art local search
    procedures be used for SAT sampling/counting? (as
    alternatives to standard Monte Carlo Markov Chain
    and DPLL methods)

Yes! Shown in this talk
9
Our approach biased random walk
  • Biased random walk greedy bias pure random
    walk. Example WalkSat (Selman et al, 1994),
    effective on SAT.
  • Can we use it to sample from solution space?
  • Does WalkSat reach all solutions?
  • How uniform is the sampling?

10
WalkSat
Hamming distance
11
Probability Ranges in Different Domains
12
Improving the Uniformity of Sampling
WalkSat
SA
  • SampleSat
  • With probability p, the algorithm makes a biased
    random walk move
  • With probability 1-p, the algorithm makes a SA
    (simulated annealing) move

13
Comparison Between WalkSat and SampleSat
WalkSat
SampleSat
14
SampleSat
Hamming Distance
15
(No Transcript)
16
Analysis
17
Property of F
  • Proposition 1 SA with fixed temperature takes
    exponential time to find a solution of F
  • This shows even for some simple formulas in 2cnf,
    SA cannot reach a solution in poly-time

18
Analysis, cont.
19
SampleSat
  • In SampleSat algorithm, we can devide the search
    into 2 stages. Before SampleSat reaches its first
    solution, it behaves like WalkSat.

20
SampleSat, cont.
  • After reaching the solution, random walk
    component is turned off because all clauses are
    satisfied. SampleSat behaves like SA.
  • Proposition 3 SA at zero temperature samples all
    solutions within a cluster uniformly.
  • This 2-stage model explains why SampleSat samples
    more uniformly than random walk algorithms alone.

21
Verification on Larger formulas - ApproxCount
  • Small formulas -gt Figures, solution frequencies.
    How to verify on large formulas? ApproxCount.
  • ApproxCount approximates the number of solutions
    of Boolean formulas, based on SampleSat
    algorithm.
  • Besides using it to justify the accuracy of our
    sampling approach, ApproxCount is interesting on
    its own right.

22
Algorithm
  • The algorithm works as follows (Jerrum and
    Valiant, 1986)
  • Pick a variable X in current formula
  • Draw K samples from the solution space
  • Set variable X to its most sampled value t, and
    the multiplier for X is K/(Xt). Note
    1 ? multiplier ? 2
  • Repeat step 1-3 until all variables are set
  • The number of solutions of the original formula
    is the product of all multipliers.

23
Accumulation of Errors
24
Within the Capacity of Exact Counters
  • We compare the results of approxcount with those
    of the exact counters.

25
And beyond
  • We developed a family of formulas whose solutions
    are hard to count
  • The formulas are based on SAT encodings of the
    following combinatorial problem
  • If one has n different items, and you want to
    choose from the n items a list (order matters) of
    m items (mltn). Let P(n,m) represent the number
    of different lists you can construct. P(n,m)
    n!/(n-m)!

26
Hard Instances
  • Encoding of P(20,10) has only 200 variables, but
    neither cachet or Relsat was able to count it in
    5 days in our experiments.
  • On the other hard, ApproxCount is able to finish
    in 2 hours, and estimates the solutions of even
    larger instances.

27
Summary
  • Small formulas -gt complete analysis of the
    search space
  • Larger formulas -gt compare ApproxCount results
    with results of exact counting procedures
  • Harder formulas -gt handcraft formulas compare
    with analytic results

28
Conclusion and Future Work
  • Shows good opportunity to extend SAT solvers to
    develop algorithms for sampling and counting
    tasks.
  • Next step Use our methods in probabilistic
    reasoning and Bayesian inference domains.
Write a Comment
User Comments (0)
About PowerShow.com