Randomized algorithms - PowerPoint PPT Presentation

Loading...

PPT – Randomized algorithms PowerPoint presentation | free to view - id: 372a8-YTkxO



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Randomized algorithms

Description:

The expected value of the random variable X representing your earnings is ... Useful facts from calculus. As n increases from 2, the function: ... – PowerPoint PPT presentation

Number of Views:273
Avg rating:3.0/5.0
Slides: 53
Provided by: desh8
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Randomized algorithms


1
Randomized algorithms
  • Instructor YE, Deshi
  • yedeshi_at_zju.edu.cn

2
Probability
  • We define probability in terms of a sample space
    S, which is a set whose elements are called
    elementary events. Each elementary event can be
    viewed as a possible outcome of an experiment.
  • An event is a subset of the sample space S.
  • Example flipping two distinguishable coins
  • Sample space S HH, HT, TH, TT.
  • Event the event of obtaining one head and one
    tail is HT, TH.
  • Null event ø. Two events A and B are mutually
    exclusive if A n B ?.
  • A probability distribution Pr on a sample space
    S is a mapping from events of S to real numbers
    such that
  • PrA 0 for any event A.
  • PrS 1.
  • PrA ? B PrA PrB for any two mutually
    exclusive events A and B.

3
Axioms of probability
  • Using A to denote the event S - A (the complement
    of A),
  • we have PrA 1 - PrA.
  • For any two events A and B
  • Discrete probability distributions
  • A probability distribution is discrete if it is
    defined over a finite or countably infinite
    sample space. Let S be the sample space. Then for
    any event A,
  • Uniform probability distribution on S Prs
    1/ S
  • Continuous uniform probability distribution For
    any closed interval c, d, where a c d b,

4
Probability
  • Conditional probability of an event A given that
    another event B occurs is defined to be
  • Two events are independent if
  • Bayes's theorem,
  • PrBPrB n A PrB n A PrA Pr B A
    PrAPrB A.

5
Discrete random variables
  • For a random variable X and a real number x, we
    define the event X x to be s ? S X(s) x
    Thus
  • Probability density function of random variable
    X f (x) PrX x.
  • PrX x 0 and Sx PrX x 1.
  • If X and Y are random variables, the function f
    (x, y) PrX x and Y y
  • For a fixed value y,

6
Expected value of a random variable
  • Expected value (or, synonymously, expectation or
    mean) of a discrete random variable X is
  • Example Consider a game in which you flip two
    fair coins. You earn 3 for each head but lose 2
    for each tail. The expected value of the random
    variable X representing your earnings is
  • EX 6 Pr2 H's 1 Pr1 H, 1 T - 4
    Pr2 T's 6(1/4) 1(1/2) - 4(1/4) 1.
  • Linearity of expectation
  • when n random variables X1, X2,..., Xn are
    mutually independent,

7
First success
  • Waiting for a first success. Coin is heads with
    probability p and tails with probability 1-p.
    How many independent flips X until first heads?
  • Useful property. If X is a 0/1 random variable,
    EX PrX 1.
  • Pf.

8
Variance and standard deviation
  • The variance of a random variable X with mean
    EX is
  • If n random variables X1, X2,..., Xn are pairwise
    independent, then
  • The standard deviation of a random variable X is
    the positive square root of the variance of X.

9
Randomization
  • Randomization. Allow fair coin flip in unit
    time.
  • Why randomize? Can lead to simplest, fastest, or
    only known algorithm for a particular problem.
  • Ex. Symmetry breaking protocols, graph
    algorithms, quicksort, hashing, load balancing,
    Monte Carlo integration, cryptography.

10
Maximum 3-Satisfiability
  • MAX-3SAT. Given 3-SAT formula, find a truth
    assignment that satisfies as many clauses as
    possible.
  • Remark. NP-hard problem.
  • Simple idea. Flip a coin, and set each variable
    true with probability ½, independently for each
    variable.

11
Maximum 3-Satisfiability Analysis
  • Claim. Given a 3-SAT formula with k clauses, the
    expected number of clauses satisfied by a random
    assignment is 7k/8.
  • Pf. Consider random variable
  • Let Z weight of clauses satisfied by assignment
    Zj.

linearity of expectation
12
EZj
  • EZj is equal to the probability that Cj is
    satisfied.
  • Cj is not satisfied, each of its three variables
    must be assigned the value that fails to make it
    true, since the variables are set independently,
    the probability of this is (1/2)31/8. Thus Cj is
    satisfied with probability 1 1/8 7/8.
  • Thus EZj 7/8.

13
Maximum 3-SAT Analysis
  • Q. Can we turn this idea into a
    7/8-approximation algorithm? In general, a
    random variable can almost always be below its
    mean.
  • Lemma. The probability that a random assignment
    satisfies ? 7k/8 clauses is at least 1/(8k).
  • Pf. Let pj be probability that exactly j clauses
    are satisfied let p be probability that ? 7k/8
    clauses are satisfied.

14
Analysis con.
  • Let k? denote the largest natural number that is
    strictly smaller than 7k/8.
  • Then 7k/8 - k? 1/8, thus, k? 7k/8 1/8.
    Because k? is a natural number, and the remaining
    of 7k mod 8 is at least 1.
  • Rearranging terms yields p ? 1 / (8k).

15
Maximum 3-SAT Analysis
  • Johnson's algorithm. Repeatedly generate random
    truth assignments until one of them satisfies ?
    7k/8 clauses.
  • Theorem. Johnson's algorithm is a
    7/8-approximation algorithm.
  • Pf. By previous lemma, each iteration succeeds
    with probability at least 1/(8k). By the
    waiting-time bound, the expected number of trials
    to find the satisfying assignment is at most 8k.
    ?

16
Maximum Satisfiability
  • Extensions.
  • Allow one, two, or more literals per clause.
  • Find max weighted set of satisfied clauses.
  • Theorem. Asano-Williamson 2000 There exists a
    0.784-approximation algorithm for MAX-SAT.
  • Theorem. Karloff-Zwick 1997, Zwickcomputer
    2002 There exists a 7/8-approximation algorithm
    for version of MAX-3SAT where each clause has at
    most 3 literals.
  • Theorem. Håstad 1997 Unless P NP, no
    ?-approximation algorithm for MAX-3SAT (and hence
    MAX-SAT) for any ? gt 7/8.

very unlikely to improve over simple
randomizedalgorithm for MAX-3SAT
17
Randomized Divide-and-Conquer
18
Finding the Median
  • We are given a set of n numbers Sa1,a2,...,an.
  • The median is the number that would be in the
    middle position if we were to sort them.
  • The median of S is equal to kth largest element
    in S, where
  • k (n1)/2 if n is odd, and kn/2 if n is even.
  • Remark. O(n log n) time if we simply sort the
    number first.
  • Question Can we improve it?

19
Selection problem
  • Selection problem. Given a set of n numbers S and
    a number k between 1 and n. Return the kth
    largest element in S.

Select (S, k) choose a splitter ai ? S uniformly
at random foreach (a ? S) if (a lt
ai) put a in S- else if (a gt ai) put a in
S If S-k-1 then ai was the desired
answer Else if S-k-1 then The kth largest
element lies in S- Recursively call
Select(S-,k) Else suppose S-l lt k-1 then
The kth largest element lies in S
Recursively call Select(S,k-1-l) Endif
20
Analysis
  • Remark. Regardless how the splitter is chosen,
    the algorithm above returns the kth largest
    element of S.
  • Choosing a good Splitter.
  • A good choice a splitter should produce sets S-
    and S that are approximately equal in size.
  • For example we always choose the median as the
    splitter. Then each iteration, the size of
    problem shrink half.
  • Let cn be the running time for selecting a
    uniformed number.
  • Then the running time is
  • T(n) T(n/2) cn
  • Hence T(n) O(n).

21
Analysis con.
  • Funny!! The median is just what we want to find.
  • However, if for any fixed constant b gt 0, the
    size of sets in the recursive call would shrink
    by a factor of at least (1- b) each time. Thus
    the running time T(n) would be bounded by the
    recurrence T(n) T((1-b)n) cn.
  • We could also get T(n) O(n).
  • A bad choice. If we always chose the minimum
    element as the splitter, then
  • T(n) T(n-1) cn
  • Which implies that T(n) O(n2).

22
Random Splitters
  • However, we choose the splitters randomly.
  • How should we analysis the running time of this
    performance?
  • Key idea. We expect the size of the set under
    consideration to go down by a fixed constant
    fraction every iteration, so we would get a
    convergent series and hence a linear bound
    running time.

23
Analyzing the randomized algorithm
  • We say that the algorithm is in phase j when the
    size of the set under consideration is at most
    n(3/4)j but greater than n(3/4)j1.
  • So, to reach phase j, we kept running the
    randomized algorithm after the phase j 1 until
    it is phase j. How much calls (or iterations) in
    each phases?
  • Central if at least a quarter of the elements
    are smaller than it and at least a quarter of the
    elements are larger than it.

24
Analyzing
  • Observe. If a central element is chosen as a
    splitter, then at least a quarter of the set will
    be thrown away, the set shrink by ¾ or better.
  • Moreover, at least half elements could be
    central, so the probability that our random
    choice of splitter produces a central element is
    ½ .
  • Q when will the central will be found in each
    phase?
  • A by waiting-time bound, the expected number of
    iterations before a central element is found is
    2.
  • Remark. The running time in one iteration of the
    algorithm is at most cn.

25
Analyzing
  • Let X be a random variable equal to the number of
    steps taken by the algorithm. We can write it as
    the sum
  • XX0X1...,
  • where Xj is the expected number of steps spent by
    the algorithm in phase j.
  • In Phase j, the set has size at most n(¾ )j and
    the number of iterations is 2, thus, EXj 2
    cn(¾ )j
  • So,

Theorem. The expected running time of Select(n,k)
is O(n).
26
Quick Sort
  • Quick-Sort(A, p, r)
  • 1. if p lt r
  • 2. then q Partition(A, p, r)
  • 3. Quick-Sort(A, p, q-1)
  • 4. Quick-Sort(A, q1, r)

27
Partition
  • PARTITION(A, p, r )
  • x ? Ar
  • i ? p - 1
  • for j ? p to r - 1
  • do if A j x
  • then i ? i 1
  • exchange Ai ? A j
  • exchange Ai 1 ? Ar
  • return i 1
  • Partition takes T ( r - p 1 ) time.
  • Partition always selects the last element Ar
    in the subarray Ap . . r as the pivot.the
    element around which to partition.

28
EX of Partition
29
Worst-case of Quick sort
30
Randomized Quick-sort
  • RANDOMIZED-PARTITION(A, p, r )
  • i ?RANDOM(p, r )
  • exchange Ar ? Ai
  • return PARTITION(A, p, r )

31
Analysis of Randomized Quick sort
  • Recurrence for the worst-case running time of
    QUICKSORT
  • Worst case

32
Expected running time
  • The Element chosen by RANDOM is called the pivot
    element
  • Each number can be a pivot element at most once
  • So totally, at most n calls to Partition
    procedure
  • So the total steps is bounded by a constant
    factor of the number of comparisons in Partition.

33
Compute the total number of comparison in calls
to Partition
  • When does the algorithm compare two elements?
  • When does not the algorithm compare two elements?
  • Suppose (Z1,Z2,,Zn) are the sorted array
    of elements in A
  • that is, Zk is the kth smallest element of A.

34
Compute the total number of comparison in calls
to Partition
  • The reason to use Z rather than A directly is
    that is it hard to locate elements in A during
    Quick-Sort because elements are moving around.
  • But it is easier to identify Zk because they
    are in a sorted order.
  • We call this type of the analysis scheme the
    backward analysis.

35
Compute the total number of comparisons in calls
to Partition
  • Under what condition does Quick-Sort compare Zi
    and Zj?
  • What is the probability of comparison?
  • First Zi and Zj are compared at most once!!!
  • Let Eij be the random event that Zi is compared
    to Zj.
  • Let Xij be the indicator random variable of Eij.
  • Xij IZi is compared to Zj

36
Compute the total number of comparisons in calls
to Partition
  • So the total number of comparisons is
  • We are interested in

37
Compute the total number of comparisons in calls
to Partition
  • By linearity of expectation, we have
  • So what is Pr(Zi is compared to Zj)?

38
Compute the total number of comparisons in calls
to Partition
  • So what is Pr(Zi is compared to Zj)?
  • What is the condition that
  • Zi is compared to Zj?
  • What is the condition that
  • Zi is not compared to Zj?
  • Answer no element is chosen from Zi1 Zj-1
    before Zi or Zj is chosen as a pivot in
    Quick-Sort
  • therefore ...

39
Compute the total number of comparisons in calls
to Partition
  • Therefore

40
Compute the total number of comparisons in calls
to Partition
  • By linearity of expectation, we have

41
Original Quick-Sort(Tony Hoare)
  • Partition with the first element
  • Average-Case Complexity
  • Assume inputs come from uniform permutations.
  • Our analysis of the Expected time analysis of
    Random Quick-Sort extends directly.
  • Notice the difference of randomized algorithm and
    average-case complexity of a deterministic
    algorithm

42
Contention Resolution in a Distributed System
  • Contention resolution. Given n processes P1, ,
    Pn, each competing for access to a shared
    database. If two or more processes access the
    database simultaneously, all processes are locked
    out. Devise protocol to ensure all processes get
    through on a regular basis.
  • Restriction. Processes can't communicate.
  • Challenge. Need symmetry-breaking paradigm.

P1
P2
...
Pn
43
Contention Resolution Randomized Protocol
  • Protocol. Each process requests access to the
    database at time t with probability p 1/n.
  • Claim. Let Si, t event that process i
    succeeds in accessing the database at time t.
    Then 1/(e ? n) ? PrS(i, t) ? 1/(2n).
  • Pf. By independence, PrS(i, t) p
    (1-p)n-1.
  • Setting p 1/n, we have PrS(i, t) 1/n (1 -
    1/n) n-1. ?
  • Useful facts from calculus. As n increases from
    2, the function
  • (1 - 1/n)n-1 converges monotonically from 1/4 up
    to 1/e
  • (1 - 1/n)n-1 converges monotonically from 1/2
    down to 1/e.

process i requests access
none of remaining n-1 processes request access

value that maximizes PrS(i, t)
between 1/e and 1/2
44
Contention Resolution
Randomized Protocol
  • Claim. The probability that process i fails to
    access the database inen rounds is at most 1/e.
    After e?n(c ln n) rounds, the probability is at
    most n-c.
  • Pf. Let Fi, t event that process i fails to
    access database in rounds 1 through t. By
    independence and previous claim, we havePrF(i,
    t) ? (1 - 1/(en)) t.
  • Choose t ?e ? n?
  • Choose t ?e ? n? ?c ln n?

45
Contention Resolution
Randomized Protocol
  • Claim. The probability that all processes
    succeed within 2e ? n ln n rounds is at least 1 -
    1/n.
  • Pf. Let Ft event that at least one of the n
    processes fails to access database in any of the
    rounds 1 through t.
  • Choosing t 2 ?en? ?c ln n? yields PrFt ? n
    n-2 1/n. ?
  • Union bound. Given events E1, , En,

union bound
previous slide
46
Global Minimum Cut
47
Global Minimum Cut
  • Global min cut. Given a connected, undirected
    graph G (V, E) find a cut (A, B) of minimum
    cardinality.
  • Applications. Partitioning items in a database,
    identify clusters of related documents, network
    reliability, network design, circuit design, TSP
    solvers.
  • Network flow solution.
  • Replace every edge (u, v) with two antiparallel
    edges (u, v) and (v, u).
  • Pick some vertex s and compute min s-v cut
    separating s from each other vertex v ? V.
  • False intuition. Global min-cut is harder than
    min s-t cut.

48
Contraction Algorithm
  • Contraction algorithm. Karger 1995
  • Pick an edge e (u, v) uniformly at random.
  • Contract edge e.
  • replace u and v by single new super-node w
  • preserve edges, updating endpoints of u and v to
    w
  • keep parallel edges, but delete self-loops
  • Repeat until graph has just two nodes v1 and v2.
  • Return the cut (all nodes that were contracted to
    form v1).

a
?
b
c
c
a
b
u
v
w
d
contract u-v
e
f
f
49
Contraction Algorithm
  • Claim. The contraction algorithm returns a min
    cut with prob ? 2/n2.
  • Pf. Consider a global min-cut (A, B) of G. Let
    F be edges with one endpoint in A and the other
    in B. Let k F size of min cut.
  • In first step, algorithm contracts an edge in F
    probability k / E.
  • Every node has degree ? k since otherwise (A,
    B) would not be min-cut. ? E ? ½kn.
  • Thus, algorithm contracts an edge in F with
    probability ? 2/n.

B
A
F
50
Contraction Algorithm
  • Claim. The contraction algorithm returns a min
    cut with prob ? 2/n2.
  • Pf. Consider a global min-cut (A, B) of G. Let
    F be edges with one endpoint in A and the other
    in B. Let k F size of min cut.
  • Let G' be graph after j iterations. There are n'
    n-j supernodes.
  • Suppose no edge in F has been contracted. The
    min-cut in G' is still k.
  • Since value of min-cut is k, E' ? ½kn'.
  • Thus, algorithm contracts an edge in F with
    probability ? 2/n'.
  • Let Ej event that an edge in F is not
    contracted in iteration j.

51
Contraction Algorithm
  • Amplification. To amplify the probability of
    success, run the contraction algorithm many
    times.
  • Claim. If we repeat the contraction algorithm n2
    ln n times with independent random choices, the
    probability of failing to find the global min-cut
    is at most 1/n2.
  • Pf. By independence, the probability of failure
    is at most

(1 - 1/x)x ? 1/e
52
Global Min Cut Context
  • Remark. Overall running time is slow since we
    perform ?(n2 log n) iterations and each takes
    ?(m) time.
  • Improvement. Karger-Stein 1996 O(n2 log3n).
  • Early iterations are less risky than later ones
    probability of contracting an edge in min cut
    hits 50 when n / v2 nodes remain.
  • Run contraction algorithm until n / v2 nodes
    remain.
  • Run contraction algorithm twice on resulting
    graph, and return best of two cuts.
  • Extensions. Naturally generalizes to handle
    positive weights.
  • Best known. Karger 2000 O(m log3n).

faster than best known max flow algorithm
ordeterministic global min cut algorithm
About PowerShow.com