Integrating Advanced Algorithms into Undergraduate Computer Science Curriculum - PowerPoint PPT Presentation

About This Presentation
Title:

Integrating Advanced Algorithms into Undergraduate Computer Science Curriculum

Description:

Global Alignment: compare two sequences in their entirety; the gap penalty is ... Approximation algorithms for shop scheduling problems with minsum objective. ... – PowerPoint PPT presentation

Number of Views:58
Avg rating:3.0/5.0
Slides: 42
Provided by: yana3
Learn more at: https://cs.widener.edu
Category:

less

Transcript and Presenter's Notes

Title: Integrating Advanced Algorithms into Undergraduate Computer Science Curriculum


1
Integrating Advanced Algorithms into
Undergraduate Computer Science Curriculum
  • Yana Kortsarts
  • Widener University
  • Computer Science Department

2
Plan
  • Randomized Algorithms
  • Teaching a Power of Randomization Using a Simple
    Game
  • Additional Examples of Advanced Algorithms in
    Undergraduate Computer Science Curriculum

3
Algorithm
  • An algorithm is a sequence of instructions for
    solving a problem
  • Deterministic Algorithm runs in the same way on
    the same input every time. Deterministic
    algorithm has predicted behavior
  • Randomized Algorithm is an algorithm that makes
    random choices during execution

4
Deterministic Algorithm
THE SAME INPUT
THE SAME BEHAIVOR
OUTPUT
5
Randomized Algorithm
THE SAME INPUT
DIFFERENT BEHAIVOR
OUTPUT
6
Why Should We Teach Randomized Algorithms?
  • Randomization is a general tool that applies in
    various computer science areas and not just a
    subject by itself.
  • Significance many of the breakthroughs in
    various algorithmic areas have used
    randomization.
  • Example Prime Number Test
  • Simple polynomial one-sided error Monte Carlo
    algorithm Rabin Algorithm (1980)
  • A deterministic polynomial time algorithm was
    given by Agarwal, Kayal, and Saxena (2002).

7
Advantages of Randomized Algorithms
  • Performance for many problems, randomized
    algorithms run faster than the best known
    deterministic algorithm
  • Simplicity many randomized algorithms are
    simpler to describe and implement than
    deterministic algorithms of comparable
    performance.

8
Challenges and Solutions
  • The concept of a randomized algorithm can be
    difficult to understand.
  • Usually, there is no separate course on
    Randomized Algorithms in undergraduate CS
    curriculum
  • The idea of a randomized algorithm is clearer for
    students when presented as a game.
  • Topic could be integrated into introductory
    courses

9
Algorithm as Part of the Game
Design of an Algorithm for a Combinatorial Problem
GAME
Algorithm Player
Input Player
Designs the Algorithms
Goal Minimize Running Time of the Algorithm
Goal Maximize Running Time of the Algorithm
Selects Test Input for Selected Algorithm
10
Deterministic Algorithms
Input Player
Algorithm Player
Deterministic Strategy (Deterministic Algorithm)
Best Strategy Finding the Worst Input for the
Algorithm Produced by the Algorithm Player
  • Reveals an entire strategy
  • (algorithm) first
  • Input Player can pick
  • the worst example for
  • the suggested algorithm

11
Deterministic Algorithms
  • The problem facing the algorithm player is that
    if it uses a deterministic strategy, then since
    in a sense it moves first", the second (input)
    player can indeed pick the worst example for the
    suggested algorithm

12
Randomized Algorithms
Input Player
Algorithm Player
Randomized Strategy (Randomized Algorithm)
A bad input for a randomized algorithm
has to be an input which is bad for several
algorithms simultaneously
  • Randomized algorithm can be seen
  • as a distribution over all possible
  • deterministic algorithms
  • Doesnt reveal his cards fully in advance
  • Tells the second player the probability by
  • which it selects any one of the possible
  • deterministic algorithms
  • The coins have not fallen yet, and the game
  • only begins after the input player chooses
  • its adversarial input.

13
Game Description
  • Player 1 Decides on integer x gt 0
  • Player 2 Has to find a number yn so that yn ?
    x
  • Rules
  • Player 2 y1lt y2lt lt yn
  • y1lt y2lt lt yn-1 lt x and yn ? x
  • On a guess yj, player 1 either says
  • smaller than x, please provide a next guess
  • larger or equal x, and reveals x stopping the
    game

14
Optimization Criteria
  • Let the guesses be y1, y2, .yn, so that
  • yn ? x
  • yj lt x for all j n 1
  • The optimization criteria is the
  • performance ratio

15
EXAMPLE
Player 1
Player 2
x is chosen, please provide a guess
1
Smaller than x, next guess
3
Smaller than x, next guess
10
Smaller than x, next guess
28
Smaller than x, next guess
76
STOP! x 37
Performance Ratio 118 / 37 3.189189
16
EXAMPLE
Player 1
Player 2
x is chosen, please provide a guess
1
Smaller than x, next guess
2
Smaller than x, next guess
3
Smaller than x, next guess
4
Smaller than x, next guess
23
STOP! x 5
Performance Ratio (123423) / 5 6.6
17
Teaching the Game
  • Discussion of the Optimization Criteria
    selection
  • Why not to choose yi/x, where yi ? x?
  • Answer simple strategy 1, 2, 3, is optimal
  • Discussion of possible strategies
  • Why not to start with some large number?
  • Why do we not benefit from increasing the next
    guess only a little compared to the previous
    guess?
  • What is the disadvantage of making the next
    guess, say, 100 times larger than previous guess?

18
The Powers of 2 Strategy for the Second Player
  • It turns out that the simple strategy that
    selects powers of 2 y0 1, y1 2, y2 4, yi
    2i
  • is an optimal deterministic strategy for
    this game
  • The worst case for the strategy is when the
    number selected by the first player is x 2j 1
  • In this case the game is played until the second
    player suggests 2j1

19
EXAMPLE
  • x 13
  • guesses 1, 2, 4, 8, 16
  • sum 124816 31
  • performance ratio is 2.384615
  • x 65
  • guesses 1, 2, 4, 8, 16, 32, 64, 128
  • sum 1248163264128 255
  • performance ratio is 3.923

20
The Powers of 2 Strategy Analysis
  • The strategy y0 1, y1 2, y2 4, yi 2i
    gives a following performance ratio
  • Worst Case x 2j 1, the performance ratio is

21
Teaching the Game
  • Encourage students to find by themselves the
    worst case for the powers of 2 strategy.
  • This example serves well in illustrating the
    strict notion of the worst case input.
  • The bad instance for the powers of 2 strategy is
    a very specific and rare number. (1, 2, 4, 8, 16,
    17, 32)

22
Teaching the Game
  • If x is some random number, the powers of 2
    strategy performs much better.
  • A good place to discuss the difference between
    random strategy and random inputs.
  • The input is sometimes not within our control,
    while the randomized algorithm is within our
    control as the designers of the algorithms.

23
A Deterministic Worst CaseLower Bound
  • Let ? gt 0 be a small as desired constant.
  • We show that any deterministic strategy has
    examples with performance ratio at least 4 - ?
  • The powers of 2 is the optimal deterministic
    strategy.

24
A Randomized Strategy
  • The following simple randomized strategy gives an
    improved expected value
  • Let ? ?R 0, 1) randomly and uniformly chosen
    from interval 0, 1)
  • Define yj ?exp( j ? )?
  • Let i be so that ?exp( i - 1 ? )? lt x ?exp( i
    ? )?
  • The expected performance ratio is

25
EXAMPLE
Player 1
Player 2
x is chosen, please provide a guess
? 0.419
?exp(?)? 1
Smaller than x, next guess
?exp(?1)? 4
Smaller than x, next guess
?exp(?2)? 11
Smaller than x, next guess
?exp(?3)? 30
Smaller than x, next guess
?exp(?4)? 83
STOP! x 48
Performance Ratio 129 / 48 2.6875
26
EXAMPLE
Player 1
Player 2
x is chosen, please provide a guess
? 0.866
?exp(?)? 2
Smaller than x, next guess
?exp(?1)? 6
Smaller than x, next guess
?exp(?2)? 17
Smaller than x, next guess
?exp(?3)? 47
Smaller than x, next guess
?exp(?4)? 129
STOP! x 63
Performance Ratio 201 / 63 3.190476
27
EXAMPLE
Player 1
Player 2
x is chosen, please provide a guess
? 0.195
?exp(?)? 1
Smaller than x, next guess
?exp(?1)? 3
Smaller than x, next guess
STOP! x 6
?exp(?2)? 8
Performance Ratio 12 / 6 2.0000
28
A Deterministic Worst CaseLower Bound Intuitive
Explanation
  • The second player cannot
  • choose yi1 to be
  • too large compared to yi

If after may times the second player makes such
choices
x yi 1 ltlt yi1 (yi1 yi)/x is already a
large number
For some yj y1y2yj-1 gtgt yj
The second player always selects yi1 to be not
much larger than yi
The choice x yj is bad for second player,
since (y1y2.yj)/yj is large
29
Summary
  • Simple game illustrating the power of
    randomization.
  • The full analysis of the game is presented in
    Teaching the Power of Randomization Using Simple
    Game, SIGCSE 2006 (Y. Kortsarts, J. Rufinus)
  • Teaching the game
  • Introduction to Computer Science I and II
  • Design and Analysis of Algorithms
  • Undergraduate Research Projects

30
Summary
  • The game is well-motivated from the point of view
    of modern scheduling research
  • Even though this specific game seems not to have
    been studied before, the techniques illustrated
    here have been used in a series of papers on
    approximating scheduling problems 11, 7, 8, 4.
    These papers study the fast scheduling of
    conflicting jobs with the goal of minimizing the
    sum of finish times of these jobs. Hence, the
    suggested game is at the heart of modern research.

31
Advanced Algorithms in Introductory CS Curriculum
  • Las Vegas - always gives the correct solution.
  • Monte Carlo - may sometimes produce an incorrect
    solution
  • How (and why) to Introduce Monte Carlo
    Randomized Algorithms Into a Basic Algorithms
    Course?, Y. Kortsarts, J. Rufinus, Journal of
    Computing Sciences in Colleges, December 2005
  • Integrating a real-world scheduling problem into
    the basic algorithms course, Yana Kortsarts,
    Journal of Computing Sciences in Colleges, June
    2007

32
Advanced Algorithms in Introductory CS Curriculum
  • Merkle-Hellman Knapsack Cryptosystem 31
  • Elegant and beautiful underlying mathematics
  • Due to its simple structure, the knapsack
    cryptosystem is an ideal model for introducing
    algorithmic techniques and a concept of Public
    Key cryptosystem to computer science students
  • Sequence Alignment 32, 33
  • Needleman and Wunsch Algorithm (Global Alignment)
  • Smith-Waterman Algorithm (Local Alignment)

33
Knapsack Cryptosystem in Computer Science
Curriculum
Cryptology
Design and Analysis of Algorithms
Introduction to Computer Science
Concept of Public Key Cryptosystem
  • Knapsack Problem
  • Subset-Sum Problem
  • Algorithmic Techniques
  • Concept of Public Key
  • Cryptosystem
  • Computational Problems
  • Prime Numbers
  • GCD, Euclidian Algorithm
  • Modular Exponentiation
  • Primitive Roots for Primes

Undergraduate Student Research Projects
34
Sequence Alignment
  • Global Alignment compare two sequences in their
    entirety the gap penalty is assessed regardless
    of whether gaps are located internally within a
    sequence, or at the end of one or both sequences.
  • The Needleman and Wunsch Algorithm.
  • Local Alignment find best matching subsequences
    within the two search sequences.
  • The Smith-Waterman Algorithm.

35
REFERENCES
  • 1 S. Arora, C. Lund, R. Motwani, M. Sudan and
    M. Szegedy. Proof verication and the hardness of
    approximation problems. Journal of ACM,
    45(3)501-555, 1998.
  • 2 G. J. Brebner and L. G. Valiant, Universal
    schemes for parallel communication. Proceedings
    of the thirteenth annual ACM symposium on Theory
    of computing, Pages 263 - 277, 1981
  • 3 T. H. Cormen, C. E. Leiserson, and R. L.
    Rivest. Introduction to algorithms. The MIT
    Press, 2nd edition, 2001.
  • 4 S. Chakrabarti, C. A. Phillips, A. S. Schulz,
    D. B. Shmoys, C. Stein and J. Wein. Improved
    scheduling algorithms for-minsum criteria. ICALP
    '96, 875-886.
  • 5 A. Fiat, R. M. Karp, M. Luby, L. A. McGeoch,
    D. D. Sleator, and N. E. Young, Competitive
    paging algorithms. Journal of Algorithms archive
    Volume 12(4) 685 - 699 1991
  • 6 O. Goldreich, S. Micali, and A. Wigderson.
    Proofs that yield nothing but their validity or
    all languages in NP have zero-knowledge proof
    systems. Journal of the ACM, 38(3)690 - 728,
    1991

36
REFERENCES
  • 7 L. A. Hall, D. B. Shmoys, and J. Wein.
    Scheduling to minimize average completion time
    O-line and on-line algorithms. SODA'96, 142-151.
    42-151, Jan 1996.
  • 8 L. A. Hall, A. Schulz, D. B. Shmoys, and J.
    Wein. Scheduling to minimize average completion
    time O-line nd on-line approximation algorithms.
    Math. Operations Research 22513-544, 1997.
  • 9 G. Kalai, A subexponential randomized simplex
    algorithm, Proceedings of the twenty-fourth
    annual ACM symposium on Theory of computing, 475
    - 482, 1992
  • 10 R. M. Karp, E. Upfal and A. Wigderson.
    Constructing a perfect matching is in random NC.
    Combinatorica Volume 6(1)35-48, 1986
  • 11 M. Queyranne, M. Sviridenko. Approximation
    algorithms for shop scheduling problems with
    minsum objective. J. Scheduling 5287-305, 2002.
  • 12 R. L. Rivest, A. Shamir, L. M. Adleman, A
    Method for Obtaining Digital Signatures and
    Public-Key Cryptosystems. Commun. ACM
    21(2)120-126, 1978

37
REFERENCES
  • 13 N. Alon and R Yuster and U Zwick.
    Color-coding Journal of the ACM, 42(4)844 - 856
  • 14 A. Bjorklund, T. Husfeldt and S. Khanna.
    Approximating Longest Directed Path. Symposium on
    Automata, Languages and Programming (ICALP) 2004,
    to appear.
  • 15 D. Dor, U. Zwick, Selecting the Median, SIAM
    J. Comput, 28(5) 1722-1758, 1999.
  • 16 D. Dor and U. Zwick, Median Selection
    Requires (2epsilon)n Comparisons, SIAM Journal
    on Discrete Mathematics, 14(3)312-325

38
REFERENCES
  • 17 R. W. Floyd and R. L. Rivest Expected time
    bounds for selection Communications of the ACM,
    18(3)165 - 172, 1975.
  • 18 T. Feder, R. Motwani, C. Subi. Finding long
    paths and cycles in sparse Hamiltonian graphs
    Proceedings of the ACM symposium on Theory of
    computing, pages 524 - 529, 1999
  • 19 H. Gabow, Finding paths and cycles of
    superpolylogarithmic size.
  • Proceedings of the ACM symposium on
    Theory of computing, pages 407-416, 2004.
  • 20 M. T. Goodrich and R. Tamassia. Using
    randomization in the teaching of data structures
    and algorithms, The proceedings of the thirtieth
    SIGCSE technical symposium on Computer science
    education, 53 - 57, 1999
  • 21 D. Karger, R. Motwani, and G.D.S. Ramkumar.
    On Approximating the Longest Path in a Graph.
    Algorithmica 18 (1997) 82-98.

39
REFERENCES
  • 22 R. M. Karp. Reducibility among combinatorial
    problems, R. E. Miller and J. W. Thatcher, eds.,
    Complexity of Computer Computations, Plenum
    Press, New York, 1972, pp. 85-103.
  • 23 M. O. Rabin Probabilistic algorithm for
    testing primality, J. Number Theory, 12, 128-138,
    1980.
  • 24 N. Robertson and P. Seymour, Graph minors.
    II. Algorithmic aspects of tree-width. J.
  • Algorithms 7, 1986.
  • 25 R. Motwani and P. Raghavan, Randomized
    Algorithms, Cambridge University Press, 1995
  • 26 R.M. Karp, An Introduction to randomized
    algorithms, Discrete Applied Mathematics, 34
    165-201, 1991

40
REFERENCES
  • 27 D.R.Karger, Global min-cuts in RNC, and
    other ramifications of a simple min-cut
    algorithm, In Proceedings of the 4th Annual
    ACM-SIAM Symposium on Discrete Algorithms, pp.
    21- 30, 1993.
  • 28 M. J. Quinn, Parallel Programming in C with
    MPI and OpenMP, McGraw-Hill, 2004
  • 29 Y. Kortsarts, J. Rufinus, Teaching the Power
    of Randomization Using Simple Game, SIGCSE 2006
  • 30 Y. Kortsarts, J. Rufinus, How (and why) to
    Introduce Monte Carlo Randomized Algorithms Into
    a Basic Algorithms Course?, Journal of Computing
    Sciences in Colleges, 2005

41
REFERENCES
  • 31 R. C. Merkle, M. E. Hellman,  Hiding
    Information
  • and Signatures in Trapdoor Knapsacks,
    IEEE
  • Transactions on Information Theory, vol.
    IT-24, 1978, pp. 525-530.
  • 32 An Introduction to Bioinformatics
    Algorithms,
  • N.C. Jones and P. A. Pevzner, The MIT
    Press, 2004
  • 33 Fundamental Concepts of Bioinformatics,
  • D. E. Krane and M . L. Raymer,
    Publisher
  • Benjamin Cummings, 2002
Write a Comment
User Comments (0)
About PowerShow.com