Loading...

PPT – Algorithm design and Analysis PowerPoint presentation | free to download - id: 5e77c3-YWJmY

The Adobe Flash plugin is needed to view this content

????

??????? Algorithm design and Analysis

? ? ? ????? yedeshi_at_zju.edu.cn

??????

- ???? 21190120
- ???????2012? ???
- ??(9?10)??? ?101 ???(34)??? ?101
- ????(11?12)??????
- ????6?11?(hard deadline)
- ????????????
- ??/??4-2/? ??/ 2-1 ??

Office Homepage

- Office??? 215
- My Homepagehttp//www.cs.zju.edu.cn/people/yedesh

i/ - Course Home

Examination

- Grading Polices
- Homework 15
- Quiz 10 (5th Week, May 15)
- Presentation Poster session 20 (Slides submit

to TA), Videos maybe taken - Poster in Lab time
- 4) Programming Project 20 (2-3programms)
- 5) Final Survey 35

TA information

- ??? roles
- Guide to group the students
- Collect the homework, program, and survey
- Review of homework, programs
- Collect the scores of each presentations/posters

Algorithms

Programming

What is the position of algorithms in CS

- 1. Linguists what shall we talk to the

machines? - 2. Algorithms what is a good method for solving

a problem fast on my computer - 3. Architects Can I build a better computer?
- 4. Sculptors of Machine Intelligence Can I write

a computer program that can find its own

solution.

Algorithms in Computer Science

Algorithms

Hardware

Compilers, Programming languages

Networking, Distributed systems, Fault tolerance,

Security

Machine learning, Statistics, Information

retrieval, AI

Bioinformatics .......

MIT Undergraduate Programs

What is algorithm?

- (Oxford Dict.)Algorithm
- A set of rules that must be followed when solving

a particular problem. - From Math world
- A specific set of instructions for carrying out a

procedure or solving a problem, usually with the

requirement that the procedure terminate at some

point. - An algorithm is any well-defined computational

procedure that takes some value, or set of

values, as input and produces some value, or set

of values, as output.

(No Transcript)

What will CS be?

- Computer Science Past, Present, and Future
- Ed Lazowska (Washington)
- Computer Science is the new Math
- Christos H. Papadimitriou (Berkeley)

Algorithm

- Problem definition ??
- Objective ?? (very important)
- Evaluation ????
- Methods ??

Algorithm evaluation

- Quality
- how far away from the optimal solution ?
- Cost
- Running time
- Space needed
- Our goal is to design algorithm with high

quality, but in low cost

Reasonable times

- Poly(I), Time polynomial in I, where I is

the size of the problem instance - Input size size(x) of an instance x with

rational data is the total number of bits needed

for the binary prepresentation.

Time complexity

- logarithmic time if T(n) O(log n).
- sub-linear time if T(n) o(n)
- linear time, or O(n) time
- linearithmic function T(n) O(n log n),

quasilinear time if T(n) O(n logk n) - polynomial time T(n) O(nk) for some constant k
- strongly polynomial time
- the number of operations in the arithmetic model

of computation is bounded by a polynomial in the

number of integers in the input instance and - the space used by the algorithm is bounded by a

polynomial in the size of the input. - weakly polynomial time P but not strongly P

Time complexity

- Quasi-polynomial time for some

fixed c. - Sub-exponential time if T(n) 2o(n)
- Exponential time, if T(n) is upper bounded by

2poly(n)

Hardness of problems

Easy

- Polynomial (e.g. n2, n log n, n3, n1000).
- Quasi-polynomial(e.g.n log n, n log2n, c log7n).
- Sub-exponential (e.g. 2vn, 5(n0.98)).
- Exponential (e.g. 2n, 8n, n!, nn).

Hard

Running time

- Computer A is 100 times faster than computer B
- Sort n numbers
- Computer A requires instructions
- Computer B requires 50nlgn instructions
- n 1,000, 000
- Computer A 2(106)2/109 2000 seconds
- Computer B 50106 lg 106/107 100 seconds

Running time

10 lt 1 s lt 1s lt 1 s lt 1 s lt 1 s 4 s

100 lt 1 s lt 1 s lt 1 s 1 s 18 min year

1,000 lt 1 s lt 1 s 1 s 18 min Very long Very long

10,000 lt 1 s lt 1 s 2 min 12 day Very long Very long

1 s 20 s 12 days 31710 year Very long Very long

Sorting

- ??A sequence of n number
- ????(permutation )

lt

gt

a

a

a

2

1

n

,

,

,

0

0

0

lt

gt

a

a

a

1

2

,

,

,

n

??

0

0

0

lt

lt

lt

a

a

...

a

1

2

n

Example

Input 8 2 4 9 3 6

Output 2 3 4 6 8 9

EX. of insertion sort

8

2

4

9

3

6

EX. of insertion sort

2

8

4

9

3

6

EX. of insertion sort

EX. of insertion sort

2

4

8

9

3

6

EX. of insertion sort

EX. of insertion sort

2

4

8

9

3

6

EX. of insertion sort

2

4

8

9

3

6

EX. of insertion sort

2

4

8

9

3

6

2

3

4

8

9

6

EX. of insertion sort

2

4

8

9

3

6

2

3

4

8

9

6

EX. of insertion sort

2

4

8

9

3

6

2

3

4

8

9

6

2

3

4

6

8

9

done

Insertion sort

i

j

1

n

A

key

sorted

Analyzing algorithms

- Need a computational model
- Random-access machine (RAM) model
- Instructions are executed one after another. No

concurrent operations. - Arithmetic add, subtract, multiply, divide,

remainder, floor, ceiling - Data movement load, store, copy
- Control conditional/unconditional branch,

subroutine call and return. - Each of these instructions takes a constant

amount of time.

Running time

- Running time
- The running time of an algorithm on a particular

input is the number of primitive operations or

steps executed. - line consists only of primitive operations and

takes constant time - Input size
- number of items
- the total number of bits.
- more than one number Graph
- the number of vertices and the
- number of edges

Example

- The input size of sorting problem is

n. Worst-case running time of Insert sort is

O(n2).

Running time

- The running time depends on the input an already

sorted sequence is easier to sort. - Parameterize the running time by the size of the

input, since short sequences are easier to sort

than long ones. - Generally, we seek upper bounds on the running

time, because everybody likes a guarantee.

Map of Algorithm Design

New problem

Off-line problem

On-line problem

Polynomial

Polynomial

NP-C problem

Quality Appro. ratio

Exact Algorithm

Heuristic

Approximate Algorithm

Improve cost running time

Improve cost running time

Quality Appro. ratio

????

- 1. ????
- 1.1 ????
- 1.2 ? (SUMS) ???? (Sets)
- 1.3 ??? (Stirling numbers, Harmonic numbers,

Eulerian numbers et al.) - 2. ????
- 2.1 ?? (Divide-and-Conquer)
- 2.1.1 Mergesort
- 2.1.2 ?????(Multiplication)
- 2.1.3 ????(Matrix multiplication)
- 2.1.4 Discrete Fourier transform and Fast

Fourier transform

????

- 2.2 ???? (Dynamic Programming)
- 2.2.1 ????(Knapsack problem)
- 2.2.2 ???????(Longest increasing subsequence)
- 2.2.3 Sequence alignment
- 2.2.4 ???????(Longest common subsequence)
- 2.3.5 Matrix-chain multiplication
- 2.3.6 ?????? (Max Independent set in tree)

????

- 2.3 ???? (Greedy)
- 2.3.1 ????(Interval scheduling)
- 2.3.2 ????(Set cover)
- 2.3.3 ??(Matroids)
- 2.4 NP ?? (NP-completeness)
- 2.4.1 The classes P and NP
- 2.4.2 NP-completeness and reducibility
- 2.4.3 NP-complete problems

????

- 2.5 ???? (Approximate Algorithm)
- 2.5.1 ?????? (Vertex cover)
- 2.5.2 ?????? (Load balancing)
- 2.5.3 ????? (Traveling salesman problem)
- 2.5.4 ????? (Subset sum problem)

????

- 3. ?????
- 3.1 ???? (Local Search)
- 3.1.1 The Metropolis Algorithm and Simulated

Annealing - 3.1.2 Local Search to Hopfield Neural

Networks(Nash Equilibria) - 3.1.3 Maximum Cut Approximation via Local Search

????

- 3.2 ?? (Graph Theorem)
- 3.2.1 ??????? (Fundamental)
- 3.2.2 ???? (Linear Programming)
- ???(Network Flow),???,??????
- 3.3????? (Computational Geometry)
- 3.3.1 ??????????? (Line-segment )
- 3.3.2 ??????? (Segments intersects )
- 3.3.3 ???? (Convex Hull )
- 3.3.4 ?????? (The closet pair of points)
- 3.3.5 ??????? (Polygon Triangulation)

????

- 3.4 ???? (Randomized Algorithm)
- 3.4.1 ???????
- 3.4.2 A Randomized MAX-3-SAT
- 3.4.3 Randomized Divide-and-Conquer
- 3.5 ????(Online Algorithm)
- 3.5.1 Online Skying
- 3.5.2 Online Hiring

????

????

??

- Textbook Introduction to algorithms, Second

Edition. Thomas H. Cormen, Charles E. Leiserson,

Ronald L. Rivest and Clifford Stein. The MIT

Press, 2001. ISBN 0262032937. - Recommended Algorithm Design. Jon Kleinberg, Éva

Tardos. Addison Wesley, 2005. ISBN 0-321-29535-8.

Rolf Nevanlinna Prize, 06

(No Transcript)

????

- Algorithms. S. Dasgupta, C.H. Papadimitriou, and

U. V. Vazirani. May 2006. - Combinatorial Algorithms. Jeff Erickson.

University of Illinois, Urbana-Champaign. Lecture

Notes. Fall 2002. - Concrete Mathematics. Ronald L. Graham, Donald E.

Knuth, Oren Patashnik. Addison-Wesley Publishing

Company, 2005. ISBN o-201-14236-8.

Algorithms in Computer Science

- P NP ?
- Can we solve a problem efficiently?
- Tradeoff between quality of solution and the

running time - Solve a problem with optimal solution, but it

might cost long time - Solve a problem approximately in short time

1,000,000 problem

- P NP ? http//www.claymath.org/millennium/
- Solved???!!!!

Algorithms in Computer Science

- Selfish Routing
- Privacy preserve in database
- TSP
- Ad auction

Perspective

- Algorithms we can find everywhere
- They have been developed to easy our daily life
- Train/Airplane timetable schedule
- Routing
- We live in the age of information
- Text, numbers, images, video, audio

Selfish routing

- Pigou's Example
- Suburb s, a nearby train station t.
- Assuming that all drivers aim to minimize the

driving time from s to t

C(x) 1

s

t

C(x) x, with x in 0, 1

Selfish routing

- We have good reason to expect all traffic to

follow the lower road - Social optimal? ½ to the long, wide highway, ½

to the lower road. - selfish behavior need not produce a socially

optimal outcome

Braess's Paradox

v

C(x) x

C(x) 1

s

t

C(x) 1

C(x) x

w

Braess's Paradox

v

C(x) x

C(x) 1

C(x) 0

s

t

C(x) 1

C(x) x

w

Braess's Paradox

- Paradox thus shows that the intuitively helpful

action of adding a new zero-cost link can

negatively impact all of the traffic! - With selfish routing, network improvements can

degrade network performance.

Link attack example

- Re-identify the medical record of the governor of

Massachussetts - MA collects and publishes sanitized medical data

for state employees (microdata) left circle - voter registration list of MA (publicly available

data) right circle

- looking for governors record
- join the tables
- 6 people had his birth date
- 3 were men
- 1 in his zipcode

- regarding the US 1990 census data
- 87 of the population are unique based on

(zipcode, gender, dob)

Privacy in microdata

- the role of attributes in microdata
- explicit identifiers are removed
- quasi identifiers can be used to re-identify

individuals - sensitive attributes (may not exist!) carry

sensitive information

Name Birthdate Sex Zipcode Disease

Andre 21/1/79 male 53715 Flu

Beth 10/1/81 female 55410 Hepatitis

Carol 1/10/44 female 90210 Brochitis

Dan 21/2/84 male 02174 Sprained Ankle

Ellen 19/4/72 female 02237 AIDS

identifier quasi identifiers quasi identifiers quasi identifiers sensitive

Name Birthdate Sex Zipcode Disease

Andre 21/1/79 male 53715 Flu

Beth 10/1/81 female 55410 Hepatitis

Carol 1/10/44 female 90210 Brochitis

Dan 21/2/84 male 02174 Sprained Ankle

Ellen 19/4/72 female 02237 AIDS

k-anonymity

- k-anonymity intuitively, hide each individual

among k-1 others - each QI set of values should appear at least k

times in the released microdata - linking cannot be performed with confidence gt 1/k
- sensitive attributes are not considered (going to

revisit this...) - how to achieve this?
- generalization and suppression
- value perturbation is not considered (we should

remain truthful to original values ) - privacy vs utility tradeoff
- do not anonymize more than necessary

Advertisement Auction

- Auction
- Dutch auction
- Vickrey auction
- Ad placement

k-anonymity example

- tools for anonymization
- generalization
- publish more general values, i.e., given a domain

hierarchy, roll-up - suppression
- remove tuples, i.e., do not publish outliers
- often the number of suppressed tuples is bounded

original microdata

2-anonymous data

Birthdate Sex Zipcode

21/1/79 male 53715

10/1/79 female 55410

1/10/44 female 90210

21/2/83 male 02274

19/4/82 male 02237

Birthdate Sex Zipcode

group 1 /1/79 person 5

group 1 /1/79 person 5

suppressed 1/10/44 female 90210

group 2 //8 male 022

group 2 //8 male 022

TSP

- Trucking company with a central warehouse
- Each day, it loads up the truck at the warehouse

and sends it around to several locations to make

deliveries. - At the end of the day, the truck must end up back

at the warehouse so that it ready to be loaded

for the next day. - To reduce the costs, the company wants to select

an order of delivery stops that yields the lowest

overall distance traveled by the truck.

(No Transcript)

(No Transcript)

(No Transcript)

Pizza delivery

- One can give a call or via internet to order a

pizza for dinner - We want the hot, fresh and tasty pizzas
- How should they delivery the pizzas upon the

reception of orders?? - Immediately or wait some minutes for next orders

in the near places?

The Ski problem

- The Ski problem Karp 92 A skier must decide

every day she goes skiing whether to rent or buy

skis, unless or until she decides to buy them.

The skiier doesnt know how many days she will go

on skiing before she gets tired of this hobbie.

The cost to rent skis for a day is 1 unit, while

the cost to buy the skis is B units. - How can she save money?

Lost cow problem

- A short-sighted cow (or assume its dark, or

foggy, or ...) is standing in front of a fence

and does not know in which direction the only

gate in the fence might be. How can the cow find

the gate without walking too great a detour? - How can two soldiers get together when lost in

battlefield ?

Erdos project shortest path

- Paul Erdos(1913-1996) has an Erdos number of

zero. If the lowest Erdos number of a coauthor is

X, then the author's Erdos number is X 1.

Nevanlinna Prize winners

- NAME YEAR COUNTRY ERDÖS

NUMBER - Robert Tarjan 1982 USA

2 - Leslie Valiant 1986 Hungary/Gt Brtn

3 - Alexander Razborov 1990 Russia

2 - Avi Wigderson 1994 Israel

2 - Peter Shor 1998 USA

2 - Madhu Sudan 2002 India/USA

2 - Jon Kleinberg 2006 USA

3 - Daniel Alan Spielman 2010 USA

Other famous people

- Albert Einstein 1921 Physics 2
- Chen Ning Yang 1957 Physics 4
- Tsung-dao Lee 1957 Physics 5
- John F. Nash 1994 Economics 4
- Edmund S. Phelps 2006 Economics 4
- Shing-Tung Yau 1982 China 2
- Shiing Shen Chern 1983-84 China 2
- Alan Turing computer science 5
- John von Neumann mathematics 3
- David Hilbert mathematics 4
- Donald E. Knuth 2

Extensions of shortest path

- On k-skip Shortest Paths (SIGMOD 2011)

History of Algorithm

- The word algorithm comes from the name of the 9th

century Persian mathematician Abu Abdullah

Muhammad ibn Musa al-Khwarizmi whose works

introduced Arabic numerals and algebraic

concepts. - The word algorism originally referred only to the

rules of performing arithmetic using Arabic

numerals but evolved into algorithm by the 18th

century. The word has now evolved to include all

definite procedures for solving problems or

performing tasks.

History con.

- The first case of an algorithm written for a

computer was Ada Byron's notes on the analytical

engine written in 1842, for which she is

considered by many to be the world's first

programmer. However, since Charles Babbage never

completed his analytical engine the algorithm was

never implemented on it. - This problem was largely solved with the

description of the Turing machine, an abstract

model of a computer formulated by Alan Turing,

and the demonstration that every method yet found

for describing "well-defined procedures" advanced

by other mathematicians could be emulated on a

Turing machine (a statement known as the

Church-Turing thesis).

Why you come here?

Requirement

- Come to the class ()
- Ask questions
- Thinking
- Why it is ok now?
- How about other methods?

Kinds of analyses

- Worst-case (usually)
- T(n) maximum time of algorithm on any input of

size n. - Average-case (sometimes)
- T(n) expected time of algorithm over all inputs

of size n. - Need assumption of statistical distribution of

inputs. - Best-case (bogus)
- Cheat with a slow algorithm that works fast on

some input.

Uniform distribution

Performance Measures for On-line Algorithms

- Competitive ratio
- Max/Max ratio
- Smoothed Competitiveness