Chapter 1 Introduction - PowerPoint PPT Presentation


PPT – Chapter 1 Introduction PowerPoint presentation | free to download - id: 5f68f2-OGViO


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation

Chapter 1 Introduction


Title: Author: BAI II Last modified by: Will Created Date: 1/4/2007 2:05:12 PM Document presentation format: Company – PowerPoint PPT presentation

Number of Views:31
Avg rating:3.0/5.0
Slides: 67
Provided by: BAI89


Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Chapter 1 Introduction

Chapter 1Introduction
  • Instructors
  • C. Y. Tang and J. S. Roger Jang

All the material are integrated from the textbook
"Fundamentals of Data Structures in C" and  some
supplement from the slides of Prof. Hsin-Hsi Chen
How to create programs
  • Requirements
  • Analysis bottom-up vs. top-down
  • Design data objects and operations
  • Refinement and Coding
  • Verification
  • Program Proving
  • Testing
  • Debugging

Data Structure How data are organized and hence
  • Youve been very familiar with arrays in your
    programming assignments. They are basic (yet
    powerful!) data structures.
  • They can hold data (objects)e.g., integers.
  • They are structuredstructured in a way that the
    data held inside can be operated.
  • Each element in an array has an index. With that,
    you can store or retrieve an element.

Learning Data Structures, and Algorithms
  • You want your tasks to be performed efficiently.
    You need good methods (algorithms).
  • Data must be structured in some manner to be
  • Good structures can be operated efficiently.

  • For example, suppose that you are building a
    database storing the data of all (past, present,
    and future) students of NTHU, which are growing
    in size.
  • Youll need to find anybodys data in the
    database (to search/retrieve).
  • To enter new entries into it (to insert).
  • Etc.

  • You can use an array to implement the database.
  • To insert a new entry, simply add it to the first
    empty array cell.
  • If the current array is full, allocate a new
    array whose size is the double of the current.
    Then copy all the original entries from the old
    array to the new one.
  • To search for an entry, simply looks at all
    entries in the array, one by one.

  • Your programming experience, however, tells you
    that this is not a good method.
  • In Chapter 10, youll see sophisticated data
    structures that can be operated (searched,
    inserted/deleted, etc.) efficiently.
  • Then, youll just feel how data can be cleverly
    structured to serve as the basis of fast

Example 2
  • This time you want to work with polynomials,
    i.e., functions of the form f(x) an xn an-1
    xn-1 a0
  • Youll need to store them, as well as to multiply
    or add them.
  • You may allocate an array to store the
    coefficients of a polynomial.
  • E.g., Ai stores ai.

Example 2
  • Alternatively, you can use a linked list to store
  • Each node needs to store the coefficient and the
    exponent. E.g., to store 3x2 5, a linked list
    like 32 -gt 50 is constructed.
  • To represent polynomials, both data structures
    have their relative advantages and drawbacks in
    time and space considerations, etc (see later

Searching in Arrays
  • You are able to write, in minutes, a program to
    search sequentially in an array.
  • However, when the numbers in an array are sorted,
    you can do much faster
  • using binary search, which you probably are also
    familiar with.
  • You may say that, a sorted array is not the same
    structure as a general array.

Binary Search
  • Let A0..n - 1 be an array of n integers.
  • We want to ask is some integer x stored in A?
  • Suppose that A is sorted (say, in ascending
    order), i.e., (for simplicity assume that the
    numbers are distinct) A0 lt A1 lt lt An -
  • To be more concrete, let n be 5.

Binary Search
  • Observation If x gt A2, then x must fall in
    A3..4, (or x is not in A). If x lt A2, then x
    must fall in A0..1.
  • Ex Let x 11.
  • A 3 5 7 11 13
  • x ( 11) gt A2 ( 7), so we need not consider

Binary Search
  • Example Let x 17. Let A contain the following
    8 integers. Initially, let left 0 and right 7
    (shown in red).
  • left and right define the range to be searched.
  • 2 3 5 7 11 13 17 19
  • mid (0 7) / 2 3 (rounded).
  • Amid 7 lt x, so let left mid 1 4, and
    continue the search.

  • 2 3 5 7 11 13 17 19
  • mid (4 7) / 2 5, Amid 13, x 17.
  • Amid lt x, so let left mid 1 6, and
    continue the search.
  • 2 3 5 7 11 13 17 19
  • mid (6 7) / 2 6, Amid 17, x 17.
  • Amid x, so return mid 6, the desired
    position, and were done!

Binary Search
  • Binsearch(A, x, left, right) / finds x in
    Aleft..right /
  • while left lt right do
  • mid (left right) / 2
  • if x lt Amid then right mid 1
  • else if x Amid then return mid
  • else left mid 1
  • return -1 / not found /

Recursive Functions
  • A function that invokes (or is defined in terms
    of) itself directly or indirectly is a recursive
  • Fibonacci sequence Fn Fn-1 Fn-2
  • Summation sum(n) sum(n - 1) An
  • Where sum(i) is the sum of the first i items in A
  • Binomial coefficient (n choose k) C(n, k) C(n
    1, k) C(n 1, k 1)
  • The combinatorial interpretation of this is
    profoundtry it out if youve not learned it yet!

Recursive Binary Search
  • Binsearch(A, x, left, right)
  • if left gt right then return -1
  • mid (left right) / 2
  • if x lt Amid then return Binsearch(A, x,
    left, right 1)
  • else if x Amid then return mid
  • else return Binsearch(A, x, left 1, right)

Data Abstraction
  • Before implementation, we need first to know the
    specification of the objects to be stored as well
    as the operations that should be supported,
    before we can implement it.
  • In our previous polynomial example, first we have
    the demand to store polynomials (the objects),
    supporting multiplications and other operations.
    The specification is independent of how it is
    implemented (e.g. using arrays or linked lists).

Abstract Data Type (ADT)
  • An abstract data type (ADT) is a data type that
    is organized in such a way that the specification
    of the objects and the specification of the
    operations on the objects is separated from the
    representation of the objects and the
    implementation of the operations.
  • No fixed syntax to describe them. The
    specifications need only be clear.

ADT for Natural Numbers (an example)
  • Structure Natural_Number
  • Objects an ordered subrange of the integers
    starting at 0 and ending at the maximum integer
    (INT_MAX) on the computer.
  • Functions
  • Nat_Num Zero() 0
  • Nat_Num Add(x, y) if (xy) lt INT_MAX then
    return x y, else return INT_MAX.
  • And so on. This example is actually too simple so
    that you may feel that the implementation of the
    functions have been stated. But this is generally
    not the case.

Structure 1.1Abstract data type Natural_Number
(p.17)structure Natural_Number is objects
an ordered subrange of the integers starting at
zero and ending at the maximum integer (INT_MAX)
on the computer functions for all x,
y ? Nat_Number TRUE, FALSE ? Boolean and
where , -, lt, and are the usual integer
operations. Nat_No Zero ( )
0 Boolean Is_Zero(x) if (x) return
else return TRUE Nat_No Add(x, y)
if ((xy) lt INT_MAX) return xy
return INT_MAX Boolean Equal(x,y)
if (x y) return TRUE
else return FALSE
Nat_No Successor(x) if (x INT_MAX) return
else return x1 Nat_No Subtract(x,y)
if (xlty) return 0
else return x-y end
is defined as
Performance Analysis
  • Were most interested in the time and space
    requirements of an algorithm.
  • The space complexity of a program is the amount
    of memory that it needs to run to completion. The
    time complexity is the amount of computer time
    that it needs to run to completion.

??? Algorithm ?
  • A number of rules, which are to be followed in a
    prescribed order, for solving a specific type of
  • Computer Algorithm?????
  • Finiteness(???steps)
  • Definiteness(???step?????)
  • Effectiveness(?????????????)
  • Input/Output(O.S.??terminate????computational

Algorithm is everywhere !
  • Operating Systems
  • System Programming
  • Numerical Applications
  • Non-numerical Applications
  • ???field???Algorithm????field???????
  • Algorithm Implement???
  • Software
  • Hardware
  • Firmware

?????Algorithm ?
  • 1. ????,?????????(Time, Space)?Algorithm?????
  • Life-time Job
  • ????????????Algorithm???
  • ?????????????paper,update???algorithms??????????

?????Algorithm ?
  • 2. ????,?????NP-Complete?????efficient??
  • Life-time Job
  • ????????NP-Complete?
  • Real Application
  • Average Performance??
  • ?Approximating??
  • TSP
  • n 20 771??
  • N3log n in average (B B)
  • Planar Graph Coloring (Maximum 4 ????)

  • Criteria
  • Is it correct?
  • Is it readable?
  • Performance Analysis (machine independent)
  • space complexity storage requirement
  • time complexity computing time
  • Performance Measurement (machine dependent)

Space Complexity
  • The space needed includes two parts
  • (1) Fixed space requirement Not dependent on the
    number and size of the programs inputs and
  • Instruction space, simple variables (e.g., int),
    fixed-size structure variables (such as struct),
    and constants.

Space Complexity
  • (2) Variable space requirement S(I) Depends on
    the instance I involved.
  • In recursive calls, many copies of simple
    variables (e.g. many ints) may exist. Such space
    requirement is included in S(I).
  • S(I) may depend on some characteristics of I. The
    characteristic well most often encounter is n,
    the size of the instance.
  • In this case we denote S(I) as S(n).

Space Complexity
  • float abc(float a, float b, float c)
  • return abbc (ab-c) / (ab) 4
  • Sabc(I) 0.
  • Only has fixed space requirement.

Space Complexity
  • float sum(float list, int n) / adds up list
  • float tempsum 0
  • int i
  • for (i0 i lt n i) tempsum listi
  • return tempsum
  • In C, the array is passed using the address. So
    Ssum(n) 0.
  • In Pascal, the array may be passed by copying
    values. If this is the case, then Ssum(n) n.

Space Complexity
  • float rsum(float list, int n)
  • if (n gt 0) return rsum(list, n-1) listn-1
  • return 0
  • For each recursive call, the OS must save
    parameters, local variables, and the return
  • In this example, two parameters (list and n)
    and the return address (internally) are saved for
    each recursive call.

  • To add a list of n numbers, there are n recursive
    calls in total.
  • So Srsum(n) (c1 c2 c3) n, where c1, c2
    and c3 are the number of bytes (or other unit of
    interest) needed for each of these types (list,
    n, and return address).

Time Complexity
  • Were interested in the number of steps taken by
    a program. But what is a step?
  • A program step is a syntactically or semantically
    meaningful program segment whose execution time
    is independent of the instance characteristics.
  • For a program, everybody can have his/her own
    steps defined. The important thing is the
    independency of the instance size, etc.

Statement s/e Freq. Tot.
float rsum(float list, int n) if (n gt 0) return rsum(list, n-1) listn-1 return 0 1 n1 n1 1 n n 1 1 1
Total 2n2
s/e steps per execution.
Time Complexity
  • For some programs, for a fixed n, the time taken
    by different instances may still be different.
  • For example, the binary search algorithm depends
    on the position of x in array A.
  • Let A lt3, 5, 7, 11gt.
  • If x 5, then we need less steps than what if x
  • But in both cases, n 4.

Time Complexity
  • So we may consider the worst case, average case,
    or best case time complexity of an algorithm.
  • Worst case the maximum number of steps needed
    for any possible instance of size n.
  • Most commonly used. The concept of guarantee.
  • Average case under some assumption of instance
    distribution, the expected number of steps
  • Useful. But usually most complicated to analyze.
  • Best case the opposite of worst case.
  • Rarely seen.

Asymptotic Notations
  • To make exact step counts is often not necessary.
  • The concept of step is even inexact itself.
  • It is more impressive to obtain a functional
    improvement in the time complexity than an
    improvement by a constant multiple.
  • It is good to improve 2n2 to n2.
  • It is even better to improve 2n2 to 1000n.
  • For n large enough, you know that n2 is much
    larger than 1000n.

Figure 1.7Function values (p.38)
(No Transcript)
Figure 1.9Times on a 1 billion instruction per
second computer(p.40)
Asymptotic Notations
  • Therefore, in many cases we do not worry about
    the coefficients in the time complexity.
  • Not in all cases since, when we cannot have
    functional improvement, well still want
    improvements in the coefficients.
  • We regard 2n2 as equivalent to n2. They belong to
    the same class in this sense.
  • Similarly, 2n2 100n and n2 belong to the same
    class, since the quadratic term is dominant.
  • n2 and n belong to different classes.

Asymptotic Notation (O)
  • Definitionf(n) O(g(n)) iff there exist
    positive constants c and n0 such that f(n) ?
    cg(n) for all n, n ? n0.
  • Examples
  • 3n2O(n) / 3n2?4n for n?2 /
  • 3n3O(n) / 3n3?4n for n?3 /
  • 100n6O(n) / 100n6?101n for n?10 /
  • 10n24n2O(n2) / 10n24n2?11n2 for n?5 /
  • 62nn2O(2n) / 62nn2 ?72n for n?4 /

  • Complexity of c1n2c2n and c3n
  • for sufficiently large of value, c3n is faster
    than c1n2c2n
  • for small values of n, either could be faster
  • c11, c22, c3100 --gt c1n2c2n ? c3n for n ? 98
  • c11, c22, c31000 --gt c1n2c2n ? c3n for n ?
  • break even point
  • no matter what the values of c1, c2, and c3, the
    n beyond which c3n is always faster than c1n2c2n

  • O(1) constant
  • O(n) linear
  • O(n2) quadratic
  • O(n3) cubic
  • O(2n) exponential
  • O(logn)
  • O(nlogn)

Asymptotic Notations
  • To make these concepts more precise, asymptotic
    notations are introduced.
  • f(n) O(g(n)) if there exist positive constants
    c and n0 such that f(n) ? c g(n) for all n gt n0.
  • 1000n O(2n2). This is to say that 1000n is no
    larger than (?) 2n2 in this sense.
  • You can choose c 500 and n0 1.
  • Then 1000n ? 500 2n2 1000n2 for all n gt 1.

Asymptotic Notations
  • 1000n O(n) since you can choose c 1000 and n0
  • 1000 O(1).
  • For constant functions, the choice of n0 is
  • n2 ?? O(n) since for any c gt 0 and any n0 gt 0,
    there always exists some n gt n0 such that n2 gt

Asymptotic Notations
  • 2n2 100n O(n2), since 2n2 100n lt 200n2
  • This last equation may be validated by choosing
    c 200 and n0 1.
  • So you can feel that in asymptotic notations we
    only care about the most dominant term. Simply
    throw out the minor terms. Throw out the
    constants, too.
  • log2 n O(n). May choose c 1 and n0 2.
  • n log2 n O(n2).

Asymptotic Notations
  • n100 O(2n). You may choose c 1 and n0 1000.
  • O(1) constant, O(n) linear, O(n2) quadratic,
    O(n3) cubic, O(log n) logarithmic, O(n log n),
    O(2n) exponential.
  • These are the most commonly encountered time
  • The base of the log is not relevant
    asymptotically, since logba (1/log2b) log2a,
    different only by a constant multiple 1/log2b.

Asymptotic Notations
  • f(n) O(g(n)) is just a notation. f(n) and
    O(g(n)) are not the same thing.
  • So you cant write O(g(n)) f(n).
  • It is also common to view O(g(n)) as a set of
    functions, and f(n) O(g(n)) actually means f(n)
    ? O(g(n)).
  • O(g(n)) f(n) for some c gt 0 and n0 gt 0 such
    that f(n) lt c g(n) for all n gt n0

Asymptotic Notations
  • n O(n), n O(n2), n O(n3),
  • Choose the function g(n) closer to f(n) is more
  • You may ask, why not use f(n) itself? f(n)
    O(f(n)) is the best choice. The difficulty is
    when we analyze an algorithm, we may even not
    know the exact f(n). We can only obtain an upper
    bound in some cases.

Asymptotic Notations
  • Thm 1.2 If f(n) am nm a1 n a0, then
    f(n) O(nm).
  • Pf Let a maxam, , a0 1. Then
  • f(n) lt a nm a n a lt (m1)a nm
    for n gt 1.
  • So choosing c (m1)a and n0 1 just works.
  • So, again, drop the constants and minor terms.

Asymptotic Notations
  • We also have a notation for lower bounds.
  • f(n) ?(g(n)) if for some c gt 0 and n0 gt 0, f(n)
    ? c g(n) for all n gt n0.
  • n2 ?(2n) choose c 1/2 and n0 1.
  • 2n ?(n100) choose c 1 and n 1000.
  • You can prove that If f(n) O(g(n)), then g(n)

Asymptotic Notations
  • Thm 1.3 If f(n) am nm a1 n a0 and am
    gt 0, then f(n) ?(nm).
  • Pf Exercise.
  • Note that, if am lt 0, then f(n) ? ?(nm). For
    example, -n2 1000n ? ?(n2) since for any c gt 0
    and n0 gt 0, we have n2 1000n lt n2 for n

Asymptotic Notations
  • We also have a notation for equivalence.
  • f(n) ?(g(n)) if there exist c1 gt 0, c2 gt 0, and
    n0 gt 0 such that c1 g(n) ? f(n) ? c2 g(n) for
    all n gt n0.
  • 2n2 100n ?(n2).
  • n2 ? ?(n).
  • You can prove that, f(n) ?(g(n)) if and only if
    f(n) O(g(n)) and f(n) ?(g(n)).

Asymptotic Notations
  • Thm 1.4 If f(n) am nm a1n a0 and am gt
    0, then f(n) ?(nm).
  • Pf Immediate from Thm 1.2 and 1.3.

Time Complexity of Binary Search
  • At each iteration, the search range for binary
    search is reduced by about a half.
  • So, in any case, the number of iterations needed
    cannot exceed log2n. The time complexity is
    O(log n) in any case (each iteration takes
  • The worst case time complexity is ?(log n), which
    occurs when, e.g., the number to search is not in
    the array.
  • Therefore, the worst case time complexity is
    ?(log n).
  • In the best case, one iteration suffices. The
    best case time complexity is ?(1).

Time Complexity of Binary Search
  • Note that the time complexity for a sequential
    search is ?(n), which occurs when, e.g., the
    number to search is not in the array.
  • So binary search is faster than sequential
    search, but it requires the array to be sorted.

Time Complexity of Binary Search
  • The worst case time complexity of the recursive
    binary search can be stated elegantly as
  • T(n) T(n/2) ?(1) for n gt 1 T(1) ?(1).
  • The ?(1) term means some anonymous function f(n)
    s.t. f(n) ?(1), which is for the time needed in
    addition to the time taken by the recursive call.

Time Complexity of Binary Search
  • Note that any function f(n) ?(1) is bounded
    above by a constant, and is bounded below by a
  • By the definition of ?, there exists c gt 0 and n0
    gt 0 such that f(n) lt c for n gt n0. So f(n) lt
    supf(n) n ? n0 c. Similarly f(n) is
    bounded below by a constant.
  • So T(n) lt T(n/2) c.

Time Complexity of Binary Search
  • For simplicity let n 2m, so m log n. Then
  • T(n) T(2m) lt T(2m-1) c lt T(2m-2) 2c
    lt lt T(1) mc O(m)
  • So T(n) O(log n).
  • This may be seen as an initial guess. To confirm
    the answer we may use induction.

Time Complexity of Binary Search
  • Ind. Hyp. T(m) lt d log m for m lt n.
  • We are free in choosing the constant d, if it
    satisfies the base case. So we can choose d to be
    larger than c.
  • Induction T(n) lt T(n/2) c lt d log(n/2) c d
    log n d c lt d log n.
  • Base case T(2) lt T(1) c lt c lt d log 2.
  • Need to choose d to satisfy d gt c.
  • So T(n) lt d log n, implying T(n) O(log n).

Selection Sort
  • Given several numbers, how to sort them?
  • You can find the smallest number, set it aside,
    find the next smallest number, and so on,
    continue until all numbers are done.

Selection Sort
  • sort(A, n) / Assume that A is indexed by 1..n
  • for i 1 to n - 1 do find the index of the
    min. elem. in Ai..n swap the min. elem. with
  • Observe that, at the end of the i th iteration,
    Aj holds the j th smallest element of A, for
    all i ? i.

Time Complexity of Selection Sort
  • Find the minimum in Ai..n takes ?(n - i) time.
  • Total time

Comparison of Two Strategies
  • Suppose that you want to do searches in A.
  • If the search will be performed only once, then a
    sequential search is good.
  • If the search is to be done very frequently (much
    more than n times), then it is worth paying n2
    time to sort the array first (preprocessing),
    being able to do binary search subsequently.