BITS, PILANI – K. K.
BIRLA GOA CAMPUS
Foundations of Data Structures and Algorithms
                  (BITS F232 )
                   Lecture No. 2
                                           1
                Sorting Problem
• Input:
  – A sequence of n numbers a1, a2, . . . , an
• Output:
  – A permutation (reordering) a1’, a2’, . . . , an’ of the
    input sequence such that a1’ ≤ a2’ ≤ · · · ≤ an’
               Insertion Sort
• Insertion sort
                     To insert 12, we need to
                     make room for it by moving
                     first 36 and then 24.
                    Insertion sort
• Idea: like sorting a hand of playing cards
   – Start with an empty left hand and the cards facing
     down on the table.
   – Remove one card at a time from the table, and insert
     it into the correct position in the left hand
      • compare it with each of the cards already in the hand, from
        right to left
   – The cards held in the left hand are sorted
      • these cards were originally the top cards of the pile on the
        table
Insertion Sort
 Example
                    Insertion Sort
INSERTION-SORT(A)
  for j ← 2 to n
       do key ← A[ j ]
          Insert A[ j ] into the sorted sequence A[1 . . j -1]
         i←j-1
         while i > 0 and A[i] > key
              do A[i + 1] ← A[i]
                  i←i–1
         A[i + 1] ← key
• Insertion sort – sorts the elements in place
• Invariant: at the start of the for loop the elements in A[1
  . . j-1] are in sorted order
           Insertion Sort - Analysis
• Best Case Analysis
  The best case for insertion sort occurs when the list is already
  sorted.
  In this case, insertion sort requires n-1 comparisons
  i.e., O(n) complexity.
• Worst Case Analysis
  for each value of i, what is the maximum number of key
  comparisons possible?
  - Answer: i -1
• Thus, the total time in the worst case is
  T(n) = 1+2+3+……+(n-1)
        = n(n-1)/2
        = O(n^2)
                                                                     7
                       Big O
Formal mathematical definition of Big O.
Let T(n) and f(n) be two positive functions.
We write T(n) = O(f(n)), and say that T(n) has order of
f(n) or T(n) is of order f(n),
if there are positive constants M and n₀ such that
T(n) ≤ M·f(n) for all n ≥ n₀.
          Asymptotic Notation
By dropping the less significant terms and the
constant coefficients,
we can focus on the important part of an
algorithm's running time—its rate of growth
—without getting mired in details that
complicate our understanding.
Example
            Comparing Algorithms
Comparison between algorithms can be done easily.
Let's take an example to understand this:
If we have two algorithms with the following expressions representing
the time required by them for execution,
Expression 1: 20n2 + 3n - 4
Expression 2: n3 + 100n - 2
Then as per asymptotic notations, we should just worry about how the
function will grow as the value of n (input) will grow
And that will entirely depend on n2 for Expression 1
and on n3 for Expression 2.
Hence, we can clearly say that the algorithm for which running time is
represented by the Expression 2, will grow faster than the other one,
simply by analysing the highest power coefficient
            Good & Bad Solutions
When analysing algorithms we will often come across the
following time complexities.
Complexity
                                  Good News
        O(1)
        O(logn)
        O(n)
        O(nlogn)
   ________________________
       O(nk), k ≥ 2
                                  Bad News
       O(kn), k ≥ 2
       O(n!)
       Other Asymptotic Notations
Big Omega Ω
used to give a lower bound.
Important to know how many computations are
necessary to solve the problem
Equivalently any solution to the problem has to
perform that may computations
Generally not easy to prove. For future
Big Theta θ
used to indicate that a function is bounded both from
above and below.
      How to Determine Complexities
Complexity
How to determine the running time of a piece of code?
•The answer is that it depends on what kinds of statements are used.
Sequence of statements
statement 1;
statement 2;
...
statement k;
The total time is found by adding the times for all statements:
total time = time(statement 1) + time(statement 2) + ... +
time(statement k)
If each statement is "simple" (only involves basic operations) then the
time for each statement is constant and the total time is also constant:
O(1)
   How to Determine Complexities
Complexity
If-Then-Else
        if (cond)
                then block 1 (sequence of statements)
        else
                block 2 (sequence of statements)
        end if;
Here, either block 1 will execute, or block 2 will execute.
Therefore, the worst-case time is the slower of the two
possibilities: max(time(block 1), time(block 2))
If block 1 takes O(1) and block 2 takes O(N), the if-then-else
statement would be O(N).
   How to Determine Complexities
Complexity
Loops
      for I in 1 .. N loop
              sequence of statements
      end loop;
• The loop executes N times, so the sequence of
  statements also executes N times.
• If we assume the statements are O(1),
• then total time for the for loop is N * O(1), which is
  O(N) overall
   How to Determine Complexities
Complexity
Nested loops
       for I in 1 .. N loop
                for J in 1 .. M loop
                         sequence of statements
                end loop;
       end loop;
• The outer loop executes N times.
• Every time the outer loop executes, the inner loop executes
  M times.
• As a result, the statements in the inner loop execute a total
  of N * M times.
                   Recurrence Relations
  The Tower of Hanoi                           It Consists of three pegs mounted
                                               on a board together with disks of
                                               different sizes. Initially these disks
                                               are placed on the first peg in order
                                               of size, with the largest on the
                                               bottom
                                              The rules of the puzzle allow disks
  The Initial Position in Tower of Hanoi
                                              to be moved one at a time from one
                                              peg to another as long as a disk is
                                              never placed on top of a smaller
                                              disk.
                                               The goal of the puzzle is to have
                                               all the disks on the 2nd peg in
                                               order of size, with the largest on
An Intermediate Position in the Tower of Hanoi the bottom
            Recurrence Relations
Let Hn denote the number of moves needed to solve the
Tower of Hanoi problem with n disks.
We set up a recurrence relation for the sequence {Hn}.
Begin with n disks on peg 1.
We can transfer the top n − 1 disks, following the rules of
the puzzle, to peg 3 using Hn-1 moves
We keep the largest disk fixed during these moves.
Then, we use one move to transfer the largest disk to the
second peg.
We can transfer the n − 1 disks on peg 3 to peg 2 using Hn-1
additional moves, placing them on top of the largest disk,
which always stays fixed on the bottom of peg 2.
      Recurrence Relations - Solution
This shows that Hn = 2 Hn-1 + 1
The initial condition is H1 = 1
Because one disk can be transferred from peg 1 to peg 2,
according to the rules of the puzzle, in one move.
How to solve the recurrence relation?
Iterative Approach               We have used the recurrence
                                  relation repeatedly to express Hn in
                                  terms of previous terms of the
                                  sequence
                     Binary Search
Binary Search Algorithm
Given: Sorted Array, and a key
• Divide the search space into two halves by finding the middle
   index “mid”.
• Compare the middle element of the search space with the
   key.
• If the key is found at middle element, the process is
   terminated.
• If the key is not found at middle element, choose which half
   will be used as the next search space.
    – If the key is smaller than the middle element, then the left
       side is used for next search.
    – If the key is larger than the middle element, then the right
       side is used for next search.
• This process is continued until the key is found or the total
   search space is exhausted.
                 Running time
Running Time for Binary Search
W(n)=1+W(n/2)
W(1)= 1
Iterative/Substitution
                        Exercises
1. Find a recurrence relation and give initial conditions for
the number of bit strings of length n that do not have two
consecutive 0s.
2. A computer system considers a string of decimal digits a
valid codeword if it contains an even number of 0 digits. Let
Hn be the number of valid n-digit codewords. Find a
recurrence relation for Hn.
3. Find a recurrence relation for Cn, the number of ways to
parenthesize the product of n + 1 numbers, x0· x1· x2· ··· · xn,
to specify the order of multiplication.