Divide
Divide and
                               and Conquer
                                   Conquer
                           (Merge
                           (Merge Sort)
                                   Sort)
Comp 122, Spring 2004
               Divide and Conquer
      Recursive in structure
        Divide the problem into sub-problems that are
         similar to the original but smaller in size
        Conquer the sub-problems by solving them
         recursively. If they are small enough, just solve
         them in a straightforward manner.
        Combine the solutions to create a solution to
         the original problem
-2                          Comp 122
            Which are
           more difficult?
     Divided    Conquer      Combine
                                       3
-3
                                    Sort
      Obviously application
           Sort a list of names.
           Organize an MP3 library.
           Display Google PageRank results.
           List RSS news items in reverse chronological order.
      Problems become easy once items are in sorted order
           Find the median.
           Find the closest pair.
           Binary search in a database.
           Identify statistical outliers.
           Find duplicates in a mailing list.
                                                                  4
-4
      Non-obvious applications
          Data compression.
          Computer graphics.
          Computational biology.
          Supply chain management.
          Book recommendations on Amazon.
          Load balancing on a parallel computer.
          ....
                                                    5
-5
         Divide-and-Conquer Examples
      Sorting: mergesort and quicksort
      Binary tree traversals
      Binary search (?)
      Multiplication of large integers
      Matrix multiplication: Strassen’s algorithm
      Closest-pair and convex-hull algorithms
-6
          An Example: Merge Sort
     Sorting Problem: Sort a sequence of n elements into
       non-decreasing order.
      Divide: Divide the n-element sequence to be
       sorted into two subsequences of n/2 elements each
      Conquer: Sort the two subsequences recursively
       using merge sort.
      Combine: Merge the two sorted subsequences to
       produce the sorted answer.
-7                          Comp 122
                           Mergesort
      Split array A[1….n] into about equal halves and
       make copies of each half in arrays B and C
      Sort arrays B and C recursively
      Merge sorted arrays B and C into array A as
       follows:
        Repeat the following until no elements remain in one of the
         arrays:
          • compare the first elements in the remaining unprocessed
            portions of the arrays
          • copy the smaller of the two into A, while incrementing the
            index indicating the unprocessed portion of that array
        Once all elements in one of the arrays are processed, copy the
         remaining unprocessed elements from the other array into A.
-8
                       Merge Sort
      John von Neumann 1945.
      MERGE-SORT A[1 . . n]                     Divide
         1. If n = 1, done.
         2. Recursively sort A[ 1 . .
            n/2 ] and A[ n/2+1 . . n ] .
         3. “Merge” the 2 sorted lists.
      Conque
         r                                     Combine
      Key subroutine: “Merge”
                                                         9
-9
                     Merge-Sort (A, p, r)
       INPUT: a sequence of n numbers stored in array A
       OUTPUT: an ordered sequence of n numbers
       MergeSort (A, p, r) // sort A[p..r] by divide & conquer
       1 if p < r
       2    then q  (p+r)/2
       3       MergeSort (A, p, q)
       4       MergeSort (A, q+1, r)
       5       Merge (A, p, q, r) // merges A[p..q] with A[q+1..r]
        Initial Call: MergeSort(A, 1, n)
- 10                                  Comp 122
                Merge Sort – Example
            Original Sequence                      Sorted Sequence
       18 26 32 6 43 15 9           1      1   6    9   15 18 26 32 43
   18 26 32 6       43 15 9         1     6 18 26 32          1      9   15 43
                                                                            43
  18 26 32 6        43 15       9   1     18 26     6 32     15 43        1   9
 18 26 32      6    43 15       9   1    18 26 32       6    43 15       9    1
18 26 32       6    43 15       9   1
- 12                                Comp 122
                     Merge-Sort (A, p, r)
       INPUT: a sequence of n numbers stored in array A
       OUTPUT: an ordered sequence of n numbers
       MergeSort (A, p, r) // sort A[p..r] by divide & conquer
       1 if p < r
       2    then q  (p+r)/2
       3       MergeSort (A, p, q)
       4       MergeSort (A, q+1, r)
       5       Merge (A, p, q, r) // merges A[p..q] with A[q+1..r]
        Initial Call: MergeSort(A, 1, n)
- 13                                  Comp 122
                                 Procedure Merge
       Merge(A, p, q, r)
       1 n1  q – p + 1
       2 n2  r – q                               Input: Array containing
       3    for i  1 to n1
       4      do L[i]  A[p + i – 1]
                                                  sorted subarrays A[p..q]
       5    for j  1 to n2                       and A[q+1..r].
       6      do R[j]  A[q + j]
       7    L[n1+1]                             Output: Merged sorted
       8    R[n2+1]                             subarray in A[p..r].
       9    i1
       10 j  1
       11 for k p to r
       12     do if L[i]  R[j]
       13         then A[k]  L[i]
       14               ii+1                      Sentinels, to avoid having to
       15         else A[k]  R[j]                 check if either subarray is
       16               jj+1
                                                   fully copied at each step.
- 14                                   Comp 122
                           Merge – Example
       A           …       61   86 26
                                    8 32 1 32
                                       9 26 9 42 43                            …
                           k    k k     k       k k        k       k       k
           L       6   8 26 32             R      1   9 42 43                 
               i       i   i    i   i              j   j       j       j       j
- 15                                    Comp 122
                           Correctness of Merge
       Merge(A, p, q, r)
                                        Loop Invariant for the for loop
       1 n1  q – p + 1
       2 n2  r – q
                                        At the start of each iteration of the
       3    for i  1 to n1             for loop:
       4      do L[i]  A[p + i – 1]                   Subarray A[p..k – 1]
       5    for j  1 to n2             contains the k – p smallest elements
       6      do R[j]  A[q + j]        of L and R in sorted order.
       7    L[n1+1]                   L[i] and R[j] are the smallest elements of
       8    R[n2+1]                   L and R that have not been copied back into
       9    i1
                                        A.
       10 j  1
       11 for k p to r
       12     do if L[i]  R[j]
       13         then A[k]  L[i]      Initialization:
       14               ii+1           Before the first iteration:
       15         else A[k]  R[j]      •A[p..k – 1] is empty.
       16               jj+1           •i = j = 1.
                                        •L[1] and R[1] are the smallest
                                         elements of L and R not copied to A.
- 16                                   Comp 122
                           Correctness of Merge
       Merge(A, p, q, r)                Maintenance:
       1 n1  q – p + 1
       2 n2  r – q                     Case 1: L[i]  R[j]
       3    for i  1 to n1             •By LI, A contains p – k smallest elements
       4      do L[i]  A[p + i – 1]     of L and R in sorted order.
       5    for j  1 to n2             •By LI, L[i] and R[j] are the smallest
       6      do R[j]  A[q + j]        elements of L and R not yet copied into A.
       7    L[n1+1]                   •Line 13 results in A containing p – k + 1
       8    R[n2+1]                   smallest elements (again in sorted order).
       9    i1
                                        Incrementing i and k reestablishes the LI
       10 j  1
       11 for k p to r                 for the next iteration.
       12     do if L[i]  R[j]         Similarly for L[i] > R[j].
       13         then A[k]  L[i]      Termination:
       14               ii+1           •On termination, k = r + 1.
       15         else A[k]  R[j]
                                        •By LI, A contains r – p + 1 smallest
       16               jj+1
                                         elements of L and R in sorted order.
                                        •L and R together contain r – p + 3 elements.
                                         All but the two sentinels have been copied
                                         back into A.
- 17                                   Comp 122
                Analysis of Merge Sort
        Running time T(n) of Merge Sort:
        Merge Sort work correctly with any input i.e even or
         odd. But our analysis assume that input size is a power
         of 2.
        Merge sort on one element takes constant time
         whereas when we have n > 1 eleements, break down
         the running time as follows.
        When an algorithm contains a recursive call to itself,
         we describe its running time by a recurrence equation
- 18                            Comp 122
                Analysis of Merge Sort
        Divide: computing the middle of subarray, takes
         D(n) = (1)
        Conquer: solving 2 subproblems of size n/2 takes
         T(n) = 2T(n/2)
        Combine: merging n elements takes C(n) = (n)
        Total:
                    T(n) = (1)              if n = 1
                    T(n) = 2T(n/2) + (n) if n > 1
           T(n) = (n lg n) (can be proved using master theorem ,
           as logrithmic function grow more slowly than any linear
           function. Chapter 4)
- 19                             Comp 122
                 Recurrence Relations
        Solution Methods (Chapter 4)
           Substitution Method.
           Recursion-tree Method.
           Master Method.
        Recurrence relations arise when we analyze the
         running time of iterative or recursive algorithms.
           Ex: Divide and Conquer.
             T(n) = (1)                     if n  c
             T(n) = a T(n/b) + D(n) + C(n)   otherwise
- 20                             Comp 122
                  Substitution Method
        Guess the form of the solution, then
         use mathematical induction to show it correct.
           Substitute guessed answer for the function when the
            inductive hypothesis is applied to smaller values –
            hence, the name.
        Works well when the solution is easy to guess.
        No general way to guess the correct solution.
- 21                              Comp 122
           Example – Exact Function
       Recurrence: T(n) = 1                          if n = 1
                   T(n) = 2T(n/2) + n                 if n > 1
         Guess: T(n) = n lg n + n.
         Induction:
            •Basis: n = 1  n lgn + n = 1 = T(n).
            •Hypothesis: T(k) = k lg k + k for all k < n.
            •Inductive Step: T(n) = 2 T(n/2) + n
                                     = 2 ((n/2)lg(n/2) + (n/2)) + n
                                     = n (lg(n/2)) + 2n
                                       = n (lg(n) – lg(2)) + 2n
                                       = n lg n – nlg(2) + 2n
                                       = n lg n – n + 2n ; log2 2 = 1
- 22                                 = n lg n + n
                Recursion-tree Method
        Making a good guess is sometimes difficult
         with the substitution method.
        Use recursion trees to devise good guesses.
        Recursion Trees
          Show successive expansions of recurrences using
           trees.
          Keep track of the time spent on the subproblems of a
           divide and conquer algorithm.
          Help organize the algebraic bookkeeping necessary
           to solve a recurrence.
- 23                            Comp 122
             Recursion Tree – Example
        Running time of Merge Sort:
                     T(n) = (1)               if n = 1
                     T(n) = 2T(n/2) + (n)     if n > 1
        Rewrite the recurrence as
                     T(n) = c                   if n = 1
                     T(n) = 2T(n/2) + cn        if n > 1
            c > 0: Running time for the base case and
               time per array element for the divide and
               combine steps.
- 24                            Comp 122
            Recursion Tree for Merge Sort
       For the original problem,               Each of the size n/2 problems
       we have a cost of cn,                   has a cost of cn/2 plus two
       plus two subproblems                    subproblems, each costing
       each of size (n/2) and                  T(n/4).
       running time T(n/2).                                  cn
                  cn          Cost of divide and
                              merge.
                                                       cn/2       cn/2
       T(n/2)           T(n/2)
                                                  T(n/4) T(n/4) T(n/4) T(n/4)
                          Cost of sorting
                          subproblems.
- 25                                        Comp 122
             Recursion Tree for Merge Sort
       Continue expanding until the problem size reduces to 1.
                        cn                                              cn
                        cn/2        cn/2                                cn
   lg n
                 cn/4        cn/4 cn/4     cn/4                         cn
             c      c    c               c c c                           cn
                                                             Total   : cnlgn+cn
- 26                                              Comp 122
             Recursion Tree for Merge Sort
       Continue expanding until the problem size reduces to 1.
                   cn
                                      •Each level has total cost cn.
                                      •Each time we go down one level,
                                      the number of subproblems
                                      doubles, but the cost per
             cn/2      cn/2
                                      subproblem halves  cost per
                                      level remains the same.
                                      •There are lg n + 1 levels, height is
        cn/4    cn/4 cn/4   cn/4      lg n. (Assuming n is a power of 2.)
                                          •Can be proved by induction.
                                      •Total cost = sum of costs at each
                                      level = (lg n + 1)cn = cnlgn + cn =
                                      (n lgn).
       c c c              c c c
- 27                                   Comp 122
                      Other Examples
        Use the recursion-tree method to determine a
         guess for the recurrences
          T(n) = 3T(n/4) + (n2).
          T(n) = T(n/3) + T(2n/3) + O(n).
- 28                            Comp 122
          Recursion Trees – Caution Note
        Recursion trees only generate guesses.
          Verify guesses using substitution method.
        A small amount of “sloppiness” can be
         tolerated. Why?
        If careful when drawing out a recursion tree and
         summing the costs, can be used as direct proof.
- 29                            Comp 122
                    The Master Method
        Based on the Master theorem.
        “Cookbook” approach for solving recurrences
         of the form
           T(n) = aT(n/b) + f(n)
            • a  1, b > 1 are constants.
            • f(n) is asymptotically positive.
            • n/b may not be an integer, but we ignore floors and
              ceilings. Why?
        Requires memorization of three cases.
- 30                                Comp 122
                       The Master Theorem
 Theorem
  Theorem4.1    4.1
  Letaa  11 and
 Let            and bb >> 11be     beconstants,
                                      constants,let
                                                 letf(n)
                                                     f(n)be     beaafunction,
                                                                     function,and
                                                                               and
      Let
       LetT(n)
             T(n)bebedefined
                       definedon    onnonnegative
                                       nonnegativeintegers
                                                       integersby     bythe
                                                                         therecurrence
                                                                             recurrence
      T(n)
       T(n)==aT(n/b)
                aT(n/b)++f(n),  f(n),where
                                      wherewe wecan
                                                  canreplace
                                                         replacen/b  n/bbybyn/b  orn/b.
                                                                             n/bor  n/b.
      T(n)
       T(n)cancanbebebounded
                       boundedasymptotically
                                     asymptoticallyin     inthree
                                                                threecases:
                                                                      cases:
                              a–) for some constant  > 0, then T(n) = (nlog
 1.1. IfIf f(n) =  O(n            ) for some constant  > 0, then T(n) = (n ).         logaa).
                        log
                          loga–
            f(n) = O(n    b
                              b
                                                                                          b
                                                                                              b
            f(n)==
 2.2. IfIf f(n)     (n(nlog a),),then  T(n)==(
                                   thenT(n)      (nnloglogaalglgn).
                         log a
                           b
                               b
                                                        b
                                                            b     n).
            f(n)==(n
 3.3. IfIf f(n)     (nlog a+)) forforsome    constant>>0,0,
                                        someconstant
                        log a+
                          b
                              b
            and
             andif,
                  if,for
                      forsome
                            someconstant
                                    constantcc<<11andandall    allsufficiently
                                                                   sufficientlylarge
                                                                                largen,n,
            we
             wehave     a·f(n/b)ccf(n),
                 havea·f(n/b)            f(n),then  T(n)==
                                               thenT(n)            (f(n)).
                                                                     (f(n)).
         We’ll return to recurrences as we need them…
- 31                                         Comp 122