DAA Mod2
DAA Mod2
Module II
                                   DIVIDE AND
                                   CONQUER
                                       Contents
  1. Introduction to Divide and Conquer
   General method
  .3.Algorithm: Strassen’s
     matrix multiplication
 4. Advantages and
    Disadvantages of Conquer
    divide and conquer.
 5. Decrease and Conquer
     Approach Introduction
     Algorithm: Topological Sort
  Divide and Conquer is one of the best-known general algorithm design technique It
  works according to the following general plan:
      Given a function to compute on ‗n‘ inputs the divide-and-conquer strategy
      suggests splitting the inputs into ‗k‘ distinct subsets, 1<k<=n, yielding ‗k‘ sub
      problems.
      These sub problems must be solved, and then a method must be found to
      combine sub solutions into a solution of the whole.
      If the sub problems are still relatively large, then the divide-and-conquer
      strategy can possibly be reapplied.
        Often the sub problems resulting from a divide-and-conquer design are of the
      same type as the original problem. For those cases the reapplication of the divide-
      and- conquer principle is naturally expressed by a recursive algorithm.
      Now smaller and smaller subproblems of the same kind are generated until
      eventually subproblems that are small enoughto be solved without spiltting are
      produced.
      A typical case with k=2 is diagrammatically shown below
       Small (P) is a Boolean valued function that determines whether the input size
      is small enough that the answer can be computed without splitting. If small (P)
      is true then function ‘S’ is invoked.
        Otherwise the problem ‗P‘ is divided into sub problems. These sub
      problems are solved by recursive application of DAndC.
      Finally the solution from k sub problems is combined to obtain the solution of the
      given problem using Combine function.
        If the size of problem ‗p‘ is n and the sizes of the ‗k‘ sub problems are n1, n2
  ….nk, respectively, then the computing time of divide and conquer is described by the
  recurrence relation
Where,
         T(n) is the time for divide and conquer method on any input of size n and
         g(n) is the time to compute answer directly for small inputs.
         The function f(n) is the time for dividing the problem ‗p‘ and combining the
solutions
to sub problems.
      For divide and conquer based algorithms that produce sub problems of the
      same type as the original problem, it is very natural to describe them by using
      recursion.
  It states that, in recurrence equation T(n) = aT(n/b) + f (n), If f (n)∈ Θ (nd ) where
  d ≥ 0 then
  Solution: For the above problem, the algorithm can be implemented as recursive
  or non- recursive algorithm.
   Algorithm BinSearch (A, low , high, x)
         //Given an array A[low : high] of elements in non decreasing order , 1≤ low ≤ high
        //Determine whether x is present, and if so return j such that x=A[j]; else return 0.
          {
                  if (low=high) then
                  {
                      if(x=A[high]) then return h; //If Small
                      (P) else return 0;
                  }
                  else
                  {//reduce P into a smaller sub
                      problem. mid :=[(low +high)/2];
                      if(x=A[mid]) then return
                      mid; else if(x<A[mid]) then
                            return BinSearch (A, low, mid-1,
                       x); else return BinSearch (A, mid+1,
                       high, x);
                  }
          }
  The above algorithm is a recursive binary search algorithm.
   Explanation:
  The problem is subdivided into smaller problems until only one single element is left
  out.
      1. If low = high then it means there is only one single element. So compare it
          with the search element ‗x‘. if both are equal then return the index, else
          return 0.
      2. If low       high then divide the problem into smaller subproblems.
              a. Calculate mid = (low+high)/2.
b. Compare x with element at mid if both are equal the return index mid.
               c. Else if x is less than A[mid] then the element x would be in the first
                  partition A[low:mid-1]. So make a recursive call to BinSearch with low as
                  low and high as mid-1.
               d. Else x is greater than A[mid] then the element x would be in the
                  second partition A[mid+1: high]. So make a recursive call to
                  BinSearch with low as mid+1 and high as high.
   The below algorithm is an iterative algorithm for Binary search.
      Algorithm BinSearch (a, n, x)
      //Given an array a[1:n] of elements in non decreasing order, n≥0,
      //Determine whether x is present, and if so return j such that x=a[j]; else return 0
      {
           low:=1;
           high:=n;
           while(low ≤ high) do
           {
                  mid:=[(low+high)/2];
                  if(x<a[mid])then high:=mid-1;
                  else if(x>a[mid])then
                  low:=mid+1; else return mid;
           }
           return 0;
       }
  Example:-
  Consider the elements :
  -15,-6,0,7,9,23,25,54,82,101,112,125,131,142,151.
   Place them in a[1:14], and simulate the steps BinSearch goes through
  as it searches for different values of ‗x‘.
   Only the variables, low, high & mid need to be traced as we simulate the algorithm.
  We try the following values for x: 151, -14 and 9. For 2 successful searches &
  1 for unsuccessful search.
        Otherwise we observe that each time through the loop the possible elements to
        be checked of or equality with x are a[low], a[low+1], ,a[mid],……a[high].
        If x=a[mid], then the algorithm terminates successfully.
        Otherwise, the range is narrowed by either increasing low to (mid+1) or
        decreasing high to (mid-1).
        Clearly, this narrowing of the range does not affect the outcome of
  the      search. If low becomes > than high, then ‗x‘ is not present & hence
  the
        loop is exited.
         Each and every internal node represents a successful search and the values
      denoted by internal nodes are the various values taken by the variable mid.
         If the value of mid is at level zero then the no. of comparisons for an element
      present at the position is one.
       The no. of comparisons for a successful search if the mid value is at level one
      is two and the maximum number of comparisons for a successful search in the
      above decision tree is equal to four. i.e., the max. No. of comparisons is directly
      proportional to the height of the tree. Therefore the time complexity for a
      successful search is given by O (log2 n).
        Each and every unsuccessful search terminates at an external node. The
      no. of comparisons needed for an unsuccessful search of an element less
      than -15 is 3. For the entire remaining unsuccessful search the no. of
      comparisons made is 4. Therefore the average no. of elemental comparisons
      of an unsuccessful search is
      Except for one case the no. of comparisons for unsuccessful search is
      constant and is equal to the height of the tree. Therefore the time for
      unsuccessful search is given by O (log2 n).
  Explanation:
        StraightMaxMin requires 2(n-1) comparisons in the best, average & worst
          cases.
        By realizing that the comparison a[i]<min is necessary only when a[i]>max
          is false, improvement in a algorithm can be done. Hence we can replace
          the contents of the for loop by,
                 if(a[i]>max) then max
                 = a[i]; else if (a[i]< min)
                 min=a[i];
        Now the best case occurs when the elements are in increasing order, then
          only first if condition is satisfied(i,e true) therefore the number of
          comparisions is (n
          -1).
        The worst case condition occurs when the elements are in decreasing
          order, in this case the first condition is evaluated as false and the second
          condition is evaluated to true therefore the number of comparisions is 2(n-
          1).
       Let P = (n, a [i],……,a[j]) denote an arbitrary instance of the problem. Here ‗n‘
         is the no. of elements in the list (a[i],….,a[j]) and we are interested in finding
         the maximum and minimum of the list.
       Let Small(P) be true when <=2. In this case, the maximum and minimum
         are a[i] if n=1 and if n=2, the problem can be solved by making one
         comparison.
       If the list has more than 2 elements, P has to be divided into smaller
         instances. For example, we might divide ‗P‘ into the 2 instances, P1=(
         [n/2],a[1], a[n/2])
         and P2= ( n-[n/2], a[[n/2]+1], ... , a[n])
       After having divided ‗P‘ into 2 smaller sub problems, we can solve them
         by recursively invoking the same divide-and-conquer algorithm.
       When the maximum and minimum of these sub problems are determined, the
         two maxima are compared and the two minima are compared to achieve the
         solution for the entire problem.
   We see that the root node contains 1 and 9 as the values of i and j
      corresponding to the initial call to MaxMin. This execution produces two new call
      to MaxMin, where i and j have the values 1, 5 and 6, 9, and thus split the set
      into two subsets of the same size. From the tree we can immediately see that
      the maximum depth of recursion is four (including the first call). The circled
      numbers in the upper left order of each node represent the the orders in which
      max and min are assigned values.
  Time Complexity:
   Now what is the number of element comparisons needed for MaxMin? If
      T(n) represents this number, then the resulting recurrence relation is
   Note that 3n/2 – 2 is the best, average, worst case number of comparison when
      n is a power of two.
  Comparisons with Straight Forward Method:
       Compared with the 2n – 2 comparisons for the Straight Forward method, this is
  a saving of 25% in comparisons. It can be shown that no algorithm based on
  comparisons uses less than 3n/2 – 2 comparisons.
Space Complexity
     It sorts a given array A[1],….,A[n] by dividing it into two halves A[1],…..,A[ _n/2_ ] and
    A[ _n/2_ +1], .. ,A[n], sorting each of them recursively, and then merging the two
    smaller
    sorted arrays into a single sorted one.
      The following algorithm describes the process using recursion and the
 algorithm Merge
    merges two sorted sets.
           The merging of two sorted arrays can be done as follows. Two pointers
  (array
indices) are initialized to point to the first elements of the arrays being merged.
      The elements pointed to are compared, and the smaller of them is added to a new
      array being constructed; after that, the index of the smaller element is incremented
      to point to its immediate successor in the array it was copied from.
      This operation is repeated until one of the two given arrays is exhausted, and then
      the remaining elements of the other array are copied to the end of the new array.
 Analysis
       Here the basic operation is key comparison. As merge sort execution does
      not depend on the order of the data, best case and average case runtime are
      the same as worst case runtime.
        Assuming for simplicity that total number of elements n is a power of 2, the
      recurrence relation for the number of key comparisons C(n) is
            Where, Cmerge(n) is the number of key comparison made during the merging
   stage.       Let us analyze Cmerge(n), the number of key comparisons performed
       during the merging stage. At each step, exactly one comparison is made, after
   which the total number of elements in the two arrays still needing to be processed
                                                                          is reduced by 1.
      In the worst case, neither of the two arrays becomes empty before the other one
      contains just one element (e.g., smaller elements may come from the alternating
      arrays). Therefore, for the worst case, Cmerge(n) = n – 1. Now,
       For large n, the number of comparisons made by this algorithm in the average
      case turns out to be about 0.25n less and hence is also in Θ (n log n).
            Mergesort is a stable
  algorithm Limitations
overhead
Quick Sort
        Quicksort is the other important sorting algorithm that is based on the divide-
      andconquer approach. Unlike mergesort, which divides its input elements
      according to their position in the array, quicksort divides them according to their
      value.
      A partition is an arrangement of the array‘s elements so that all the elements to the
      left of some element A[s] are less than or equal to A[s], and all the elements to the
      right of A[s] are greater than or equal to it
       Obviously, after a partition is achieved, A[s] will be in its final position in the sorted
      array, and we can continue sorting the two subarrays to the left and to the right of
      A[s] independently (e.g., by the same method).
         Note the difference with mergesort: there, the division of the problem into two
      subproblems is immediate and the entire work happens in combining their solutions;
      here, the entire work happens in the division stage, with no work required to
      combine the solutions to the subproblems.
Partitioning
         We will now scan the subarray from both ends, comparing the subarray‘s
      elements to the pivot.
          The left-to-right scan, denoted below by index pointer i, starts with the
             second element. Since we want elements smaller than the pivot to be in the
             left part of the subarray, this scan skips over elements that are smaller than
             the pivot and stops upon encountering the first element greater than or equal
             to the pivot.
          The right-to-left scan, denoted below by index pointer j, starts with the last
             element of the subarray. Since we want elements larger than the pivot to
             be in the right part
             of the subarray, this scan skips over elements that are larger than the
             pivot and stops on encountering the first element smaller than or equal to
             the pivot.
      After both scans stop, three situations may arise, depending on whether or not the
      scanning indices have crossed.
   If scanning indices i and j have not crossed, i.e., i < j, we simply exchange A[i] and
      A[j
      ] and resume the scans by incrementing i and decrementing j, respectively:
   If the scanning indices have crossed over, i.e., i > j, we will have partitioned the
      subarray after exchanging the pivot with A[j ]:
   Finally, if the scanning indices stop while pointing to the same element, i.e., i = j, the
      value they are pointing to must be equal to p (why?). Thus, we have the subarray
      partitioned, with the split position s = i = j. We can combine the last case with the
      case of crossed-over indices (i > j ) by exchanging the pivot with A[j ] whenever i ≥ j
      .
    0    1   2       Z   4        5       8       7
         3   1       g       Al       2       a
         3   1       g       8    2       4       7
    5    3   1       4       B    2       9       7
         3   1       4       B        2       9       7
    5    3   1       4       Z    B           9       7
    5    3   1       A       Z    8       Q       7
     2   3   1       4       5    8       9       7
         3   1       4
                     4
    1    2   3       4
    1
             3       4
                                                      /”
                                  s       9       7
                                  s       7       e
                                      7   B       B
                                  7
Analysi
s Best
Case
   If all the splits happen in the middle of corresponding subarrays, we will have
      the best case.
 The number of key comparisons in the best case satisfies the recurrence,
   According to the Master Theorem, Cbest(n) ∈ Θ(n log2 n); solving it exactly for n =
      2k yields
  Cbest(n) = n log2 n.
  Worst Case
   In the worst case, all the splits will be skewed to the extreme: one of the two
      subarrays will be empty, and the size of the other will be just 1 less than the size of
      the subarray being partitioned. This unfortunate situation will happen, in particular,
      for increasing arrays, i.e., for inputs for which the problem is already solved
   Indeed, if A[0..n − 1] is a strictly increasing array and we use A[0] as the pivot, the
      left-to- right scan will stop on A[1] while the right-to-left scan will go all the way to
      reach A[0], indicating the split at position 0: So, after making n + 1 comparisons to
      get to this partition and exchanging the pivot A[0] with itself, the algorithm will be
      left with the strictly increasing array A[1..n − 1] to sort. This sorting of strictly
      increasing arrays of diminishing sizes will continue until the last one A[n−2 .. n−1]
      has been processed. The total number of key comparisons made will be equal to
 Average Case
   Let Cavg(n) be the average number of key comparisons made by quicksort
      on a randomly ordered array of size n.
relation:
   Its solution, which is much trickier than the worst- and best-case analyses, turns
      out to be
   Thus, on the average, quicksort makes only 39% more comparisons than in
      the best case.
   Moreover, its innermost loop is so efficient that it usually runs faster than
      mergesort on randomly ordered arrays of nontrivial sizes. This certainly justifies
      the name given to the algorithm by its inventor.
Variations
  Because of quicksort‘s importance, there have been persistent efforts over the
  years to refine the basic algorithm. Among several improvements discovered by
  researchers are:
       Better pivot selection methods such as randomized quicksort that uses a
         random element or the median-of-three method that uses the median of the
         leftmost, rightmost, and the middle element of the array
       Switching to insertion sort on very small subarrays (between 5 and 15
         elements for most computer systems) or not sorting small subarrays at
         all and finishing the algorithm with insertion sort applied to the entire nearly
         sorted array
       Modifications of the partitioning algorithm such as the three-way partition
         into segments smaller than, equal to, and larger than the pivot
  Limitations
   It is not stable.
   It requires a stack to store parameters of subarrays that are yet to be sorted.
   While performance on randomly ordered arrays is known to be sensitive not
      only to implementation details of the algorithm but also to both computer
      architecture and data type.
  3. Strassen’s Matrix
  Multiplication Direct
  Method
      Suppose we want     to multiply   two   n x   n matrices, A and   B. Their
      product
where
  It is not difficult to verify that one can treat these submatrices as numbers to get
  the correct product. For example, C00 can be computed either as A00 * B00 + A01 *
  B10 or as M1 + M4 – M5 + M7 where M1, M4, M5, and M7 are found by Strassen‘s
  formulas, with the numbers replaced by the corresponding submatrices. If the
  seven products of n/2 × n/2 matrices are computed recursively by the same
Dept. of CSE,KNSIT                                                                Page 33
DAA                  21CS42
method, we have
  Analysis
      Here the basic operation is multiplication.
      If M(n) is the number of multiplications made by Strassen‘s algorithm in
      multiplying two n× n matrices (where n is a power of 2), we get the following
      recurrence relation for it
   This implies M(n) = Θ(n2.807) which is smaller than n3 required by the brute-
   force algorithm.
Advantages
reuses the cached data until the sub-problem has been completely solved.
   It allows us to solve difficult and often impossible looking problems, such as the
      Tower of Hanoi, which is a mathematical game or puzzle. Being given a difficult
      problem can often be discouraging if there is no idea how to go about solving it.
      However, with the divide and conquer method, it reduces the degree of difficulty
      since it divides the problem into sub problems that are easily solvable, and
      usually runs faster than other algorithms would.
Disadvantages
   Another advantage to this paradigm is that it often plays a part in finding other
      efficient algorithms, and in fact it was the central role in finding the quick
      sort and merge sort algorithms.
   One of the most common issues with this sort of algorithm is the fact that
      the recursion is slow, which in some cases outweighs any advantages of
      this divide and conquer process.
   Sometimes it can become more complicated than a basic iterative approach,
      especially in cases with a large n. In other words, if someone wanted to add a
      large amount of numbers together, if they just create a simple loop to add them
      together, it would turn out to be a much simpler approach than it would be to
      divide the numbers up into two groups, add these groups recursively, and then
      add the sums of the two groups together.
   Another downfall is that sometimes once the problem is broken down into sub
      problems, the same sub problem can occur many times. In cases like these,
      it   can often be easier to identify and save the solution to the repeated sub
      problem, which is commonly referred to as memorization
         1. Decrease by a constant
         :(usually by 1): Examples:
                 a. Insertion sort
                 b. Graph traversal algorithms (DFS and BFS)
                 c. Topological sorting
                 d. Algorithms for generating permutations, subsets
         2. Decrease by a constant factor (usually
         by half) Example: Binary search
         3. Variable size
         decrease Example:
         Euclid‘s algorithm
 by 2):
  The decrease-by-a-constant-factor technique suggests           reducing    a    problem
  instance by the same constant factor on each iteration of the algorithm. In most
  applications, this constant factor is equal to two. The decrease-by-half idea is
  illustrated in the following figure:
 For an example, let us revisit the exponentiation problem. If the instance of size n is to
 compute an, the instance of half its size is to compute a n/2, with the obvious
 relationship between the two: an = (an/2) 2.
  But since we consider here instances with integer exponents only, the former
  does not work for odd n. If n is odd, we have to compute a n-1 by using the rule for
following formula:
    Topological Sort
  Background
  A directed graph, or digraph for short, is a graph with directions specified for all its
  edges. The adjacency matrix and adjacency lists are still two principal means of
  representing a digraph.
  There are only two notable differences between undirected and directed graphs
  in representing them: (1) the adjacency matrix of a directed graph does not have
  to be symmetric; (2) an edge in a directed graph has just one (not two)
  corresponding nodes in the digraph‘s adjacency lists.
  Depth-first search and breadth-first search are principal traversal algorithms for
  traversing digraphs as well, but the structure of corresponding forests can be
  more complex than for undirected graphs. Thus, even for the simple example
  (Figure a) of the depth-first search forest (Figure b) exhibits all four types of edges
  possible in a DFS forest of a directed graph:
                 tree edges (ab, bc, de)
                 backedges (ba) from vertices to their ancestors,
                 forward edges (ac) from vertices to their descendants in the tree
                 other than their children
                 cross edges (dc), which are none of the aforementioned types.
  Note that a back edge in a DFS forest of a directed graph can connect a vertex to its
  parent. Whether or not it is the case, the presence of a back edge indicates that the
  digraph has a directed cycle. A directed cycle in a digraph is a sequence of three
  or more of its vertices that starts and ends with the same vertex and in which every
  vertex is connected to its immediate predecessor by an edge directed from the
  predecessor to the successor. For example, a, b, a is
      Let us consider a set of five required courses {C1, C2, C3, C4, C5} a
      part-time student has to take in some degree program.
      The courses can be taken in any order as long as the following course
      prerequisites are met:
      -   C1 and C2 have no prerequisites,
      -   C3 requires C1 and C2,
      -   C4 requires C3,
      -   and C5 requires C3 and C4.
      The student can take only one course per term. In which order should the
      student take the courses?
Dept. of CSE,KNSIT                                                                Page 41
DAA                                                                           21CS42
          In terms of this digraph, the question is whether we can list its vertices in
      such an order that for every edge in the graph, the vertex where the edge
      starts is listed before the vertex where the edge ends. (Can you find such an
      ordering of this digraph‘s vertices?) This problem is called topological
      sorting.
           It can be posed for an arbitrary digraph, but it is easy to see that the problem
      cannot have a solution if a digraph has a directed cycle.
          Thus, for topological sorting to be possible, a digraph in question must be a
      directed acyclic graph (dag).
      Two algorithms for obtaining topological sort of a digraph is as follows:
       The first algorithm is a simple application of depth-first search:
      -     perform a DFS traversal and note the order in which vertices become dead-
            ends (i.e., popped off the traversal stack).
      -     Reversing this order yields a solution to the topological sorting problem,
            provided, of course, no back edge has been encountered during the traversal.
            If a back edge has been encountered, the digraph is not a dag, and
            topological sorting of its vertices is impossible.
     Figure: (a) Digraph for which the topological sorting problem needs to be solved.
          (b) DFS traversal stack with the subscript numbers indicating the popping-off
                               order. (c) Solution to the problem.
  Note: The solution obtained by the source-removal algorithm is different from the
  one obtained by the DFS-based algorithm. Both of them are correct, because the
  topological sorting problem may have several alternative solutions.