Libya free
High study academy / Misrata
Design and Analysis of Algorithms
Report of address:-
The Standard Matrix Multiplicationn & Strassens Algorithm for Matrix Multiplication
preface from student :Basma Mohammed Algubii Lecturer:Dr. Adel Smeda
Autumn 2011-2012
content
Introduction
The multiplication of two matrices is one of the most basic operations of linear algebra and scientific computing and has provided an important focus in the search for methods to speed up scientific computation. any speedup in matrix multiplication can improve the performance of a wide variety of numerical algorithms. Much less effort has been given towards the investigation of alternative algorithms whose asymptotic complexity is less than the 0(m3) operations required by the conventional algorithm to multiply m x m matrices. Standard algorithm for multiplying two square matrices of size requires n n multiplications and n 2 n 1additions Strassens algorithm reduces the number of multiplications to required
n log 2 7 if
n is a power of 2
Standard Matrix Multiplication
Given two n x n matrices A = (aij )ni;j=1 and B = (bij )ni ;j=1, their product is defined as follows:
Therefore, to compute the matrix product, we need to compute n2 matrix entries. A naive approach takes n multiplications and n 1 additions for each entry.
IDEA :
n x n matrix = 2 x 2 matrix of ( n / 2 ) x ( n / 2 ) submatrices :[ C r =a s= a t=c u =c .e +b. .f +b. .e+d. .f +d. g h h g ]=[ = ].[ A . B ]
8 recursive mults of (n /2 )x(n/2) submatrices 4 add of (n/2)x(n/2) submatrices
Pseudo code
for i:=0..n-1 do for j:=0..n-1 do C[i, j]:=0; for k:=0..n-1 do C[i, j]:=C[i, j] + A[i, k] B[k, j]; end end end return C; Running time is o(n3).
Divide-and-Conquer Multiplication Algorithm For simplicity assume that n is a power of 2. To compute the product of matrices, we subdivide each of the matrices into four (n/2) x (n/2) submatrices so that the equation C = A . B takes form:
This matrix equation corresponds to the following four equations on the submatrices:
Divide-and-Conquer Multiplication Running Time
Using index calculation, we can execute Step 5 in O(1) time (in contrast to O(n2) that would be required if we created submatrices and copied their entries). However, that does not make a difference asymptotically. The running time T(n) for Matrix-Multiply-Recursive on n x n matrices satisfy the recurrence:
By Master Theorem, T(n) = o(n3) which is unfortunately not faster than the naive method Matrix-Multiply. Divide-and-Conquer Multiplication Drawback Each time we split matrix sizes in half, but do not actually reduce the total amount of time. If we assume that naive matrix multiplication takes c _ n3 time. Then computing each product of submatrices takes c .(n/2)3 = c . n3 /8 and we need eight such products, resulting in total time of 8 . c . n3 8 = c . n3 (plus overhead) that is no better than simply doing multiplication in the naive way. In contrast, let us consider Merge-Sort with the running time recurrence T(n) = 2T(n/2)+ O(n): Even if we did naive quadratic (that is, of time c . n2) sorting for each of the subproblems, the total time would be 2 . c . (n/2)2 = c . n2/2 (plus overhead of O(n)) that is faster than naive sorting of the whole problem by factor of 2. This tells us that Divide-andConquer sorting may be more efficient than naive sorting (and it is indeed such as Master Theorem proves). Strassen's Method The idea behind Strassen's method is to reduce the number of multiplications at each recursive call from eight to seven. That makes the recursion tree slightly less bushy. Strassen's method has four steps: 1- Divide the input matrices A and B into submatrices as before, using index calculations in O(1) time.
2- Create ten (n/2) x (n/2) matrices S1; S2; : : : ; S10, each equal the sum or difference of two submatrices created in Step 1. This step takes O(n2) time. 3- Using the submatrices created in Steps 1 and 2, compute seven products P1; P2; : : : ; P7, each of size (n/2) x (n/2). 4- Compute the submatrices of C by adding or subtracting various combinations of the matrices Pi . This step takes O(n2). The running time for Strassen's Method satisfies the recurrence:
Strassens algorithm 1. Divide: Partition A and B into (n/2)(n/2) submatrices. Form P-terms to be multiplied using + and . 2. Conquer: Perform 7 multiplications of (n/2)(n/2) submatrices recursively. 3. Combine: Form C using + and on (n/2)(n/2) submatrices. T(n) = 7 T(n/2) + O(n2) Analysis of Strassen T(n) = 7 T(n/2) + (n2)
nlogba = nlog27 n2.81 CASE 1 T(n) = O(nlog 7). The number 2.81may not seem much smaller than 3, but because the difference is in the exponent, the impact on running time is significant. In fact, Strassens algorithm beats the ordinary algorithm on todays machines for n 30 or so. Best to date (of theoretical interest only): O(n2.376).
Pseudo code
Comparison Multi Standrad Strassen 8 7 Add/Sub 4 18 T. Complexity
O( N 3 ) O( N 2.801 )
Notes Strassen's algorithm was the first to beat O(n3) time, but it is not the asymptotically fastest known. A method by Coppersmith and Winograd runs in O(n2.376) time. Practical issues against Strassen's algorithm: Higher constant factor than the obvious O(n3)-time method. Not good for sparse matrices. Not numerically stable: larger errors accumulate than in the naive method. Submatrices consume space, especially if copying. Various researchers have tried to find the crossover point, where Strassen's algorthm runs faster than the nave O(n3)-time method. Theoretical analyses (that ignore caches and hardware pipelines) have produced crossover points as low as n = 8, and practical experiments have found crossover points as low as n = 400.