Basic concepts
Data structures
Text book:
E. Horowitz, S. Sahni, S. Anderson-freed, Fundamentals of data structures in C, Computer Science Press, 2nd edition, 2008. L. Nyhoff, C++ An introduction to data structures, Prentice-Hall, 1999. R. F. Gilberg and B. A. Forouzan, Data structures A pseudocode approach with C++, Brooks/Cole, 2001. Midterm test: 30% Final test: 30% Quiz 15% Homework 25%
Reference books
Grading:
Three or four homework assignments with computer programs.
http://140.121.196.191/ds.htm
Basic concepts
Data structures and algorithms
The design of a program include deciding
how to organize the data involved in a problem, and how to design algorithms for the operations on the
data.
Basic concepts
How to represent these data in your computer program?
We are here!!
Road map Adopted from GoogleEarth
Can you find the shortest path from Keelung to Yungho?
Basic concepts 4
A graph representation
How to represent this problem in your program? (contd)
a11 a 21 M a1000000,1
a12
a1000000, 2
x1 c1 a2,1000000 x2 c2 = M M M L a1000000,1000000 x1000000 c1000000 a1,1000000
How to efficiently represent and manipulate a huge system of linear equations
Basic concepts
How to represent these data in your computer program? (contd)
How to organize a huge amount of text data for efficient web searching
Basic concepts
Important data structures
Arrays Linked Lists Stacks Queues Graphs
Priority queue
Binary search trees Optimal binary search trees Selection trees 2-3 trees 2-3-4 trees Red-black trees AVL trees B/B+ trees Splay trees Tries Digital search trees
Adjacency matrices Adjacency lists Min-max heaps Deaps Leftist trees Binomial heaps Fibonacci heaps
Trees
Heaps
Sets Hash tables
Basic concepts
Algorithm specification
An algorithm is a finite set of instructions that satisfies the following criteria.
1. 2. 3. 4. 5.
Input
Zero or more quantities are externally supplied. At least one quantity is produced. Each instruction must be clear and unambiguous. For all cases, the algorithm always terminates after a finite number of steps. The algorithm uses finite sources. Every instruction must be basic enough to be carried out.
Output Definiteness Finiteness
Effectiveness
A program does not have to satisfy the 4th criterion.
e.g. an operating system.
Basic concepts 9
Example
Problem: Sort a collection of n integers. Simple solution:
From those unsorted integers, find the smallest and place it next in the sorted list. Selection sort
Transform this idea to an algorithm
void sort(int* a, const int n) { int i,j,k,temp; for(i = 0; i<n; i++) { j=i;
/* find the smallest integer in a[i] to a[n-1] */
for(k=i+1; k<n; k++) { if (a[k]<a[j]) j = k; } temp=a[i]; a[i] = a[j]; a[j]=temp; /* interchange */ } }
Basic concepts 10
Recursive algorithms
A function calls itself directly or indirectly.
The recursive algorithm often expresses a complex process very clearly.
Problem: Compute the binomial coefficient
n n! = m m!(n m)!
A recursive formula:
n n n n 1 n 1 , where = = 1 + = n 0 m m m 1
Basic concepts 11
Computing binomial coefficient by recursion
int binomial(int n, int m) { if (m==0 || (n==m)) { return 1; } else { return binomial(n-1,m)+binomial(n-1,m-1); } }
Can you design an iterative version?
Basic concepts 12
Performance analysis and measurement
Space complexity
The amount of memory needed to run to
completion.
Time complexity
The amount of computer time needed to
run to completion.
Basic concepts
13
Space complexity
The space needed by Program P is c+Sp(I) where
c: A fixed part independent of the characteristics of the inputs and outputs Sp(I):A variable part dependent on the problem instance I. The space needed by referenced variables. The recursion stack space. We are interested in estimating this part.
Basic concepts
14
Space complexity (contd)
float sum(float *a, const int n) float abc(float a,float b, float c) { { float s=0; return a+b+b*c+(a+b-c)/(a+b)+4.0; int i; } for(i = 0; i < n; i++) { c=sizeof(a)+sizeof(b)+sizeof(c) s += a[i]; Sabc(I)=0 } return s; } c=sizeof(n)+sizeof(a) Ssum(n)=0 float rsum(float *a, const int n) { if (n<=0) return 0; return (rsum(a,n-1)+a[n-1]); } Srsum(n)= (sizeof(n)+sizeof(a)+sizeof(return address))*(n+1)
Basic concepts 15
Time complexity
The run time of a program p, denoted by tp (instance characteristics).
tp may be defined as the number of program steps executed by Program P. A program step is a segment of a program that has an execution time independent of the instance characteristics.
float abc(float a,float b, float c) { return a+b+b*c+(a+b-c)/(a+b)+4.0; }
a program step
Basic concepts
16
Time complexity (contd)
The time complexity of a program is given by the number of steps taken by the program to compute the function it was written for.
The number of steps is a function of the
instance characteristics.
Basic concepts
17
Time complexity (contd)
steps per execution float sum(float *a, const int n) 1{ line s/e frequency total steps 2 float s=0; 1 0 1 0 3 int i; 4 for(i = 0; i < n; i++) 2 1 1 1 5 s += a[i]; 3 0 1 0 6 return s; 7} 1 n+1 n+1 4
5 6 7 1 1 0 n 1 1 n 1 0 2n+3
18
total number of steps
Basic concepts
Time complexity (contd)
float rsum(float *a,const int n) 1{ 2 if(n<=0) return 0; 3 else return (rsum(a,n-1)+a[n-1]); 4}
line s/e 0 1 1
frequency n=0 1 n>0 1 1 0 1 1
total steps n=0 0 1 1 0 0 2 n>0 0 1 0 1+trsum(n-1) 0 2+trsum(n-1)
19
2(a)
2(b)
2(a) 2(b) 3 4
1 1
1+trsum(n-1) 0 0 1
total number of steps
Basic concepts
n=0 2 t rsum (n) = 2 + t rsum (n 1) n > 0 t rsum (n) = 2 + t rsum (n 1) = 2 + 2 + t rsum (n 2) = 2 + 2 + ... + 2 + t rsum (0) 14243
n
= 2n + 2
10
Time complexity (contd)
void add(int a[][MAX_SIZE], int b[][MAX_SIZE], int c[][MAX_SIZE], int m,int n) 1{ 2 int i,j; line s/e frequency total steps 3 for(i=0; i<m; i++) 1 0 1 0 4 for(j=0; j<n; j++) 2 0 1 0 5 c[i][j] = a[i][j]+b[i][j]; 6}
3 4 5 6 1 1 1 0 m+1 m(n+1) mn 1 m+1 mn+m mn 0 2mn+2m+1
21
total number of steps
Basic concepts
Time complexity (contd)
Example [Fibonnaci numbers]. The Fibonnaci sequence of numbers starts as 0,1,1,2,3,5,.... Each new item is obtained by taking the sum of the two previous items. In general, Fn=Fn-1+Fn-2 with F0=0,F1=1.
Basic concepts
22
11
Time complexity (contd)
iterative version recursive version int fibonacci(int n) int fibonacci(int n) { { if (n<=1) return n; /* F0=0 and F1=1 */ else { if (n<=1) return n; /* F0=0 and F1=1*/ int fn,i; int fnm2=0, fnm1=1; else { for(i=2; i<= n; i++) { return fn=fnm1+fnm2; fibonacci(n-1)+ fnm2=fnm1; fibonacci(n-2); fnm1=fn; } } } return fn; }
Basic concepts
23
iterative version int fibonacci(int n) { 1 if (n<=1) return n; /* F0=0 and F1=1 */ 2 else { 3 int fn,i; int fnm2=0, fnm1=1; 4 for( i=2; i<= n; i++) { 5 fn=fnm1+fnm2; 6 fnm2=fnm1; 7 fnm1=fn; 8 } 9 return fn; }
line 1(a) 1(b) 2 3 4 5 6 7 8 9
s/e 1 1 0 2 1 1 1 1 0 1
frequency n<=1 1 1 1 0 0 0 0 0 0 1 1 0 1 1 n n-1 n-1 n-1 1 1
total steps n>1 1 0 0 2 n n-1 n-1 n-1 0 1 4n+1 1 1 0 0 0 0 0 0 0 1 3
n>1 n<=1
total number of steps
12
recursive version int fibonacci(int n) { 1 if (n<=1) return n; /* F0=0 and F1=1 */ 2 else { 3 return fibonacci(n-1)+ fibonacci(n-2); } } line s/e 1(a) 1(b) 2 3 1 1 0 t(n-1)+t(n-2)+1
frequency n<=1 n>1 1 1 1 0 1 0 1 1 n<=1 1 1 0 0 2
total steps n>1 1 0 0 t(n-1)+t(n-2)+1 t(n-1)+t(n-2)+2
total number of steps
2 n 1 t ( n) = t (n 1) + t (n 2) + 2 n > 1
13
Time complexity (contd)
Three types of step counts:
The best-case step count is the minimum number of steps that can be executed with the given parameters. The worst-case step count is the maximum number of steps that can be executed with the given parameters. The average step count is the average number of steps executed on instances with the given parameters.
It is difficult to compare the time complexities of two programs that compute the same function by their exact step counts.
Basic concepts
27
Asymptotic notation (O,,)
For most situations, It is not necessary to know the exact step count.
Something like it is about 80n, or 75n is
adequate to arrive at the same conclusion.
Basic concepts
28
14
Asymptotic notation (contd)
Example If we have two programs that compute the same function with a complexity of c1n2+c2n and c3n, respectively, where c1,c2, and c3 are nonnegative. Which program is the faster?
The program with complexity c3n is faster than the one with complexity c1n2+c2n for sufficiently large value of n. c1n2+c2n>c3n(n-(c3-c2)/c1)n>0
Basic concepts
29
Asymptotic notation (contd)
n2+2n
40000
30000
20000
100n
10000
n0
0 50 100 n
100n n2+2n when n n0
150 200
30
Basic concepts
15
Asymptotic notation (contd)
Definition. f(n)=O(g(n)) if and only if there exist positive constants c and n0 such that f(n) cg(n) for all n, nn0.
g(n) is an upper bound of f(n).
Basic concepts 31
Asymptotic notation (contd)
Example (1) 3n+2=O(n) as 3n+2 4n for all n2. (2) 100n+6=O(n) as 100n+6 101n for all n10. (3) 10n2+4n+2 =O(n2) as 10n2+4n+2 11n2 for all n5. (4) 6*2n+n2=O(2n) as 6*2n+n2 7*2n for all n4. (5) 3n+3=O(n2) as 3n+33n2 for all n2. (6) 3n+2O(1) as 3n+2 >c*1 for all n>(c-2)/3. (7) 10n2+4n+2 O(n).
Basic concepts 32
16
Asymptotic notation (contd)
O(n) is called linear. O(n2) is called quadratic. O(n3) is called cubic. O(2n) is called exponential. O(1)<O(logn)<O(n)<O(nlogn)<O(n2)<O(n3)< O(2n)
Basic concepts
33
Asymptotic notation (contd)
Theorem 1.2: If f(n)=amnm++a1n+a0, then f(n)=O(nm). Proof: m
f (n) | ai | n i
i =0
| a | n
i =0 m i i =0
i m
n m | ai |, for n 1.
So, f(n)=O(nm).
Basic concepts 34
17
Asymptotic notation (contd)
Definition. f(n)= (g(n)) if and only if there exist positive constants c and n0 such that f(n) cg(n) for all n, nn0.
g(n) is a lower bound of f(n).
Basic concepts
35
Asymptotic notation (contd)
Example (1) 3n+2= (n) as 3n+2 3n for all n1. (2) 100n+6= (n) as 100n+6 100n for all n1. (3) 10n2+4n+2 = (n2) as 10n2+4n+2 10n2 for all n1. (4) 6*2n+n2= (2n) as 6*2n+n2 6*2n for all n1. (5) 3n+2=(1). (6) 10n2+4n+2 = (n). (7) 6*2n+n2= (n100). (8) 3n+3 (n2) as 3n+33n2 for all n2. (9) 10n2+4n+2 (n3).
Basic concepts 36
18
Asymptotic notation (contd)
Theorem 1.3: If f(n)=amnm++a1n+a0 and am>0, then f(n)= (nm).
Basic concepts
37
Asymptotic notation (contd)
Definition. f(n)= (g(n)) if and only if there exist positive constants c1, c2, and n0 such that c1g(n) f(n) c2g(n) for all n, nn0.
g(n) is both an upper bound and lower bound of f(n).
Basic concepts 38
19
Asymptotic notation (contd)
Example (1) 3n+2= (n) as 3n+2 3n and 3n+2 4n for all n2. So, c1=3, c2=4, and n0=2. (2) 10n2+4n+2 = (n2) as 10n2+4n+2 10n2 and 10n2+4n+2 11n2 for all n5. (3) 6*2n+n2= (2n). (4) 10logn= (logn). (5) 6*2n+n2 (n2). (6) 10n2+4n+2 (1).
Basic concepts
39
Asymptotic notation (contd)
Theorem 1.4: If f(n)=amnm++a1n+a0 and am>0, then f(n)= (nm).
Basic concepts
40
20
Asymptotic notation (contd)
float sum(float *a, const int n) line 1{ 2 float s=0; 1 3 int i; 2 4 for( i = 0; i < n; i++) 5 s += a[i]; 3 6 return s; 4 7}
5 6 7
s/e 0 1 0 1 1 1 0
frequency 1 1 1 n+1 n 1 1
total steps 0= (0) 1= (1) 0= (0) n+1= (n) n= (n) 1= (1) 0= (0) 2n+3=(n)
41
total number of steps
1i 7
t sum (n) = (max{g i (n)}) = (n)
Basic concepts
Asymptotic notation (contd)
float rsum(float *a,const int n) 1{ 2 if(n<=0) return 0; 3 else return (rsum(a,n-1)+a[n-1]); 4} frequency
line 1 s/e 0 1 1 n=0 1 1 1 n>0 1 1 0 1 1
total steps n=0 n>0 0 1 1 0 0 2 (0) (1) (0) (1+trsum(n-1)) (0) (2+trsum(n-1))
42
2(a)
2(b)
2(a) 2(b) 3 4
1+trsum(n-1) 0 0 1
total number of steps
Basic concepts
21
Asymptotic notation (contd)
void add(matrix a, matrix b, matrix c, int m,int n) 1{ 2 for(int i=0; i<m; i++) 3 for(int j=0; j<n; j++) 4 c[i][j] = a[i][j]+b[i][j]; 5}
line 1 2 3 4 5 s/e 0 1 1 1 0 frequency 1 m+1 m(n+1) mn 1 total steps 1= (1) m+1= (m) mn+m= (mn) mn= (mn) 0= (0) 2mn+2m+1= (mn)
43
total number of steps
Basic concepts
Asymptotic notation (contd)
void perm(char* a,const int k,const int n) { char temp; if (k==n-1) { for(int i=0; i<n;i++) printf(%c ,a[i]); printf(\n); } else { for(i=k;i<n;i++) { temp=a[k];a[k]=a[i];a[i]=temp; perm(a,k+1,n); temp=a[k]; a[k]=a[i]; a[i] = temp; } } }
abcd bacd cbad dbca permute the remaining three elements
Basic concepts
44
22
Asymptotic notation (contd)
( n ) k = n 1 t perm (k , n) = ((n k )t perm (k + 1, n)) k < n 1 t perm (0, n) = (n t perm (1, n)) = (n (n 1) t perm ( 2, n)) = ... = (n ( n 1) ... 2 t perm (n 1, n 1)) = (n n!)
void perm(char* a,const int k,const int n) { if (k==n-1) { for(int i=0; i<n;i++)printf(%c ,a[i]); printf(\n); } else { for(i=k;i<n;i++) { char temp=a[k];a[k]=a[i];a[i]=temp; perm(a,k+1,n); temp=a[k]; a[k]=a[i]; a[i] = temp; } } }
Basic concepts
45
Asymptotic notation (contd)
iterate at most log 2 (n + 1) times
int binary_search(int *a,const int x,const int n) { int left,right,middle; for(left=0,right=n-1;left<=right;) { middle=(left+right)/2; switch(compare(x,a[middle])) { case <:left=middle+1;break; case >:right=middle+1;break; case =:return middle; } } return 1;/*not found*/ (log n) in the worst - case }
Basic concepts 46
23
2 0 10 1 20 2 30 3 40 4 50 0 3 4
3 0 10 1 20 2 30 3 40 4 50 5 60 6 70 0 2 4 6
Practical complexity
2n
60 50 40 30
n2
nlogn
20 10 0
n logn
2 4 n 6 8 10
48 Basic concepts
24
Practical complexity (contd)
Basic concepts
49
25