0% found this document useful (0 votes)
11 views67 pages

Intro To Dsa

notes of data structure

Uploaded by

ishnkhan123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views67 pages

Intro To Dsa

notes of data structure

Uploaded by

ishnkhan123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 67

Introduction to Data Structures

and Algorithms
COA3600
Recommended Books

Seymour Lipschutz, Data Structures, TMH


Aaron M. Tenenbaum, Langsam, Data structures using C, Pearson, 2008
1/31/
2025 What is a Computer Program?

To exactly know, what is data structure? We must know:


What is a computer program?

processing
Input Output Datastructures and
3
Algorithms
Data

Data is plural of datum- a piece of information


Data are values or a set of values
Data item refers to single unit of values
Data item
Group item :
Data item that can be subdivided into sub item.
Ex Name : First Name, Middle initial and Last Name

Elementary item:
Data item that can not be sub divided into sub item

Ex : PAN card number / Bank Pass Book Number is treated as single item
Elementary Data Organization

Collection of data are frequently organized into a hierarchy of fields, records and files
Entity :
Something that has certain attributes or properties which may be assigned values eg. Employee
Values may be numeric or non-numeric
Ex: The employee of an organization
Attributes Name Age Employee Code
Values John 33 13472
Elementary Data Organization

Entity with similar attributes ( e.g all employees of an organization) form an entity set
Each attribute of an entity set has a range of values [ the set of possible values that could be
assigned to the particular attribute]
Information: Data with given attribute or processed data

Field is a single elementary unit of information representing an attribute of an entity


Record is the collection of related data about a single entity
Fixed Length
Variable Length
File is the collection of records of the entities in a given entity set
Example
Issues in Data Structures

Study of Data Structure includes the following three steps


Logical or Mathematical description of the structure of data
Implementation of the structure on a computer
Quantitative analysis of the structure, which includes determining the amount of memory needed
to store the structure and the time required to process the structure
Second and third step depends on whether data is stored in main memory or secondary
storage
In this course, we are concerned only with data stored in main memory
Data Structure

Data Structure
The logical or mathematical model of a particular organization of data
Choice of a model depends on two factor
It must be rich enough in structure to mirror the actual relationships of the data in the real world
The structure should be simple enough that one can effectively process the data when necessary
A data structure is a way to logically organize data that specifies:
A set of data elements i.e., a data object and
A set of operations which may legally be applied to elements of this data object.
Operations

Operations:
Data appearing in DS are processed by means of certain operation
A particular DS, one chooses for a given situation depends largely on the frequency with which specific
operations are performed
Main operations
Traversal
Search
Insertion
Deletion
Sort

Issues
Space needed
Operations efficiency (Time required to complete operations)
Categorization
1/31/
2025 Pictorial Representation

array

Linked list

queue
tree stack
Datastructures and
12
Algorithms
Algorithms

A list of well defined instructions for solving a particular problem


High level and language independent instructions
Each instruction must be clear
Finite number of instructions
There may be many algorithms for solving a problem
Good Algorithm: Efficient Algorithm
Time
Space
Time-space trade-off

Each algorithm involve a particular data structure


Choice of data structure effect the efficiency of algorithm
Time-space trade-off
Increasing the amount of space may reduce time need and vice
versa
Example: Search Algorithms

Suppose we have a file containing information about employees


Name, Age, Telephone Number, Address
We are given name of an employee and want to search for his/her
Telephone number
Linear Search
Traverse the file linearly starting from the first record
Number of comparisons is equal to n-1
What if file is very large
Binary Search
Binary search can be used if names are sorted alphabetically
General Idea of binary search
Compare given name with the name in the middle of list
Tells which half of the list contains given name
Repeat this process until given name is found
Number of comparisons is log2n
Number of comparisons in Binary Search
Number of comparisons in Binary Search

log N
Binary Search

Very efficient than linear search


Drawbacks
Sorted list
Direct access to middle element
insertion and deletion in sorted array is difficult
Example 1: time space trade-off
Example 1: time space trade-off
Example 1: time space trade-off
Example 2: time space trade-off
Roll No Name
55 Clark
712 Hill
3 Drew
124 White
943 Brown
345 Adams
231 Bill

Roll No can be used as address


No movement of data in insertion and deletion
Instant access to any record
Mostly empty locations- memory is allocated for non-existing records
Properties of Algorithms

Finiteness: The algorithm must always terminate after a finite


number of steps.
Definiteness: Each step must be precisely defined; the actions to be
carried out must be rigorously and unambiguously specified for each
case.
Input: An algorithm has zero or more inputs, taken from a specified
set of objects.
Output: An algorithm has one or more outputs, which have a
specified relation to the inputs.
Effectiveness: All operations to be performed must be sufficiently
basic that they can be done exactly and in finite length.
Algorithmic Notation
Pseudocode
an informal high-level description of the operating principle
of an algorithm
intended for human reading rather than machine reading
uses natural language descriptions with some structural
conventions of a normal programming language
No standard pseudocode syntax as a program in pseudocode
is not an executable program
Flowcharts can be thought of as a graphical alternative to
pseudocode
Format of Pseudocode

Two parts
First part: a paragraph in natural language which tells
the purpose of algorithm, identifies the variables which
occur in algorithm and lists input data
Second part: a list of instructions
Primitive operations

READ and WRITE


Assigning a value to a variable
Arithmatic operations
Comparing two numbers
Array access
Calling a sub-algorithm
Returning from a sub-algorithm
Control Structures in Pseudocode
Sequence Logic
It is used to perform instructions in a sequence, that is one after
another.
Pseudocode instructions are written in an order in which they are
to be performed.
Control Structures in Pseudocode
Selection Logic
It is used for making decisions and for selecting the proper path out of
two or more alternative paths in program logic
It is also known as decision logic
Single Alternative
Control Structures in Pseudocode
Selection Logic
Double Alternative
Control Structures in Pseudocode
Selection Logic
Multiple Alternative
Control Structures in Pseudocode
Iteration Logic (repetetive)
It is used to produce loops when one or more instructions may be
executed several times depending on some of the conditions.
It uses structures called the REPEAT FOR and REPEAT WHILE.
Repeat For
Repeat While
Subalgorithm

A subalgorithm is a complete and independently defined algorithm


which is used by some algorithm (other subalgorithm)
Receives values called arguments (or parameters), perform
computations, and sends back results to the calling algorithm
A subalgorithm has a heading
NAME(par1, par2,…,park)
Return statement instead of Exit
Conventions in Pseudocode

Write only one statement per line.


Each statement in your pseudocode should express just one
action for the computer.
Capitalized keyword
READ, WRITE, IF, ELSE, ENDIF, WHILE, ENDWHILE,
REPEAT, UNTIL
Indent to show hierarchy
For sequence, selection and looping
Conventions in Pseudocode

End multi-line structures.


All the initial keyword must always in line with the last or end of
the structure.
Keep statement language independent.
Number each instruction.
Identifying Number
Algorithm 2, Algorithm 4.3
Write comments in each statement
Largest Number in Array
An array DATA of numerical values is in memory. We want to find index LOC and value
MAX of the largest element of DATA.
Largest Number in Array
An array DATA of numerical values is in memory. We want to find index LOC and value
MAX of the largest element of DATA.
Complexity of Algorithms
There are often many different algorithms which can be used to solve the
same problem. Thus, it makes sense to develop techniques that allow us
to:
compare different algorithms with respect to their “efficiency”
choose the most efficient algorithm for the problem
The Complexity of any algorithmic solution to a problem is a measure of
several factors. Two important and general factors:
Time Complexity : the time it takes to execute.
Space Complexity : the space (primary memory) it uses.
We will focus on an algorithm’s efficiency with respect to time.
Running Time of Program
Experimental Study
Write a program that implements the algorithm
Run the program with data sets of varying size and composition.
Use a method like System.currentTimeMillis() to get an accurate measure of
the actual running time.
Factors affecting running time
Hardware
Operating System
Compiler
Size of input
Nature of Input
Algorithm
Which should be improved?
Limitations

Experimental studies have several limitations:


It is necessary to implement and test the algorithm in order to
determine its running time.
Experiments can be done only on a limited set of inputs, and may
not be indicative of the running time on other inputs not included
in the experiment.
In order to compare two algorithms, the same hardware and
software environments should be used.
Running Time of an Algorithm
Depends upon
Input Size
Nature of Input
To find running time of an algorithm
count the number of basic/key/primitive operations/steps the algorithm
performs.
Ex. Comparisons in searching and sorting
calculate how this number depends on the size of the input.
A basic operation is an operation which takes a constant amount of time to
execute.
Time for other operations are much less than or at most proportional to the
time for basic operations
Machine independent
Complexity of Algorithms
The complexity (or efficiency) of an algorithm is the number of basic
operations it performs. This number is a function of the input size n.
Definition: The time complexity of an algorithm is the function f(n) which
gives the running time requirement of the algorithm in terms of size n of
input data.
Complexity function f(n) is found for three cases
Best: minimum value of f(n) for any possible input
Worst: maximum value of f(n) for any possible input
Average: expected value of f(n)
Ex: Linear Search
Best: 1, Worst: N, Average: (n+1)/2
WORST CASE COMPLEXITY

We are usually interested in the worst case complexity: what are the most
operations that might be performed for a given problem size.
Usually focus on worst case analysis
Best case complexity has little use
Average case is difficult to compute
Worst case complexity is easier to compute
Provides upper bound on complexity i.e. complexity in all other cases is lower
than worst case complexity
Usually close to the actual running time
Crucial to real-time systems (e.g. air-traffic control, surgery)
Ex: Largest Number in Array
An array DATA of numerical values is in memory. We want to find index LOC and value
MAX of the largest element of DATA.

Number of basic operations = (4n + 3)


Rate of Growth of function

Estimated running time for different values of N in 4*N+3


N = 10 => 43 steps
N = 100 => 403 steps
N = 1,000 => 4003 steps
N = 1,000,0 => 4,000,3 steps
As N grows, the number of steps grow in linear proportion to N
What about the +3 and 4* in 4N+3?
As N gets large, both 4* and +3 becomes insignificant
Rate of Growth of function

Estimated running time for different values of N in 3N2+5N+8 :


N = 10 => 358 steps
N = 100 => 30508 steps
N = 1,000 => 3005008 steps
N = 1,000,0 => 300050008 steps
As N grows, the number of steps grow in quadratic proportion to N
Second factor becomes insignificant for large values of N
What about the *3, 5* and +3?
As N gets large, all constant becomes insignificant
ASYMPTOTIC COMPLEXITY

The 4N+3 time bound is said to "grow asymptotically“ like N


This gives us an approximation of the complexity of the algorithm
ASYMPTOTIC NOTATION
Big Oh Notation: Upper bound
Omega Notation: Lower bound
Theta Notation: Tighter bound
BIG OH NOTATION

If f(N) and g(N) are two complexity functions, we say


f(N) = O(g(N))
(read "f(N) is order of g(N)", or "f(N) is big-O of g(N)") if there are constants c and N0 such
that for N > N0,
f(N) ≤ c * g(N)
Example
For functions f(n)
and g(n) (to the right)
there are positive
constants c and n0
such that: f(n)≤c g(n)
for n ≥ n0

conclusion:
2n+6 is O(n).

51
Another Example

On the other hand…


n2 is not O(n) because there is no c and n0
such that: n2 ≤ cn for n ≥ n0

(As the graph to the right illustrates, no


matter how large a c is chosen there is an
n big enough that n2>cn ) .

52
Comparing functions
Comparing Functions

As inputs get larger, any algorithm of a smaller order will be more efficient than an
algorithm of a larger order
55 Time complexity growth

f(n) Number of data items processed per:


1 minute 1 day 1 year 1 century
n 10 14,400 5.26⋅106 5.26⋅108
n log n 10 3,997 883,895 6.72⋅107
n1.5 10 1,275 65,128 1.40⋅106
n2 10 379 7,252 72,522
n3 10 112 807 3,746
2n 10 20 29 35
Increasing order of complexity
These functions are in increasing order of complexity
logarithmic: O(log n)
linear: O(n)
quadratic: O(n2)
polynomial: O(nk), k ≥ 1
exponential: O(an), n > 1
57 Big-Oh vs. Actual Running Time

Example 1: let algorithms A and B have running times


TA(n) = 20n ms and TB(n) = 0.1n log2n ms
In the “Big-Oh”sense, A is better than B…
But: on which data volume can A outperform B?
TA(n) < TB(n) if 20n < 0.1n log2n, or
log2n > 200, that is, when n >2200 ≈ 1060 !
Thus, in all practical cases B is better than A…
58 Big-Oh vs. Actual Running Time

Example 2: let algorithms A and B have running times


TA(n) = 20n ms and TB(n) = 0.1n2 ms
In the “Big-Oh”sense, A is better than B…
But: on which data volumes A outperforms B?
TA(n) < TB(n) if 20n < 0.1n2, or n > 200
Thus A is better than B in most practical cases except for n
< 200 when B becomes faster…
Useful Rules of BIG-OH NOTATION
If the function f can be written as a finite sum of other functions,
then the fastest growing one determines the order of f(n).
Drop lower order terms and constant factors
7n+3 is O(n)
8n2log n + 5n2 + n is O(n2log n)
Useful Rules of BIG-OH NOTATION

Multiplication by constant

Product

Sum

O(log n) is exactly the same as O(log(nc)) since log(nc) = c log n


logs with different constant bases are equivalent
Example
62 Big-Omega: asymptotic lower bound

The function f(n) is Ω(g(n)) iff there exist a real positive


constant c > 0 and a positive integer n0 such that f(n) ≥
cg(n) for all n ≥ n0
Big Omega is just opposite to Big Oh
It generalises the concept of “lower bound” (≥) in the
same way as Big Oh generalises the concept of “upper
bound” (≤)
If f(n) is Ο(g(n)) then g(n) is Ω(f(n))
Big-Omega
Little-o notation
Even though it is correct to say “7n +3 is O(n3)”, a better statement is “7n+3 is O(n)”,
that is, one should make the approximation as tight as possible
Little-ω notation
4n2 = Ω(n) is not asymptotically tight but 4n2 = Ω(n2)
is asymptotically tight.
66 Big-Theta: asymptotic tight bound

The function f(n) is Θ(g(n)) iff there exist two real positive
constants c1 > 0 and c2 > 0 and a positive integer n0 such
that:
c1g(n) ≥ f(n) ≥ c2g(n) for all n ≥ n0
Whenever two functions, f and g, are of the same
order, f(n) is Θ(g(n)), they are each Big-Oh of the
other: g(n) is Ο(f(n)) AND f(n) is Ο(g(n))
g(n) is an asymptotic tight bound for f(n).
Big-Theta

You might also like