AI & DS Dept - DSD Important Questions & Answers
Unit - 1
1) Asymptotic notation
2) Inheritance
3) ADT - Abstract data type
4) Types of copy (Deep copy / shallow copy)
5) Reculsion, analysing, recursion algorithm
Unit - 2
1) Singly linked list
2) Doubly linked list
3) Circularly linked list
4) Implementation of stack and queue (Insertion and deletion)
5) Double ended queue
6) Various operation of linked list
Unit - 3
1) Bubble sorting
2) Selection sorting
3) Merge Sorting
4) Quick sorting
5) Insertion sorting
6) Linear searching
7) Binary searching
8) Collision handling techniques
Unit - 4
1) Binary search tree
2) Tree Traversal - inorder, preorder, postorder
3) Tree ADT, Binary Tree ADT
4) Heaps
5) AVL Trees
Unit - 5
1) Breadth First Search
2) Depth first Search
3) Minimum spanny tree - kruskis algorithm
4) Prim’s algorithm
5) Topological ordering
6) Graph representations
Answers :
(Answers are given without diagram so add diagram for diagram
needed answer)
UNIT - 1
1. Asymptotic Notation
Definition:
Asymptotic Notation is a method used to describe the running time or space requirement of
an algorithm as a function of input size n as n approaches infinity. It ignores constants and
lower-order terms to focus on the algorithm's growth rate.
Explanation:
In algorithm analysis, it's more meaningful to understand how performance scales rather
than exact timings. Asymptotic notations provide a mathematical framework to compare
algorithms independent of hardware.
There are three primary notations:
● Big O (O): Upper bound; represents the worst-case scenario.
E.g., O(n^2) means time grows at most quadratically.
● Omega (Ω): Lower bound; best-case scenario.
● Theta (Θ): Tight bound; average-case or when upper and lower bounds are the
same.
Other notations:
● Little o: Strictly less than a bound.
● Little ω: Strictly greater than a bound.
Algorithm (Conceptual Steps):
1. Count the number of significant operations based on input size.
2. Remove constants and low-order terms.
3. Represent growth rate using Big O, Ω, or Θ.
Code Example:
python
CopyEdit
def search(arr, x):
for i in range(len(arr)):
if arr[i] == x:
return i
return -1
# Time Complexity: O(n)
Advantages:
● Gives general performance across all inputs.
● Platform-independent analysis.
● Helps compare algorithms easily.
Disadvantages:
● Ignores constant factors which may matter in practice.
● Doesn’t give exact execution time.
● Doesn’t account for hardware or compiler optimizations.
Applications:
● Choosing the best algorithm for large-scale applications.
● Theoretical foundation for data structure and algorithm design.
● Used in software development, AI, database indexing, etc.
2. Inheritance (OOP Concept)
Definition:
Inheritance is a key concept in Object-Oriented Programming (OOP) that allows a class
(child) to inherit attributes and methods from another class (parent).
Explanation:
Inheritance promotes code reuse and logical hierarchy. The parent class (superclass)
contains common features, while child classes (subclasses) extend or customize these
features.
Types of inheritance:
● Single: One child inherits from one parent.
● Multiple: One child inherits from more than one parent.
● Multilevel: Child inherits from a class which is also a child of another.
● Hierarchical: Multiple children inherit from the same parent.
● Hybrid: Combination of multiple types.
Algorithm (Conceptual Steps):
1. Define a base class with methods and properties.
2. Create a subclass using class Sub(Base):.
3. Call or override methods of the base class in the subclass.
Code:
python
CopyEdit
class Vehicle:
def start(self):
print("Vehicle started")
class Car(Vehicle):
def drive(self):
print("Car is moving")
c = Car()
c.start()
c.drive()
Advantages:
● Promotes code reuse and scalability.
● Easier maintenance due to modular code.
● Enhances readability through hierarchy.
Disadvantages:
● Tightly coupled classes.
● Changes in base class affect subclasses.
● In multiple inheritance, ambiguity (Diamond Problem) can occur.
Applications:
● GUI frameworks, where all elements inherit from a common Widget class.
● Game development (e.g., Enemy and BossEnemy).
● Web frameworks like Django and Flask use inheritance extensively.
3. ADT – Abstract Data Type
Definition:
An Abstract Data Type (ADT) is a theoretical model that defines a data structure by its
behavior (operations) rather than its implementation.
Explanation:
ADT provides a logical description of how data is organized and manipulated without
revealing how it's stored or implemented. For example, a Stack ADT allows push/pop
operations, but whether it's implemented using arrays or linked lists is abstracted.
Common ADTs:
● Stack
● Queue
● List
● Tree
● Graph
● Hash Table
Algorithm (Conceptual):
Example: Stack ADT
1. push(x): Add x to the top.
2. pop(): Remove top element.
3. isEmpty(): Check if stack is empty.
Code Example:
python
CopyEdit
class Stack:
def __init__(self):
self.stack = []
def push(self, val):
self.stack.append(val)
def pop(self):
if self.stack:
return self.stack.pop()
return None
Advantages:
● Separation of logic and implementation.
● Allows flexible and efficient implementations.
● Promotes modular programming.
Disadvantages:
● Not language-specific; harder for beginners to visualize.
● Requires good abstraction skills.
Applications:
● Software design (e.g., compilers, parsers).
● OS (task scheduling with queues).
● AI (state space search with trees/graphs).
4. Types of Copy – Deep Copy vs Shallow Copy
Definition:
Copying refers to creating duplicates of objects in memory. There are two types:
● Shallow Copy: Copies the reference; changes in one affect the other.
● Deep Copy: Copies the object and its nested objects recursively.
Explanation:
● Shallow copy creates a new object but inserts references to the same elements.
● Deep copy creates a completely independent object with new memory allocations for
nested elements.
Algorithm:
1. Shallow copy: Use copy.copy(obj)
2. Deep copy: Use copy.deepcopy(obj)
Code:
python
CopyEdit
import copy
lst1 = [[1, 2], [3, 4]]
shallow = copy.copy(lst1)
deep = copy.deepcopy(lst1)
lst1[0][0] = 100
print(shallow) # reflects change
print(deep) # unchanged
Advantages:
● Shallow copy: Faster and memory-efficient.
● Deep copy: Safe for independent object manipulation.
Disadvantages:
● Shallow copy: Can cause bugs due to shared references.
● Deep copy: Slower due to recursive copying.
Applications:
● Creating backups of data structures.
● Cloning objects in simulations or games.
● Managing configurations/settings in software.
5. Recursion, Analyzing, Recursion Algorithm
Definition:
Recursion is a programming technique where a function calls itself to solve a problem. It is
used when a problem can be divided into similar sub-problems.
Explanation:
Recursive problems must have:
● A base case to stop recursion.
● A recursive case where the function calls itself.
Types of recursion:
● Direct vs Indirect
● Tail vs Non-tail
● Linear vs Tree recursion
Analyzing recursive functions involves identifying the recurrence relation and solving it using
substitution or the Master Theorem (for divide-and-conquer algorithms).
Algorithm (Generic Recursive Steps):
1. Define base case.
2. Break the problem into sub-problems.
3. Solve using recursive call.
Code (Factorial & Fibonacci):
python
CopyEdit
def factorial(n):
if n == 0:
return 1
return n * factorial(n-1)
def fibonacci(n):
if n <= 1:
return n
return fibonacci(n-1) + fibonacci(n-2)
Advantages:
● Reduces code length.
● Solves problems naturally expressed recursively (e.g., trees).
Disadvantages:
● Higher memory use (stack frames).
● Can cause stack overflow.
● Sometimes less efficient than iteration.
Applications:
● Mathematical problems (factorials, permutations).
● Tree/graph traversal (DFS).
● Backtracking (N-Queens, Sudoku).
● Divide-and-conquer algorithms (Merge Sort, Quick Sort).
UNIT - 2
1. Singly Linked List
Definition:
A Singly Linked List is a linear data structure where each element (node) contains data and
a pointer (reference) to the next node in the sequence. It is unidirectional.
Explanation:
In a singly linked list, the nodes are connected in a single direction — from the first node
(head) to the last node (tail). Each node contains:
● Data: The value stored.
● Next: A reference to the next node.
Since there’s no back-reference, traversal is only possible in one direction, from head to tail.
The last node's next is set to None.
Algorithm (Insertion at Beginning):
1. Create a new node with data.
2. Set new node’s next to current head.
3. Update head to new node.
Code:
python
CopyEdit
class Node:
def __init__(self, data):
self.data = data
self.next = None
class SinglyLinkedList:
def __init__(self):
self.head = None
def insert_front(self, data):
new_node = Node(data)
new_node.next = self.head
self.head = new_node
Advantages:
● Dynamic size; memory efficient.
● Easy insertion/deletion at the beginning.
Disadvantages:
● No backward traversal.
● Accessing elements requires traversal from head.
Applications:
● Memory-efficient stacks/queues.
● Image viewers (next-only navigation).
● Dynamic memory allocation.
2. Doubly Linked List
Definition:
A Doubly Linked List is a linear data structure where each node contains a reference to both
the next and previous nodes, allowing bidirectional traversal.
Explanation:
Unlike singly linked lists, doubly linked lists provide two-way navigation. Each node
contains:
● Data
● Next: Points to the next node.
● Prev: Points to the previous node.
This makes operations like deletion and backward traversal efficient.
Algorithm (Insertion at End):
1. Create a new node.
2. Traverse to the last node.
3. Set last node’s next to new node.
4. Set new node’s prev to last node.
Code:
python
CopyEdit
class Node:
def __init__(self, data):
self.data = data
self.next = None
self.prev = None
class DoublyLinkedList:
def __init__(self):
self.head = None
def insert_end(self, data):
new_node = Node(data)
if self.head is None:
self.head = new_node
return
temp = self.head
while temp.next:
temp = temp.next
temp.next = new_node
new_node.prev = temp
Advantages:
● Can be traversed both ways.
● Efficient deletion from any position.
Disadvantages:
● Requires extra memory for prev pointer.
● More complex to implement.
Applications:
● Browser history (forward and back).
● Music playlists.
● Navigation systems.
3. Circularly Linked List
Definition:
A Circular Linked List is a variation where the last node points back to the first node, forming
a circle.
Explanation:
Circular linked lists can be singly or doubly linked. There’s no NULL at the end — the list
loops continuously. It allows circular traversal without null checks.
Algorithm (Insertion at End):
1. Create a new node.
2. If list is empty, set next of new node to itself.
3. Else, traverse to last node.
4. Point last node’s next to new node.
5. Point new node’s next to head.
Code:
python
CopyEdit
class Node:
def __init__(self, data):
self.data = data
self.next = None
class CircularLinkedList:
def __init__(self):
self.head = None
def insert_end(self, data):
new_node = Node(data)
if self.head is None:
self.head = new_node
new_node.next = new_node
return
temp = self.head
while temp.next != self.head:
temp = temp.next
temp.next = new_node
new_node.next = self.head
Advantages:
● Efficient circular navigation.
● Can be used to cycle through lists repeatedly.
Disadvantages:
● Extra care needed to avoid infinite loops.
● Slightly more complex than singly linked list.
Applications:
● Round-robin scheduling.
● Multiplayer board games.
● Circular buffers.
4. Implementation of Stack and Queue (Insertion &
Deletion)
Stack (LIFO)
Definition:
A stack is a linear data structure that follows the Last In First Out (LIFO) principle.
Explanation:
In a stack, the most recently added item is removed first. Operations:
● push: Insert element at top.
● pop: Remove element from top.
Code (Using list):
python
CopyEdit
stack = []
def push(val):
stack.append(val)
def pop():
if stack:
return stack.pop()
Queue (FIFO)
Definition:
A queue is a linear data structure that follows the First In First Out (FIFO) principle.
Explanation:
The first inserted item is removed first. Operations:
● enqueue: Insert at rear.
● dequeue: Remove from front.
Code (Using list):
python
CopyEdit
queue = []
def enqueue(val):
queue.append(val)
def dequeue():
if queue:
return queue.pop(0)
Advantages:
● Stack: Simple to use in recursion and expression parsing.
● Queue: Fair ordering (scheduling).
Disadvantages:
● Stack: Not suitable for FIFO problems.
● Queue: In list, dequeue is slow (O(n)).
Applications:
● Stack: Undo feature, recursion, syntax parsing.
● Queue: Print queue, CPU scheduling, data buffering.
5. Double Ended Queue (Deque)
Definition:
A Deque is a data structure that allows insertion and deletion from both front and rear ends.
Explanation:
A deque combines features of stacks and queues. It supports:
● insert_front()
● insert_rear()
● delete_front()
● delete_rear()
Algorithm:
1. Maintain front and rear pointers.
2. Update pointers appropriately for each operation.
Code (Using collections.deque):
python
CopyEdit
from collections import deque
dq = deque()
dq.append(10) # insert rear
dq.appendleft(5) # insert front
dq.pop() # delete rear
dq.popleft() # delete front
Advantages:
● Flexible operations from both ends.
● Efficient in real-time applications.
Disadvantages:
● More complex to implement manually.
● May require more memory.
Applications:
● Palindrome checking.
● Task scheduling.
● Browser history navigation.
6. List out and explain various operations of Linked List
Definition:
Linked List operations are the fundamental actions that can be performed on a linked list
data structure to manipulate its elements, including insertion, deletion, traversal, searching,
updating, and reversing.
Explanation:
A Linked List is a dynamic data structure consisting of nodes. Each node has two parts:
● Data: Stores the actual value.
● Pointer: References the next node in the list.
Linked lists allow efficient memory usage and dynamic resizing, unlike arrays. Here are the
main operations commonly performed on singly linked lists (though similar concepts apply
to doubly and circular linked lists too):
🔹 1. Insertion
Purpose: Add a new element at a specific position in the list.
Types:
● At beginning
● At end
● At a specific position
Algorithm (Insert at beginning):
1. Create a new node.
2. Set its next to point to the current head.
3. Make the new node the new head.
python
CopyEdit
def insert_beginning(self, data):
new_node = Node(data)
new_node.next = self.head
self.head = new_node
🔹 2. Deletion
Purpose: Remove an element from the list.
Types:
● From beginning
● From end
● By value/key
Algorithm (Delete from beginning):
1. Check if the list is empty.
2. If not, point head to the next node.
python
CopyEdit
def delete_beginning(self):
if self.head:
self.head = self.head.next
🔹 3. Traversal
Purpose: Visit and display all nodes of the list.
Algorithm:
1. Start at head.
2. Loop through each node until None is reached.
python
CopyEdit
def traverse(self):
temp = self.head
while temp:
print(temp.data, end=" -> ")
temp = temp.next
🔹 4. Search
Purpose: Find whether a specific value exists in the list.
Algorithm:
1. Start at head.
2. Compare each node’s data with the target.
3. Return True if found, else False.
python
CopyEdit
def search(self, key):
temp = self.head
while temp:
if temp.data == key:
return True
temp = temp.next
return False
🔹 5. Update
Purpose: Modify the value of a specific node.
Algorithm:
1. Traverse the list to the target index or value.
2. Change the data field.
python
CopyEdit
def update(self, old, new):
temp = self.head
while temp:
if temp.data == old:
temp.data = new
return
temp = temp.next
🔹 6. Reverse
Purpose: Reverse the order of nodes in the list.
Algorithm:
1. Initialize three pointers: prev = None, curr = head, next = None
2. Traverse the list, reversing the next pointer at each step.
python
CopyEdit
def reverse(self):
prev = None
curr = self.head
while curr:
next = curr.next
curr.next = prev
prev = curr
curr = next
self.head = prev
Advantages:
● Dynamic memory allocation: Grows and shrinks at runtime.
● Efficient insertion/deletion: Especially at the beginning.
● No memory wastage: Unlike fixed-size arrays.
Disadvantages:
● No direct access: Must traverse nodes to access elements.
● Extra memory: Each node requires pointer storage.
● Slower access time: Compared to arrays (O(n) vs O(1) for access).
Applications:
● Implementation of stacks and queues
● Symbol tables in compilers
● Dynamic memory allocation
● Undo/redo functionality
● Navigation systems (forward/back)
UNIT - 3
1. Bubble Sort
Definition:
Bubble Sort is a simple comparison-based sorting algorithm where adjacent elements are
compared and swapped if they are in the wrong order. This process continues until the list is
sorted.
Explanation:
In each pass, the largest element "bubbles up" to the end. The number of passes needed is
n-1 for n elements. It's inefficient for large data sets.
Algorithm:
1. Repeat for i = 0 to n-1
2. For j = 0 to n-i-2:
○ If arr[j] > arr[j+1], swap them
Code:
python
CopyEdit
def bubble_sort(arr):
n = len(arr)
for i in range(n):
for j in range(0, n - i - 1):
if arr[j] > arr[j + 1]:
arr[j], arr[j + 1] = arr[j + 1], arr[j]
Advantages:
● Easy to implement
● No extra memory required
Disadvantages:
● Time complexity: O(n²)
● Inefficient for large datasets
Applications:
● Small datasets
● Teaching sorting concepts
2. Selection Sort
Definition:
Selection Sort is a comparison-based algorithm that selects the smallest (or largest)
element and places it at the correct position in each pass.
Explanation:
In every iteration, the smallest element is selected and swapped with the first unsorted
element.
Algorithm:
1. Repeat for i = 0 to n-1:
○ Find the minimum element from i to n-1
○ Swap with element at index i
Code:
python
CopyEdit
def selection_sort(arr):
n = len(arr)
for i in range(n):
min_idx = i
for j in range(i+1, n):
if arr[j] < arr[min_idx]:
min_idx = j
arr[i], arr[min_idx] = arr[min_idx], arr[i]
Advantages:
● Simple logic
● Works well on small data
Disadvantages:
● O(n²) time
● Not adaptive or stable
Applications:
● Teaching selection principles
● When memory writes are costly
3. Merge Sort
Definition:
Merge Sort is a divide-and-conquer algorithm that divides the array into halves, sorts them
recursively, and then merges them.
Explanation:
It divides the array until each sub-array has one element, then merges them in sorted order.
Algorithm:
1. Divide the array into two halves
2. Recursively sort each half
3. Merge the two sorted halves
Code:
python
CopyEdit
def merge_sort(arr):
if len(arr) > 1:
mid = len(arr)//2
left = arr[:mid]
right = arr[mid:]
merge_sort(left)
merge_sort(right)
i = j = k = 0
while i < len(left) and j < len(right):
if left[i] < right[j]:
arr[k] = left[i]
i += 1
else:
arr[k] = right[j]
j += 1
k += 1
while i < len(left):
arr[k] = left[i]
i += 1
k += 1
while j < len(right):
arr[k] = right[j]
j += 1
k += 1
Advantages:
● Time complexity: O(n log n)
● Stable sort
● Works well for large data
Disadvantages:
● Requires extra memory
● Not in-place
Applications:
● Sorting linked lists
● External sorting (disk-based data)
4. Quick Sort
Definition:
Quick Sort is a divide-and-conquer sorting algorithm that selects a pivot and partitions the
array into two halves around it.
Explanation:
Elements smaller than the pivot go to the left, and larger to the right. The process is
repeated recursively.
Algorithm:
1. Choose a pivot
2. Partition array into two parts
3. Recursively apply quick sort to each part
Code:
python
CopyEdit
def quick_sort(arr):
if len(arr) <= 1:
return arr
pivot = arr[0]
less = [x for x in arr[1:] if x <= pivot]
greater = [x for x in arr[1:] if x > pivot]
return quick_sort(less) + [pivot] + quick_sort(greater)
Advantages:
● In-place and fast (O(n log n) average)
● Efficient for large data
Disadvantages:
● Worst-case: O(n²)
● Not stable
Applications:
● Fast sorting for in-memory data
● System-level sorting (e.g., C’s qsort())
5. Insertion Sort
Definition:
Insertion Sort is a comparison-based sorting method that builds the final sorted array one
element at a time.
Explanation:
Each new element is compared with previous elements and inserted in its correct place.
Algorithm:
1. Start from index 1 to n-1
2. Compare current with all elements before it
3. Shift larger elements and insert current at correct position
Code:
python
CopyEdit
def insertion_sort(arr):
for i in range(1, len(arr)):
key = arr[i]
j = i - 1
while j >= 0 and key < arr[j]:
arr[j+1] = arr[j]
j -= 1
arr[j+1] = key
Advantages:
● Easy to implement
● Efficient for nearly sorted data
Disadvantages:
● O(n²) in worst case
● Not suitable for large datasets
Applications:
● Small datasets
● Real-time systems (due to low overhead)
6. Linear Search
Definition:
Linear Search is a sequential search method where each element is compared with the
target one by one.
Explanation:
It does not require a sorted list and works by iterating over all elements until the target is
found.
Algorithm:
1. Loop through each element.
2. If match is found, return index.
3. Else, return -1.
Code:
python
CopyEdit
def linear_search(arr, target):
for i in range(len(arr)):
if arr[i] == target:
return i
return -1
Advantages:
● Simple to implement
● Works on unsorted lists
Disadvantages:
● O(n) time
● Inefficient for large data
Applications:
● Small datasets
● Unsorted arrays
7. Binary Search
Definition:
Binary Search is a search algorithm that works on sorted arrays by repeatedly dividing the
search interval in half.
Explanation:
It compares the target with the middle element and eliminates half the array at each step.
Algorithm:
1. Set low and high pointers
2. While low <= high:
○ Find mid
○ Compare mid with target
○ Narrow the search range
Code:
python
CopyEdit
def binary_search(arr, target):
low = 0
high = len(arr) - 1
while low <= high:
mid = (low + high)//2
if arr[mid] == target:
return mid
elif arr[mid] < target:
low = mid + 1
else:
high = mid - 1
return -1
Advantages:
● O(log n) time
● Efficient for large, sorted arrays
Disadvantages:
● Requires sorted array
● More complex than linear search
Applications:
● Searching in databases
● Lookup in dictionaries/maps
8. Collision Handling Techniques (Hashing)
Definition:
In hashing, collisions occur when two keys hash to the same index. Collision handling
techniques manage these cases.
Explanation:
Since collisions are inevitable in hashing, strategies are needed to store and retrieve data
correctly.
Types of Collision Handling:
🔹 1. Chaining (Separate Chaining):
Each index of the hash table points to a linked list. All items hashing to the same index are
stored in that list.
Code:
python
CopyEdit
hash_table = [[] for _ in range(10)]
def insert(key, value):
index = key % 10
hash_table[index].append((key, value))
🔹 2. Open Addressing:
If a collision occurs, find another empty slot in the table.
● Linear Probing: Check next cell (i+1, i+2, ...)
● Quadratic Probing: Check i+1^2, i+2^2, ...
● Double Hashing: Use a second hash function to calculate a step size
Advantages:
● Efficient memory usage
● Fast search with proper collision handling
Disadvantages:
● Collisions slow down performance
● Complex to implement correctly
Applications:
● Hash tables (dictionaries, sets)
● Databases
● Symbol tables in compilers
UNIT - 4
1. Binary Search Tree (BST)
Definition:
A Binary Search Tree is a special type of binary tree where each node follows the property:
Left child < Parent < Right child
Explanation:
In a BST:
● All left descendants must be less than the node.
● All right descendants must be greater than the node.
This allows fast searching, insertion, and deletion in O(log n) time (for balanced
trees).
Algorithm (Insertion):
1. Compare the value to insert with the root.
2. If smaller, move left; if greater, move right.
3. Repeat until the correct leaf position is found.
Code:
python
CopyEdit
class Node:
def __init__(self, data):
self.data = data
self.left = self.right = None
def insert(root, key):
if root is None:
return Node(key)
if key < root.data:
root.left = insert(root.left, key)
else:
root.right = insert(root.right, key)
return root
Advantages:
● Fast searching, insertion, deletion (O(log n) for balanced trees)
● Maintains sorted order
Disadvantages:
● Becomes unbalanced → O(n) time
● Needs self-balancing (like AVL)
Applications:
● Searching and sorting
● Map and Set implementations
● Indexing in databases
2. Tree Traversal (Inorder, Preorder, Postorder)
Definition:
Tree traversal is a method of visiting all the nodes in a tree. Three common types for binary
trees are:
● Inorder (LNR)
● Preorder (NLR)
● Postorder (LRN)
Explanation:
● Inorder: Left → Root → Right
Used to get sorted order in BST.
● Preorder: Root → Left → Right
Used to copy the tree structure.
● Postorder: Left → Right → Root
Used to delete/free the tree.
Algorithm:
python
CopyEdit
def inorder(root):
if root:
inorder(root.left)
print(root.data, end=' ')
def preorder(root):
if root:
print(root.data, end=' ')
preorder(root.left)
preorder(root.right)
def postorder(root):
if root:
postorder(root.left)
postorder(root.right)
print(root.data, end=' ')
Advantages:
● Systematic way to visit nodes
● Useful for expression trees, memory management, file systems
Disadvantages:
● Recursion may cause stack overflow for deep trees
Applications:
● Expression parsing (Pre/Post)
● Sorting and searching (Inorder in BSTs)
● Compiler design
3. Tree ADT & Binary Tree ADT
Definition:
A Tree ADT (Abstract Data Type) defines a hierarchical structure where each node has a
value and a list of child nodes.
A Binary Tree ADT restricts each node to have at most two children: left and right.
Explanation:
● Tree ADT: General tree with any number of children.
● Binary Tree ADT: Each node has at most 2 children.
● Operations: insert, delete, traverse, find height, find level, etc.
Algorithm (General insertion for binary tree):
1. Start from the root.
2. Use level-order (BFS) to find the first empty spot.
3. Insert new node there.
Code (Binary Tree):
python
CopyEdit
class Node:
def __init__(self, data):
self.data = data
self.left = self.right = None
Advantages:
● Flexible, hierarchical structure
● Enables efficient algorithms (e.g., heaps, trees, tries)
Disadvantages:
● Complex implementation compared to arrays
● Recursion may lead to stack issues
Applications:
● Decision trees, syntax trees, file systems
● Used in expression evaluation
● Routing algorithms
4. Heaps
Definition:
A Heap is a complete binary tree that satisfies the heap property:
● Max Heap: Parent ≥ children
● Min Heap: Parent ≤ children
Explanation:
● Always a complete binary tree (every level filled left to right).
● Implemented using arrays.
● Used for priority queues and heap sort.
Algorithm (Insertion in Max Heap):
1. Insert element at end.
2. Compare with parent; if larger, swap.
3. Repeat until heap property is restored.
Code (Heapify – Max Heap):
python
CopyEdit
def heapify(arr, n, i):
largest = i
l = 2*i + 1
r = 2*i + 2
if l < n and arr[l] > arr[largest]:
largest = l
if r < n and arr[r] > arr[largest]:
largest = r
if largest != i:
arr[i], arr[largest] = arr[largest], arr[i]
heapify(arr, n, largest)
Advantages:
● Efficient priority queue (O(log n))
● Fast sorting: Heap Sort (O(n log n))
Disadvantages:
● Slower than BST for searching
● Requires complete tree structure
Applications:
● Priority queues
● CPU scheduling
● Heap sort
5. AVL Trees
Definition:
AVL Tree is a self-balancing binary search tree where the difference in heights of the left
and right subtrees (balance factor) is at most 1 for all nodes.
Explanation:
● Ensures balanced tree for efficient operations (O(log n)).
● Performs rotations (left, right, double) after insertions/deletions to maintain balance.
Balance Factor = height(left) - height(right)
Valid range: -1, 0, +1
Algorithm (Insertion with balancing):
1. Perform normal BST insertion.
2. Update heights and balance factor.
3. If unbalanced, apply:
○ Left Rotation (Right-heavy)
○ Right Rotation (Left-heavy)
○ Left-Right Rotation
○ Right-Left Rotation
Code (Basic Insertion with Balancing Check):
python
CopyEdit
class AVLNode:
def __init__(self, key):
self.key = key
self.left = self.right = None
self.height = 1
def get_height(node):
return node.height if node else 0
def get_balance(node):
return get_height(node.left) - get_height(node.right)
def right_rotate(z):
y = z.left
T3 = y.right
y.right = z
z.left = T3
z.height = max(get_height(z.left), get_height(z.right)) + 1
y.height = max(get_height(y.left), get_height(y.right)) + 1
return y
def left_rotate(z):
y = z.right
T2 = y.left
y.left = z
z.right = T2
z.height = max(get_height(z.left), get_height(z.right)) + 1
y.height = max(get_height(y.left), get_height(y.right)) + 1
return y
Advantages:
● Always balanced → guaranteed O(log n)
● Fast insertion/deletion with balancing
Disadvantages:
● Complex to implement
● Rotations add overhead
Applications:
● High-performance searching
● Real-time systems (where time constraints matter)
● Indexing in databases
UNIT - 5
1. Breadth-First Search (BFS)
Definition:
BFS is a graph traversal algorithm that visits nodes level by level starting from a selected
source vertex, using a queue.
Explanation:
● Visits all immediate neighbors of a node before moving to the next level.
● Uses a queue (FIFO).
● Works well for unweighted graphs to find shortest path (in terms of number of
edges).
Algorithm:
1. Mark the source node as visited.
2. Add it to the queue.
3. While the queue is not empty:
○ Remove a node from queue.
○ Visit all its unvisited neighbors, mark them as visited and add to the queue.
Code:
python
CopyEdit
from collections import deque
def bfs(graph, start):
visited = set()
queue = deque([start])
while queue:
vertex = queue.popleft()
if vertex not in visited:
print(vertex, end=" ")
visited.add(vertex)
queue.extend([n for n in graph[vertex] if n not in
visited])
Advantages:
● Finds shortest path in unweighted graphs.
● Simple and easy to implement.
Disadvantages:
● Requires more memory (due to queue).
● Not suitable for deep graphs.
Applications:
● Web crawlers
● Social networks (friend suggestions)
● GPS navigation (shortest path)
2. Depth-First Search (DFS)
Definition:
DFS is a graph traversal algorithm that explores as far along a branch as possible before
backtracking, using a stack (or recursion).
Explanation:
● Uses stack (LIFO) or recursion.
● Useful for cycle detection, topological sort, and connected components.
Algorithm:
1. Start from a source node, mark as visited.
2. Visit one neighbor and go deeper.
3. Backtrack when no unvisited neighbor remains.
Code:
python
CopyEdit
def dfs(graph, start, visited=None):
if visited is None:
visited = set()
visited.add(start)
print(start, end=" ")
for neighbor in graph[start]:
if neighbor not in visited:
dfs(graph, neighbor, visited)
Advantages:
● Uses less memory than BFS.
● Good for deep traversal and solving puzzles.
Disadvantages:
● May get stuck in deep paths.
● Doesn’t guarantee shortest path.
Applications:
● Solving mazes
● Topological sorting
● Strongly connected components (Tarjan's, Kosaraju's)
3. Minimum Spanning Tree – Kruskal’s Algorithm
Definition:
Kruskal’s Algorithm finds a minimum spanning tree (MST) by adding the next smallest edge
that doesn’t form a cycle, using the disjoint-set (union-find) data structure.
Explanation:
● Sorts all edges by weight.
● Adds edges to MST in ascending order if they don’t form a cycle.
● Suitable for sparse graphs.
Algorithm:
1. Sort all edges by weight.
2. Initialize MST as empty.
3. For each edge:
○ If adding it doesn’t form a cycle (using union-find), add it to MST.
Code (simplified):
python
CopyEdit
class DisjointSet:
def __init__(self, n):
self.parent = list(range(n))
def find(self, x):
if self.parent[x] != x:
self.parent[x] = self.find(self.parent[x])
return self.parent[x]
def union(self, x, y):
self.parent[self.find(x)] = self.find(y)
def kruskal(n, edges):
ds = DisjointSet(n)
mst = []
edges.sort(key=lambda x: x[2])
for u, v, w in edges:
if ds.find(u) != ds.find(v):
ds.union(u, v)
mst.append((u, v, w))
return mst
Advantages:
● Simple and efficient for sparse graphs.
● Always gives correct MST.
Disadvantages:
● Needs sorting and disjoint-set.
● Slower for dense graphs.
Applications:
● Network design (roads, pipes, electrical)
● Cluster analysis
● Image segmentation
4. Prim’s Algorithm
Definition:
Prim’s Algorithm builds the MST by expanding from a starting node and adding the smallest
edge that connects to the MST.
Explanation:
● Grows MST one node at a time.
● Uses a priority queue (min heap) for efficiency.
● Best for dense graphs.
Algorithm:
1. Start with one node.
2. Add the smallest edge that connects MST to a new node.
3. Repeat until all nodes are included.
Code (using heapq):
python
CopyEdit
import heapq
def prim(graph, start):
visited = set()
min_heap = [(0, start)]
total_cost = 0
while min_heap:
weight, node = heapq.heappop(min_heap)
if node not in visited:
visited.add(node)
total_cost += weight
for neighbor, w in graph[node]:
if neighbor not in visited:
heapq.heappush(min_heap, (w, neighbor))
return total_cost
Advantages:
● Fast for dense graphs.
● Greedy and optimal.
Disadvantages:
● Needs priority queue.
● Slower on sparse graphs compared to Kruskal.
Applications:
● Network routing
● Telecommunication
● Road map optimization
5. Topological Ordering
Definition:
Topological ordering is a linear ordering of vertices in a Directed Acyclic Graph (DAG) such
that for every edge u → v, u comes before v.
Explanation:
● Only possible for DAGs.
● Used in scheduling and dependency resolution.
Algorithm (Using DFS):
1. Perform DFS.
2. On return from recursion, push the node onto a stack.
3. Reverse the stack for topological order.
Code:
python
CopyEdit
def topological_sort(graph):
visited = set()
stack = []
def dfs(v):
visited.add(v)
for neighbor in graph[v]:
if neighbor not in visited:
dfs(neighbor)
stack.append(v)
for vertex in graph:
if vertex not in visited:
dfs(vertex)
return stack[::-1]
Advantages:
● Helps in scheduling tasks
● Useful for detecting cycles in graphs
Disadvantages:
● Only works on DAGs
● Requires full traversal of graph
Applications:
● Course scheduling
● Build systems (Make, Gradle)
● Job/task dependency management
6. Graph Representations
Definition:
Graph representation is the method used to store and model a graph in a computer’s
memory.
Explanation:
There are 3 main ways to represent graphs:
🔹 1. Adjacency Matrix
● 2D array n x n, where matrix[i][j] = 1 if edge exists
● Easy edge lookup: O(1)
● Space: O(n²)
🔹 2. Adjacency List
● Dictionary or list of lists
● Each vertex stores a list of its neighbors
● Space: O(V + E)
● Efficient for sparse graphs
🔹 3. Edge List
● List of all edges as (u, v, w)
● Used in Kruskal's algorithm
● Easy for sorting
Code (Adjacency List):
python
CopyEdit
graph = {
0: [1, 2],
1: [2],
2: [0, 3],
3: [3]
}
Advantages:
● Adjacency list: Efficient for sparse graphs
● Adjacency matrix: Fast for dense graphs
● Edge list: Compact and ideal for edge-centric algorithms
Disadvantages:
● Matrix wastes space for sparse graphs
● List slower for edge lookups
● Edge list lacks node relationship clarity
Applications:
● Used in BFS, DFS, Dijkstra, Prim, Kruskal, etc.
● Modeling social networks, road networks, maps
● Routing and scheduling
Credit : RK