Data Structure
Definition:
A data structure is a specific way to organize and store data in the computer’s memory, so
this data can be used efficiently.
Types:
Primitive Data Structures:
These are predefined ways of storing data by the system. e.g., Integer, Float, Character,
Boolean
Non-Primitive Data Structures:
These emphasize on grouping the same(homogeneous) or different(heterogeneous) data
items with a relationship between each data item. There are two types
1. Linear Data Structures
2. Non Linear Data Structures
Linear Data Structures:
Here the data elements are organized in a specific or definite sequence.
1. Arrays
2. Linked List
3. Stack
4. Queue
Non Linear Data Structures:
Here data elements are organized in some arbitrary function without a specific sequence.
1. Trees
2. Graphs
Abstract Data Types:
The process of providing only specifications and hiding the full implementation is called
abstraction. In a similar way the Abstract Data Type (ADT) is a specification of behaviour of
the given Data Structures.
ADTs define the different operations that can be performed on the given data structure. Note
that it only defined the operations but not the full implementations of the same. So, it is kind
of an interface on the Data Structures.
How to measure Data Structures:
Technically a data structure is some kind of an algorithm that works with some principles. So
to measure DS we have to properties:
1. Time Complexity
2. Space Complexity
Time Complexity:
The time complexity of an algorithm quantifies the amount of time taken by an algorithm to
run as a function of the length of the input. Note that the time to run is a function of the
length of the input and not the actual execution time of the machine on which the algorithm is
running on.
The most common metric to calculate the time complexity is Big O notation i.e., O(n).
e.g., printf("Hello World\n");
Here this is a single statement and hence the time complexity will always be a constant.
for(i=0; i< n; i++) {
printf("The element is %d\n", i);
}
Here the running time is directly proportional to the length of a loop that is n
The next is the nested loops. Nested loops form the quadratic or non-linear behaviour. And
hence the time complexity will be generally to the proportional of n.
for(i=0; i< n; i++) {
for(j = 0; j < n; j++) {
printf("The element is %d * %d\n", i, j);
}
}
So, here there is a loop inside another loop and hence time complexity would be n*n or n2
Space Complexity:
It determines the amount of memory required by the given algorithm.
1. Fixed size Memory: Space required for simple variables, constants, or array.
2. Variable size Memory: Here the memory is allocated at run-time.
Space complexity also uses the Big O notation i.e., O(n)
Note: Always algorithm is traded off between space and time depending on the requirements
and the place where it is run.
Asymptotic Notations for Complexity of Algorithms:
Asymptotic analysis refers to computing the running time of any operation in mathematical
units of computation. Using this we conclude the best case, average case, and worst case of
an algorithm. Following are the commonly used asymptotic notations:
1. Big O Notation (O)
2. Omega Notation (Ω)
3. Theta Notation (Θ)
Strings
Definition:
A string is a sequence of characters that is treated as a single data item. Each character
within the string is stored as an element of the array in contiguous.
ADTs:
1. Concatenation
2. Length
3. Substring
4. Reverse
5. Indexing
Concatenation:
Concatenation is adding one string to another string.
e.g.,
#include <stdio.h>
#include <string.h>
int main() {
char str1[100] = "Hello, ";
char str2[] = "World!";
// Concatenate str2 to str1
strcat(str1, str2);
printf("Concatenated String: %s\n", str1);
return 0;
}
As in the above example, there is a built-in library in C via string.h import which uses
strcat(src, dest) as the function to concatenate 2 strings. Note that str2 is appended to the
str1 and hence str1 will contain the whole concatenated string.
Length:
It gives the total number of characters in a string excluding the null terminator `\0`.
e.g.,
#include <stdio.h>
int main() {
char str[] = "Hello, World!";
int length = 0;
// Loop until null terminator is found
while (str[length] != '\0') {
length++;
}
printf("Length of string: %d\n", length);
return 0;
}
So, manually one can iterate over a string as it is technically an array of characters, until the
null terminator character `\0` is encountered. The above code is written in the same logic via
while loop.
Substring:
A substring is a subset of a mainstring.
e.g., “Hello World” The word “World” is a subset of a string “Hello World”.
#include <stdio.h>
void substring(char *source, char *target, int start, int length) {
for (int i = 0; i < length; i++) {
target[i] = source[start + i];
}
target[length] = '\0'; // Null-terminate
}
int main() {
char str[] = "Hello, World!";
char sub[20];
substring(str, sub, 7, 5); // Extract "World"
printf("Substring: %s\n", sub);
return 0;
}
● A loop copies length characters from source[start] to target[i].
● The substring is null-terminated to avoid garbage values.
Pattern Matching:
A pattern matching is basically a string searching, where a sequence of characters is
searched for occurrences in the main string.
e.g.,
#include<stdio.h>
int search(char *s, char *p, char l) {
int i = 0, j = 0, startInd = -1;
while(s[i] != '\0') {
if(s[i] == p[j]) {
j++;
if(startInd == -1) {
startInd = i;
}
if(p[j] == '\0') {
return startInd;
}
} else {
j = 0;
startInd = -1;
}
i++;
}
return startInd;
}
int main() {
char ms[] = "hello geeks";
char ps[] = "geeks";
char s = 'l';
int res = -1;
clrscr();
res = search(ms, ps, s);
if (res >= 0) {
printf("Found at the index = %d", res);
} else {
printf("Not Found");
}
getch();
return 0;
}
The main logic is each character of a sample string is compared against the main string until
it matches 100%, else it is a reset from that index position. During the match the start-index
is preserved by copying it in a separate variable i.e., startInd variable whose default value
is -1. So, when the method returns the value as -1 then it means the pattern has not
matched.
Indexing:
Indexing is the process of accessing a string’s individual character via its position or index
value.
e.g.,
#include <stdio.h>
int main() {
char str[] = "Hello, World!";
// Accessing individual characters using indexing
printf("First character: %c\n", str[0]);
printf("Fifth character: %c\n", str[4]);
printf("Last character: %c\n", str[12]); // '!' (before '\0')
return 0;
}
Note that in any programming the starting index is always 0, and hence the fifth character is
accessed via the 4th index as in the above example.
Reversing:
Reversing is printing the string from right to left order.
e.g.,
#include <stdio.h>
#include <string.h>
void reverseString(char *str) {
int left = 0, right = strlen(str) - 1;
while (left < right) {
// Swap characters
char temp = str[left];
str[left] = str[right];
str[right] = temp;
left++;
right--;
}
}
int main() {
char str[] = "Hello, World!"; // Make sure it's a mutable array
reverseString(str);
printf("Reversed String: %s\n", str);
return 0;
}
Explanation:
● Uses two pointers: one at the start and one at the end.
● Swaps characters and moves inward until the middle is reached.
Arrays
An array is a structured data type consisting of a group of elements of the same type. Some
important properties are:
1. All elements are of same type i.e., homogeneous
2. The individual data items can be characters, integers, floating point number etc.
3. Array elements are stored in contiguous memory locations
Linear Arrays:
A linear array is a list of a finite number of homogeneous elements. That is the elements of
the same data type. The general form is
int arrayname[size]
int array[3] = {10, 20, 30}
In the above example,
1. The index starts at 0.
2. Array length is 3 as mentioned within the square brackets.
3. Each element can be accessed via its index.
Representation of Linear Arrays in Memory:
In 1-D array location of an element A[i] can be determined using following equation:
LOC(A[i]) = Base Address + W * (i)
● Base Address is the virtual address in the heap memory (RAM)
● W is the word size. It means number of bytes required for each value, in this case
integer which needs 4bytes.
Arrays ADT:
Operations are:
1. Traverse
2. Insertion
3. Deletion
4. Update
5. Search
Traversing Linear Arrays:
A linear array can be traversed using the looping either with for or while, where with for loop
iteration is done till the length of the array, whereas using while loop it is iterated till the
increments index reaches the upper-bound which is length-1 in general.
#include <stdio.h>
int main() {
// Define an array
int arr[] = {10, 20, 30, 40, 50};
int length = sizeof(arr) / sizeof(arr[0]); // Calculate array length
// Traverse the array using a loop
printf("Array elements: ");
for (int i = 0; i < length; i++) {
printf("%d ", arr[i]); // Print each element
}
printf("\n");
return 0;
}
Output:
Array elements: 10 20 30 40 50
Note array can be defined without explicitly mentioning the size, however with the formula
the size can be determined.
int length = sizeof(arr) / sizeof(arr[0]); // Calculate array length
Insertion:
A new element can be added at the beginning, in between, at the end.
Position: 0 1 2 3 4
Item: 10 20 30 40 50
If the item 15 is to be inserted in position/index 2:
1. Create a new array of size 6, as the old array is of size 5, to add a new element size
has to be increased by 1. But in C this step is not required.
2. Next move the last item 50 by 1 position i.e., to 5.
3. Likewise move the next last item 40 by 1 position i.e., to 4. Continue till the index 2.
4. Now add the new element 15 to the index 2 and this array is inserted.
So, in general, the elements present towards the right of the index and the previous element
of the index are shifted right by 1 position and then the new element is inserted.
e.g.,
#include <stdio.h>
int main() {
int arr[100], n, i, element, position;
// Input: Size of the array
printf("Enter number of elements in array: ");
scanf("%d", &n);
// Input: Elements of the array
printf("Enter %d elements: ", n);
for (i = 0; i < n; i++) {
scanf("%d", &arr[i]);
}
// Input: Element to insert and position
printf("Enter element to insert: ");
scanf("%d", &element);
printf("Enter position (1 to %d): ", n+1);
scanf("%d", &position);
// Check if position is valid
if (position < 1 || position > n+1) {
printf("Invalid position!\n");
return 1;
}
// Shift elements to the right
for (i = n; i >= position; i--) {
arr[i] = arr[i - 1];
}
// Insert the new element
arr[position - 1] = element;
n++; // Increase array size
// Print the updated array
printf("Array after insertion: ");
for (i = 0; i < n; i++) {
printf("%d ", arr[i]);
}
printf("\n");
return 0;
}
Deletion:
A new element can be deleted at the beginning, in between, at the end.
Position: 0 1 2 3 4
Item: 10 20 30 40 50
Steps:
1. Find the element and its respective index.
2. Remove the element present at the index.
3. Move the elements present towards the right to left by 1 position.
#include <stdio.h>
int main() {
int arr[100], n, i, position;
// Input: Size of the array
printf("Enter number of elements in array: ");
scanf("%d", &n);
// Input: Elements of the array
printf("Enter %d elements: ", n);
for (i = 0; i < n; i++) {
scanf("%d", &arr[i]);
}
// Input: Position to delete
printf("Enter position to delete (1 to %d): ", n);
scanf("%d", &position);
// Check if position is valid
if (position < 1 || position > n) {
printf("Invalid position!\n");
return 1;
}
// Shift elements to the left to overwrite the deleted element
for (i = position - 1; i < n - 1; i++) {
arr[i] = arr[i + 1];
}
n--; // Reduce array size
// Print the updated array
printf("Array after deletion: ");
for (i = 0; i < n; i++) {
printf("%d ", arr[i]);
}
printf("\n");
return 0;
}
Update:
To update an element at a given index, it is done in the following way:
LA[<<index>>] = <<val>>;
Multidimensional Arrays:
Arrays which have elements with two subscripts are known as 2-D arrays. Matrix is one of
the good examples of 2D arrays.
The general form is:
datatype arrayname[size1][size2]
list [0] [1] [2] [3] [4]
[0]
[1] list[1][3]
[2] list[2][1]
[3]
Order Representation of Array:
Multi dimensional arrays can be stored in the memory mainly in 2 ways -
1. Row major
2. Column major
Row Major Order Representation of Array:
In this method elements of an array are arranged sequentially row by row. So, the elements
of the first row occupy the first set of memory locations and so on.
Note that each row takes the sequential memory address, in the above image observe from
the bottom to up.
Formula:
LOCATION(A[i,j]) = Base_Address/Start_Address + W[C(i) + (j)]
where,
● Base address is the address of the 1st element of the array.
● W is the word size. It means the number of bytes occupied by each element, e.g., int
will take 4bytes in C.
● R is the number of rows.
● C is the number of columns.
Column Major Order Representation of Array:
In the column major, each column value is stored in the sequential memory. Refer the above
image for both row and column major representation illustrations.
Formula:
LOCATION(A[i,j]) = Base_Address/Start_Address + W[R(i) + (j)]
where,
● Base address is the address of the 1st element of the array.
● W is the word size. It means the number of bytes occupied by each element, e.g., int
will take 4bytes in C.
● R is the number of rows.
● C is the number of columns.
Sparse Matrices:
A sparse matrix is a 2-d array with R rows and C columns, where most of the matrix values
are 0.
e.g.,
0 0 3 0 4
0 0 5 7 0
0 0 0 0 0
0 2 6 0 0
There are two common representations of sparse matrices in order to save the memory
consumed by the 0 values:
● Array representation
● Linked Representation
Array Representation:
Technically, only non zero elements are stored and 0 elements are formed at the run time.
Row → Index of row, where non-zero element is located.
Column → Index of column, where non-zero element is located.
Value → Value of the corresponding non-zero element.
Linked List Representation:
0 0 3 0 4
0 0 5 7 0
0 0 0 0 0
0 2 6 0 0
Row Column Value Address of next
node
0 2 3
↲
0 4 4
↲
1 2 5
↲
1 3 7
↲
3 1 2
↲
3 2 6
NULL ↲
Sorting and Searching
Sorting is a process of arranging data in a specified order.
Searching is a process of finding an element in the array or in any other data structure.
There are many sorting techniques and 3 main sorting techniques are
1. Bubble Sort
2. Insertion Sort
3. Selection Sort
Bubble Sort:
#include <stdio.h>
// Function to perform Bubble Sort
void bubbleSort(int arr[], int n) {
int i, j, temp;
for (i = 0; i < n - 1; i++) {
for (j = 0; j < n - i - 1; j++) {
if (arr[j] > arr[j + 1]) {
// Swap arr[j] and arr[j+1]
temp = arr[j];
arr[j] = arr[j + 1];
arr[j + 1] = temp;
}
}
}
}
// Function to print the array
void printArray(int arr[], int n) {
for (int i = 0; i < n; i++) {
printf("%d ", arr[i]);
}
printf("\n");
}
// Main function
int main() {
int arr[100], n, i;
// Input: Number of elements
printf("Enter number of elements: ");
scanf("%d", &n);
// Input: Elements of the array
printf("Enter %d elements: ", n);
for (i = 0; i < n; i++) {
scanf("%d", &arr[i]);
}
// Sort the array
bubbleSort(arr, n);
// Print sorted array
printf("Sorted array: ");
printArray(arr, n);
return 0;
}
Complexity Analysis:
● Best Case Time Complexity → O(n)
● Best Case Time Complexity → O(n2)
● Worst Case Time Complexity → O(n2)
● Space Complexity → O(1)
Insertion Sort:
#include <stdio.h>
// Function to perform Insertion Sort
void insertionSort(int arr[], int n) {
int i, j, key;
for (i = 1; i < n; i++) {
key = arr[i]; // Current element to insert
j = i - 1;
// Move elements that are greater than key one position ahead
while (j >= 0 && arr[j] > key) {
arr[j + 1] = arr[j];
j--;
}
arr[j + 1] = key; // Insert key in its correct position
}
}
// Function to print the array
void printArray(int arr[], int n) {
for (int i = 0; i < n; i++) {
printf("%d ", arr[i]);
}
printf("\n");
}
// Main function
int main() {
int arr[100], n, i;
// Input: Number of elements
printf("Enter number of elements: ");
scanf("%d", &n);
// Input: Elements of the array
printf("Enter %d elements: ", n);
for (i = 0; i < n; i++) {
scanf("%d", &arr[i]);
}
// Sort the array
insertionSort(arr, n);
// Print sorted array
printf("Sorted array: ");
printArray(arr, n);
return 0;
}
● Best Case Time Complexity → O(n)
● Best Case Time Complexity → O(n2)
● Worst Case Time Complexity → O(n2)
● Space Complexity → O(1)
Selection Sort:
#include <stdio.h>
// Function to perform Selection Sort
void selectionSort(int arr[], int n) {
int i, j, minIndex, temp;
for (i = 0; i < n - 1; i++) {
minIndex = i; // Assume the first element is the smallest
// Find the index of the smallest element in the unsorted part
for (j = i + 1; j < n; j++) {
if (arr[j] < arr[minIndex]) {
minIndex = j;
}
}
// Swap the found minimum element with the first element of the
unsorted part
temp = arr[i];
arr[i] = arr[minIndex];
arr[minIndex] = temp;
}
}
// Function to print the array
void printArray(int arr[], int n) {
for (int i = 0; i < n; i++) {
printf("%d ", arr[i]);
}
printf("\n");
}
// Main function
int main() {
int arr[100], n, i;
// Input: Number of elements
printf("Enter number of elements: ");
scanf("%d", &n);
// Input: Elements of the array
printf("Enter %d elements: ", n);
for (i = 0; i < n; i++) {
scanf("%d", &arr[i]);
}
// Sort the array
selectionSort(arr, n);
// Print sorted array
printf("Sorted array: ");
printArray(arr, n);
return 0;
}
● Best Case Time Complexity → O(n2)
● Best Case Time Complexity → O(n2)
● Worst Case Time Complexity → O(n2)
● Space Complexity → O(1)
Searching:
It is a process of finding an element in the array. There are mainly 2 search techniques:
1. Linear Search
2. Binary Search
Linear Search:
#include <stdio.h>
// Function to perform Linear Search
int linearSearch(int arr[], int n, int key) {
for (int i = 0; i < n; i++) {
if (arr[i] == key) {
return i; // Return index if found
}
}
return -1; // Return -1 if not found
}
// Main function
int main() {
int arr[100], n, key, i, result;
// Input: Number of elements
printf("Enter number of elements: ");
scanf("%d", &n);
// Input: Elements of the array
printf("Enter %d elements: ", n);
for (i = 0; i < n; i++) {
scanf("%d", &arr[i]);
}
// Input: Element to search
printf("Enter element to search: ");
scanf("%d", &key);
// Perform Linear Search
result = linearSearch(arr, n, key);
// Output result
if (result != -1) {
printf("Element found at index %d\n", result);
} else {
printf("Element not found in the array\n");
}
return 0;
}
Complexity:
● Best Case Time Complexity → O(1) : When the element is found in the 0th index.
● Best Case Time Complexity → O(n) : When the element is found in the middle or
extreme right.
● Worst Case Time Complexity → O(n) : When the element is found in the extreme
right or the last element.
● Space Complexity → O(1)
Binary Search:
The minimum criteria for binary search to work is that the array shall be sorted in a specific
order.
#include <stdio.h>
// Function to perform Binary Search
int binarySearch(int arr[], int left, int right, int key) {
while (left <= right) {
int mid = left + (right - left) / 2;
// Check if the key is at mid
if (arr[mid] == key)
return mid;
// If key is greater, ignore the left half
if (arr[mid] < key)
left = mid + 1;
else // If key is smaller, ignore the right half
right = mid - 1;
}
return -1; // Key not found
}
// Main function
int main() {
int arr[100], n, key, i, result;
// Input: Number of elements
printf("Enter number of elements: ");
scanf("%d", &n);
// Input: Sorted array elements
printf("Enter %d sorted elements: ", n);
for (i = 0; i < n; i++) {
scanf("%d", &arr[i]);
}
// Input: Element to search
printf("Enter element to search: ");
scanf("%d", &key);
// Perform Binary Search
result = binarySearch(arr, 0, n - 1, key);
// Output result
if (result != -1) {
printf("Element found at index %d\n", result);
} else {
printf("Element not found in the array\n");
}
return 0;
}
Complexity:
● Best Case Time Complexity → O(1) : When the element is found in the middle index.
● Best Case Time Complexity → O(log n) : When the element is found in the middle or
extreme right.
● Worst Case Time Complexity → O(log n) : When the element is found in the extreme
right or the last element.
● Space Complexity → O(1)