UNIT V
SORTING, SEARCHING AND HASH TECHNIQUES
Searching - Linear search –Binary Search. Sorting- Bubble sort - Selection sort - Insertion
sort - Shell sort –Merge sort – Hash and Hash Functions – Separate Chaining – Open
Addressing – Rehashing – Extendible Hashing.
PART A
1. What is meant by Sorting?
Sorting is ordering of data in an increasing or decreasing fashion according to some
linear relationship among the data items.
2. List the different sorting algorithms.
• Bubble sort
• Selection sort
• Insertion sort
• Shell sort
• Quick sort
• Radix sort
• Heap sort
• Merge sort
3. Why bubble sort is called so?
The bubble sort gets its name because as array elements are sorted they gradually
“bubble” to their proper positions, like bubbles rising in a glass of soda.
4. State the logic of bubble sort algorithm.
The bubble sort repeatedly compares adjacent elements of an array. The first and second
elements are compared and swapped if out of order. Then the second and third elements are
compared and swapped if out of order. This sorting process continues until the last two
elements of the array are compared and swapped if out of order.
5. What number is always sorted to the top of the list by each pass of the Bubble
sort algorithm?
Each pass through the list places the next largest value in its proper place. In essence, each item
“bubbles” up to the location where it belongs.
6. When does the Bubble Sort Algorithm stop?
The bubble sort stops when it examines the entire array and finds that no "swaps" are
needed. The bubble sort keeps track of the occurring swaps by the use of a flag.
7. State the logic of selection sort algorithm.
It finds the lowest value from the collection and moves it to the left. This is repeated
until the complete collection is sorted.
8. What is the output of selection sort after the 2nd iteration given the following sequence?
16 3 46 9 28 14
Ans: 3 9 46 16 28 14
9. How does insertion sort algorithm work?
In every iteration an element is compared with all the elements before it. While comparing if
it is found that the element can be inserted at a suitable position, then space is created for it by
shifting the other elements one position up and inserts the desired element at the suitable
position. This procedure is repeated for all the elements in the list until we get the sorted
elements.
10. What operation does the insertion sort use to move numbers from the unsorted section
to the sorted section of the list?
The Insertion Sort uses the swap operation since it is ordering numbers within a single list.
11. How many key comparisons and assignments an insertion sort makes in its worst case?
The worst case performance in insertion sort occurs when the elements of the input array are in
descending order. In that case, the first pass requires one comparison, the second pass requires two
comparisons, third pass three comparisons….kth pass requires (k-1), and finally the last pass
requires (n-1) comparisons. Therefore, total numbers of comparisons are:
f(n) = 1+2+3+………+(n-k)+…..+(n-2)+(n-1) = n(n-1)/2 = O(n2)
12. Which sorting algorithm is best if the list is already sorted? Why?
Insertion sort as there is no movement of data if the list is already sorted and
complexity is of the order O(N).
13. Which sorting algorithm is easily adaptable to singly linked lists? Why?
Insertion sort is easily adaptable to singly linked list. In this method there is an array link
of pointers, one for each of the original array elements. Thus the array can be thought of as a
linear link list pointed to by an external pointer first initialized to 0. To insert the k th element the
linked list is traversed until the proper position for x[k] is found, or until the end of the list is
reached. At that point x[k] can be inserted into the list by merely adjusting the pointers without
shifting any elements in the array which reduces insertion time.
14. Why Shell Sort is known diminishing increment sort?
The distance between comparisons decreases as the sorting
algorithm runs until the last phase in which adjacent elements are compared. In each step, the
sortedness of the sequence is increased, until in the last step it is completely sorted.
15. Which of the following sorting methods would be especially suitable to sort a list L
consisting of a sorted list followed by a few “random” elements?
Quick sort is suitable to sort a list L consisting of a sorted list followed by a few
“random” elements.
rd
16. What is the output of quick sort after the 3 iteration given the following sequence?
24 56 47 35 10 90 82 31
Pass 1:- (10) 24 (56 47 35 90 82 31)
Pass 2:- 10 24 (56 47 35 90 82 31)
Pass 3:- 10 24 (47 35 31) 56 (90 82)
17. Mention the different ways to select a pivot element.
The different ways to select a pivot element are
• Pick the first element as pivot
• Pick the last element as pivot
• Pick the Middle element as pivot
• Median-of-three elements
• Pick three elements, and find the median x of these elements
• Use that median as the pivot.
• Randomly pick an element as pivot.
18. What is divide-and-conquer strategy?
• Divide a problem into two or more sub problems
• Solve the sub problems recursively
• Obtain solution to original problem by combining these solutions
19. Compare quick sort and merge sort.
Quick sort has a best-case linear performance when the input is sorted, or nearly sorted. It has a
worst-case quadratic performance when the input is sorted in reverse, or nearly sorted in reverse.
Merge sort performance is much more constrained and predictable than the performance of quick sort.
The price for that reliability is that the average case of merge sort is slower than the average case of quick
sort because the constant factor of merge sort is larger.
20. Define Searching.
Searching for data is one of the fundamental fields of computing. Often, the difference between a fast
program and a slow one is the use of a good algorithm for the data set. Naturally, the use of a hash table or
binary search tree will result in more efficient searching, but more oftenthan not an array or linked list will
be used. It is necessary to understand good ways of searchingdata structures not designed to support
efficient search.
21. What is linear search?
In Linear Search the list is searched sequentially and the position is returned if the key element to be
searched is available in the list, otherwise -1 is returned. The search in Linear Search starts at the beginning
of an array and move to the end, testing for a match at each item.
22. What is Binary search?
A binary search, also called a dichotomizing search, is a digital scheme for locating a specific object in a
large set. Each object in the set is given a key. The number of keys is always a power of 2. If there are 32
items in a list, for example, they might be numbered 0 through 31 (binary 00000 through 11111). If there
are, say, only 29 items, they can be numbered 0 through 28 (binary 00000 through 11100), with the numbers
29 through31 (binary 11101, 11110, and 11111) as dummy keys.
23. Define hash function?
Hash function takes an identifier and computes the address of that identifier in the hash tableusing some
function.
PART B
1. Explain the algorithm to perform Bubble sort
Bubble sort
Consecutive adjacent pairs of elements in the array are compared with each other.
If the element at the lower index is greater than the element at the higher index the two
elements are interchanged so that the element is placed before the bigger one.
This process will continue till the list of unsorted elements exhausts
void bubble_sort( int A[ ], int n )
{
int temp;
for(int k = 0; k< n-1; k++)
{
// (n-k-1) is for ignoring comparisons of elements which have already been compared in earlier iterations
for(int i = 0; i < n-k-1; i++)
{
if(A[ i ] > A[ i+1] )
{
temp = A[ i ];
A[ i ] = A[ i+1 ];
A[ i + 1] = temp;
}
}
}
}
2. Explain the algorithm to perform Insertion sort
Insertion Sort Algorithm
Procedure:
Step 1: The second element of an array is compared with the elements that appear before it (only first
element in this case). If the second element is smaller than first element, second element is inserted in the
position of first element. After first step, first two elements of an array will be sorted.
Step 2: The third element of an array is compared with the element that appears before it (first and second
element). If third element is smaller than first element, it is inserted in the position of first element. If third
element is larger than first element but, smaller than second element, it is inserted in the position of second
element. If third element is larger than both the elements, it is kept in the position as it is. After second step,
first three elements of an array will be sorted.
Step 3: Similarly, the fourth element of an array is compared with the elements that appear before it (first,
second and third element) and the same procedure is applied and that element is inserted in the proper
position. After third step, first four elements of an array will be sorted. If there are n elements to be sorted.
Then, this procedure is repeated n-1 times to get sorted list of array.
void insertion_sort ( int A[ ] , int n)
{
for( int i = 0 ;i < n ; i++ )
{
int temp = A[ i ]; //storing current element
int j = i;
while( j > 0 && temp < A[ j -1])
{
// moving the left side element to one position forward.
A[ j ] = A[ j-1];
j= j - 1;
}
// moving current element to its correct position.
A[ j ] = temp;
}
}
3. Explain the algorithm to perform Selection sort
Selection Sort
The algorithm maintains two subarrays in a given array.
i) The subarray which is already sorted.
ii) Remaining subarray which is unsorted.
The selection sort algorithm sorts an array by repeatedly finding the minimum element from unsorted
part and putting it at the beginning.
In every iteration of selection sort, the minimum element from the unsorted subarray is picked and
moved to the sorted subarray
void selection_sort (int A[ ], int n)
{
// temporary variable to store the position of minimum element
int minimum;
for(int i = 0; i < n-1 ; i++)
{
//finds the minimum element
minimum = i ;
for(int j = i+1; j < n ; j++ )
{
if(A[ j ] < A[ minimum ])
{
minimum = j ;
}
}
// putting minimum element on its proper position.
swap ( A[ minimum ], A[ i ]) ; } }
4. Explain the algorithm to perform Shell sort
Shell Sort
o In shell sort, elements at a specific interval are sorted.
o The interval between the elements is gradually decreased based on the sequence used.
o Sequence
Shell's original sequence: N/2 , N/4 , …, 1
Knuth's increments
Sedgewick's increments
o The performance of the shell sort depends on the type of sequence used for a given input array
int shellSort(int arr[], int n)
{
for (int gap = n/2; gap > 0; gap /= 2)
{
for (int i = gap; i < n; i += 1)
{
int temp = arr[i];
// shift earlier gap-sorted elements up until the correct
// location for a[i] is found
int j;
for (j = i; j >= gap && arr[j - gap] > temp; j -= gap)
arr[j] = arr[j - gap];
// put temp (the original a[i]) in its correct location
arr[j] = temp;
}
}
return 0;
}
5. State and explain the algorithms for linear and binary search with example?
Searching:
Searching is an algorithm, to check whether a particular element is present in the list.
Types of searching:-
Linear search
Binary Search
Linear Search
Linear search or sequential search is a method for finding a particular value in a list, that
consists of checking every one of its elements, one at a time and in sequence, until the desired one is
found.
Linear search is the simplest search algorithm. For a list with n items, the best case is when the
value is equal to the first element of the list, in which case only one comparison is needed. The worst
case is when the value is not in the list (or occurs only once at the end of the list), in which case n
comparisons are needed.
The worst case performance scenario for a linear search is that it has to loop through the entire
collection, either because the item is the last one, or because the item is not found.
Binary Search
It is possible to take greater advantage of the ordered list .Instead of searching the list in
sequence, a binary search will start by examining the middle item. If that item is the one we are
searching for, we are done. If it is not the correct item, we can use the ordered nature of the list to
eliminate half of the remaining items. If the item we are searching for is greater than the middle item,
we know that the entire lower half of the list as well as the middle item can be eliminated from further
consideration. The item, if it is in the list, must be in the upper half.
We can then repeat the process with the upper half. Start at the middle item and compare it against
what we are looking for. Again, we either find it or split the list in half, therefore eliminating another
large part of our possible search space.
Working principle
Algorithm is quite simple. It can be done either recursively or iteratively:
1. Get the middle element;
2. If the middle element equals to the searched value, the algorithm stops;
3. Otherwise, two cases are possible:
o Searched value is less, than the middle element. In this case, go to the step 1 for
the part of the array, before middle element.
o Searched value is greater, than the middle element. In this case, go to the step 1
for the part of the array, after middle element.
Now we should define when iterations should stop. First case is when searched element is found.
Second one is when subarray has no elements. In this case, we can conclude that search value is
not present in the array.
Example
Find 6 in {-1, 5, 6, 18, 19, 25, 46, 78, 102, 114}.
Step 1 (middle element is 19 > 6): -1 5 6 18 19 25 46 78 102 114
Step 2 (middle element is 5 < 6): -1 5 6 18 19 25 46 78 102 114
Step 3 (middle element is 6 == 6): -1 5 6 18 19 25 46 78 102 114
6. Explain about hash function?
Types of Hash Functions
1. Division Method
2. Mid Square Method
3. Multiplicative Hash Function
4. Digit Folding
1. Division Method:
It depends on remainder of division.
Divisor is Table Size.
Formula is ( H ( key ) = key % table size )
E.g. consider the following data or record or key (36, 18, 72, 43, 6) table size = 8
2. Mid Square Method:
We first square the item, and then extract some portion of the resulting digits. For
example, if the item were 44, we would first compute 442=1,936. Extract the middle two digit 93
from the answer. Store the key 44 in the index 93.
.
44
93
99
3. Multiplicative Hash Function:
Key is multiplied by some constant value.
Hash function is given by,
H(key)=Floor (P * ( key * A ))
P = Integer constant [e.g. P=50]
A = Constant real number [A=0.61803398987]
E.g. Key 107
H(107)=Floor(50*(107*0.61803398987))
=Floor(3306.481845)
H(107)=3306
4. Digit Folding Method
The folding method for constructing hash functions begins by dividing the item into equal-
size pieces (the last piece may not be of equal size). These pieces are then added together to give
the resulting hash key value. For example, if our item was the phone number 436-555- 4601, we
would take the digits and divide them into groups of 2 (43, 65, 55, 46, 01). After the addition,
43+65+55+46+01, we get 210. If we assume our hash table has 11 slots, then we need to perform
the extra step of dividing by 11 and keeping the remainder. In this case 210 % 11 is 1, so the phone
number 436-555-4601 hashes to slot 1.
1 436-555-4601
.8
10
7. Write short notes on hashing and the various collision resolution techniques
SEPARATE CHAINING
Collision handled by Elements with same hash value are kept in a list
Each cell of the hash table points to a linked list of elements mapped withsame hash value 20
Insert the following four keys 22 84 35 62 into hash table of size 10 using separate chaining
The hash function is key % 10
LINEAR PROBING
Easiest method to handle collision.
Apply the hash function H (key) = key % table size
QUADRATIC PROBING
To resolve the primary clustering problem, quadratic probing can be used. With quadratic probing,
rather than always moving one spot, move i2 spots from the point of collision, where i is the number of
attempts to resolve the collision.
DOUBLE HASHING
Double hashing uses the idea of applying a second hash function to the key when a collision occurs.
The result of the second hash function will be the number of positions forms the point of collision to
insert.
There are a couple of requirements for the second function:
It must never evaluate to 0 must make sure that all cells can be probed
A popular second hash function is: Hash2 (key) = R - (key % R) where R is a prime number that is
smaller than the size of thetable.