0% found this document useful (0 votes)
23 views55 pages

04 Mergesort

The document discusses mergesort, a classic sorting algorithm, detailing its basic plan, implementation, and performance analysis. It covers both top-down and bottom-up approaches, as well as practical improvements and variations like natural mergesort and Timsort. The analysis includes the algorithm's complexity, memory usage, and enhancements for efficiency in specific scenarios.

Uploaded by

ummeabiha.mniops
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views55 pages

04 Mergesort

The document discusses mergesort, a classic sorting algorithm, detailing its basic plan, implementation, and performance analysis. It covers both top-down and bottom-up approaches, as well as practical improvements and variations like natural mergesort and Timsort. The analysis includes the algorithm's complexity, memory usage, and enhancements for efficiency in specific scenarios.

Uploaded by

ummeabiha.mniops
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 55

Data Structures R OBERT S EDGEWICK | K EVIN W AYNE

CSE 247 – F ALL ’2024

M ERGESORT
‣ mergesort
‣ bottom-up mergesort
‣ sorting complexity
‣ comparators
‣ stability

Slides adapted from Sedgewick and Wayne


COS226 Algorithms – Princeton University
https://algs4.cs.princeton.edu/lectures/
Two classic sorting algorithms: mergesort and quicksort

Critical components in the world’s computational infrastructure.


・Full scientific understanding of their properties has enabled us
to develop them into practical system sorts.
・Quicksort honored as one of top 10 algorithms of 20 th century
in science and engineering.

Mergesort. [this lecture]

...

Quicksort. [next lecture]

...

2
M ERGESORT
‣ mergesort
‣ bottom-up mergesort
‣ sorting complexity
‣ comparators
‣ stability
Mergesort

Basic plan.
・Divide array into two halves.
・Recursively sort each half.
・Merge two halves.

4
Abstract in-place merge demo
Goal. Given two sorted subarrays a[lo] to a[mid] and a[mid+1] to a[hi], replace with sorted subarray a[lo] to a[hi].

Goal. Given two sorted subarrays a[lo] to a[mid] and a[mid+1] to a[hi],
replace with sorted subarray a[lo] to a[hi].

lo mid mid+1 hi

a[] E E G M R A C E R T

sorted sorted

lo hi

a[] A C E E E G M R R T

sorted
5
Merging
Q. Use
A. Howan
toauxiliary
combinearray.
two sorted subarrays into a sorted whole.

Q. How to combine two sorted subarrays into a sorted whole.


A. Use an auxiliary array.

6
Merging: C++ implementation

template<typename T>
void merge(vector<T>& a, vector<T>& aux, int lo, int mid, int hi) {

for (int k = lo; k <= hi; k++)


copy
aux[k] = a[k];

int i = lo, j = mid+1;


for (int k = lo; k <= hi; k++) {
if (i > mid) a[k] = aux[j++];
else if (j > hi) a[k] = aux[i++]; merge

else if (aux[j] < aux[i]) a[k] = aux[j++];


else a[k] = aux[i++];
}
}

lo i mid j hi
aux[] A G L O R H I M S T

k
a[] A G H I L M
7
Mergesort: C++ implementation

void merge(vector<T>& a, vector<T>& aux, int lo, int mid, int hi)
{ /* as before */ }
void sort(vector<T>& a, vector<T>& aux, int lo, int hi) {
if (hi <= lo) return;
int mid = lo + (hi - lo) / 2;
sort(a, aux, lo, mid);
sort(a, aux, mid+1, hi);
merge(a, aux, lo, mid, hi);
}
void sort(vector<T>& a) {
int N = std::size(a);
vector<T> aux(N);
sort(a, aux, 0, N - 1);
}

lo mid hi

10 11 12 13 14 15 16 17 18 19
8
Mergesort: trace

result after recursive call

9
Mergesort: animation

50 random items

algorithm position
in order
current subarray
not in order
http://www.sorting-algorithms.com/merge-sort

10
Mergesort: animation

50 reverse-sorted items

algorithm position
in order
current subarray
not in order
http://www.sorting-algorithms.com/merge-sort

11
Mergesort: empirical analysis

Running time estimates:


・Laptop executes 10 compares/second.
8

・Supercomputer executes 10 compares/second. 12

insertion sort (N2) mergesort (N log N)

computer thousand million billion thousand million billion

home instant 2.8 hours 317 years instant 1 second 18 min

super instant 1 second 1 week instant instant instant

Bottom line. Good algorithms are better than supercomputers.


12
Mergesort: number of compares

Proposition. Mergesort uses ≤ 𝑁 lg 𝑁 compares to sort an array of length


𝑁.

Pf sketch. The number of compares 𝐶 (𝑁) to mergesort an array of length


𝑁 satisfies the recurrence:

𝐶(𝑁) ≤ 𝐶 𝑁/2 + 𝐶 𝑁/2 + 𝑁 for 𝑁 > 1, with 𝐶 (1) = 0.

left half right half merge

We solve the recurrence when N is a power of 2: result holds for all N


(analysis cleaner in this case)

𝐷 (𝑁) = 2𝐷 (𝑁/2) + 𝑁, for 𝑁 > 1, with 𝐷 (1) = 0.

13
Divide-and-conquer recurrence: proof by picture

Proposition. If 𝐷 (𝑁) satisfies 𝐷(𝑁) = 2 𝐷(𝑁/2) + 𝑁 for 𝑁 > 1, with


𝐷 (1) = 0, then 𝐷 (𝑁) = 𝑁 lg 𝑁.

Pf 1. [assuming N is a power of 2]

𝐷 (𝑁) 𝑁 = 𝑁

𝐷 (𝑁 / 2) 𝐷 (𝑁 / 2) 2 (𝑁/2) = 𝑁

lg 𝑁
𝐷(𝑁/4) 𝐷(𝑁/4) 𝐷(𝑁/4) 𝐷(𝑁/4) 4 (𝑁/4) = 𝑁

𝐷(𝑁/8) 𝐷(𝑁/8) 𝐷(𝑁/8) 𝐷(𝑁/8) 𝐷(𝑁/8) 𝐷(𝑁/8) 𝐷(𝑁/8) 𝐷(𝑁/8) 8 (𝑁/8) = 𝑁

⋮ 𝑇(𝑁) = 𝑁 lg 𝑁
14
Divide-and-conquer recurrence: proof by induction

Proposition. . If 𝐷 (𝑁) satisfies 𝐷(𝑁) = 2 𝐷(𝑁/2) + 𝑁 for 𝑁 > 1, with


𝐷 (1) = 0, then 𝐷 (𝑁) = 𝑁 lg 𝑁.

Pf 2. [assuming N is a power of 2]
・Base case: 𝑁 = 1.
・Inductive hypothesis: 𝐷 (𝑁) = 𝑁 lg 𝑁.
・Goal: show that 𝐷 (2𝑁) = (2𝑁) lg(2𝑁).

𝐷 (2𝑁) = 2 𝐷 (𝑁) + 2𝑁 given

= 2 𝑁 lg 𝑁 + 2𝑁 inductive hypothesis

= 2 𝑁 (lg(2𝑁) – 1) + 2𝑁 algebra

= 2 𝑁 lg(2𝑁) QED

15
Mergesort: number of array accesses

Proposition. Mergesort uses ≤ 6 𝑁 lg 𝑁 array accesses to sort an array of


length 𝑁.

Pf sketch. The number of array accesses 𝐴 (𝑁) satisfies the recurrence:

𝐴(𝑁) ≤ 𝐴 𝑁/2 + 𝐴 𝑁/2 + 6𝑁 for 𝑁 > 1, with 𝐴 (1) = 0.

Key point. Any algorithm with the following structure takes 𝑁 log 𝑁 time:

void linearithmic(int N)
{
if (N == 0) return;
linearithmic(N/2); solve two problems
linearithmic(N/2); of half the size

linear(N); do a linear amount of work


}

Notable examples. FFT, hidden-line removal, Kendall-tau distance, …


16
Mergesort analysis: memory

Proposition. Mergesort uses extra space proportional to 𝑁.


Pf. The array aux[] needs to be of length 𝑁 for the last merge.

two sorted subarrays

A C D G H I M N U B E F J O P Q R S T
V
A B C D E F G H I J M N O P Q R S T U V

merged result

Def. A sorting algorithm is in-place if it uses ≤ 𝑐 log 𝑁 extra memory.


Ex. Insertion sort, selection sort, shellsort.

Challenge 1 (not hard). Use aux[] array of length ~ ½ 𝑁 instead of 𝑁.


Challenge 2 (very hard). In-place merge. [Kronrod 1969]
17
Mergesort: practical improvements

Use insertion sort for small subarrays.


・Mergesort has too much overhead for tiny subarrays.
・Cutoff to insertion sort for ≈ 10 items.
void sort(vector<T>& a, vector<T>& aux, int lo, int hi) {

if (hi <= lo + CUTOFF - 1) {


insertion_sort(a, lo, hi);
return;
}

int mid = lo + (hi - lo) / 2;


sort(a, aux, lo, mid);
sort(a, aux, mid+1, hi);
merge(a, aux, lo, mid, hi);
}

18
Mergesort with cutoff to insertion sort: visualization

19
Mergesort: practical improvements

Stop if already sorted.


・Is largest item in first half ≤ smallest item in second half?
・Helps for partially-ordered arrays.

A B C D E F G H I J M N O P Q R S T U V

A B C D E F G H I J M N O P Q R S T U V

void sort(vector<T>& a, vector<T>& aux, int lo, int hi) {


if (hi <= lo) return;
int mid = lo + (hi - lo) / 2;
sort(a, aux, lo, mid);
sort(a, aux, mid+1, hi);
if (a[mid+1] >= a[mid]) return;
merge(a, aux, lo, mid, hi);
}

20
Mergesort: practical improvements

Eliminate the copy to the auxiliary array. Save time (but not space)
by switching the role of the input and auxiliary array in each recursive call.
void merge(vector<T>& a, vector<T>& aux, int lo, int mid, int hi) {
int i = lo, j = mid+1;
for (int k = lo; k <= hi; k++) {
if (i > mid) aux[k] = a[j++];
else if (j > hi) aux[k] = a[i++];
merge from a[] to aux[]
else if (a[j] < a[i]) aux[k] = a[j++];
else aux[k] = a[i++];
}}
void sort(vector<T>& a, vector<T>& aux, int lo, int hi) {
if (hi <= lo) return; assumes aux[] is initialize to a[] once,

int mid = lo + (hi - lo) / 2; before recursive calls

sort(aux, a, lo, mid);


sort(aux, a, mid+1, hi);
merge(a, aux, lo, mid, hi);
}
switch roles of aux[] and a[] 21
M ERGESORT
‣ mergesort
‣ bottom-up mergesort
‣ sorting complexity
‣ comparators
‣ stability
Bottom-up mergesort

Basic plan.
・Pass through array, merging subarrays of size 1.
・Repeat for subarrays of size 2, 4, 8, ....

23
Bottom-up mergesort: Java implementation

void mergesort_bu(vector<T>& a) {
int N = std::size(a);
vector<T> aux(N);
for (int sz = 1; sz < N; sz = sz+sz)
for (int lo = 0; lo < N-sz; lo += sz+sz)
merge(a, aux, lo, lo+sz-1, std::min(lo+sz+sz-1, N-1));
}

but about 10% slower than recursive,


top-down mergesort on typical systems

Bottom line. Simple and non-recursive version of mergesort.


24
Mergesort: visualizations

top-down mergesort (cutoff = 12) bottom-up mergesort (cutoff = 12)


25
Natural mergesort

Idea. Exploit pre-existing order by identifying naturally-occurring runs.

input

1 5 10 16 3 4 23 9 13 2 7 8 12 14

first run

1 5 10 16 3 4 23 9 13 2 7 8 12 14

second run

1 5 10 16 3 4 23 9 13 2 7 8 12 14

merge two runs

1 3 4 5 10 16 23 9 13 2 7 8 12 14

Tradeoff. Fewer passes vs. extra compares per pass to identify runs.
26
Timsort

・Natural mergesort.
・Use binary insertion sort to make initial runs (if needed).
・A few more clever optimizations.
Tim Peters
Intro
-----
This describes an adaptive, stable, natural mergesort, modestly called
timsort (hey, I earned it <wink>). It has supernatural performance on many
kinds of partially ordered arrays (less than lg(N!) comparisons needed, and
as few as N-1), yet as fast as Python's previous highly tuned samplesort
hybrid on random arrays.

In a nutshell, the main routine marches over the array once, left to right,
alternately identifying the next run, then merging it into the previous
runs "intelligently". Everything else is complication for speed, and some
hard-won measure of memory efficiency.
...

Consequence. Linear time on many arrays with pre-existing order.


Now widely used. libc++, Python, Java 7, GNU Octave, Android, ….
27
The Zen of Python

http://www.python.org/dev/peps/pep-0020/

28
M ERGESORT
‣ mergesort
‣ bottom-up mergesort
‣ sorting complexity
‣ comparators
‣ stability
Complexity of sorting

Computational complexity. Framework to study efficiency of algorithms


for solving a particular problem X.

Model of computation. Allowable operations.


Cost model. Operation count(s).
Upper bound. Cost guarantee provided by some algorithm for X.
Lower bound. Proven limit on cost guarantee of all algorithms for X.
Optimal algorithm. Algorithm with best possible cost guarantee for X.

lower bound ~ upper bound

Example: sorting.
・Model of computation: decision tree. can access information
only through compares
・Cost model:  compares. (e.g., Java Comparable framework)
・Upper bound: ~ N lg N from mergesort.
・Lower bound:
・Optimal algorithm:
30
Decision tree (for 3 distinct keys a, b, and c)

a<b
height of tree =
yes no worst-case number
code between compares of compares
(e.g., sequence of exchanges)

b<c a<c

yes no yes no

abc bac b<c


a<c

yes no yes no

acb cab bca cba

each leaf corresponds to one (and only one) ordering;


(at least) one leaf for each possible ordering 31
Compare-based lower bound for sorting

Proposition. Any compare-based sorting algorithm must use at least


lg ( N ! ) ~ N lg N compares in the worst-case.

Pf.
・Assume array consists of N distinct values a through a .
1 N

・Worst case dictated by height h of decision tree.


・Binary tree of height h has at most 2 leaves.
h

・N ! different orderings  at least N ! leaves.

32
Compare-based lower bound for sorting

Proposition. Any compare-based sorting algorithm must use at least


lg( 𝑁! ) ~ 𝑁 lg 𝑁 compares in the worst-case.

Pf.
・Assume array consists of 𝑁 distinct values 𝑎1 through 𝑎𝑁 .
・Worst case dictated by height ℎ of decision tree.
・Binary tree of height ℎ has at most 2 leaves.ℎ

・𝑁! different orderings  at least 𝑁! leaves.

2ℎ ≥ # leaves ≥ 𝑁!
⇒ ℎ ≥ lg( 𝑁! ) ~ 𝑁 lg 𝑁

Stirling's formula

33
Complexity of sorting

Model of computation. Allowable operations.


Cost model. Operation count(s).
Upper bound. Cost guarantee provided by some algorithm for X.
Lower bound. Proven limit on cost guarantee of all algorithms for X.
Optimal algorithm. Algorithm with best possible cost guarantee for X.

Example: sorting.
・Model of computation: decision tree.
・Cost model:  compares.
・Upper bound: ~ 𝑁 lg 𝑁 from mergesort.
・Lower bound: ~ 𝑁 lg 𝑁.
・Optimal algorithm = mergesort.

First goal of algorithm design: optimal algorithms.

34
Complexity results in context

Compares? Mergesort is optimal with respect to number compares.


Space? Mergesort is not optimal with respect to space usage.

Lessons. Use theory as a guide.


Ex. Design sorting algorithm that guarantees ½ 𝑁 lg 𝑁 compares?
Ex. Design sorting algorithm that is both time- and space-optimal?
35
Complexity results in context (continued)

Lower bound may not hold if the algorithm can take advantage of:

・The initial order of the input.


Ex: insert sort requires only a linear number of compares on partially-
sorted arrays.

・The distribution of key values.


Ex: 3-way quicksort requires only a linear number of compares on
arrays with a constant number of distinct keys. [stay tuned]

・The representation of the keys.


Ex: radix sort requires no key compares — it accesses the data via
character/digit compares.

36
M ERGESORT
‣ mergesort
‣ bottom-up mergesort
‣ sorting complexity
‣ comparators
‣ stability
Sort countries by gold medals

38
Sort countries by total medals

39
Sort music library by artist

40
Sort music library by song name

41
Operator overloading: review

operator<(): sort using a type's natural order.

class Date {
private:
const int month, day, year;
public:
Date(int m, int d, int y) : month {m}, day {d}, year {y} { }

bool operator<(const Date& that) const { natural order

if (this->year < that.year ) return true;


if (this->year > that.year ) return false;
if (this->month < that.month) return true;
if (this->month > that.month) return false;
if (this->day < that.day ) return true;
return false;
}
};

44
Custom comparators

Custom comparators in C++: sort using an alternate order.


Required property. Must be a total order.
string order example

natural order Now is the time

case insensitive is Now the time

reverse natural
time the is Now
order

auto iless = [](const std::string& a, const std::string& b) {


auto ichar_less = [](unsigned char a, unsigned char b) {
return std::tolower(a) < std::tolower(b);
};
return std::lexicographical_compare(a.begin(), a.end(),
b.begin(), b.end(), ichar_less);
};

case insensitive string comparator


45
Custom comparators: std::sort

To use with C++ std::sort:


・Create custom comparator.
・Pass as third argument to std::sort().

vector<string> a; uses natural order


... uses alternate order
std::sort(a.begin(), a.end());
...
std::sort(a.begin, a.end(), std::greater{});
...
std::sort(a.begin(), a.end(), iless);
...

Bottom line. Decouples the definition of the data type from the
definition of what it means to compare two objects of that type.
46
Custom comparators: using with our sorting libraries

To support comparators in our sort implementations:


・Pass comparator and use it in instead of < operator.

insertion sort using a custom comparator

template<typename T, typename C>


void insertion_sort(vector<T>& a, C comp) {
int N = std::size(a);
for (int i = 0; i < N; i++)
for (int j = i; j > 0 && comp(a[j], a[j-1]); j--)
std::swap(a[j], a[j-1]);
}

47
Custom comparators: implementation

auto by_name = [](const student& x, const student& y) {


return x.name < y.name;
};
auto by_section = [](const student& x, const student& y) {
return x.section < y.section;
};
std::sort(a, by_name); std::sort(a, by_section);

Andrews 3 A 664-480-0023 097 Little Furia 1 A 766-093-9873 101 Brown

Battle 4 C 874-088-1212 121 Whitman Rohde 2 A 232-343-5555 343 Forbes

Chen 3 A 991-878-4944 308 Blair Andrews 3 A 664-480-0023 097 Little

Fox 3 A 884-232-5341 11 Dickinson Chen 3 A 991-878-4944 308 Blair

Furia 1 A 766-093-9873 101 Brown Fox 3 A 884-232-5341 11 Dickinson

Gazsi 4 B 766-093-9873 101 Brown Kanaga 3 B 898-122-9643 22 Brown

Kanaga 3 B 898-122-9643 22 Brown Battle 4 C 874-088-1212 121 Whitman

Rohde 2 A 232-343-5555 343 Forbes Gazsi 4 B 766-093-9873 101 Brown

48
M ERGESORT
‣ mergesort
‣ bottom-up mergesort
‣ sorting complexity
‣ comparators
‣ stability
Stability

A typical application. First, sort by name; then sort by section.

selection_sort(a, by_name); selection_sort(a, by_section);

Andrews 3 A 664-480-0023 097 Little Furia 1 A 766-093-9873 101 Brown

Battle 4 C 874-088-1212 121 Whitman Rohde 2 A 232-343-5555 343 Forbes

Chen 3 A 991-878-4944 308 Blair Chen 3 A 991-878-4944 308 Blair

Fox 3 A 884-232-5341 11 Dickinson Fox 3 A 884-232-5341 11 Dickinson

Furia 1 A 766-093-9873 101 Brown Andrews 3 A 664-480-0023 097 Little

Gazsi 4 B 766-093-9873 101 Brown Kanaga 3 B 898-122-9643 22 Brown

Kanaga 3 B 898-122-9643 22 Brown Gazsi 4 B 766-093-9873 101 Brown

Rohde 2 A 232-343-5555 343 Forbes Battle 4 C 874-088-1212 121 Whitman

@#%&@! Students in section 3 no longer sorted by name.

A stable sort preserves the relative order of items with equal keys.
50
Stability

Q. Which sorts are stable?


A. Need to check algorithm (and implementation).

51
Stability: insertion sort

Proposition. Insertion sort is stable.

template<typename T>
void insertion_sort(vector<T>& a) {
int N = std::size(a);
for (int i = 0; i < N; i++)
for (int j = i; j > 0 && (a[j] < a[j-1]); j--)
std::swap(a[j], a[j-1]);
i j 0 1 2 3 4
}
0 0 B1 A1 A2 A3 B2

1 0 A1 B1 A2 A3 B2

2 1 A1 A2 B1 A3 B2

3 2 A1 A2 A3 B1 B2

4 4 A1 A2 A3 B1 B2

A1 A2 A3 B1 B2

Pf. Equal items never move past each other.


52
Stability: selection sort

Proposition. Selection sort is not stable.

template<typename T>
void selection_sort(vector<T>& a) {
int N = std::size(a);
for (int i = 0; i < N; i++) {
int min = i; i min 0 1 2
0 2 B1 B2 A
for (int j = i+1; j < N; j++)
1 1 A B2 B1
if (a[j] < a[min])
2 2 A B2 B1
min = j;
A B2 B1
std::swap(a[i], a[min]);
}
}

Pf by counterexample. Long-distance exchange can move one equal item


past another one.
53
Stability: shellsort

Proposition. Shellsort sort is not stable.


void shellsort(vector<T>& a) {
int N = std::size(a);
int h = 1;
while (h < N/3) h = 3*h + 1;
while (h >= 1) {
for (int i = h; i < N; i++) {
for (int j = i; j >= h && (a[j] < a[j-h]); j -= h)
std::swap(a[j], a[j-h]);
}
h 0 1 2 3 4
h = h/3; B1 B2 B3 B4 A1
} 4 A1 B2 B3 B4 B1

} 1 A1 B2 B3 B4 B1

A1 B2 B3 B4 B1
Pf by counterexample. Long-distance exchanges.
54
Stability: mergesort

Proposition. Mergesort is stable.

template <typename T>


void mergesort(vector<T>& a, vector<T>& aux, int lo, int hi) {
if (hi <= lo) return;
int mid = lo + (hi - lo) / 2;
sort(a, aux, lo, mid);
sort(a, aux, mid+1, hi);
merge(a, aux, lo, mid, hi);
}

Pf. Suffices to verify that merge operation is stable.


55
Stability: mergesort

Proposition. Merge operation is stable.

void merge(...) {
for (int k = lo; k <= hi; k++)
aux[k] = a[k];
int i = lo, j = mid+1;
for (int k = lo; k <= hi; k++) {
if (i > mid) a[k] = aux[j++];
else if (j > hi) a[k] = aux[i++];
else if (aux[j] < aux[i]) a[k] = aux[j++];
else a[k] = aux[i++];
}
0 1 2 3 4 5 6 7 8 9 10
}
A1 A2 A3 B D A4 A5 C E F G

Pf. Takes from left subarray if equal keys.


56
Sorting summary

inplace? stable? best average worst remarks

selection ½𝑁2 ½𝑁2 ½𝑁2 N exchanges

use for small N


insertion 𝑁 ¼𝑁2 ½𝑁2
or partially ordered
/2 tight code;
shell 𝑁 log3 𝑁 ? 𝑐 𝑁3
subquadratic
N log N guarantee;
merge ½ 𝑁 lg 𝑁 𝑁 lg 𝑁 𝑁 lg 𝑁
stable
improves mergesort
timsort 𝑁 𝑁 lg 𝑁 𝑁 lg 𝑁
when preexisting order

? 𝑁 𝑁 lg 𝑁 𝑁 lg 𝑁 holy sorting grail

57

You might also like