Binary Merge-Sort
Merge-Sort(A,i,j) 01 if (i < j) then 02 m = (i+j)/2;
03 Merge-Sort(A,i,m);
04 Merge-Sort(A,m+1,j);
05 Merge(A,i,m,j) Merge-Sort(A,i,j)
01 if (i < j) then 02 m = (i+j)/2;
03 Merge-Sort(A,i,m);
04 Merge-Sort(A,m+1,j);
05 Merge(A,i,m,j)
Divide
Conquer
Combine
1 2 8 10 7 9 13 19 1 2 7
Merge is linear in the
#items to be merged
Few key observations
Items = (short) strings = atomic...
On english wikipedia, about 10
9tokens to sort
(n log n) memory accesses (I/Os ??)
[5ms] * n log
2n ≈ 3 years
In practice it is a “faster”, why?
Recursion
10 2
10 2
5 1
5 1
13 19 13 19
9 7
9 7
15 4
15 4
8 3
8 3
12 17 12 17
6 11
6 11 10 2 5 1 13 19 9 7 15 4 8 3 12 17 6 11 10 2 5 1 13 19 9 7 15 4 8 3 12 17 6 11
10 2 5 1 13 19 9 7 15 4 8 3 12 17 6 11
log
2N
Implicit Caching…
10 2
2 10
5 1
1 5
13 19 13 19
9 7
7 9
15 4
4 15
8 3
3 8
12 17 12 17
6 11
6 11 1 2 5 10 7 9 13 19 3 4 8 15 6 11 12 17
1 2 5 7 9 10 13 19 3 4 6 8 11 12 15 17 1 2 3 4 5 6 7 8 9 10 11 12 13 15 17 19
log
2N
N/M runs, each sorted in internal memory (no I/Os) M
2 passes (one Read/one Write) = 2 * (N/B) I/Os
— I/O-cost for binary merge-sort is ≈ 2 (N/B) log
2(N/M) Lo g
2( N /M )
2 passes (R/W)
2 passes (R/W)
B
A key inefficiency
1 2 4 7 9 10 13 19 3 5 6 8 11 12 15 17
B
After few steps, every run is longer than B !!!
B
We are using only 3 pages
But memory contains M/B pages ≈ 2
30/2
15= 2
15B
Output
Buffer
Disk
1, 2, 3
1, 2, 3
Output Run
4, ...
Multi-way Merge-Sort
Sort N items with main-memory M and disk-pages B:
Pass 1: Produce (N/M) sorted runs.
Pass i: merge X = M/B-1 runs log
XN/M passes
Main memory buffers of B items
Pg for run1
Pg for run X
Out Pg
Disk Disk
Pg for run 2
. . . . .
.
. . .
How it works
1 2 5 10 7 9 13 19 1 2 5 7….
1 2 3 4 5 6 7 8 9 10 11 12 13 15 17 19
M
N/M runs, each sorted in internal memory = 2 (N/B) I/Os
2 passes (one Read/one Write) = 2 * (N/B) I/Os
— I/O-cost for X-way merge is ≈ 2 (N/B) I/Os per level Lo g
X( N /M )
M
X
X
Cost of Multi-way Merge-Sort
Number of passes = log
XN/M log
M/B(N/M)
Total I/O-cost is ( (N/B) log
M/BN/M ) I/Os
Large fan-out (M/B) decreases #passes In practice
M/B ≈ 10
5 #passes = 1 few mins
Tuning depends on disk features