SortingSorting
Sorting
Keeping data in “order” allows it to be searched more efficiently
Example: Phone Book
– Sorted by Last Name (“lots” of work to do this)
• Easy to look someone up if you know their last name
• Tedious (but straightforward) to find by First name or
Address
Important if data will be searched many times
Two algorithms for sorting today
– Bubble Sort
– Merge Sort
Searching: next lecture
Bubble Sort (“Sink” sort here)
If A(1)>A(2)
switch
If A(2)>A(3)
switch
If A(3)>A(4)
switch
If A(4)>A(5)
switch
…
If A(N-3)>A(N-2)
switch
If A(N-2)>A(N-1)
switch
If A(N-1)>A(N)
switch
If A(1)>A(2)
switch
If A(2)>A(3)
switch
If A(3)>A(4)
switch
If A(4)>A(5)
switch
If A(N-3)>A(N-2)
switch
If A(N-2)>A(N-1)
switch
If A(1)>A(2)
switch
If A(2)>A(3)
switch
If A(3)>A(4)
switch
If A(4)>A(5)
switch
If A(N-3)>A(N-2)
switch
If A(1)>A(2)
switch
A(N) is now
largest entry
A(N-1) is now
2nd
largest entry
A(N) is still
largest enry
A(N-2) is now 3rd
largest entry
A(N-1) is still 2nd
largest entry
A(N) is still
largest enry
A(1) is now Nth
largest entry.
A(2) is still (N-1)th
largest entry.
A(3) is still (N-2)th
largest entry.
A(N-3) is still 4th
largest entry
A(N-2) is still 3rd
largest entry
A(N-1) is still 2nd
largest entry
A(N) is still largest
entry
Bubble Sort (“Sink” sort here)
If A(1)>A(2)
switch
If A(2)>A(3)
switch
If A(3)>A(4)
switch
If A(4)>A(5)
switch
…
If A(N-3)>A(N-2)
switch
If A(N-2)>A(N-1)
switch
If A(N-1)>A(N)
switch
If A(1)>A(2)
switch
If A(2)>A(3)
switch
If A(3)>A(4)
switch
If A(4)>A(5)
switch
If A(N-3)>A(N-2)
switch
If A(N-2)>A(N-1)
switch
If A(1)>A(2)
switch
If A(2)>A(3)
switch
If A(3)>A(4)
switch
If A(4)>A(5)
switch
If A(N-3)>A(N-2)
switch
If A(1)>A(2)
switch
N-1 steps
N-2 steps
N-3 steps
1 step
22
)1(
stepsof#
21
1
NNN
i
N
i
≈
−
== ∑
−
=
Bubble Sort (“Sink” sort here)
If A(1)>A(2)
switch
If A(2)>A(3)
switch
If A(3)>A(4)
switch
If A(4)>A(5)
switch
…
If A(N-3)>A(N-2)
switch
If A(N-2)>A(N-1)
switch
If A(N-1)>A(N)
switch
If A(1)>A(2)
switch
If A(2)>A(3)
switch
If A(3)>A(4)
switch
If A(4)>A(5)
switch
If A(N-3)>A(N-2)
switch
If A(N-2)>A(N-1)
switch
If A(1)>A(2)
switch
If A(2)>A(3)
switch
If A(3)>A(4)
switch
If A(4)>A(5)
switch
If A(N-3)>A(N-2)
switch
If A(1)>A(2)
switch
for lastcompare=N-1:-1:1
for i=1:lastcompare
if A(i)>A(i+1)
Matlab code for Bubble Sort
function S = bubblesort(A)
% Assume A row/column; Copy A to S
S = A;
N = length(S);
for lastcompare=N-1:-1:1
for i=1:lastcompare
if S(i)>S(i+1)
tmp = S(i);
S(i) = S(i+1);
S(i+1) = tmp;
end
end
end
What about returning
an Index vector Idx,
with the property that
S = A(Idx)?
Matlab code for Bubble Sort
function [S,Idx] = bubblesort(A)
% Assume A row/column; Copy A to S
N = length(A);
S = A; Idx = 1:N; % A(Idx) equals S
for lastcompare=N-1:-1:1
for i=1:lastcompare
if S(i)>S(i+1)
tmp = S(i); tmpi = Idx(i);
S(i) = S(i+1); Idx(i) = Idx(i+1);
S(i+1) = tmp; Idx(i+1) = tmpi;
end
end
end
If we switch two entries of S, then exchange the same
two entries of Idx. This keeps A(Idx) equaling S
Merging two already sorted
arrays
Suppose A and B are two sorted arrays (different
lengths)
How do you “merge” these into a sorted array C?
Chalkboard…
Pseudo-code: Merging two
already sorted arrays
function C = merge(A,B)
nA = length(A); nB = length(B);
iA = 1; iB = 1; %smallest unused element
C = zeros(1,nA+nB);
for iC=1:nA+nB
if A(iA)<B(iB) %compare smallest unused
C(iC) = A(iA); iA = iA+1; %use A
else
C(iC) = B(iB); iB = iB+1; %use B
end
end
BA nn +=steps""of#
MergeSort
function S = mergeSort(A)
n = length(A);
if n==1
S = A;
else
hn = floor(n/2);
S1 = mergeSort(A(1:hn));
S2 = mergeSort(A(hn+1:end));
S = merge(S1,S2);
end
Base Case
Split in half
Sort 2nd
half
Merge 2 sorted arrays
Sort 1st
half
Rough Operation Count for MergeSort
Let R(n) denote the number of operations necessary to
sort (using mergeSort) an array of length n.
function S = mergeSort(A)
n = length(A);
if n==1
S = A;
else
hn = floor(n/2);
S1 = mergeSort(A(1:hn));
S2 = mergeSort(A(hn+1:end));
S = merge(S1,S2);
end
R(1) = 0
R(n/2) to sort array of length
n/2
n steps to merge two sorted arrays of total
length n
R(n/2) to sort array of length
n/2
Recursive relation: R(1)=0, R(n) = 2*R(n/2) + n
Rough Operation Count for MergeSort
The recursive relation for R
R(1)=0, R(n) = 2*R(n/2) + n
Claim: For n=2m
, it is true that R(n) ≤ n log2(n)
Case (m=0): true, since log2(1)=0
Case (m=k+1 from m=k)
( )12 1
+⋅= +
kk
( ) kkk
222log22 2 ⋅+⋅≤
( ) kk
R 2222 ⋅+⋅=( ) ( )kk
RR 222 1
⋅=+
( )1
2
1
2log2 ++
⋅= kk
Recursive relation
Induction
hypothesis
Matlab command: sort
Syntax is
[S] = sort(A)
If A is a vector, then S is a vector in ascending
order
The indices which rearrange A into S are also
available.
[S,Idx] = sort(A)
S is the sorted values of A, and A(Idx) equals S.

Bubble and-merge-sort

  • 1.
  • 2.
    Sorting Keeping data in“order” allows it to be searched more efficiently Example: Phone Book – Sorted by Last Name (“lots” of work to do this) • Easy to look someone up if you know their last name • Tedious (but straightforward) to find by First name or Address Important if data will be searched many times Two algorithms for sorting today – Bubble Sort – Merge Sort Searching: next lecture
  • 3.
    Bubble Sort (“Sink”sort here) If A(1)>A(2) switch If A(2)>A(3) switch If A(3)>A(4) switch If A(4)>A(5) switch … If A(N-3)>A(N-2) switch If A(N-2)>A(N-1) switch If A(N-1)>A(N) switch If A(1)>A(2) switch If A(2)>A(3) switch If A(3)>A(4) switch If A(4)>A(5) switch If A(N-3)>A(N-2) switch If A(N-2)>A(N-1) switch If A(1)>A(2) switch If A(2)>A(3) switch If A(3)>A(4) switch If A(4)>A(5) switch If A(N-3)>A(N-2) switch If A(1)>A(2) switch A(N) is now largest entry A(N-1) is now 2nd largest entry A(N) is still largest enry A(N-2) is now 3rd largest entry A(N-1) is still 2nd largest entry A(N) is still largest enry A(1) is now Nth largest entry. A(2) is still (N-1)th largest entry. A(3) is still (N-2)th largest entry. A(N-3) is still 4th largest entry A(N-2) is still 3rd largest entry A(N-1) is still 2nd largest entry A(N) is still largest entry
  • 4.
    Bubble Sort (“Sink”sort here) If A(1)>A(2) switch If A(2)>A(3) switch If A(3)>A(4) switch If A(4)>A(5) switch … If A(N-3)>A(N-2) switch If A(N-2)>A(N-1) switch If A(N-1)>A(N) switch If A(1)>A(2) switch If A(2)>A(3) switch If A(3)>A(4) switch If A(4)>A(5) switch If A(N-3)>A(N-2) switch If A(N-2)>A(N-1) switch If A(1)>A(2) switch If A(2)>A(3) switch If A(3)>A(4) switch If A(4)>A(5) switch If A(N-3)>A(N-2) switch If A(1)>A(2) switch N-1 steps N-2 steps N-3 steps 1 step 22 )1( stepsof# 21 1 NNN i N i ≈ − == ∑ − =
  • 5.
    Bubble Sort (“Sink”sort here) If A(1)>A(2) switch If A(2)>A(3) switch If A(3)>A(4) switch If A(4)>A(5) switch … If A(N-3)>A(N-2) switch If A(N-2)>A(N-1) switch If A(N-1)>A(N) switch If A(1)>A(2) switch If A(2)>A(3) switch If A(3)>A(4) switch If A(4)>A(5) switch If A(N-3)>A(N-2) switch If A(N-2)>A(N-1) switch If A(1)>A(2) switch If A(2)>A(3) switch If A(3)>A(4) switch If A(4)>A(5) switch If A(N-3)>A(N-2) switch If A(1)>A(2) switch for lastcompare=N-1:-1:1 for i=1:lastcompare if A(i)>A(i+1)
  • 6.
    Matlab code forBubble Sort function S = bubblesort(A) % Assume A row/column; Copy A to S S = A; N = length(S); for lastcompare=N-1:-1:1 for i=1:lastcompare if S(i)>S(i+1) tmp = S(i); S(i) = S(i+1); S(i+1) = tmp; end end end What about returning an Index vector Idx, with the property that S = A(Idx)?
  • 7.
    Matlab code forBubble Sort function [S,Idx] = bubblesort(A) % Assume A row/column; Copy A to S N = length(A); S = A; Idx = 1:N; % A(Idx) equals S for lastcompare=N-1:-1:1 for i=1:lastcompare if S(i)>S(i+1) tmp = S(i); tmpi = Idx(i); S(i) = S(i+1); Idx(i) = Idx(i+1); S(i+1) = tmp; Idx(i+1) = tmpi; end end end If we switch two entries of S, then exchange the same two entries of Idx. This keeps A(Idx) equaling S
  • 8.
    Merging two alreadysorted arrays Suppose A and B are two sorted arrays (different lengths) How do you “merge” these into a sorted array C? Chalkboard…
  • 9.
    Pseudo-code: Merging two alreadysorted arrays function C = merge(A,B) nA = length(A); nB = length(B); iA = 1; iB = 1; %smallest unused element C = zeros(1,nA+nB); for iC=1:nA+nB if A(iA)<B(iB) %compare smallest unused C(iC) = A(iA); iA = iA+1; %use A else C(iC) = B(iB); iB = iB+1; %use B end end BA nn +=steps""of#
  • 10.
    MergeSort function S =mergeSort(A) n = length(A); if n==1 S = A; else hn = floor(n/2); S1 = mergeSort(A(1:hn)); S2 = mergeSort(A(hn+1:end)); S = merge(S1,S2); end Base Case Split in half Sort 2nd half Merge 2 sorted arrays Sort 1st half
  • 11.
    Rough Operation Countfor MergeSort Let R(n) denote the number of operations necessary to sort (using mergeSort) an array of length n. function S = mergeSort(A) n = length(A); if n==1 S = A; else hn = floor(n/2); S1 = mergeSort(A(1:hn)); S2 = mergeSort(A(hn+1:end)); S = merge(S1,S2); end R(1) = 0 R(n/2) to sort array of length n/2 n steps to merge two sorted arrays of total length n R(n/2) to sort array of length n/2 Recursive relation: R(1)=0, R(n) = 2*R(n/2) + n
  • 12.
    Rough Operation Countfor MergeSort The recursive relation for R R(1)=0, R(n) = 2*R(n/2) + n Claim: For n=2m , it is true that R(n) ≤ n log2(n) Case (m=0): true, since log2(1)=0 Case (m=k+1 from m=k) ( )12 1 +⋅= + kk ( ) kkk 222log22 2 ⋅+⋅≤ ( ) kk R 2222 ⋅+⋅=( ) ( )kk RR 222 1 ⋅=+ ( )1 2 1 2log2 ++ ⋅= kk Recursive relation Induction hypothesis
  • 13.
    Matlab command: sort Syntaxis [S] = sort(A) If A is a vector, then S is a vector in ascending order The indices which rearrange A into S are also available. [S,Idx] = sort(A) S is the sorted values of A, and A(Idx) equals S.