Stupid Columnsort Tricks





## Stupid Columnsort Tricks

Title: Stupid Columnsort Tricks

Stupid Columnsort Tricks
• Geeta Chaudhry
• Tom Cormen
• Dartmouth College
• Department of Computer Science

What Do We Know About Columnsort?
• Sorts N values on an r s mesh
• Uses 8 steps
• Each step either sorts each column or performs a
fixed permutation
• Divisibility restriction s divides r
• Height restriction r 2s2
• 4s3/2
• Exponent of s goes from 2 to 3/2
• Mesh need not be quite so tall and skinny
• Can simultaneously remove the divisibility
restriction and relax the height restriction tor
6s3/2

Why Relax the Conditions?
• Columnsort applies in more circumstances
• Our motivation out-of-core sorting
• Column height r is limited by amount of memory
• Either per processor or in entire system
• N rs, r 2s2 N r3/2/21/2
• N rs, r 4s3/2 N r5/3/42/3
• Reducing the exponent of s in the bound for r
allows us to sort more values with a given amount
of memory
• A similar technique works for applying columnsort
to in-core sorting

This Talk
• Slabpose columnsort
• r 4s3/2
• Requires divisibility restriction
• Also in the paper
• Subblock columnsort
• r 4s3/2 with divisibility restriction
• r 6s3/2 without divisibility restriction
• Proof that the divisibility restriction is
unnecessary in the basic columnsort algorithm

Columnsort Steps
• Sort each column
• Transpose entire mesh
• Sort each column
• Untranspose entire mesh
• Sort each column
• Shift down by half a column
• Sort each column
• Shift up by half a column

Slabpose Columnsort Steps
• Sort each column
• Slabpose transpose within vertical slabs
• Sort each column
• Shuffle columns
• Slabpose
• Sort each column
• Untranspose entire mesh
• Sort each column
• Shift down by half a column
• Sort each column
• Shift up by half a column
• Sort each column
• Slabpose transpose within vertical slabs
• Sort each column
• Shuffle columns
• Slabpose
• Sort each column
• Untranspose entire mesh
• Sort each column
• Shift down by half a column
• Sort each column
• Shift up by half a column
• Sort each column
• Slabpose transpose within vertical slabs
• Sort each column
• Shuffle columns
• Slabpose
• Sort each column
• Untranspose entire mesh
• Sort each column
• Shift down by half a column
• Sort each column
• Shift up by half a column

Oblivious!
Slabpose Columnsort Steps
• Sort each column
• Slabpose transpose within vertical slabs
• Sort each column
• Shuffle columns slabpose
• Sort each column
• Untranspose entire mesh
• Sort each column
• Shift down by half a column
• Sort each column
• Shift up by half a column

Oblivious!
Why Work With Vertical Slabs?
• In regular columnsort, the matrix needs to be
tall and skinny
• Working with vertical slabs allows us to change
the aspect ratio to use tall and skinny slabs
• Well use slabs that are s columns wide
• The mesh will have s slabs

0-1 Principle
• If an oblivious algorithm sorts all input sets
consisting solely of 0s and 1s, then it sorts all
input sets with arbitrary values
• Use the 0-1 Principle by looking at portions of
the r s mesh
• Clean all 0s or all 1s
• Dirty may be mixed 0s and 1s

Step 1 Sort Each Column
0
dirty
r
1
s
Step 2 Slabpose
s-slab
column
s
s slabs
Step 3 Sort Each Column
s rows
Step 4 Shuffle
s-slab
s-slab
s rows
s slabs
s slabs
Step 5 Slabpose
s-slab
s-slab
r/ s rows
2 rows
s slabs
s slabs
s sets of dirty rows
Step 6 Sort Each Column
2 s rows 2s3/2 elements
Step 7 Untranspose Entire Mesh
2s3/2 elements
r 4s3/2 2s3/2 r/2 dirty area half
a column
Once the size of the dirty area is at most half a
column, the last four steps will finish up
Step 8 Sort Each Column
dirty area resides in one column done
Step 8 Sort Each Column
dirty area resides in two columns no change
Step 9 Shift Down by Half a Column
dirty area resides in one column
Step 10 Sort Each Column
dirty area resides in one column
Step 11 Shift Up by Half a Column
sorted
Subblock Columnsort
• Adds two steps to columnsort
• Sort each column
• A fixed permutation
• The permutation is any one that distributes all
elements of each s s subblock to alls
columns
• Like slabpose columnsort, the size of the dirty
area is 2s3/2 entering the last four steps
• As long as 2s3/2 r/2 (half a column), the last
four steps complete the sorting

Removing the Divisibility Restrictionfrom
Columnsort
• With the divisibility restriction, the dirty rows
after the transpose step have only 0-1
transitions
• Without the divisibility restriction, there may
also be 1-0 transitions
• The proof shows that even with the 1-0
transitions, the size of the dirty area entering
the last four steps does not increase
• Thus r 2s2 suffices, even without the
divisibility restriction

Conclusion
• We can get around the restrictions of columnsort
• Reduce the exponent in the height restriction
from 2 to 3/2
• The mesh need not be quite so tall and skinny
• Cost Two extra steps
• In out-of-core implementation, slabpose