Saturday, August 8, 2015


The superscalarsort whitepaper by Agarwal is a good read.

It's a pretty novel idea. Modern superscalar processors have the ability to gain massive throughput increases when they can correctly predict branches. The issue is that many conventional sorting algorithms don't have conditionals that are overwhelmingly either true or false. If you picture QuickSort, the optimal case has the conditional (if the current element is larger than the pivot) true 50% of the time. This is the worst-case for modern branch predictors.

Superscalarsort avoids the branch prediction problem by eliminating conditional branching probabilistically. After partitioning the data into buckets, a number of tricks are used to make a radix sort not require any comparisons unless there are ties. There is a 0.002 probability of a tie given random numbers. The conditionless moves are very amiable to ILP as well.

I wonder if this 1996 algorithm might find new life on modern GPUs and vector processors. For these massively parallel processors, a conditional with 50/50 probability will see half of the computation threads sit idle in each branch. Conditionals are incredibly expensive in a SIMD world.

No comments:

Post a Comment