0.3.9 Bucket and Radix Sorting
The bucket sort is a non-comparison-based sorting algorithm in which
we allocate one storage location for each item to be sorted and
proceed to process the data set to be sorted assigning each item into
its corresponding bucket. In order to bucket sort n unique items in
the range of 1 through m, allocate m buckets and then iterate
over the n items assigning each one to the proper bucket. Finally
loop through the buckets and collect the items putting them into final
order.
The bucket sort is a good choice when items to be sorted are from a
small data range that is known in advance.
One problem with the bucket sort is that, if the range of items to be
sorted is very large, an unreasonable amount of memory is required to
allocate enough buckets. A method very similar to bucket sorting
called radix sorting elegantly handles this problem. In a radix sort
the data to be sorted is broken down into several buckets, like in the
bucket sort. The difference is that many items are assigned to each
bucket in the radix sort because to assign an item to a bucket the
radix sort only considers a subset of the item key. Often the bucket
to which an item is assigned with the radix sort is based on a certain
digit or subset of the item's key value. The radix sort recursively
processes the contents of each bucket by allocating sub-buckets and
assigning items into sub-buckets by considering a different subrange
of the items' keys. This process continues until there is only one
item per bucket at which point items are recollected in order.
For example, sorting social security numbers with a radix method one
might divide the entire data space into ten sets. The members of set
number one would be the social security numbers beginning with the
digit 1, etc. The numbers in set one would be subdivided based on
the second digit in each social security number, and so on until the
entire set was sorted.
|