next up previous contents index Search
Next: 0.3.9.1 Analysis Up: 0.3 Sorting Algorithms Previous: 0.3.8 Bubble Sort

   
0.3.9 Bucket and Radix Sorting

 

The bucket sort is a non-comparison-based sorting algorithm in which we allocate one storage location for each item to be sorted and proceed to process the data set to be sorted assigning each item into its corresponding bucket. In order to bucket sort n unique items in the range of 1 through m, allocate m buckets and then iterate over the n items assigning each one to the proper bucket. Finally loop through the buckets and collect the items putting them into final order.

The bucket sort is a good choice when items to be sorted are from a small data range that is known in advance.

One problem with the bucket sort is that, if the range of items to be sorted is very large, an unreasonable amount of memory is required to allocate enough buckets. A method very similar to bucket sorting called radix sorting elegantly handles this problem. In a radix sort the data to be sorted is broken down into several buckets, like in the bucket sort. The difference is that many items are assigned to each bucket in the radix sort because to assign an item to a bucket the radix sort only considers a subset of the item key. Often the bucket to which an item is assigned with the radix sort is based on a certain digit or subset of the item's key value. The radix sort recursively processes the contents of each bucket by allocating sub-buckets and assigning items into sub-buckets by considering a different subrange of the items' keys. This process continues until there is only one item per bucket at which point items are recollected in order.

For example, sorting social security numbers with a radix method one might divide the entire data space into ten sets. The members of set number one would be the social security numbers beginning with the digit 1, etc. The numbers in set one would be subdivided based on the second digit in each social security number, and so on until the entire set was sorted.



 
Scott Gasch
1999-07-09