A binary search on an array is
O(log2 n) because at each test you
can ``throw out'' one half of the search space. If we assume n, the
number of items to search, is a power of two (i.e. n = 2x) then,
given that n is cut in half at each comparison, the most comparisons
needed to find a single item in n is x. It is noteworthy that for
very small arrays a linear search
can prove faster than a binary search. However as the size of the
array to be searched increases the binary search is the clear victor
in terms of number of comparisons and therefore overall speed.
Still, the binary search has some drawbacks. First of all, it
requires that the data to be searched be in sorted order. If there is
even one element out of order in the data being searched it can throw
off the entire process. When presented with a set of unsorted data
the efficient programmer must decide whether to sort the data and
apply a binary search or simply apply the less-efficient linear
search. Even the best sorting algorithm is a complicated process. Is
the cost of sorting the data is worth the increase in search speed
gained with the binary search? If you are searching only once, it is
probably better to do a linear search in most cases.
Once the data is sorted it can also prove very expensive to add or
delete items. In a sorted array, for instance, such operations
require a ripple-shift
of array elements to open or close a ``hole'' in the array. This is
an expensive operation as it requires, in worst case, log2 ncomparisons and n item moves.
The binary search assumes easy random-access to the data space it is
searching. An array is the data structure that is most often used
because it is easy to jump from one index to another in an array. It
is difficult, on the other hand, to efficiently compute the midpoint
of a linked list
and then traverse there inexpensively. The binary search tree
data structure and algorithm, which we discussed later,
attempt to solve these array-based binary search weaknesses.
|