0.2.8 Skip lists
In a paper entitled Skip Lists: A Probabilistic Alternative to
Balanced Trees William Pugh at the University of Maryland proposed a
linked-list like data structure with some additional features which
improved traversal speed. Binary trees work well when the elements
which they are to contain are inserted in a random order. As we have
seen in the previous sections, BSTs perform very poorly when data is
inserted in order. Red-Black trees, which do not suffer from
degenerate performance manifested by BSTs, must constantly rotate and
re-balance as data is inserted in order. A skip list is an ordered
linked list in which every node contains a variable number of links to
other nodes. The nth link of a given node points to subsequent nodes
in the list skipping over some number of intermediary nodes. These
skipped nodes are common in that they have fewer than n links
associated with them.
Because most nodes have a variable number of links, a skip list is
really a collection of linked lists of different levels. In order to
quickly traverse the structure looking for some target key we search
on the upper level list until either the target data is encountered or
we find a node with a key smaller than the target which links to a
subsequent node with a larget value than the target. In the second
case we continue by repeating the same procedure beginning at the node
with the smaller value than the target and continuing on the list one
level down.
In the following quotation William Pugh describes the performance of
his skip list data structure:
Skip lists are a probabilistic alternative to balanced trees. Skip
lists are balanced by consulting a random number generator. Although
skip lists have bad worst-case performance, no input sequence
consistently produces the worst-case performance (much like quicksort
when the pivot element is chosen randomly). It is very unlikely a
skip list data structure will be significantly unbalanced (e.g., for a
dictionary of more than 250 elements, the chance that a search will
take more than 3 times the expected time is less than one in a
million). Skip lists have balance properties similar to that of search
trees built by random insertions, yet do not require insertions to be
random.
Balancing a data structure probabilistically is easier than
explicitly maintaining the balance. For many applications, skip lists
are a more natural representation than trees, also leading to simpler
algorithms. The simplicity of skip list algorithms makes them easier to
implement and provides significant constant factor speed improvements
over balanced tree and self-adjusting tree algorithms. Skip lists are
also very space efficient. They can easily be configured to require an
average of 1 1/3 pointers per element (or even less) and do not require
balance or priority information to be stored with each node.
The above article appeared in the June 1990 issue of the
Communications of the ACM. A Postscript format version is available
for anonymous ftp from ftp.cs.umd.edu. The author, William Pugh,
can be contacted at pugh@cs.umd.edu.
|