Imagine the distance from a given node down to the most distant leaf
node below it is represented by d. In the process of
``heapification,'' the number of comparisons needed to process a given
node is, at most, 2d. This is because, in the worst case, when the
value at said node is compared to it's two children it will have to be
demoted. To determine that the node must be demoted two comparisons
are needed: one to compute the greatest child and one to compare the
parent node value with the greatest child. Once the values have been
swapped it becomes necessary to continue this two-comparison cost
process until we reach a leaf node, in the worst case.
Thus the total complexity of the entire ``heapify'' process is, at
most, two times the sum of the heights of all nodes in the heap. The
goal now is to evaluate this sum.
Consider complete trees (where all nodes have either two children or
none). Let H(i) denote the sum of the heights of all nodes a
complete tree of height i. A tree of height one has one node with
height one. A tree of height i consists of two trees of height
(i-1) plus one root node. Hence,
H(i) = 2H(i - 1) + i
Of course H(1) = 1.
The closed form solution of this recurrence relation is:
H(i) = 2i+1 - (i + 2)
Since a complete binary tree has more nodes than an incomplete one,
the total of all node heights in any tree of height i will be
less-than or equal to the total node height of the complete binary
tree with height i.
Thus the total number of comparisons needed to ``heapify'' a tree of
height i will be, at most,
O(2(2i+1 - (i + 2))). This linear
complexity can be simplified to O(i) where i is the tree height.
Thus, heapification is a linear process.
|