(algorithm)
Definition: Merge n sorted sequences of different lengths into one output while minimizing reads. Only two sequences can be merged at once. At each step, the two shortest sequences are merged.
Formal Definition: Let D={n1, ... , nk} be the set of lengths of sequences to be merged. Take the two shortest sequences, ni, nj∈ D, such that n≥ ni and n≥ nj ∀ n∈ D. Merge these two sequences. The new set D is D' = (D - {ni, nj}) ∪ {ni+nj}. Repeat until there is only one sequence.
See also simple merge, ideal merge, Huffman coding, greedy algorithm.
Note:
Merging sequences by length is the same as joining trees by frequency in Huffman coding. For example, let there be a set of sorted sequences of the following lengths: D={3,5,7,9,12,14,15,17}. Building the optimal merge tree goes as follows. Note that merged sequences are replaced by the sum of their lengths. For instance, the first step merges the sequence of length 3 and the sequence of length 5 to get a sequence of length 8.
3 5 7 9 12 14 15 17
8 7 9 12 14 15 17
/ \
3 5 15 9 12 14 15 17
/ \
8 7
/ \
3 5 15 21 14 15 17
/ \ / \
8 7 9 12
/ \
3 5 29 21 15 17
/ \ / \
14 15 9 12
/ \
8 7
/ \
3 5 29 21 32
/ \ / \ / \
14 15 9 12 15 17
/ \
8 7
/ \
3 5 50 32
/ \ / \
/ \ 15 17
29 21
/ \ / \
14 15 9 12
/ \
8 7
/ \
3 5 82
/ \
/ \
/ \
50 32
/ \ / \
/ \ 15 17
29 21
/ \ / \
14 15 9 12
/ \
8 7
/ \
3 5
Author: AL
If you have suggestions, corrections, or comments, please get in touch with Paul Black.
Entry modified 14 August 2008.
HTML page formatted Mon Feb 2 13:10:40 2015.
Cite this as:
Alen Lovrencic, "optimal merge", in
Dictionary of Algorithms and Data Structures [online], Vreda Pieterse and Paul E. Black, eds. 14 August 2008. (accessed TODAY)
Available from: http://www.nist.gov/dads/HTML/optimalMerge.html