Huffman Coding
A lossless data compression algorithm which uses a small number of bits to encode common characters. Huffman coding approximates the probability for each character as a power of 1/2 to avoid complications associated with using a nonintegral number of bits to encode characters using their actual probabilities.
Huffman coding works on a list of weights
by building
an extended binary tree with minimum weighted
external path length and proceeds by finding
the two smallest
s,
and
, viewed as external
nodes, and replacing them with an internal node of weight
. The procedure
is them repeated stepwise until the root node is reached. An individual external
node can then be encoded by a binary string of 0s (for left branches) and 1s (for
right branches).
The procedure is summarized below for the weights 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, and 41 given by the first 13 primes, and the resulting tree is shown above (Knuth 1997, pp. 402-403). As is clear from the diagram, the paths to the larger weights are shorter than those to the smaller weights. In this example, the number 13 would be encoded as 1010.
| 2 | 3 | 5 | 7 | 11 | 13 | 17 | 19 | 23 | 29 | 31 | 37 | 41 |
| 5 | 5 | 7 | 11 | 13 | 17 | 19 | 23 | 29 | 31 | 37 | 41 | |
| 10 | 7 | 11 | 13 | 17 | 19 | 23 | 29 | 31 | 37 | 41 | ||
| 17 | 11 | 13 | 17 | 19 | 23 | 29 | 31 | 37 | 41 | |||
| 17 | 24 | 17 | 19 | 23 | 29 | 31 | 37 | 41 | ||||
| 24 | 34 | 19 | 23 | 29 | 31 | 37 | 41 | |||||
| 24 | 34 | 42 | 29 | 31 | 37 | 41 | ||||||
| 34 | 42 | 53 | 31 | 37 | 41 | |||||||
| 42 | 53 | 65 | 37 | 41 | ||||||||
| 42 | 53 | 65 | 78 | |||||||||
| 95 | 65 | 78 | ||||||||||
| 95 | 143 | |||||||||||
| 238 |
The following Wolfram Language code can be used to construct the list of internal nodes and table of iterations:
HuffmanStep[l0_List] := Module[
{
l = l0,
s2 = Take[Select[Sort[l0], Positive], 2]
},
l[[Take[Flatten[Position[l, #]& /@ s2], 2]]] = 0;
l[[Last[Position[l, 0]]]] = Plus @@ s2;
{l, s2}
]
HuffmanList[l_List] := Module[{},
Plus @@@ Last /@ NestWhileList[
HuffmanStep[First[#]]&,
HuffmanStep[l], Length[Union[First[#]]] > 2&]
]
HuffmanTable[l_List] :=
NestWhileList[First[HuffmanStep[#]]&, l,
Length[Union[#]] > 2&]
BCH code



