Allows to create a Look-Up-Table (LUT) of a Huffman tree.
A Huffman tree allows to compress data at the bits level, by creating a dictionary that will translate a certain pattern of bits, into another, often smaller. It is often represented as a binary tree, with each code's 0 and 1 encoding left or right until getting to a symbol at the designated leaf. However, traversing the tree everytime is not ideal for performance reasons. This LUT aims to flatten the lookup time into a simple indexing operation.
However, this comes with some constraints :
-
The LUT takes more space in memory, up to 65 536 bytes.
-
Indexing the LUT should be done by padding any shorter code with 0s on the right, to be the same size as the longest possible code encoded.
For instance, if the tree features codes with 7 bits and at most 9 bits, then a code such as 0b1100011 should be padded to become 0b110001100 before looking up.
A typical usage is to use a BitStream as such :
Constructor. The descriptor data is interpreted sequentially. As such :
-
The symbols to encode in the _table are mapped one by one to the bit counts as per the _perBitsCounts array.
-
The minimum bit count is 1, meaning that _perBitsCounts[0] encodes the number of symbols encoded using 1 bit, _perBitsCounts[1] is 2 bits, etc.
-
Currently, no checks are made to ensure the sanity of the data mapping between both members.
As an example, a simple tree can be created with :
This tree will feature 3 symbols, the first with a code of size 1, and the 2 next of size 2. More precisely : 128 (encoded by 0), 12 (encoded by 10), and 13 (encoded by 11).
- Parameters
-
descriptor | The tree descriptor to use for the tree, and ultimately LUT creation. |
Canonical constructor. The canonical description should be a series of numbers encoding the number of bits each symbol takes. The associated values start from 0, up to the buffer's size - 1. Providing a length of 0 for a symbol means that it won't be used in the tree representation.
For instance, a buffer of {2, 1, 3, 3} will output a tree associating symbols 0, 1, 2, 3 to their codes 0b10, 0b0, 0b110, 0b111.
- Parameters
-
canonicalDescription | The canonical description, each entry giving the number of bits a code for a symbol should have (each symbol is the index at which the code length is). |