API Documentation
Public Member Functions | List of all members
nkMemory::HuffmanLut Class Reference

Allows to create a Look-Up-Table (LUT) of a Huffman tree. More...

Public Member Functions

 HuffmanLut (const HuffmanTreeDescriptor &descriptor) noexcept
 
 HuffmanLut (nkMemory::BufferView< unsigned short > canonicalDescription) noexcept
 
BufferView< HuffmanSymbolgetLut () const
 
HuffmanSymbol get (unsigned int input) const
 
unsigned int getLutSize () const
 
unsigned int getMaxSymbolSize () const
 

Detailed Description

Allows to create a Look-Up-Table (LUT) of a Huffman tree.

A Huffman tree allows to compress data at the bits level, by creating a dictionary that will translate a certain pattern of bits, into another, often smaller. It is often represented as a binary tree, with each code's 0 and 1 encoding left or right until getting to a symbol at the designated leaf. However, traversing the tree everytime is not ideal for performance reasons. This LUT aims to flatten the lookup time into a simple indexing operation.

However, this comes with some constraints :


For instance, if the tree features codes with 7 bits and at most 9 bits, then a code such as 0b1100011 should be padded to become 0b110001100 before looking up.
A typical usage is to use a BitStream as such :

...
nkMemory::HuffmanLut lut (...) ;
nkMemory::BitStream stream (data) ;
nkMemory::HuffmanSymbol lookup = lut.get(stream.peekMsb(lut.getMaxSymbolSize())) ;
stream.moveForward(lookup._size) ;
// Use lookup._value as needed
...

Constructor & Destructor Documentation

◆ HuffmanLut() [1/2]

nkMemory::HuffmanLut::HuffmanLut ( const HuffmanTreeDescriptor descriptor)
noexcept

Constructor. The descriptor data is interpreted sequentially. As such :

  • The symbols to encode in the _table are mapped one by one to the bit counts as per the _perBitsCounts array.
  • The minimum bit count is 1, meaning that _perBitsCounts[0] encodes the number of symbols encoded using 1 bit, _perBitsCounts[1] is 2 bits, etc.
  • Currently, no checks are made to ensure the sanity of the data mapping between both members.

As an example, a simple tree can be created with :

nkMemory::HuffmanTreeDescriptor descriptor ;
descriptor._perBitsCounts[0] = 1 ;
descriptor._perBitsCounts[1] = 2 ;
descriptor._table = {128u, 12u, 13u} ;

This tree will feature 3 symbols, the first with a code of size 1, and the 2 next of size 2. More precisely : 128 (encoded by 0), 12 (encoded by 10), and 13 (encoded by 11).

Parameters
descriptorThe tree descriptor to use for the tree, and ultimately LUT creation.

◆ HuffmanLut() [2/2]

nkMemory::HuffmanLut::HuffmanLut ( nkMemory::BufferView< unsigned short >  canonicalDescription)
noexcept

Canonical constructor. The canonical description should be a series of numbers encoding the number of bits each symbol takes. The associated values start from 0, up to the buffer's size - 1. Providing a length of 0 for a symbol means that it won't be used in the tree representation.
For instance, a buffer of {2, 1, 3, 3} will output a tree associating symbols 0, 1, 2, 3 to their codes 0b10, 0b0, 0b110, 0b111.

Parameters
canonicalDescriptionThe canonical description, each entry giving the number of bits a code for a symbol should have (each symbol is the index at which the code length is).
Remarks
The length featured should not be larger than the maximum per bits cound supported, 16.

Member Function Documentation

◆ getLut()

BufferView<HuffmanSymbol> nkMemory::HuffmanLut::getLut ( ) const
Returns
The raw buffer used for the look up table.
Remarks
To index this buffer, the input needs to be padded on the right to be the same size as the longest possible input encoded. For instance, if a 7 bits code needs to index a LUT made with codes up to 9 bits, then it needs to be padded : 0b1100011 -> 0b110001100.

◆ get()

HuffmanSymbol nkMemory::HuffmanLut::get ( unsigned int  input) const
Parameters
inputThe input binary code to lookup for.
Returns
The symbol encoded at a given input's position.
Remarks
Because this operation works within the LUT, the input needs to be padded on the right to be the same size as the longest possible input encoded. For instance, if a 7 bits code needs to index a LUT made with codes up to 9 bits, then it needs to be padded : 0b1100011 -> 0b110001100.

◆ getLutSize()

unsigned int nkMemory::HuffmanLut::getLutSize ( ) const
Returns
The total LUT size, or entry count.

◆ getMaxSymbolSize()

unsigned int nkMemory::HuffmanLut::getMaxSymbolSize ( ) const
Returns
The maximum bit code size as encoded in the LUT.

The documentation for this class was generated from the following file: