home .. forth .. colorforth mail list archive ..

[colorforth] shifted huffman trees


This is similar to Bill Parker's shift character idea, except I use
shifted huffman trees.

The current character set is 48 characters = 1 space + 26 letters + 10
digits + 11 puctuation/symbols. The huffman tree looks like this:


    /\
   /  \
  8   /\
     /  \
    8   32

Left branches represents a '0' bit, right branches represents '1', the
terminal number is the number of terminals that follow. For example 8 says
there are 8 terminals for which 3 bits are needed to represent this
subtree. If I split the terminal 32, I can add new branches to the tree. I
choose to split it into a subtree of 16 terminals and another subtree since
8+8+16=32. 32 entries are enough to contain the 26 lower case letters. For
the remaining subtree, I choose to mirror the parent tree providing the
upper case letters. The justification for this is that upper case letters
are much less frequent than lower case, and the mirror might provide
implementation advantages, this is not based on huffman probabilities. To
the illustrate, the tree now looks like:

    /\
   /  \
  8   /\
     /  \
    8   /\
       /  \
      16   \
           /\
          /  \
         8   /\
            /  \
           8   /\
              /  \
             16   ?

'?' can be chosen to give the desired number of characters. This new tree
should be regarded as an alternative to the 'comment', 'Capitalized', 'all
caps' tags and the anti-space word. Huffman probabilities should influence
the shape and balance of tree, but I haven't done that yet, nor have I
considered all the implications of this change. One thing that I notice is
that this doesn't change the 4-bit, 5-bit, and 7-bit distribution of the
26 lower case characters. The 4-bit and 5-bit classes are unchanged, the
7-bit class becomes a 3-bit prefix and a 4-bit index.

Mark

---------------------------------------------------------------------
To unsubscribe, e-mail: colorforth-unsubscribe@xxxxxxxxxxxxxxxxxx
For additional commands, e-mail: colorforth-help@xxxxxxxxxxxxxxxxxx
Main web page - http://www.colorforth.com