home .. forth .. colorforth mail list archive ..

Re: [colorforth] Unused character encodings available in Colorforth


The 0 0000 character code indicates end-of-word. With characters packed into
32-bit words, zeros are also used as fill if 28 bits aren't used.

But I had in mind to parse/unparse words as they move from/to disk or
Internet. That is, the parsed 32-bit words are converted to a bit stream.
Then the only zeros would be the end-of-word code. It is then the most
frequently used code and deserves a 5-bit length.

I've never gotten around to doing this, but guess maybe a 50% compression
results. Anyone want to try it?

After the end-of-word, I'd append the 4-bit tag. These could be
length-encoded, but it doesn't seem worth the effort.

I'd also use an empty word (an extra 0 0000) to indicate end-of-block or
end-of-text.


---------------------------------------------------------------------
To unsubscribe, e-mail: colorforth-unsubscribe@xxxxxxxxxxxxxxxxxx
For additional commands, e-mail: colorforth-help@xxxxxxxxxxxxxxxxxx
Main web page - http://www.colorforth.com