Re: compressing short XML messages without including dictionary or huffman table



input input length, output bytes, output/input as percent,
dictionary string

abcxyz*45 270 => 19 ( 7.0%) :: i.e. no dictionary
abcxyz*45 270 => 22 ( 8.1%) :abc: - why should this
increase the output size?
abcxyz*45 270 => 22 ( 8.1%) :xyz:
abcxyz*45 270 => 19 ( 7.0%) :abcxyz:
abcxyz*45 270 => 20 ( 7.4%) :xyzabc: - suprisingly not
quite the same uas using 'abcxyz'
abcxyz*45 270 => 21 ( 7.8%) :abc xyz: - slightly worse
than 'abcxyz' without the space

So, why is it that specifiying a dictionary can actually increase
output size?

The parsing phase has many choices for which preceding substring
to match (both position and length.) The parser does not pretend
to chose matchings which will result in an optimal (shortest) encoding.
The parser in zlib tries only to select quickly a good matching.
You must pay more (usually a _lot_ more) to get a parser that
tries to select a matching which gives a shortest encoding.

--
.