Re: compressing short XML messages without including dictionary or huffman table
- From: John Reiser <jreiser@xxxxxxxxxxxx>
- Date: Mon, 04 Jun 2007 11:33:34 -0700
input input length, output bytes, output/input as percent,
dictionary string
abcxyz*45 270 => 19 ( 7.0%) :: i.e. no dictionary
abcxyz*45 270 => 22 ( 8.1%) :abc: - why should this
increase the output size?
abcxyz*45 270 => 22 ( 8.1%) :xyz:
abcxyz*45 270 => 19 ( 7.0%) :abcxyz:
abcxyz*45 270 => 20 ( 7.4%) :xyzabc: - suprisingly not
quite the same uas using 'abcxyz'
abcxyz*45 270 => 21 ( 7.8%) :abc xyz: - slightly worse
than 'abcxyz' without the space
So, why is it that specifiying a dictionary can actually increase
output size?
The parsing phase has many choices for which preceding substring
to match (both position and length.) The parser does not pretend
to chose matchings which will result in an optimal (shortest) encoding.
The parser in zlib tries only to select quickly a good matching.
You must pay more (usually a _lot_ more) to get a parser that
tries to select a matching which gives a shortest encoding.
--
.
- References:
- compressing short XML messages without including dictionary or huffman table
- From: benedict
- Re: compressing short XML messages without including dictionary or huffman table
- From: Mark Adler
- Re: compressing short XML messages without including dictionary or huffman table
- From: benedict
- Re: compressing short XML messages without including dictionary or huffman table
- From: Mark Adler
- Re: compressing short XML messages without including dictionary or huffman table
- From: benedict
- compressing short XML messages without including dictionary or huffman table
- Prev by Date: Re: compressing short XML messages without including dictionary or huffman table
- Next by Date: Re: compressing short XML messages without including dictionary or huffman table
- Previous by thread: Re: compressing short XML messages without including dictionary or huffman table
- Next by thread: Re: compressing short XML messages without including dictionary or huffman table
- Index(es):