Re: Pacbyte data compression patent
- From: boo@xxxxxxxxxxxxxxxx
- Date: 5 Sep 2006 17:34:00 -0700
Matt Mahoney wrote:
The other (less obvious) flaw is that the scheme requires each of the
256 possible byte values to occur at least once in the 300 byte
sequence. The number of such sequences is (300!*256^44)/44! out of a
possible 256^300 sequences, or one in 2.806 * 10^56. So applying one
of 64K transforms is unlikely to put the input in the right form. You
would need 187.5 bits to encode enough transforms to guarantee the
required form for all possible inputs. When you add this plus the 44
bytes, you end up with exactly the same size as you started with, 300
bytes.
Yes, that was my intuition also. And even if it were possible
to guarantee getting the needed number of unique entries
after 2^16 attempts it would be a glacial process to do this
for each sequence of bytes.
The "obvious" flaw is that any random or recursive compression scheme
violates the counting argument. You can't compress all n-bit strings
because there are 2^n of them, but only 2^n - 1 possible compressed
codes shorter than n bits. Yet it is amazing how many people there are
who can't understand this simple proof. Invariably, they have a weak
understanding of mathematics. Just like the perpetual motion designers
of the last century who didn't think conservation of energy applied to
That's an excellent, succinct description of why no lossless
compression algorithm can be guaranteed to compress all strings.
I'd heard this before, but your second sentence sums it up well.
It's so obvious that I feel stoopid for not realizing it :(
Cheers, Eric
.
- References:
- Pacbyte data compression patent
- From: Sportman
- Re: Pacbyte data compression patent
- From: boo
- Re: Pacbyte data compression patent
- From: Matt Mahoney
- Pacbyte data compression patent
- Prev by Date: Re: FAQ Maintenance?
- Next by Date: Re: FAQ Maintenance?
- Previous by thread: Re: Pacbyte data compression patent
- Next by thread: fast'ish BWT source code
- Index(es):
Relevant Pages
|