Re: A truely BIJECTIVE BWT is here!
- From: Mark Nelson <snorkelman@xxxxxxxxx>
- Date: Thu, 13 Dec 2007 12:56:36 -0800 (PST)
On Dec 13, 8:43 am, Phil Carmody <thefatphil_demun...@xxxxxxxxxxx>
wrote:
I suspect this is impossible. In order to achieve an
improvement in compression, BWT needs to map less-
compressible vectors onto more-compressible vectors.
However, there are fewer of the latter.
I don't think this is quite correct Phil. If you look at the BWT
alone, it is truly a bijective function. It is a one-to-one mapping of
the set of input messages to the set of output messages, for all input
and output.
The output vectors that we get are more compressible ONLY because the
input data has a specific type of correlation - the chances that
symbol X appears after a given sequence of symbols, {a, b, c, ... } is
larger than would be suggested by mere chance. And of course, most
interesting files have this property. Files that aren't interesting
won't have this property, but they will still be transformed by BWT -
into equally uncompressible output vectors.
The reason that I am suspicious of David's claim is just that I don't
see how the transform can be reversible without additional
information.
Of course, I'm reluctant to flag it as impossible. If you were naively
reading the original paper without having ever heard of it before, I
would be very surprised if you were to get through the description of
the transform and then say "Aha! This would be totally reversible if I
just had a pointer to the new position of the last character in the
input sequence!" Well, maybe you would, but I sure didn't, and it
still seems pretty remarkable to me that it works at all.
I haven't even looked at David's code yet but I was hoping he could
give a words-only description of what he did to achieve his claim. But
I'll give David a break, he says his attempts at exposition are rarely
successful, and that's not likely to change overnight. What worries me
is my own personal conviction that there is a bijection between clear
writing in spoken language and clear writing in code.
|
| Mark Nelson - http://marknelson.us
|
.
- Follow-Ups:
- Re: A truely BIJECTIVE BWT is here!
- From: Phil Carmody
- Re: A truely BIJECTIVE BWT is here!
- From: biject
- Re: A truely BIJECTIVE BWT is here!
- References:
- A truely BIJECTIVE BWT is here!
- From: biject
- Re: A truely BIJECTIVE BWT is here!
- From: Phil Carmody
- A truely BIJECTIVE BWT is here!
- Prev by Date: Re: A truely BIJECTIVE BWT is here!
- Next by Date: Re: A truely BIJECTIVE BWT is here!
- Previous by thread: Re: A truely BIJECTIVE BWT is here!
- Next by thread: Re: A truely BIJECTIVE BWT is here!
- Index(es):