Re: bit permutation (encoding)
- From: Josiah Carlson <josiah.carlson@xxxxxxxxxxxxx>
- Date: Sat, 09 Jun 2007 12:55:52 -0700
stdazi wrote:
i was thinking a lot about the way I could encode a permutation of N
bits. I've also found some code searching through Usenet archives, but
I haven't understood the code concept well enough.
I'll try to explain what I mean by bit encoding in hope someone can
explain , or give a hint on how to approach the problem I've faced.
Say I have a binary digit D, of size N. This digit can surely be
described as a permutation of p elements with value 1 and (N-p)
elements of value 0. So, if I know the value of N, and the number of
set bits in it (or unset ;-) ) , I can describe it as a permutation. I
was wondering, on how could I efficiently store that information to
gain some compression size. I know this method wouldn't be efficient
nor fast, but I'm doing all that just to learn something out of it.
The only way I could think of "encoding" the permutation is to
generate all the permutations of the given elements, and find the
matching permutation. That way I can store the consecutive number of
the seeking permutation which would give me
I'll save you the time and effort of continuing that direction. There is nothing to be gained by specifying which permutation of n elements you actually have. This can be seen by the fact that log(n!) is O(nlogn) (trivially upper bounded by log(n^n) and lower bounded by log(n/2^n/2) ).
That is to say, if you have n unique items, and you want to transmit the permutation of those items, then describing the permutation as 'the shortest bit string that uniquely represents the first item seen' followed by 'the shortest bit string that uniquely represents the second item seen', etc., is as good as you can do. When we are talking about a sequence of bits, this is the bit sequence itself*.
The only potential for improving this is if you were to transfer the total length of a bit sequence as well as the number of zeroes (or ones). However, this is (functionally) equivalent to sending the tree in a static huffman encoding, which tends to do a bit worse than adaptive huffman codings, which tends to do a bit worse than adaptive arithmetic coding.
But when we get around to predicting and coding data based on history, now we are talking about entropy coding and data modeling, and are back into the realm of classic data compression and Shannon information theory.
Now, to answer your question regarding "is there a method of choosing permutation X from a sequence of permutations Y quickly?" The answer is yes. It relies on realizing that if you have k items that you are permuting, the first (k-1)! permutations have the same first element and some permutation of the remaining elements. The second (k-1)! has the second element of the original sequence, with the remaining elements permuted. Etc. But, the effect of generating 'which permutation', really just gets you the equivalent of an arithmetic coding (where all elements have initial probability 1, which becomes 0 after they are seen), plus or minus a handful of bits*.
- Josiah
* There was a discussion here a few months ago about encoding permutations of the 256 ascii values, 65536 short values, etc., but this only makes sense if you are encoding, specifically, permutations of that data...which never really happens (except when transferring static huffman statistics, which isn't as good as other methods).
.
- Follow-Ups:
- Re: bit permutation (encoding)
- From: Steven Pigeon
- Re: bit permutation (encoding)
- References:
- bit permutation (encoding)
- From: stdazi
- bit permutation (encoding)
- Prev by Date: Re: bit permutation (encoding)
- Next by Date: Re: that two-three week wait is up, this is how I spent my summer (or last 12 years)
- Previous by thread: Re: bit permutation (encoding)
- Next by thread: Re: bit permutation (encoding)
- Index(es):
Relevant Pages
|
|