Re: bit permutation (encoding)



stdazi wrote:
i was thinking a lot about the way I could encode a permutation of N
bits. I've also found some code searching through Usenet archives, but
I haven't understood the code concept well enough.

I'll try to explain what I mean by bit encoding in hope someone can
explain , or give a hint on how to approach the problem I've faced.

Say I have a binary digit D, of size N. This digit can surely be
described as a permutation of p elements with value 1 and (N-p)
elements of value 0. So, if I know the value of N, and the number of
set bits in it (or unset ;-) ) , I can describe it as a permutation. I
was wondering, on how could I efficiently store that information to
gain some compression size. I know this method wouldn't be efficient
nor fast, but I'm doing all that just to learn something out of it.

The only way I could think of "encoding" the permutation is to
generate all the permutations of the given elements, and find the
matching permutation. That way I can store the consecutive number of
the seeking permutation which would give me

I'll save you the time and effort of continuing that direction. There is nothing to be gained by specifying which permutation of n elements you actually have. This can be seen by the fact that log(n!) is O(nlogn) (trivially upper bounded by log(n^n) and lower bounded by log(n/2^n/2) ).

That is to say, if you have n unique items, and you want to transmit the permutation of those items, then describing the permutation as 'the shortest bit string that uniquely represents the first item seen' followed by 'the shortest bit string that uniquely represents the second item seen', etc., is as good as you can do. When we are talking about a sequence of bits, this is the bit sequence itself*.

The only potential for improving this is if you were to transfer the total length of a bit sequence as well as the number of zeroes (or ones). However, this is (functionally) equivalent to sending the tree in a static huffman encoding, which tends to do a bit worse than adaptive huffman codings, which tends to do a bit worse than adaptive arithmetic coding.

But when we get around to predicting and coding data based on history, now we are talking about entropy coding and data modeling, and are back into the realm of classic data compression and Shannon information theory.

Now, to answer your question regarding "is there a method of choosing permutation X from a sequence of permutations Y quickly?" The answer is yes. It relies on realizing that if you have k items that you are permuting, the first (k-1)! permutations have the same first element and some permutation of the remaining elements. The second (k-1)! has the second element of the original sequence, with the remaining elements permuted. Etc. But, the effect of generating 'which permutation', really just gets you the equivalent of an arithmetic coding (where all elements have initial probability 1, which becomes 0 after they are seen), plus or minus a handful of bits*.

- Josiah

* There was a discussion here a few months ago about encoding permutations of the 256 ascii values, 65536 short values, etc., but this only makes sense if you are encoding, specifically, permutations of that data...which never really happens (except when transferring static huffman statistics, which isn't as good as other methods).
.



Relevant Pages

  • Re: A new lostless transformation algorithm
    ... The set could be coded by coding any permutation. ... > sequence) and 3) 20 bytes permutation number ... > you can achieve more than 50% compression of any permutation of the 40 ...
    (comp.compression)
  • Re: optimation, a black art?
    ... instructions may be completely subsumed by slower running ones. ... To do this the compiler has to estimate where the value lies in the ... permutation generate the actual sequence of instructions following ...
    (comp.lang.lisp)
  • Re: A new lostless transformation algorithm
    ... The set could be coded by coding any permutation. ... > you can achieve more than 50% compression of any permutation of the 40 ... is man only a blunder of God, or God only a blunder of man? ...
    (comp.compression)
  • Re: Mathematical formula to compare integer sequence
    ... Then you have to define the implementation language first :-). ... " and sequence comparison are built in): ... permutation of b. ... primes and it would be quite fast. ...
    (sci.math)
  • Permutations, Fractions, Primes (a "game")
    ... First, take a permutation of the first n positive integers, and ... numerators and denominators of the s's. ... For example, if, for n=6, we have the permutation ... as discussed on the Integer Sequence Fan ...
    (sci.math)