A new lostless transformation algorithm



I would like to share the idea I investigated in my thesis (Techical
University-Sofia 18.06.98)

The idea is similar to BWT or move-to-front: transform data before
starting to code it. If the reversible trannsformation leads to very
well codable data then coding is payed-off.

The Main Idea: Input data is just one of the permutations of the set of
symbols it contains.
If you have the set (S) and the permtation number (P) you can recreate
original permutation. The set could be coded by coding any permutation.
When decoding takes place you shoud only rearrnge the set (S) in a such
way that it forms (P)

Example: set (0, 1, 2, 3, 4)
coded sequence: 0, 1, 2, 4, 3 (second permuatation)
Coding: S = 0, 1, 2, 3, 4 / P = 1

Example2: Same set
coding sequence: 2, 4, 3, 0, 1 ( 32th permutation /not
sure/ )
Coding: S = 0, 1, 2, 3, 4 / P = 32


Best Coding with this method: We find all permutatios(m) of the
Original Input (I), and every permution is coded in (n)diifrent ways
We have (m x n) outputs. One of them is much smaller in size (has best
compression)
So we code the original message(sequence) with 3 strings:
1) Compression Method Used (Best of n) log2(n) bits
2) Best output data
3) The number of the original permutation (all are m) log2(m)

The interesting question is how much space 3) takes. In the worst case
(no repeating symbols in original alphabet) for 2 symbols 1bit, 8
symbols (2 bytes), 40 symbols (20 bytes)

So if we want to code 1 MB, we devide it into sequences of 40 symbols
and code every sequence with 1) X bytes (Y percents), 2) Z bytes (coded
sequence) and 3) 20 bytes permutation number

At first the transformation coding overhead seems too big - 50 % but if
you can achieve more than 50% compression of any permutation of the 40
symbols it pays off

EXAMPLE3: 8 bytes sequences: 4, 3, 1, 5, 8, 6, 2, 7 - pure chaos - no
way to compress.
(64 bits sequence)

=> Data 1) Only one compression method is used (bitmap coding), so == 0
bytes nloss here)
Data 2) Permuttion 1 coded (1, 2, 3, 4, 5, 6, 7, 8) with 32 bits
coding 256 chars in the set bitmap coding (256/8 = 32)
Data 3) Coding permutation number: Exacly 16 bits

So compression = (32+16)/64 = 75 % compression

If you have questions please ask (pnk@xxxxxx) Peter Koniarov

.



Relevant Pages

  • Re: A new lostless transformation algorithm
    ... The set could be coded by coding any permutation. ... > you can achieve more than 50% compression of any permutation of the 40 ... is man only a blunder of God, or God only a blunder of man? ...
    (comp.compression)
  • Re: A new lostless transformation algorithm
    ... The set could be coded by coding any permutation. ... > sequence) and 3) 20 bytes permutation number ... > you can achieve more than 50% compression of any permutation of the 40 ...
    (comp.compression)
  • Re: bit permutation (encoding)
    ... I'll try to explain what I mean by bit encoding in hope someone can ... The only way I could think of "encoding" the permutation is to ... When we are talking about a sequence of bits, this is the bit sequence itself*. ... However, this is equivalent to sending the tree in a static huffman encoding, which tends to do a bit worse than adaptive huffman codings, which tends to do a bit worse than adaptive arithmetic coding. ...
    (comp.compression)
  • Re: AES 256 key and anti-key
    ... the same permutation, and there is nothing wrong about that. ... Guaranteeing a surjective mapping from the set ... One doesn't even need to worry about anyting. ... Same with Burrows Wheeler compression systmes it bugged me ...
    (sci.crypt)
  • Re: Enhancement of the 3R algorithm
    ... Don't all compression algorithms expand data on average? ... people like to have fun with stuffs they can click... ... There is NO permutation or computation of that sort in the ... and it's not a general purpose algo either. ...
    (comp.compression)