A new lostless transformation algorithm
- From: "Peter N K" <peet@xxxxxx>
- Date: 29 Dec 2005 03:47:58 -0800
I would like to share the idea I investigated in my thesis (Techical
University-Sofia 18.06.98)
The idea is similar to BWT or move-to-front: transform data before
starting to code it. If the reversible trannsformation leads to very
well codable data then coding is payed-off.
The Main Idea: Input data is just one of the permutations of the set of
symbols it contains.
If you have the set (S) and the permtation number (P) you can recreate
original permutation. The set could be coded by coding any permutation.
When decoding takes place you shoud only rearrnge the set (S) in a such
way that it forms (P)
Example: set (0, 1, 2, 3, 4)
coded sequence: 0, 1, 2, 4, 3 (second permuatation)
Coding: S = 0, 1, 2, 3, 4 / P = 1
Example2: Same set
coding sequence: 2, 4, 3, 0, 1 ( 32th permutation /not
sure/ )
Coding: S = 0, 1, 2, 3, 4 / P = 32
Best Coding with this method: We find all permutatios(m) of the
Original Input (I), and every permution is coded in (n)diifrent ways
We have (m x n) outputs. One of them is much smaller in size (has best
compression)
So we code the original message(sequence) with 3 strings:
1) Compression Method Used (Best of n) log2(n) bits
2) Best output data
3) The number of the original permutation (all are m) log2(m)
The interesting question is how much space 3) takes. In the worst case
(no repeating symbols in original alphabet) for 2 symbols 1bit, 8
symbols (2 bytes), 40 symbols (20 bytes)
So if we want to code 1 MB, we devide it into sequences of 40 symbols
and code every sequence with 1) X bytes (Y percents), 2) Z bytes (coded
sequence) and 3) 20 bytes permutation number
At first the transformation coding overhead seems too big - 50 % but if
you can achieve more than 50% compression of any permutation of the 40
symbols it pays off
EXAMPLE3: 8 bytes sequences: 4, 3, 1, 5, 8, 6, 2, 7 - pure chaos - no
way to compress.
(64 bits sequence)
=> Data 1) Only one compression method is used (bitmap coding), so == 0
bytes nloss here)
Data 2) Permuttion 1 coded (1, 2, 3, 4, 5, 6, 7, 8) with 32 bits
coding 256 chars in the set bitmap coding (256/8 = 32)
Data 3) Coding permutation number: Exacly 16 bits
So compression = (32+16)/64 = 75 % compression
If you have questions please ask (pnk@xxxxxx) Peter Koniarov
.
- Follow-Ups:
- Re: A new lostless transformation algorithm
- From: Phil Carmody
- Re: A new lostless transformation algorithm
- From: David A. Scott
- Re: A new lostless transformation algorithm
- From: Lionel Delafosse
- Re: A new lostless transformation algorithm
- From: David A. Scott
- Re: A new lostless transformation algorithm
- From: Phil Frisbie, Jr.
- Re: A new lostless transformation algorithm
- Prev by Date: Defossé Guillaume DGS compression investor problems
- Next by Date: Re: huffman encoder
- Previous by thread: Defossé Guillaume DGS compression investor problems
- Next by thread: Re: A new lostless transformation algorithm
- Index(es):
Relevant Pages
|
|