Re: compression challenge
- From: "MisterE" <voids@xxxxxxxxxxxxxxx>
- Date: Wed, 5 Mar 2008 16:39:17 +1000
Initially I thought:
With a maximum of 256 outcomes, you should be able to easily manage
this. Represent first value. If remaining values are less than a given
binary representation, that representation is used next. Continue.
Worst case, you get 1, 2, 3, 4, 5, 6, 7, 8... this would mean 16
bytes. So therefore we need a command section to say less/more or we
can use this alternative where we deduce if most values are under
represented.
why the *** would you do that
So 1 bit to say before each if the following number is within
[remaining bit length available / 2] then to run it. However worst
case this still creates issues with some outcomes, but statistically a
very low number of them, where you have 4 bits + 1 bit for every one
of the 16 bits. This could however be resolved with a single bit more
at the front if "Most are inside the first 1/2" and then an additional
indicator if most are inside the first "1/4". This would then mean 2+1
bits per on whatever we define most as, and worst case of 4+1 on the
remaining, plus 2 for a command section. If they landed in the 2nd
quarter we would spend 3+1 with a command of 2 bits.
stupid
We also could use a 'closer to, farther from' representation, and
normally compress, where 1 bit prior to each stored area would
indicate if the amount is up to 8 bits from the current location, or
if it is less than 5 bits from the current location. This would cover
32 outcomes close to the origin, but would still have issues overall
due to the addition of a bit to each individual.
stupid
There are other methods... I had made a count to system, where you had
4 bits that provided a count, then a bit to say "continue or end?"
where continue would add 4 more bits to the whole and then it would be
representing your number. This would handle only numbers within 16
points of the last number however, and would generally add a bit per
to the whole, not resulting in measurable compression... but the funny
thing about that is the counting can be compressed if commonly it
defaults on way or the other :P
stupid
Statistically this is somewhat a challenge, but not to much of a
challenge.
You haven't done anything statistically.
You need to dervice representations from the statistical distribution of the
data, not the other way around.
A good place for him to start would be to get the distribution of the 256
values for each byte.
.
- Follow-Ups:
- Re: compression challenge
- From: Einstein
- Re: compression challenge
- Prev by Date: Segmentaion codec for ECG compression
- Next by Date: Re: Magic Compression? (Why won't this work) (Code provided)
- Previous by thread: Re: compression challenge
- Next by thread: Re: compression challenge
- Index(es):