Re: Need a bit of information about Compression
- From: "cr88192" <cr88192@xxxxxxxxxxxxxxxxxx>
- Date: Tue, 11 Apr 2006 06:46:17 +1000
"Nishu" <naresh.attri@xxxxxxxxx> wrote in message
news:1144667277.765377.196120@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
the last byte doesn't matter.
cr88192 wrote:
"Nishu" <naresh.attri@xxxxxxxxx> wrote in message
news:1144416807.604959.239040@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
this approach is often used by arithmetic coders as well. when the lengthunless we encode the EOF character (end of file) how can wetypical approaches:
practically
determine its end or decode it or store it?
either encode a full length, or a length mod some constant.
Yes, full Length coding, arithmetic coding which is certainly better
option.
is
reached, we know the data is done.
The question is"when the length is reached !!"
In arithmetic coders, we can take care of whole string at a time and
can convert it into a single number which lies between a certain range.
hence there cant be any problem about whether all of bits in last byte
are filled or not, but this doesnt hold for huffman coding.
you keep track of decoded length.
if you are decoding, say, 1024 symbols, then you stop as soon as you decode
those symbols, no problem, whatever is left is ignored.
But would u explain me how length mode CONST coding works?mod const usually involves some manner of guess, eg, one decodes until
the
end of the compressed data is reached. one knows that some amount of the
trailing data is garbage. we then know that the true end of the data (mod
const) is less than the current end (mod the same const) (this may not
hold
per se, if it doesn't it means that the decoded data wrapped around), one
can then back up the requisite amount, and have the decoded data (at the
correct length).
One need to know which data is garbage and which is not !!
because you know where the end "should" be.
Problem is
to find out the proper end of the bit stream...no, it is not.
you don't actually need to know this in this case.
and this can be done if we know how many '/', '1' and '0' are there.storing coded bits is one way (possible with 3 bits, if one knows the
but not otherwise. It seems as if we need extra counter to find out
them and then we know how many total no. of actual coded bits are
present.
location of the last byte).
likewise, one can store how large it should be when decoded, and maybe use
the mod approach for this.
well, in my case I don't personally care all that much about bijectiveyeah, but with only 3 symbols, having a code dedicated to eof is veryI didnt understand this. Isnt that "111" would be the special case..
also common is to know from the data how long it is.
less common is to set up some special condition, where the eof is
implied
by
an otherwise impossible occurance (less likely in this case).
that last case. and hence forth we know the eof and so will stop the
decoding.
expensive, better to look for a cheaper special case (an encoder
impossibility or somesuch...).
yes, coded EOF is certainly expensive, but, then theres not a good
alternative suggested here right now. I'm finding myself helpless in
following Mr. Davids point of view. If u understand that, plz explain
me.
coding...
easy enough to do things the good ol-fashioned way, and unless major
additional compression or speed gains pop up, I am not all that interested
personally.
and what special character or impossibilty??? we are dealing in bits!!!
so we need bits to assign special case and in case u dont encode eof, u
simply cant put it as it is in the bit stream. right? we dont have any
exception to point out eof.
note, we transform some input to some output. we define impossibility in
terms of the rules of the transform. if the data located can't possibly be
generated by the encoder (say, a run of the same symbol that is too long and
not handled by RLE), it could be used as an eof marker, with no additional
costs.
a simple case would be, eg, defining an RLE scheme where any run >=2 of the
last symbol is replaced with a run, so 3 occurances of the same symbol in a
row is impossible.
AAAABBBCAABBBB
A#3B#2CAAB#3AAA
in this case, the rules don't allow the AAA on the end, so decoding the AAA
would flag that an eof has been reached (likewise, to save space one could
do this with the highest probability symbol, rather than needing an extra,
low probability symbol).
Just that we can make a counter to count all the char and then encodeno need to store the bit counts. this wont really work with AC anyways...
them.
so say, then we know, '/' == ten, '1' == 4, '2' == 2. then we have 1 x
10 + 2 x4 + 2 x2 = 22 bits. and hence we found the no. of coded bits.
Its a one alternative i can think of.
most of my approaches would work with both AC and huffman coding.
so, we can encode it. that is what I was getting at.this depends...say, for example, we know that all runs of 0 greater than or equal to
5
will
be encoded as an rle run. if we then encode, say, 8 consecutive 0's,
then
we
know that the eof is reached (some fudging would be involved with this
one,
a final 1 bit could be used, eg, to properly align the eof).
Yes, after doing Huffman, RLE too can be envoled for even better
compression. And in that if we mention the impossible case as the eof,
then yes, problem can be solved. and we can ignore eof while doing
huffman. Its certainly better option as you pointed out.
But still, if we just opt for huffman encoding, i'm not able find out
how can we determine the eof ?
the eof symbol approach works, if the codespace is large enough to
justify
it. with, eg, >50 symbols, the eof symbol doesn't really matter, with 3
it
does.
Yes, eof is a good choice in case of large set of char and when u dont
know the length of the stream. And in this case, we dont know the
length of the stream.
if one, up front, is like: this stream is 31690 symbols, ok, we can decode
it.
likewise, if we state that the length mod 256 is 202, it can still be
decoded, as once a known end for the coded stream is reached, we can use the
mod to find the end of the decoded stream.
Plz help me, if i'm not able to follow you.
Regards,
Nishu
.
- References:
- Need a bit of information about Compression
- From: Dr.Whacked
- Re: Need a bit of information about Compression
- From: cr88192
- Re: Need a bit of information about Compression
- From: Earl Colby Pottinger
- Re: Need a bit of information about Compression
- From: Dr.Whacked
- Re: Need a bit of information about Compression
- From: cr88192
- Re: Need a bit of information about Compression
- From: Nishu
- Re: Need a bit of information about Compression
- From: cr88192
- Re: Need a bit of information about Compression
- From: Nishu
- Re: Need a bit of information about Compression
- From: cr88192
- Re: Need a bit of information about Compression
- From: Nishu
- Need a bit of information about Compression
- Prev by Date: Re: New Compression Method
- Next by Date: Re: Need a bit of information about Compression
- Previous by thread: Re: Need a bit of information about Compression
- Next by thread: Re: Need a bit of information about Compression
- Index(es):
Relevant Pages
|