Re: Arithmetic coder input
- From: "Matt Mahoney" <matmahoney@xxxxxxxxx>
- Date: 25 Dec 2005 18:17:49 -0800
Learning_boy wrote:
> Still struggling with the difference in values with hand calculations and
> coder calculatons.
>
> Can someone point me to a reference that will explain how the bits are
> shifted in with relation to the probability examples. I can't seem to be
> able to get a handle on this. I need this understanding to try and work out
> the underflow influence.
>
> Thanks
All arithmetic coders have a state representing a range, initially lo=0
to hi=1, which is repeatedly divided in proportion to the next symbol
probabilities. They differ primarily in how the range is represented.
These are high precision numbers having 3 or 4 parts, in order from
most to least significant:
1. Where the most significant bits of lo and hi are the same, the bits
are written to disk.
2. (optional) where the bits of lo are 0 followed by a string of 1s and
the bits of hi are 1 followed by a string of 0s, the number of such
bits is stored in a carry counter.
3. The next 16, 32, or 64 bits are stored in a pair of integers, or a
smaller number of bits is stored as a single integer state representing
both lo and hi.
4. The least signficiant bits of lo and hi are implicitly all 0s and
all 1s respectively. This requires some rounding of the symbol
probabilities.
Older arithmetic coders used 16 bit integers and a carry counter. They
wrote out bits one at a time.
PAQ6v2, PAQ7, and FPAQ0 use 32 bit integers and no carry counter.
Output is a byte at a time. PAQAR and PAsQDa output one bit a a time.
The symbols to be coded are binary. This code is very close to the
Shannon limit.
PPMD uses a 32 bit coder with no carry counter. The symbols are bytes.
Output is a byte at a time. There is a bit more loss of precision,
but still very efficient. The actual values represented are lo and
range (hi=lo+range).
JPEG in arithmetic coding mode uses a state representing both lo and hi
to code binary symbols. This is very fast because each coding step is
just a state table lookup. However the code is less efficient because
of the limited number of states forces more severe rounding of
probabilities.
Take a look at the source code and the JPEG spec for these different
coders.
-- Matt Mahoney
.
- Follow-Ups:
- Re: Arithmetic coder input
- From: Learning_boy
- Re: Arithmetic coder input
- References:
- Arithmetic coder input
- From: Learning_boy
- Arithmetic coder input
- Prev by Date: Arithmetic coder input
- Next by Date: embarrassing question on lzo decompression buffer corruption error
- Previous by thread: Arithmetic coder input
- Next by thread: Re: Arithmetic coder input
- Index(es):
Relevant Pages
|