Re: QI and MQ Coder: First real-life experiences



[reposted with better formatted line breaks]

> typedef union _swi { // SW Integer
> struct {
> dword m; // mantissa
> int32 e; // exponent
> };
> qword q; // alias for 64-bit compares
> } SWI;

> That is a specialized floating number. What information
> is used to align the mantissa?

Regarding the 64 bit "precision", there are two "precision"
conventions or concepts, (a) and (b) described below, that are
being mixed up in these questions.

a) If you look at SWI as a floating point number, than the SWI
structure does use 64 bits, which would be equivalent to a 64
bit FP number (with some non-standard partition of bits between
mantissa and exponent). So for that concept of "precision", call
it "precision-fp", the 32-bit SWI "precision-fp" is the same as
for 64 bit FP numbers.

b) Coding "precision", "precision-c" -- This is convention that
is used to describe AC (or Huffman code) precision, where the
"precision" means the width of "addend" (which in QI is called
SW mantissa and in AC the "range"). With a g-bit Huffman coding,
these addends are just the Huffman codes with max length of g
bits per code, and these codes are added to the output code on
exact bit boundaries i.e. the Huffman addends don't overlap with
the current sum or the encoded ouput (hence this "addition" in
Huffman coding is equivalent to concatenation of addends to the
current encoded output). With QI and AC, each working in 32 bit
"precision-c", which code symbols with bit fraction accuracy,
the addends are 32 bit values and these addends do overlap with
the current sum (the cumulative probability in AC or enumerative
index with QI).

The convention I follow when saying that QI uses "precision" g=
32 bits is (b), since we are comparing QI to AC with the same
"precision-c" (of its mantissa, the "range").

> That is a specialized floating number.

Structurally, yes. The FP and SW numeric types are not specified
by their structure alone, but also by the set of operators and
how these operators work. The operators work differently on SW
operands than on FP operands. That should be also obvious if you
check [T3] p. 7, the definition (i) which defines how the
operators work.

If you wished to perform that kind of operation on existent FP
numbers (hence reusing the structural similarity), you would
need to implement it on your own, since the FP operations would
not work the way (i) specifies. The principal difference is that
SW arithmetic operators are decoupled from rounding operator,
while with FP numbers the rounding is an integral part of the
arithmetic operators.

Note that the [QIC] source kit includes conversion functions
between SW and FP numbers in Swi.c, functions: fp2swi(),
swi2fp() as well as for logs of fp number to swi of the fp
number itself: lfp2swi() and lfp2swx(). All these are used in
QIC only for various stats, where logs of exact binomials &
factorials are computed via FP's 53 bit mantissa, in order to
measure quantization effects on the QI's quantized binomials,
powers, factorials, etc.

> What information is used to align the mantissa?

The SW exponent, of course, as shown e.g. in EncDec.c, function
qiEnc(). Note that AC, due to the historical accidents of its
evolution, never developed similar clean formalism as SW for
that type of numbers and their operations, even though it does
arithmetically the same thing as QI with SW integers (AC's are
g-bit SW fractions).

The AC's mantissa is the "range" and its exponent is -(# of
"range" scaling steps), i.e. AC exponent is a negative number
(scaling step is a way of looking at SW fraction multiplication
with 2, and forgetting of exponent). The kludgy way of handling
the arithmetic of, what amounts to simple SW multiplications of
32 bit mantissas for SW fractions (which are similar to FP
multiplications of 32 bit mantissas, except that the full 64 bit
product of two mantissas is not truncated to 32 bits, as FP
"multiply" operator would do), results in the AC's arithmetic
leaving the randomly fluctuating number of significant bits in
its mantissa ("range") and thus to further precision loss.

The other precision loss (also absent with QI) is the loss of
"range" precision after multiplication of the "range" with p=
Probability(x) where x is the current symbol, which for
unlimited precision AC would gain the number of significant bits
from p, but which are not kept in a finite precision value for
"range" of a regular 32 bit AC.

Hence, even reformulating the AC arithmetic via SW fractions,
with explictly maintained exponent, which would help keep its
mantissa precision more stable (using up its max value of g bits
throughout), would not solve its entire excess redundancy vs
QI's integer based arithmetic and index (with AC the "index" is
its "cummulative probability", hence a number smaller than 1.0).
But it would certainly simplify and streamline its rounding,
zooming and scaling logistic (which amounts to a "poor man's SW
arithmetic") making it much more transparent and mathematically
more sound.

.



Relevant Pages

  • Re: Semi OT: Binary representation of floating point numbers
    ... precision floating point number to double precision after running into ... set out to study the IEEE 754 representation of these numbers. ... and e is the biased exponent which can range from 0 to 255 ... The mantissa is assumed to be normalized which means the msb of its value is shifted into the msb of the mantissa. ...
    (comp.lang.c)
  • Re: What are the minimum numbers of bits required to represent C std float and double components?
    ... I guess the standards were written to accommodate some ... mantissa sizes to accurately mirror the C standard minimum widths. ... 8 bits for exponent, ... You need 36.54 bits to get 10 digit precision. ...
    (comp.lang.c)
  • Re: Another question about big floating point
    ... ULP over the entire input range, and still having decent performance. ... Say at least as many as the exponent size plus ... mantissa, discarding an equivalent amount of the original precision. ...
    (comp.programming)
  • Re: Another question about big floating point
    ... Say at least as many as the exponent size plus ... mantissa, discarding an equivalent amount of the original precision. ... Size consideration on shorter number may make you go the ...
    (comp.programming)
  • Re: Semi OT: Binary representation of floating point numbers
    ... precision floating point number to double precision after running into ... set out to study the IEEE 754 representation of these numbers. ... and e is the biased exponent which can range from 0 to 255 ... The mantissa is assumed to be normalized which means the msb of its value is shifted into the msb of the mantissa. ...
    (comp.lang.c)