Re: XQ and ->Qpi bug on large X



All this controversy must come from my lack of skill
in developing the topic clearly, I suppose; this post
was actually written last night, while our network was down,
but it seems to address other questions brought up today.

And since I (like any man) was made in the image of God,
and could not be uttering anything but what God put into me,
from the moment of my Creation, it must be the truth,
so pay more attention this time, and stop arguing with me! ;-)

On Thu, 06 Jul 2006 16:29:55 -0500, Eric Smith wrote:

"Round to even" will not introduce statistical bias
unless the data is already in a very small range...
if your data is all in [a small] range,
you shouldn't be rounding to only two digits

[it was Rodger who actually started illustrating
"rounding to even" (or any rounding) for only *two* digits,
which initially prompted my rant here :]

Yes, that's exactly what I meant to illustrate,
except that the term "very small range"
might be worth interpreting a bit broadly,
and "only two digits" might well deserve
to be broadly interpreted as "fewer than 12,"
at least in a calculator supporting 12-digit mantissas :)

But before my trying to justify that again,
let me digress for a moment to examine the results of RAND.

If we look at the leading digits of consecutive RAND
output values, we see little pattern,
but if we look at the trailing digits,
there is a very significant pattern
(or "lack of randomness"),
because of the algorithm used.

That is why we should always use only
each complete (12-digit) real value returned by RAND,
rather than its individual digits,
because any time we drop off any of the early digits,
or use any of the later digits independently,
what we're left with becomes "less randomly distributed,"
until at the extreme,
the final few digits are extremely periodic.

It may not be a sudden leap from "extremely random"
to "extremely non-random" for each number of digits less than 12
that we might use from the output of RAND, but it remains a fact
that we lose *something* every time we shorten the value,
so why use anything less than the whole, whenever we can?

I am trying to make a similar point here about rounding;
when we consider all of the 15-digit "long" mantissas
that arise (in intermediate internal operations)
from the use of typical real-valued calculator functions,
we expect a very wide and somewhat uniform
distribution of that portion of the mantissa at the end
which gets "rounded off" to produce twelve-digit
final results that are carried forward thereafter,
so we accept as valid and useful that the calculator
automatically applies this rounding method internally, at least
as far as rounding all results to the standard twelve digits.

But as we take real data of fewer digits
and round it (for what purpose?), the "less randomly distributed"
are the truncated parts of the values likely to be, e.g.
"[in the] statistical distribution of the fraction parts
of randomly chosen [decimal] floating point numbers,...
the leading digit tends to be 1 over 30 percent of the time!"
[Knuth v.2, 4.2.4B], and such trends extend, although to progressively
lesser degrees, when a few more leading digits are taken,
and becomes least significant, of course, when we consider
all the digits that we can retain in the mantissa.

As such a tendency occurs, of course,
even the "round to even" method, which might be preferable
*if*we*were*to*round*at*all*, begins to lose that quality
which we think it may provide, because of the
progressively less uniform distribution of inputs,
and it actually may not be a good idea at all
to round input values up to twelve digits to shorter values
by *any* method (not only *just* "round to even,"
but *including* "round to even"),
prior to using the input values in calculations.

I would go so far as to say that one should
never round values prior to using them in calculations
(unless they don't fit in a 12-digit mantissa to begin with :)
and that rounding (of any sort) should in general
be reserved only for these purposes:

o For convenience in displaying
(not for further use of the data),
as is done in the stack *display* (for FIX/SCI/ENG modes),
while the values on the stack as saved in *memory*
(for further use) do *not* themselves get rounded!

o For necessity in limiting the storage occupied
(for example, we keep only twelve-digit mantissas
in all stored real-valued numeric quantities,
including components of complex numbers and arrays,
so the operating system itself must round
all internal results).

o To present final results for display or printing
(as a means of indicating lack of significance
of any remaining digits).

Of the above purposes, the only one that gains
significant value from the "round to even" method
is the internal automatic rounding done by the calculator itself,
and that is already taken care of in the RPL operating system;
it was basically the effect of the continuing re-rounding and
re-use of rounded data in subsequent calculations that Knuth analyzed,
in the section titled "4.2.2 Accuracy of Floating Point Arithmetic,"
the benefit of which is already built into the OS;
if you think he recommended it elsewhere, please let me know.

For the other purposes (which involve no further use of the data
in calculations), why would we emphasize "round to even"
as being particularly better? In most cases, it never
matters anyway, e.g. 1.45000000001 still rounds to 1.5;
only the precise single value 1.45000000000
rounds differently (to 1.4) than the UserRPL RND command,
so I very much sympathize with the HP4x designers' decision
to let both the display rounding and UserRPL RND operate
conventionally, even though it would only have cost one or two
extra system flags to give it several additional modes :)

For people who have original input data having only
three significant digits, the value 1.45 can be stored
*exactly* by the OS, so there is no need
to pre-round it before using it for calculations
(and strong reasons not to round it before such use),
even if you think that *final*results* won't be
more accurate than two digits (so then you round
your final results, not the original inputs!)

For display purposes, the built-in FIX/SCI/ENG modes
(and UserRPL RND command, which does the same thing)
happen to use conventional rounding (as in finance),
but we can always use a fairly simple "URND" program
for "round to even," if we are so determined.

If we want to get a much better idea of how significant
are the digits of final results of calculations, then
rather than round input values ahead of time (arggghhhhhhh!,
because it *does*not*improve* the results, to say the least),
why not look to the "interval arithmetic" section in Knuth
(v.2 4.2.2), and use that (at the *end* of calculation)
to decide how many digits are still valid?

This is where I really rest my case, but by way of
tapping on the outer shell of some further assumptions
that I see made, to see whether perhaps they are hollow inside:

But in a more typical situation, round to even introduces no bias

I remind you that you are still thinking of only one citerion,
which is the unweighted *averaging* of values; other statistics
(mean squares, mean absolute differences, etc.) remain affected
by rounding of any sort, and so does the progressively greater
likelihood that even an average value is affected by
rounding involving fewer digits, because the data ranges
expressed by fewer digits tend to be "less random,"
even though only a little less random for each digit less,
much like the outputs of RAND.

But all this is really moot, because we should never be using
UserRPL RND ourselves before the subsequent use of the value in
further internal computations (that's the job of the built-in OS,
and necessary only when results don't fit into the standard
12-digit mantissa); we should be using a rounding command ourselves
only when we need to abbreviate the data for display,
or to say that final results for external use
are only approximate to a given number of digits.

If some value happens to sit on the razor's edge,
exactly halfway between two displayable values,
so what if the stack display always picks the conventional,
larger value? What it displays is not used for further
computation (the internal value keeps its "already-evened-out"
twelve digit precision), so why should I care?

The point of "round to even" is that it is unbiased (just like
your random selection should be), but it is also deterministic.

Gee, did I forget to mention that "stochastic" (randomized)
rounding wasn't deterministic?
(and even why that's the very reason to examine it,
at least in the event that one insists on performing any rounding
at all, where it would otherwise be a really poor idea to round
certain kinds of data deterministically by *either* the conventional
or "round to even" methods, as I'd hoped to have illustrated).

"Stochastic" rounding doesn't happen to produce
increasing bias as the input data range shrinks,
as does almost any deterministic rounding, which was
just an interesting point that I wanted to bring out
(it sure has attracted more attention than I figured, however :)

By way of summary:

"No rounding rule can be best for every application"
[Knuth v2, in 4.2.2A, following Theorem C]

Perhaps people think that I've been against those things
which "round to even" is good for, but what I've been trying
to do is bring out what rounding *in*general* isn't good for,
and even a note of caution in thinking that "round to even"
is applicable to other situations than where it's indicated,
which is for further numerical floating-point calculations
when mantissa precision is limited to a fixed size
(is this not what Knuth has taken all this care to say?)

I've also explored a bit into other forms of rounding
that we probably never think of, which in some special cases,
if we were to round at all, would actually fit the bill just right.

When human beings use pencil and paper, there is a strong motivation
to round original input values, because to carry more digits
increases the manual labor of computation, which Leibniz
found quite distasteful ("It is unworthy of excellent men
to lose hours, like slaves, in the labor of computation");
however, the calculator doesn't mind, and isn't even spared
any efforts by rounding, so unless it were to improve *results*,
I don't see why we'd be eager to do it at all while we're
manipulating our numbers inside the calculator -- it does
save us some typing time to round "in our own heads,"
*before* we actually give data to the calculator,
and if we are doing integer calculations, say,
it might help to round intermediate results to integers,
but these are all very special exceptions.

If we were to persuade the UserRPL RND command, however
(which is meant for other purposes, most certainly
not for improving the accuracy of results, because *user*
rounding basically "fuzzies" otherwise distinctly preservable
results) to by default round 1.45 to 1.4, say, that would
raise quite a stink, especially in finance and education!

So I maintain that "round to even" is absolutely the way to go,
in all of the internal "round long(15) to real(12)" machinations
which the processor and the OS *have*to* do, and at the same time
is absolutely not the way to go as a standard for
either display rounding or UserRPL RND.

Extra pre-conditioning of inputs to shorter than 12 digits
by gratuitous user rounding (even if "to even") is not necessary,
it certainly does not improve upon the results
of the built-in real-valued arithmetic,
but it sometimes can certainly worsen the proper results;
these are all the reasons for not doing it.

That's the best I can do in this rough draft
(but just wait until next year :)

So have we exhausted "Round Two," even?

Over to you... Rodger?
.



Relevant Pages

  • Re: converting float to double
    ... the ftrunc the calculation (with 1,000,000 iterations) takes 0.73 ... I have indicated above where your calculations give wrong results. ... general rounding rules in any case, I do not see this as ... interest example I see already that if I do not round at every step ...
    (comp.lang.c)
  • Re: Rounding error when converting from double to int
    ... interest calculations). ... We want to round any fractional cents up to ... We use a rounding scheme where values greater than ...
    (comp.lang.c)
  • Re: Rounding errors
    ... >>number of random digits following the point of rounding, ... > an infinite number of digits right of decimal. ... multiplication nor what the number of digits in the calculations was. ...
    (comp.lang.cobol)
  • Re: Decimall Float Question
    ... the number other than to round the decimal part. ... 'bankers rounding' to the nearest even number is not required). ... the FAQ code includes having a variable number of digits after ... // Convert number to string and split ... ...
    (comp.lang.javascript)
  • RE: Rounding in VBA - Any ideas?
    ... MS chose not to display more than 15 digits because digits beyond the 15th ... Consider dblContainer*lngExpon which you round to produce ... VBA did what you told it to do, but that is different than what you wanted ... "banker's rounding" than the VBA Round function, ...
    (microsoft.public.excel.worksheet.functions)