Re: Polynomial Function ...



On Tue, 1 Sep 2009 00:38:54 -0700 (PDT), Kappa <secureasm@xxxxxxxxx> wrote:

Hi,

I have one analog signal input. I want to apply at analog signal a
transfer function of the type :

y = a0 + a1 * x + a2 * x^2 + a3 * x^3 + a4 * x^4 + a5 * x^5 + a6 * x^6
+ a7 * x^7 + a8 * x^8 + a9 * x^9

Polynomial semplification :

y = a0 + x * (a1 + x * (a2 + x * (a3 + x * (a4 + x * (a5 + x (a6 + x *
(a7 + x * (a8 + x * a9))))))))

There are specific techniques and/or optimization for building this
polynomial on FPGA ?

Kappa.

There are any number of ways to implement it; as Jonathan said your pipelined
implementation is easy and cheap (though you will inevitably lose resolution
through the successive multiplications; see below)

His suggestion of interpolation is good, though I tend to add a quadratic term
to guarantee accuracy (this requires three multiplications rather than one; but
is accurate enough for single precision floating point; i.e. 24-bit or better
accuracy, on square root, reciprocal, etc)

Or if you prefer a direct implementation, consider the range of coefficient
values for your higher order terms; they are typically small. Consider the
rounding error imposed by truncating the product of x and the largest
coefficient; you can reduce the resolution of the smaller multiplications until
they introduce a similar (preferably still smaller) error with economic benefits
(reduced resources)

Or if latency is important, note that the slowest path,
a9 * x^9 = (a9 * x) * ((x*x) * (x*x)) * (ditto)
which takes 4 cycles to compute; plus a fifth to add it to the (parallel
computed) sum of the faster paths. (Implementation issues may require additional
cycles to move data between multipliers)

Only you can determine if the loss of resolution due to e.g. truncating
multiplier outputs,or economising on word widths is acceptable; i.e. within your
error budget.

This can be exhaustively tested in simulation (for a single variable input as
here) by a testbench implementing comparison with an exact solution, and logging
the magnitude and spread of errors. For a single-precision FP square root unit
(quadratic interpolator) I tested all 2^24 input values in simulation in about
an hour; you can afford to do this after any major design change if accuracy is
important; normally I only re-tested the most sensitive region.

But for prototyping the algorithm and testing e.g. the accuracy of an
interpolated LUT, I suggest starting with a simple spread***! You can truncate
values to 16 (or 19) bits and quickly investigate tradeoffs to get a pretty good
idea of what you are doing before moving to VHDL.

- Brian
.


Loading