Re: high level languages for synthesis
- From: "Robin Bruce" <robin.bruce@xxxxxxxxx>
- Date: 28 Aug 2006 07:01:38 -0700
There seems to be little attempt here to categorise the tools
themselves. Are we talking solely about cycle-accurate HLLs or not? The
differences between a tool like Handel-C and a tool like Impulse-C are
enormous. I would argue that the non-cycle-accurate high-level
languages are more distinct from their cycle-accurate cousins than
either of them are from HDLs.
The cycle-accurate languages seem to suit a spiral methodology that
begins with high levels of abstraction. Functional models precede a
hardware-software partition and the fine details are filled in
gradually, with functional testing at every stage. At the end of the
design process you are almost as close to the hardware as you would
have been if you designed in VHDL in the first place, still having to
understand what's happening on every clock cycle. Handel-C and
System-C fit into this category. The advantage to these languages is
that you are working at the system level from the outset. It's not so
straightforward to use a spiral design methodology using VHDL, which is
much better suited to building and testing all your components fully
before you bring them together. That's a nasty time to find that there
are problems with the system-level design, as the components themselves
may need re-designed. These languages make sense for large projects
where timescales are tight, budgets are large and you want to mitigate
the risks of design respins. See Jonathan Feifarak's paper at last
year's MAPLD conference for an example:
http://klabs.org/mapld05/papers/
Non cycle-accurate languages are targeted at an entirely different
crowd, and are more suited for general-purpose reconfigurable
computing. Example of these languages are Impulse-C, SRC's Carte
Programming Environment, Mitrionics' MitrionC, and Nallatech's DIME-C.
They get their performance speed-ups from a mixture of spatial and
temporal parallelism.
These languages are aimed at users that are not necessarily familiar
with FPGA development, or even low-level programming. High-Performance
Computing users are those who have most to gain from general-purpose
reconfigurable computing. Paradoxically these users, who have the
greatest need of high performance, are often not the power users one
would expect due to the fact they are also application-domain
specialists. They may be used to writing C, but as an alternative to
FORTRAN and not as an alternative to assembly language. To expect such
people to be able to take quickly to VHDL is misunderstand who these
users are. These people do not want to design PCI cores, but instead
want to compare genetic sequences, implement CFD simulations and
implement a wide variety of other scientific algorithms. These
algorithms exist in C and FORTRAN presently and people are now looking
to porting them to FPGAs. The tools do not make it possible to simply
compile pre-existing C applications and receive 500X speed-up, but they
instead offer an environment in which all the nasty complexities of
FPGA design are abstracted. Users can adapt the algorithms to best suit
the hardware compiler. No clock periods, no PCI cores to design, no
memory controllers to worry about, no pins to worry about. The tool
only makes sense as part of an integrated platform that makes all of
this possible. Many of these tools are now beginning to lean towards a
library-based development process, where the most frequently used
functions are implemented, ideally as pipelined cores, in a traditional
HDL process and then integrated into the tool.
This kind of development also has a place within High-Performance
Embedded Computing. Your application may interface to sensors and
actuators etc..., but in the world of the billion-plus transistor FPGA
some seriously complicated algorithms can be implemented. At this stage
it makes sense to develop the external interfaces in a traditional HDL
manner, but develop the main algorithm in a high-level language. This
allows for a lot more experimentation with the main algorithm. You
don't give up half as much performance as you think you might in doing
things this way and you could react to market opportunities a lot more
quickly. Once you've settled on the structure of your algorithm, you
have the option of recreating this architecture in HDL for a
performance increase. You could either do this before you release the
first version of your design to users, or you could get there first
with the HLL version then follow it up later with the optimised
version.
A final point is that we really need to nail down this fallacy that an
FPGA-targeting C-syntax HLL is somehow inherently sequential. It's not.
The compiler determines that. FPGAs are best for algorithms where there
are large data sets. They offer the best performance when the
algorithms (or their main computational loops) are pipelinable. For
loops should be pipelined as a matter of course wherever possible.
Below is a quick example of what I mean. It's a DIME-C design that
implements the probality density function. The array sizes are 8192
here, as I've implemented the memory in BRAM, though I could have had
these arrays packed into SRAMs, so they could be a lot bigger. The
entire for loop is pipelined, so the whole thing will take N + latency
cycles to execute. In this case latency is about 85 cycles. I would
expect a clock rate of 120-150MHz on V4 for this, but I haven't
actually built this.
It took me 15 minutes to do this, and I'm a real slowcoach. I wonder
how long it would take me to do with VHDL...
/* Project to implement probability density function
on V4 device
Robin Bruce
28/08/06 */
#include "math.h"
#define SIZE 8192
#define PI 3.1415926535897
#define ROOT_2PI 2.506628274
void probability_density(float x[SIZE], float mu[SIZE], float
sigma[SIZE],
float phi[SIZE], int N)
{
int i = 0;
float sigma_sqr = 0.0;
float dif_x_mu_sqr = 0.0;
float dif_x_mu = 0.0;
float sigma_local = 0.0;
float x_local = 0.0;
float mu_local = 0.0;
for(i=0; i<N; i++){
sigma_local = sigma[i];
mu_local = mu[i];
x_local = x[i];
sigma_sqr = sigma_local * sigma_local;
dif_x_mu = x_local - mu_local;
dif_x_mu_sqr = dif_x_mu * dif_x_mu;
phi[i] = (1.0 / (sigma_local * ROOT_2PI)) * expf(-( dif_x_mu_sqr /
(2 * sigma_sqr) ));
}
}
KJ wrote:
<fpga_toys@xxxxxxxxx> wrote in message
news:1156625298.673980.153570@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
I agree that every line of code is a probability for design error, but in my
KJ wrote:
'Works as designed' is a given (except for the occasional bugs that pop
up
in the compiler/synthesis tool itself). 'Works as intended' is another
thing entirely where the right language (whatever that may be) and a
skilled
user/designer are probably the biggest aids in minimizing the gap between
'as designed' and 'as intended'.
I know, I'm just stating the obvious here ;)
yes, and it can not be said enough times. The turning point is a
complexity * (probability of mistake) product which predicts the likely
hood of making errors.
When you build circits a bit at a time, then a 2M bit device will have
a development error rate of 2M times the probablity of a bit programmer
error. If those 2M bits are better described as 50 lines of HLL, the
error rate is a different product based on 50 times the probablility of
an HLL coding error. This second probablility can be relatively low, or
in some cases very high too (such as a C programmer writing their first
complex Lisp). In addition, debug time is directly proportional to
expression size ... looking for a needle in a haystack problem, vs
sorting thru 50 pins.
HLL's tend to leverage smaller well defined complexity, to produce
lower over all system probablility of error.
opinion, the various languages being bandied about for doing hardware design
can me equally used to create either succint code that describes the
functionality concisely or misused to describe the functionality 'bit at a
time' as you say (I'm interpreting that to mean....bloated....lots of code
that could've been written more concisely).
Whether you get the concise code or the bloated code for a particular
hardware design I've found is simply a function of
- The skill level of the person with the hardware design language that they
are using.
- The skill level of the person in hardware design.
I could be wrong, but I haven't seen anything to indicate that the actual
language used itself is an important factor.
KJ
.
- Follow-Ups:
- Re: high level languages for synthesis
- From: fpga_toys
- Re: high level languages for synthesis
- References:
- high level languages for synthesis
- From: Sanka Piyaratna
- Re: high level languages for synthesis
- From: Antti
- Re: high level languages for synthesis
- From: fpga_toys
- Re: high level languages for synthesis
- From: Jan Panteltje
- Re: high level languages for synthesis
- From: David Ashley
- Re: high level languages for synthesis
- From: Jan Panteltje
- Re: high level languages for synthesis
- From: David Ashley
- Re: high level languages for synthesis
- From: KJ
- Re: high level languages for synthesis
- From: fpga_toys
- Re: high level languages for synthesis
- From: KJ
- high level languages for synthesis
- Prev by Date: Re: What is the truth about the Virtex5 ?
- Next by Date: Re: Spartan-4 ?
- Previous by thread: Re: high level languages for synthesis
- Next by thread: Re: high level languages for synthesis
- Index(es):
Relevant Pages
|