Re: FPGA-based hardware accelerator for PC



In article <1147155282.274065.16140@xxxxxxxxxxxxxxxxxxxxxxxxxxxx>,
JJ <johnjakson@xxxxxxxxx> wrote:

Phil Tomson wrote:
In article <1146975146.177800.163180@xxxxxxxxxxxxxxxxxxxxxxxxxxxx>,
JJ <johnjakson@xxxxxxxxx> wrote:


snipping


FPGAs and standard cpus are bit like oil & water, don't mix very well,
very parallel or very sequential.

Actually, that's what could make it the perfect marriage.

General purpose CPUs for the things they're good at like data IO,
displaying information, etc. FPGAs for applications where parallelism is
key.


On c.a another Transputer fellow suggested the term "impedance
mismatch" to describe the idea of mixing low speed extreme parallel
logic with high speed sequencial cpus in regard to the Cray systems
that have a bunch of Virtex Pro parts with Opterons on the same board,
a rich mans version of DRC (but long before DRC). I suggest tweening
them, puts lots of softcore Transputer like nodes into FPGA and
customize them locally, you can put software & hardware much closer to
each other. One can even model the whole thing in a common language
designed to run as code or be synthesized as hardware with suitable
partitioning, starting perhaps with occam or Verilog+C. Write as
parallel and sequential code and later move parts around to hardware or
software as needs change.

I think the big problem right now is conceptual: we've been living in a
serial, Von Neumann world for so long we don't know how to make effective
use of parallelism in writng code - we have a hard time picturing it.

I think the software guys have a huge problem with parallel, but not
the old schematic guys. I have more problems with serial, much of it
unnecessary but forced on us by lack of language features that forces
me to order statements that the OoO cpu will then try to unorder. Why
not let the language state "no order" or just plain "par" with no
communication between.

Read some software engineering blogs:
with the advent of things like multi-core processors, the Cell, etc. (and
most of them are blissfully unaware of the existence of FPGAs) they're
starting to wonder about how they are going to be able to model their
problems to take advantage of that kind of paralellism. They're looking

The problem with the Cell and other multicore cpus, is that the cpu is
all messed up to start with, AFAIK the Transputer is the only credible
architecture that considers how to describe parallel processes and run
them based on formal techniques. These serial multi cpus have the
Memory Wall problem as well as no real support for concurrency except
at a very crude level, it needs to be closer to 100 instruction cycles
context switches to work well, not 1M. The Memory Wall only makes
threading much worse than it already was and adds more pressure to the
cache design as more thread contexts try to share it.

I wasn't singing the virtues of any particular parallel architecture (like
the Cell) - I brought it up to say that these architectures are now
becoming known in the software engineering world and there are a lot of
folks in that camp wondering how we're going to effectively develop
software for them.



for new abstractions (remember, software engineering [and even hardware
engineering these days] is all about creating and managing abstractions).
They're looking for and creating new languages (Erlang is often mentioned
in these sorts of conversations). Funny thing is that it's the hardware
engineers who hold part of the key: HDLs are very good at modelling
parallelism and dataflow. Of course HDLs as they are now would be pretty
crappy for building software, but it's pretty easy to see that some of the
ideas inherant in HDLs could be usefully borrowed by software engineers.



Yeh, try taking your parallel expertise knowledge to the parallel
software world, they seem to scorn the idea that hardware guys might
actually know more than they do about concurrency while they happily
reinvent parallel languages that have some features we have had for
decades but still clinging to semaphores and spinlocks.

You have to masquerade as a software guy ;-)

I came across
one such parallel language from U T Austin that even had always,
initial and assign constructs but no mention of Verilog or hardware
HDLs.

But there are more serious researchers in Europe who are quite
comfortable with concurrency as parallel processes like hardware, from
the Transputer days based on CSP, see wotug.org. The Transputers
native language occam based on CSP later got used to do FPGA design
then modified into HandelC so clearly some people are happy to be in
the middle.

I have proposed taking a C++ subset and adding live signal ports to a
class definition as well as always, assign etc, starts to look alot
like Verilog subset but using C syntax but builds processes as
communicating objects (or modules instances) which are nestable of
course just like hardware. The runtime for it would look just like a
simulator with an event driven time wheel or scheduler. Of course in a
modern Transputer the even wheel or process scheduler is in the
hardware so it runs such a language quite naturally, well thats the
plan. Looking like Verilog means RTL type code could be "cleaned" and
synthesized with off the shelf tools rather than having to build that
as well and the language could be open. SystemVerilog is going in the
opposite direction.

snip

The real money I think is in the problem space where the data rates are
enormous with modest processing between data points such as
bioinformatics. If you have lots of operations on little data, you can
do better with distributed computing and clusters.

Yes, bioinformatics can be a good application space for FPGAs (depending
on what you're doing)

snip


Transputers & FPGAs two sides of the same process coin


Are there any transputers still being made?

Phil
.



Relevant Pages

  • Re: FPGA-based hardware accelerator for PC
    ... FPGAs and standard cpus are bit like oil & water, don't mix very well, ... General purpose CPUs for the things they're good at like data IO, ... you can put software & hardware much closer to ... One can even model the whole thing in a common language ...
    (comp.arch.fpga)
  • Re: Best FPGA for floating point performance
    ... FPGAs on the other hand are typically bound by peak ... > operations because they have higher peak FLOPs/s, ... I think that both FPGAs and Multi cpus could go through some serious ... FPGA with cpu components like FPU, we would end up in a more similar ...
    (comp.arch.fpga)
  • Re: EHLO, board designers
    ... libraries that help programmers make use of the co-processing features. ... keep in mind some of the main downfalls of any hardware based co-processor: ... specialized hardware than to just burn CPUs cycles to perform the search. ...
    (comp.arch.fpga)
  • Re: OT: CPUs
    ... their cpus to obtain the best performances with current software. ... strategy but rather yields oscillating trends: (complexity in hardware ... The classic example is coevolution between predator and prey species. ...
    (talk.origins)
  • Re: Estimating number of FPGAs needed for an application
    ... Run the FFT core from there. ... boss consult a hardware engineer. ... I have to estimate the number of FPGAs ... 16k complex vector multiplication ...
    (comp.arch.fpga)

Loading