Re: Systolic array architectures
- From: Kolja Sulimma <news@xxxxxxxxxx>
- Date: Thu, 06 Oct 2005 09:01:07 +0200
timotoole@xxxxxxxxx schrieb:
> Systolic array architectures are commonly used for image/video
> compression hardware blocks (e.g. convolution filters, motion
> estimation,
> etc). I loosely have an idea that this is because they are efficient
> at
> reusing the data, and thus reduce memory accesses in comparsion to say
> a
> custom designed high throughput singular processing element. Would this
> be generally considered the princicpal benefit and are there other
> benefits?
>
> I have read that they are considered "i/o bandwidth efficient", I guess
> thats just another way of saying what I've just outlined above?
>
> Is there ever scenario's where the area and switching overhead of a
> systolic array would warrant a less bandwidth efficient, more serial
> approach - or is that just plain ridulous to consider?
In fact only the serial solution has an area and switching overhead for
each computation. The systolic implementation only computes without
doing anything else. See below.
> For example could you hope to trade less switching in the datapath for
> increased switching in the memory accesses but still make an overall
> reduction in switching?
If your alogrithm dictates that you add two numbers, you can not save
the switching of the adder. But you might be able to save the memory access.
Whenever you have an algorithm that can be implemented systolically
without an increase of net operations performed (Filters,
Smith-Waterman, etc.) the systolic implementation must be a lot more
efficient.
If you have to perform the operations anyway it is allways more
efficient to perform them right away when the data is available compared
to sending intermediate results a long distance across chip or even off
chip. The later will slow down the process and consume a lot of power.
However, this is the overall area/delay efficiency (computations per
time per area). If you have a fixed bandwidth goal perfect efficiency
doesn't help you much if you only have enough data available to utilize
the hardware only 1% of the time. In that case you go to a more serial
solution to save hardware. But the area/time/energy per computation
increases in that case.
More complicated are problems were the systolic algorithm requires more
operations than the serial algorithm. In that case you need to tradeoff
the gains per computation of the systolic implementation vs. the
improved number of computations of the serial algorithm.
Kolja Sulimma
.
- Follow-Ups:
- Re: Systolic array architectures
- From: timotoole
- Re: Systolic array architectures
- References:
- Systolic array architectures
- From: timotoole
- Systolic array architectures
- Prev by Date: Re: Floating point multiplication on Spartan3 device
- Next by Date: Re: ise (lin64) and debian
- Previous by thread: Systolic array architectures
- Next by thread: Re: Systolic array architectures
- Index(es):
Relevant Pages
|