Re: Sorting large amounts of floats
- From: fpga_toys@xxxxxxxxx
- Date: 20 Jan 2006 23:49:47 -0800
Keith O'Conor wrote:
> Hi,
> I'm trying to figure out the best way to sort a large amount
> (thousands) of floats or fixed-point data on an FPGA. With this amount
> of data, it needs to be stored in RAM because it obviously won't fit in
> registers, but this means that there can only be one access to that RAM
> every clock cycle.
Great, you at least know where the bottleneck is. The first rule in
configurable computing is to stop using central memories, as they
are unavoidably single threaded. Using lut based rams suddenly
gives you thousands of little memories with concurrent accesses.
Frequently the next step is to stop thinking parallel, and go serial,
which immediately suggests divide and conquer in a massively
parallel way. Consider a 2 bit in, 2 bit out node with a tiny
statemachine
that resets on word clock, passes bits in to out till they are
different,
then latches one of those as the mux selector, and continues passing
bits till the next word clock in this selected state. These butterflies
can be cascaded 2^n deep to concurrently sort thousands of serial
data streams with a one clock latency per stage, and a log2(width)
latency for the whole sort.
> Since any sorting algorithm will need access to at
> least two pieces of data to compare, I can't figure out how to
> parallelize the sorting. Has anyone got any experience in this area? I
> would really appreciate any advice.
>
> Thanks,
> Keith
I hope this wasn't your class assignment in reconfigurable computing
for parallel processing applications. Monday or tuesday I will post
Beta-2
for FpgaC, which has LUT RAMs for small arrays along with structures
and a some other improvements. If you aren't currently taking a
parallel
programming class, it might be useful to at least review the course
notes
from a major university and pickup one of the texts, as it will provide
you
a lot of insight about approaching these problems for traditional
processors
which can be easily applied to reconfigurable computing.
http://sourceforge.net/projects/fpgac
When the beta-2 is available for downloading look at the users manual
in
docs, and the examples. Setting up multiple stages of concurrent
pipelines
is pretty easy in traditional C, with a cute trick of building the
procedure
block backwards ... bottom to top data flow, rather than top to bottom.
have fun,
John
.
- References:
- Sorting large amounts of floats
- From: Keith O'Conor
- Sorting large amounts of floats
- Prev by Date: Virtual Pin in Xilinx ISE
- Next by Date: EDK 8.1, Finally!
- Previous by thread: Re: Sorting large amounts of floats
- Next by thread: Security of Xilinx Virtex2 Pro
- Index(es):