Re: Sorting routine
- From: mhx@xxxxxx (Marcel Hendrix)
- Date: Sat, 6 Sep 2008 02:53:03 +0200
"dkelvey@xxxxxxxxxxx" <dkelvey@xxxxxxxxxxx> writes Re: Sorting routine
On Sep 5, 3:19 pm, m...@xxxxxx (Marcel Hendrix) wrote:
"dkel...@xxxxxxxxxxx" <dkel...@xxxxxxxxxxx> wrote Re: Sorting routine
On Sep 4, 12:36 pm, m...@xxxxxx (Marcel Hendrix) wrote:
"dkel...@xxxxxxxxxxx" <dkel...@xxxxxxxxxxx> writes Re: Sorting routine
[..]
\ FORTH> test ( 3 GHz PIV )
\ Creating 1,000,000 input values.
\ Sorting .. 0.126 seconds elapsed. ok
Hi Marcel
This looks right. There is room for improvements. Using RSHIFT may or
may not be the best way to parse out bytes but other methods would
be machine dependent. On a byte addressing machine it might be
more efficient to just fetch the byte at a time, rather than fetching
the entire value and then extracting the byte. On my NC4000 that
had no byte addressing, I made a hardware register at -1 that would
do a byte swap. The endian problem makes it less portable as well.
On a PC, the hardware fetches many 32-bit words at once.
As I mentioned, the two arrays, index and counts could be combined.
If kept separate, for the last pass to sort the signed values, one can
create the index array starting from the 128th entry and then
wrap back to 0 after 255, instead of starting from zero. One would
need to keep the two arrays in this method. This would only be done
as the last step when doing the byte with the sign for 2's complement.
I'll leave the complications for somebody else :-) I like the simplicity
of the current code.
I'm assuming that RSHIFT does a byte sized shift?
n RSHIFT shifts n bits.
One can do a small optimization bye building the counts and index
tables for each byte at once. This save needing to refetch the entries
for each pass to do this calculations. One still needs to run the
last step for each byte separately in order.
I am not so sure this will do much (besides decreasing size).
If there are a lot of bytes in the values, one can setup such a loop
that one does the counting at the same time as one does the placing
in the output array. This would make the first pass and last pass
unique. This would be handy for sign sorting of 2's complement.
Again, I'll gladly leave the complications to other enthusiasts.
Have you rerun any of the other benches with the faster processor
and using a 1M entries?
In all cases random numbers were used to initialize the arrays.
FORTH> simple-bench1M ( 3 GHz PIV )
Sedge sort - 1000000 elements, 174 ms (needs 128889 passes)
Quicker sort - 1000000 elements, 191 ms (needs 518998 passes)
Quick sort - 1000000 elements, 195 ms (needs 771816 passes)
Comb sort - 1000000 elements, 317 ms (needs 54 passes)
Heap sort - 1000000 elements, 913 ms (needs 19549621 passes)
So Radix/Distribution sort is about 1.4 times faster than Sedge's Sort.
However, no serious optimization effort has been done yet for either sort.
-marcel
.
- Follow-Ups:
- Re: Sorting routine
- From: Marcel Hendrix
- Re: Sorting routine
- From: Marcel Hendrix
- Re: Sorting routine
- References:
- Re: Sorting routine
- From: dkelvey@xxxxxxxxxxx
- Re: Sorting routine
- Prev by Date: Re: Sorting routine
- Next by Date: Re: Serial Ascii Editor
- Previous by thread: Re: Sorting routine
- Next by thread: Re: Sorting routine
- Index(es):
Relevant Pages
|
Loading