Re: A Fast sorting algorithm for almost sorted data
- From: moogie <budgetanime@xxxxxxxxxxxxxx>
- Date: 9 May 2007 20:10:18 -0700
The probability for entry i with occurrences j in list k is j/sum(all
entries in k). If you keep an explicit count (like the last number in
your example sequences), then you merely need to look up the count for
an item and divide it by your count for that sequence. The entire
probability updating and reordering algorithm is quite simple...
counts = [0,0,0,...,0]
count = sum(counts)
for posn in seen:
counts[posn] += 1
new = posn
while new and counts[new-1] < counts[posn]:
new -= 1
if new != posn:
counts[new], counts[posn] = counts[posn], counts[new]
The probability of the symbol in position i is merely counts[i]/count.
That will give you a value between 0 and 1, which you can then combine
as you do now.
I think i missunderstood your initial point... this is exactly what i
am doing now... i was under the impression that you thought by not
using the double value at all is better (i.e. faster) but i could not
see how to do so given that i need to combine the predictions of
different models.
are you saying i should sort each model and then combine them? I will
still need to sort the symbols based on the combined probabilities...
You can speed up the comparison phase with a binary search, or a
doubling followed by a binary search, and using the swapping method
above can get you down to an O(log(po-pn)) bubble mechanism, where po is
the original position and pn is the new position. It can have some
strange behavior...
[5_4, 5_3, 5_2, 5_1]
increment the last one
[5_4, 5_3, 5_2, 6_1]
shift it using the above algorithm...
[6_1, 5_3, 5_2, 5_4]
Note how the 5_4 and 6_1 swapped positions? Yeah.
Nice!, i will have to think on how this is actually working :) i dont
get it just at the moment. but sounds like it should be useful!
Making huffman coding reasonably fast can be difficult. That's one of
the reasons why I implemented a tree-based arithmetic coder. It is
fast, can handle a practically limitless number of entries, and is able
to save some fractions of a bit for each encoded symbol.
I had improved my huffman compressor/decompressor but as you said it
rapidly slowed down once my number of symbols started to get beyond
6000.
You don't necessarily need to use the arithmetic coding, but the
algorithms used to make arithmetic coding faster (see the algorithm
snippet that I included earlier in this message) are also able to
improve upon the algorithm you are currently using.
Thanks for all your input, it really has made me think and see how
little i know :P
Cheers
Nick
.
- Follow-Ups:
- Re: A Fast sorting algorithm for almost sorted data
- From: Sportman
- Re: A Fast sorting algorithm for almost sorted data
- From: Josiah Carlson
- Re: A Fast sorting algorithm for almost sorted data
- References:
- A Fast sorting algorithm for almost sorted data
- From: moogie
- Re: A Fast sorting algorithm for almost sorted data
- From: collection60
- Re: A Fast sorting algorithm for almost sorted data
- From: moogie
- Re: A Fast sorting algorithm for almost sorted data
- From: collection60
- Re: A Fast sorting algorithm for almost sorted data
- From: Josiah Carlson
- Re: A Fast sorting algorithm for almost sorted data
- From: collection60
- Re: A Fast sorting algorithm for almost sorted data
- From: moogie
- Re: A Fast sorting algorithm for almost sorted data
- From: Josiah Carlson
- Re: A Fast sorting algorithm for almost sorted data
- From: moogie
- Re: A Fast sorting algorithm for almost sorted data
- From: Josiah Carlson
- Re: A Fast sorting algorithm for almost sorted data
- From: moogie
- Re: A Fast sorting algorithm for almost sorted data
- From: Josiah Carlson
- Re: A Fast sorting algorithm for almost sorted data
- From: moogie
- Re: A Fast sorting algorithm for almost sorted data
- From: Josiah Carlson
- Re: A Fast sorting algorithm for almost sorted data
- From: moogie
- Re: A Fast sorting algorithm for almost sorted data
- From: Josiah Carlson
- A Fast sorting algorithm for almost sorted data
- Prev by Date: Re: A Fast sorting algorithm for almost sorted data
- Next by Date: Re: A Fast sorting algorithm for almost sorted data
- Previous by thread: Re: A Fast sorting algorithm for almost sorted data
- Next by thread: Re: A Fast sorting algorithm for almost sorted data
- Index(es):
Relevant Pages
|