Re: Speed issues iterating over chars
- From: Robert Klemme <shortcutter@xxxxxxxxxxxxxx>
- Date: Wed, 1 Sep 2010 09:27:17 -0500
On Wed, Sep 1, 2010 at 1:35 PM, Martin Hansen <mail@xxxxxxxxx> wrote:
The gsub solution seems to be reasonably efficient.
seq.gsub! /./ do |m|
scores[$`.length].ord - BASE_SOLEXA < cutoff ? m.downcase! : m
But my original proposed naive loop is twice as fast:
scores.each_char do |score|
seq[i] = seq[i].downcase if score.ord - BASE_SOLEXA <= cutoff
i += 1
I dont really know how gsub and tr compares to the Perl equivalents
speed wise - in Perl tr is precompiling a lookup table that is evil fast
and the regex engine is also primed at compile time and runs extremely
Regexp is precompiled but I suspect that tr works at runtime only.
For definitive answer you'll have to look at the source.
I suspect that you need some C extension to go faster than this, but I
don't really want to spend the time on that. I was exploring Inline C
but that appears very fragile - I cannot even get the example from the
cookbook up and running under Ruby 191/192.
Looking at your last proposal I see three iterations where two are
running narrow if loops. I have not testet it, but it looks suspicious.
Well, but the caching should avoid that too many loops are executed.
I do not know however, how often you reuse values. If you need this
in several processes you could save the current state in a file via
Marshal which is quite fast.
remember.guy do |as, often| as.you_can - without end