Re: Quicker finding strings? Alternative for array, hash, set?
- From: Jesús Gabriel y Galán <jgabrielygalan@xxxxxxxxx>
- Date: Mon, 2 Feb 2009 06:22:23 -0500
On Mon, Feb 2, 2009 at 12:00 PM, Patrick Put <patrick.put@xxxxxxxxx> wrote:
I've been searching for this information already, but cannot really find
the answer I'm looking for.
I have to find values in a list of some 40,000 strings. Putting all
these in an array is not likely to be the quickest way.... and it isn't.
A set seems to be faster already. Maybe using a map where I only use the
keys is also faster. But I'm wondering: is there no such thing as a tree
to store and retrieve items in a fast way?
If I remember well, for n elements in a list it should take log(n) steps
then, where it takes n/2 just going through an array if I only want to
pick up the element. In my situation it is much closer to n because most
of the times the value is NOT in the list.
Any ideas?
If you are only concerned with lookup time, a hash has amortized O(1)
lookup time:
require 'benchmark'
words = ("aaaa".."zzzz").to_a # => 456,976 strings
hash = {}
words.each {|w| hash[w] = true}
TIMES = 10
Benchmark.bmbm do |x|
x.report("Array#find existing") do
TIMES.times do
words.find {|w| w == "mmmm"}
end
end
x.report("Array#find non-existing") do
TIMES.times do
words.find {|w| w == "mmmmm"}
end
end
x.report("Hash#[] existing") do
TIMES.times do
hash["mmmm"]
end
end
x.report("Hash#[] non-existing") do
TIMES.times do
hash["mmmmm"]
end
end
end
jesus@jesus-laptop:~/personal/programacion/ruby/Benchmarks/lib$ ruby
hash_array2.rb
Rehearsal -----------------------------------------------------------
Array#find existing 3.160000 0.700000 3.860000 ( 3.870061)
Array#find non-existing 6.280000 1.630000 7.910000 ( 7.935304)
Hash#[] existing 0.000000 0.000000 0.000000 ( 0.000019)
Hash#[] non-existing 0.000000 0.000000 0.000000 ( 0.000018)
------------------------------------------------- total: 11.770000sec
user system total real
Array#find existing 3.130000 0.710000 3.840000 ( 3.866618)
Array#find non-existing 5.820000 1.580000 7.400000 ( 9.195417)
Hash#[] existing 0.000000 0.000000 0.000000 ( 0.000030)
Hash#[] non-existing 0.000000 0.000000 0.000000 ( 0.000032)
Jesus.
.
- Follow-Ups:
- Re: Quicker finding strings? Alternative for array, hash, set?
- From: Patrick Put
- Re: Quicker finding strings? Alternative for array, hash, set?
- References:
- Quicker finding strings? Alternative for array, hash, set?
- From: Patrick Put
- Quicker finding strings? Alternative for array, hash, set?
- Prev by Date: Quicker finding strings? Alternative for array, hash, set?
- Next by Date: Re: Quicker finding strings? Alternative for array, hash, set?
- Previous by thread: Quicker finding strings? Alternative for array, hash, set?
- Next by thread: Re: Quicker finding strings? Alternative for array, hash, set?
- Index(es):
Relevant Pages
|
Loading