Re: Hash tables
- From: stephen@xxxxxxxxxxxxxxxxx (Stephen J. Bevan)
- Date: Tue, 21 Apr 2009 07:40:34 -0700
Jon Harrop <jon@xxxxxxxxxxxxxxxxx> writes:
I think it is fair to say that microsecond worst-case insertion time is not
a common requirement for a dictionary.
Maybe, I can only comment on the problems that I have to solve.
That is a tautology. My test program is designed solely to empower
programmers to be able to choose when to make the trade-offs themselves
based upon real performance measurements. My books and articles are often
unusually rich in such content because I believe it is very valuable.
Your program only tests two structures, uses a simple key and used a
relatively small number of entries. That does't mean the result
wasn't useful for that test but I don't think it is empowering
programmers to draw any conclusions about hash tables in general from
that test unless it is clear that the test results scale up or there
is a qualifier indicating that they don't.
Perhaps I can also qualify that keys are usually small constant-sized data
That would be a very good qualifier especially if small is quantified.
The simplest fixed size key in any problems I face is 8 bytes (20 for
IPv6) and typically much longer and and can be variable e.g. the from
tag, to tag and branch tag in a SIP call are all variable length
strings and collectively the minimum could be around 10 bytes but in
real calls typically would be upwards of 60 bytes.
Perhaps. Once you've loaded the first bit of your key then you've loaded a
whole cache line though. Furthermore, comparison is likely to terminate
early, particularly if you put the most randomized bytes first.
That's the purpose of inserting the hash first but that's a best case
optimization everthing still has to work consistently for the worst
case where the hacker sees the code and hits you with many packets
that hash to the same small set of values.
You originally said there were up to 8 million live sessions, which is 23
comparisons.
I agree using the lower bound of 10^6 gives a slight advantage to the
tree so let's be fair and use 10^8 and thus 23 comparisons. I don't
think it will change the outcome.
While the trie&hash _may_ win that one, I
for one could not call it either way without writing the code and
testing it since the winner may depend on which is most cache
friendly for the particular cache/size being used.
Agreed but I think these are very unusual requirements.
As noted above, we can only really speak to the problems we have to solve.
My problems may not be your problems or common when averaged over all
possible problems but it is a domain which whose services are being
used utilized as part of this conversation (at least at my end).
[1] The rehash on re-size + additional probes on collision
until/unless one is willing to re-size until the table is the same
size as the trie key in which case it isn't a hash table anymore.
I disagree.
Since you didn't elaborate I assume you don't want to argue the point.
.
- References:
- F# vs OCaml vs Python vs Haskell: hash table performance
- From: Jon Harrop
- Re: F# vs OCaml vs Python vs Haskell: hash table performance
- From: namekuseijin
- Re: F# vs OCaml vs Python vs Haskell: hash table performance
- From: Jon Harrop
- Re: F# vs OCaml vs Python vs Haskell: hash table performance
- From: namekuseijin
- Re: F# vs OCaml vs Python vs Haskell: hash table performance
- From: Jon Harrop
- Hash tables (was Re: F# vs OCaml vs Python vs Haskell: hash table performance)
- From: Bruce Stephens
- Re: Hash tables (was Re: F# vs OCaml vs Python vs Haskell: hash table performance)
- From: Jon Harrop
- Re: Hash tables
- From: Stephen J. Bevan
- Re: Hash tables
- From: Jon Harrop
- Re: Hash tables
- From: Stephen J. Bevan
- Re: Hash tables
- From: Jon Harrop
- Re: Hash tables
- From: Stephen J. Bevan
- Re: Hash tables
- From: Jon Harrop
- Re: Hash tables
- From: Stephen J. Bevan
- Re: Hash tables
- From: Jon Harrop
- Re: Hash tables
- From: Stephen J. Bevan
- Re: Hash tables
- From: Jon Harrop
- Re: Hash tables
- From: stephen . bevan
- Re: Hash tables
- From: Jon Harrop
- Re: Hash tables
- From: Stephen J. Bevan
- Re: Hash tables
- From: Jon Harrop
- F# vs OCaml vs Python vs Haskell: hash table performance
- Prev by Date: Re: Haskell bashing (was Re: F# vs OCaml vs Python vs Haskell: hash table performance)
- Next by Date: Re: Haskell bashing (was Re: F# vs OCaml vs Python vs Haskell: hash table performance)
- Previous by thread: Re: Hash tables
- Next by thread: Haskell bashing (was Re: F# vs OCaml vs Python vs Haskell: hash table performance)
- Index(es):
Relevant Pages
|