Re: hash table size
- From: "Guest" <none@xxxxxxxxxxx>
- Date: Thu, 21 Feb 2008 21:24:56 -0600
"Andy Walker" <anw@xxxxxxxxx> wrote in message
news:fplbdt$ke4$1$8300dec7@xxxxxxxxxxxxxxxxxxx
In article <l5A*pO37r@xxxxxxxxxxxxxxxxxxxxxxxxxxx>,
David Richerby <davidr@xxxxxxxxxxxxxxxxxxxxxx> wrote:
In a well-written engine, the optimal hash table size really is `as
big as you can have it without swapping.'
Whoa, it's not *quite* as simple as that.
(a) Even in a well-written engine, you may want/need to "clear
out" the table from time to time. Initialising several terabytes
may take a little while even on a modern machine ....
Actually, current programs don't do that. They leave the hash alone. They
don't even initialize it when the program is first run.
Either they actually use the data from the previous search, or they
invalidate it at the time of access...
There are several ways to invalidate at access...
1) Change all the hash keys so you get a new random value. This isn't a
prefered choice because many people use the hash as a way to quickly check
for repetition.
2) The hash entry contains some sort of counter as part of the 'lock'.
3) At time of access into the hash, XOR in an additional key that changes
for every search.
This way old entries from the previous search (or uninitialized hash memory)
will very very rarely be considered valid.
The odds are so remote that you are more likely to encounter a normal false
hash hit during the search.
And if you actually check the returned move as valid (not everybody does
because it takes time and they feel the odds are acceptible), then the odds
are so remote that you are accessing old data as current data that you can
totally ignore it.
You have a better chance of getting random bit flip errors in the memory or
in the cpu.
As for terabytes... well, not too many people have access to machines with
terabytes of physical memory....
(b) Some of the standard operations on a hash table involve
indexing and finding remainders modulo the table size. These may
be more efficient if the table size is a power of 2, and if the
relevant numbers fit into [in C terms] ints rather than longs.
It is true that doing an AND is faster than a modulo. But the time
difference isn't really all that significant considering how long main
memory is going to take to access. And the savings can greatly outweigh the
few extra cycles a modulo would take.
But yes, many people do prefer a power of two size.
And as for the 'ints' vs. 'longs'... Sorry. That was true in the days of 16
bit compilers, but these days on 32 bit systems, both int & long will be 32
bits. And on a 32 bit system, 32 bits will be enough to access all of
physical memory. (With the exception of certain hacks that Intel did for
some servers to increase memory by paging it.) So there will be no need for
64 bit 'long long' which would involve a more complicated modulo.
And on 64 bit systems (like the powerful chess programs all use and even
many hobbiests use), int's and long's are both going to be 64 bits. (Well,
actually that needs to be qualified a bit, since the C99 standard is a
little vague on how to handle all 8, 16, 32, 64 and 128 bit data sizes that
can be done on a 64 bit system.)
And no 64 bit system is going to have more than 2^64 bytes of memory, so
there's no need to get into 128 bit 'long long'.
(c) If you manage to lock down all, or most, of the storage of
a machine, you may perhaps be unpopular, in a multi-user environment,
Well DUH.
However, on a multi-user environment you are likely to have memory
limitations imposed so you can't use more than your fair share. In fact,
you may only be given a couple meg of physical memory and all the rest of
your data will be paged in & out as needed. Definetly not a good thing for
large hash tables.
If you are on a system without that restriction, then it's probably not a
problem for other reasons, such as verbal agreements between users, etc.
etc.
And if you are on a system without that restriction and you don't have
agreements etc. and you do it anyway and cause problems... (shrug) Tough.
That's your problem when the sysadmin catches you.
with other users of the machine. *Your* program may not be swapping,
but *theirs* are! [Even on a home PC, the other users may be *you*,
trying to run other things, such as the engine that this program is
playing against.]
No.
If you are runing a chess program for a reason (such as a tournament), then
you know not to use it for other things.
If you intend to use it for other things, then just don't tell it to use all
your available memory for the hash. (Although your other programs will
reduce the cpu power available to the chess program, which may cost it a
full ply of search.)
And if you do want to run some other program and a chess program with
massive hash tables at the same time.... (shrug) Get a second computer.
It's not like they are expensive. A second computer can be bought for $400
that's more than good enough for basic usage, web surfing, etc. For just a
few dollars more, you could get a low end laptop instead.
.
- Follow-Ups:
- Re: hash table size
- From: Andy Walker
- Re: hash table size
- References:
- hash table size
- From: dkraft
- Re: hash table size
- From: Guest
- Re: hash table size
- From: Anders Thulin
- Re: hash table size
- From: David Richerby
- Re: hash table size
- From: Andy Walker
- hash table size
- Prev by Date: Re: hash table size
- Next by Date: Chess Tournament Software
- Previous by thread: Re: hash table size
- Next by thread: Re: hash table size
- Index(es):
Relevant Pages
|
Loading