Re: What’s the best general implementation of cust om #hash methods?



On Sep 10, 10:49 am, Rick DeNatale <rick.denat...@xxxxxxxxx> wrote:
On Thu, Sep 10, 2009 at 11:56 AM, Shot (Piotr Szotkowski)<s...@xxxxxx> wrote:
Hello, ruby-talk.

What’s the best (fast, yet with relatively few cases when equal hashes
are generated for non-eql objects) way to implement a hash method?

Is XOR-ing the hashes of an object’s instance variables (assuming that
two objects are eql if their ivars are eql) a good general approach?

I’m asking because I use this pattern in my code, and it seems quite
fast; I needed to work around a bug in MRI’s Set#hash and this solution¹
seems about as fast as the original MRI’s Set#hash. I’m just not sure
whether it’s hashy enough (so to speak) in that two un-eql objects
hashed in this way produce different hashes often enough.

Well, let's look at how Array#hash works here's the 1.8 C code:

static VALUE
rb_ary_hash(ary)
    VALUE ary;
{
    long i, h;
    VALUE n;

    h = RARRAY(ary)->len;
    for (i=0; i<RARRAY(ary)->len; i++) {
        h = (h << 1) | (h<0 ? 1 : 0);
        n = rb_hash(RARRAY(ary)->ptr[i]);
        h ^= NUM2LONG(n);
    }
    return LONG2FIX(h);

}

This looks like a typical cryptographic hash algorithm pattern
combining bit rotation and xor.  It seeds the value with the length of
the array, then for each element, rotates the value 1 bit to the left,
then xors in the elements hash.

This scrambles the bits more than just a straight xor

--
Rick DeNatale

I must take STRONG exception to your characterization of this
algorithm. This method performs parallel xors of the data. Change one
bit of one of the values, and you change one bit of the hash. A
fundamental property of cryptographic hashes is that changing one bit
of the input changes many bits of output. This is even true of crcs.
Cryptographic hashes do rotate & xor (sometimes), but only as part of
a larger operation.
.



Relevant Pages

  • Re: Hash/permutation function for object ID creation
    ... I plan to create a sequence counter ...      has roughly same number of objects. ... hash = ID+SOME_CONSTANT ... R' = R XOR H ...
    (comp.theory)
  • Re: GetHashCode for Objects that Compare Based on Value (Not reference equality)
    ... Further, for those sets of data, an XOR hash produces a sufficiently ... random distribution for hashes to work just fine. ... realm of developing a good generic algorithm; ... Saying that XOR produces collisions is not very useful. ...
    (microsoft.public.dotnet.framework)
  • Re: Problem with search
    ...   ... What I am currently doing is using a hash table for the lowest level and ... the FSM has 5 states. ... If you end up at an accepting state when the whole reverse ...
    (comp.lang.lisp)
  • Re: Ruby-based data language
    ... configuration information, I'd be interested to see them. ...   name "Joe Foo" ... Hash literals - one which isn't widely used. ... kind of object depends on the parser. ...
    (comp.lang.ruby)
  • Re: Help me break this stupid cipher I just invented. :P
    ... C= x xor b ... When I look at this, it looks a lot like the common XOR against a PRNG, ... But if you're using a cryptographically strong one-way hash, ... creating a secure key stream in this sort of crypto system. ...
    (sci.crypt)