[Zope-dev] zope.keyreference hashes vs. 32/64bit

Hanno Schlichting hanno at hannosch.eu
Sat Aug 28 12:17:19 EDT 2010


Hi.

I've recently stumbled on some at least to me unexpected behavior with
zope.keyreference. For a persistent object it generates a unique key
using:

hash((database_name, oid))

where hash is Python's built-in hash function.

Reading the documentation I assumed that a keyreference for the same
object (as identified by database name and oid) should be stable and
always produce the same result. This isn't always true, when you look
up persisted keyreference data, upgrade your software versions and
compare it to a new calculation.

Python's hash function is only stable inside the same Python version
and 32/64 bit combination. The same input in a 32bit Python 2.6 and
64bit Python 2.6 produces different results, as both try to use the
maximum available integer space and thus a 64bit Python generates keys
above the 32int range. As a simple example "hash(('main', 1)) > 2**32"
is True in a 64bit Python and False in a 32bit Python.

The internal hash implementation seems to have been pretty stable in
all the latest Python versions up to 3.1. So the algorithm produces
the same results for all 32bit version of Python 2.x to 3.1 and 64bit
respectively. But as far as I understand this isn't guaranteed to be
the case for future versions.

Does anyone else see a problem with this? Should keyreference use a
different hash algorithm?

Hanno


More information about the Zope-Dev mailing list