[Zope-Coders] Fw: [Zope] sessions and zope2.6.0

Tim Peters tim@zope.com
Tue, 22 Oct 2002 11:47:13 -0400


[Chris McDonough]
> We're having some problems with BTrees and I'm wondering what the
> best way is to isolate them.
>
> In this bit of code:
>
>  1        # gc the stale buckets at the "beginning" of _data
>  2        # iterate over the keys in data that have no minimum value
> and
>  3        # a maximum value of delete_end (note: ordered set)
>  4        # XXX- fixme.  range search doesn't always work (btrees
> bug)
>  5        for k in data.keys(None, delete_end):
>  6            if k > delete_end:
>  7                DEBUG and TLOG(
>  8                    '_housekeep: broken range search (key %s > max
> %s)'
>  9                    % (k, delete_end)
> 10                    )
> 11               continue
> 12            bucket = data[k]
> 13            # delete the bucket from _data
> 14            del data[k]
> 15            DEBUG and TLOG('_housekeep: deleted data[%s]' % k)

Brrr.  Mutating an object *while* iterating over it is always dicey at best.
Note that data.keys() doesn't produce a distinct list of keys for an XYBTree
or XYTreeSet, it produces a tiny iterator object with pointers *into* data's
internals. If data mutates, the pointers into data held by the iterator
aren't magically updated to match.

Note that in Python, it's impossible to get this kind of trouble by
iterating over a Python dict.keys():  dict.keys() materializes the full list
of keys as a distinct object.  Whether or not dict mutates after that point
is irrelevant, as the list returned by dict.keys() is a self-contained
object in its own right.

But starting in Python 2.2, it *is* possible to get this kind of trouble by
using the dict.iterkeys() method, which is much like XYBTree.keys() in doing
a lazy traversal:

>>> d = {1:1, 2:2}
>>> for k in d.iterkeys():
...     del d[k]
...
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
RuntimeError: dictionary changed size during iteration
[8532 refs]
>>>

As shown, Python tries to stop you from mutating the dict during an
iterkeys() traversal, because the results are unpredictable otherwise; it
doesn't catch all possible mutations, but catches "most".

BTrees are much more complicated structures than Python dicts, and it's
correspondingly much harder to try to stop insane usage (for example, BTrees
don't even know their own size, so Python's "changed size during iteration"
check isn't available).

Yadda yadda yadda:  short course:  try changing

>  5        for k in data.keys(None, delete_end):

to

>  5        for k in list(data.keys(None, delete_end)):

If the problems go away, then read this msg again from the top but for real
the second time around <wink>.