POSKeyErrors was Re: [Zope] Zope leaking memory?

Tim Peters tim.peters at gmail.com
Thu Sep 16 10:36:44 EDT 2004


[Richard Jones]
>>> Deleting the index had no impact.

[Tim Peters]
>> Then, in your example:
>>
>>     oid 0x0265be BTrees.IOBTree.IOBucket
>>     last updated: 2004-09-16 02:32:47.973507, tid=0x357DBF8CCAFDCCCL
>>     refers to invalid object:
>>            oid 0x02b6c2 missing: 'BTrees.IOBTree.IOBucket'
>>
>> oid 0x02b6c2 is not in your .fs.index, and so an attempt to load oid
>> 0x02b6c2 should cause a POSKeyError.  When you said:
>>
>>     but when I dig in there, the IOBucket appears to just have strings as
>>     the values. And they're all present
>>
>> it wasn't clear what "when I dig in there" meant.  What specifically
>> did you do to inspect oid 0x02b6c2?  Or you were looking at oid
>> 0x0265be?  ("the IOBucket" was ambiguous, since two distinct IOBuckets
>> are mentioned in the output).

[Richard]
> Sorry, by "dig in there" I meant that I loaded up the object with oid 0x0265be
> using:
>
> >>> from Zope.Startup.run import configure;configure('zope-19100/zope.conf')
> >>> from Zope import app; root = app()
> >>> from ZODB.utils import p64
> >>> o = root._p_jar[p64(0x0265be)]

Thanks!  That's clear.  To make more sense of what you're seeing, you
have to know that btrees are complicated data structures.  While you
see a single IOBTree B at the Python level, under the covers B is
actually a graph made up of any number of IOBTree and IOBucket nodes,
each a distinct persistent object.  That's why btrees scale well. 
*Normally* you only see the topmost IOBTree node, but digging into the
database by oid exposes the elaborate internal structure.  That
internal structure includes three distinct kinds of inter-node
references that have nothing to do with the keys or values.  Those
inter-node references are part of the btree's state too, but you're
normally not aware of them.

While it would take deeper analysis to be sure (there's not enough
info here to nail it), the evidence that is here suggests that oid
0x0265be is a leaf-level bucket that's an internal (normally
unexposed) detail of some higher-level IOBTree.  All the leaf-level
buckets in a BTree are in a singly-linked list, to support efficient
traversal from smallest key to largest.  Each bucket has a "next
bucket" pointer to support this.  This isn't exposed in Python -- it's
an internal detail of btree construction.  So the evidence here
suggests that oid 0x0265be has a next-bucket pointer to oid 0x02b6c2,
but the latter object doesn't exist in the database.

> and then I had a poke at that:
> 
> >>> for k,v in o.items():
> ...  print k, type(o[k]), o[k]
> ...

Probably would have been easier to do

     print k, type(v), v

at that point <wink>.

> 1531753053 <type 'str'> /CGPublisher/publishers/12/messages/17
> 1610364516 <type 'str'> /CGPublisher/works/171/messages/1
> 1610364517 <type 'str'> /CGPublisher/publishers/11/messages/31
> 1610364518 <type 'str'> /CGPublisher/publishers/11/messages/32
> 1610364519 <type 'str'> /CGPublisher/works/173/messages
> 1610364520 <type 'str'> /CGPublisher/publishers/11/messages/33
> 1610364521 <type 'str'> /CGPublisher/publishers/11/messages/34
> 1637779823 <type 'str'> /CGPublisher/publishers/11/messages/30
> 1655774688 <type 'str'> /CGPublisher/works/163/messages/4
> 1660892580 <type 'str'> /CGPublisher/publishers/11/messages/75
> 1660892581 <type 'str'> /CGPublisher/publishers/11/messages/76
> 1660892582 <type 'str'> /CGPublisher/publishers/11/messages/77
> [snip many similar lines]
> 1701534533 <type 'str'> /CGPublisher/publishers/13/messages/63
> 1701534534 <type 'str'> /CGPublisher/publishers/13/messages/64
> 1701534535 <type 'str'> /CGPublisher/publishers/13/messages/65
> 1701534536 <type 'str'> /CGPublisher/publishers/13/messages/66
> 1701534537 <type 'str'> /CGPublisher/publishers/13/messages/67
> 1701534538 <type 'str'> /CGPublisher/publishers/13/messages/68
> 1701534539 <type 'str'> /CGPublisher/publishers/13/messages/69
> 1708905051 <type 'str'> /CGPublisher/works/170/messages
> 1716432762 <type 'str'> /CGPublisher/publishers/13/messages/68/2
> 1716432763 <type 'str'> /CGPublisher/works/183/messages
> >>>

So the keys and values are fine.  Traversing a bucket object makes no
use of the next-bucket pointer, so a missing next-bucket object
wouldn't cause any problems here.

> and just to confirm I'm not going mad:
>
> >>> root._p_jar[p64(0x02b6c2)]
> Traceback (most recent call last):
>  File "<stdin>", line 1, in ?
>  File "/opt/zope/cgpublisher-prod/Zope/lib/python/ZODB/Connection.py", line
> 170, in __getitem__
>  File "/opt/zope/cgpublisher-prod/Zope/lib/python/ZEO/ClientStorage.py", line
> 749, in load
>  File "/opt/zope/cgpublisher-prod/Zope/lib/python/ZEO/ServerStub.py", line
> 82, in zeoLoad
>  File "/opt/zope/cgpublisher-prod/Zope/lib/python/ZEO/zrpc/connection.py",
> line 372, in call
> ZODB.POSException.POSKeyError: 0x02b6c2

Which is consistent with fsrefs.py saying that oid 0x02b6c2 is
"missing" -- it's not in the index, so trying to load it raises
POSKeyError.

> I guess one issue here is that I'm poking fsrefs.py directly at the Data.fs,
> whereas the above session is done through a ZEO connection. Not sure how ZEO
> could "hide" the erroroneous data from me, but then I don't know the inner
> workings of ZEO and its caches...

The info so far is self-consistent, so let's assume ZEO isn't a factor.

There's no way to tell from what we have here what the "top level"
btree may be.  Trying to traverse the top-level btree would raise
POSKeyError, when it got to the dangling next-bucket pointer.  It's
possible that running the checkbtrees.py tool would identify the bad
top-level btree in a helpful way.


More information about the Zope mailing list