[Zope-dev] [BUG] Quadratic ZODB bloat caused by "PathIndex"

Joachim Werner joe@iuveno.de
Fri, 21 Feb 2003 10:14:58 +0100


Andreas Jung schrieb:
> 
> 
> --On Donnerstag, 20. Februar 2003 08:05 +0100 Dieter Maurer 
> <dieter@handshake.de> wrote:
> 
>> Zope 2.5.1
>>
>> A "PathIndex" maps (pathsegment,level) onto the "IISet" of document ids
>> with "pathsegment" at "level" in their path.
>>
>> An "IISet" is a single persistent object, written as a whole to
>> the ZODB. Its size is proportional to the number of entries.
>> Therefore a ZODB storage with undo support grows quadratically
>> with respect to the number of entries (between packs).
>>
>> The standard "path" index indexes based on the physical path.
>> Therefore, the size of the index entry of (at least) one
>> of the top level pathsegments is in the order of all indexed
>> objects.
>>
>> Once, you have lots of indexed objects you will observe
>> significant ZODB growth between packs.
>>
>>
>> The fix would be easy: "PathIndex" should use "IITreeSet" rather
>> than "IISet" to store the document id lists (as do other indexes).
>> (There are more bugs in "PathIndex": e.g. it does not remove
>> old index information when a new "index_object" brings in new data.
>> A code review would be appropriate.)
>>
>>
>> A quick workaround: delete the "path" index unless you really need it.
>>
>>
> 
> I am going to fix the problem for Zope 2.5, 2.6 and HEAD
> next week.

I don't know if that's related, but I had cases where an empty index 
(all cataloged items removed, all lexicons or vocabularies removed) 
still had a size of around 6 MB (when exported in *.zexp format). That 
means that some stuff is not deleted correctly.

Joachim