[Zope-dev] Data.fs corruption when creating lots of objects.

Phillip J. Eby pje@telecommunity.com
Fri, 28 Apr 2000 20:17:50 -0500


What's happening is pretty normal for the type of operation you describe.
Your 52 objects are BTree pages/buckets and IntSet objects being rewritten
for each transaction.  The expected number of objects rewritten by a
transaction is based on the depth of each BTree.  Each ZCatalog index has a
forward and backward BTree that gets updated.  The ZCatalog itself has from
one to three additional BTrees.  The index forward BTrees end in IntSet
objects.  And of course there are the objects you're adding.  The objects
get larger each time because the pages and IntSets are filling up with
entries, and of course they are written out again at the end of every
transaction.

To ease the storage burden, you should probably load your objects in larger
tranasactions.  ZODB only writes changed objects at the end of a
transaction, so if you do, say, 100 objects per transaction, you will have
nearly 100 times fewer objects written and disk space used.  ZCatalog is
NOT designed for high-volume updates like this.  Your actual best bet would
be to load all 60,000 objects first, then ZCatalog them all in one single
long transaction, with subtransactions to prevent memory overflow.
ZCatalog+FileStorage has terrific speed and great space performance in that
configuration.  But if you have an application that requires frequent
indexing, you may be better off with an RDBMS for certain aspects.


At 07:08 PM 4/28/00 -0400, R. David Murray wrote:
>
>What's going on here?  The amount of data and objects being thrown
>around here seems *totally* crazy.  I'm especially puzzled by those
>larger and larger object sizes during a load, given that the size
>of the input records is consistent.  The objects that remain after
>packing are of a very reasonable size (around 350-400 bytes each).
>And why does the addition of a single new ZClass instance result
>in *52* objects in the transaction, when I would assume that only
>the Catalog and the containing folder are getting modified in the
>transaction?
>
>(For review, I'm adding a simple catalog aware ZClass object that
>has about a dozen properties, but I'm trying to add a little less
>than 60,000 of them).
>
>Can anyone with some knowledge of how the ZODB in general and FileStorage
>in particular works give me a clue what is going on here?  Is this
>really normal behavior, to need this huge amount of disk space to
>handle what seem to be simple updates?