Thanks for the explanation!
On Fri, 28 Apr 2000, Phillip J. Eby wrote:
NOT designed for high-volume updates like this. Your actual best bet would be to load all 60,000 objects first, then ZCatalog them all in one single long transaction, with subtransactions to prevent memory overflow. ZCatalog+FileStorage has terrific speed and great space performance in that configuration. But if you have an application that requires frequent indexing, you may be better off with an RDBMS for certain aspects.
I came to this conclusion myself about two hours ago and am in the process of doing the load that way. The object load without the Catalog was relatively fast and not disk intensive. I do find it interesting to note that raw data occupying 8MB on disk takes up 28MB on disk loaded into ZClass instance objects, *without* cataloging. This is not a good storage ratio, so I'm not sure what you mean by "great space performance"; but then again, ZODB's purpose is definately not storage optimization! I'll be curious to see how much overhead the Catalog adds; I'm guessing the space performance you are refering to is in that overhead.
Do you have an example of doing subtransactions? I tried just calling get_transaction().commit(1), but it didn't seem to write anything to disk and the memory usage just kept growing.
There remains the fact that there is a data corruption bug in ZODB somewhere, which seems to only be triggered by the massive update activity associated with something like the load-with-ZCatalog I tried to do. I just hope it doesn't bite me again when I do the Cataloging run.
I have not been successful at narrowing down an easily reproduced example of the error. When it strikes seems to depend on the update load, so debugging it is going to be an annoyingly long process, I think.
--RDM