[Zope-dev] 27 million objects.

Michael R. Bernstein webmaven@lvcm.com
Mon, 09 Apr 2001 10:33:50 -0700


Andy McKay wrote:
> 
> Any cataloguing and un-cataloguing of an object is expensive, c'mon you are
> changing all the indices, vocabulary and so on. You never notice it normally
> for 1 - 10 things, but run an import script of 10000 and catalog each object
> as it gets added (rather than all of them at the end) and you'll notice the
> difference. (This script was cataloguing 250,000 mail messages, one at a
> time. Big no-no)

Perhaps I expressed myself poorly.

What I am watching out for is evidence that adding,
indexing, reindexing, or retreiving *a single object* (or a
small set of objects), takes longer if there are more
objects stored/indexed already.

In other words, does the time to
store/index/reindex/retreive an object change (for the
worse) depending on whether there are 10,000 objects,
100,000 objects or 10,000,000 objects stored/cataloged in
the ZODB/ZCatalog?

Previously, the fact that searching performance suffered
depending on a combination of number of total objects and
the size of the result set (irrespective of the batch size,
apparently), came to light, and has apparently been fixed.
Now searching performance scales with the number of
cataloged objects.

So, are there any non-linear gotchas waiting for me?

Michael Bernstein.