[Zope-CMF] Bulk Indexing

Shane Landrum srl at boston.com
Fri Jan 23 09:50:37 EST 2004


On Fri, Jan 23, 2004 at 09:37:57AM +0000, Chris Withers wrote:
> Shane Landrum wrote:
> 
> >I've poked at this a bit more and have a refinement of my
> >question. What's the best strategy for changes to indexes
> >on large catalogs in production systems? 
> 
> I do it in a seperate ZEO client on another machine, usually in a 
> python-only ZEO client, rather than TTW...

How many ZEO clients do you have running total when you do 
this? Do you have trouble with database ConflictErrors?  

It's quite possible that I'm dealing with a unique situation here.
We have 7 large Zope servers talking to a very large ZEO server,
and unusually for lots of Zope installs, we do a lot of database
writes, more or less all the time. We have an elaborate automatic
news feed system that's always dumping new content into the database,
an automatic workflow approval system for some of that new content, 
and a staff of 2-15 editorial people on the system at any one 
time, poking and prodding at content.  As a result, it's 
relatively likely that at least one object out of several hundred 
thousand is going to be being worked on when the reindexing is 
done; hence, ConflictErrors. 

We're mitigating this a bit by forcing the ZODB to commit a transaction
after every 100 objects reindexed. It's not ideal--- I'm uncomfortable
groping Zope's internals for optimization, because I don't know all
the implications of forced transaction commits---- but it seems
not to work too horribly.

srl
-- 
Shane Landrum, Software Engineer    srl at boston.com
boston.com / NY Times Digital



More information about the Zope-CMF mailing list