[Zope-CMF] reindexing optimizations

Alec Mitchell apm13 at columbia.edu
Mon Nov 21 12:15:05 EST 2005


On Monday 21 November 2005 06:30 am, Chris Withers wrote:
> >                 *) And calls CMFCatalogAware.reindexObjectSecurity()
> > which reindexes the object only on the security index, and doesn't touch
> > metadata.
>
> Does reindexObjectSecurity do anything other than just the reindex the
> security indexes? If not, it can go too ;-)

Yes, it recursively reindexes the security of all children.  It will be hard 
to make this go away IMO.
...
> > So we have two full reindexes, and three metadata updates.  The last
> > reindex appears to be there only to catch the change to 'portal_type' in
> > _finishConstruction.
>
> Well, it's the last one, so I'd argue it should be the _only_ one. Why
> do things need to be indexed before then?
>
> > Additionally, almost immediately before this last reindexObject call,
> > another reindexObject call has happened in notifyWorkflowCreated, which
> > included a full catalog metadata update.  As a result, updating the
> > catalog metadata here is certainly redundant.  Unfortunately, the
> > CMFCatalogAware.reindexObject method provides no means of avoiding the
> > duplicate metadata update, though it would be trivial to add and to use
> > here.
>
> That sounds like a good idea :-)
>
> > Another option suggested by Sidnei on IRC, which would avoid the
> > potential issues with limiting the variables indexed in the final
> > reindex.  Would be to let CMFCatalogAware.manage_afterAdd know
> > (presumably via some state variable)
>
> Why a state variable rather than just a parameter?

How do you propose to pass parameters to manage_after*, monkey-patching 
_setObject?

> > that it is being invoked through constructInstance/invokeFactory,
> > in which case it could safely skip the initial indexing and allow
> > _finishConstruction to take care of indexing the object fully on it's
> > own at the end.
>
> +1 from me.
>
> > In the long term we will probably be better served by delaying all
> > indexing to transaction boundaries, though it will be a fair bit harder
> > to implement, and may irk some developers who depend on immediate
> > changes to the catalog on reindex.
>
> Yeah, it also makes things harder to test. Unit tests require stuff to
> be indexed, so if this was the way to go, which apart from that one
> thing I think _should_ be the case, there should be a "flush all pending
> indexing" thing, which should keep everyone happy. Just have to make
> sure that then doesn't get misused and end up being called 100 times per
> operation ;-)

Ben Saller's Eventually product apparently has this type of functionality: 
delayed events, and methods for coercing them to run immediately.  It may be 
a good place to look.

Alec


More information about the Zope-CMF mailing list