[Zope-dev] ZCatalog caching with memcached

Roché Compaan roche at upfrontsystems.co.za
Mon Oct 27 08:32:40 EDT 2008


On Mon, 2008-10-27 at 13:23 +0100, Jens Vagelpohl wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> 
> On Oct 27, 2008, at 13:08 , Roché Compaan wrote:
> 
> > On Sun, 2008-10-26 at 14:07 -0400, Tres Seaver wrote:
> >> - Plone uses too many indexes, and in particular, uses multiple text
> >>   indexes.  Having extra indexes around "just in case" is a sure lose
> >>   a write time, and may even be expensive at query time (depending on
> >>   the query).
> >>
> >> - Particular indexes have performance characteristics based on their
> >>   designed purpose:  for instance, the stock FieldIndex  
> >> implementation
> >>   assumes that the number of documents indexed will be >> the  
> >> number of
> >>   discrete indexable values.  Using such an index in an application
> >>   domain with a very large set of indexable values probably loses,  
> >> and
> >>   in ways which don't show up in early / small-scale testing.
> >>
> >> - I'm pretty sure that we haven't yet found the best data structure  
> >> for
> >>   "hierarchy indexes" (e.g., the Plone EPI index, or the stock Zope2
> >>   PathIndex, etc.).  Something like a 'trie' might be optimal for
> >>   pure prefix searching of hierarchies.
> >>
> >> - I am confident that the TopicIndex is underutiliized:  it does  
> >> *all*
> >>   the work for a given query at write time, and can thus be  
> >> blindingly
> >>   fast at query time.
> >>
> >> - Other special-purpose indexes (e.g., a "recent items" index) would
> >>   be worth a look, especially for applications with large volumes of
> >>   content.
> >
> > I agree that one should look at improving performance without  
> > caching as
> > well. But this is a lot harder and takes significantly more  
> > development
> > and debugging time than introducing some form caching. So I'm not
> > convinced that it needs to happen in a certain order. If caching gives
> > you lots of performance with little effort now, then why shouldn't you
> > use it?
> 
> It's the typical trade-off. One course is expedient and fast for your  
> use case now. The other requires more resources, but benefits  
> everyone. Including those who don't want to depend on yet another  
> package, like memcached, for performance.

I'm not tied to memcached. We started out using module level caches like
zope.cache.ram but that has obvious problems when using ZEO.

> When it comes to integrating anything in Zope itself I'd choose the  
> latter.

Sure, we're not trying to get this into Zope, we're just sharing our
experience and exploring the territory so that one can produce a third
party package that really help people with the same use case (which I
suspect is quite common one).

-- 
Roché Compaan
Upfront Systems                   http://www.upfrontsystems.co.za



More information about the Zope-Dev mailing list