[Zope-Annce] [Ann] Incremental filtering: a new way for catalog query optimization

Dieter Maurer dieter at handshake.de
Sun Apr 9 13:24:37 EDT 2006


Catalog searches are often slow.

IncrementalSearch[2] can lead to drastic gains in query time.
However, there is a large class of queries that cannot be significantly
sped up: queries that contain huge Or subqueries involving
bushy indexes, usually resulting
from a time based subquery checking effectiveness and expiration (or
other similar use cases)


The newest versions of the companions "AdvancedQuery", "ManagableIndex"
and "IncrementalSearch[2]" support incremental filtering to
execute most such queries efficiently.

With incremental filtering, the index is not use in the usual way.
Instead, the remaining query parts determine a set of document
candidates with is then filtered by the filtering subqueries,
dropping documents not matched by these subqueries.

Lets look at an example:

  Suppose you search for news containing 'AdvancedQuery'
  which are effective and not expired.

  The standard query would look like
  
	Eq('portal_type','News') & Eq('SearchableText', 'AdvancedQuery')
	& Le('effective', now) & Ge('expires', now)

  Internally, the "Le('effective', now)" subquery is
  expanded into "Or(*[Eq('effective', t) for t in 'effective' and t<=now])
  which usually is huge. The "Ge" subquery is similarly expanded.

  The filtering query has the form

	Eq('portal_type','News') & Eq('SearchableText', 'AdvancedQuery')
	& Ge('expires', now, filter=True) & Le('effective', now, filter=True)

  When this query is executed,
  "Eq('portal_type','News') & Eq('SearchableText', 'AdvancedQuery')"
  determines a set of candidate objects.
  From this (probably small) set, objects not satisfying "expires >= now"
  and (then) "effective <= now" are filtered out.
  This way, we avoid the construction of huge Or subqueries.


Of course, filtering is only efficient in some circumstances:
usually, when the other query parts already garantee a small
set of candidates and the filtering can avoid the construction
of large intermediaries. Otherwise, filtering may not improve
the query speed but increase it (maybe even drastically).

Incremental filtering is a powerful optimazation tool,
which need careful usage...


You need the complete bundle ("AdvancedQuery", "IncrementalSearch[2]"
and "ManagableIndex") when you want to use incremental filtering.


More information and download:

  <http://www.dieter.handshake.de/pyprojects/zope>

     

-- 
Dieter


More information about the Zope-Announce mailing list