[Zope] ZCTextIndex - prefix wildcards not supported?

Casey Duncan casey at zope.com
Fri Nov 21 12:13:38 EST 2003


On Thu, 20 Nov 2003 12:38:24 -0500
"Small Business Services" <toolkit at magma.ca> wrote:

> Why are wildcards '?' and '*' not supported at the beginning of search terms in ZCTextIndex?  It would be very useful to search for terms using '*someterm'.
> 
> In the cvs for ZCTextIndex, Lexicon.py 
> (http://cvs.zope.org/Products/ZCTextIndex/Lexicon.py?annotate=1.17.10.2)
> 
> the code raises an exception for wildcards at the beginning of search terms (see line 113) and a related comment says"
> 
> 111                                  # The pattern starts with a globbing character.
> 112                                  # This is too efficient, so we raise an exception.
> 
> Why is this 'too efficient"?

I think it should sat "too inefficient". The data structures in the lexicon as it is currently implemented cannot efficiently return all of the matching words for *foo. It would require iterating all of the words in the lexicon.

As Andreas said, it would be possible to implement this efficiently if the lexicon kept a separate head globbing index, but this would greatly increase the size of the lexicon and would make updates somewhat more expensive (although probably not too much in steady-state).

I'm curious, you said you had 700,000 some-odd documents in your catalog. How many words are in the lexicon(s) you have?

-Casey



More information about the Zope mailing list