[Zope] RE: Confera reply error

Michel Pelletier michel@digicool.com
Thu, 28 Oct 1999 11:56:18 -0400


> -----Original Message-----
> From: Thomas B. Passin [mailto:tpassin@mitretek.org]
> Sent: Thursday, October 28, 1999 10:53 AM
> To: zope@zope.org
> Subject: Re: [Zope] RE: Confera reply error
> 
> 
> Thanks, Cayce.  This bug is gone in version 1.3.2, as you said.
> 
> Now, Confera-people, can we get some wild-card action for searches?

This is a vastly harder problem than it sounds.  Basicly, there's no
good speed/space tradeoff in indexes to get this done well.  You can't
simply regular-expression search through the index because this would
require looking at every single word in the index, which could take a
looooong time.

To give you an example, htdig is probaby the most popular indexing
engine in the open source world.  It doesn't even do wildcard searching,
the best it can do is 'substring' searching, and it has this to say
about it:

substring 
  Matches all words containing the queries as substrings. Since this
requires 
  checking every word in the database, this can really slow down
searches 
  considerably. 

Due to the complexity of this problem, I don't know of any open source
indexer that does this.  There is a particularly interesting algorithm
called 'n-grams' that will let you do simple 'globbing' (DOS style
wildcards) but it does have an additional space requirement in the
index.

-Michel