[Zope-Annce] [ANN] TextIndexNG 2.0 final released

Andreas Jung Andreas Jung <andreas@andreas-jung.com>
Sun, 20 Jul 2003 16:39:12 +0200

I am pleased to announce the final release of TextIndexNG 2.0.

What's new in TextIndexNG 2.0?

  - Relevance ranking of search results added. Searches are now ranked
    using an extended cosine measure. The cosine measure is based on
    a vector model and calculates the document "score" based on the
    frequency of the query terms inside the document result set.

  - Much faster phrase/near search: the old implementation of TextIndexNG
    had to perform a very expensive job at query time when phrase/near 
    was performed. Re-using the !WidCode module of !ZCTextIndex made
    this operation less expensive.

  - Left-truncation added: TextIndexNG can be configured creation-time
    time to support left-truncation (means you can search for "*suffix")

  - optional auto-expansion support: This optional feature also to get
    better search results when some of the query terms could not be found.
    The index expands a query term "foo" to "foo*" if there was no hit
    for "foo". This expansion is currently global for the index. This 
    will be available on a per-query basis in a later version. 
    will be extended in a later version to search for similar terms)

  - improved HTML converter: now using  Chris Withers "Strip-o-Gram" module
    instead of the Strip-Tag-Parser

  - added converter for text/sgml

  - Similarity search (soundex, metaphone, doublemetaphone) dropped
    and replace with a more general approach and language indepedant
    approach using the Levenshtein distance.

  - internal code cleanup, more unittests

  - range searches like "Fi..Foo"

  - substring searches "*substring*"

  - reduced conflict errors caused by the lexicon/storage implementation

  - no longer conflicts with TextIndex V 1 installations




Project Wiki:


Note: there are currently no binary packages available
for the TextIndexNG extension modules. They will be provided
at a later time.