[Zope-dev] ZCatalog cannot support chinese?

Michel Pelletier michel@digicool.com
Thu, 24 Feb 2000 18:14:57 -0800


Victor.Zhai@ogilvy.com wrote:
> 
> Hi,all
>    In my project, I want to use ZCatalog to build up a search interface!
> But It doesnot support Chinese. Can some one give me some advice on it.

ZCatalog does not currently support Chinese for several reasons:

 1) I've never seen or worked with Chinese, and I have no environment to
debug it.

 2) Python itself is still working on complete internationalization

 3) ZCatalog is very english-centric

However, I am working on several enhancements to ZCatalog which will
help you here.  First, ZCatalog now supports the notion of
Vocabularies.  Vocabularies are seperate objects from ZCatalogs. 
Vocabularies seperate all of the language specific features from
ZCatalog.  Therefore, if you subclass and create your own kind of
Vocabulary (say, ChineseVocabulary), you can:

  1) create your own kind of 'Splitter', which is the object that splits
documents into words.  Currently Zope's splitter is very simply and only
understands english (and some european) languages how to split words on
spaces.  Splitting chinese probably requires a much different algorithm.

  2) control stop words and synonyms, right now, Zope has hard-coded
stopwords that are english only, and no synonym support.  In 2.2, Zope
Vocabularies will allow you to control these stopwords and synonyms in a
language neutral fashion.

There features are in the current CVS but they are still quite raw. 
What would help is the currently unreleased ZCatalog User's Guide, the
latest version of which is currently on a Zip disk packed in a box
somehere here in my apartment.  I should really dig that up.

But for chinese support, you're going to have to roll up your sleeves a
little and subclass your own kind of Vocabulary object.  This is not
really so hard to do, it's just hard to understand without
documentation.

-Michel