[Zope-dev] Re: Catalog and Unicode

Florent Guillaume fg at nuxeo.com
Thu Aug 11 05:48:46 EDT 2005

Dieter Maurer wrote:
> Florent Guillaume wrote at 2005-8-9 17:18 +0200:
>>We're seeing problems in one application here due to the catalog and  
>>interactions with Unicode. Here's what happens:
>>- an object is indexed with a Unicode title, so in the catalog the  
>>metadata tuple has for instance (u'cafe',)
>>- later that title is changed to latin-1, so the new metadata tuple  
>>would be ('caf\xe9',)
>>The problem is that Catalog.py has in updateMetadata() the code:
>>            if data.get(index, 0) != newDataRecord:
>>                data[index] = newDataRecord
>>            try:
>>                changed = data.get(index, 0) != newDataRecord
>>            except UnicodeDecodeError:
>>                changed = True
>>           if changed:
>>                data[index] = newDataRecord
>>Objections ?
> I fear, you will get similar problems in the indexes.
> You should avoid mixed unicode/non-unicode in fields or indexes
> (or the the "default encoding" appropriately).

For indexes I agree, and indeed my example of Title was not ideal. But 
metadata fields can have nothing to do with indexes...

Suppose you're migrating your code from using utf-8 encoded str to unicode. 
You have no way to recatalog the thing, it will blow in updateMetadata...


Florent Guillaume, Nuxeo (Paris, France)   CTO, Director of R&D
