[Zope-dev] ZCatalog: updateMetadata and comparing string and unicode

Dieter Maurer dieter at handshake.de
Thu Mar 6 13:41:55 EST 2008


Maurits van Rees wrote at 2008-3-5 23:57 +0000:
> ...
>I have an item in the portal_catalog of my Plone site that has some
>string as description.  The real object meanwhile has had a code
>change so the description field now returns unicode.  When I now
>recatalog that object it throws an error:
>
>  Module Products.ZCatalog.Catalog, line 359, in catalogObject
>  Module Products.ZCatalog.Catalog, line 318, in updateMetadata
>UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 159: ordinal not in range(128)
>> /home/maurits/buildout/projectdeploy/parts/zope2/lib/python/Products/ZCatalog/Catalog.py(318)updateMetadata()
>-> if data.get(index, 0) != newDataRecord:

You must not mix "unicode" and "str" as keys in the same index.
If you do, errors as the above are very likely.

You can try the following approaches:

  *  if you know the encoding used by your "str" objects,
     you can set Python's default encoding to this encoding.
     Whenever "unicode" and "str" come together, the "str"
     is converted to "unicode" using this encoding (which hopefully
     is the correct one in all such cases).

     "sys.setdefaultencoding" is only available at startup.
     Thus, setting "defaultencoding" must happen in a "sitecustomize"
     or "site" module.

  *  You completely switch to "unicode" for the given index
     and convert the BTrees used be the index.

     An index usually uses two BTrees: the so called forward index
     (usually called "_index")
     (it maps the index terms to sets of record ids indexed under this term)
     and the reverse index (usually called "_unindex")
     (it maps record ids to the values corresponding
     to these objects).

     You need to convert the keys of the forward index
     and the values of the reverse index. For a "FieldIndex",
     the value is the index term, for a "KeywordIndex" it it
     a sequence of index terms (all need be converted).

     The forward index can be converted as follows:

	 self._index = OOBTree(((s.decode(<your encoding>), v) for (s,v) in self._index.items()))

     The reverse index uses an IOBTree and is similar to the above.
     But the details depend on index type.



-- 
Dieter


More information about the Zope-Dev mailing list