[Zope] Problems with unicode and other encodings.

Alec Munro alecmunro at gmail.com
Tue Nov 16 11:18:55 EST 2004


Hi List,

I don't know much about encodings, so please excuse any ignorance I display.

I am having a problem with encodings. I have restructuredText objects,
and text fields in a database that are interpreted as
restructuredText. Either of these can be edited by lay people, and in
many cases, will be copied wholesale from Word.
This produces a problem when these pages are viewed, because they
contain improper characters, such as the ones mentioned in this
article:
http://effbot.org/zone/unicode-gremlins.htm

This article presents a way to convert these characters to Unicode,
which seems to work quite well, in and of itself. However, if I
retrieve database fields, convert them, and then attempt to reinsert
them, with the unicode characters, I get a UnicodeEncodeError, because
it is attempting to encode these characters as ascii before inserting
them in the database.

What are possible solutions to these problems? Is there are standard
practice that needs to be followed? Should I maintain the data as it
is, and simply convert it to unicode before display? Alternatively,
should I enforce a policy where those characters cannot be used? Is
unicode the encoding I should be using?

Thanks for any help.

Alec Munro


More information about the Zope mailing list