[Zope-dev] Unicode treatment in 2.6b1

Florent Guillaume fg@nuxeo.com
Thu, 26 Sep 2002 23:58:42 +0200


Andreas Kostyrka  <andreas@kostyrka.priv.at> wrote:
> So how are these Unicode changes supposed to work? Are non-ascii
> characters forbidden now? And how do I get UTF-8 text into Zope?

If all your code outputs is plain python strings, ZPublisher passes them
as-is to the client.

If ZPublisher has to output a Unicode string, it has to decide how to
translate that into a byte string at the other end. What it does then is
encode the Unicode string into the charset defined in any 'Content-Type:
text/xxx; charset=thecharset' header you produced using
RESPONSE.setHeader (defaulting to latin-1).

But how does ZPublisher get a Unicode string in the first place? Well it
gets it from the rendering of whatever method was called when publishing
the object.

For DTML, various blocks are joined together (function render_blocks()),
and if one of them happens to be Unicode then the join_unicode method
will make it so that all non-Unicode string are converted into Unicode
using unicode(s, 'latin-1'). So this assumes that plain strings are
encoded in latin-1. Note, WE MAY WANT TO PARAMETRIZE THIS. Basically
there could be an additional attribute to the DTML saying what's its
native encoding.

For PageTemplates, the various blocks produced by the template and
python are sent to an StringIO-like objects, which is responsible for
converting them into a coherent thing when its getvalue() method is
called. At the moment it doesn't deal very well mixed Unicode and
non-Unicode strings so the reported failures don't surprise me. WE NEED
TO FIX THIS BEFORE THE NEXT BETA, probably also by providing an explicit
native encoding. I believe that's what AltPT does.

Localizer 0.9, for instance, had the need to patch the StringIO-like
object to make it deal with joining non-Unicode and Unicode. Now that I
better understand the problem, I'll help fix this ASAP in core Zope.

Florent


-- 
Florent Guillaume, Nuxeo (Paris, France)
+33 1 40 33 79 87  http://nuxeo.com  mailto:fg@nuxeo.com