[Zope-Coders] Re: [Zope-dev] Unicode treatment in 2.6b1
Toby Dickenson
tdickenson@geminidataloggers.com
Fri, 27 Sep 2002 07:57:47 +0100
cc: zope-coders
On Thursday 26 Sep 2002 10:58 pm, Florent Guillaume wrote:
> For PageTemplates, the various blocks produced by the template and
> python are sent to an StringIO-like objects, which is responsible for
> converting them into a coherent thing when its getvalue() method is
> called. At the moment it doesn't deal very well mixed Unicode and
> non-Unicode strings so the reported failures don't surprise me. WE NEED
> TO FIX THIS BEFORE THE NEXT BETA,
I agree. Is someone committed to working on this?
> probably also by providing an explicit
> native encoding.
Thats not what dtml currently does, and I dont see an obvious reason why =
page=20
templates should be different. The dtml semantics have been worked out=20
carefully over the last few years.
The problem with this proposed approach is that it confuses the encoding =
of=20
the *document* with the encoding of the *attributes* of the objects which=
are=20
used to create the document. Page templates often deal with diverse objec=
ts=20
from different source; how is it to know that all objects use the same=20
character encoding for 8-bit strings?
New objects should be exposing these attributes as unicode objects, and l=
egacy=20
objects would have had to expose them as latin-1 if it wanted them render=
ed=20
correctly in the ZMI.
Legacy page template using legacy non-latin-1 properties will continue to=
work=20
unchanged as long as it does not encounter a unicode object while the=20
template is being rendered.
I dont believe it is possible to introduce unicode without some form of p=
ain.=20
The scheme implemented for dtml puts all the pain on users who are MIXING=
=20
non-ascii non-latin-1 8 bit string objects with unicode object, which is=20
perhaps not a bad thing.