[ZPT] Unicode and 8-bit string migration fix

Stuart Bishop zen at shangri-la.dropbear.id.au
Fri Oct 17 01:29:44 EDT 2003


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi.

Starting with Zope 2.6, Zope became capable of publishing Unicode. 
However,
Page Templates which mixed Unicode and 8-bit encoded strings would raise
a Unicode exception:

	<p tal:content="python:u'My 2\N{CENT SIGN}'" />
     <p tal:content="python:u'My 2\N{CENT SIGN}'.encode('latin1')" />

The reason for this is that it is impossible to join a Unicode string
containing 'high bit' characters to an 8-bit encoded string without
knowing what encoding to use. However, because we are publishing HTML
or XML documents, we can work around this problem by adding the
following method to TALInterpreter.FasterStringIO:

     def getvalue(self):
         try:
             return StringIO.getvalue(self)
         except UnicodeDecodeError:
             utype = type(u'')
             self.buflist = [
                 (type(b) is utype and 
b.encode('ascii','xmlcharrefreplace'))
                         or b
                         for b in self.buflist
                 ]
             return StringIO.getvalue(self)

This should have no effect on pages that currently render correctly,
but it allows a way for Zope 2.6+ Unicode aware Products, Zope 2.5 
Unicode
aware Products and non-Unicode aware Products to interoperate. The
end result is that I can use Formulator widgets on the same page as my
own components that use Unicode strings.

Does this sound correct?

If so, any objections to me committing it to the Zope CVS tree?

If so, any objections to it going into the 2.7 branch?

Is there a better way?

- -- 
Stuart Bishop <stuart at stuartbishop.net>
http://www.stuartbishop.net/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (Darwin)

iD8DBQE/j35PAfqZj7rGN0oRAjQcAJ9AGBGeYHwMpxfWf2ahFSWuK2j1pwCeLn5A
fzbhUt+mZkbghD1vr/kcn5Y=
=2zT+
-----END PGP SIGNATURE-----




More information about the ZPT mailing list