[Zope-Coders] Re: [Zope-Checkins] CVS: Zope27/lib/python/TAL -
TALInterpreter.py:1.68.26.2
Florent Guillaume
fg@nuxeo.com
09 Sep 2002 16:52:06 +0200
On Mon, 2002-09-09 at 16:28, Guido van Rossum wrote:
> > > Depends on what you want. Since Python has no standard API for
> > > writing Unicode to files, this is indeed nontrivial. I think the
> > > Python StringIO.py might accidentally support Unicode.
> >=20
> > It doesn't (python 2.2):
> > from StringIO import StringIO
> > s =3D StringIO()
> > s.write('=E9')
> > s.write(u'a')
> > s.getvalue()
> > Traceback (most recent call last):
> > File "<stdin>", line 1, in ?
> > File "/usr/lib/python2.2/StringIO.py", line 169, in getvalue
> > self.buf +=3D ''.join(self.buflist)
> > UnicodeError: ASCII decoding error: ordinal not in range(128)
>=20
> Depends on what you call "support". :-)
Well, yes :-)
This doesn't necessarily mean StringIO has to be changed, but rather
that those who call it have to ensure that they always pass the same
kind of strings to it.
> > But then it's not clear what should be done in any case. In this
> > example the first '=E9' is in a "native" coding and shouldn't be
> > allowed by the application. But because TALES can get its values
> > from python code, it's conceivable that we can receive native
> > strings and have to decide what to do with them.
> >=20
> > Localizer's choice is to convert all Unicode strings to standard
> > strings in the desired output charset, and leave "native" strings
> > alone (supposing the application has generated them in the correct
> > way).
>=20
> OK, but then I don't understand why it needs to solve the problem you
> show above. Either it converts everything to an 8-bit encoding before
> it hits the StringIO object, and then you don't need a Unicode-aware
> StringIO object, or it *only* writes Unicode to the StringIO object
> (in which case StringIO.py is just fine).
To be able to do the conversion before it hits the StringIO a number of
places in Zope would have to be changed. So it was decided that it was
simpler to replace only the StringIO part and make it do the
conversions.
> > 2.6's choice is to allow building a complete response using Unicode
> > strings, and do the conversion only upon publishing to the
> > client. But then we have to convert a mix of non-unicode strings and
> > unicode strings, which can cause the problems outlined above.
>=20
> I don't understand how your monkey patch helps you solve this
> solution.
>=20
> (And if you have a patch for StringIO.py, maybe you can make it
> available for the Python standard library? Others might need this.)
For the unicode-StringIO part I'm just experimenting here, it's too
early. Probably, a Unicode-aware StringIO would be one that takes an
additional join function as parameter, this function being responsible
for joining strings of arbitrary type and returning a sane result.
Basically join_unicode. And defaulting to ''.join.
> > > > What do you think about it now. Should I revert them?
> > >=20
> > > Ask the Zope Pope.
> >=20
> > Jim, do you want those reverted?
> >=20
> > Again, for the record, my argument to leave those in is: they don't
> > harm, they'll be removed later, and in the meantime third-party product=
s
> > can still function.
>=20
> At the very best, I propose something like this instead:
>=20
> from StringIO import StringIO
>=20
> CustomStringIOClass =3D StringIO
>=20
> def setCustomStringIO(C):
> CustomStringIOClass =3D C
>=20
> class C: # This is the class you patched
>=20
> def StringIO(self):
> return self.CustomStringIOClass()
Yes, that's much cleaner.
I can do that change if I'm given a go-ahead.
Florent
--=20
Florent Guillaume, Nuxeo (Paris, France)
+33 1 40 33 79 87 http://nuxeo.com mailto:fg@nuxeo.com