[Zope] strange unicode behaviour

Toby Dickenson tdickenson@geminidataloggers.com
Thu, 24 Jul 2003 08:32:59 +0100


On Thursday 24 July 2003 00:17, Giuseppe Bonelli wrote:
> I have spent the last 30 minutes going crazy with this:
>
> in dtml:
> <html>
> <head>
> <meta http-equiv="content-type" content="text/html;charset=utf-8">
> </head>
> <dtml-call "getText()")>
> </html>
>
> in python:
> def getText():
>     s=u'a string with some accented chars'
>     s=s.encode('utf-8')
>     return s
>
> the above works fine, but
> return s.lower()
>
> does not !!! (the accented chars are badly rendered in the browser).
>
> Can someone, please, explain this to me??
>
> I am on zope 2.6.1 (installed from binaries under win),
>
> From the python console everithing is OK, so there should be something
> with Zope.

You have had lots of advice about why this effect is happening, but so far 
noone has recommended the best approach.

If you remove the s=s.encode('utf-8') line, then getText will return a unicode 
string (with or without s.lower()), and your dtml method will also return a 
unicode string.

Add to your dtml:
<dtml-call "RESPONSE.setHeader('content-type','text/html;charset=utf-8')">
and ZPublisher will automatically encode the response as unicode before 
sending it over http.

The advantage of this approach is that your application code can work entirely 
in unicode. 

> I have utf-8 as sys.defaultencoding and I do not load any locale when
> starting Zope.

That is old advice that predates Zope 2.6. It was never a particularly good 
idea, because it affects all of pythons internals. You only need to encode 
your unicode as utf-8 (or other encoding) before sending it over the network, 
and ZPublisher is capable of doing that itself if you tell it the encoding in 
the header. 

-- 
Toby Dickenson - http://www.geminidataloggers.com/people/tdickenson

Want a job like mine?  http://www.geminidataloggers.com/jobs for Software
Engineering jobs at Gemini Data Loggers in Chichester, West Sussex, England