[Zope-dev] RFV: Unicode in Zope 2

Martijn Faassen faassen at infrae.com
Tue Dec 13 07:51:01 EST 2005


Jim Fulton wrote:
> I forgot a very important need:
> 
> - Common approach to Unicode
> 
> In particular, In Zope 3, all text is stored and managed as Unicode.
> The publisher decodes request data and encodes response data.  The vast
> majority of application and library code can ignore encoding issues.
> (The exceptions are applications and frameworks that need to exhange
> text with non-Unicode-aware external systems.)  This has provided
> great simplifications and allowed us to avoid common pitfals from
> mixing Unicode and encoded text.
> 
> We need to migrate Zope 2 to use a similar strategy.  We need volunteers
> to brainstorm how this can be done and make one or more proposals.
> This is likely a prerequisite for finishing the publisher and ZPT
> work.

This is definitely a scary topic, and I speak from years of experience 
with Zope 2 unicode here. This sounds like a very hard transition that 
would touch *a lot* of code in non-Zope 2 core. How do you envision all 
the form inputs to suddenly produce unicode strings, for instance?

We've struggled hard with Formulator to make it work with unicode for 
instance (and still it's buggy, as I wanted to support the non-unicode 
scenarios too). I can imagine any system in Zope that uses forms at all 
would need to be touched.

I'll volunteer to help brainstorm on this, but right now my brainstorm 
is only very dark and full of lightning.

Anyway, in some basics, Zope 2 does have an approach to unicode for 
*output* that's fairly similar to Zope 3's: if you use unicode strings 
your entire output (including page templates) will be unicode (if you 
don't mix with non-unicode non-ascii strings..). Then the response 
encoding setting is read and everything is transformed once to unicode 
text. Silva uses this. It also struggles to make sure all its input is 
transformed to unicode (among other ways using Formulator).

In Plone, the situation is quite different -- its 
PlacelessTranslationService monkeypatches into the page template engine 
and puts in ways so that you can mix UTF-8 and unicode strings together. 
This then goes on to break assumptions of code that uses the page 
template engine in a unicode-pure environment (which is what happened to 
Silva).

Regards,

Martijn


More information about the Zope-Dev mailing list