[Zope-Coders] Re: [Zope-Checkins] CVS: Zope27/lib/python/TAL - TALInterpreter.py:1.68.26.2

Mon, 9 Sep 2002 15:37:52 +0000 (UTC)

Toby Dickenson  <tdickenson@geminidataloggers.com> wrote:
> On Monday 09 Sep 2002 3:22 pm, Florent Guillaume wrote:
> > The problem is that the default charset
> > has to depend on a number of factors, 
> 
> > especially the "native" charset
> > for the DocumentTemplate or PageTemplate being acted upon, 
> 
> That seems useful for encoding output, but not decoding input surely?

I'm sorry I don't see what you mean here.

> > and the
> > "native" charset for strings generated by python code.
> 
> Yes, although new python code should be creating Unicode strings IMO.

For non-ascii stuff, yes, but see later, these things have a tendency to
crop up everywhere :-)

> Most legacy code in Zope would have to deal with the fact that, sooner or 
> later, its strings get embedded into 'text/html' pages with no charset 
> declaration. This means latin-1 in practice.

A lot of legacy code has things like
  <meta http-equiv="Content-Type" content="text/html; charset="koi8-r" />
that take care of it.

> > I don't know at
> > what level this should be configured.
> 
> IMO, at the level in which encoded strings are read from byte-streams, and 
> converted into Unicode at the earliest opportunity. General purpose 
> text-processing tools like dtml and zpt should not have to deal with 
> character encodings.

Agreed, but that's not always easy.

Consider for instance
   <span tal:replace="python:DateTime(date).strftime('%B')">January</span>
It's returning a localized month, which can contain accents.

Or consider reading strings from a database (but that one is obvious).

Of course all those will have to be migrated to a Unicode framework, but
I just wanted to point out that they're pervasive.

Florent

-- 
Florent Guillaume, Nuxeo (Paris, France)
+33 1 40 33 79 87  http://nuxeo.com  mailto:fg@nuxeo.com