[Zope-Coders] Re: [Zope-dev] Unicode treatment in 2.6b1

Toby Dickenson tdickenson@geminidataloggers.com
Mon, 30 Sep 2002 14:00:43 +0100


On Monday 30 Sep 2002 1:00 pm, Florent Guillaume wrote:

> Ok, here's something that occured to me:
> Why not explicitely use "locale.getlocale()[1] or 'latin=3D1'" as the
> default encoding for all str->unicode conversions?

Is this just 'for legacy support', or a new feature that we plan to suppo=
rt?

A new ugly environment variable is something we can take away eventually,=
 in=20
principal. Even if we cant ever do that in practice, we can encourage mor=
e=20
people to move to Unicode strings by threatening to ;-)

Anything based on locale feels more permanent.

> (why can't user code
> change the sys default encoding after initialization BTW?).

In the original (pre-python 2.0) implementation this restriction was need=
ed=20
because of the needs of plain/unicode string comparisons. The default=20
encoding affects how cross-type comparisons perform. It therefore needs t=
o=20
affects the hash value of strings, and hash values are required to be=20
consistent within the lifetime of one program so that, for example,=20
dictionaries can be efficient.

In those early days sys.setdefaultencoding was needed because there was s=
ome=20
debate about what the default encoding should be. (Guido originally favou=
red=20
utf8). The original idea was that this funtcion was to be removed before =
the=20
release of 2.0, once a global decision had been made. Im not sure why tha=
t=20
didnt happen.