[Zope-dev] Non-ASCII characters in URLs

Jonathan dev101 at magma.ca
Mon Apr 7 08:32:17 EDT 2008


----- Original Message ----- 
From: "Martijn Pieters" <mj at zopatista.com>
To: "Alexander Limi" <limi at plone.org>
Cc: <zope-dev at zope.org>
Sent: Monday, April 07, 2008 4:39 AM
Subject: Re: [Zope-dev] Non-ASCII characters in URLs


> On Mon, Apr 7, 2008 at 1:37 AM, Alexander Limi <limi at plone.org> wrote:
>>  Is there a good technical explanation for why Zope doesn't allow 
>> non-ASCII
>> characters in URLs?
>
> Because URLs don't allow non-ASCII characters?
>
>>  I'd like to be able to let URLs work like this example from Wikipedia:
>>
>>  http://ja.wikipedia.org/wiki/メインページ
>
> Your browser translates that into
> http://ja.wikipedia.org/wiki/%E3%83%A1%E3%82%A4%E3%83%B3%E3%83%9A%E3%83%BC%E3%82%B8
>
>>  Is there a fundamental reason (ie. Python objects can only be ASCII) or 
>> is
>> it simply bugs that need to be fixed?
>
> RFC 1738 (http://www.ietf.org/rfc/rfc1738.txt) doesn't allow non-ascii
> characters in URLs.
>
>   No corresponding graphic US-ASCII:
>
>   URLs are written only with the graphic printable characters of the
>   US-ASCII coded character set. The octets 80-FF hexadecimal are not
>   used in US-ASCII, and the octets 00-1F and 7F hexadecimal represent
>   control characters; these must be encoded.
>
> Now, Zope could well support UTF-8 ids, and translate URLs
> appropriately, but in the meantime you could use the same scheme?

IDNA (http://www.ietf.org/rfc/rfc3490.txt) and Punycode 
(http://www.faqs.org/rfcs/rfc3492.html) may be of some use.

Jonathan




More information about the Zope-Dev mailing list