[Zope-dev] Non-ASCII characters in URLs

Martijn Pieters mj at zopatista.com
Mon Apr 7 04:39:01 EDT 2008


On Mon, Apr 7, 2008 at 1:37 AM, Alexander Limi <limi at plone.org> wrote:
>  Is there a good technical explanation for why Zope doesn't allow non-ASCII
> characters in URLs?

Because URLs don't allow non-ASCII characters?

>  I'd like to be able to let URLs work like this example from Wikipedia:
>
>  http://ja.wikipedia.org/wiki/メインページ

Your browser translates that into
http://ja.wikipedia.org/wiki/%E3%83%A1%E3%82%A4%E3%83%B3%E3%83%9A%E3%83%BC%E3%82%B8

>  Is there a fundamental reason (ie. Python objects can only be ASCII) or is
> it simply bugs that need to be fixed?

RFC 1738 (http://www.ietf.org/rfc/rfc1738.txt) doesn't allow non-ascii
characters in URLs.

   No corresponding graphic US-ASCII:

   URLs are written only with the graphic printable characters of the
   US-ASCII coded character set. The octets 80-FF hexadecimal are not
   used in US-ASCII, and the octets 00-1F and 7F hexadecimal represent
   control characters; these must be encoded.

Now, Zope could well support UTF-8 ids, and translate URLs
appropriately, but in the meantime you could use the same scheme?

-- 
Martijn Pieters


More information about the Zope-Dev mailing list