[Grok-dev] Problem with character encoding

Luciano Ramalho luciano at ramalho.org
Tue Jul 8 17:41:22 EDT 2008


On Tue, Jul 8, 2008 at 6:29 PM, Sebastian Ware <sebastian at urbantalk.se> wrote:
> I know this is slightly off topic but maybe there is a simple answer.
>
> I have a unicode attribute [message] of an object stored in the ZODB and I
> want to encode it to "iso-8859-1" and the use urllib.urlencode to create
> parameters for a http post operation.
>
> The problem is that the characters "åäöÅÄÖ" are encoded to this:
>
>  '%E5%E4%F6%C5%C4%D6'
>
> but should be encoded to this:
>
>  '%C3%A5%C3%A4%C3%B6%C3%85%C3%84%C3%96'
>
> I notice the following (the first one is what I want):
>
>   >> u'å'.encode('iso-8859-1')
>   '\xc3\xa5'
>   >> self.context.message[0].encode('iso-8859-1')
>   '\xe5'
>
> Any hints?

Something is wrong in this picture, Sebatian, because you say you want
to encode with iso-8859-1 but then you say the correct encoding is one
with two bytes per character. However, iso-8859-1 uses only one byte
per character. It is UTF-8 which uses 2 or more bytes for non-ASCII
characters. Did I misuntersdand your message, or are you working too
late?

Cheers,

Luciano


More information about the Grok-dev mailing list