[Zope-CMF] Charsets

Dieter Maurer dieter at handshake.de
Mon Jan 19 15:21:48 EST 2009


Charlie Clark wrote at 2009-1-18 22:30 +0100:
>Am 18.01.2009 um 20:36 schrieb Dieter Maurer:
> ...
> From the current HTML specification:
>
>"accept-charset = charset list [CI]
>This attribute specifies the list of character encodings for input  
>data that is accepted by the server processing this form. The value is  
>a space- and/or comma-delimited list of charset values. The client  
>must interpret this list as an exclusive-or list, i.e., the server is  
>able to accept any single character encoding per entity received."
>
>ie. exactly as you have suggested: it is possible to force a client to  
>encode data in a particular charset before sending it to the server.  
>All references I have come across suggest that this, together with the  
>meta tag content-type can and should be used to coerce browsers to use  
>UTF-8.

I fear that the "accept-charset" form control attribute
can easily only be used for "method=post content-type=multipart/form-data"
as only then the browser has a chance to specify how it has
encoded the value.

I am not sure whether Zope handles the "charset" information
in this case correctly.


As the "Accept-Charset" request header has (almost) nothing to do
with the "accept-charset" form control attribute, it must of course
not be used to interpret form data even when this was created
based on an "accept-charset".


If the server chooses its output encoding based on the "Accept-Charset"
request header (and Yuppie indicated that the Zope 3 publisher does this),
then the same algorithm can be used for "normal" form data
(where "normal" means, you do not explicitely specify an "accept-charset"
form control attribute).
That's one sensefull mode of operation.
Another one is choosing a fixed encoding and using it as input and
output encoding.



-- 
Dieter


More information about the Zope-CMF mailing list