[Zope] charset from forms input

Dieter Maurer dieter@handshake.de
Thu, 14 Dec 2000 21:27:11 +0100 (CET)


Matt writes:
 > ... browser does not send "charset" parameter for "form" data ...

 > POST /hi HTTP/1.0
 > ...
 > 
 > Content-type: multipart/form-data;
 > boundary=---------------------------17670043309955870831526446972
 > Content-Length: 180
You should not expect a "charset" parameter to the
"multipart/form-data" content type.
The parameter can appear in each single part (when applicable)
not the multipart wrapper.

HTML 4.0 specifies:
   As with all multipart MIME types, each part has an optional
   "Content-Type" header that defaults to "text/plain". User agents
   should supply the "Content-Type" header, accompanied by a
   "charset" parameter. 


 > ... detecting used charsets ...
We use UTF-8 and ISO-8859-1 encodings.

Our experience is, that browsers use the encoding for form posts
that they used to display the form itself.
Of cause, the browser must have been explicitly told, which
encoding it has to use for form rendering. Otherwise,
it uses the default encoding (defined by the user).

To be precise:
  If we send a page (containing a form) to a browser
  with a "Content-Type: text/html; charset=UTF-8" HTTP header,
  then we will get the form data back in an UTF-8 encoding.

  If the page has instead a
  Content-Type: text/html; charset=ISO-8859-1" HTTP header,
  the delivered form data is encoded in ISO-8859-1.

This is as I would expect it to be.



Dieter