[Zope-CMF] Re: [RFC] [Patch] GenericSetup and encodings

Yves Bastide ybastide at wanadoo.fr
Wed Jun 7 10:16:05 EDT 2006


yuppie wrote:
> Hi Yves!
> 
> 
> Yves Bastide wrote:
>> GenericSetup has problems handling non-ASCII data.
> 
> 1.) GenericSetup explicitly doesn't support non-UTF-8 XML in profiles. 
> UTF-8 is the default encoding for XML and I can't see a need to support 
> other XML encodings.

As output, right? Agreed.

> 
> 2.) GenericSetup explicitly doesn't support non-UTF-8 site settings. If 
> someone provides a good patch this feature can be added.

But with the problems you mention later ('default_charset', 
'management_page_charset', and so on), how would you envision it?

> 
> 3.) GenericSetup is not tested with non-ASCII UTF-8 site settings. AFAIK 
> import works, but not export. I consider this a bug.

Neither: CMF trunk, change portal_types/Document's title to 'Dôcument', 
export:

Traceback (innermost last):
   Module ZPublisher.Publish, line 115, in publish
   Module ZPublisher.mapply, line 88, in mapply
   Module ZPublisher.Publish, line 41, in call_object
   Module Products.GenericSetup.tool, line 471, in manage_exportAllSteps
   Module Products.GenericSetup.tool, line 272, in runAllExportSteps
   Module Products.GenericSetup.tool, line 736, in _doRunExportSteps
   Module Products.CMFCore.exportimport.typeinfo, line 198, in 
exportTypesTool
   Module Products.GenericSetup.utils, line 728, in exportObjects
   Module Products.GenericSetup.utils, line 722, in exportObjects
   Module Products.GenericSetup.utils, line 501, in _exportBody
   Module xml.dom.minidom, line 62, in toprettyxml
   Module StringIO, line 271, in getvalue
UnicodeDecodeError: 'ascii' codec can't decode byte 0xf4 in position 20: 
ordinal not in range(128)


> 
>> It treats strings sometimes as ASCII, sometimes as UTF-8, yet it has 
>> access to two variables: its own ISetupContext.getEncoding() (whose 
>> use I didn't fully grok) and CMF's 
>> ISetupContext.getSite().getProperty('default_charset').
> 
> Sorry, but your assumptions are wrong:
> 
> - The default setup tool creates export contexts without specifying the 
> encoding, so ISetupContext.getEncoding() returns always None. And even 
> if it would be set it represents the encoding of the exported files, not 
> the site encoding.
> 
> - getSite().getProperty('default_charset') is CMF specific and should 
> not be used in GenericSetup.
> 
> - The adapters adapt ISetupEnviron, not ISetupContext. getEncoding() and 
> getSite() are not always available.

Thanks for setting me right. What's the usefulness of getEncoding()? As 
you say, exported files don't need to be other than utf-8 encoded.

> 
> First of all we need unit tests that make sure UTF-8 works and I think 
> this should be the default used by GenericSetup. Code that needs to know 
> how to find the site encoding can't be generic.

Yep.

> 
> There is an additional problem: If tools use the default property edit 
> page from OFS the properties might have a different encoding than 
> 'default_charset' of the site. Since the default 
> 'management_page_charset' is UTF-8 we have less trouble if we allow only 
> UTF-8.

D'oh! /manage is 8859-15, /manage_menu is -1 and manage_propertiesForm 
UTF-8. No wonder Firefox sometimes gets confused :-)

Well, I think I can wriggle out of most of my problems using 
translation. And I'll try and write UTF-8 unit tests if nobody beats me 
to it.

Thanks!

> 
> 
> Cheers,
> 
>     Yuppie

yves



More information about the Zope-CMF mailing list