[Grok-dev] how to write doctest in unicode?

Uli Fouquet uli at gnufix.de
Fri Aug 29 23:04:47 EDT 2008


Hi Brandon,

Brandon Craig Rhodes wrote:
> I am attempting to write a doctest file by creating a .txt file and
> putting ":Test-Layer: unit" up at the top, as outlined in:
> 
>    http://grok.zope.org/documentation/how-to/tests-with-grok-testing
> 
> But I cannot figure out how to make it Unicode-friendly; I am getting
> mismatch reports like:
> 
>     - u'1¼ years'
>     ?    ^^
>     + u'1\xbc years'
>     ?    ^^^^
> 
> Is there something like an ":Encoding: utf-8" that I can put at the top
> of my doctest to get things working?

I'm glad you asked this. Seems, that several people ran into that
trap ;-)

Yes, you can, but there is no need to do so, because with
`z3c.testsetup` all doctests are registered automatically as utf-8.

If you want to override this (not needed in your case), you can pass the
'encoding' option to `register_all_tests`::

  test_suite = register_all_tests('mypackage', encoding='utf-8')

Furthermore I think (haven't checked this) you can write::

  # -*- coding: utf-8 -*-

in doctests as with normal Python modules to require a certain encoding.
Don't know whether this still works with the encoding option set on
tests (which is done automatically).

Now for your real problem. It has nothing to do with the whole testing
machinery. You can do the following on any Python shell, not only in
doctests::

  >>> val = u'1¼'
  >>> val
  u'1\xbc'

Apparently you got the 'byte-representation' of the string, which is the
result of a call to `val.__repr__()`, the interactive interpreter
performs automatically on such occasions. Here codes between 127 and 256
are escaped by '\x' followed by the hex character code. People from
countries with umlauts in the language are quite used to it :-)

If you want to see the 'real' encoded string, you have to use `print`
and friends::

  >>> print val
  1¼

Will give what you want. This works also in doctests.

Hope that helps,

-- 
Uli

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 191 bytes
Desc: Dies ist ein digital signierter Nachrichtenteil
Url : http://mail.zope.org/pipermail/grok-dev/attachments/20080830/8345bb9d/attachment.bin 


More information about the Grok-dev mailing list