[Zope-Coders] Analysis: BTrees and Unicode and Python

Jeremy Hylton jeremy@zope.com
Fri, 19 Oct 2001 11:58:39 -0400 (EDT)


>>>>> "AJ" == Andreas Jung <andreas@zope.com> writes:

  AJ> After lots of debugging here an explanation for the behaviour we
  AJ> have seen in the unittest:

  AJ> - The BTrees calls PyCompare_Object() several times before the
  AJ>   comparison that failed (unicode vs. unicode)

  AJ> - one of these earlier comparision checks a Python string
  AJ>   (containing and accented character) against a unicode string
  AJ>   and raises a unicode exception (ASCII decoding error: ordinal
  AJ>   notr in range(128)).  I assume because the default encoding is
  AJ>   ascii.

  AJ> - there is no check in the BTree code to check for an exception
  AJ>   after PyObject_Compare() and so this error got never cleared

Who's going to fix this code?

  AJ> - when when trying to compare two identical unicode strings,
  AJ>   Python calls default_3_way_compare() and runs into the
  AJ>   following code:

Is this with 2.1 or 2.2?

I'm looking at the code in the CVS and can't figure out how we get
there if we're comparing two Unicode objects, we should always use the
first test in do_cmp(), because unicode objects have a tp_compare
defined. 

	if (v->ob_type == w->ob_type
	    && (f = v->ob_type->tp_compare) != NULL) {
		c = (*f)(v, w);
		if (c != 2 || !PyInstance_Check(v))
			return c;
	}

Jeremy