[Zope-Coders] unicode question

Fred L. Drake, Jr. fdrake@acm.org
Fri, 5 Oct 2001 10:17:31 -0400


Guido van Rossum writes:
 > (I'm still thinking about whether to provide a common base for str and
 > unicode; the more I see this kind of examples, the more I think there
 > should be one.  But what to call it?  abstractstring?  String?  string?)

  This is a fairly important case, and the mixed use of 8-bit strings
as both byte-arrays and character buffers doesn't help.  (Byte buffers
should generally *not* return true for isinstance(ob, AbstractString),
but where code is currently asking about abstract string by using two
isinstance() calls or type(s) in (StringType, UnicodeType), that's
probably not an issue.
  Using two isinstance() calls makes more sense in terms of
flexibility and correctly capturing the abstraction, but has a lot of
overhead.  Perhaps a new function implemented in C could be used to
solve the problem; isString(s) would be the equivalent of:

----------------------------------------------------------------------
from types import StringType, UnicodeType

def isString(s):
    return isinstance(s, StringType) or isinstance(s, UnicodeType)
----------------------------------------------------------------------

  I'm not sure where the best place to put it is; there are a couple
of similar predicates in the operator module, but that generally seems
a (moderately) bad place for them.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation