[Zope] eliminating dupes in a list

Tres Seaver tseaver@palladion.com
Sun, 09 Apr 2000 21:17:17 -0500


"Robert Roy" <rjroy@takingcontrol.com> wrote:
> 
> I wrote:
> 
> >sathya <linuxcraft@redspice.com> asked:
> >
> >> I have a list to pass in as a parameter to dtml-in but before doing that I
> >> would like to eliminate duplicates  form the list.
> >> ie in ['1','2','1'] I want to skip the duplicate 1. is there a zope hack
> >> for this or do I have to use an external method
> >
> > This requires some Python expression trickery which can't (currently) be
> > done within DTML (filter and map aren't available to DTML).  You are
> > probably better off using a PythonMethod for such logic.  For grins, I
> > used the Python interpreter to bang out the following Python expression:
> >
> >  filter( None, map( lambda i, d={}:
> >                     ( i, None )[ d.has_key(i) or d.update( {i: 1} ) or 0 ]
> >                   , foo ) )
> >
> >This is too convoluted to use in production code (and it strips out 0 values,
> >too) -- much better a nice, straightforward, "Pythonic" solution, a Python
> >method 'uniq' taking a single argument, 'items':
> >
> > d = {}
> > for item in items:
> >     if not d.has_key( item ):
> >        d.update( { item: 1 } )
> > return d.keys()
> >
> >Call from DTML:
> >
> >  <dtml-in "uniq( myItems )" sort>
> >    ...
> >  </dtml-in>
>
> Your uniq method is not as fast as it could be. The call to has_key is
> superfluous and the update call has to creates a dictionary which then gets
> thrown away.
> 
> All you need to do is:
> def uniq2(items):
>     d = {}
>     for item in items:
>         d[item]=1
>     return d.keys()
> 
> This saves creating a dictionary, and having to hash the key twice for
> every item. It runs about 2-3 times faster

<timings proving this snipped>

Yes, if I had been doing it from native Python (i.e., within an ExternalMethod
or a Python product), that is how I would have done it.  But PythonMethods use
bytecodehacks to prevent several "unsafe" operations, one of which is assignment
into a mutable structure (i.e., calling __setitem__).  Try pasting the body of
your method into the body of a PythonMethod -- you get:

      Error Type: Python Method Error
      Error Value: Forbidden operation STORE_SUBSCR at line 4

That I can't even assign into a "local" mutable is a defect of PythonMethods,
but one I'm not competent to fix (I'm CC'ing Evan on this one in case he knows
of something better).  The "update()" version works because it looks to
bytecodehacks like a function call, which does the required magic in C.  Ugly,
but does the trick.  I should have noted the reason for the hack in the original
post).

WRT the superfluity of has_key(), you are correct -- I think it is there as a
fossil from a previous incarnation.  So, for those watching at home, the minimal
form for doing this in a PythonMethod is:

    d = {}
    for item in items:
        d.update( { item: 1 } )
    return d.keys()

Thanks!

Tres.
-- 
=========================================================
Tres Seaver  tseaver@digicool.com   tseaver@palladion.com