[Zope-dev] Runaway processes

Dieter Maurer dieter at handshake.de
Fri Dec 7 13:53:56 EST 2007


Stephan Richter wrote at 2007-12-5 17:47 -0500:
> ...
>On Unix-like systems, we can use `os.fork()`. The advantage of this approach 
>is that I can use OS system calls to kill the process. However, ZODB database 
>storages cannot be shared between processes. Nikolay Kim has done some 
>preliminary experiments and found that `db.open()` locks the system (for 
>both, `FileStorage` and `ZeoClientStorage`). I have not verified these 
>results or tried to figure out why it is hanging, but I can see the problem 
>for `FileStorage`.
>
>Are there any known side-effects on what happens, if I fork after the 
>connection has been made?

We are using this kind of architecture to generate our newsletters:

  A scheduling process periodically checks the ZODB for
  new work (newsletters to be published). It does this
  via ZCatalog queries.

  If the scheduler finds a newsletter to publish, it forks
  and let the child produce the newsletter.

I had to do some tricks to get it working -- and new ZODB versions
tend to require more tricks.

My code currently looks like this:

    pid= fork()
    if not pid:
      # the line below is necessary to prevent a child from
      # stealing messages destined for the parent
      clearParentZODBState()
      config.setup() # reopen storage in order not to confuse the ZEO protocol

"clearParentZODBState" looks like this (for ZODB 3.4):

def clearParentZODBState():
  '''called in the forked child to clear the parents ZODB state
  in order to prevent the child to intercept messages destined
  for the parent.

  Almost surely dependent on the ZODB version.
  '''
  # necessary for ZODB 3.2
  from asyncore import socket_map
  socket_map.clear() # get rid of any handlers for the parent's IO
  # necessary for ZODB 3.4
  try: from transaction import manager
  except ImportError: manager = None
  if manager is not None:
    manager._txns.clear() # get rid of the parent's transactions
    manager._synchs.clear() # get rid of the parent's synchronizers

"config.setup" looks like:
    ....
    s = ClientStorage((zeoServer, int(zeoPort)))
    db = DB(s,
            version_cache_size=2000,
            )
    db.setClassFactory(ClassFactory)
    c = db.open(temporary=1)
    ....

The approach is viable only when you have truely long running
processes (and not for quick requests) as opening a new
connection is expensive (mainly because the cache is initially empty).


Currently, we have occasionally a non-deteristic LDAP problem.
I expect that the LDAP connection is shared by the forked
processes -- and, understandably, the LDAP server does not
expect to get requests from different, not synchronized sources
on the same connection.
Apart from that, our solution is working (at least until
the next ZODB upgrade).



-- 
Dieter


More information about the Zope-Dev mailing list