[Zope-dev] RE: [ZODB-Dev] [Warning] Zope/ZEO clients: subprocesses can lead tonon-deterministic message loss

Tim Peters tim.peters at gmail.com
Sun Jun 27 17:06:52 EDT 2004


[Dieter Maurer]
> The problem occured in a ZEO client which called "asyncore.poll"
> in the forked subprocess. This "poll" deterministically
> stole ZEO server invalidation messages from the parent.

I'm sorry, but this is still too vague to guess what happened.

- Which operating system was in use?

- Which thread package?

- In the ZEO client that called fork(), did it call fork() directly, or
 indirectly as the result of a system() or popen() call?  Or what?
 I'd like to understand a specific failure before rushing to
 generalization.

- In the ZEO client that called fork() (whether directly or indirectly),
 was fork called *from* the thread running ZEO's asyncore loop,
 or from a different thread?

> I read the Linux "fork" manual page and found:
> 
>  fork creates a child process that differs from the parent process
>  only in its PID and PPID, and in the fact that resource utilizations
>  are set to 0. File locks and pending signals are not inherited.
> 
>  ...
> 
>  The fork call conforms to SVr4, SVID, POSIX, X/OPEN, BSD 4.3

If it conforms to POSIX (as it says it does), then fork() also has to
satisfy the huge list of requirements I referenced before:

   http://www.opengroup.org/onlinepubs/009695399/functions/fork.html

That page is the current POSIX spec for fork().

> I concluded that if the only difference is in the PID/PPID
> and resource utilizations, there is no difference in the threads between parent
> and child.  

Except that if you're running non-POSIX LinuxThreads, a thread *is* a
process (there's a one-to-one relationship under LinuxThreads, not the
many-to-one relationship in POSIX), in which case "no difference in
threads" is trivially true.

> This would mean that the wide spread "asyncore.mainloop" threads could suffer
> the same message loss and message duplication.

That's why all sane <wink> threading implementations do what POSIX
does on a fork().  fork() and threading don't really mix well under
POSIX either, but the "fork+exec" model for starting a new process is
an historical burden that bristles with subtle problems in a
multithreaded world; POSIX introduced posix_spawn() and posix_spawnp()
for sane(r) process creation, ironically moving closer to what most
non-Unix systems have always done to create a new process.

> I did not observe a message loss/duplication in any
> application with an "asyncore.mainloop" thread.

I don't understand.  You said that you *have* seen message
loss/duplication in a ZEO client, and I assume the ZEO client was
running an asyncore thread.  If so, then you have seen
loss/duplication in an application with an asyncore thread.

Or are you saying that you haven't seen loss/duplication under the
specific Linux flavor whose man page you quoted, but have seen it
under some other (so far unidentified) system?

> Maybe, the Linux "fork" manual page is only not precise with respect
> to threads and the problem does not occur in applications
> with a standard "asyncore.mainloop" thread.

That "fork" manpage is clearly missing a mountain of crucial details
(or it's not telling the truth about being POSIX-compliant).  fork()
is historically poorly documented, though.


More information about the Zope-Dev mailing list