[Zope-dev] Re: [ZODB-Dev] [Problem] strange state after SIGSEGV

Dieter Maurer dieter at handshake.de
Mon Mar 22 14:11:28 EST 2004


Sorry, the message was intended for "zope-dev".
I have accidentally sent it to "zodb-dev". Redirected...

Shane Hathaway wrote at 2004-3-22 11:03 -0500:
>Dieter Maurer wrote:
>> This problem report is for Zope 2.7.0, Python 2.3.3, Linux 2.4.19.
>> 
>> After an application provoked SIGSEGV (caused by a C runtime stack overflow),
>> my Zope process entered a strange (and unhealthy) state:
>> 
>>   Zope did not die completely (as it should have done) but only partially:
>>   One of the threads had disappeared, the others where in
>>   the following state:
>> 
>>     *  their parent pid has been set to "1"
>> 
>>     *  attaching with "GDB" was only allowed as "root"
>> 
>>     *  at least two of the three remaining processes were waiting in "accept"
>> 
>>     *  they would not die on SIGTERM but only SIGKILL
>> 
>>   Consequences:
>> 
>>     *  Zope did no longer respond to requests
>> 
>>     *  "stop" did not work (as "SIGTERM" was ineffective)
>> 
>>     *  "start" did not work, as the dangling processes kept
>>        the HTTP port bound.
>> 
>> 
>> Anyone with some understanding what can cause such a strange state?
>
>While developing, this happens all the time for me.  The most reliable 
>way to get there is to Ctrl-C out of a 'pdb' session.
>
>I can explain some of it.  Python threads other than the main thread set 
>a mask that blocks most signals, but SIGKILL (9) can't be blocked.  You 
>can find out the signal mask for a process by looking at the SigBlk line 
>of /proc/(process_id)/status.  I think Python freezes because a lock 
>held by the dead thread never gets released--perhaps the storage's 
>commit lock.  The parent pid and gdb issues could be normal for Python 
>threads.

Thank you for your response!

The commit lock is held only during the last phase of a request
(the commit). It is unlikely that it is being held.
It definitely was not held in my case.

The parent pid and gdb issues are not normal for Python threads.

It looks wrong that a SIGSEGV does not terminate the complete
process. Such a behaviour interferes with what "zdaemon" is supposed
to do (restart Zope in case of a problem).
Python threads indeed block signal 11 (SIGSEGV).

-- 
Dieter



More information about the Zope-Dev mailing list