[Zope] Need help : Zope servers hanging.

Phil Harris phil.harris@zope.co.uk
Mon, 31 Jan 2000 10:24:45 -0000


Tone,

It looks as if the problems you have seem to be fixed in the 2.1.3 version.

<quote>
        - A race condition in the logic for managing Zope database
          connections caused Zope to hang on very busy sites.

	- A bug in the packing code that caused records to be
          nreadable after:

	  o  someone did work in a version

	  o  Someone did an (unrelated) undo

	  o  the version was committed

	  and the database was packed to a time before the work was done
	  in the version.

	- Fixed a bug that caused packing to raise an
	  error in the following situation:

	    o someone modifies and then deletes an object
	      in a version.

	    o they commit the version

	    o the database is packed between the time the
	      object is deleted and the time the version
	      is committed.

        - Fixed a bug that caused Zope to sometimes hang instead of
          shutting down or restarting when accessed over a fast network.

        - It wasn't possible to use a ZClass instance as a method of a
          ZClass.

</quote>

Have you tried upgrading?  I'd recommend it.

Phil
phil.harris@zope.co.uk

|>-----Original Message-----
|>From: zope-admin@zope.org [mailto:zope-admin@zope.org]On Behalf Of Tony
|>McDonald
|>Sent: Monday, January 31, 2000 9:25 AM
|>To: Zope List
|>Subject: [Zope] Need help : Zope servers hanging.
|>
|>
|>Hi all,
|>I need some help here - over the past few days two different Zope
|>servers have gone into the 'hanging' state, where they don't reply to
|>further requests. When the first event happened I didn't take a
|>'top', today, I've managed to get one. The process causing the
|>problem is 12482. From previous messages, I believe that the python
|>CPU can go up to 100%, obviously this isn't happening here. This
|>happened when I asked the Zope server to make a MySQL query. The
|>MySQL server is running fine and I can get to it from a command line
|>interface.
|>
|>Both servers are Zope 2.1.2 source distributions running under Solaris 5.6
|>
|>This server is running three different Zope sites using Apache as the
|>backend (ie I'm using pcgi to get to my servers). I can't get to it
|>using the pcgi route (ie a ReWrite Rule from Apache), nor from the
|>ZServer incarnation of the server.
|>
|>I also can't get to it from the monitor connection (telnet
|>localhost 8099).
|>
|>I can't let this situation continue as these are live sites. I need
|>to restart the server whenever this happens.
|>
|>Process list:
|>
|>load averages:  1.39,  1.11,  0.63
|>08:51:59
|>262 processes: 258 sleeping, 2 zombie, 2 on cpu
|>CPU states: 74.6% idle, 25.0% user,  0.4% kernel,  0.0% iowait,  0.0% swap
|>Memory: 512M real, 25M free, 560M swap in use, 736M swap free
|>
|>   PID USERNAME THR PRI NICE  SIZE   RES STATE   TIME    CPU COMMAND
|>12482 nnle       9  -5    0   38M   20M cpu/0 267:57 24.94% python
|>23032 nnle       1  23    0 1992K 1456K cpu/2   0:00  0.32% top
|>  6736 nnle       8  33    0   12M 9360K sleep   3:55  0.00% roxen
|>15072 nnle       8  33    0   14M   11M sleep   0:59  0.00% python
|>  1848 nnle       7  33    0   10M 7728K sleep   0:40  0.00% python
|>15071 nnle       4 -25    0 4240K 1304K sleep   0:00  0.00% python
|>  1847 nnle       4 -25    0 4240K  856K sleep   0:00  0.00% python
|>   656 nnle       1 -25    0  928K  512K sleep   0:00  0.00% start
|>12481 nnle       4  -5    0 4240K  856K sleep   0:00  0.00% python
|>18305 nnle       1  23    0 2056K 1832K sleep   0:00  0.00% tcsh
|>19302 nnle       1  23    0 2000K 1040K sleep   0:00  0.00% tcsh
|>19481 nnle       1  33    0 1000K  672K sleep   0:00  0.00% grep
|>
|>
|>The only other data I have is that the pcgi for this site is shown as
|>running in the process list quite a few times.
|>   nobody 23059  4659  0 08:54:28 ?        0:00
|>/home/nnle/MED_DUR_NOTTS/pcgi/pcgi-wrapper
|>/home/nnle/MED_DUR_NOTTS/Zope.cgi
|>   nobody 23135  4716  0 09:03:18 ?        0:00
|>/home/nnle/MED_DUR_NOTTS/pcgi/pcgi-wrapper
|>/home/nnle/MED_DUR_NOTTS/Zope.cgi
|>   nobody 23069  4574  0 08:57:03 ?        0:00
|>/home/nnle/MED_DUR_NOTTS/pcgi/pcgi-wrapper
|>/home/nnle/MED_DUR_NOTTS/Zope.cgi
|>   nobody 23068  4753  0 08:56:52 ?        0:00
|>/home/nnle/MED_DUR_NOTTS/pcgi/pcgi-wrapper
|>/home/nnle/MED_DUR_NOTTS/Zope.cgi
|>   nobody 23144  4694  0 09:04:11 ?        0:00
|>/home/nnle/MED_DUR_NOTTS/pcgi/pcgi-wrapper
|>/home/nnle/MED_DUR_NOTTS/Zope.cgi
|>     nnle 12481     1  0   Jan 21 ?        0:00 /usr/local/bin/python
|>/home/nnle/MED_DUR_NOTTS/z2.py
|>   nobody 23064  4737  0 08:55:59 ?        0:00
|>/home/nnle/MED_DUR_NOTTS/pcgi/pcgi-wrapper
|>/home/nnle/MED_DUR_NOTTS/Zope.cgi
|>     nnle 12482 12481 25   Jan 21 ?       280:44 /usr/local/bin/python
|>/home/nnle/MED_DUR_NOTTS/z2.py
|>
|>
|>*any* help at all on this would be really appreciated.
|>Tone
|>
|>------
|>Dr Tony McDonald,  FMCC, Networked Learning Environments Project
|>http://nle.ncl.ac.uk/
|>The Medical School, Newcastle University Tel: +44 191 222 5888
|>Fingerprint: 3450 876D FA41 B926 D3DD  F8C3 F2D0 C3B9 8B38 18A2
|>
|>_______________________________________________
|>Zope maillist  -  Zope@zope.org
|>http://lists.zope.org/mailman/listinfo/zope
|>**   No cross posts or HTML encoding!  **
|>(Related lists -
|> http://lists.zope.org/mailman/listinfo/zope-announce
|> http://lists.zope.org/mailman/listinfo/zope-dev )
|>