[Zope-dev] Zope Server hanging :-(

Chris McDonough chrism@digicool.com
Wed, 2 May 2001 11:59:42 -0400


One of the more important bits is the "active" count when using
the --detailed option to requestprofiler.  This tells you how many *other*
requests were "unfinished" at the end of a particular request.

So for instance, if you run requestprofiler like this:

  python requestprofiler.py log.file.name --detailed

And you see a profile line like (header added for reference):

start                win wout wend tot S   osize  code act url
2001-04-19T01:19:42    0   12    0  12 E   26024  200    5
/Catalog/checkrep?P

This means that:

 - this request started at 2001-04-19 1:19:42 UTC ("start")

 - zserver waited 0 secs for input from the client ("win")

 - zserver waited 12 secs for output from the publisher ("wend")

 - the total time from start to end for the request was 12 secs ("tot")

 - the request reached the end ("E") stage ("S", one of "E", "A", "I", "B")

 - the output size in bytes was 26024 ("osize")

 - the http code was 200 ("code")

 - the number of pending requests at the end of this request was 5
("active")

 - the URL of the request ("url")

Hangs are also interesting.

Use detailed reporting to sort by end stage:

  requestprofiler.py your.log.file --detailed --sort=endstage

Any requests that don't show a stage of "E" for their endstage were "hung"
requests.

You can also use cumulative reporting to sort by "hangs":

  requestprofiler.py your.log.file --cumulative --sort=hangs

This will show the methods that "hung" most frequently.  Much of the time,
you can find problems this way by seeing methods that enter an infinte loop
or cause a segfault.

Total time is also interesting:

  requestprofiler.py your.log.file --detailed --sort=total

  Shows you individual calls to methods that took a long time.

  - AND -

  requestprofiler.py your.log.file --cumulative --sort=total

   Shows you where most of the request time is spent.

The goal is to indentify methods or requests that take a long time, don't
return, or are accessed frequently.  If you can match this data up with
specific problems you've experienced (possibly by way of times/dates), it
becomes easier to form postulations about what is happening to Zope under
load.  Then investigate those postulations by refactoring code or by doing
more detailed logging in the places where you think the problems are.

- C





> Hmmm, not really sure what I should be looking for :-S
>
> What ways should I look to run it and what should I do with the output?
>
> cheers,
>
> Chris
>