[Zope] Re: Zope hanging (poss. threads-related)

Tony Rossignol tonyr@ep.newtimes.com
Tue, 11 Apr 2000 10:13:11 -0700


Marcus Collins wrote:
>
> I've installed Zope 2.1.6 from source on the same machine, with as close to
> an identical setup as is possible on a live server. Even after really
> hammering 2.1.6, though, I've been unable to produce the same hanging
> behaviour observed in 2.1.3. The only differences, apart from the apache
> rewrite rules, are that the 2.1.6 install is using GenericUserFolder 1.2.2
> (vs. 1.1.0) and SiteAccess 1.0.1 (vs. 1.0.0). I doubt that these products
> are the cause of the problem.

I have been observing Zope restarts under moderate site load w/
management activity occurring.  I have three Linux servers running zope;
two w/ RH6.2 and one w/ RH5.2(?).  The one on the older version of the
OS does not spontaneously restart, both other servers will restart w/
the one designated to receive the /manage activity restarting at least
once an hour during moderate use.

> It's also pretty unlikely that this is an OS threads problem, at least on
> our platform. We're running Zope on FreeBSD with pthreads; this is a
> different thread implementation to that used on Linux. If this is happening
> on Linux, FreeBSD and Solaris, the my hunch (dare I voice it?) is that this
> hanging occurs somewhere deep in the bowels of ZServer.

Since this problem has been reported on serveral OSs now I tend to go
along w/ you in looking for a ZServer/medusa cause for the problem.  It
still puzzles me as to why the older version of Linux/RH is for some
reason more stable.  I'm in the process of getting another machine for
testing that will have RH5.2 and be designated as the /manage server,
but I'm waiting on the HW.  I was orginally figuring something changed
in the RH6.2 distribution, but with these restarts occuring on FreeBSD
and Solaris (I believe I've seen complaints from both OSs on this list
at one time or another), I'm wondering if speed may have something to do
w/ it.  Our older system is also a slower box dual 400Mhz vs dual
500Mhz.

> Zope has in the past been fairly stable using four threads; it was only when
> the threads were increased to 20 that it began hanging repeatedly. We really
> *do* need to run Zope with a modest number of threads, as some database
> queries can be expected to take a couple of seconds to complete.

I also wonder if the DB connections might be something comming into play
here.  We use MySQL heavily for serveral sections of our Zope site.  Has
anyone been seeing Zope restarts that do not use any DB adapters?  If DB
adapters appear to be the culprit it might be in the Aquacut code not
the ZServer/medusa code?  Hmmm, I haven't given that much thought.

> I really do want to get the the root of this problem; if anyone out there
> has some suggestions or further information requirements, I'm listening!

Amen!  I've been suffering thought this for several months now.  We
built several failsafe systems and use laod balancing and static caching
heavily to mask these restart problems from our end-users.  But as we
try to start adding more of the interactive features our site demands,
our ability to cache and hide these problems become more and more
difficult.  I hate to say it but at some point Zope may not be the
solution for our needs due solely to this stability issue :(

Any insights, help, or feedback would be greatly appreciated.

-- 
-------------------------------
tonyr@ep.newtimes.com
Director of Web Technology
New Times, Inc.
-------------------------------