[Zope] zope timeouts

Peter Sabaini peter at sabaini.at
Tue Jul 20 13:59:10 EDT 2010


Hm, just some quick ideas -- thread 3 is busy doing a stat64() , should
usually done quickly. And thread 4 is doing some kind of string
operation (comparison), which also shouldn't take long except if you
have a huge dataset. Is this backtrace repeatable, ie. does it hang in
the same syscalls?

Are you maybe running on NFS or something like that? 

hth,
peter.


On Tue, 2010-07-20 at 17:00 +0200, Michele Marcionelli wrote:
> Ciao Jens
> 
> thank you for your answer. Now since I'm new with "gdb" I have some questions.
> I did following. I created a file called "gdb_batch" with following content:
> 
>     info threads
>     thread 1
>     bt
>     thread 2
>     bt
>     thread 3
>     bt
>     thread 4
>     bt
>     thread 5
>     bt
> 
> and run the "gdb" command like this
> 
>     gdb python $(cat Z2.pid) --batch -x gdb_batch  > gdb_bt.log 2>&1
> 
> Can you maybe help me to "decode" the output (see attachment)?
> 
> For instance: what does this mean?
> 
> (gdb) info threads
>   5 Thread 0xb712bb90 (LWP 7532)  0x005fd410 in __kernel_vsyscall ()
>   4 Thread 0xb672ab90 (LWP 7533)  lookdict_string (mp=0xb78c4714, key=0xb7c19db8, hash=1293002269) at Objects/dictobject.c:331
>   3 Thread 0xb5d29b90 (LWP 7534)  0x005fd410 in __kernel_vsyscall ()
>   2 Thread 0xb5328b90 (LWP 7535)  0x005fd410 in __kernel_vsyscall ()
> * 1 Thread 0xb7fc16c0 (LWP 7174)  0x005fd410 in __kernel_vsyscall ()
> 
> What does mean the "*"? And what is the "difference" between the threads 1,2,3,5 (with the __kernel_vsyscall) and the 4?
> 
> What should I look for?
> 
> Thank you,
> Michele
> 
> 
> 
> On Jul 20, 2010, at 14:54 , Jens Vagelpohl wrote:
> 
> > -----BEGIN PGP SIGNED MESSAGE-----
> > Hash: SHA1
> > 
> > Hi Michele,
> > 
> > I would consider the coincience with strace "reviving" the process a red
> > herring. When strace runs it does not influence the process you are
> > watching with it, except for telling you what OS-level calls are made.
> > 
> > The most common reason for such timeouts are problems with software that
> > tries to use other network resources during the running request to talk
> > to services like RDBMS systems, LDAP, etc. If these background
> > connections hang for some reason you'll end up with a hanging Zope
> > thread. Enough of them and Zope itself appears to hang.
> > 
> > When your process hangs you should be able to attach to it using "gdb"
> > and printing a stack trace for each thread. That may give you a big clue
> > where the problem is.
> > 
> > jens
> > 
> > 
> > On 7/20/10 13:53 , Michele Marcionelli wrote:
> >> Hello
> >> 
> >> I'm running an old version of Zope which is required by our corporate design using also an older version of the CMS Silva:
> >> 
> >>    Linux Distribution: Red Hat Enterprise Linux Server release 5.5
> >>    Linux Kernel: 2.6.18-194.8.1.el5PAE
> >>    Zope: 2.8.9.1
> >>    Python: 2.3.6 (with several modules, like: ldap, pil, mysql, pyxml, ...)
> >>    Silva: 1.5.13
> >> 
> >> My problem ist that my zope server hat timeouts on displaying pages after a while that it has been started (currently this "while" is approx. 15 minutes). The only thing that helps is restarting zope.
> >> 
> >> By accident I discovered about two month ago, that running an "strace -f" on the zope process, "reactivated" it... so I could live with this hack until yesterday.
> >> 
> >> Now I'm running "strace" permanently (with all the disadvantages) and zope works "fine" for about 1 to 1.5 hours.
> >> 
> >> Here is a short recurring output of "strace":
> >> 
> >>> [pid 13408] <... futex resumed> )       = 0
> >>> [pid 13406] futex(0xbcc12c0, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
> >>> [pid 13405] futex(0xbcc12c0, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
> >>> [pid 13408] futex(0xbcc12c0, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
> >>> [pid 13407] <... futex resumed> )       = 0
> >>> [pid 13405] <... futex resumed> )       = 1
> >>> [pid 13408] <... futex resumed> )       = 1
> >>> [pid 13405] futex(0xbcc12c0, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
> >>> [pid 13408] futex(0xbcc12c0, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
> >>> [pid 13406] <... futex resumed> )       = 0
> >>> [pid 13406] futex(0xbcc12c0, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
> >>> [pid 13407] futex(0xbcc12c0, FUTEX_WAKE_PRIVATE, 1) = 1
> >>> [pid 13405] <... futex resumed> )       = 0
> >>> [pid 13407] futex(0xbcc12c0, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
> >>> [pid 13405] futex(0xbcc12c0, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
> >>> [pid 13408] <... futex resumed> )       = 0
> >>> [pid 13407] <... futex resumed> )       = 1
> >>> [pid 13407] futex(0xbcc12c0, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
> >>> [pid 13405] <... futex resumed> )       = -1 EAGAIN (Resource temporarily unavailable)
> >> 
> >> 
> >> I need definitively help... Any idea?
> >> 
> >> Thx,
> >> Michele
> 
> 
> _______________________________________________
> Zope maillist  -  Zope at zope.org
> https://mail.zope.org/mailman/listinfo/zope
> **   No cross posts or HTML encoding!  **
> (Related lists - 
>  https://mail.zope.org/mailman/listinfo/zope-announce
>  https://mail.zope.org/mailman/listinfo/zope-dev )

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part
Url : http://mail.zope.org/pipermail/zope/attachments/20100720/a983ad44/attachment.bin 


More information about the Zope mailing list