[Zope] Re: database conflicts and the _p_oid missing attribute bug

Jeremy Hylton jeremy at zope.com
Thu Jan 22 23:26:23 EST 2004


I'll be at LinuxWorld tomorrow and mostly aware from my email, but
here's a brief response that might offer some clues.

On Thu, 2004-01-22 at 21:45, Dennis Allison wrote:
> I have been having problems with Zope locking up (deadlock?) when 
> ZODB conflicts occur.  I am running Zope 2.6.2b3 with the 3.1.2b2
> version of the ZODB (that is, a vanilla install).  Python is 2.1.3.

It's possible that upgrading to a non-beta release will fix, although
most of the issues resolved in the 3.1.x maintenance releases were ZEO
issues.

> The diagnostic is always:
> 
> 2004-01-22T12:25:15 ERROR(200) ZODB A storage error occured during 
> transaction abort.  This shouldn't happen.
> Traceback (innermost last):
>   File /home/zopesys/lib/python/ZODB/Transaction.py, line 262, in commit
>   File /home/zopesys/lib/python/ZODB/Transaction.py, line 393, in 
> _commit_error
> AttributeError: Surrogate instance has no attribute '_p_oid'

There is more than one error here, which makes the log message very
confusing.  In particular, there is an exception that occurs during the
second phase of the two-phase commit, a second error that occurs trying
to recover from the first error, and a third error trying to log a
message about the second error.  It's a real mess.  It might help to
change the way logging is done so that you can see the first two error
messages.

In _commit_error(), change the self.log() call in these lines

            except:
                # nothing to do but log the error
                self.log("Failed to abort object %s" % repr(o._p_oid),
                         error=sys.exc_info())

to say this:
  self.log("Failed to abort object %s" % repr(o), error=sys.exc_info())


And change one line in _commit

                # Ugh, we got an got an error during commit, so we
                # have to clean up.  First save the original exception
                # in case the cleanup process causes another
                # exception.
                error = sys.exc_info()
                try:
                    self._commit_error(objects, ncommitted, jars,
subjars)
                except:
                    LOG('ZODB', ERROR,
                        "A storage error occured during transaction "
                        "abort.  This shouldn't happen.",
                        error=sys.exc_info())

The LOG() call should be passing error=error, not error=sys.exc_info().

I don't think those changes will introduce any new problems and will
tell us a lot about what is really going on.

> As I understand what's happening (after reading the logs carefully), I
> have one or more folks trying to use the same object.  Zope's ZODB
> serializes the access and aborts all but the oldest of the conflicting
> transactions.  The diagnostic occurs because a registered object is 
> missing the _p_oid attribute that's used to rollback the transaction.  The 
> big question is which of the several sub-transactions is the culprit--or 
> is it a bug as Dieter has suggested, soluable by providing a default 
> value for _p_oid -- 

That right in a rough sense.  ZODB uses optimistic concurrency control,
so later transactions are aborted if they conflict with
already-committed ones.  But the transactions actually run concurrently.

The original error in the second-phase of two-phase commit is part of
the problem.  Since it's in the second phase, it's not caused by a ZODB
conflict but by some more serious failure of one of the transactional
resource managers.  Unfortunately, we can't tell which one because the
buggy logging code has swallowed the error.

Jeremy





More information about the Zope mailing list