[Zope] Re: database conflicts and the _p_oid missing attribute bug

Casey Duncan casey at zope.com
Fri Jan 23 11:36:24 EST 2004


On Fri, 23 Jan 2004 09:56:23 -0500
Shane Landrum <srl at boston.com> wrote:

> On Thu, Jan 22, 2004 at 11:26:23PM -0500, Jeremy Hylton wrote:
> > 
> > That right in a rough sense.  ZODB uses optimistic concurrency
> > control, so later transactions are aborted if they conflict with
> > already-committed ones.  But the transactions actually run
> > concurrently.
> 
> Could you explain this a bit more? Because we run relatively
> high-write operations on our Zope/ZEO setup, we hit ConflictErrors
> quite a bit. I'd like to understand the underlying machinery here,
> ideally so I can come up with a fix.

Here is a simple example:

Suppose you have an object 'somefile' that has an attribute 'data'. Now
suppose that there are two concurrent processes modifying 'somefile':

Process 1:
  somefile.data = 'Mary had a little lamb'

Process 2:
  somefile.data = 'The Quick Brown Fox...'

The concurrency control is "optimistic" because it allows both processes
to change the same file object (there are no locks). 

Now suppose that process 1 commits the transaction. This saves 'Mary had
a little lamb' as the new file data in the ZODB. Inevitably, process 2
also commits its transaction. The ZODB sees that the file was modified
since process 2's transaction began, and this is a conflict
(specifically a write conflict).

When the ZODB detects a write conflict it sees if the object supports
conflict resolution (via a special method). If so, the method is called
to allow the object to try to resolve the conflict. Most objects do not
have conflict resolution, however. In this case ZODB raises a
ConflictError (thereby aborting the transaction). This error is caught
by Zope. In response, Zope retries the web request which replays the
transaction again. If it gets another ConflictError is tries again (up
to 3 times). After 3 times it gives up and returns an error to the user.

In most cases, the first retry is sufficent. More retries are necessary
as the system becomes busier (more concurrency). This causes extra work
since many transactions need to be retried. This can exacerbate the
problem and cause performance degradation.

Conflicts tend to center around "hotspots", objects that are changed
often by many requests. The Catalog is a classic example since it tends
to get changed everytime new content is added to the system. Content
repository folders are another example.

Both of these "hotspots" have their own conflict resolution code (they
typically use BTree objects which can resolve many write conflicts
internally). Write conflicts tend to be pretty manageable if you
recognize hotspots and use the proper data types (like BTrees) or add
your own conflict resolution (not for the faint of heart, however).

The more insidious kind of conflicts are Read Conflicts. They are caused
(in simple terms) when one transaction changes an object that another
concurrent transaction is about to use (These are called "dirty reads").
You can't resolve these kinds of conflicts and they can be very tricky
to prevent especially in a busy system. Luckily ZODB 3.3 has a new
feature: "Multi-version concurrency control"  (MVCC) which resolves read
conflicts (or at worst changes them to write conflicts). Once ZODB 3.3
gets released in Zope, we can look forward to many fewer conflict
errors.

-Casey



More information about the Zope mailing list