[Zope-dev] [CRITICAL] Conflict Errors, Transactions, Retries, Oh My....

Chris McDonough chrism@zope.com
29 May 2003 09:32:40 -0400


On Thu, 2003-05-29 at 01:08, Jeffrey P Shell wrote:
> Thanks for the information.  Is it safe at all to try to catch a 
> ConflictError during the critical part of the code, log some 
> information, and then reraise the error to let the system do what it 
> needs?

Sure, but I'm not sure what that buys you in your case.  The system will
still retry the request if you reraise a conflict error.  And it would
be spotty coverage at best; it's almost impossible to know where a
ConflictError might be raised.  The only reasonable "solution" would be
to change ZPublisher's default behavior to not retry requests on
conflict errors, which is probably not what you want either.

> I guess you're right though - it's hard to know when it will occur.
> 
> In the production system, in this particular method, there are only two 
> known persistent object interactions.  At the end of the entire method, 
> after a notification email has been sent, I have something like:
> 
> session['pieces'] = {}
> 
> (session['pieces'] was a dictionary of {item_id:integer} bits.  It 
> never gets large for an individual user).  I think that the one recent 
> case of desync'd data happened when we got to this point. Since it's 
> at the very end of the script (no more writes are expected beyond this 
> point), I imagine that a get_transaction().commit() might be OK to 
> precede this statement, just so that even if any conflicts happen when 
> trying to write back to the session, we at least have synchronized data 
> between the two systems.  Although, prior to this, there are a few 
> reads of this session data.  Might it be safer to do something like 
> this at the top of the method?:
> 
> pieces = session['pieces'].copy()

pieces = session.get('pieces', {}

..at the top of the method might be better, particularly because you'll
need to explicitly resave the dictionary into the session like so at the
end of the method anyway:

session['pieces'] = pieces

(standard persistence rules apply to session data as well, so you need
to restore basic types after you mutate them if you want the changes to
persist).

We've also found that accessing session data early in the request can
help reduce the number of conflicts that happen later in the request. 
See http://mail.zope.org/pipermail/zope-dev/2003-March/019081.html for
more information.

> I apologize if this post is making little sense (or stupid sense) - 
> dealing with threads, locks, conflicts, etc, has been the part of Zope 
> I've understood the least.  I like that for the most part I don't have 
> to think about it, but I don't know where to go for [fairly] current 
> documentation on how to deal with it for those rare times I do.

FWIW, the Zope Book 2.6 edition session chapter speaks a bit to what
conflict errors are.  The ZDG persistence chapter talks a bit about
threading and concurrency.

> The other persistent data write occurs earlier in the method, an object 
> that generates serial numbers based off of some simple data in a 
> PersistentMapping gets updated.  I think that PersistentMapping has 
> become fairly large by now.  It maps the item_id referenced above to a 
> regular dictionary containing three key/value pairs each.  I make sure 
> to follow the rules of persistence when dealing with these 
> dictionaries-with-a-PersistentMapping, but I'm guessing that an OOBTree 
> might be better instead.  I still don't understand the potential 
> pitfalls of Zope/ZODB BTrees (I keep reading about 'bucket splits' 
> causing conflicts, and I don't know if that would be better or worse 
> than any pitfalls a PersistentMapping gives).

Know that any change to a PersistentMapping needs to load and repersist
the entire data set in the mapping when a key or value is updated or
added.  It is very likely that this will cause a conflict, particularly
when two threads try to do this at once.

OTOH, a BTree is made up of many other persistent subobjects, and there
is less of a chance (but still a good chance) that two concurrent
accesses to a BTree will cause a conflict error.

> Finally, the system in question has a few (three?  four?) public Zope 
> sites using the same session storage.  Is there any documentation, 
> notes, etc, about fine tuning the default session storage set up to 
> handle large sites (or groups of sites) with less conflicts?

The best source of docs for sessions in the 2.6 Zope Book sessions
chapter.  The maillist thread that I mentioned above gives some
information from Toby Dickenson about accessing session data early in a
transaction to reduce the possibility of read conflicts.

> Thanks again for the help.  I'll take a look at MailDropHost.  Maybe 
> I'll have to wrap another gateway around the gateway to the external 
> system to try to catch these conflict situations.  Fortunately, the 
> critical area only occurs once in the current copy of the code.  
> Hopefully that will make it easier to protect.

Good luck!

- C


> 
> Thanks again,
> Jeffrey
>   
>