[ZODB-Dev] Re: BTrees strangeness (was [Zope-dev] Zope 2.X BIG Session problems - blocker - our site dies - need help of experience Zope developer, please)

Michael Dunstan michael at looma.co.nz
Mon Apr 26 18:34:04 EDT 2004


On 11/03/2004, at 7:25 PM, Chris McDonough wrote:

> On Wed, 2004-03-10 at 23:14, michael wrote:
>
> Thanks for your excellent isolation here.
>
>> I don't get a KeyError thrown until *after* at least one rendering of
>> standard_error_message that includes an access to a session variable.
>> And I'm only getting KeyError's for sessions that have had a chance to
>> expire. And that KeyError will be thrown whenever any access is made
>> for the session. (Previously I thought I was *only* seeing KeyError's
>> being thrown from standard_error_message.)
>
> I think after provide transaction isolation for the duration of an
> error, we will not see this symptom occur again.

Okay - I've gone ahead and implemented a flavour of transaction  
isolation for errors. (See below for patch.) That seems to have done  
the trick just nicely. I have not seen any more KeyError's thrown from  
Transience.py. Yay!

http://collector.zope.org/Zope/848 has a couple of examples of such  
KeyErrors thrown from Transience. These KeyError were a very painful  
user experience as either the server needed to be restarted or the user  
needed to delete their cookie for that server from their browser cookie  
jar.

However there is another KeyError reported in 848 too. This one is  
thrown from TempoaryStoarge. After some further tweaking of my test rig  
I was able to produce such KeyErrors quite frequently. Using Zope 2.7.0  
and recent the rewrite of Transience.py:

http://cvs.zope.org/*checkout*/Products/Transience/Transience.py? 
rev=1.32.12.2.2.2&only_with_tag=chrism-sessiongeddon

errors look like:

2004-04-27T08:24:16 BLATHER(-100) OFS Exception while rendering an  
error message
Traceback (most recent call last):
   File "/Users/michael/builds/zope/Zope/lib/python/OFS/SimpleItem.py",  
line 204, in raise_standardErrorMessage
     v = s(**kwargs)
   File  
"/Users/michael/builds/zope/Zope/lib/python/Shared/DC/Scripts/ 
Bindings.py", line 306, in __call__
     return self._bindAndExec(args, kw, None)
   File  
"/Users/michael/builds/zope/Zope/lib/python/Shared/DC/Scripts/ 
Bindings.py", line 343, in _bindAndExec
     return self._exec(bound_data, args, kw)
   File  
"/Users/michael/builds/zope/Zope/lib/python/Products/PythonScripts/ 
PythonScript.py", line 318, in _exec
     result = f(*args, **kw)
   File "Script (Python)", line 12, in standard_error_message
   File  
"/Users/michael/builds/zope/Zope/lib/python/ZPublisher/HTTPRequest.py",  
line 1218, in __getattr__
     v = self.get(key, default, returnTaints=returnTaints)
   File  
"/Users/michael/builds/zope/Zope/lib/python/ZPublisher/HTTPRequest.py",  
line 1178, in get
     if callable(v): v = v()
   File  
"/Users/michael/builds/zope/Zope/lib/python/Products/Sessions/ 
SessionDataManager.py", line 93, in getSessionData
     return self._getSessionDataObject(key)
   File  
"/Users/michael/builds/zope/Zope/lib/python/Products/Sessions/ 
SessionDataManager.py", line 180, in _getSessionDataObject
     ob = container.new_or_existing(key)
   File  
"/Users/michael/builds/zope/Zope/lib/python/Products/Transience/ 
Transience.py", line 504, in new_or_existing
     item = self.get(key, _marker)
   File  
"/Users/michael/builds/zope/Zope/lib/python/Products/Transience/ 
Transience.py", line 332, in get
     item = self._move_item(k, current_ts, _marker)
   File  
"/Users/michael/builds/zope/Zope/lib/python/Products/Transience/ 
Transience.py", line 208, in _move_item
     self._gc(current_ts)
   File  
"/Users/michael/builds/zope/Zope/lib/python/Products/Transience/ 
Transience.py", line 429, in _gc
     for key in list(self._data.keys(None, max_ts)):
   File "/Users/michael/builds/zope/Zope/lib/python/ZODB/Connection.py",  
line 561, in setstate
     p, serial = self._storage.load(oid, self._version)
   File  
"/Users/michael/builds/zope/Zope/lib/python/tempstorage/ 
TemporaryStorage.py", line 94, in load
     s=self._index[oid]
KeyError: '\x00\x00\x00\x00\x00\x00\x08\x82'

So far it looks like commenting out the construction of the to_notify  
list in _gc makes this go away. So _gc looks like:

     def _gc(self, now=None):
         if self._timeout_slices == 0:
             return # dont do gc if there is no timeout

         if now is None:
             now = getCurrentTimeslice() # for unit tests
         max_ts = now  - (PERIOD * (self._timeout_slices + 1))
         keys = self._data.keys(None, max_ts)

         #to_notify = []

         for key in list(self._data.keys(None, max_ts)):
             assert(key <= max_ts)
             STRICT and _assert(self._data.has_key(key))
             #for v in self._data[key].values():
             #    to_notify.append(v)
             del self._data[key]

         #for v in to_notify:
         #    self.notifyDel(v)

Note that the test rig does not set onAdd or onDelete.

Along the way there where a few things that I was not too sure about so  
I tried fiddling with to see what would happen:

(1) Use of hasattr in Transience.py. hasattr should be avoided? See  
http://mail.zope.org/pipermail/zodb-dev/2003-October/005968.html. I did  
change usage of hasattr to match the recipe given by Tim. And added  
some logging to explicitly look for ConflictError's at those points.  
Found a few of these in the logs. But that did not solve the KeyErrors.

(2) _p_resolveConflict of Increaser of Transience does not always  
return the max of the three given states. I replaced the implementation  
with return max(old, state1, state2). But that did not solve the  
KeyErrors.

(3) Use of try/except in TempoaryStorage. For example:

         try: del referenceCount[oid]
         except: pass

     I removed all such try/except wrappings of del. But that did not  
solve the KeyErrors. Nor did it raise any new errors in the test rig.

(4) Checking consistency of btrees. I added the following code to _gc  
just before the assignment of keys:

         from BTrees.check import check, display
         try:
             self._data._check()
             check(self._data)
         except AssertionError:
             LOG('Transience',ERROR,'yikes')
             display(self._data)
             raise

     I never saw any noise from this. Looks like self._data is healthy.

Just to be explicit... I removed all the above hacks leaving just the  
one that comments out the construction of the to_notify list. Reran the  
test rig. And have yet to see any KeyErrors.

Now that patch I used for transaction isolation for  
standard_error_message:

Index: lib/python/ZPublisher/Publish.py
===================================================================
RCS file: /cvs-repository/Zope/lib/python/ZPublisher/Publish.py,v
retrieving revision 1.164.2.2
diff -u -r1.164.2.2 Publish.py
--- lib/python/ZPublisher/Publish.py    17 Nov 2003 22:34:19 -0000       
1.164.2.2
+++ lib/python/ZPublisher/Publish.py    26 Apr 2004 22:17:56 -0000
@@ -121,19 +121,26 @@
          if err_hook is not None:
              if parents: parents=parents[0]
              try:
-                return err_hook(parents, request,
-                                sys.exc_info()[0],
-                                sys.exc_info()[1],
-                                sys.exc_info()[2],
-                                )
-            except Retry:
-                # We need to try again....
-                if not request.supports_retry():
+                try:
+                    if transactions_manager:  
transactions_manager.begin()
                      return err_hook(parents, request,
                                      sys.exc_info()[0],
                                      sys.exc_info()[1],
                                      sys.exc_info()[2],
                                      )
+                finally:
+                    if transactions_manager:  
transactions_manager.abort()
+            except Retry:
+                if not request.supports_retry():
+                    try:
+                        if transactions_manager:  
transactions_manager.begin()
+                        return err_hook(parents, request,
+                                        sys.exc_info()[0],
+                                        sys.exc_info()[1],
+                                        sys.exc_info()[2],
+                                        )
+                    finally:
+                        if transactions_manager:  
transactions_manager.abort()
                  newrequest=request.retry()
                  request.close()  # Free resources held by the request.
                  try:
Index: lib/python/Zope/App/startup.py
===================================================================
RCS file: /cvs-repository/Zope/lib/python/Zope/App/startup.py,v
retrieving revision 1.8.2.3
diff -u -r1.8.2.3 startup.py
--- lib/python/Zope/App/startup.py      20 Dec 2003 04:48:36 -0000       
1.8.2.3
+++ lib/python/Zope/App/startup.py      26 Apr 2004 22:17:57 -0000
@@ -176,8 +176,6 @@
              published=app.__bobo_traverse__(REQUEST).__of__(
                  RequestContainer(REQUEST))

-        get_transaction().begin() # Just to be sure.
-
          published=getattr(published, 'im_self', published)
          while 1:
              f=getattr(published, 'raise_standardErrorMessage', None)




More information about the Zope-Dev mailing list