[Zodb-checkins] CVS: ZODB3/bsddb3Storage/bsddb3Storage - Full.py:1.47

Barry Warsaw barry@wooz.org
Fri, 8 Nov 2002 14:35:52 -0500


Update of /cvs-repository/ZODB3/bsddb3Storage/bsddb3Storage
In directory cvs.zope.org:/tmp/cvs-serv7162

Modified Files:
	Full.py 
Log Message:
A new algorithm for packing which seems much more straightforward.
Here's how it works:

- On every store(), we write an entry to a objrev table containing the
  tuple of information (newserial, oid, oldserial).  We don't write
  this entry if the store is the first revision of an object on a new
  version.

  We do basically the same thing on restore() and transactionalUndo().

- On an abortVersion(), we write two entries to the objrev table, one
  that has (newserial, oid, oldserial) -- which points to the old
  serial in the version, and (newserial, oid, nvserial) -- which
  points to the non-version revision of the version revision.

- On commitVersion(), we do the same as abortVersion() except that we
  don't write the non-version data if we're committing to a different
  version.

- Now, when we pack, all we need to do is cruise from the beginning of
  the objrev table until we find an entry with a newserial > packtime.
  If the oldserial is ZERO, it's an object creation event which we
  don't need to worry about because there's no previous revision.  But
  otherwise, we can delete the oid+oldserial revision because we know
  it's not current.  We do this, updating pickle refcounts and then
  collecting any objects that are left unreferenced.

The cute thing is that autopacking will use the same algorithm.  The
main difference between autopack and classic pack, is that the latter
does a mark and sweep garbage collection phase after the normal objrev
collection phase.  Also, this algorithm means autopack needs only
three pieces of information:

- How often the thread should run (e.g. once per hour)

- How far in the past it should pack (e.g. pack to 4 hours ago).  We
  don't need a start time for the autopack window, because we'll
  always just start at the beginning of the objrev table.

- How often should autopack also do a classic pack (e.g. do a classic
  pack once per day).

Autopack isn't implemented in this checkin, but I believe it will be
nearly trivial to add.  That comes next.


=== ZODB3/bsddb3Storage/bsddb3Storage/Full.py 1.46 => 1.47 === (881/981 lines abridged)
--- ZODB3/bsddb3Storage/bsddb3Storage/Full.py:1.46	Tue Nov  5 18:07:31 2002
+++ ZODB3/bsddb3Storage/bsddb3Storage/Full.py	Fri Nov  8 14:35:51 2002
@@ -24,7 +24,7 @@
 
 # This uses the Dunn/Kuchling PyBSDDB v3 extension module available from
 # http://pybsddb.sourceforge.net.  It is compatible with release 3.4 of
-# PyBSDDB3.
+# PyBSDDB3.  The only recommended version of BerkeleyDB is 4.0.14.
 from bsddb3 import db
 
 from ZODB import POSException
@@ -41,21 +41,15 @@
 # functionality.
 from BerkeleyBase import BerkeleyBase
 
-# Flags for transaction status in the transaction metadata table.  You can
-# only undo back to the last pack, and any transactions before the pack time
-# get marked with the PROTECTED_TRANSACTION flag.  An attempt to undo past a
-# PROTECTED_TRANSACTION will raise an POSException.UndoError.  By default,
-# transactions are marked with the UNDOABLE_TRANSACTION status flag.
-UNDOABLE_TRANSACTION = 'Y'
-PROTECTED_TRANSACTION = 'N'
-
 ABORT = 'A'
 COMMIT = 'C'
 PRESENT = 'X'
 ZERO = '\0'*8
+
+# Special flag for uncreated objects (i.e. Does Not Exist)
 DNE = '\377'*8
 # DEBUGGING
-#DNE = 'nonexist'                                  # does not exist
+#DNE = 'nonexist'
 
 try:
     # Python 2.2
@@ -91,7 +85,8 @@
         #
         # - Object ids (oid) are 8-bytes
         # - Objects have revisions, with each revision being identified by a
-        #   unique serial number.
+        #   unique serial number.  We sometimes refer to 16-byte strings of
+        #   oid+serial as a revision id.
         # - Transaction ids (tid) are 8-bytes
         # - Version ids (vid) are 8-bytes
         # - Data pickles are of arbitrary length
@@ -138,16 +133,9 @@
         #     prevrevid is the tid pointing to the previous state of the
         #     object.  This is used for undo.
         #

[-=- -=- -=- 881 lines omitted -=- -=- -=-]

-            return tid, status, user, desc, ext
+            packtime = self._last_packtime()
+            if tid <= packtime:
+                packedp = True
+            else:
+                packedp = False
+            userlen, desclen = unpack('>II', data[:8])
+            user = data[8:8+userlen]
+            desc = data[8+userlen:8+userlen+desclen]
+            ext = data[8+userlen+desclen:]
+            return tid, packedp, user, desc, ext
         finally:
             if c:
                 c.close()
@@ -1678,14 +1741,14 @@
         if self._closed:
             raise IOError, 'iterator is closed'
         # Let IndexErrors percolate up.
-        tid, status, user, desc, ext = self._storage._nexttxn(
+        tid, packedp, user, desc, ext = self._storage._nexttxn(
             self._tid, self._first)
         self._first = False
         # Did we reach the specified end?
         if self._stop is not None and tid > self._stop:
             raise IndexError
         self._tid = tid
-        return _RecordsIterator(self._storage, tid, status, user, desc, ext)
+        return _RecordsIterator(self._storage, tid, packedp, user, desc, ext)
 
     def close(self):
         self._closed = True
@@ -1715,14 +1778,14 @@
     description = None
     _extension = None
 
-    def __init__(self, storage, tid, status, user, desc, ext):
+    def __init__(self, storage, tid, packedp, user, desc, ext):
         self._storage = storage
         self.tid = tid
         # Impedence matching
-        if status == UNDOABLE_TRANSACTION:
-            self.status = ' '
-        else:
+        if packedp:
             self.status = 'p'
+        else:
+            self.status = ' '
         self.user = user
         self.description = desc
         self._extension = ext