[Zodb-checkins] SVN: ZODB/trunk/ Collector #1330: repozo.py -R can create corrupt .fs.

Tim Peters tim.one at comcast.net
Fri May 21 14:03:09 EDT 2004


Log message for revision 24857:

Collector #1330:  repozo.py -R can create corrupt .fs.
When looking for the backup files needed to recreate a Data.fs file,
repozo could (unintentionally) include its meta .dat files in the list,
or random files of any kind created by the user in the backup directory.
These would then get copied verbatim into the reconstructed file, filling
parts with junk.  Repaired by filtering the file list to include only
files with the data extensions repozo.py creates (.fs, .fsz, .deltafs,
and .deltafsz).  Thanks to James Henderson for the diagnosis.



-=-
Modified: ZODB/trunk/NEWS.txt
===================================================================
--- ZODB/trunk/NEWS.txt	2004-05-21 17:52:09 UTC (rev 24856)
+++ ZODB/trunk/NEWS.txt	2004-05-21 18:03:09 UTC (rev 24857)
@@ -5,12 +5,21 @@
 ZODB
 ----
 
+Collector #1330:  repozo.py -R can create corrupt .fs.
+When looking for the backup files needed to recreate a Data.fs file,
+repozo could (unintentionally) include its meta .dat files in the list,
+or random files of any kind created by the user in the backup directory.
+These would then get copied verbatim into the reconstructed file, filling
+parts with junk.  Repaired by filtering the file list to include only
+files with the data extensions repozo.py creates (.fs, .fsz, .deltafs,
+and .deltafsz).  Thanks to James Henderson for the diagnosis.
+
 fsrecover.py couldn't work, because it referenced attributes that no
 longer existed after the MVCC changes.  Repaired that, and added new
 tests to ensure it continues working.
 
 Collector #1309:  The reference counts reported by DB.cacheExtremeDetails()
-for ghosts were one too small.
+for ghosts were one too small.  Thanks to Dieter Maurer for the diagnosis.
 
 Collector #1208:  Infinite loop in cPickleCache.
 If a persistent object had a __del__ method (probably not a good idea
@@ -21,7 +30,8 @@
 satsify the attribute reference, the cache would again decide that
 the object should be ghostified, and so on.  The infinite loop no longer
 occurs, but note that objects of this kind still aren't sensible (they're
-effectively immortal).
+effectively immortal).  Thanks to Toby Dickenson for suggesting a nice
+cure.
 
 
 What's new in ZODB3 3.3 alpha 3

Modified: ZODB/trunk/src/scripts/repozo.py
===================================================================
--- ZODB/trunk/src/scripts/repozo.py	2004-05-21 17:52:09 UTC (rev 24856)
+++ ZODB/trunk/src/scripts/repozo.py	2004-05-21 18:03:09 UTC (rev 24857)
@@ -27,7 +27,7 @@
     -r dir
     --repository=dir
         Repository directory containing the backup files.  This argument
-        is required.
+        is required.  The directory must already exist.
 
 Options for -B/--backup:
     -f file
@@ -74,12 +74,6 @@
 
 program = sys.argv[0]
 
-try:
-    True, False
-except NameError:
-    True = 1
-    False = 0
-
 BACKUP = 1
 RECOVER = 2
 
@@ -88,7 +82,6 @@
 VERBOSE = False
 
 
-
 def usage(code, msg=''):
     outfp = sys.stderr
     if code == 0:
@@ -107,7 +100,6 @@
         print >> sys.stderr, msg % args
 
 
-
 def parseargs():
     global VERBOSE
     try:
@@ -184,7 +176,6 @@
     return options
 
 
-
 # Read bytes (no more than n, or to EOF if n is None) in chunks from the
 # current position in file fp.  Pass each chunk as an argument to func().
 # Return the total number of bytes read == the total number of bytes
@@ -272,24 +263,27 @@
     return '%04d-%02d-%02d-%02d-%02d-%02d%s' % t
 
 
+# Return a list of files needed to reproduce state at time options.date.
+# This is a list, in chronological order, of the .fs[z] and .deltafs[z]
+# files, from the time of the most recent full backup preceding
+# options.date, up to options.date.
 def find_files(options):
     def rootcmp(x, y):
         # This already compares in reverse order
         return cmp(os.path.splitext(y)[0], os.path.splitext(x)[0])
-    # Return a list of files needed to reproduce state at time `when'
     when = options.date
     if not when:
         when = gen_filename(options, '')
-    log('looking for files b/w last full backup and %s...', when)
+    log('looking for files between last full backup and %s...', when)
     all = os.listdir(options.repository)
     all.sort(rootcmp)
-    # Find the last full backup before date, then include all the incrementals
-    # between when and that full backup.
+    # Find the last full backup before date, then include all the
+    # incrementals between that full backup and "when".
     needed = []
-    for file in all:
-        root, ext = os.path.splitext(file)
-        if root <= when:
-            needed.append(file)
+    for fname in all:
+        root, ext = os.path.splitext(fname)
+        if root <= when and ext in ('.fs', '.fsz', '.deltafs', '.deltafsz'):
+            needed.append(fname)
         if ext in ('.fs', '.fsz'):
             break
     # Make the file names relative to the repository directory
@@ -335,7 +329,6 @@
     return fn, startpos, endpos, sum
 
 
-
 def do_full_backup(options):
     # Find the file position of the last completed transaction.
     fs = FileStorage(options.file, read_only=True)
@@ -471,7 +464,6 @@
     do_full_backup(options)
 
 
-
 def do_recover(options):
     # Find the first full backup at or before the specified date
     repofiles = find_files(options)
@@ -493,7 +485,6 @@
     log('Recovered %s bytes, md5: %s', reposz, reposum)
 
 
-
 def main():
     options = parseargs()
     if options.mode == BACKUP:




More information about the Zodb-checkins mailing list