[Zodb-checkins] CVS: Packages/StorageGC - CyclicGC.py:1.2

tim@digicool.com tim@digicool.com
Thu, 19 Apr 2001 15:05:28 -0400 (EDT)


Update of /cvs-repository/Packages/StorageGC
In directory korak:/tmp/cvs-serv23673/StorageGC

Modified Files:
	CyclicGC.py 
Log Message:
Added user-level comment-docs for the GCable and CycleGC interfaces.  Barry,
Jim, does the writeup jibe with your understandings of our design mtg?



--- Updated File CyclicGC.py in package Packages/StorageGC --
--- CyclicGC.py	2001/04/19 17:37:12	1.1
+++ CyclicGC.py	2001/04/19 19:05:28	1.2
@@ -3,6 +3,80 @@
 # named "storage" to reflect its expected use.  storage must implement
 # the GCable interface, and objects must be hashable.
 
+###########################################################################
+# The GCable interface.
+# A storage implementing GCable implements the following methods.  "oid"
+# is short for object ID, and oids must be hashable (usable as Python dict
+# keys); no other assumption about oids is made by CycleGC.
+#
+# GCable.gcRefcount(oid) -> int
+#     Return reference count of oid.
+#
+# GCable.gcReferences(oid) -> oids (a list of oid)
+#     Return list of oid's immediate successors.
+#     y is in gcReferences(x) if and only if x accounts for at least one
+#     of y's refcounts; if x accounts for N of y's refcounts, then y must
+#     appear exactly N times in gcReferences(x).
+#
+# GCable.trash(oids) -> None
+#     This is a callback, invoked when CycleGC has discovered cyclic
+#     garbage.  oids is a list of objects in garbage cycles, and of objects
+#     reachable only from garbage cycles.
+#     trash() must be robust in the face of an oid that no longer exists,
+#     since CycleGC can run incrementally, and oids it was originally told
+#     about may no longer exist by the time it discovers they were trash.
+
+###########################################################################
+# The CycleGC interface.
+# The CycleGC class here supplies the following methods for use by
+# refcounted storages desiring detecting of unreachabe cycles.
+#
+# Typical use:
+#
+#     collector = CycleGC(self)
+#     collector.start(list_of_suspect_objects)
+#     while not collector.done():
+#         collector.resume()
+#
+# CycleGC.__init__(storage)
+#     The constructor takes a storage argument.  The storage must implement
+#     GCable (see above).
+#
+# CycleGC.start(oids) -> None
+#     The list of oids is taken as a root set.  All objects reachable from
+#     the root set (the "transitive closure", or TC) are examined, and
+#     all objects in the TC *not* reachable-- whether directly or
+#     indirectly --from *outside* the TC are (eventually) passed to
+#     storage's gcTrash() callback.  The intent is that the storage take
+#     its best guess at which objects *may* be in garbage cycles, and
+#     pass them to .start().
+#     .start() only begins the job.  .resume() continues it.  In effect,
+#     this is a by-hand coroutine approach.
+#     It's OK to call .start() again before CycleGC has completed analyzing
+#     the TC of the objects passed to preceding .start() calls.  In that
+#     case, the subsequent OIDs are queued internally for later processing.
+#     In particular, the gcTrash() callback may want to call .start() if
+#     seeing a list of actual trash helps it guess which other objects may
+#     be in garbage cycles.
+#
+# CycleGC.resume(effort=-1) -> None
+#     .resume() continues analyzing the TC.  Optional integer argument
+#     "effort" is a measure of how much work the caller wants .resume()
+#     to do before returning.  A value <= 0 means .resume() should not
+#     return until analysis is complete (i.e., until .done() returns true).
+#     It's OK to call .resume() when .done() returns true, although there's
+#     no point to doing so.
+#     XXX TODO All effort values are treated as -1 right now.
+#     XXX TODO Define what effort *means*; e.g., perhaps an approximation
+#     XXX      to the maximum number of .gcReferences() calls the storage
+#     XXX      wants to endure before regaining control?  milliseconds
+#     XXX      elapsed?  Not yet clear what would be most useful.
+#
+# CycleGC.done() -> bool
+#     Return true when analysis is complete, false if .resume() has more
+#     work to do.
+
+###########################################################################
 # The approach takes a bit from many techniques in the literature, but is
 # closest to:
 #
@@ -18,6 +92,7 @@
 # XXX recursion, and will make the algorithm 10x more brittle.  Need a great
 # XXX test suite first.
 
+###########################################################################
 # Indices into object descriptors.
 # An object descriptor is a 4-list of the form [OBJ, RC, ADJRC, ISFREE].
 # A list is used instead of an object to slash the space overhead; a tuple