[Zope-CVS] CVS: Products/ZCTextIndex - Lexicon.py:1.7

Guido van Rossum guido@python.org
Wed, 15 May 2002 11:24:21 -0400


Update of /cvs-repository/Products/ZCTextIndex
In directory cvs.zope.org:/tmp/cvs-serv1951

Modified Files:
	Lexicon.py 
Log Message:
Keep some statistics about indexing: total number of bytes and words
indexed (where the bytes are counted before entry into the pipeline,
and the words are counted after the pipeline is done).  To get the
numbers, use the _nbytes and _nwords instance variables directly.


=== Products/ZCTextIndex/Lexicon.py 1.6 => 1.7 ===
         self._pipeline = pipeline
 
+        # Keep some statistics about indexing
+        self._nbytes = 0 # Number of bytes indexed (at start of pipeline)
+        self._nwords = 0 # Number of words indexed (after pipeline)
+
     def length(self):
         """Return the number of unique terms in the lexicon."""
         return self._nextwid - 1
@@ -45,8 +49,11 @@
 
     def sourceToWordIds(self, text):
         last = _text2list(text)
+        for t in last:
+            self._nbytes += len(t)
         for element in self._pipeline:
             last = element.process(last)
+        self._nwords += len(last)
         return map(self._getWordIdCreate, last)
 
     def termToWordIds(self, text):