[Zope] Experimental searchable mail list archive

Ben Leslie benno@sesgroup.net
Thu, 7 Oct 1999 11:54:28 +1000


Hi The!

On Tue, 05 Oct 1999, The Dragon De Monsyne wrote:

> On Tue, 28 Sep 1999, Michel Pelletier wrote:
> 
> > Greetings,
> > 
> > I finally got sick of paging through endless archive messages, so I
> > implimented an expirimental searchable list archive:
> > 
> > http://www.zope.org:12080/archives/Catalog/S
> > 
> > will present you with a single text search box.  This is a very trivial
> > interface, it will be expanded upon.
> > 
> > Please try and use this over the next few days and see if it help answer
> > your questions.
> > 
> > I used the fsimport script to import the entirety of the pipermail
> > archive, and then cataloged it with the 'Find objects' Catalog tab.  In
> > the process, I fixed a silly design flaw that improved the mass indexing
> > speed of catalog by at least 200% and greatly reduced the memory
> > overhead and thrashing.  The dataset of documents is 56MB, the total
> > dataset plus indexes is 64MB.  Not bad.  It took 6 minutes to index the
> > entire dataset with a 10000 word subtransaction threshold and the
> > process footprint grew to 85MB.  Catalog has come a long way in terms of
> > speed and memory usage.
> > 
> > Further improvements are to parse the documents into rfc822 Messages
> > (probably with a ZClass), index all interesting attributes (date,
> > author, etc), and impliment a simple ZPublisher.Client script that
> > mailman calls to 'push' a message up to the server, instanciate a new
> > message object, and incrimentaly index it in the Catalog.
> > 
> 	Hmmm! Whaddayaknow! this is exactly what I've been working on! 
> I've been planning out a product called MessageBase to do this.  I'm
> sketching out the Message class right now. I'm planning on it having full
> MIME suport. (one of the things I have gotten done so far is an imporved
> version of python's mimetools module thast is actually compliant to the
> MIME RFC's) 


Umm any chance you could send me a copy of these? I have been working on
the NotMail product and have got it viewing MIME messages slightly better,
however an improved mimetools would certainly make my life easier and code 
neater.


Cheers,


Benno