[Zope] ZODB vs. Gadfly vs. ???

Michel Pelletier michel@digicool.com
Thu, 09 Sep 1999 19:56:05 -0400


John Goerzen wrote:
> 
> Hi,
> 
> First, thanks to everyone that has helped me along so far!  It is
> *tremendously* appreciated!  You can see what I've managed to hack
> together thus far at http://www.complete.org:8080/ACLUG/events
> 
> Now I need some advice rather than pointers on syntax (though I'm
> still learning that, too).  Here's my situation.  I want to put up
> information about some things (they happen to be single-time events,
> but that's not terribly relevant).  Kevin Dangoor and others suggested
> that I make an Event ZClass, then give it an index_html or whatever to
> fetch the events.  Pretty slick, and it was easier to implement than a
> gadfly thing, with which I had been failing due to not being aware of
> _.DateTime().
> 
> However, I am running into some problems.  I have only a couple dozen
> Event objects in my eventsDb folder, but already there is a noticeable
> performance hit.  It takes the server about 3-4 seconds to render the
> above page in calendar mode -- which is completely out of line no this
> server, which is a cream-of-the-crop 600MHz Alpha.  This is, no doubt, due to
> inefficiency.  To search for events on a given date, I have to iterate
> through *all* the objects in there, inspecting dates.  This must be
> done for each day in a given month to determine whether there is an
> event on that day (for displaying on a calendar square).  Ick.
> 

So you iterate over each object 28-31 times?  Then your problem is
purely algorithmic.

It would be better to catalog the date attributes of each object and
then ask the Catalog:

<dtml-in "Catalog.searchResults({'date_property' : [ZopeTime(),
(ZopeTime + 1)], 'date_property_usage' : 'range:min:max'})">

This will return you all objects with a 'date_property' whose date is
between now (ZopeTime()) and now+24 hours (ZopeTime()+1).  This may not
be exactly what you want, but you get the idea.  This will happen very,
very fast, and will scale to thousands of events.  One of our customers
uses the catalog to search over 10,000 objects in the blink of eye.  In
fact, I think he uses the Catalog and Calendar together in a way like
this.  Jason?

In addition, you can teach your ZClass instances to automaticly catalog
and uncatalog themselves without managment intervention.

>  * Am I missing out on some whiz-bang way to do searches through
>    a directory full of custom objects?

Yes, the Catalog.  Immagine seaching through millions of Oracle records
iterativly.  Catalog uses indexes just like relational databases do to
greatly speed up searches.

>  * Should I be using Gadfly instead?  Would it be faster?  Why is ZODB
>    so slow?

ZODB isn't slow.  It may be slowER than <insert your favorite database
here> but then again, it might be faster.  Gadfly might be faster, it
might not.  From what I understand, Gadfly keeps all or part of it's
data in memory at all times and never disk writes; if this is the case
then it will be faster, until you run out of memory (I could be wrong,
haven't used Gadfly in a bit).

>  * In essence, because of the ZODB architecture or my own ignorance
>    of how to do it better, I'm getting performance of less than 10
>    queries per second.  This is not acceptable.
>
> Also, I am having performance worries.  If the server chokes this fast
> with only a couple dozen items, I am concerned.  This server is
> normally capable of dishing out many thousands of documents a second,
> and even figuring worst-case here, (24 * 30), it's getting only 720
> (and those aren't even complete documents, just lookups).  Can someone
> help ease my mind on this one?

Your server is choking because of your choice of algorithm.  This is
unrelated to the problem that Zope cannot serve up information as fast
as a static web server like Apache.  This is obvious when you take into
account that Apache is a C program that serves static files, with gobs
of optimizations to make that operation fast.  There is no script
evaluation, acquisition, advanced security model, etc.  Zope is written
in a higher level language and serves up dynamic content, it must go
through a code path on every request that varies from simple to
complex.  I run Zope on a nifty little P75 with 32MB of ram and it works
fairly dandy.  Bruce Perens runs his popular technocrat website
(http://www.technocrat.net) on a humble P120.  The technocrat site
survived with flying colors a full on slashdot effect over the course of
24 hours.

We are addressing the concepts of performance, but I don't think there
is nearly the kind of problem in Zope like you are experiencing. 
Granted there are many area we can improve through cleaner design and
conversion to C code, none of these optimizations would help your case. 
I would suggest looking into the Catalog.

-Michel
 
> Many thanks,
> 
> John Goerzen
>