[Zope] ZCatalog : Limiting number of records returned?

Edward Pollard pollej@uleth.ca
Fri, 21 Feb 2003 11:15:38 -0700


I'm having a problem with speed and my zCatalog. Since I only have 350
items in it, I think I must be doing something horribly wrong.

Here is my situation...

I'm using the zCatalog to sort department information. Documents have
properties for which department owns them (owner), and what kind of
document they are (doctype). Departments have a heirarchical
relationship to one another.

I have two needs:
1) When you request a list of documents of a certain doctype from a
specific owner, you get a list of that doctype from that owner and all
owners below that owner in the heirarchy.

2) Each owner has a page listing the types of documents available for
that owner. This list is dynamic based on what doctypes are in use for
that owner, and all owners below that owner in the heirarchy.

#2 is what is slowing me down - unacceptable slow - as it constructs an
OR query that gets sluggish.

The Python that calculates active doctypes is this:

------
searchresults = context.zIndex_Catalog(uofl_owner =
context.get_dept_search_record(owner),uofl_doctype = str(category))
total =  len(searchresults)
------
Things to Know:
* zIndex_Catalog is the zCatalog
* uofl_owner/doctype : the fieldindexes of the documents. These
properties are string values that hold numbers (typing was a pain in the
early stages of this working, and it was easier to keep them strings
even though the data is numeric. It's wierd and should be fixed but I've
got bigger fish to fry)
* get_dept_search_record is a function that returns a collection of all
subdepartments. (eg. if owner = 5 the results from this function are
['5', '2', '3', '6', '7', '8', '9', '10', '11', '13', '14', '15', '16',
'17', '18'] which represents all the owners below owner 5 in the
heirarchy. Owner 5 is still a few levels from the top of the hierarchy,
too. There is no > or < relationship to values in the heirarchy, its all
expressed in a database and owner keys are assigned when the owner is
created)

So for each doctype - and I think we have about 15 at the moment - that
query gets run to see if that doctype is active. It takes about 5
seconds to return the index for owner=5. Thats *way* too slow.

Now, I know caching is a possible resolution to this delay, but the poor
user who hits the cache on a refresh is still going to get stuck with
the 5 second delay. This will still cause problems.

Is there anyway to search the catalog stopping when 1 result is found?
Or can anyone else suggest an inspired solution to this problem? I'd
appreciate it.

Edward