[Zope-CMF] cmf zpt_generic/search - sequence_length is very slow for large collections, len () is much faster.

Henryk Paluch hpaluch at gitus.cz
Wed Dec 22 09:33:57 EST 2004


Hi folks!
   I'm currently doing few scalability tests on Zope 2.7.3/Fedore Core 3 
Linux 700MHz CPU, 640MB RAM, 2x IDE disks, CMF 1.5.0. The portal was 
filled with 66000 portal Document objects (3 CMF BTreefolders, each 
contains about 20000 documents). Fulltext indexes are realized using 
TextIndexNG and storage is currently BDB. I started simple profiling of  
document search (using log and DateTime() ;-) and find that for large 
sets the expression

length = batch_obj.sequence_length

took nearly same taime as search itself, e.g.:
 items = ctool.searchResults(kw)

For example - when search returns about 7000 items then 
ctool.searchResults(kw) take about 30s, but
the batch_obj.sequence_length took another 30seconds!

It tried to replace
length = batch_obj.sequence_length
with:
length = len(items)
ane wow! Now it is instant (about 100ms) - significant speedup.

Also when tried search of Document objects only (no fulltext, eg.)
http://z154.gitus.dom:8080/portal/search?review_state=&SearchableText=&Title=&Subject%3Alist=&Description=&created.query%3Arecord%3Adate=1970%2F01%2F01+00%3A00%3A01+GMT&created.range%3Arecord=min&portal_type%3Alist=Document&listCreators=
Original code took 90s to display results, but modified version took just 3s

Can anybody confirm/explain this, or even submit this optimization into 
CMF tree (if correct ;-)?

Best regards

-- 
---(c)--------------------------------------------------
GITUS, s.r.o. Spitalska 2a, 190 00 Praha 9, CZ
Henryk Paluch - analytik/programator
mailto: hpaluch at gitus.cz  http://www.gitus.cz
--------------------------------------------------------



More information about the Zope-CMF mailing list