[Zope-dev] catalog performance: query plan

Roché Compaan roche at upfrontsystems.co.za
Sun Nov 9 13:58:02 EST 2008


On Mon, 2008-10-27 at 11:32 -0500, Alan Runyan wrote:
> I agree with Tres.  A lot more can be done with Indexes and Catalog
> without caching.
> 
> The most exiciting development in Catalog optimizations comes out
> Jarn.  Helge Tesdal (iirc) did a buncha work  at a RDBMS company when
> he was in college.  He has a protoype of a query plan for ZCatalog.
> 
> http://www.jarn.com/blog/catalog-query-plan
> 
> I would like to ask Roche and others to look at the Query Plan.

We looked at query plan but it didn't help us in any way. Some catalog
indexes are performing very badly and most of our content is in a
published state which doesn't help the query plan much.

> Caching is a total PITA because invalidation machinery becomes
> overwhelming complex and unwieldly quickly in production.
> 

I agree but this was the only thing that we could do to even go into
production.

Since I'm in full agreement that we need to fix indexes that are
problematic, I started doing some benchmarks on the large data set that
gave us so many headaches. It is probably not surprising that the more
complex indexes are performing badly. DateRangeIndex, KeywordIndex and
Plone's ExtendedPathIndex performed the worst. Below are some stats
showing timings around the "apply_index" call in Catalog.py that was
done while testing the application with real data:

Index Name           |Type             |Avg Time |Calls/second
==============================================================
object_implements    |KeywordIndex     |0.2172234|         4.6
getEffective_or_creat|DateIndex        |0.1941770|        5.15
effectiveRange       |DateRangeIndex   |0.0086295|      115.88
allowedRolesAndUsers |KeywordIndex     |0.0069754|      143.36
path                 |ExtendedPathIndex|0.0040614|      246.22
portal_type          |FieldIndex       |0.0025984|      384.84
SearchableText       |ZCTextIndex      |0.0007645|     1308.04
sourceUID            |FieldIndex       |0.0004886|     2046.31
UID                  |FieldIndex       |0.0003070|      3257.1
targetUID            |FieldIndex       |0.0002287|     4372.12
exact_getUserId      |FieldIndex       |0.0001931|     5177.79
exact_getUserName    |FieldIndex       |0.0001816|     5504.39
relationship         |FieldIndex       |0.0000822|     12153.1
id                   |FieldIndex       |0.0000822|    12161.81
end                  |DateIndex        |0.0000623|    16027.48
getGroups            |FieldIndex       |0.0000278|    35973.45
getArtistTitle       |FieldIndex       |0.0000259|    38495.53
review_state         |FieldIndex       |0.0000259|    38582.22
Subject              |KeywordIndex     |0.0000253|    39413.57
getDaysOfTheWeek     |KeywordIndex     |0.0000247|    40465.98
meta_type            |FieldIndex       |0.0000199|    50116.64
exact_getGroupId     |FieldIndex       |0.0000162|    61417.51
getVideoURL          |FieldIndex       |0.0000155|     64447.5
year                 |FieldIndex       |0.0000155|    64460.43
Title                |FieldIndex       |0.0000136|    73381.01
getId                |FieldIndex       |0.0000131|    76056.97
Title                |ZCTextIndex      |0.0000128|    77809.46
startendrange        |DateRangeIndex   |0.0000127|    78485.82
expires              |DateIndex        |0.0000126|    79001.59
getObjPositionInParen|FieldIndex       |0.0000124|     80675.9
targetId             |FieldIndex       |0.0000122|    81418.68
effective            |DateIndex        |0.0000121|     82651.7
getProvince          |FieldIndex       |0.0000117|    85198.54
month                |FieldIndex       |0.0000116|    85762.56
Description          |ZCTextIndex      |0.0000116|    86241.39
Type                 |FieldIndex       |0.0000115|    86345.17
getLast_login_time   |DateIndex        |0.0000115|    86698.98
Creator              |FieldIndex       |0.0000113|    87840.03
getEmail             |ZCTextIndex      |0.0000113|    87849.05
cmf_uid              |FieldIndex       |0.0000113|    88352.13
getDuration          |FieldIndex       |0.0000113|    88454.29
SearchableText       |TextIndex        |0.0000113|    88466.69
sortable_title       |FieldIndex       |0.0000112|    88698.49
getRating            |FieldIndex       |0.0000112|     88747.5
getGenres            |KeywordIndex     |0.0000112|    88796.55
object_provides      |KeywordIndex     |0.0000112|    88919.43
getEventType         |KeywordIndex     |0.0000112|     88953.9
in_reply_to          |FieldIndex       |0.0000112|    89057.46
getReview_state      |FieldIndex       |0.0000112|    89124.63
is_folderish         |FieldIndex       |0.0000112|    89240.51
getRawRelatedItems   |KeywordIndex     |0.0000111|    89568.91
getThumbSize         |FieldIndex       |0.0000111|    89653.89
getStudioCamURL      |FieldIndex       |0.0000111|    89678.92
Date                 |DateIndex        |0.0000111|    89799.23
getHash              |FieldIndex       |0.0000111|    90111.54
getNumberOfComments  |FieldIndex       |0.0000110|    90141.88
start                |DateIndex        |0.0000110|    90400.59
getPercentage        |FieldIndex       |0.0000110|    90420.94
is_default_page      |FieldIndex       |0.0000110|     90446.4
modified             |DateIndex        |0.0000106|    93506.29
created              |DateIndex        |0.0000106|    93678.59
getGroupId           |ZCTextIndex      |0.0000105|    94962.39
getUserId            |ZCTextIndex      |0.0000105|    95165.88
getFullname          |ZCTextIndex      |0.0000104|    95313.06
getRoles             |FieldIndex       |0.0000104|    95385.31
getUserName          |ZCTextIndex      |0.0000103|    96692.46


-- 
Roché Compaan
Upfront Systems                   http://www.upfrontsystems.co.za



More information about the Zope-Dev mailing list