-----Original Message----- From: Chris Withers [mailto:chrisw@nipltd.com] Sent: Friday, 26 April 2002 11:18 PM To: tdickenson@geminidataloggers.com Cc: Jay, Dylan; zope-dev@zope.org Subject: Re: [Zope-dev] RE: [ZC] 365/ 2 Reject "metadata in Catalog is space inefficient"
Toby Dickenson wrote:
Note that this scheme may not necessarily give runtime performance benefits. Loading the reverse index data may not be any faster than loading metadata.
I'm betting in a lot of cases it'll be a damn site slower.
MetaData is specifically designed to be real quick to load. For the small extra space usage (how much _does_ disk space, or RAM for that matter, cost nowadays?! ;-), I'm more than happy to take the speed win...
Fair enough, I hadn't considered the time trade-off due to the reverse index being loaded. However, let me couch my suggestion in a different way. We've identified that the metadata can be located in possibly 3 different places. Use of each has different speed/space trade-offs. Using each also has a different API eg (from memory so won't be entirely correct)
getObjectForRID[object_id_].field
getIndexDataForRID[object_id_].field
field
All these methods might result in the correct data being displayed for a given search (each with a different tradeoff). Perhaps there should be a way of making this tradeoff transparently. That way the report designer can be ignorant of any optimizations eg
<dtml-in Catalog> <dtml-var title> </dtml-in>
would work no matter whether the title was set as metadata field or not. If it wasn't then the catalog might access the object and look it up (with the resulting time penelty). Then, in the same way as in the RDBMS world, the indexs (the Catalog) could be adjusted to make this operation more efficient without changing any code (just add metadata to the Catalog). The same tradeoff could be used if an admin decided that the object was too big and overhead of repeating the field in metadata was too big, then they could decide to obtain the data from the FieldIndex data instead.