I changed the ZCatalog and indexes codebase for Zope 2.8 in the following way:
- ZCatalog and indexes derived from UnIndex used a nasty implementation of __len__ which lead to problems in Zope 2.8 with new-style classes. The corresponding code has been cleaned up. For ZCatalog instances there is an implicit migration built-in for the __len__ attribute. For all indexes I added a manage_convertIndexes() method that re-creates and reindexes all indexes of a given ZCatalog instance.
Outstanding issues:
- some indexes show the number of indexed objects vs the number of indexes terms within the ZMI which is totally inconsistent. I think the ZMI should show the number of indexed objects. Index specific information e.g. the number of indexed terms should be shown within the indexes default view (if necessary). Objections?
- Indexes derived from UnIndex also store informations about objects although they do not index useful information. E..g. if you have one thousand objects and a keyword index indexing only two objects (because these two objects have the required property or method returning indexable values) than you have the RIDs of one thousand objects within the internal data structure of the index....a total waste of space. An optimised version of UnIndex would store only values evaluating to non-zero. Such an optimisation is already used in Dieter's Managablexes. This might change the behaviour of some applications but I am not completely sure about this issue. Thoughts?
Andreas
Andreas Jung wrote at 2005-1-30 15:30 +0100:
... Outstanding issues:
- some indexes show the number of indexed objects vs the number of indexes
terms within the ZMI which is totally inconsistent. I think the ZMI should show the number of indexed objects. Index specific information e.g. the number of indexed terms should be shown within the indexes default view (if necessary). Objections?
This was already discussed (--> mailing list archives).
I (and others) find it more informative to get a feeling about the size of the index (rather than the number of indexed objects) in the overview.
- Indexes derived from UnIndex also store informations about objects
although they do not index useful information. ... An optimised version of UnIndex would store only values evaluating to non-zero.
It would essentially change the "number of indexed objects" (and make it a bit more informative when one is interested in the size of the index) ;-)
You must be a bit careful with the "non-zero". Some indexes interpret (some) zero values in a special way, e.g. "DateRangeIndex". It interprets "None" as "no limit".
However, I agree with you: at least when an object "o" does not define a value for index "i", then "i" should not index "o".
I would also prefer when "None" would consistently means: I do not have a value (in the current context). But, this will interfere with some indexes.
--On Sonntag, 30. Januar 2005 19:17 Uhr +0100 Dieter Maurer dieter@handshake.de wrote:
- some indexes show the number of indexed objects vs the number of
indexes terms within the ZMI which is totally inconsistent. I think the ZMI should show the number of indexed objects. Index specific information e.g. the number of indexed terms should be shown within the indexes default view (if necessary). Objections?
This was already discussed (--> mailing list archives).
I (and others) find it more informative to get a feeling about the size of the index (rather than the number of indexed objects) in the overview.
There are two points of view: the "normal" user is confused if some indexes show up the number of indexed objects and others show the size of the index within the same column. This should be consistent. If would prefer the number of indexed objects within the default ZMI view and put index specific size information into their own default view.
- Indexes derived from UnIndex also store informations about objects
although they do not index useful information. ... An optimised version of UnIndex would store only values evaluating to non-zero.
It would essentially change the "number of indexed objects" (and make it a bit more informative when one is interested in the size of the index) ;-)
right.
You must be a bit careful with the "non-zero". Some indexes interpret (some) zero values in a special way, e.g. "DateRangeIndex". It interprets "None" as "no limit".
Yes, but DateRangeIndexes overwrite index_object() so that a change of UnIndex itself would not harm.
I would also prefer when "None" would consistently means: I do not have a value (in the current context). But, this will interfere with some indexes.
"None" could be a problem with other indexes...at least there should be a unique marker saying: I have nothing of interest to be indexed....
Andreas
Andreas Jung wrote at 2005-1-31 18:50 +0100:
...
[AJ]
- some indexes show the number of indexed objects vs the number of
indexes terms within the ZMI which is totally inconsistent. I think the ZMI should show the number of indexed objects. Index specific information e.g. the number of indexed terms should be shown within the indexes default view (if necessary). Objections?
[DM]
This was already discussed (--> mailing list archives).
I (and others) find it more informative to get a feeling about the size of the index (rather than the number of indexed objects) in the overview.
[AJ]
There are two points of view: the "normal" user is confused if some indexes show up the number of indexed objects and others show the size of the index within the same column. This should be consistent. If would prefer the number of indexed objects within the default ZMI view and put index specific size information into their own default view.
[DM] All are with you (including myself) when you strive for consistency. The display should be consistent and correspond to the label in the table head.
I am not with you with respect to "number of indexed objects" versus "size of the index". In fact *BOTH* as index specific (otherwise, it would not make any sense to list it in an index specific column).
Maybe, a compromize would be to include both numbers?
--On Montag, 31. Januar 2005 20:20 Uhr +0100 Dieter Maurer dieter@handshake.de wrote:
I am not with you with respect to "number of indexed objects" versus "size of the index". In fact *BOTH* as index specific (otherwise, it would not make any sense to list it in an index specific column).
Maybe, a compromize would be to include both numbers?
I am currently working on a solution for all these issues on a dedicated 2.8 branch.
Andreas