[Grok-dev] Re: declaring catalog indexes

Martijn Faassen faassen at startifact.com
Mon Apr 16 09:30:53 EDT 2007


Philipp von Weitershausen wrote:
> Martijn Faassen wrote:
>> The main drawback to this approach is that all indexes are set up 
>> centrally. Of course you can write code to adjust the catalog later 
>> (and have no setup_catalog), but are there patterns we can come up 
>> with to make the creation of indexes more easy?
>>
>> Let's translate the above in another proposal:
>>
>> class App(grok.Application):
>>     grok.local_utility(IntIds, provides=IIntIds)
>>     grok.local_utility(Catalog, provides=ICatalog, name='my_catalog')
> 
> Why do you make the catalog a named utility? I've seen you like doing 
> that (hurry.query's API always expects a name for the catalog, which 
> makes for a bit of a confusing API if the catalog's an unnamed utility).

I think it's mostly because we started that way with the Document 
Library, which has multiple catalogs, where that makes sense. We do need 
to provide for the use case where the catalog is unnamed.

>> class AIndex(grok.FieldIndex):
>>    grok.catalog(App, 'my_catalog')
>>    grok.attribute('a_attribute', ISomeInterface)
> 
> As somebody who knows the AttributeIndex API I would've expected 
> something like this:
> 
>   class AIndex(grok.FieldIndex):
>       grok.catalog(App, 'my_catalog')
> 
>       field_name = 'a_attribute'
>       field_callable = False
>       interface = ISomeInterface

Right, I was looking exactly for people familiar with other Index APIs. 
:) Then again, this API looks equivalent to that of FieldIndex, which I 
tried to cover (with a hidden default for the 'callable' argument :).

We could also make this:

grok.index_parameters('a_attribute', ISomeInterface, False)

Thinking about this some more, it's a bit confusing that grok.FieldIndex 
in fact does not in any way subclass from the *real* FieldIndex. This 
would cause a normal FieldIndex (not a subclass thereof) to be installed 
in the catalog. What would be a better name for 'grok.FieldIndex'?

>> class BIndex(grok.TextIndex):
>>    grok.catalog(App, 'my_catalog')
>>    grok.attribute('b_attribute', ISomeInterface)
> 
> Somebody who's not entirely familiar with the ZODB and how we do 
> indexing using the catalog probably won't think that this is 
> straight-forward. It's also a lot of typing and repitition. I do 
> understand and appreciate that it's pretty flexible because you can add 
> additional indexes quite easily.

You're right, this doesn't look at straightforward as it should be.

> To reduce typing, the "grok.catalog(...)" line could be guessed if 
> everything is in the same module and there's only one catalog (an 
> unnamed utility) in the site.
> 
> Still, I wonder how many people (coming from Archetypes or other 
> libraries that mangle a bit more information together) would expect to 
> set the indexing flag directly in the model or in the schema. While this 
> would mangle up things badly, I could think of a more compact spelling: 
> write a data schema and then write an "index schema" next to it:
> 
>   class IMySchema(Interface):
> 
>       a_attribute = schema.Int()
>       b_attribute = schema.Datetime()
> 
>       def c_method():
>           pass
> 
>   class MySchemaIndexes(grok.Indexes):
>       grok.schema(IMySchema) # this is actually the default
> 
>       a_attribute = grok.FieldIndex()
>       b_attribute = grok.TextIndex(lexicon)  # additional parameters
>       c_method = grok.SetIndex()   # field_callable is automatically
>                                    # determined from the schema
> 
> What do you think?

It's an interesting approach. I think I like it. :) It wouldn't work for 
a flexible run-time configuration of a catalog, but neither would mine 
and this is more concise. It also allows one to define indexes all over 
the place.

When the grok.Indexes subclasses are grokked, they would set up a 
subscriber for Application. This subscriber would install the intids/ 
catalog if needed (either nameless or being informed by grok.name() on 
MySchemaIndexes). It would then install the appropriate indexes.

Calling these 'grok.SetIndex' and such would be a bit dangerous however, 
as people that set up a catalog manually would be tempted to instantiate 
that and plug it into a catalog, and get very confused. Perhaps we need 
to come up with something like grok.indexschema' to import these things 
from... Perhaps grok.indexschema.Text, grokindexschema.Field, 
grok.indexschema.Set? It needs to be somehow clear that these things are 
*descriptions* and not the thing themselves, just like schema fields are 
descriptions of attributes and not used to implement the attributes 
themselves.

Also interesting would be something like an 'AutoIndex'. This would 
figure out from the schema what kind of index to install. If the schema 
  defines a sequence field, we automatically create a SetIndex. Too far 
for the first phase though.

In fact there are other use cases besides indexing that need some kind 
of schema annotation. More application specific, but if I want to use 
zc.table to display a whole lot of objects (that provide a schema), I 
might want to specify which fields should be displayed where. Anyway, 
those use cases are more vague, so not something to worry about too much 
right now.

>> The indexes are grokked automatically and get created in the catalog 
>> when 'App' is created using a subscriber. This allows an extension 
>> modules to create new indexes in the core. Of course the app would 
>> still need to be reinstalled to enable those indexes, but with some 
>> clever event registration (IApplicationSetup event?) we might be able 
>> to handle that quite nicely as well.
> 
> I think after setting up the catalog, we want to send an event 
> (CatalogAddedEvent or something like that). Only that way we can ensure 
> that indexes are set up *after* the catalog (Zope doesn't guarantee 
> ordered execution of event handlers).

Right. Your approach sounds good, except that I don't know how to do 
named catalogs with that. There would be only a single catalog. My 
approach though less nice in some ways does allow this.

>> One issue that this approach has is if indexes take other parameters 
>> than the three I know about (attribute name, interface, boolean on 
>> whether to call attribute or not). How do we pass them in? Does anyone 
>> have experience with indexes that can use extra information?
> 
> The only indexes I've worked so far were attribute indexes. They take 
> the three famous arguments but we could simply set them as attributes on 
> the object as outlined above.

Right, your approach looks quite nice for this, no matter what indexes 
we run into in the future.

> The TextIndex from zope.app.catalog.text also takes a lexicon as a 
> fourth argument, though that too can be set as an attribute like the 
> others.

Just pass it along to the index definition constructor, indeed.

I like your approach! I might just go ahead and start working on this in 
a branch.

Regards,

Martijn



More information about the Grok-dev mailing list