[Zope-dev] 100k+ objects, or...Improving Performance of BTreeFolder...

Phillip J. Eby pje@telecommunity.com
Mon, 10 Dec 2001 11:43:03 -0500


At 04:08 PM 12/10/01 +0000, Tony McDonald wrote:
>On 10/12/01 2:54 pm, "Phillip J. Eby" <pje@telecommunity.com> wrote:
>
> > I'm not sure if this is taken into consideration in your work so far/future
> > plans...  but just in case you were unaware, it is not necessary for you to
> > persistently store objects in the ZODB that you intend to index in a
> > ZCatalog.  All that is required is that the object to be cataloged is
> > accessible via a URL path.  ZSQL methods can be set up to be
> > URL-traversable, and to wrap a class around the returned row.  To load the
> > items into the catalog, you can use a PythonScript or similar to loop over
> > a multi-row query, passing the objects directly to the catalog along with a
> > path that matches the one they'll be retrievable from.  This approach would
> > eliminate the need for BTreeFolder altogether, although of course it
> > requires access to the RDBMS for retrievals.  This should reduce the number
> > of writes and allow for bigger subtransactions in a given quantity of 
> memory.
>
>Gad! - are you saying you don't need to store a 1Mb .doc file into the ZODB,
>but can still index the thing, store the index information in the Zcatalog
>(presumably a lot smaller than 1Mb) and have the actual file accessible from
>a file system URL? If so, that's really neat!

Yep.  By "URL path", though, I meant a *Zope* path.  However it would be 
straightforward to create a Zope object that represents a filesystem path 
and does traversal/retrieval, assuming that one of the 'FS'-products out 
there doesn't already do this for you.

Chris Withers has pointed out that technically you don't even need the path 
string to be valid, it just has to be unique.  However, the standard tools 
and the method for getting the "real object" referred to by the catalog 
record do expect it to be a valid path IIRC.  I personally find it most 
convenient, therefore, to use a real Zope path.