[Zope-CMF] Pythonish Questions

Jon Edwards jon@pcgs.freeserve.co.uk
Tue, 1 May 2001 12:36:54 +0100


Hi all, I've started digging into Python, and I have a couple of questions
from looking at the CMF source code -

(N.B. I plumped for Boa Constructor in the end -
http://boa-constructor.sourceforge.net/ - seems an excellent open-source
Python IDE and wxPython GUI Builder, written in Python, with a lot of Zope
functionality already built-in!)

1. SearchableText method (of documents, news-items, etc) - If the
text_format is HTML, could this be made to stip out the HTML tags? Something
along the lines of...

 def SearchableText(self):
        "text for indexing"
        if self.text_format == 'html':
           self.text = strip_htmltags(text)
        return "%s %s %s" % (self.title, self.description, self.text)

(my syntax is probably wrong, but you get the idea?) Is there a function
somewhere equivalent to 'strip_htmltags'? I guess this is something DC would
need to do, as if I change the code myself it will be overwritten when I
upgrade?

This would keep the Catalog tidier (no HTML bits to confuse search results),
and would mean SearchableText could be inserted into a document's HTML
metadata headers (to help search-engine optimisation), without the risk of
breaking things by including HTML tags!

2. On a related note, I noticed the 'getMetaDataHeaders' method in
DublinCore, which would seem to be ideal for this - just append
SearchableText to the 'Description'. Would this break anything else? Is
there a way I can patch this change in my copy, without it being overwritten
when I upgrade? (Sorry for the newbie question, this is probably covered in
documentation somewhere, but I couldnt find it!)

3. Also in DublinCore, there is a 'Contributors' property. This would seem
very useful for CompositeContent objects made up of docs contributed by
several different people - the Creator would be the editor/reviewer
responsible for the CompositeContent object, the Contributors would be a
list of the individual authors of the documents. But next to it there is a
comment saying "# XXX: Fixme!". Is it safe to use this property?

4. I'm starting to wrap my head round the CompositeContent issue, does
anybody have any code they wouldn't mind "sharing with the group" to get me
started? Or pointers to existing code that does similar things? Is there a
SIG (or Zope equivalent) that's working on this?

Cheers, Jon