[Zope] htmllib question

Oleg Broytmann phd@phd.russ.ru
Wed, 1 Dec 1999 11:05:19 +0000 (GMT)


On Tue, 30 Nov 1999, Sam Gendler wrote:
> I thought I had a pretty clean solution for extracting all the contents
> between the <body> </body> tags of an uploaded html file, using the
> htmllib.  Basically, in start_body, I call save_bgn(), and in end_body,
> I call save_end(), which was supposed to save all the contents between
> the two tags.  Unfortunately, it saves only the content that isn't in
> html tags.  All the subsequent tags get dropped.  Does anyone know an
> easy way around this?  The only method that I see is to overload the
> unknown tag functions to pu tthe tags back into a buffer, which is
> WAY more effort than it is worth.

   Look into Zope-2.1.0b2, directoru utils, file load_site.py. There is my
patch there that does exactly this using SGMLLib.

Oleg.
---- 
    Oleg Broytmann      Foundation for Effective Policies      phd@phd.russ.ru
           Programmers don't die, they just GOSUB without RETURN.