[Zope] Patch to check all pages with html-tidy

Andy McKay andy@agmweb.ca
Wed, 09 Apr 2003 11:33:15 +0100


Im not sure I'd want to do this for every request, but maybe we could 
something like this to ZChecker, which finds bugs and issues in ZPT, 
DTML etc, including running ZPT through htmltidy?

http://www.zope.org/Members/andym/ZChecker

Thomas Guettler wrote:
> Hi!
> 
> Maybe someone is interested in this patch:
> 
> If you have html-tidy[1] installed, you can apply this patch to
> lib/python/ZPublisher/HTTPResponse.py to scan every html page with
> html tidy.
> 
> Warnings of html-tidy will be displayed in the debug logs. 
> 
> [1]: http://tidy.sourceforge.net/
> 
> You apply this patch like this:
> 
> cd zope/lib/python/ZPublisher
> cat html-tidy-patch.txt | patch
> 
>  thomas
> 
> 
> 
> ------------------------------------------------------------------------
> 
> --- HTTPResponse.py.orig	Wed Apr  9 08:37:36 2003
> +++ HTTPResponse.py	Wed Apr  9 08:26:52 2003
> @@ -176,6 +176,46 @@
>          self.stdout = stdout
>          self.stderr = stderr
>  
> +    def html_tidy(self):
> +        """
> +        Small hack to call html-tidy for every html
> +        page which is serverd by zope.
> +        Call it from lib/python/ZPublisher/HTTPResponse.setBody()
> +        after self.body is set
> +        
> +        if content_type == 'text/html':
> +            self.html_tidy()
> +        """
> +        import tempfile
> +        import popen2
> +        ignore=[
> +            'Warning: <table> lacks "summary" attribute',
> +            "Can't open",
> +            "Warning: <nobr> is not approved by W3C",
> +            "Warning: inserting missing 'title' element"]
> +        htmlfile=tempfile.mktemp()
> +        fd=open(htmlfile, "wt")
> +        fd.write(self.body)
> +        fd.close()
> +        stdout, stdin = popen2.popen4("tidy -q -errors %s" % htmlfile)
> +        out=stdout.readlines()
> +        os.unlink(htmlfile)
> +        for line in out:
> +            line=line.strip()
> +            cont=0
> +            for ign in ignore:
> +                if line.find(ign)!=-1:
> +                    cont=1
> +                    continue
> +            if cont:
> +                continue
> +            base="unknown base"
> +            if hasattr(self, "base"):
> +                base=self.base
> +            print "HTML-Tidy: %s %s" % (
> +                self.base, line)
> +        
> +
>      def retry(self):
>          """Return a response object to be used in a retry attempt
>          """
> @@ -329,6 +369,8 @@
>              body = '&gt;'.join(body.split('\233'))
>  
>          self.setHeader('content-length', len(self.body))
> +        if content_type == 'text/html':
> +            self.html_tidy()
>          self.insertBase()
>          if self.use_HTTP_content_compression and \
>              not self.headers.get('content-encoding',None):


-- 
   Andy McKay