[Zope-dev] DTML and REQUEST data changes about to be checked in

Martijn Pieters mj@zope.com
Fri, 2 Aug 2002 12:33:15 -0400


On Fri, Aug 02, 2002 at 08:55:13AM -0700, Andy McKay wrote:
> Likewise Im trying to digest all that and Im a little suprised. More magic
> in DTML? Not something I'd vote for normally.
> 
> Im a little confused why this is suddenly an issue, yeah so we pull a string
> out of the REQUEST and thanks to DTML stack we may not know where it came
> from. Well thats always been there. And yeah the string may contain nasty
> HTML. Again that's always been there.
> In the past (and I cant find posts to show it) the party line was Zope is an
> application server and its up to the person developing the application to
> worry about it. Thats why ChrisW wrote stripogram and I use it in quite a
> few apps.

Yup. And that is still the case. However, the combination of implict REQUEST
form interpolation and no HTML quoting turns out to especially dangerous,
because of those situations where you *want* no HTML quoting for optional
information that normally should *not* come from the REQUEST.

An example is the Zope help system; there are API help pages that have
optional information, which when present is already HTML. But when not
present in the object hierarchy, but it *is* available in the REQUEST, the
REQUEST data is used instead. The way standard_error_message deals with
exceptions is another such a situation. The DTML author didn't expect the
particular template slot to be filled with REQUEST data, the slot is
optional, and the author has no way of preventing REQUEST data from being
used.

The solution we choose fixes that problem, for all existing DTML as well as
future DTML. Note that ZPT does not have this problem, as it quotes by
default and doesn't use implict namespaces.

> One other question? Why does it matter that the string is implicitly called,
> why dont you taint explicitly called to? It makes me think of Perl where
> taint mode taints anything coming from the user?

Because, as explained above, its the implicit case that is dangerous. In the
explicit case you are supposed to know you are working with unsafe data and
thus the old rules apply. If we explicitly quoted, we hurt everyone that
either did the right thing from the start and/or already knows they are
playing with fire.

> This still doesnt solve the party line and means I would like to suggest
> again (and this time I have the time to work on it) that we add something
> like stripogram or similar to the core, so that is easy for an application
> developer to have access to strip html and other functions from products,
> DTML, Python Scripts etc to easily alter, manage and make HTML safer.

The CMF now includes a basic HTML stripper. In future iterations, Tres
Seaver expects this to evolve into a CMF Tool that is more generaly
configurable and useable.

-- 
Martijn Pieters
| Software Engineer  mailto:mj@zope.com
| Zope Corporation   http://www.zope.com/
| Creators of Zope   http://www.zope.org/
---------------------------------------------