[Zope] Zope + htdig indexing problem

Paul Erickson erickson@kaivo.com
Wed, 17 Jul 2002 15:09:34 -0600


The best bet is to stop using relative URL's in zope,

i.e. use
<a href="&dtml.url-yourobject;">link</a>
instead of
<a href="yourobject">link</a>

 but if that causes a lot of pain, here's a couple of other ideas:

Try putting a rule in your robots.txt file such as:

User-agent: *
Disallow: /org/org

Or, try using the max_hop_count parameter in the htdig.conf file. You'd 
still get some repeats, but at least it would stop at some point.  This 
is only reliable for complete indexes rather than updates.

-Paul

Dieter Maurer wrote:

>Tiffany Webb writes:
> >    We are having a problem with htdig indexing Zope documents with 
> > multiple directory listings from htdig -i -vvv:
> >
> >href:http://dev.website.com/org/org/org/org/org/org/org/org/org/core/index_a.html 
>
>Looks like a non-trivial relative URL reference. A relative URL
>reference is non-trivial when it contains a "/" which is not preceeded
>by "..".
>
>Due to acquisition, Zope resolves such URL references quite well.
>But, when you have a reference cycle containing one (or more)
>non-trivial URL references, then the URLs get longer and longer
>for each round through the circle. Humans finally stop
>turning around the circle, but spiders may be stupid...
>
>
>Dieter
>
>
>_______________________________________________
>Zope maillist  -  Zope@zope.org
>http://lists.zope.org/mailman/listinfo/zope
>**   No cross posts or HTML encoding!  **
>(Related lists - 
> http://lists.zope.org/mailman/listinfo/zope-announce
> http://lists.zope.org/mailman/listinfo/zope-dev )
>
>  
>