[Zope-dev] ZCatalog and cataloguing remote sites

Agata Cruickshank a.cruickshank@ccs.bbk.ac.uk
Mon, 9 Sep 2002 15:13:07 +0100


Has anyone successfully managed to use ZCatalog for cataloguing external =
sites?=20
I followed the example from the Spicklemire's book (Zope - Web =
Application Development and Content Management, chapter 9), but didn't =
manage to get it working.=20
The basic idea is that you have a TinyTablePlus called CatalogedURLs =
storing URLs to be catalogued (with one column of data called url). You =
also create a ZCatalog called HTMLCatalog with an index called contents, =
and the title and source fields in the metadata. Then you have an =
external method called HTMLForCatalog.py, which creates a dummy object =
with title and content.

HTMLForCatalog.py:
-------------------------------
from ZPublisher import Client
import string

class dummyObject :
	pass

def createTitle(url,data) :
	aTitle =3D string.split(url,'/')[-1]
	firstIndex =3D string.find(string.upper(data), '<TITLE>')
	if (firstIndex <> -1):
		secondIndex =3D string.find(string.upper(data), '<TITLE>')
		if (secondIndex <> -1) :
			aTitle =3D data[firstIndex+7:secondIndex]
	return aTitle

def getHTMLForCatalog(self, url):
	x =3D dummyObject()
	theFile =3D Client.call(url)
	theData =3D theFile[1]
	x.title =3D createTitle(url, theData)
	x.content =3D theData
	aSource =3D string.split(url,'/')[-1]
	x.source =3D string.join(aSource, '/')
	return x
-----------------------------------------------
You connect an external method to the Zope site by creating =
GetHTMLForCatolog external method in ZMI.=20
You also have a DTML method called addFilesToCatalog which calls the =
external method

addFilesToCatalog :
--------------------------------------------
<dtml-var standard_html_header>
<dtml-in CataloguedURLs>
   <dtml-let newObject=3D"GetHTMLForCatalog(anUrl)">
     <dtml-call "HTMLCatalog.catalog_object(newObject,url)">
   </dtml-let>
</dtml-in>
<p>
<dtml-in "HTMLCatalog()">whatever
  <a href=3D'<dtml-var "getPath()">'>
<dtml-var "getPath()">
<dtml-var title></a><br>
</dtml-in>
<dtml-var standard_html_footer>
---------------------------------

This doesn't work because the line 3 passes 'anURL', which is not =
defined . If I change this to 'url' the URLs from the CatalogedURLs =
table are inserted into the catalog, but the method fails to retrive =
title or content. Any suggestions as to how to fix it or where I'm going =
wrong are welcomed.  As, in fact, are any suggestions how I can =
implement a search engine across Zope and non-Zope sites.

Many thanks,
Agata