[Zope] finding the content_type of a subclass (phyton newby)

Giuseppe Bonelli g.bonelli@pn.itnet.it
Mon, 20 Aug 2001 22:50:32 +0200


Hi all,

I made an archive of pdf files searchable through the catalog, but I =
have a small glich I cannot resolve myself.

The PDFs are stored in the file system using the ExtFile product and I =
am using pdftotext and ExtDocument to get them indexed by the =
PrincipiaSearchSource.

The code I use (better to say I borrowed ...) is the following:

In ExtDocument.py
	[...]
	class ExtDocument(ExtFile):
	[...]

	def PrincipiaSearchSource(self):
		"""Convert data to raw text (don't bother formatting)"""
		filename=3Dself._get_filename(self.filename)	=09
=09
		if self.content_type =3D=3D 'application/pdf':
			return popen('pdftotext -raw %s -' % filename).read()
		else:
			return 'abracadabra'		=09

In ExtFile.py
	[...]
	class ExtFile(CatalogAware, SimpleItem, PropertyManager):
	[...]
=09
	def _get_filename
	[...]

The problem is that any new instance of an ExtDocument get indexed in =
PrincipiaSearchSource as 'abracadabra', meaning that it is not =
recognized as 'application/pdf'; but when I update the Catalog the =
ExtDocument get indexed correctly!

Someone has a clue ?

TIA,

--peppo