[Zope] mysterious dataloss clues?

Martijn Faassen faassen@vet.uu.nl
Fri, 24 Aug 2001 20:23:19 +0200


Hi there,

I'm posting this in the hope someone will be able to help me with
some mysterious data lossage we've been experiencing. I basically am
completely clueless what could be the cause of this, so any hint
would be useful.

Some boring numbers first:

We have an application running on Zope 2.3.0, connecting to Postgresql
7.0 using ZPoPyDA 1.2. The version of PoPy is 2.0.2. This is on Debian
2.2 (with some upgraded packages such as Postgresql).

Now to the meat of the issue:
 
For some reason, at random times, Postgresql seems to magically lose some
records that were already there and should've been committed. This seems
to happen within a few hours of the time the records were first created.
All trace of the records disappears from the database; we can see the
holes in the autoincrement primary key fields (and we never do any delete
ourselves that could cause this).

The lossage seems internally consistent; a bunch of records that's interrelated
all disappears.

The records disappearing are not all part of a single Zope transaction;
instead there are bunch of ZSQL insert operations in a set of pages.
It's just as if there's some kind of long running transaction that
at some point magically expires. Very weird. 

We haven't been able to reproduce the problem; it just seems to happen.
Vaguely associated with it is another, possibly related problem; 
occasionally Zope does not see a record with an id that is definitely
in the database. After a few more tries, it works again. We haven't
been able to confirm a firm temporal relationship between this problem
and the one described, but it's possible one somehow affects the other.

Is it possible that due to some strange interaction postgresql's 
transaction machinery ends up in some flakey state, so that something is
never committed that really should be?

I'm now trying to change all the tires and the engine as well; upgrading
this and that to see if the problem will go away. But some clues in which
direction I should search would be greatly appreciated.

Anyone ever saw something like this? I really wish I could be more 
specific. :)

Regards,

Martijn