[ZWeb] Re: zope.org - serious caching issues

Jim Fulton jim at zope.com
Mon May 21 10:09:42 EDT 2007


I'm adding zope-web to the CC list.


On May 20, 2007, at 1:42 PM, Mark W. Alexander wrote:

> On Saturday 19 May 2007, Michael Haubenwallner wrote:

Michael, I'm sorry I dropped the ball on this.  I said I'd look into  
it and got distracted.

>> Mark W. Alexander schrieb:
>>> On Friday 18 May 2007, you wrote:
>>>> Hi, i experience problems with the news listing and rss feed on  
>>>> zope.org
>>>> : http://www.zope.org (right column)
>>>> http://www.zope.org/News/ (News listing)
>>>> http://www.zope.org/news.rss (news rss feed)
>>>> http://www.zope.org/products.rss (products rss feed)
>>>>
>>>> The pages / feeds are different for authenticated and anonymous  
>>>> users.
>>>> Refreshing (even forced) does not produce a correct page, same  
>>>> results
>>>> with non-browser based retrieval (wget, urllib).
>>>>
>>>> Adding a querystring to the URL returns updated data - but that for
>>>> logged in Users only.
>>>
>>> What do you mean by "different" for authenticated and anonymous  
>>> users? It
>>> looks the same to me both ways.  Pages will cache for 15 minutes
>>> _per_cache_ so when you are making many changes you'll see  
>>> differences
>>> depending on which cache you hit.
>>>
>>> Any query string will bust the cache once, but only once, as the
>>> url?string will produce a new, unique cache url.
>>>
>>> You can  use wget's -S option to see the X-Cache headers for the  
>>> caches
>>> the request is using as well as the Age (in seconds) of the cached
>>> object. That information may help your analysis.
>>>
>>> Mark
>>
>> Checking again this morning i see no difference - there is still (2
>> month) old data showing on frontpage and in the rss feeds ...
>>
>> I've looked into the scripts that compute the data
>>
>> /zopeorg/news.rss
>> /zopeorg/products.rss
>> /zopeorg/latestContentBySubject
>>
>> In ZMI all three objects are cached by an 'Accelerated HTTP Cache  
>> Manager'.
>>
>> I subsequently removed the 'Five minutes' cache from the objects and
>> checked that stats page for several minutes (see below)

I wish you hadn't done that yet.  If we keep changing things. it will  
be hard to figure this out.

It would be helpful to show the results of, say wget -S, as in:

jim at ds9:~/tmp$ wget -S http://www.zope.org/news.rss
--09:58:05--  http://www.zope.org/news.rss
            => `news.rss'
Resolving www.zope.org... 63.240.213.171
Connecting to www.zope.org|63.240.213.171|:80... connected.
HTTP request sent, awaiting response...
   HTTP/1.0 200 OK
   Server: Zope/(unreleased version, python 2.2.3, linux2) ZServer/1.1b1
   Date: Mon, 21 May 2007 13:57:28 GMT
   Content-Length: 4011
   Content-Type: text/xml
   Age: 4
   X-Cache: HIT from parent-ng2.zmh.zope.net
   X-Cache: MISS from cache2.zmh.zope.net
   Connection: close
Length: 4,011 (3.9K) [text/xml]

100%[====================================>] 4,011         --.--K/s

09:58:05 (365.80 KB/s) - `news.rss' saved [4011/4011]

jim at ds9:~/tmp$ wget --user jim --password xxxxxx -S http:// 
www.zope.org/news.rss
--10:00:45--  http://www.zope.org/news.rss
            => `news.rss.1'
Resolving www.zope.org... 63.240.213.171
Connecting to www.zope.org|63.240.213.171|:80... connected.
HTTP request sent, awaiting response...
   HTTP/1.0 200 OK
   Server: Zope/(unreleased version, python 2.2.3, linux2) ZServer/1.1b1
   Date: Mon, 21 May 2007 13:45:20 GMT
   Content-Length: 4011
   Content-Type: text/xml
   X-Cache: HIT from parent-ng2.zmh.zope.net
   Age: 892
   X-Cache: HIT from cache4.zmh.zope.net
   Connection: close
Length: 4,011 (3.9K) [text/xml]

100%[====================================>] 4,011         --.--K/s

10:00:45 (1.85 MB/s) - `news.rss.1' saved [4011/4011]

Note that the second request is authenticated (except with a  
different password :)

...

>> The objects do not display any caching policy in ZMI, but the cache
>> manager still shows the enties in different variations.

Possibly because it doesn't know about the change.


> It looks like in issue in Zope.

How so?  If you look at the wget output above, there don't seem to be  
any cache headers set.  So, data would not be cached unless there is  
an overriding policy in squid.

> If you see both te child and the parent MISS,
> then what you're getting is coming from the app server.

I'm getting a hit from the parent.  Also note that both hits have me  
results for which the most recent entry is from March 29.  If I bust  
the cache with a query string, the most recent entry is for May 15.


> That would also
> explain differences based on roles. There is nothing in squid that
> distinguishes if a user is authenticated, anonymous or manager.

I *think* Andrew Sawyers did something to arrange that non-anonymous  
users get non-cached results.  This doesn't seem to be working any  
more. This is bad. I'm hoping that however got this working properly  
at some point can tell us what they did. :)

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org





More information about the Zope-web mailing list