[Zope] Network Appliances NetCache vs. Squid

Marc Burgauer marc@sharedbase.com
Wed, 30 Apr 2003 12:36:16 +0100


Hi Scott, thanks for the reply!

This is going to be a bit lengthy, as I tried to fully describe the problem
at hand.

Interestingly enough, I started out asking this list how I can determine
best usage of the cache mechanisms built into Zope. There are two
bottlenecks in the site.

1.) Images

We are only responsible for programming and hosting of the site. The
customer has chosen to give the HTML coding part of the work to a design
agency.

All pages, images and data are stored in the ZODB!

We have trained the HTML coders (2 years ago) and written some tip sheets
explicitly telling them to use absolute referencing for images from the
root-level images-directory.
However in a recent re-design of the site, they have ignored this advice and
all images are now referenced relatively. In any bog standard web server
environment, those paths wouldn't work. (As they are simply wrong.) But
thanks to Zope inheritance, it just climbs up the directory tree and
ultimately finds the images. So the coders never spotted their mistake.

Example:

The images are all stored in a root-level image directory or subdirectories
of it:

/images/navigation/home.gif
/images/mainlogo.gif
/images/navigation/news/nextitem.gif

The site contains 45 "microsites" which have identical page-structure, the y
have all the same pages, but different data. So each microsite has a news
page. On such a news page all three images above are required. The HTML code
for these images reads:

<img src=images/navigation/home.gif ...> or
<img src=images/mainlogo.gif ... > or
<img src=images/navigation/news/nextitem.gif>

The microsites are all root level directories. The news pages are in
subdirectories of each microsite. The templates use "includes" for most of
the page components and these includes are in the root. For example:

newslisting.dtml
/newyork/news/list
/london/news/list
/paris/news/list
/munich/news/list

A browser or proxy trying to cache things will resolve any of the images to
be "local" to the news directory and not identify them all being the same
image. So caching will only work for pages within the same subdirectory, but
not across all pages on the site, even they all use the same logo.

Obviously, I could go and edit the 100+ include files and fix the false
referencing. However, that would only fix about half of the image-caching
problem, as some images are stored in image folders within each microsite.

Obviously, this task is labour intensive and I look for a quicker fix. Also,
the person responsible for the site within the customer has changed and the
new guy wants to change the site's look and feel in around 6 months (when he
has new budget) to be less image rich. At the moment each page contains at
least 30 images, most being albeit small. But as they are mainly text links
rendered to images using a not-so-common font (Stone) we (me and the
customer) agree that making the real text links in the next revamp will
improve usability and speed up page load time.

Is Zope's RAM cache manger going to help delivering the images here? Most of
them are under 10Kb in size. Obviously the times saved would be fetching
them from the ZODB on disk.

2.) Zope responding to request

Zope is responding slowly to any initial request. You can see in the
browser's status bar the "connecting to site" message for about 15-20
seconds before anything happens, at peak usage time this can go up to nearly
a minute. The site is "busy" from around 6am to 1am, with a peak usage
between 4pm and 7pm.

I believe that the issue here is how many requests can be served in
parallel. The box itself could serve more, neither CPU, RAM nor network are
maxed out. CPU load is usually around 30% even at peak usage. So I was
wondering if running ZEO and 2 instances of Zope pointing at the same ZODB
on the same machine would improve things.

Content changes frequently overall, but many pages don't change for weeks.
Most pages are truly dynamic (i.e. content managed). Packing the database
every night has not improved performance. And because of the true dynamic
nature of the site I wonder how well a cache like squid would improve the
site. (I have never done any caching, although recently read a lot about
this.)

We're using Zope 2.3 on a Solaris 8 server (Compaq, 2 P2 450MHz CPUs, RAID
Disk array, 1 Gig of RAM). The box and OS has been chosen by the customer
and there's no room for play currently. (OS is company policy and no money
for another box.)

We have developed a few "light-weight - very easy to use" content management
products for Zope and have run into a few problems when testing them against
Zope 2.5/2.6, hence we still stick with 2.3 for production sites until all
bugs are fixed.


Cheers

Marc



> Date: Mon, 28 Apr 2003 10:41:36 -0700
> From: Scott Burton <scott@posplugin.com>
> Subject: Re: [Zope] Network Appliances NetCache vs. Squid
> To: zope@zope.org
> 
> Why do you really need to accelerate the site?
> If you have a highly dynamic site, no acceleration appliances will work.
> There may be a few things you can do on your end first to see if there is
> something easy to do to your instance of Zope.
> 
> Have you run profiling to see what is the bottleneck? Are there some
> expensive scripts that may need some reworking for better performance?
> Profiling can help with that. Do you efficiently use Zope's caching
> mechanisms like caching templates and images in ram, or increasing the size
> of the ZODB cache? What OS is running Zope? Could you proxy Apache in front
> of Zope on the same machine using mod_cache to speed up images etc.? Could
> you build a simple 1u Linux box with ZEO and Squid on it as the accelerator
> for less than a netappliance?
> 
> I would look into those things(if you haven't already) before worrying about
> a caching/accelerator.