Sounds like you need something to distribute and/or balance the load; 3 suggestions:
1 - HTTP Proxy Load Distribution - Squid does this, round-robin, put squid in front of your ZEO farm (this is what I do). I don't think that mod_proxy is capable of doing this. 2 - Layer 4 switch (Cisco LocalDirector, Intel Netstructure, etc) - not cheap, but offers some features like a choice between true balancing, outage detection, and funky routing techniques like out-of-path return for quicker network performance. 3 - LVS - Linux Virtual Server Project, attempts to do in software what # 2 does.
Personally, I'm in favor of squid because it is cheap and easy, if all your machines are of equal weight in performance terms. My company uses squid in front of multiple ZEO clients, bypassing Apache altogether for access to Zope; we use pyredir as a redirector to rewrite URLs like you would with mod_rewrite; Squid is very nice as a Zope virtual host front-end, and provides ACLs for security purposes, like blocking out the ZMI from public access. Getting load balancing working in squid is simply a matter of compiling it with a flag to specify you want external DNS resolution support and using squid's built-in name resolving program dnsserver, which will look at /etc/hosts and round-robin among multiple IPs that have the same hostname. One caveat: if you put Squid (or any high-volume TCP app) on Solaris, make sure to tune your TCP connection buffers, so Solaris doesn't choke; this is one (albeit trivial) reason we use Linux 2.4 for our proxies instead of Solaris.
Since squid round-robin's it doesn't deal with node reliability problems, which means you will have to rely upon another HA mechanism on the ZEO client nodes themselves, like heartbeat from the Linux-HA project, which at some point will ported to Solaris, from what I hear.
If you have a decent budget, but not a lot of time, look at a L4 switch (we use for another application, and like, the Intel Netstrcuture 7140); otherwise, if you have budget constraints consider LVS or Squid, with Squid being the likely easiest path in terms of setup time.
Sean
-----Original Message----- From: Tony McDonald [mailto:tony.mcdonald@ncl.ac.uk] Sent: Saturday, August 25, 2001 9:14 AM To: Zope Subject: [Zope] Advice needed: load balancing wih ZEO and Apache on Solaris.
Hi, I've followed the instructions at http://www.zope.org/Members/dshaw/AdvancedSiteSetup to set up a ZEO server and client (on the same machine). These instructions are very clear and I'd recommend them to people wanting to experiment with ZEO.
We serve our content up through Apache (fast delivery of static content, CGI scripts and PHP4 served too).
I'm using the VHM method of routing Apache requests through to my ZServer installation so (only a snippet shown, and this is an example);
RewriteEngine on RewriteRule ^/cgi-bin - [L] RewriteRule ^/static - [L] RewriteRule ^/ltsn_images - [L] RewriteRule ^/(.*) http://localhost:18080/VirtualHostBase/http/myserver.ncl.ac.uk:80/VirtualHos tRoot/$1 [P]
Following the instructions from Dave Shaws' page I have a ZEO server running on port 8080 and a client running on port 8081
One idea I've had from reading the Apache Rewrite rules page at http://httpd.apache.org/docs/misc/rewriteguide.html Is to use ProxyPassReverse (in some way), i.e.
ProxyPassReverse / http://localhost:8080/ ProxyPassReverse / http://localhost:8081/
My question is this;
What can I do to get requests from myserver.ncl.ac.uk farmed out to my ZEO server-client farm?
Any pointers would be really appreciated. Cheers Tone.
On 25/8/01 8:55 pm, "sean.upton@uniontrib.com" sean.upton@uniontrib.com wrote:
Sounds like you need something to distribute and/or balance the load; 3 suggestions:
1 - HTTP Proxy Load Distribution - Squid does this, round-robin, put squid in front of your ZEO farm (this is what I do). I don't think that mod_proxy is capable of doing this.
Sean, Phil, Many thanks for the help. I couldn't reply earlier due to family matters.
I downloaded squid and spent quite some time with it. Unfortunately, it seems way too complex for me. However, as it has the http-accelerator and other facilities, I'll probably look at it again once I get some free time.
2 - Layer 4 switch (Cisco LocalDirector, Intel Netstructure, etc) - not cheap, but offers some features like a choice between true balancing, outage detection, and funky routing techniques like out-of-path return for quicker network performance.
At the moment we're strapped for cash, so I can't use this option - although as we're going to be putting some firewalls in front of our boxes, I *may* be able to do this in the future.
3 - LVS - Linux Virtual Server Project, attempts to do in software what # 2 does.
Had a look at this, but we're a Solaris shop and there seemed too many kernel patches needed to do this.
Personally, I'm in favor of squid because it is cheap and easy, if all your machines are of equal weight in performance terms. My company uses squid in front of multiple ZEO clients, bypassing Apache altogether for access to Zope; we use pyredir as a redirector to rewrite URLs like you would with mod_rewrite; Squid is very nice as a Zope virtual host front-end, and provides ACLs for security purposes, like blocking out the ZMI from public access.
This is the main reason I spent that time looking at squid - the security aspect. But, we've built up quite a bit of experience with Apaches ReWriteRules and have a requirement for PHP and (cough) Perl scripts to run alongside our Zope sites as well.
Getting load balancing working in squid is simply a matter of compiling it with a flag to specify you want external DNS resolution support and using squid's built-in name resolving program dnsserver, which will look at /etc/hosts and round-robin among multiple IPs that have the same hostname. One caveat: if you put Squid (or any high-volume TCP app) on Solaris, make sure to tune your TCP connection buffers, so Solaris doesn't choke; this is one (albeit trivial) reason we use Linux 2.4 for our proxies instead of Solaris.
Ah - thanks for that - I'll do the /etc/system thing with the TCP/IP parameters.
Since squid round-robin's it doesn't deal with node reliability problems, which means you will have to rely upon another HA mechanism on the ZEO client nodes themselves, like heartbeat from the Linux-HA project, which at some point will ported to Solaris, from what I hear.
We're going to be getting some high availability software that does heartbeat monitoring so I think I'm ok there (but see *** below).
If you have a decent budget, but not a lot of time, look at a L4 switch (we use for another application, and like, the Intel Netstrcuture 7140); otherwise, if you have budget constraints consider LVS or Squid, with Squid being the likely easiest path in terms of setup time.
Sean
Thanks very much for the info Sean. Although I'll probably not be using squid at the moment - all this is very helpful. In the end I used the method that Phil mentioned;
Anyway on with the show:
Take a look at this document, http://httpd.apache.org/docs-2.0/mod/mod_rewrite.html. Search for 'Randomised Plain Text' and there's your recipe.
Basically, you need to create a file that has these lines in, call it map.txt (the name doesn't really matter though):
localhost port1|port2|port3
replace port1 .. port 3 with the exact ports you are using:
localhost 8080|8081|8082
You can put as many options on this line as you need.
Then change your final rewrite rule to be like this:
RewriteMap servers rnd:/path/to/file/map.txt RewriteRule ^/(.*) http://localhost:$(servers:localhost)/VirtualHostBase/http/myserver.ncl.ac.u k:80/VirtualHostRoot/$1 [P]
This will put a random port number into the line thereby giving you pseudo 'round-robin' functionality. The ${servers:localhost} is the clever part, 'servers' is the name of the map, and localhost is a parameter telling the map which line of the map.txt file to choose from.
as I have quite some experience with Apache and Rewriterules.
*** We're only going to have two machines in our 'cluster', but they are multi processor machines. The reason I'm having to go through all these hoops is basically down to the poor performance of Sparc chips. My pystone ratings on our new server are about 4500, whilst my own TiPB is 6500 and a PIII 700MHz is about 10,500. Therefore I'm trying to squeeze more performance out of our boxes.
Once again guys, thanks for the help.
Now all I need to do is figure out how to do Core Session Tracking with ZEO (I know there are HowTos - but they're not that transparent to my poor head! :)
Cheers, Tone.
Hi Tony,
I've achieved reasonable (though somewhat naive) ZEO load distribution (I won't call it balancing) this way:
In Apache:
01 <VirtualHost 192.xxx.yyy.zzz:80> 02 ServerAdmin steve@spvi.com 03 ServerName test_balance.spvi.net 04 ErrorLog /var/log/spvi.net-error_log 05 CustomLog /var/log/spvi.net-access_log common 06 07 08 RewriteEngine on 09 #RewriteLog /var/log/rewrite.log 10 #RewriteLogLevel 10 11 RewriteMap balance_load_ext prg:/usr/local/share/apache/conf/balance_load_ext.py 12 RewriteRule ^/(.*)$ ${balance_load_ext:$1} [P,L] 13 14 </VirtualHost>
where balance_load_ext.py is:
01 #!/usr/bin/env python 02 03 count = 0 04 05 import sys 06 import string 07 08 def translate(data): 09 global count 10 count = (count + 1) % 3 11 return "http://www%i.spvi.net:14080/VirtualHostBase/http/test_balance.spvi.net:80/ %s" % (count, data) 12 13 14 if __name__=='__main__': 15 while 1: 16 data = string.strip(sys.stdin.readline()) 17 if not data: 18 break 19 print translate(data) 20 sys.stdout.flush()
This distributes load between three machines, using Apache only. You could make the python script smarter to achieve something closer to real load balancing with a little effort.
-steve
On Sunday, September 2, 2001, at 01:03 PM, Tony McDonald wrote:
On 25/8/01 8:55 pm, "sean.upton@uniontrib.com" sean.upton@uniontrib.com wrote:
Sounds like you need something to distribute and/or balance the load; 3 suggestions:
1 - HTTP Proxy Load Distribution - Squid does this, round-robin, put squid in front of your ZEO farm (this is what I do). I don't think that mod_proxy is capable of doing this.
Sean, Phil, Many thanks for the help. I couldn't reply earlier due to family matters.
I downloaded squid and spent quite some time with it. Unfortunately, it seems way too complex for me. However, as it has the http-accelerator and other facilities, I'll probably look at it again once I get some free time.
2 - Layer 4 switch (Cisco LocalDirector, Intel Netstructure, etc) - not cheap, but offers some features like a choice between true balancing, outage detection, and funky routing techniques like out-of-path return for quicker network performance.
At the moment we're strapped for cash, so I can't use this option - although as we're going to be putting some firewalls in front of our boxes, I *may* be able to do this in the future.
3 - LVS - Linux Virtual Server Project, attempts to do in software what # 2 does.
Had a look at this, but we're a Solaris shop and there seemed too many kernel patches needed to do this.
Personally, I'm in favor of squid because it is cheap and easy, if all your machines are of equal weight in performance terms. My company uses squid in front of multiple ZEO clients, bypassing Apache altogether for access to Zope; we use pyredir as a redirector to rewrite URLs like you would with mod_rewrite; Squid is very nice as a Zope virtual host front-end, and provides ACLs for security purposes, like blocking out the ZMI from public access.
This is the main reason I spent that time looking at squid - the security aspect. But, we've built up quite a bit of experience with Apaches ReWriteRules and have a requirement for PHP and (cough) Perl scripts to run alongside our Zope sites as well.
Getting load balancing working in squid is simply a matter of compiling it with a flag to specify you want external DNS resolution support and using squid's built-in name resolving program dnsserver, which will look at /etc/hosts and round-robin among multiple IPs that have the same hostname. One caveat: if you put Squid (or any high-volume TCP app) on Solaris, make sure to tune your TCP connection buffers, so Solaris doesn't choke; this is one (albeit trivial) reason we use Linux 2.4 for our proxies instead of Solaris.
Ah - thanks for that - I'll do the /etc/system thing with the TCP/IP parameters.
Since squid round-robin's it doesn't deal with node reliability problems, which means you will have to rely upon another HA mechanism on the ZEO client nodes themselves, like heartbeat from the Linux-HA project, which at some point will ported to Solaris, from what I hear.
We're going to be getting some high availability software that does heartbeat monitoring so I think I'm ok there (but see *** below).
If you have a decent budget, but not a lot of time, look at a L4 switch (we use for another application, and like, the Intel Netstrcuture 7140); otherwise, if you have budget constraints consider LVS or Squid, with Squid being the likely easiest path in terms of setup time.
Sean
Thanks very much for the info Sean. Although I'll probably not be using squid at the moment - all this is very helpful. In the end I used the method that Phil mentioned;
Anyway on with the show:
Take a look at this document, http://httpd.apache.org/docs-2.0/mod/mod_rewrite.html. Search for 'Randomised Plain Text' and there's your recipe.
Basically, you need to create a file that has these lines in, call it map.txt (the name doesn't really matter though):
localhost port1|port2|port3
replace port1 .. port 3 with the exact ports you are using:
localhost 8080|8081|8082
You can put as many options on this line as you need.
Then change your final rewrite rule to be like this:
RewriteMap servers rnd:/path/to/file/map.txt RewriteRule ^/(.*) http://localhost:$(servers:localhost)/VirtualHostBase/http/myserver.ncl. ac.u k:80/VirtualHostRoot/$1 [P]
This will put a random port number into the line thereby giving you pseudo 'round-robin' functionality. The ${servers:localhost} is the clever part, 'servers' is the name of the map, and localhost is a parameter telling the map which line of the map.txt file to choose from.
as I have quite some experience with Apache and Rewriterules.
*** We're only going to have two machines in our 'cluster', but they are multi processor machines. The reason I'm having to go through all these hoops is basically down to the poor performance of Sparc chips. My pystone ratings on our new server are about 4500, whilst my own TiPB is 6500 and a PIII 700MHz is about 10,500. Therefore I'm trying to squeeze more performance out of our boxes.
Once again guys, thanks for the help.
Now all I need to do is figure out how to do Core Session Tracking with ZEO (I know there are HowTos - but they're not that transparent to my poor head! :)
Cheers, Tone. -- Dr Tony McDonald, Assistant Director, FMCC, http://www.fmcc.org.uk/ The Medical School, Newcastle University Tel: +44 191 243 6140 A Zope list for UK HE/FE http://www.fmcc.org.uk/mailman/listinfo/zope
Zope maillist - Zope@zope.org http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
On 2/9/01 9:41 pm, "Steve Spicklemire" steve@spvi.com wrote:
Hi Tony,
I've achieved reasonable (though somewhat naive) ZEO load distribution (I won't call it balancing) this way:
In Apache:
01 <VirtualHost 192.xxx.yyy.zzz:80> 02 ServerAdmin steve@spvi.com 03 ServerName test_balance.spvi.net 04 ErrorLog /var/log/spvi.net-error_log 05 CustomLog /var/log/spvi.net-access_log common 06 07 08 RewriteEngine on 09 #RewriteLog /var/log/rewrite.log 10 #RewriteLogLevel 10 11 RewriteMap balance_load_ext prg:/usr/local/share/apache/conf/balance_load_ext.py 12 RewriteRule ^/(.*)$ ${balance_load_ext:$1} [P,L] 13 14 </VirtualHost>
where balance_load_ext.py is:
01 #!/usr/bin/env python 02 03 count = 0 04 05 import sys 06 import string 07 08 def translate(data): 09 global count 10 count = (count + 1) % 3 11 return "http://www%i.spvi.net:14080/VirtualHostBase/http/test_balance.spvi.net:80/ %s" % (count, data) 12 13 14 if __name__=='__main__': 15 while 1: 16 data = string.strip(sys.stdin.readline()) 17 if not data: 18 break 19 print translate(data) 20 sys.stdout.flush()
This distributes load between three machines, using Apache only. You could make the python script smarter to achieve something closer to real load balancing with a little effort.
-steve
That's an interesting method Steve (at least it's in Python and I can understand it). Thing is that the poor python performance on our solaris hardware is the driving force behind me trying to use ZEO!.
Wouldn't having a python script sitting in front of every request slow things down a fair bit?
Cheers for the info though, Tone.
Hi Tony,
Hmm.. my benchmarking with 'ab' didn't show any significant slowdown. YMMV! It's easy enough to try, why not find out? The rewrite module keeps the the program running and just sends to stdin and reads from stdout. You could write it in "C" if it turns out to be a bottleneck. I think however that if you have poor python performance, then Zope is going to be a much bigger problem than my little script. :-(
-steve
On Monday, September 3, 2001, at 03:27 PM, Tony McDonald wrote:
On 2/9/01 9:41 pm, "Steve Spicklemire" steve@spvi.com wrote:
Hi Tony,
I've achieved reasonable (though somewhat naive) ZEO load distribution (I won't call it balancing) this way:
In Apache:
01 <VirtualHost 192.xxx.yyy.zzz:80> 02 ServerAdmin steve@spvi.com 03 ServerName test_balance.spvi.net 04 ErrorLog /var/log/spvi.net-error_log 05 CustomLog /var/log/spvi.net-access_log common 06 07 08 RewriteEngine on 09 #RewriteLog /var/log/rewrite.log 10 #RewriteLogLevel 10 11 RewriteMap balance_load_ext prg:/usr/local/share/apache/conf/balance_load_ext.py 12 RewriteRule ^/(.*)$ ${balance_load_ext:$1} [P,L] 13 14 </VirtualHost>
where balance_load_ext.py is:
01 #!/usr/bin/env python 02 03 count = 0 04 05 import sys 06 import string 07 08 def translate(data): 09 global count 10 count = (count + 1) % 3 11 return "http://www%i.spvi.net:14080/VirtualHostBase/http/test_balance.spvi.net: 80/ %s" % (count, data) 12 13 14 if __name__=='__main__': 15 while 1: 16 data = string.strip(sys.stdin.readline()) 17 if not data: 18 break 19 print translate(data) 20 sys.stdout.flush()
This distributes load between three machines, using Apache only. You could make the python script smarter to achieve something closer to real load balancing with a little effort.
-steve
That's an interesting method Steve (at least it's in Python and I can understand it). Thing is that the poor python performance on our solaris hardware is the driving force behind me trying to use ZEO!.
Wouldn't having a python script sitting in front of every request slow things down a fair bit?
Cheers for the info though, Tone. -- Dr Tony McDonald, Assistant Director, FMCC, http://www.fmcc.org.uk/ The Medical School, Newcastle University Tel: +44 191 243 6140 A Zope list for UK HE/FE http://www.fmcc.org.uk/mailman/listinfo/zope
On 3/9/01 10:38 pm, "Steve Spicklemire" steve@spvi.com wrote:
Hi Tony,
Hmm.. my benchmarking with 'ab' didn't show any significant slowdown. YMMV! It's easy enough to try, why not find out? The rewrite module keeps the the program running and just sends to stdin and reads from stdout. You could write it in "C" if it turns out to be a bottleneck. I think however that if you have poor python performance, then Zope is going to be a much bigger problem than my little script. :-(
-steve
Oh I quite agree that my Zope performance will be the bottleneck. Thing is, I managed to get Phils' Apache ReWriteRules to work, ie
Basically, you need to create a file that has these lines in, call it map.txt (the name doesn't really matter though):
localhost port1|port2|port3
replace port1 .. port 3 with the exact ports you are using:
localhost 8080|8081|8082
You can put as many options on this line as you need.
Then change your final rewrite rule to be like this:
RewriteMap servers rnd:/path/to/file/map.txt RewriteRule ^/(.*) http://localhost:$(servers:localhost)/VirtualHostBase/http/myserver.ncl. ac.u k:80/VirtualHostRoot/$1 [P]
...and as I'm having lots of 'fun' with my CoreSessionTracking/BerkeleyDB/ZEO combination at the moment, I'm up to my eyes in it.
Many thanks for the pointers though Steve, Tone.