Fronting Tomcat With Apache or IIS - Best Practices

Fronting Tomcat with Apache or IIS - Best practices 2/12/08 10:39 AM
Summary
Running cluster of Tomcat servers behind the Web server can be demanding task if you wish to archive
maximum performance and stability. This article describes best practices how to accomplish that.
By Mladen Turk
Fronting Tomcat
One might ask a question Why to put the Web server in front of Tomcat at all? Thanks to the latest
advances in Java Virtual Machines (JVM) technology and the Tomcat core itself, the Tomcat standalone
is quite comparable with performance to the native web servers. Even when delivering static content it is
only 10% slower than recent Apache 2 web servers.
The answer is: scalability.
Tomcat can serve many concurrent users by assigning a separate thread of execution to each concurrent
client connection. It can do that nicely but there is a problem when the number of those concurrent
connections rise. The time the Operating System will spend on managing those threads will degrade the
overall performance. JVM will spend more time managing and switching those threads then doing a real
job, serving the requests.
Besides the connectivity there is one more significant problem, and it caused by the applications running
on the Tomcat. A typical application will process client data, access the database, do some calculations
and present the data back to the client. All that can be a time consuming job that in most cases must be
finished inside half a second, to achieve user perception of a working application. Simple math will show
that for a 10ms application response time you will be able to serve at most 50 concurrent users, before
your users start complaining. So what to do if you need to support more users? The simplest thing is to
buy a faster hardware, add more CPU or add more boxes. A two 2-way boxes are usually cheaper then a
4-way one, so adding more boxes is generally a cheaper solution then buying a mainframe.
First thing to ease the load from the Tomcat is to use the Web server for serving static content like
images, etc..
Figure 1. Generic configuration
Figure 1. shows the simplest possible configuration scenario. Here the Web server is used to deliver
static context while Tomcat only does the real job - serving application. In most cases this is all that you
will need. With 4-way box and 10ms application time you'll be capable of serving 200 concurrent users,
thus giving 3.5 million hits per day, that is by all means a respectable number.
For that kind of load you generally do not need the Web server in front of Tomcat. But here comes the
second reason why to put the Web server in front, and that is creating an DMZ (demilitarized zone).
http://people.apache.org/~mturk/docs/article/ftwai.html Page 1 of 11
Putting Web server on a computer host inserted as a "neutral zone" between a company's private network
and the internet or some other outside public network gives the applications hosted on Tomcat capability
to access company private data, while securing the access to other private resources.
Figure 2. Secure generic configuration
Beside having DMZ and secure access to a private network there can be many other factors like the need
for the custom authentication for example.
If you need to handle more load you will eventually have to add more Tomcat application servers. The
reason for that can be either caused by the fact that your client load just can not be handled by a single
box or that you need some sort of failover in case one of the nodes breaks.
Figure 3. Load balancing configuration
Configuration containing multiple Tomcat application servers needs a load balancer between web server
and Tomcat. For Apache 1.3, Apache 2.0 and IIS Web servers you can use Jakarta Tomcat Connector
(also known as JK), because it offers both software load balancing and sticky sessions. For the upcoming
Apache 2.1/2.2 use the advanced mod_proxy_balancer that is a new module designed and integrated
within the Apache httpd core.
Calculating Load
When determining the number of Tomcat servers that you will need to satisfy the client load, the first
and major task is determining the Average Application Response Time (hereafter AART). As said
before, to satisfy the user experience the application has to respond within half of second. The content
received by the client browser usually triggers couple of physical requests to the Web server (e.g.
images). The web page usually consists of html and image data, so client issues a series of requests, and
the time that all this gets processed and delivered is called AART. To get most out of Tomcat you should
limit the number of concurrent requests to 200 per CPU.
So we can come with the simple formula to calculate the maximum number of concurrent connections a
physical box can handle:
500
Concurrent requests = ( ---------- max 200 ) * Number of CPU's
AART (ms)
The other thing that you must care is the Network throughput between the Web server and Tomcat
instances. This introduces a new variable called Average Application Response Size (hereafter AARS),
that is the number of bytes of all context on a web page presented to the user. On a standard 100Mbps
network card with 8 Bits per Byte, the maximum theoretical throughput is 12.5 MBytes.
12500
Concurrent requests = ---------------
AARS (KBytes)
For a 20KB AARS this will give a theoretical maximum of 625 concurrent requests. You can add more
cards or use faster 1Gbps hardware if need to handle more load.
The formulas above will give you rudimentary estimation of the number of Tomcat boxes and CPU's that
you will need to handle the desired number of concurrent client requests. If you have to deploy the
configuration without having actual hardware, the closest you can get is to measure the AART on a test
platform and then compare the hardware vendor Specmarks.
Fronting Tomcat with Apache

If you need to put the Apache in front of Tomcat use the Apache2 with worker MPM. You can use
Apache1.3 or Apache2 with prefork MPM for handling simple configurations like shown on the Figure 1.
If you need to front several Tomcat boxes and implement load balancing use Apache2 and worker MPM
compiled in.
MPM or Multi-Processing Module is Apache2 core feature and it is responsible for binding to network
ports on the machine, accepting requests, and dispatching children to handle the requests. MPMs must be
chosen during configuration, and compiled into the server. Compilers are capable of optimizing a lot of
functions if threads are used, but only if they know that threads are being used. Because some MPMs use
threads on Unix and others don't, Apache will always perform better if the MPM is chosen at
configuration time and built into Apache.
Worker MPM offers a higher scalability compared to a standard prefork mechanism where each client
connection creates a separate Apache process. It combines the best from two worlds, having a set of
child processes each having a set of separate threads. There are sites that are running 10K+ concurrent
connections using this technology.
Connecting to Tomcat
In a simplest scenario when you need to connect to single Tomcat instance you can use mod_proxy that
comes as a part of every Apache distribution. However, using the mod_jk connector will provide
approximately double the performance. There are several reasons for that and the major is that mod_jk
manages a persistent connection pool to the Tomcat, thus avoiding opening and closing connections to
Tomcat for each request. The other reason is that mod_jk uses a custom protocol named AJP an by that
avoids assembling and disassembling header parameters for each request that are already processed on
the Web server. You can find more details about AJP protocol on the Jakarta Tomcat connectors site.
For those reasons you can use mod_proxy only for the low load sites or for the testing purposes. From
now on I'll focus on mod_jk for fronting Tomcat with Apache, because it offers better performance and
scalability.
One of the major design parameters when fronting Tomcat with Apache or any other Web server is to
synchronize the maximum number of concurrent connections. Developers often leave default
configuration values from both Apache and Tomcat, and are faced with spurious error messages in their
log files. The reason for that is very simple. Tomcat and Apache can each accept only a predefined
number of connections. If those two configuration parameters differs, usually with Tomcat having lower
configured number of connections, you will be faced with the sporadic connection errors. If the load gets
even higher, your users will start receiving HTTP 500 server errors even if your hardware is capable of
dealing with the load.
Determining the number of maximum of connections to the Tomcat in case of Apache web server
depends on the MPM used.
MPM configuration parameter

Prefork MaxClients
Worker MaxClients
WinNT ThreadsPerChild
Netware MaxThreads
On the Tomcat side the configuration parameter that limits the number of allowed concurrent requests is
maxProcessors with default value of 20. This number needs to be equal to the MPM configuration
parameter.
Load balancing
Load balancing is one of the ways to increase the number of concurrent client connections to the
application server. There are two types of load balancers that you can use. The first one is hardware load
balancer and the second one is software load balancer. If you are using load balancing hardware, instead
of a mod_jk or proxy, it must support a compatible passive or active cookie persistence mechanism, and
SSL persistence.
Mod_jk has an integrated virtual load balancer worker that can contain any number of physical workers
or particular physical nodes. Each of the nodes can have its own balance factor or the worker's quota or
lbfactor. Lbfactor is how much we expect this worker to work, or the workers's work quota. This
parameter is usually dependent on the hardware topology itself, and it offers to create a cluster with
different hardware node configurations. Each lbfactor is compared to all other lbfactors in the cluster and
its relationship gives the actual load. If the lbfactors are equal the workers load will be equal as well (e.g.
1-1, 2-2, 50-50, etc...). If first node has lbfactor 2 while second has lbfactor 1, than the first node will
receive two times more requests than second one. This asymmetric load configuration enables to have
nodes with different hardware architecture.
In the simplest load balancer topology with only two nodes in the cluster, the number of concurrent
connections on a web server side can be as twice as high then on a particular node. But ...
1 + 1 != 2
The upper statement means that the sum of allowed connections on a particular nodes does not give the
total number of connections allowed. This means that each node has to allow a slightly higher number of
connections than the desired total sum. This number is usually a 20% higher and it means that
1 * 1.2 + 1 * 1.2 == 2
So if you wish to have a 100 concurrent connections with two nodes, each of the node will have to
handle the maximum of 60 connections. The 20% margin factor is experimental, and depends on the
Apache server used. For prefork MPMs it can rise up to 50%, while for the NT or Netware its value is
0%. The reason for that is that each particular child process menages its own balance statistics thus
giving this 20% error for multiple child process web servers.
worker.node1.type=ajp13
worker.node1.host=10.0.0.10
worker.node1.lbfactor=1
worker.list=lbworker
worker.lbworker.type=lb
worker.lbworker.balance_workers=node1,node2,node3
The minimum configuration for a three node cluster shown in the upper example will give the 25%-
50%-25% distribution of the load, meaning that the node2 will get as much load as the rest of the two
members. It will also impose the following number of maxProcessors for each particular node in case of
the MaxClients=200.
node1 :
<Connector ... maxProcessors="60" ... />
node2 :
node3 :
Using simple math the load should be 50-100-50 but we needed to add the 20% load distribution error.
In case this 20% additional load is not sufficient, you will need to set the higher value up to the 50%. Of
course the average number of connections for each particular node will still follow the load balancer
distribution quota.
Sticky sessions and failower
One of the major problems with having multiple backend application servers is determining the client-
server relationship. Once the client makes a request to a server application that needs to track user
actions over a designated time period, some sort of state has to be enforced inside a stateless http
protocol. Tomcat issues a session identifier that uniquely distinguishes each user. The problem with that
session identifier is that he does not carry any information about the particular Tomcat instance that
issued that identifier.
Tomcat in that case adds an extra jvmRoute configurable mark to that session. The jvmRoute can be any
name that will uniquely identify the particular Tomcat instance in the cluster. On the other side of the
wire the mod_jk will use that jvmRoute as the name of the worker in it's load balancer list. This means
that the name of the worker and the jvmRoute must be equal.
jvmRoute is appended to the session identifier :
http://host/app;jsessionid=0123456789ABCDEF0123456789ABCDEF.jvmRouteName
When having multiple nodes in a cluster you can improve your application availability by implementing
failover. The failover means that if the particular elected node can not fulfill the request the another node
will be selected automatically. In case of three nodes you are actually doubling your application
availability. The application response time will be slower during failover, but none of your users will be
rejected. Inside the mod_jk configuration there is a special configuration parameter called worker.retries
that has default value of 3, but that needs to be adjusted to the actual number of nodes in the cluster.
...
# Adjust to the number of workers
worker.retries=4
worker.lbworker.balance_workers=node1,node2,node3,node4
If you add more then three workers to the load balancer adjust the retries parameter to reflect that
number. It will ensure that even in the worse case scenario the request gets served if there is a single
operable node. Of course, the request will be rejected if there are no free connections available on the
Tomcat side , so you should increase the allowed number of connections on each Tomcat instance. In the
three node scenario (1-2-1) if one of the nodes goes down, the other two will have to take its load. So if
the load is divided equally you will need to set the following Tomcat configuration:
node1 :
node2 :
node3 :
This configuration will ensure that 200 concurrent connections will always be allowable no matter which
of the nodes goes down. The reason for doubling the number of processors on node1 and node3 is
because they need to handle the additional load in case node2 goes down (load 1-1). Node2 also needs
the adjustment because if one of the other two nodes goes down, the load will be 1-2. As you can see the
20% load error is always calculated in.
Figure 4. Three node example load balancer
Figure 5. Failover for node2
As shown in the two figures above setting maxProcessors depends both on 20% load balancer error and
expected single node failure. The calculation must include the node with the highest lbfactor as the worst
case scenario.
Domain Clustering model
Since JK version 1.2.8 there is a new domain clustering model and it offers horizontal scalability and
performance of tomcat cluster.
Tomcat cluster does only allow session replication to all nodes in the cluster. Once you work with more
than 3-4 nodes there is too much overhead and risk in replicating sessions to all nodes. We split all nodes
into clustered groups. The newly introduced worker attribute domain let mod_jk know, to which other
nodes a session gets replicated (all workers with the same value in the domain attribute). So a load
balancing worker knows, on which nodes the session is alive. If a node fails or is being taken down
administratively, mod_jk chooses another node that has a replica of the session.
For example if you have a cluster with four nodes you can make two virtual domains and replicate the
sessions only inside the domains. This will lower the replication network traffic by half
Figure 6. Domain model clustering
For the above example the configuration would look like:
worker.node1.domain=A
worker.node2.domain=A
worker.node3.domain=B
worker.node4.domain=B
worker.lbworker.balance_workers=node1,node2,node3,node4
Now assume you have multiple Apaches and Tomcats. The Tomcats are clustered and mod_jk uses
sticky sessions. Now you are going to shut down (maintenance) one tomcat. All Apache will start
connections to all tomcats. You end up with all tomcats getting connections from all apache processes, so
the number of threads needed inside the tomcats will explode. If you group the tomcats to domain as
explained above, the connections normally will stay inside the domain and you will need much less
threads.
Fronting Tomcat with IIS
Just like Apache Web server for Windows, Microsoft IIS maintains a separate child process and thread
pool for serving concurrent client connections. For non server products like Windows 2000 Professional
or Windows XP the number of concurrent connections is limited to 10. This mean that you can not use
workstation products for production servers unless the 10 connections limit will fulfil your needs. The
server range of products does not impose that 10 connection limit, but just like Apache, the 2000
connections is a limit when the thread context switching will take its share and slow down the effective
number of concurrent connections. If you need higher load you will need to deploy additional web
servers and use Windows Network Load Balancer (WNLB) in front of Tomcat servers.
Figure 7. WNLB High load configuration
For topologies using Windows Network Load Balancer the same rules are in place as for the Apache
with worker MPM. This means that each Tomcat instance will have to handle 20% higher connection
load per node than its real lbfactor. The workers.properties configuration must be identical on each node
that constitutes WNLB, meaning that you will have to configure all four Tomcat nodes.
Apache 2.2 and new mod_proxy

For the new Apache 2.1/2.2 mod_proxy has been rewriten and has a new AJP capable protocol module
(mod_proxy_ajp) and integrated software load balancer (mod_proxy_balancer).
Because it can maintain a constant connection pool to backed servers it can replace the mod_jk
functionality.
LoadModule proxy_module modules/mod_proxy.so

LoadModule proxy_ajp_module modules/mod_proxy_ajp.so
LoadModule proxy_balancer_module modules/mod_proxy_balancer.so
...
<Proxy balancer://mycluster>
BalancerMember ajp://10.0.0.10:8009 min=10 max=100 route=node1 loadfactor=1
BalancerMember ajp://10.0.0.11:8009 min=20 max=200 route=node2 loadfactor=2
</Proxy>
ProxyPass /servlets-examples balancer://mycluster/servlets-examples
The above example shows how easy is to configure a Tomcat cluster with proxy loadbalancer. One of
the major advantages of using proxy is the integrated caching, and no need to compile external module.
Mod_proxy_balancer has integrated manager for dynamic parameter changes. It offers changing session
routes or disabling a node for maintenance.
<Location /balancer-manager>
SetHandler balancer-manager
Order deny,allow
Allow from localhost
</Location>
Figure 8. Changing BalancerMember parameters
The future development of mod_proxy will include the option to dynamically discover the particular
node topology. It will also allow to dynamically update loadfactors and session routes.
About the Author

Mladen Turk is a Developer and Consultant for JBoss Inc in Europe, where he is responsible for native
integration. He is a long time commiter for Jakarta Tomcat Connectors, Apache Httpd and Apache
Portable Runtime projects.
Links and Resources

Jakarta Tomcat connectors documentation
Apache 2.0 documentation
Apache 2.1 documentation

Fronting Tomcat With Apache or IIS - Best Practices

Transféré par

Informations du document

Description originale:

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Fronting Tomcat With Apache or IIS - Best Practices

Transféré par

Droits d'auteur :

Formats disponibles

Fronting Tomcat with Apache or IIS - Best practices 2/12/08 10:39 AM

Figure 1. Generic configuration

Figure 2. Secure generic configuration

Figure 3. Load balancing configuration

Fronting Tomcat with Apache

MPM configuration parameter

Sticky sessions and failower

Figure 4. Three node example load balancer

Figure 5. Failover for node2

Domain Clustering model

Figure 6. Domain model clustering

For the above example the configuration would look like:

Fronting Tomcat with IIS

Figure 7. WNLB High load configuration

Apache 2.2 and new mod_proxy

LoadModule proxy_module modules/mod_proxy.so

Figure 8. Changing BalancerMember parameters

About the Author

Links and Resources

Vous aimerez peut-être aussi