2. Editing Apache for Performance
From hereon I will be writing for Linux platform configuration of Apache unless stated otherwise. Once the Hardware, OS selection and optimization are done, we have to implement Apache. Editing apache for performance can be 2 levels: Before Installation & After Installation
2.1 Before Installation (Compile Time Tuning)
2.1.(a) Choosing the appropriate MPM
MPM stands for Multi Processing Module. It is just one of the modules of apache. Its features being:
- It modifies the basic functionality of the apache server related to multi-thread & multi-processes style of working.
- It must be built into apache at compilation like http_core and mod_so modules.
- Only one MPM can be loaded into the server at any time.
( Note: If you need to know more on differeences between Processes & Threads, you may want to refer books on ‘Operating System Concepts’ which has in-depth treatment of each.)
Its simple to understand. The MPM type you choose for apache is responsible for binding to network ports on the machine, accepting requests, handling requests etc. Choosing an MPM depends on various factors, such as whether the OS supports threads, how much memory is available, scalability versus stability, whether non-thread-safe third-party modules are used, etc. Linux systems can choose to use a hybridMPM (which means that it is multi-processed and multi-threaded implementation) like ‘worker’ or a non-threaded MPM (only multi-process functionality)like ‘prefork’. Windows has only one choice which is ‘winnt’ MPM (only multi-process functionality). Let me discuss some of the key features of these:
Once the required module is loaded, we tweak them by the apache ‘Directives’ that are dedicated for configuring MPMs like MaxClients, StartServers, MaxRequestsPerChild etc in the httpd.conf file.
2.1.(ii) Loading only the required modules
Before installation, only compile in the modules that are provide the basic functions of a web server. This is because as more modules are compiled in statically, the size of the running httpd binary will increase(neglecting the dynamic modules. Once dynamic modules are also loaded, the size will increase again). Normally, the statically compiled modules are http_core, mod_so & the required MPM module.
If you have built the modules as DSOs, eliminating modules is a simple matter of commenting out the associated LoadModule directive for that module.If, on the other hand, you have modules statically linked into your Apache binary, you will need to recompile Apache in order to remove unwanted modules.
This is one of the compile time flags you give in for apache. If you have no thoughts for DSO support for apache, then you will need to compile apache with
DYNAMIC_MODULE_LIMIT=0. This will save the amount of RAM that is only dedicated for loading dynamically loading modules. The default value is 64 which will be usually sufficient.
2.2 After Installation (Once the server is installed and ready to run or running)
This is where the bulk of apache tuning comes in. But first I will need to give you some intro about the 2 most common MPMs that are used with Apache in Linux environment.
The ‘prefork’ MPM:
It was the only mode of operation available in Apache 1.3. In this configuration, the main Apache process also known as the ‘master server’(the apache process started by ‘root’ user with full privileges) will at startup create (fork() ) multiple child servers(running under user with less privileges w.r.t those mentioned in User & Group directive in the apache conf file). In the pool of child servers, these child servers can be considered to stand in a queue. The child at the front of the queue or standing 1st is known as the ‘Listener’ and all the rest, counted from the 2nd child onwards are known as the ‘Idle Worker’. Only the listener child is allowed to listen for connection from sockets. When a request is received, this child makes a transition in its state from ‘Listener’ to ‘Worker’. This particular child then goes on to process
the received request. In the meantime the child which was standing 2nd gets the ‘Listener’ status. When the 1st child, which became the ‘worker’, is done processing the request, it will change its state back to ‘Idle Worker’ and will then stand at the end of the same queue in which it was standing 1st previously. This cycle repeats as each request arrives. Each child server will only handle one request at a time. When it is detected that the number of available processes is running out, additional child servers will be created by the master server. But there is a limit for the maximum number of child servers and this is given in the conf file for apache. When the limit was reached and still sufficient requests arriving, the client may instead receive an error resulting from not being able to establish a connection with the web server. When the number of requests has subsequently dropped off, the excess child servers will be shutdown and killed. Child processes may also be shutdown and killed off after they have handled some set number of requests which depends upon the directives set in the conf file which I will discuss in the succeeding sections.
So the actual worker here is the child server and not the master server! The method imparts a lot of stability because each request is handled by a separate ‘process’. If a process dies/or is killed it will not affect other processes. Each process is an independent entity to which resources are allocated.
The ‘worker’ MPM:
The ‘worker’ MPM implemented from version 2.0, is similar to ‘prefork’ mode except that within each child process there will exist a number of worker threads according to ThreadsPerChild directive. A request will be handled by a ‘thread’ within a child process rather than each request being handled by a separate child process in case of prefork MPM. If some of the threads in a process are already handling requests, when a new request arrives, this is handed over to the thread which is ready and idle in the same process. If all worker threads within a child process were busy when a new request arrives the request would be processed by an idle worker thread in another child process. If all the threads in all the running child processses are engaged, Apache ‘master server’ (server run with root privileges) may still create new child processes on demand if necessary. Apache master server may also still shutdown and kill off excess child processes, or child processes that have handled more than a set number of requests.
Overall, use of ‘worker’ MPM will result in less child processes needing to be created, but resource usage of individual child processes will be greater. But where is the advantage then? It is in avoiding the delay and overhead incurred on creating a new child process for every request.
Now lets discuss the Directives which are also known as apache’s Performance directives. Much of these I discuss can be found at: http://httpd.apache.org/docs/2.0/misc/perf-tuning.html. But may not be much comprehensive to a novice. So I shall try my best to break it down.
This implies if apache is compiled to use ‘worker’ MPM and mpm_winnt (windows) only. It denotes the number of threads that will be created in an individual child process at startup. This value multiplied by the ‘number of child’ will give us the total number of threads in the server. Once a child has started, it will never change ThreadsPerChild value. The conf file needs to be edited and the apache master server must be restarted for the changes to take effect. Once this is done childs created from thereon will have the new value but the already existing child will have old value. The default value in for worker is 25 and 64 for mpm_winnt.
However, there is a limit to which the ThreadsPerChild value can be increased. This limit is the ThreadLimit. While setting this directive:
(i) If ThreadLimit is set to a value much higher than ThreadsPerChild, extra unused shared memory will be allocated.
(ii) If both ThreadLimit and ThreadsPerChild are set to values higher than the system can handle, Apache may not start or the system may become unstable.
So how to set value of this directive?
Set the value of this directive equal to the greatest value of ThreadsPerChild that might be required for Apache during the most peak time. Consequently when apache is shipped, if 25 is the ThreadsPerChild value, ThreadLimit is defaluted to 64. However all this, will again depend upon how busy your apache will be and the hardware resources you have alloted for apache.
The MaxClients indicates the limit on the maximum number of simultaneous requests that will be served. For preforking apache each request is handled by each child. Therefore MaxClients in this case can be interpreted as the maximum number of apache childs that will be launched in its single lifetime. The default value is 256 servers(in effect, in case of prefork, 256 simultaneous requests). Any connection attempts over this value will be queued according to ListenBackLog (default is in the range of 500).
For worker, since each requests are handled by single threads, the MaxClients= ServerLimit x ThreadLimit = the maximum number of apache threads that can be possibly launched during the an apache lifetime. ServerLimit in short means the maximum number of apache servers(child process or child server) that can be launched in apache’s lifetime be it prefork or worker implementation. So for:
prefork, MaxClients= Total number of child servers in apache lifetime
worker, MaxClients= Total number of threads in apache lifetime
Lets do some Math now. I have a apache running with the following specs: Apache/2.0.63, Prefork MPM and mulitple statically compiled in modules.
MaxClients= RAM available to Apache/ Memory for each apache process
Let us assume if the concept of shared memory is implemented into IPC. Assume 250M as the system RAM. Setting MaxClients will always require the administrator to see for the memory usage of apache process during idle time and peak usage time. I shall give a very short method on how to do this here.
The best tool we are going to use here is the well known ‘ps’ command. Lets see how.
Do a ps aux –sort -rss|grep httpd httpd from the commandline. Here is what I got for an apache server installed via EasyApche.
#ps aux –sort -rss|grep httpd
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 3902 0.0 0.0 9800 3312 ? Ss Nov25 0:00 /usr/local/apache/bin/httpd -k start -DSSL
nobody 1899 0.0 0.0 9936 2948 ? S Nov25 0:00 /usr/local/apache/bin/httpd -k start -DSSL
nobody 1903 0.0 0.0 9936 2948 ? S Nov25 0:00 /usr/local/apache/bin/httpd -k start -DSSL
nobody 3912 0.0 0.0 9936 2948 ? S Nov25 0:00 /usr/local/apache/bin/httpd -k start -DSSL
nobody 3913 0.0 0.0 9936 2948 ? S Nov25 0:00 /usr/local/apache/bin/httpd -k start -DSSL
In this context we need to look for fields RSS & VSZ. Lets see.
VSZ (Virtual memory SIZE) – This is the aggregate of memory the process is currently using including the RAM and also the SWAP memory(if used).
RSS (Resident Set Size) – It is the portion of the process that exists in “RAM only”. The rest if any, will be in swap.
If the process is using no swap space, then RSS will be = SZ.
The above indicates that the process is being swapped by about 6-7M. This should not be the case as the process should not be allowed to swap and the required memory should be alloted or the conf should be edited so that the process is not swapped. Now lets get back to the math for MaxClients.
So the actual apache process size here is 9.936M. If if there were enough memory, then RSS= VSZ= 9.936M and the whole 9.936M would have been in the RAM. Lets round it to 10M. We had assumed that our available system RAM for apache was 250M.
MaxClients= 250M/10M = 25 apache process. This means that 25 apache process is the maximum advisable range for setting the MaxClients to. Since we see that the process already swaps, we will need to allocate more memory and then set it to 25 or if not, lower the value.
Now lets calculate the above case considering shared memory concepts in place(OS support required). Lets say we have a shared memory of 4M. So the available RAM to apache becomes 250-4 = 246M.
MaxClients = 250-4/10-4= 61 processes. Compare the result with the above. We see that the number has increased significantly. Optmizing shared memory concepts not only increases process number but also improves IPC so this is to be considered into while optimizing apache.
The above calculated is still only accurate to the extent that we can implement this in the http.conf file. Try to set a MaxClients value a but lower than the result always. In the last calculation, 50-55 would be advisable.. I have said this becuase, the actual calculation taking into account of shared memory is a bit complex for any process.
This directive is used for reverse lookup of the IP address of the accessing machine initiating the connection. This adds a great delay since this lookup is done for every new request generated by the established connection. There are 3 options for this: Off, On, Double. When Double is set, one forward and sucessively a backward lookup os also performed to ensure that domain name is not being spoofed. It is usually set to Off.
In addition to making this Off, while implementing ACLs, you should try to avoid Hostnames. Otherwise the server will perform Double DNS lookups. See the eg:
<Location /serverstat >
Order deny, allow
Deny from all
Allow from www.example.com
Instead, try to use IP address in above. However accordig to our example, the lookup will only be performed for URLs starting with /serverstat.
2.2(e) FollowSymLinks and SymLinksIfOwnerMatch
Lets talk in general. Actually FollowSymLinks enable the server to follow symbolic links to files to wherever they are if there are any in the directory it is enabled. Actually this is a security threat since blindly having this enabled can have a webcliet wander in filesystems into which it is actually not allowed. Say, the user has created a symlink to /etc/passwd in his DocumentRoot. This can reveal the file to him. So disable this option on the whole? No, this will not only disable the user from using safe and useful symbolic links within his permitted directory, but also induce extra overhead for apache. Extra overhead? Yes. When this option is disabled, when ever a resource is requested, apache will look whether the requested resource and its preceeding directories are a symlink or not from the server root ‘/’. Didn’t get it? See the below:
When index.html is requested, apache will check whether: /www, /www/htdocs, /www/htdocs/index.html are symbolic links or not. Why does apache check this? Because we have told in ‘Options’ directive that not to serve contents if they are symlinks. So if /www/htdocs/index.html or www/htdocs or www was a symlink to something else, apache would have returned an error. This number of this check will increase if the requested resource is in deeper sub-directories and the result of these checks are also not cached so check is made with every request even if the resource is in the same directory as the previous resource that was sought. The key point to note here is that if FollowSymLinks was enabled, apache will not perform these checks, but will simply follow if a link is found. This is highly derogatory to pertformace. So disabling FollowSymLinks on the whole is not at all advisable.
Next, the SymLinksIfOwnerMatch directive. This was used as a workaround for the security leak created by FollowSymLinks of apache serving files outside user’s directories. This ensures that apache will follow the symbolic link iff the owner of the destination file is same as the owner of the link. This again, is an overhead since for each request, apache will have to do a check on the ownership of the destination files.
Now its really stuck. How do we do then? A practical case is to divide the directories in which these directives have the influence. We may deploy the following considering the Document root as: /var/www/htdocs.
Options -FollowSymLinks +SymLinksIfOwnerMatch
Now this is a good solution. Suppose a reques is made for index.html in /var/www/htdocs. We have enabled FollowSymLinks for ‘/’ and disabled it for /var/www/htdocs only. So this will eliminate the need for checking whether: /var, /var/www/, /var/www/htdocs are symbolic links or not. Apache will only need to check if /var/www/htdocs/ index.html is a symlink since FollowSymLinks has been disabled under /var/www/htdocs and index.html is under it. +SymLinksIfOwnerMatch ensures that destination file of the link served only if its owned by the same user of the link.
This can be set to either 0 or >0. When set to 0, the apache process will handle infinite number of requests and will not terminate by itself. When set to a value >0, the apache process will terminate voluntarily after handling the specified number of requests in MaxRequestsPerChild directive.
Setting this to zero has shown memory leakage issues. In simple words, the process after temination(will be done by the master process), will not release the meomry space that it had held so that it could be used by the upcoming or the in need apache processes. So if the process has run for a long time loading up more modules as it ran, it is possible to lock a hefty amount of memory after termination. The drawback need not be pointed to apache, but can be the underlying shortcoming of the platform or the poor programming of the loaded modules. One might need to monitor this carefully using pmap, ps and top. Check for the buggy modules etc, correct it out and then set this to 0 since this is more preferred if performance is sought.
If this is enabled in a directory, whenever apache traverses this directory and its sub-directories, it will look for the file ‘.htaccess’ and try to read its contents. If found, it will display contents in this directory only according to the options mentioned in this, overriding the global server settings. Like the case of 2.2(e), it is advisable to divide this setting into two. One for the / partition and other for the DocumentRoot of the users. If /var/www/htdocs is the DocumentRoot it can be given like:
If your OS supports sendfile system call, you might need to consfier using this since it enables direct delivery of files to sockets. This can also be used on a per-directory basis.
2.2.(i) StartServers, MinSpareServers, and MaxSpareServers
This indicates the number of processes that apache is to create while starting, the minimum and maximum number of idle servers to maintain respectively. This should be set on account of the server load. The default values are given below:
If all the apache servers are busy processing requests, then new requests will not be rejected but queued if this directive is enabled. What is defined by this is the length of this queue. By default, it is 511. There is no need to change this value in many cases.
This is a platform independent and http related apache performance directive. Persistent connections allow a client to send more than one request over the same connection . This is a very useful feature that must be enabled which is highly beneficial to the clients. The preferred chioce is to keep this On.
This specifies the time in seconds apache process(or thread) will wait for the next request from the client once the previous is processed and delivered. If the client does not make the next request within this time, apache closes the connection and the same client, for a new request, will have to open a new connection with apache server. This may or may not be delayed since it will depend on the available idle apache servers to accept new connections. The default value is about 15 seconds.
When the number of requests defined by this directive is reached, apache will automatically terminate the persistent connection even if KeepAliveTimeout value hasn’t expired. The value should be high such as 100 which is the most used.
2.2.(n) I will now mention some HTTP intensive directives that are not much looked upon and left to their default in most cases.
This directive specifies the number of bytes from 0 (meaning unlimited) to 2GB that are allowed in a equest body . It can be given server, per-directory, per-file or per-location basis. If the client request exceeds that limit, the server will return an error response instead of servicing the request. It is usually defeulted to 0.
This directive allows the server administrator to modify the limit on the number of request header fields allowed in an HTTP request. A server needs this value to be larger than the number of fields that a normal client request might include. The number of request header fields used by a client rarely exceeds 20 . This directive gives the server administrator greater control over abnormal client request behavior, which may be useful for avoiding some forms of denial-of-service attacks. It is defeulted to 100.
This directive allows the server administrator to reduce the limit on the allowed size of an HTTP request header field and is mentioned in bytes. This is also useful in preventing DOS attacks. The default is 8190 bytes.
This directive allows the server administrator to reduce the limit on the allowed size of a client’s HTTP request-line . Since the request-line consists of the HTTP method, URI, and protocol version, the LimitRequestLine directive places a restriction on the length of a request-URI allowed for a request on the server. This also helps in preventing DOS attacks to an extent. The default is 8190 characters.
2.2.(o) If a proxy web server is used supplementing your main server, then we must consider in including the apache module mod_expires since directives supplied by this is very useful in reducing the hits to main server by implementing the method of caching. The cache is maintained by the proxy. It is the headers of the resource sent to the proxy that tells it to cache the page or not. The directiuves of concern are:
ExpiresActive: This directive enables or disables the generation of the Expires header for the document that apache servers to the proxy. If set to Off or On, this can be overriden in .htaccess file.
ExpiresByType: This directive defines the value of the Expires header generated for documents of the specified type (e.g., text/html). The second argument sets the number of seconds that will be added to a base time to construct the expiration date. The base time is either the last modification time of the file, or the time of the client’s access to the document.
ExpiresDefault: This directive sets the default algorithm for calculating the expiration time for all documents in the affected realm.
I shall give an example:
# enable expirations
# expire GIF images after a month in the client’s cache
ExpiresByType image/gif A604800
# HTML documents are good for a week from the time they were changed
ExpiresByType text/html M604800
By the 2nd line, we have set this feature to On. By the 4th and 6th line, we specify the the type of data for which caching should be enabled. We have set caching for image files of the format gif to a week. The character A before the time stands for ‘Access’. This tells apache that to send a header which will make the cached file expire 1 week after the file was ‘Accessed’ by the client. We have set the caching of text documents of the type html to 1 week. This time we have mentioned ‘M’ before the time in seconds. M stands for modification. This tells Apache to send an Expires header so that documents expire 604800 seconds after the date they were last modified. The options available are M & A only.
Thus caching brings down the hit to webserver tremendously.
When this directive is set to On, it makes apache to issue two system calls to the OS to get time: gettimeofday(2) and time(2). This is done so that the status report contains indications of time. Its better to set this to Off.
2.2.(q) Scoreboard File
Since apache master server and its children communicate using the scoreborad file, it is always better to implement it in the shared memory area. Usually this will be on the disk. This change is made in
I would suggest the reader to go through the below link for some of the best suggestions from professionals at Google for overall optimization of website and the webserver. Its really worth it.
So I guess this would be the some of the main areas of apache to look on while configuring the software. In the next section we will see a bit on testing or benchmarking the Apache server.