Quick & Dirty Guide to Apache

Introduction

There doesn't seem to be a really good book on Apache, ie. what "DNS & BIND" is to DNS or the Red book to Unix administration. The least bad I found is O'Reilly "Apache: The Definitive Guide, 2nd Edition" by Ben Laurie & Peter Laurie, but it already dates back to February 1999. Other introductory books include "Linux Apache Web Server Administration" by Charles Aulds (Craig Hunt Linux Library), "Apache Server Bible" by Mohammed J. Kabir, and "Professional Apache" by Peter Wainwright.

Customizing httpd.conf

ServeName tells Apache which hostname to return in queries from browsers

DocumentRoot: By default, Apache looks for htdocs/ under ServerRoot, but this setting allows you to set up the document root directory elsewhere

To outsource part of the configuration, take advantage of AccessConfig and Include, especially for virtual hosting:

<VirtualHost 192.168.1.1>
    ...
    AccessConfig conf/policy_one.conf
</VirtualHost>

Per-directory configuration files are set with AccessFileName

Conditional configuration can be done using command-line switches such as: httpd -D UseRewrite

<IfDefine UseRewrite>
    ...
</IfDefine>

- OR -

<IfModule mod_rewrite>
    ...
</IfModule>

Note: If you want to use other command-line switches, you must kill httpd; apachectrl restart or HUPping httpd won't work.

Three levels of configuration:

When using the latter, AllowOverride is highly recommended.

The different available containers: <Limit> and <LimitExcept>, <Directory> (physical path in filesystem), <Files> (same, but dealing with specific files), <Location> (URL), <VirtualHost>.

Two kinds of directives: Those that are server-level only, and those that can be either general- or local-level, ie. they can be locally-overriden. For instance:

<Directory />
    Options none
    AllowOverride none
    order allow, deny
    deny from all
</Directory>
 
<Directory /home/www/*>
    allow from all
</Directory>

Options are inherited, so use + or - to add or remove them, eg. Options -FollowSymLinks

To tell Apache which file to return when a user aims at a directory and not a specific file, use the DirectoryIndex. For instance:

DirectoryIndex    index.html index.htm /cgi-bin/fake404.cgi

To set icons and descriptions, use this:

DefaultIcon     /icons/blank.gif
Alias /icons/    /usr/local/apache/htdocs/icons
AddDescription    "GIF image" *.gif

Environment variables

Different types of variables are available:

Use mod_env to set variables that can be read by CGI scripts:

SetEnv RESOURCE_PATH    /any/directory
PassEnv RESOURCE_PATH
UnsetEnv RESOURCE_PATH

Use mod_setenvif to set variables conditionnally through BrowserMatch/BrowserMatchNoCase (eg. BrowserMatch Mozilla netscape=true) and SetEnvIf (eg. SetEnvIf User-Agent Mozilla netscape=true).

In response to clients, some headers are sent along with the document such as HTTPstatus with response code, Content-Type header, and (optional) one or more HTTP response headers. Some of those headers are Cache-Control or Expires.

To return additional headers, use mod_header, eg. Header set/unset item value.

To send information on when a document should expire in a caching server, use mod_expires:

ExpiresActive on
ExpiresDefault A2419200 (A = access)

If you want to create HTML documents with its own headers, ie. tell Apache not to send any header information itself, use mod_as_is:

  1. Edit httpd.conf:

    <Directory /where/ever>
        SetHandler send-as-is
    </Directory>
     
  2. Add headers in the HTML document:

    Status: 404
    Content-Type: text/html

    <HTML>
    Hi there
    </HTML>

 

Modules

Standard module

Apache comes with a bunch of modules. Just like third-party modules below, standard modules can be built statically so they are compiled inside the httpd binary, or built dynamically, in which case they exist as independ .so files.

AddModule: Used to load static modules in an order different from the one in which they were compiled in httpd.

ClearModuleList: Unload all static modules. Those needed must be reloaded with AddModule.

LoadModule: Used to load modules, static or dynamic. Must be located before ClearModuleList.

Enabling modules

To enable modules statically, use "--enable-module=mymodule". To enable most or all standard modules, use (you guessed it) "--enable-module=most" or "--enable-module=all".

To enable a module dynamically, use "--enable-module=mymodule --enable-shared=module". To enable all standard modules dynamically, use "--enable-shared=max". You can always tell Apache not to load some of them by commenting them out in httpd.conf.

Note that when building any module dynamically, Apache will include the mod_so module automatically. If mod_so in not compiled, however, neither is the apxs binary that is needed to add dynamic third-party modules (which makes sense: Without mod_so, no dynamic modules can be loaded.)

Disabling modules

To disable those that you don't need, use "--disable-module=mymodule". If you now that you will never need to use dynamic modules, you can remove the mod_so module with "--disable-module=so".

Loading modules

The order in which modules are loaded is significant, since a URL can be handled successively by different modules.

Static modules are loaded in the order in which they were compiled (use "httpd -l" to check), but this can be changed using the ClearModuleList and AddModule settings in httpd.conf. Note that ClearModuleList really does unload all modules, so make sure you add those you need with AddModule.

Dynamic modules are loaded with the LoadModule setting in reverse order (ie. starting from the last LoadModule setting).

3rd party modules

There are two ways to build third-party modules: APACI or APXS. APACI is run with ./configure, is OK when the module consists in a single file, and is required when the module needs to patch the Apache source code (eg. mod_ssl). APXS is a binary file that is compiled by Apache when mod_so is compiled in, and lives under the bin/ sub-directory. APXS is used when the module consists in mutiple source files, and requires the Apache header files. Note that since it is compiled along with Apache, the apxs binary contains host-specific information such as LIBEXECDIR, etc.

Apaci

The source code of the module must be copied in the src/modules/extra/ sub-directory. You can either copy the source file yourself and run the --activate-module=src/modules/extra/mymodule.c, or run the --add-module=/home/mymodule.c once: --add-module will copy the file to the right location, and tell Apache to include it when compiling.

Using either --activate-module or --add-module compiles a third-party module statically. If you want this module to be built as a dynamic module, add the familiar --enable-shared switch, eg. ./configure --activate-module=src/modules/extra/mymodule.c --enable-shared=mymodule .

Apxs

Apxs is used to build dynamic modules. To compile and install a module, and add the required LoadModule/AddModule (I thought AddModule was only used to use static modules?) settings in httpd.conf: apxs -c -i -a mod_mymodule.c . Additional options are -n (to indicate the name of the module if it can't be infered from the name of its source file), and -A (add the ad hoc LoadModule/AddModule settings, but comment them out.)

Examples

The following compiles PHP as a static module (as shown by the --with-apache switch). I chose PHP, but the procedure is identical for other modules, and involves copying compiled binaries into Apache's source tree so they are included when compiling Apache itself:

  1. tar xzvf apache_1.3.x.tar.gz
  2. tar xzvf php-x.x.x.tar.gz
  3. cd apache_1.3.x ; ./configure --prefix=/usr/local/apache
  4. cd ../php-x.x.x ; ./configure --with-mysql=/usr/local/mysql --with-apache=../apache_1.3.x --enable-track-vars ; make ; make install
  5. cd ../apache_1.3.x ; ./configure --prefix=/usr/local/apache --activate-module=src/modules/php4/libphp4.a ; make ; make install
  6. Edit Apache's httpd.conf configuration file in /usr/local/apache/conf/, and add or uncomment Addtype application/x-httpd-php .php. Also edit The IfModule mod_dir.c section to add index.php
  7. Start Apache through /usr/local/apache/bin/apachectl start

Here's how to build PHP and mod_perl as dynamic modules (DSO). Once again, I chose PHP and mod_perl as examples, but you will have to proceed the same way to compile and launch any DSO module. You can tell whether your version of Apache supports DSO modules by running httpd -l, which should return mod.so:

  1. Compile Apache with ./configure --prefix=/usr/local/apache --enable-module=so ; make; make install
  2. tar xzvf mod_perl.tar.gz ; cd mod_perl ; perl Makefile.PL USE_APXS=1 WITH_APXS=/usr/local/apache/bin/apxs EVERYTHING=1
  3. tar xzvf php-x.x.x.tar.gz ; cd php ; ./configure --with-apxs=/usr/local/apache/bin/apxs  ; make  ; make install
  4. Check that Apache's httpd.conf includes an AddModule and LoadModule lines to handle the PHP and mod_perl modules
  5. Launch Apache through /usr/local/apache/bin/apachectl start, or if it is running, tell is to re-read its configuration file through killall -HUP httpd

Here's how to build Apache with SSL, mod_SSL, mod_Perl, and PHP as static modules. I'll assume that OpenSSL is already installed through RPM in order to connect to the server through OpenSSH.

  1. Install the following Perl modules (needed by mod_perl), each with perl Makefile.PL ; make ; make test; make install: MIME64, HTML-Tagset, HTML-Parser, URI, libwww
  2. Untar mod_SSL, mod_Perl, PHP, and Apache in /usr/src/ ; cd /usr/src/
  3. cd apache ; ./configure --prefix=/usr/local/apache
  4. cd ../mod_perl ; perl Makefile.PL EVERYTHING=1 APACHE_SRC=../apache/src USE_APACI=1 PREP_HTTPD=1 DO_HTTPD=1 ; make ; make install
  5. (If OpenSSL not already installed) cd ../openssl ; ./config --prefix=/usr/local/ssl ; make ; make test ; make install
  6. cd ../mod_ssl ; ./configure --with-apache=../apache
  7. cd ../php ; ./configure --enable-track-vars --with-apache=../apache --without-mysql ; make ; make install

    Note: Remember to not leave any trailing "/" in paths
     
  8. cd ../apache ; SSL_BASE=SYSTEM ./configure --prefix=/usr/local/apache --add-module=src/modules/standard/mod_auth.c --add-module=src/modules/standard/mod_auth_db.c --add-module=src/modules/standard/mod_usertrack.c --enable-module=ssl --enable-shared=ssl --activate-module=src/modules/php4/libphp4.a --enable-module=php4 --activate-module=src/modules/perl/libperl.a --enable-module=perl --disable-module=auth_dbm

    Note: If OpenSSL was not installed through RPM, subsitute SSL_BASE=SYSTEM with SSL_BASE=../openssl
     
  9. make ; make certificate ; make install
  10. Edit /usr/local/apache/conf/httpd.conf, uncomment the PHP-related AddType lines, add index.php to the Index section
  11. Launch Apache in plain HTTP with /usr/local/apache/bin/apachectl start, and aim your browser at that host, eg. http://localhost/
  12. Stop Apache, and re-start it in SSL mode with /usr/local/apache/bin/apachectl startssl, and aim at https://localhost/ . You should be warned that the certificate was not signed by a recognized organization (ie. VeriSign et al.)

To relate handlers to URLs, Apache provides two directives: SetHandler and AddHandler. SetHandler is more basic, as it causes all files in or below its location to be interpreted with the specified handler; AddHandler is more flexible: It relates access to a given media/MIME type or file extension to a CGI script.

Some samples:

<Location /type-maps>
    SetHandler type-map
</Location>
<Location /type-maps>
    AddHandler type-map .map
</Location>
<Location /cgi>
    AllowOverride None
    Options ExecCGI
    AddHandler cgi-script .cgi
</Location>
 

mod_proxy

Apache can act as a proxy server through mod_proxy. To enable proxying: ProxyRequests on, eg.

Listen 8080
<virtualhost 192.168.1.1:8080>
ServerName .net
ProxyRequest on
</virtualhost>

Mirroring: Proxypass/Linuxdocs http://www.linuxdoc.org

Reverse proxy : outside -> proxy -> inside www

ProxyPass/CPAN http://inside.net/CPAN

ProxyReverse/CPAN http://inside.net/CPAN

mod_expires

To expire cached pages: mod_expires + HTTP header (expire/1.0 & cache_control/1.1)<¨> To block access : ProxyBlock gambling sex

The connect method: allow connections to a remote server

kHTTPd : HTTPd in kernel

cgi

CGI: interface to run any program. Must either end in .cgi or be located in cgi-bin directory. To define a handler for cgi scrips: ScriptAlias /cgi-bin/ "/usr/local/apache/cgi-bin". Watch out for permissions, eg. nobody.www 760

cgi.pm : big Perl module to perform web-related tasks + manage cgi interface to the www server

FastCGI

FastCGI: scripts loaded during HTTPd start

mod_cgi

Provides the cgi-script directive to execute a URL as as CGI script, and requires the Options ExecCGI directive.

mod_imap

Provides the imap-file directive to interpret URLs as imagemap

mod_asis

Provides the send-as-is directive to send files without additional headers. The file is responsible for carrying its own headers to be interpreted correctly

mod_info

Provides the server-info directive. It generated an HTML page of server configuration. For security reasons, you should enforce access restriction

mod_include

Provides the server-parsed directive to parse files for server-side includes. Requires the Options Includes directive.

mod_status

Provides the server-status directive to generate an HTML page of server status. For security reasons, you should enforce access restriction

mod_negotiation

Provides the type-map directive to interpret URLs as a type map for content negotiation

mod_perl

mod_perl : better perfs. Apache::Registry module -> if script hasn't changed, no recompile.

mod_perl requires a lot of Perl modules -> #cpan ; install mod_perl

To compile: perl Makefile.PL APACHE_SRC=/usr/src/apache/src USE_APACI=1 PREP_HTTPD=1 DO_HTTPD=1 EVERYTHING=1

Virtual hosting

Three types of virtual hosting is available:
  1. IP-based (required by SSL)
  2. Name-based (to save IPs; requires HTTP 1.1-capable browsers)
  3. Dynamic virtual host: Configuration not fixed

For name-based virtual hosts, if no name matches, the user is redirected to the primary virtual user. Note that the IP address used for name-based virtual hosting is not available to the primary server.

To avoid being dependent on DNS resolution, alawya specify IP address + servername

httpd -S parses the configuration file and dumps its output.

It's possible to define a default virtual server (* = all ports)

<virtualhost -default:*>
    DocumentRoot ...
</virtualhost>

Controlling indexing from robots

You might want to forbid search engines from indexing all or part of your site. This can be achieved by using either a robots.txt at the highest level in the DocumentRoot directory, HTML tags in each HTML document, and directives in httpd.conf. Note that robots.txt implies that you trust search engines to follow its instructions.

Sending data to clients

Triggering CGI scripts on events

Apache can be set up to run a CGI script when a client requests certain types of resource, a file extension, or the HTTP method. All of these use the Action or Script directive supplied by the mod_action module. Here's how to set up those three handlers:

By resource types

Action text/html /cgi-bin/parse-html.cgi

By file extension

Action my-handler    /cgi-bin/myhandler.cgi
AddHandler my-handler    .myextension

By HTTP Method

Script    PUT    /cgi-bin/put.cgi

Content handling and negotiation

The value returned by the Content-Type instruction when a user retrieves a document can be defined in different ways. By default, Apache reads this information from a two-column file usually called mime.types (this can be changed with the TypesConfig directive), eg. "text/html    html htm". This type-extension association can also be set in the client browser so it knows how to handle a file if the server didn't return a Content-Type which it understands.

The information contained in mime.types can be supplemented by editing httpd.conf and adding the directives AddType or Action. For instance, "AddType text/mylanguage .myl .mylanguage" or "Action image/gif /cgi-bin/process-gif-image.cgi". The latter lets a script handle the request instead of sending a file directly.

If a file is encoded (eg. ZIP file, BinHex file, etc.), use the AddEncoding directive so that Apache will send a Content-Encoding header:

AddEncoding zip .zip

... will result in the following header:

Content-Encoding: zip

Conversely, four HTTP headers can be sent by the client to tell Apache which files it can handle: Accept, Accept-Charset, Accept-Encoding, and Accept-Language. Not all browsers use those correctly, so Apache needs to do a bit of guessing. More sophisticated content negotiation can be achieved with MultiViews and type maps.

To avoid supplementing mime.types, you can use the mod_mime_magic module which tries to guess the type of a file by looking for patterns inside it, just like the "file" command works in Unix. Obviously, this is more CPU- and disk-intensive, so mod_mime_magic is usually loaded after the lighter mod_mime module.

Error handling

Error codes are organized over 5 categories:

Apache provides the ErrorDocument directive so you can customize how Apache responds to errors. Some examples (note the absence of quotes in the first line):

Aliases and redirection

If the client requests a document that doesn't exist, you can let Apache either try to rewrite the URI or redirected the client to another URI. Rewriting can be done by editing httpd.conf, either through some basic directives or by more advanced directives offered by two modules: mod_alias and mod_rewrite; the latter is more sophisticated but is much larger.

Alias /icons/ /usr/local/apache/icons
AliasMatch /(.*)/images/(.*)\.gif$ /usr/local/apache/images/$1/$2.gif
ScriptAliasMatch ^/cgi-bin/(.*)\.cgi$ /usr/local/apache/cgibin/$1.cgi
 
Redirect permanent /archive http://www.acme.com/archive/temp (besides "permanent", you can use "temp" - which is the default -, "seeother", and "gone")

mod_rewrite

The main directive offered by mod_rewrite is RewriteRule, which is a more powerful alternative to AliasMatch. The real beauty of mod_rewrite is that it supports flow control:

RewriteCond %{HTTP_USER_AGENT} ^Lynx [NC]
RewriteRule ^/$ /lynx-index.html [L]

More information available at http://httpd.apache.org/docs/mod/mod_rewrite.html

Here's how to redirect the user to a new domain with the same docs/ tree:

RewriteEngine on
RewriteRule ^(.*)$ http://www.newdomain.com/$1 [R=301,L]

R=301 tells the browser that it's being redirected, which is a convenient way to tell search engines to update the URLs to point to the new domain.

An easier alternative in case the old and new servers have the exact same repository tree:

RedirectPermanent / http://www.new.com

ie. replace the root of the URL with the new domain, and combine this with the rest of the URL, eg. http://www.old.com/mydoc.html is turned into http://www.new.com/mydoc.html .

mod_imap

Nothing to do with the IMAP mail protocol. This is used to create clickable images, and redirect the client to a different URL depending on where the user clicked on the image.

mod_speling

This only performs basic rewriting, and is enabled by using the "CheckSpelling on" directive.

Server-side Scripting

Dynamic pages can be built with client-side scripts (JavaScript), but server-side scripts are much more powerful and can usually be used with basic web browsers on the client side. Apache supports several modules for server-side scripting, such as SSI and XSSI (eXtended SSI), CGI and FastCGI, PHP, JSP, ASP, and Perl.

Technically speaking, CGI is not a programming language but a protocol for scripts to gather information from a user request and respond accordingly. Scripts can be shells scripts, Perl or Python scripts, and binaries. Scripts retrieve information sent by the browser through either environment variables (GET method) or standard input (POST method.) The GET method has the advantage that users can bookmark the URL, but it also makes it easier for hackers to play tricks; The POST method is safer in that respect.

When using the GET method, the main variables that scripts can read are: REQUEST_METHOD, PATH_INFO, PATH_TRANSLATED, QUERY_STRING, and SCRIPT_NAME. Here's a simple way to print out all environment variables:

#!/bin/sh
echo "Content-type: text/plain"
echo
env

Basic CGI script

Shell script

 
#!/bin/sh
echo "Content-Type: text/html"
echo
echo "<HTML><HEAD><TITLE>My first CGI script</TITLE></HEAD>"
echo "<BODY>Hello, world!</BODY></HTML>"
 

Perl script

#!/usr/bin/perl -Tw
print "Content-Type: text/html\n\n";
print "<HTML><HEAD><TITLE>My first CGI script</TITLE></HEAD>";
print "<BODY>Hello, world!</BODY></HTML>";

Basic form and script

Form

<FORM METHOD=GET ACTION="/cgi-bin/askname.cgi">
    Please enter your name:<P>
    First name: <INPUT NAME="firstname" TYPE=TEXT><P>
    Last name: <INPUT NAME="surname" TYPE=TEXT><P>
    <INPUT NAME="OK" TYPE=SUBMIT>
</FORM>

Script

#!/usr/bin/perl -Tw
# askname.cgi
 
use CGI;
use strict;
 
my $cgi=new CGI;
 
print "Content-type: text/html\n\n";
print "<HTML><HEAD><TITLE>My first CGI script</TITLE></HEAD><BODY>";
print "Hello ",$cgi->param("firstname"), " ", $cgi->param("surname");
print "</BODY></HTML>";

Testing a CGI script manually

Since CGI scripts are run through Apache, they can be a pain to debug. A quick and dirty way is to set environment variables from the shell, and call the script:

# export QUERY_STRING="firstname=John&surname=Smith"
# export REQUEST_METHOD="GET"
# /usr/local/apache/cgi-bin/askname.cgi

Calling a URL manually

Using GET has the advantage that users can bookmark the URL, but those are limited to 256 characters. PUT is required when sending more data, but makes it impossible for the user to save the URL with all its parameters.

GET method

GET /cgi-bin/askname?firstname=John&surname=Smith HTTP/1.1
Host: www.acme.com

PUT method

POST /cgi-bin/askname HTTP/1.1
Host: www.acme.com
Content-length: 29
 
firstname=John&surname=Smith

Running CGI scripts

You need to configure Apache so it knows which directories contain CGI scripts, and which file extensions they use.  CGI scripts are handled by the mod_cgi module, and the ScriptAlias and ExecCGI directives. ScriptAlias is useful to restrict CGI scripts to a single directory, outside DocumentRoot, so as to forbid users from uploading CGI scripts. You can have as many ScriptAlias directives as you wish.

Security can be enhanced by forbidding the use of .htaccess files in a CGI-only directory:

Alias /cgi-bin/    "/usr/local/apache/cgi-bin/"
<Directory /usr/local/apache/cgi-bin>
    AllowOverride None
    Options ExecCGI
    SetHandler cgi-script
</Directory>

To restrict the use of CGI scripts yet more, you can restrict this to a single file using the Files directive:

<Files "/home/web/alpha-complex/welcome">
    AllowOverride None
    Options ExecCGI
    SetHandler cgi-script
</Files>

As an alternative to using a SetHandler directive as above, use either "AddHandler cgi-script .cgi .exe" and a ExecCGI option, or MIME types using mod_cgi's "AddType    application/x-httpd-cgi    .cgi"

SSI

SSI is short for Server-Side Include, and is provided by the mod_include module. To tell Apache to handle SSI:

AddHandler server-parsed .shtml (alternative: AddType application/x-server-parsed .shtml)
 
<Location /ssidocs>
    Options +Includes
</Location>

An alternative to Includes is IncludesNOEXEC, which disables any command that causes script execution.

Improving performances with FastCGI

Using the FastCGI mod_fastcgi module which provides the fastcgi-script, scripts are run persistently, ie. no need for Apache to set up the script environment, start up the script, etc. Under FastCGI, scripts can have three roles (Responder: like regular CGI scripts, Filter: Convert between input and output media type, and Authorizer: to authenticate HTTP requests and users, and can be used with mod_auth and mod_auth_dbm) and three types (Dynamic: started when the URL is first accessed, Static: started up with Apache, and External: located on a different host). Apache talks to FCGI scripts through a socket.

You can change the behavior of FCGI scripts through the command line. For instance, here's how to tell Apache to restart FCGI scripts that exit after 10 seconds and to restrict scripts to 5 instances at any one time: FastCgiConfig -restart -restart-delay 10 -maxprocesses 5 . Here's how to start a static FCGI script: FastCgiServer /cgi-bin/askname.cgi -init-start-delay 5 . And here's how to tell Apache to run an external FCGI script on a remote host: FastCgiExternalServer /cgi-bin/external.fcgi -host fcgi.alpha-prime.com:2001 .

Here's a basic FCGI script:

#!/usr/bin/perl -Tw
# askname.fcgi
use CGI;
use FCGI;
use strict;
while (FCGI::accept()>=0) {
    my $cgi=new CGI;
    
    print $cgi->header();
    print $cgi->start_html("CGI Demo");
    print "Hello ",$cgi->param("firstname"), " ",$cgi->param("surname");
    print $cgi->end_html();
}

And here's how to use a FCGI script to authenticate users:

<Location /protected>
    AuthName Top Secret
    AuthType Basic
    FastCgiAuthenticator cgi-bin/authenticate.fcgi
    require user john jane
</Location>

Finally, FastCGI scripts can be run through a wrapper, either SuExec (default: FastCgiSuexec on), or another binary (FastCgiSuexec /path/to/different/wrapper).

suExec

If Apache is compiled with suExec enabled, CGI scripts run under a different user account from the one used by the main server (ie. not root.) To compile suExec in, run "./configure --enable-suexec --suexec-caller=nobody (--suexex-caller is a user account that is allowed to call suExec, which should be the account set by the User directive in httpd.conf. It is "www" by default.)

Q&A

I can no longer compile!

If you've compiled apache several times, with different options, and you can no longer compile successfuly, just rm -Rf the source tree, and start from a clean install.

How to test a new configuration

Run httpd -t

Using apachectl

"apachectl restart" closes all active connections. "apachectl graceful" waits for existing connections to close

Testing a connection

Since it is one of the few HTTP methods that are available since release 0.9 of the HTTP protocol, you can connect in telnet mode to the TCP port on which Apache is listening, and issue "GET /" without the quotes.

What is the difference between activate-module, add-module, and enable-module?

Is enable-module used to activate the use of third-party modules whose source has been included in the Apache source tree (eg. SSL), while other third party modules which do not need to patch the Apache source first need to be set with activate-module?

Are activate-module (copy source into Apache's directory, and include module) and add-module (assume source was copied beforehand, and just include it) used for standard Apache modules, while enable-module is used for third-party modules?

Securing an Apache server

As the long list of defaced web sites over at Attrition or Assdal show, web servers are a favorite target for hackers. To lower the risks of having your web server be defaced, follow those tips:

Hacking web servers

Checking the brand name and version number

Telnet to the server's port 80, and type HEAD / HTTP/1.0 . Changes are this will return at least the type of server it is running and its version:

Server: Netscape-Enterprise/3.5.1I  

Development

SSI = server-side includes, ie. dynamic HTML. Requires mod_include

XSSI = extended SSI

Note: SHTML is often short for SSI HTML pages

AddType Application/x-httpd-php-source .phps

(.phps shows the source)

Aliasing = mapping a client's URL into a non-standard location and automatically retrieving the resource from this location eg. alias /icons "/usr/local/apache/icons"

Redirection: mod_alias (for aliasing) and mod_rewrite (for redirection).

apachectl configtest

mod_status -> www.mysrv.com/server-status

To track users: mod_usertrack + mod_session

Cookies

The server sends a Set-Cookie header in its HTTP response (Set-Cookie: name=value; expires=date; path=path; domain=domain-name, secure

mod_cookies -> mod_usertrack (is a standard module)

Security

Controlling access

.htaccess

By default, Apache looks for a file called .htaccess. Here's a sample:

AuthUserFile /etc/httpd/.htpasswd
AuthGroupFile /dev/null
AuthName Acme Secret Section
AuthType Basic
<Limit GET POST>
    require valid-user
</Limit>

You can improve performance by telling Apache to not look for .htaccess files outside the DocumentRoot directory:

<Directory />
    AllowOverride None
</Directory>
 
<Location />
    AllowOverride All
</Location>

You can tell Apache which directives can be overriden in an .htacess file using the AllowOverride directive, eg. "AllowOverride FileInfo Indexes". The order in which the directives allow, deny, and Satisfy are checked can be overriden in .htaccess if Limit Override is enabled (default). For higher security, use "AllowOverride -Limit".

Indexing

If a user aims at a directory for which no default document is available, as set by the DirectoryIndex directive, you can tell Apache to list the content of this directory by using the mod_autoindex module. For security reasons, it is recommended to disable this feature:

<Location />
    Options -Indexes
</Location>
 
<Location /ftp/>
    Options +Indexes
    IndexOptions FancyIndexing ScanHTMLTitles
</Location>

Note: ScanHTMLTitles tells Apache to open any file ending in .HTML or .HTM, read its Title section (if any), and display its content in the Description column. It's CPU and harddisk-intensive, so you might want to only use this on small Intranet servers.

Authentication can be host-based, user-based (those two using the "Satisfy any" option), or both (Satisfy all, which is the default option):

<Location ...>
    AuthName "Registered Only"
    AuthType Basic
    AuthDBUserFile    /usr/local/apache/conf/password.dbm
    require valid-user
    order deny,allow
    deny from all
    allow from 192.168.1 192.168.2
    Satisfy any
</Location>

X.509 Certificates

In this scheme, the server's public key is stamped ("certified") by a well-know organization such as VeriSign or Thawte to make sure that the public key does indeed belong to the server to which you are trying to connect. Note that a certificate is granted to a specific host, so that you'll need to get a new certificate should the web server's IP address/hostname change. Browsers come with the list of public keys of major CA (Certificate Authorities) so that they can check that the certificate was indeed signed by a reliable authority without prompting the user for any validation.

For information, the SSL package is available through either Apache_SSL or mod_ssl (better documented, and easier to install.) Two open-source packages implement SSL: SSLeay (discontinued), and OpenSSL.

mod_access

mod_access is used to forbid access based on the client hostname's IP address. Access can be set in either httpd.conf or .htaccess.
SetEnvIf Referer "someremotehost\.com" deny_this
<Directory />
    order allow,deny
    allow 192.168.1
    deny from env=deny_this
</Directory>
 
Other example:
<Directory />
    order deny,allow
    deny from all
    allow from .acme.com 192.168.0.0/16
</Directory>

Note that you should use IP addresses instead of host or domain names to improve performance and lower the consequences of losing access to the DNS.

Multiple authentication modules can be specified in httpd.conf. Watch out for the fact that the task of authenticating user requests is based on the reverse of the order in which AddModule directives are specified, ie. the last authentication module has the highest priority. If a module is said to be authoritative, authentication will be not be passed to lower-priority modules (to disable authoritativeness, AuthDBMAuthoritative off).

mod_setenvif

This module lets you allow or deny access based on information returned by the client browser. It provides two directives: BrowserMatch and SetEnvIf. Some examples:

BrowserMatch ^Mozilla netscape_browser
<Directory>
    order deny, allow
    deny from all
    allow from netscape_browser
</Directory>
SetEnvIf Request_Protocol ^HTTP/1.1 http_11_OK
<Directory>
    order deny, allow
    deny from all
    allow from env=http_11_OK
</Directory>

mod_auth

Authentication is performed against a text file that is created by htpasswd (eg. htpasswd -c /usr/local/apache/auth/userfile jdoe. Omit "-c" when adding users to an existing text file.) By default, hashing is done through crypt(), but the -m, -s, and -p switches can be used to tell htpasswd to use MD5, SHA, and plain text, respectively.

Access is controled through a .htaccess text file saved in the directory to be protected. For example,

AuthName "mod_auth Test Realm"
AuthType Basic
AuthUserFile /usr/local/apache/auth/userfile
AuthGroupFile /usr/local/apache/auth/groupfile
require user jdoe
require group admins
... where groupfile is:
admins: jdoe janedoe

mod_digest & mod_auth_digest

mod_auth_digest is more recent. The password is never sent in plain text, but instead, it is used only by the browser to compute a checksum that must match the same checksum created on the server in order for authentication to success. By default, digest authentication uses MD5.

To create a user database for MD5 authentication, run eg. htdigest -c /usr/local/apache/auth/password.MD5 "Just testing" jdoe.

AuthName "Digest Authentication"
AuthType Digest
AuthDigestFile /usr/local/apache/auth/password.MD5
AuthDigestGroupFile /usr/local/apache/auth/groups
AuthDigestDomain /MD5Protected/ /private/ Require group WebAdmins
Note: AuthDigestDomain contains URLs; Digest authentication is always authoritative.

mod_auth_db

Note: Under Linux, use mod_auth_db (Berkeley DB) instead of mod_auth_dbm (Unix DBM), as DBM support is provided by the GnuDBM library, which provides an imperfect emulation of DBM schemes like NDBM.

To activate the use of mod_auth_db, compile it either as a static or dynamic module, and add the following directives to httpd.conf:

LoadModule db_auth_module libexec/mod_auth_db.so //Only needed for dynamic modules (DSO) AddModule mod_auth_db.c
Build an .htaccess file in the directory you want to protect:
AuthName "DB Authentication Realm" AuthType basic AuthDBUserFile /usr/local/apache/auth/dbpasswds AuthDBGroupFile /usr/local/apache/auth/groups.dbm require group WebAdmins AuthDBAuthoritative On
A Berkeley DB file is created through Perl script /usr/bin/dbmmanage (where is it ? # rpm -ql db3-utils /usr/bin/berkeley_db_svc /usr/bin/db_archive /usr/bin/db_checkpoint /usr/bin/db_deadlock /usr/bin/db_dump /usr/bin/db_dump185 /usr/bin/db_load /usr/bin/db_printlog /usr/bin/db_recover /usr/bin/db_stat /usr/bin/db_upgrade /usr/bin/db_verify

Third-party authentication modules

mod_auth_kerb (Kerberos), mod_auth_ldap, mod_auth_nds, mod_auth_notes, mod_auth_nt_lm (NT domain controller), mod_auth_radius, mod_auth_samba, mod_auth_smb, mod_auth_sys (Unix system files), mod_auth_tacacs, mod_auth_yp (NIS), mod_auth_securid (SecurID token authentication). User tracking

The session key of every request is recorded by mod_session as an internal apache data structure called a note, which is referenced like a system environment variable.

Log analyzer: Analog/Getstats, WebAlizer

CGI Scripts

Just like with any program, you must set environment variables such as PATH, and check all user input to avoid buffer overflows and other major security breaches.

To improve security, you can use CGI wrappers to run scripts under a different UID: suEXEC and CgiWrap. suEXEC must be included when compiling the Apache binary, and is activated by adding User and Group directives in a section that are different from those used in the main section of httpd.conf. CgiWrap is more flexible as it runs scripts using the UID/GID of the owner of the script file. Once built, CgiWrap must be enabled by adding handlers in httpd.conf:

ScriptAlias /CGIWrapDir/    /usr/local/apache/cgiwrap-bin
AddHandler cgi-wrap     .cgi
Action cgi-wrap    /CGIWrapDir    /cgiwrap

Security Checklist

SSL

The certificate offered by the server must match the URL used to make the request.

Two ways to add SSL to Apache: Apache-SSL, and mod_ssl

OpenSSL is an open-source alternative to Netscape's SSL, and is derived from SSLeay. It consists in two libraries: libcrypto.a and libssl.a

To test OpenSSL, run /usr/local/ssl/bin/openssl version

The private key is server.key (it must be backed up and chmod 0400 server.key). The certificate signing request is server.crt, and the x509 certificate is server.crt.

/usr/local/apache/conf/ssl.csr/ = certificate signing request files
/usr/local/apache/conf/ssl.crt/ = x509 certificates
/usr/local/apache/conf/ssl.key/ = private keys

Note: The private key is also left in the Apache source directory!

To start Apache in SSL mode, run either httpd -DSSL, or apachectl startssl.

The x509 cerificate is stored in PEM (private enhanced mail) -> to view, openssl x509 -in server.crt -noout -text, or openssl x509 -noout -text -in server.crt

To have the server's public key certified, send server.csr

To self-certify: openssl req -x509 -key ./ssl.key/server.key -in ./ssl.csr/server.csr -out ./ssl.crt/server.crt, and copy into the SSL certificate file in eg. /usr/local/apache/conf/ssl.crt/server.crt

Client to server authentication is also possible by copying into /usr/local/apache/conf/ssl.crt/, and running make update

All certificates can be concatenated into ca-bundle.crt.

Certificate revocation list (crl)

Commercial SSL server : StrongHold, RH SecureWeb Server (SWS), Covalent Technology's Raven

To upload files: POST (sent as stream -> parsing in CGI script), and PUT (mod_put)

TEMP STUFF

HTTP 0.9
HTTP 1.0: content type sent to browser
HTTP 1.1: hostname identification -> which virtual host should answer request; content negotiation to match capabilities; uploading files

GET = msg body
HEAD = header info

Very small and fast httpd's: THTTPd (acme.com), MathOPd, BOA. iPlanet = ex-Netscape Enterprise Serve
Apache spawns processes instead of threads.

APACI = GNU's autoconf

--enable-module=all/most
--activate-module //to compile 3rd party modules statically into Apache from source placed into the src location
--add-module //To copy module source file into directory before compiling & linking it statically into Apache
--enable-shared=max -> all modules as DSO's ; must be last directiv
./config-status //saved configure output

config.layout
./configure --with-layout=RedHat

strip ... to remove debug infos from a binary

httpd -t to test config
httpd -D to define (eg. httpd -D SSL -> )

In httpd.conf, core directives (global) and those that are context-dependant. Some instructions can only be used in a given context.
global env't : directives for Apache server process as a whole
default srv section: , , 
virtual hosts
.htaccess

.htacess
allow from 192.168.0.*

An .htaccess file can be located anywhere in the file system, not just in the DocumentRoot directory. Use AllowOverride to specify which directives can be overridden in an .htaccess file.

More directories allowed through mod_userdir; directive userdir; default dir is ~/public_html/, but can be changed through userdir directive eg. userdir /home/*/www

userdir disabled root webmaster

CGI scripts can be run in a CGI wrapper to change UID/GID before being run: SuEXEC or virtual hosts

Redirection to allow access to docs in directories outside DocumentRoot: 1. chown -R nobody:nobody /usr/doc/MySQL 2. ln -s manual_toc.html index.html 3. Alias /MySQL/ "/usr/doc/MySQL/"

mod_dir: to add trailing "/", and return index.html if not specified through DirectoryIndex

Fancy directory listing through mod_autoindex + IndexOptions FancyIndexing

Modules : core_module and mod_so (always static); standard Apache modules; 3rd-party modules

Modules register callback functions called by Apache

mod_perl allows for development of modules in Perl instead of C.

Modules can be built in two ways: in Apache's source directory (use APACI to add), or as DSO modules (better to keep outside the Apache source directory and use perl script apxs to build .so). APACI = Apache AutoConf-style Interface. Apxs can compile and install most 3rd-party modules, which often have more than one source file.  Some modules like OpenSSL make extensive changes to the Apache source directory, so cannot be installed through apxs. 

Apxs -c -i -a MyModule.so

http://modules.apache.org

With the expection of core_module and mod_so, all Apache standard modules can be built as DSO modules.

LoadModule //To load DSO module (not used by static modules)
AddModule //To enable module (either static or DSO)

A DSO module can be known by Apache under a name different from its actual filename: 
LoadModule firewall_module libexec/mod_firewall.so
AddModule mod_firewall.c

Modules loaded last are processed first!
APXS is a Perl script to compile and install 3rd party DSO modules. Unlike APACI, it can handle modules consisting in more than one source file.

Some modules like OpenSSL make extensive changes to Apache, so cannot be installed by APXS. The standard use of APXS is: apxs -c -i -a mymodule.so

Resources