Dec 27200612:00 AM CST

Using .htaccess Files with Apache

Categories: Coders, Design

Customising Apache Functionality at the Directory Level

One of the most common needs Webmasters have is to cause the Web server to handle all the documents in a particular directory, or tree of directories, in the same way -- such as requiring a password before granting access to any file in the directory, or allowing (or disallowing) directory listings. However, this need often extends to more than just the Webmaster; consider students on a departmental Web server at a university, or individual customers of an ISP, or clients of a Web-hosting company. This article describes how the Webmaster can extend permission to tailor Apache's behaviour to users, allowing them to have some control over how it handles their own sub-areas of its total Web-space.

This article shows how you can use per-directory configuration files, called .htaccess files, to customise Apache behaviour -- or allow your users to do so for their own documents.

Per-Directory Settings

Apache's configuration system addresses the need to group documents by directory in a straightforward manner. To apply controls to a particular directory tree, for instance, you can use the <Directory> container directive in the server's configuration files:

    <Directory "C:/Program Files/Apache Group/Apache/htdocs">
        AllowOverride Non
        Options Non
    </Directory>

This has the advantage of keeping control in the Webmaster's hands; there's no need to worry about any of the server's users being able to change the settings, since the server configuration files are generally not modifiable by anyone except the admin. Unfortunately, it has the disadvantages of requiring a restart of Apache any time the config file is changed, and that it can become truly burdensome to add all the <Directory> containers that might be needed for all the users that have special requirements.

An alternative method for supplying the desired granularity of Apache configuration -- down to the directory level -- is to use special partial config files in each directory with special requirements.

So What's an .htaccess File?

An .htaccess file is simply a text file containing Apache directives. Those directives apply to the documents in the directory where the .htaccess file is located, and to all subdirectories under it as well. Other .htaccess files in subdirectories may change or nullify the effects of those in parent directories; see the section on merging for more information.

As text files, you can use whatever text editor you like to create or make changes to .htaccess files.

These files are called '.htaccess files' because that's what they're typically named. This naming scheme has its roots in the NCSA Web server and the Unix file system; files whose names begin with a dot are often considered to be 'hidden' and aren't displayed in a normal directory listing. The NCSA developers chose the name '.htaccess' so that a control file in a directory would have a fairly reasonable name ('ht' for 'hypertext') and not clutter up directory listings. Plus, there's a long history of Unix utilities storing their preferences information in such 'hidden' files.

The name '.htaccess' isn't universally acceptable, though. Sometimes it can quite difficult to persuade a system to let you create or edit a file with such a name. For this reason, you can change the name that Apache will use when looking for these per-directory config files by using the AccessFileName directive in your server's httpd.conf file. For instance,

    AccessFileName ht.a

will cause Apache to look for files named ht.acl instead of .htaccess. They'll be treated the same way, though, and they're still called '.htaccess files' for convenience.

Locating and Merging .htaccess Files

When Apache determines that a requested resource actually represents a file on the disk, it starts a process called the 'directory walk.' This involves checking through its internal list of <Directory> containers to find those that apply, and possibly searching the directories on the filesystem for .htaccess files.

Each time the directory walk finds a new set of directives that apply to the request, they are merged with the settings already accumulated. The result is a collection of settings that apply to the final document, culled from all of its ancestor directories and the server's config files.

When searching for .htaccess files, Apache starts at the top of the filesystem. (On Windows, that usually means 'C:\'; otherwise, the root directory '/'.) It then walks down the directories to the one containing the final document, processing and merging any .htaccess files it finds that the config files say should be processed. (See the section on overrides for more information on how the server determines whether an .htaccess file should be processed or not.)

This can be an intensive process. Consider a request for <URI:http://your.host.com/foo/bar/gritch/x.html> which resolves to the file

    C:\Program Files\Apache Group\Apache\htdocs\foo\bar\gritch\x.ht

Unless instructed otherwise, Apache is going to look for each of the following .htaccess files, and process any it finds:

  1. C:\.htaccess
  2. C:\Program Files\.htaccess
  3. C:\Program Files\Apache Group\.htaccess
  4. C:\Program Files\Apache Group\Apache\.htaccess
  5. C:\Program Files\Apache Group\Apache\htdocs\.htaccess
  6. C:\Program Files\Apache Group\Apache\htdocs\foo\.htaccess
  7. C:\Program Files\Apache Group\Apache\htdocs\foo\bar\.htaccess
  8. C:\Program Files\Apache Group\Apache\htdocs\foo\bar\gritch\.htaccess

That's a lot of work just to return a single file! And the server will repeat this process each and every time the file is requested. See the overrides section for a way to reduce this overhead with the AllowOverride