Apache is a Web server, that is a software system that accepts requests from a user through the Web, and returns the appropriate response to the user. This is done either by sending back the file identified in the request, or by scheduling the program identified in the request, then sending back the result of this program.
![]() |
|
1.1 Serving a file |
http://myhost.com/somefile.html |
where "myhost.com" is the Internet address of a host computer where an Apache server is installed.
If the Apache server has not been tinkered with, it would fetch the "somefile.html" file and send it back to your browser, for display on your screen. |
Usually, the requested file is a script which uses some input data and produces an output result. In a realistic situation, the application processor sets the data received from Apache into some conventional format and has the script executed, with the formatted data as input. The output result from the script is returned to APACHE which sends it back to the browser, for display onto the user screen. |
The application processor to be called is generally indicated by the file name extension of the requested file, or by the directory in which this file is located. Examples:
http://myhost/phppage.php | the .php extension indicates that the request is to be passed on to the PHP processor |
http://myhost/cgi-bin/calculus.pl | the cgi-bin directory is known by Apache to contain files to be processed by a CGI processor |
All that is controlled by the setting of Apache (see section 4. Apache configuration, below).
Some examples of application processors are:
PHP | (Hyperlink PreProcessor) |
JSP | (Java Server Page) |
Earlier generations of Apache were dedicated to running CGI (Common Gateway Interface) programs. As its name implies, CGI is not a processor, but an interface. Many language processors can be used to create programs to run as CGI applications: PERL, C, C++, Basic, etc.. Under UNIX, shell scripts can also be used. The preferred processor for CGI is PERL.
A script run under these processors generally produces an HTML page that contains the results computed from the input data. This HTML page is the result sent back to the requesting browser which then displays it for the user to see.
To develop an application running under one of these processors (PHP, JSP, PERL, etc...), a programmer has to know the language appropriate for this processor. Such a language is independent from Apache. An application developer needs not know how Apache or any other Web server works.
![]() Most databases nowadays can be distributed among multiple hosts. Using a distributed database, a user connected to an Apache system can access data held in different hosts. This capability is of the database management system. It is not related to Apache. Connection to a database usually involves a database driver which is a piece of software that pertains to the database system, and is installed in the host of the requesting script. |
2.1.1 A request entered by the user
A sample HTTP request is this line entered from the address line of your browser (e.g. Netscape Navigator, Internet Explorer, etc.):
http://www.hata.com:4080/profile/people.php?name=Fanny&surname=Adams |
Your browser translates this line into a message that looks like this:
GET http://www.hata.com:4080/profile/people.php?name=Fanny&surname=Adams HTTP/1.1CRLF Accept: text/html, image/gif, image/jpegCRLF Accept-Language: en-us, fr;qs=0.5CRLF Accept-Encoding: gzip, deflateCRLF User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)CRLF Host: www.hata.com:4080CRLF Connection: Keep-AliveCRLF CRLF |
The CRLF symbol represents the "Carriage Return / Line Feed" sequence that marks the end of a line of text. The rest of the message is composed of readable characters: HTTP is a character transmission protocol.
As you can see, the first line contains the request that was entered by the user:
http://www.hata.com:4080/profile/people.php?name=Fanny&surname=Adams
In this:
www.hata.com | is the server's host Internet name | ||||
4080is the port at the server's host where the message is expected
| /profile/people.php | is the requested file URI | name=Fanny&surname=Adams | is the data sequence to be passed as input data to the people.php script.
| |
The remaining lines are the message headers that contain informations on how to handle the message.
This message is sent by the browser to the Apache system in the www.hata.com host.
In the above example, the HTTP message conveys the data to be processed (name=Fanny&surname=Adams) on the line entered by the user. There is nothing in the message after the headers, but 2 CRLF characters.However, the browser can generate messages where the data are placed after the headers. The usual cases are messages generated from an HTML form. A sample message is as follows:
POST http://www.hata.com:4080/profile/people.php HTTP/1.1CRLF Accept: text/html, image/gif, image/jpegCRLF Accept-Language: en-us, fr;qs=0.5CRLF Accept-Encoding: gzip, deflateCRLF User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)CRLF Host: www.hata.com:4080CRLF Connection: Keep-AliveCRLF CRLF name=Fanny&surname=Adams |
![]() There are 65000 something (not quite 2 to the power 16 minus 1) ports, numbered from 1 up. Each of the programs that communicate with the outside world is assigned one of these ports. Port 80 is assigned to HTTP communications. If Apache uses it, this number may be omitted from the request message, as it is implied. Here 4080 is used, possibly because 80 is assigned to another HTTP server, for example IIS (Internet Information System), in Windows. |
The document root can contain files and other directories. The figure on the left shows 2 of these, called docs and profile. The path information (in the above sample request: profile/people.php), that part of a request that comes between the host name and port informations, and the question mark, if any, or the end of the line, specifies the relative path of the requested file. Except for references to an alias or a user directory (to be seen below), this path is evaluated relative to the document root.
|
In the above example, the profile/people.php file is to be found in the profile directory, under the document root.
If somefile.html is a file contained in the document root, a request for this file, in the same host as above, is:
http://www.hata.com:4080/somefile.html
Names that come directly after the host and port identifications identify files contained in the directory root.
2.3.2 Alias and user directories
It is possible to set up directories outside the document root to contain files that users can request. These are:- | alias directories |
- | user directories |
In the structure shown in the figure above, the "other" directory is known to users by an alias defined by the setting of Apache, for example "appl". A request starting with http://www.hata.com:4080/appl/... then refers to a file found under the "other" directory. The use of aliases precludes users from knowing the true organization of the Apache file system. It also makes the users independent of this organization.
The figure below shows one of the possible user directory organizations. The users have the names of "joe", "sam" and "willy". They have a directory named after them.
A SAMPLE USER DIRECTORY ORGANIZATION
|
A file named phppage.php contained in the scrip directory is requested by:
http://www.hata.com:4080/~sam/scrip/phppage.php |
How to select this program is generally determined by the extension of the requested file. Another method is to assign the files contained in one or more directories to a processing program, disregarding their extensions. Both methods are based on the settings of Apache.
A simple treatment is filtering: the requested file is passed to a filter program which modifies, then returns it to Apache for sending to the client. An example of such a treatment is the Server Side Include (SSI) procedure. In this, the requested file is an HTML page which has special elements inserted in it. The SSI processor reads through the page and replaces these elements by the result of the operations they describe (such a result is for instance the current date and time). The resulting transformed HTML page is sent to the client. Files to be so treated are usually characterized by the .shtml extension ("usually", because this can be changed by Apache settings).
A more elaborate treatment involves the excution of a program that runs under an application processor such as PHP, JSP, PERL, etc. The requested file is then a program (often a script) developped in the language required by the application processor. Apache passes the HTTP request message on to the application processor which retrieves the requested file and has it run. The data contained in the request message is made available to the program, in the form defined by the application processor language.
The output from this treatment is generally an HTML page which contains the sought for results. This page is sent to the client to be displayed by the browser on the user's screen. HTML pages are widely used to-day, because they are currently the type of document that browsers know best how to handle. In the near future, application programs can generate XML pages instead.
- host authorization | which accepts or rejects requests based on their originating host |
- user authentication | which requires that users enter their name and password, to access certain directories |
- per user grouping | which assigns a separate file structure to each user |
The user authentication function is supported by software components which provide for encrypting the passwords to be stored in disk files. At an elaborate level, it is also possible to encrypt passwords sent in by the users, which involves an agreement on a key (usually emitted by the server) prior to the procedure.
Apache can accept new modules to carry out new ways of handling request. These modules can be copied into the Apache environment or stay outside. One or more file extensions are to be defined to identify the files to be processed by such a module. This capability makes Apache a system open to new functionalities
Such modules can vary in size. They can be complete application systems like PHP.
The options that govern the inclusion of dynamic modules, the directory organization and the functioning of Apache are defined by a set of directives. The bulk of these are contained in the Apache configuration file. Directives that affect the access to a specific directory can be set in a .htaccess file contained in that directory.
Alias | defines a user accessible directory structure independent of the document root |
UserDir | defines a directory organization, partitioned on a per-user basis |
LoadModule | locates a module to be loaded when Apache starts -- among these modules are those to which Apache passes incoming requests for processing, such as the SSI processor, the PHP or JSP application processors, and those which implement the security procedures. |
AddType | assigns one or more file extensions to a data type |
AddHandler | assigns one or more file extensions to a handling requirement |
Action | assigns a data type or handling requirement to a processing module |
Some of the directives are discussed in the next chapter.
Directives fall into 2 categories:
- | those pertaining to the Apache core, which are always available |
- | those pertaining to a specific module, available only when the module is installed |
An exhaustive directive index is found in the file:manual/mod/directives.html.en, where a hyperlink directs to the full description of each directive. See also: Appendix 4. The Apache manual
This is a text file which can be displayed and edited using a simple text editor. In this, the lines that start with a # are comments.
Directives controlling the access to a specific directory can be set in a file names .htaccess contained in that directory. These directives will affect the handling requests to access that directory and all of the directories contained therein.
4.3 Directives restricted to specific directories, locations or files |
4.3.1 Directives restricted to specific directories Directives related to a specific directory are set up in the configuration file, within a section delimited by the <Directory > and </Directory> tags. The syntax is:
Example:
In this example, the directives enclosed within the <Directory> and </Directory> tags apply to the C:/Apache/htdocs/ directory and its descendants, if any. The directives to be applied to a directory can also be set in a file named .htaccess contained in that directory.
|