HTML - User's Guide
An introduction to the HTTP protocol
1. OVERVIEW
The communication is effected through an exchange of request and response messages.In the HTTP terminology, the partner that sends the request is the client side of the communication. The one that sends back the response is the server. This does not imply any assumption as to the capabilities of the partners; they can switch role in the course of time.
![]()
In HTML parlance, however, the side where the user is, with the HTML page, is called the client-side, the other side, where usally a Web Server is, the server-side. This is not quite consistent with what was just said, but usually no confusion is to be expected, because nobody is aware of the HTTP operations. |
2. MESSAGE FORMAT
2.1 Generic message format
Requests and responses are messages.HTTP messages are sequences of characters, in one of the character sets recommended by the ISO (International Standard Organization) such as ISO-10646, etc. No binary data is allowed.
A message is composed of lines that are sequences of characters terminated by a carriage return and a line feed character. This 2-character sequence is usually denoted by the acronym CRLF.
A message is composed of the following lines:
start-line CRLF
message-header CRLF
...
message-header CRLF
CRLF
message-body
The start-line is: - a request-line in a request - a status-line in a response They are discussed in the respective paragraphs, in the following. |
The message header group is terminated by an empty line, that is a line with a CRLF alone, after which comes the message body. This contains what the message is intended to carry, in its orignal or encoded form. The original text to be transmitted is called the entity body. For security reason, the entity body can be encoded. This is called transfer coding. The result is the message body transported in the message. If there is no transfer coding, the message body is identical to the entity body. |
Message headers in both request and response messages describe such message characteristics as the message-body length (the Content-Length header) or the method used for encoding the entity-body (the Transfer-Encoding header). |
In a request message they tell the server whether to perform the requested action (for instance: return information only if the resource was modified since a certain date), or of the the acceptable characterics of the response message such as the permissible character set (the Accept-Charset header) or language (the Accept-Language header), etc. |
Headers in a response message may convey a warning (the Warning header) or contain any indication useful to the requestor, for example, where to go for further information (the Location header). |
2.2 Request format
A request message is composed of the following lines: request-line CRLF
message-header CRLF
...
message-header CRLF
CRLF
message-body
The request-line is composed of the following items, separated by one space:
- | method, defines the action to be performed on the resource identified by the request_uri |
- | request_uri, identifies the resource upon which to apply the request -- in a very usual situation the request is addressed to a Web Server and the resource is an application to be scheduled. |
- | version, indicates the HTTP version to be used. |
method request_uri version CRLF |
GET applis/jsp/MyApps.jsp?first=John&last=blow&push=ENTER HTTP/1.1 |
- | The method is GET
| ||
- | The request URI is applis/jsp/MyApps.jsp?first=John&last=Blow&push=ENTER
| ||
It tells the Web Server to schedule the JSP application located in the file the path of which is applis/jsp/MyAppls.jsp and pass it the data : |
2.3 Response format
A response message is composed of the following lines: status-line CRLF
message-header CRLF
...
message-header CRLF
CRLF
message-body
The status line is composed of the following items, separated by one space:
- | version, indicates the HTTP version being used. |
- | status_code, indicates how the requested operation was performed |
- | reason_phrase, comments on the status code, for example, to explains what the error was. |
version status_code reason_phrase CRLF |
1xx | Informational | Request received, continuing process |
2xx | Success | The action was successfully received, understood, and accepted |
3xx | Redirection | Further action must be taken in order to complete the request |
4xx | Client Error | The request contains bad syntax or cannot be fulfilled |
5xx | Server Error | The server failed to fulfill an apparently valid request |
The reason phrase explains the meaning of the numeric code
MESSAGE ELEMENTS
3.1 The request URI
The request URI identifies the resource to process.The resource is identified by the address part which is to the left of the question mark. This is the relative URI of the resource.
To the right of the question mark is the data string. This is present in the request URI when the method is 'GET'
Individual data items are separated by the ampersand (&) sign
In the request URI, there can be special characters. They are usually represented by their hexadecimal code, preceded by the % sign. For example, a space is considered a special character; it is represented by %20.
3.2 HTTP methods
The method tells the Web Server how to process the resource.The methods are:
- | GET | requests the information represented or produced by the resource identified by the request URI |
- | POST | requests the server to post the entity body to the resource identified by the request URI |
- | OPTION | enquires about the options of the resource identified by the request URI, or of the server |
- | PUT | requests that the transmitted entity be stored under the identification supplied by the request URI -- the entity can be data, and the request URI a file name (complete with its path), the PUT method then requests the server to create the file with this name, to contain the transmitted entity |
- | HEAD | requests a response with headers only (no response-body) |
- | DELETE | request the deletion of the resource identified by the request URI |
- | TRACE | request a loop back of the message from the server side |
The GET method requests the server to retrieve and send back the information identified by the request URI. If this is an application, it is passed the data part (to the right of the question mark), and its result is to be returned to the client as an entity encoded into the body of the Response message. The result can be cached, that is saved with the request on the client side; if a subsequent request is identical to the saved request, the saved result is returned to the user from the cache, without going all the way to the server.
The POST method on the other hand posts the request entity body to the resource identified by the request URI, that is, as the referenced HTTP specification puts it, to accept this entity as a "new subordinate" of this resource. In the most frequent situation where the resource is an application, this means that the server schedules the application and passes it the entity as data to process, then returns the result to the client. No caching is allowed, that is the request is always sent all the way to the destination specified by the request URI.
In the case where the resource identified by the request URI is an application, the results returned to the user by a GET or the POST method are identical. The difference is:
- | in how the data is placed in the request message:
| ||||
- | in the caching capability : the request results can be cached when using the GET method, they cannot with the POST method |
3.3 Message headers
A message header field has the following syntax:field-name:field-value CRLFThe field value can be composed of multiple items separated by one space.
Some headers can be used in both requests and responses. Others are only for requests whereas still others are only valid for response. One more category are the entity headers.
3.3.1 General headers
Some of the general headers are:- | Allow | specifies the list of methods applicable to the resource identified by the request URI. An example:Allow: GET,POST
| ||||
- | Cache-Control | specifies the caching directives to be obeyed by the caching mecanisms along the way from the origin to the destination of the message. Some examples are:
| ||||
- | Connection | specifies the option for the connection. Examples are:Coonection: keep-alive - the connection is to be persistentCoonection: close - the connection is not to be persistent | ||||
- | Date | indicates the date and time at which the message is generated. An example is:Date: Tue, 10 Dec 2003 22:17:54 GMT
| ||||
- | Transfer-Encoding | indicates the type of encoding that has been applied to the message-body |
3.3.2 Request headers
Some of the request headers are:- | Accept | specifies the media-types acceptable in the response; a q parameter (called quality factor), with value from 0 to 1 indicates the user or user-agent preference for the media-type (the default value is 1). Example:Accept: text/html;q=1, text/plain;q=0.3
|
- | Accept-Charset | indicates the character sets acceptable in the response; a q parameter can be used to indicate the preference level of a character set. An example is:Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 ISO-8859-1 is acceptable with preference 1., utf-8 with preference 0.7, any other character set with preference 0.7 |
- | Accept-Encoding | restricts the acceptable encoding operations of the response to the specified list; if no Accept-Encoding header is present, the server may suppose that all encodings are acceptable. Example:Accept-Encoding: gzip,deflate
|
- | Accept-Language | specifies the language acceptable in the response. Example:Accept-Language: en-us,en;q=0.5
|
- | Authorization | contains the credentials that give access to the requested resource |
- | From | gives the e-mail address of the user who caused the request to be sent.From: www.information@HatayServices.com
|
- | Host | specifies the host and port number of the requested resource; this information is as defined by the original request URI |
- | User-Agent | gives information on the user-agent that originates the request. Example:User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows 98)
|
- | If-Modified-Since | used with the GET method to specify that the operation is to be performed, and information returned to the client only if the requested resource has been modified since the specified date. An example:If-Modified-Since: Mon, 8 Dec 2003 17:30:00 EDT
|
3.3.3 Response headers
Some of the response headers are:- | Age | used when the response goes through proxies on its way to the final recipient. The Age header is generated by an intermediate proxy and idicates the number of seconds elapsed since the original server emitted the response |
- | Location | specifies a location, different from the one defined by the request URI, where the client can find further information |
- | Retry-After | used with the 503 (Service Unavailable) status to indicate the delay to wait before renewing the request |
- | Server | describes the server which sends back the response |
- | Warning | contains a warning message |
3.3.4 Entity headers
The entity headers describe the properties of the entity body enclosed in a message. They can be found in requests or in responses since both requests and responses can transport an entity. Some of the entity headers are:- | Content-Base | specifies the URI base for the relative URI found in the entity-body |
- | Content-Encoding | indicates the encoding that has been applied to the entity-body Content-Encoding: gzip |
- | Content-Language | indicate the natural language of the entity-body. Example: Content-Language: en-us |
- | Content-Length | indicates the number of bytes contained in the message-body |
- | Content-Location | specifies the resource location of the entity-body; this is preciifed an an absolute or relative URI. |
- | Content-Type | indicates the media type of the entity-body. An example:Content-Type: text/html, text/xml
|
- | ETag | specifies an entity-tag as a string of characters assigned to the entity-body of the message; this entity-tag is associated with the entity-body in the cache at the destination; later on, a request from this destination can specify the entity-tag of the entity-body that can be retrieved from the cache instead of from the remote server. |
- | Expires | specifies the date and time at which the entity-body is to expire |
- | Last-Modified | specifies date and time the resource was last modified |
Reference
Hypertext Transfer Protocol 1.1 by R. Fielding, UC Irvine, J. Gettys, J. Mogul, DEC, H. Frystyk, T. Berners-Lee, MIT/LCS, RFC 2068, Networking Group.