-
Configurability of available
features by configuration files or even by an
external user interface.
-
Authentication,
optional authorization
request (request of user name
and password) before
allowing access to some or all kind of resources.
-
Handling of not only static
content (file content recorded in server's
filesystem(s)) but
of dynamic content
too by supporting one or more related interfaces
(SSI, CGI,
SCGI, FastCGI,
JSP, PHP,
ASP, ASP
.NET, Server API
such as NSAPI, ISAPI,
etc.).
-
Module support, in order
to allow the extension of server capabilities
by adding or modifying software modules which
are linked to the server software or that are
dynamically loaded (on demand) by the core server.
-
HTTPS
support (by SSL or
TLS) in order to allow
secure (encrypted) connections to the server on
the standard port 443
instead of usual port 80.
-
Content compression
(i.e. by gzip encoding)
to reduce the size of the responses (to lower
bandwidth usage, etc.).
-
Virtual
Host to serve many web sites using
one IP address.
-
Large
file support to be able to serve files
whose size is greater than 2 GB on 32 bit OS.
-
Bandwidth
throttling to limit the speed of responses
in order to not saturate the network and to be
able to serve more clients.
Origin
of returned content
The origin of the content
sent by server is called:
- static if it comes from an existing
file lying on
a filesystem;
- dynamic
if it is dynamically generated by some other
program or script
or API called
by the Web server.
Serving static content
is usually much faster (from 2 to 100 times)
than serving dynamic content, especially
if the latter involves data pulled from a database.
|
|
Path translation
Web servers usually translate
the path component of a Uniform
Resource Locator (URL) into a local
file system resource. The
URL path specified by the client is relative to the
Web server's root directory.
Consider the following URL as it
would be requested by a client:
http://www.example.com/path/file.html
The client's Web browser will translate
it into a connection to www.example.com
with the following HTTP 1.1 request:
GET /path/file.html HTTP/1.1
Host: www.example.com
The Web server on www.example.com
will append the given path to the path of its root
directory. On Unix machines,
this is commonly /var/www/htdocs.
The result is the local file system resource:
/var/www/htdocs/path/file.html
The Web server will then read the
file, if it exists, and send a response to the client's
Web browser. The response will describe the content
of the file and contain the file itself.
Performances
Web servers (programs) are supposed
to serve requests quickly from more than one TCP/IP
connection at a time.
Main key performance parameters
(measured under a varying load of clients and requests
per client), are:
- number of requests per second (depending
on the type of request, etc.);
- latency time in milliseconds for each
new connection or request;
- throughput in bytes per second (depending
on file size, cached or non cached content, available
network bandwidth, etc.).
Above three parameters vary noticeably
depending on the number of active connections, so
a fourth parameter is the concurrency level
supported by a Web server under a specific configuration.
Last but not least, the specific
server model used to implement a Web server
program can bias the performance and scalability
level that can be reached.
Load
limits
A web server (program) has
defined load limits, because it can handle only
a limited number of concurrent client connections
(usually between 2 and 60,000, by default between
500 and 1,000) per IP address
(and IP port) and it can serve only a certain
maximum number of requests per second depending
on:
- its own settings;
- the HTTP request
type;
- content origin (static or dynamic);
- the fact that the served content is or
is not cached;
- the hardware
and software
limits of the OS
where it is working.
|
|
When a web server is near to or over its limits, it
becomes overloaded and thus unresponsive.
Overload causes
At any time Web servers can be overloaded because
of:
- too much legitimate Web traffic (i.e.
thousands or even millions of clients hitting
the Web site in a short interval of time);
- DDoS (Distributed
Denial of Service) attacks;
- Computer worms
that sometimes cause abnormal traffic because
of millions of infected computers (not coordinated
among them);
- XSS viruses
can cause high traffic because of millions of
infected browsers and/or web
servers;
- Internet web robots
traffic not filtered / limited on large web sites
with very few resources (bandwidth, etc.);
- Internet (network)
slowdowns, so that client requests are served
more slowly and the number of connections increases
so much that server limits are reached;
- Web servers (computers)
partial unavailability, this can happen because
of required / urgent maintenance or upgrade, HW
or SW failures, back-end
(i.e. DB) failures,
etc.; in these cases the remaining web servers
get too much traffic and of course they become
overloaded.
Overload symptoms
The symptoms of an overloaded
Web server are:
- requests are served with noticeably (long) delays
(from 1 second to a few hundreds of seconds);
- 500, 502, 503, 504 HTTP
errors are returned to clients (sometimes
also unrelated 404 error
or even 408 error may
be returned);
- TCP connections are
refused or reset before any content is sent to
clients.
Anti-overload techniques
To partially overcome above load
limits and to prevent the overload scenario,
most popular Web sites use common techniques like:
- managing network traffic, by using:
- Firewalls
to block unwanted traffic coming from bad
IP sources or having bad patterns;
- HTTP traffic managers to drop, redirect
or rewrite requests having bad HTTP
patterns;
- Bandwidth management
and Traffic shaping,
in order to smooth down peaks in network usage;
- deploying Web cache
techniques;
- using different domain names
to serve different (static and dynamic) content
by separate Web servers, i.e.:
- http://images.example.com
- http://www.example.com
- using different domain names
and / or computers
to separate big files from small and medium sized
files; the idea is to be able to fully cache
small and medium sized files and to efficiently
serve big or huge (over 10 - 1000 MB) files by
using different settings;
- using many Web servers (programs) per computer,
each one bound to its own network
card and IP address;
- using many Web servers (computers) that are
grouped together so that they act or are seen
as one big Web server, see also: Load
balancer;
- adding more HW resources
(i.e. RAM, disks)
to each computer;
- tuning OS parameters
for HW capabilities
and usage;
- using more efficient computer
programs for Web servers, etc.;
- using other workarounds,
specially if dynamic content is involved.
Software
Apache
is a common free software web server
The four most common HTTP serving
programs are:
- Apache HTTP Server
from the Apache Software
Foundation.
- Internet Information Services
(IIS) from Microsoft.
- Sun Java System Web Server
from Sun Microsystems,
formerly Sun ONE
Web Server, iPlanet
Web Server, and Netscape
Enterprise Server.
- Zeus Web Server
from Zeus Technology.
There are thousands of different
Web server programs available, many of which are
specialized for very specific purposes.
See Category:Web
server software for a longer list of HTTP
server programs.
See also
- HTTP, HTTPS
- comparison of web servers
- tiny web servers
- SSI, CGI,
SCGI, FastCGI,
PHP, Java
Servlet, JavaServer
Pages,
ASP, ASP .NET,
Server API,
- Virtual hosting
- LAMP (software bundle)
- Web browser
- Web log analysis software
- Web hosting service
- Application server
- Mac OS X Server
- HTTP compression
|