An HTTP server should do more than just serve files. It should play an active role in both navigation and presentation issues. It is my hope that this server provides better tools for the creative webmaster. - John Franks
WN is a server for the Hypertext Transfer Protocol HTTP/1.1. Its primary design goals are security, robustness, and flexibility, in that order. One of its objectives is to provide functionality usually available only with complex CGI programs without the necessity of writing or using these programs. (Of course CGI/1.1 is fully supported for those who want it). Despite this extensive functionality the WN executable is substantially smaller than the CERN httpd, NCSA httpd or Apache servers.
WN was planned with a focus on serving HTML documents. This means such things as enabling full text searching of a single logical HTML document which may consist of many files on the server, or allowing users to search all titles on the server and obtain a menu of matching items, or allowing users to download a total logical document for printing which, in fact, consists of many linked files on the server. All of these are done in a way which is transparent to the user (and largely transparent to the maintainer)! The "User's Guide for the WN Server", which this chapter is part of, provides a good example of many of these features.
Another feature not found in many other servers is conditionally served text. Often a server maintainer may wish to serve different versions of a document to different clients. By adding simple HTML comments to documents and marking those documents to be "parsed" by the server, the maintainer can arrange that different sections or entirely different documents are sent to clients, based on such things as the client's domain name, IP address, browser type, browser "Accept" header, "Cookie header", etc. This feature is described in more detail in the section "Conditional Text: If, Else, and Endif" in this guide.
But these are only examples of many new tools WN makes available to webmasters.
The design and security mechanisms of WN differ substantially from those of the httpd servers available from CERN and NCSA so a brief description of how they work is useful.
Files served by an HTTP server may have many attributes relevant to their serving. These attributes include content-type, optional title, optional expiration date, optional keywords, whether the file should be parsed for server-side includes, access restrictions, etc. Some servers try to encode this information in ad hoc ways, in a file name suffix, or in a global configuration file. The approach of WN is to keep this information in small databases, one for each directory in the document hierarchy.
The WN maintainer never needs to understand the format of these
database files (named index.cache
by default), but this
format is very simple and a brief description will indicate how
WN works. When the server receives a request, say for
/dir/foo.html
, it looks in the file
/dir/index.cache
which contains lines like:
file=foo.html&content=text/html&title=whatever...
If the server finds a line starting with "file=foo.html
"
then the file will be served. If such a line does not exist the file
will not be served (unless special permission to serve all files in the
directory has been granted). This is the basis of WN security.
Unlike other servers, the default action for WN is to deny
access to a file. A file can only be served if explicit permission to do
so has been granted by entering it in the index.cache
database or if explicit permission to serve all files in
/dir
has been given in the index.cache
file in
/dir
. This database also provides other security functions.
For example, restricting the execution of CGI/1.1 programs can be done
on the basis of the ownership (or group ownership) of their
index.cache
files. There is no need to limit execution to
programs located in particular designated directories. The location of a
file in the data hierarchy should be orthogonal to security restrictions
on it and this is the case with the WN server.
The index.cache
database file has a number of other
functions beyond its security role. Attributes of foo.html
which can be computed before it is served and which don't often change
are stored in the fields of the line starting file=foo.html
.
For example, the MIME content type "text/html
" must be
deduced from the filename suffix ".html
". This is done once
at the time index.cache
is created and need not be done
every time the file is served.
The title of a file is another example. With the WN server
every file served has a title (even binaries) and optionally has a list
of keywords, an expiration date, and other fields associated with it.
For an HTML document the title and the keywords are automatically
extracted from the header of the document and stored in fields of that
file's line in its index.cache
file. These are used for the
built-in keyword and title searches which the server supports. The
maintainer also has the option of adding his own fields to this database
file. They could contain such things as document author, document id
number, etc. These user defined fields can be searched with the built-in
WN searches or their contents can be inserted into the document,
on the fly, as it is served
So how are the index.cache
databases created? Their format
is quite simple and a maintainer is free to create them any way she
chooses, but normally they are created by the utility wndex
(pronounced
"windex"). This program, which is part of the WN distribution,
is designed to produce the index.cache
file from a file with
a friendlier format with the default name "index.wn
". A very simple index.wn
file might look like:
File=foo.html
File=clap.au
Title=Sound of one hand
clapping
File=hand
Title=Picture of one hand
clapping
Content-type=img/png
Of course if the file hand
were named hand.png
the content-type line would not be necessary as wndex
could deduce the type
from the .png
suffix. Likewise it is not necessary to give
a title for foo.html
because wndex
will read the HTML
header from that file and extract the title and perhaps other things like
keywords and expiration date.
The WN server has several features which are not available with other servers or only available through the use of CGI/1.1 programs.
One of the design goals of WN is to provide the maintainer with tools to create extensive navigational aids for the server. A variety of search mechanisms are available.
<http://host/dir/search=title>
the server will
provide an HTML form (automatically generated or prepared by the
maintainer) asking for a regular expression search term. When supplied
the server will search the index.cache
files in
/dir
and designated subdirectories for a items whose
titles contain a match for the search term. An HTML document with a
menu of these items is returned.
<META>
headers. For other documents (or HTML
documents) they can be manually supplied in the index.wn
file.
index.wn
file. Their purpose is to include items like a
document id number, or document author in the index.cache
database. A field search could then produce all documents by a given
author for example. Or using regular expressions in the search term
produce a list of all documents whose id number satisfy certain
criteria.
text/*
documents in one directory (not subdirectories).
The returned HTML document contains a list of all the titles of
documents containing a match together with a sublist of the lines from
those documents containing the match. This provides one line of
context for the match. For HTML documents the matched expression in
each of these lines will be a highlighted anchor. Selecting one takes
you to the document with your viewer focused on the matching location.
The primary intent of this feature is to provide full text searching
for an HTML "document" which might consist of a substantial number of
files.
grep
searches
grep
search returns a
text/html
document containing the lines in the file
matching matching the regular expression.
wn_mkdigest
utility
which creates HTML documents to be searched in this way from files with
internal structure like mail or news digests, mailing lists, etc.
All of the searching methods listed above except the index searches are
built into the server and require no additional effort for the
maintainer. They are simply referenced with URLs like
<http://host/dir/search=context>
where
/dir
is any directory containing files to be served and an
index.cache
listing them. Of course search permission can
be denied for any directory or any file contained in that directory.
The WN server has extensive capabilities for automatically including files in one which is being
served or "wrapping" a served file with another, i.e. pre-pending and
post-pending information to a file being served. This latter is useful
if you wish to place a standard message at the beginning or end (or both)
of a large collection of files. For security all files included in a
file or used as a wrapper for it are listed in that file's
index.cache
file. This combined with various available
security options, like requiring that a served file and all its includes
and wrappers have the same owner (or group owner) as the
index.cache
file listing them, provide a safe and productive
Web environment.
One important application of wrappers is to customize the HTML documents returned listing the successful search matches. If a search item is given a wrapper the server assumes that it contains text describing the search and it merely inserts an unordered list of links to the matching items.
In addition to including files the output of programs may be inserted and
the value of any user defined field in the index.cache
database entry for a file may be inserted.
Also parsed text may conditionally insert items with a simple if - else - endif construct. based on
Accept
headers, User-Agent
headers,
Referer
headers etc.
An arbitrary filter can be assigned to any file
to be served. A filter is a program which reads the file and has the
program output served rather than the content of the file. The name of
the filter is another field in the file's line in its
index.cache
file. One common use of this feature is for
on-the-fly decompression. For, example, a file can be stored in its
compressed form and assigned a filter like the UNIX zcat(1)
utility which uncompresses it. Then the client is served the
uncompressed file but only the compressed version is stored on disk. As
another example, you might use the UNIX nroff(1)
utility, "nroff -man
", as a filter to process UNIX man files
before serving. There are many other interesting uses of filters. Be
creative!
An arbitrary range of a file can be served if
the server is accessed via a URL like
<http://host/dir/foo;lines=20-30>
and
file
is any text/*
document it will return a
text/plain
document consisting of lines 20 through 30 of
file foo
. This is very useful for structured text files
like address lists or digests of mail and news. A WN utility
called wn_mkdigest
will produce an HTML document with a list of links to separate sections
(line ranges) of the structured file. The wn_mkdigest
utility is
executed with two regular expressions as arguments: one to match the
section separator and the other to match the section title. For a mail
digest, for example, these could be "^From
" and
"^Subject:
" respectively. Then the sections of the virtual
documents would be delimited by a line starting with "From
"
and would have the message subject as their title. A similar mechanism
provides byte ranges from files.