This is a list of all parsing instructions recognized by WN
while parsing an HTML document. Note that only documents with MIME type
text/html
can be parsed for the purposes described here.
All parsing instructions use one of the two equivalent forms:
<!-- #something -->
or:
<?WN something>
There is a maximum allowed size of 2K bytes for the entire
"<!-- #something -->
" expression. Current
versions of WN no longer require this expression to be on a line
by itself.
The second form is considered more SGML/XML friendly by many as
"<?WN something >
" indicates a processing
instruction specific to WN rather than a comment. For
historical reasons this manual describes the other form, but either may
be used. With the first form the '#
' is required but with
the other you may use either:
<?WN #something>
or:
<?WN something>
Also "<?wn #something>
" is fine. The case of the
WN
is not significant.
#if
and
#elif
This section describes the use of conditionally included text of the form:
<!-- #if some_condition -->
Some conditional text goes here.
<!-- #elif another_condition -->
Some other conditional text goes here.
<!-- #else -->
Alternate text.
<!-- #endif -->
Which will insert the first conditional text only if
some_condition
is satisfied. The
"<!-- #elif another_condition -->
" and
"<!-- #else >
" are optional. There may be
multiple "#elif
" lines.
In all the examples below the use of the equal-tilde string
'=~
' to indicate a matching regular expression can be
replaced with the two characters '!~
' in which case the if
clause will be true when the regular expression fails to match.
Also in the examples of the form
"<-- #if accept file="foo" -->
"
the file foo
is assumed to be relative to the current
directory unless it begins with a '/
' in which case it is
taken relative to the WN data hierarchy root. The format of
these files is a list of grep(1)
like regular expressions,
one per line with any white space being taken as part of the expression.
Lines beginning with '#
' are taken to be comments. If a
regular expression is preceded with the character '!
' then
that character is skipped but the truth value of any matches with the
expression is reversed.
The regular expressions recognized by the WN server are the same
as those of the UNIX grep(1)
utility (though this utility is not used as the server has its own
regular expression functions). The more general regular expressions used
for example in the UNIX egrep(1)
utility are not supported by WN.
The condition in the "#if
" or "#elif
" tags can
be made more complex than those described above by combining simple
conditions using the logical operations '&&
' for
'and
', '||
' for 'or
' and
'!
' for 'not
'. Parentheses may be used for
grouping. For example:
<!-- #if cond_1 && cond_2 -->
Text to show if cond_1 and cond_2 are satisfied.
<!-- #endif -->
Other examples are:
<!-- #if cond_1 || cond_2 -->
<!-- #if !cond_1 -->
<!-- #if (cond_1 || cond_2) && !cond_3 -->
The '&&
' and '||
' operations have equal
precedence and associate from right to left.
#if
and
#elif
Conditions#if accept
-- Match
client's Accept
headers
The lines:
<!-- #if accept =~ "regexp" -->
or:
<!-- #if accept file = "foo" -->
specify that this text segment should be served if (in the first
case) the UNIX grep(1)
utility like regular expression "regexp
" matches any of
the Accept
headers supplied by the client. Or for the
second line if the file "foo
" contains a regular
expression matching any of the Accept
headers.
#if accept_charset
-- Match
client's Accept-Charset
headers
The lines:
<!-- #if accept_charset =~ "regexp" -->
or:
<!-- #if accept_charset file = "foo" -->
specify that this text segment should be served if (in the first
case) the UNIX grep(1)
utility like regular expression "regexp
" matches any of
the Accept-Charset
headers supplied by the client. Or for the
second line if the file "foo
" contains a regular
expression matching any of the Accept-Charset
headers.
#if accept_encoding
-- Match
client's Accept-Encoding
headers
The lines:
<!-- #if accept_encoding =~ "regexp" -->
or:
<!-- #if accept_encoding file = "foo" -->
specify that this text segment should be served if (in the first
case) the UNIX grep(1)
utility like regular expression "regexp
" matches any of
the Accept-Encoding
headers supplied by the client. Or for the
second line if the file "foo
" contains a regular
expression matching any of the Accept-Encoding
headers.
#if accept_language
-- Match client's Accept-Language
headers
The lines:
<!-- #if accept_language =~ "regexp" -->
or:
<!-- #if accept_language file = "foo" -->
specify that this text segment should be served if (in the first
case) the UNIX grep(1)
utility like regular expression "regexp
" matches any of
the Accept-Language
headers supplied by the client. Or
for the second line if the file "foo
" contains a regular
expression matching any of the Accept-Language
headers.
#if after
and
#if before
-- Select text based on date
The lines:
<!-- #if after "date" -->
or:
<!-- #if before "date" -->
specify that this text segment should be served if the current time is after (or before) the specified date. That is, the line:
<!-- #if after "22 Oct 1996 17:41:26" -->
will cause the text segment to be served only after
"22 Oct 1996 17:41:26
" local time. The
date format is rather rigid. It must be in precisely the format
shown above (specified by RFC
1123) and with a single space between each field. Only local
time of the server is supported.
#if cookie
--
Match client's Cookie
headers
The lines:
<!-- #if cookie =~ "regexp" -->
or:
<!-- #if cookie file= "foo" -->
specifies that this text segment should be served if the UNIX
grep(1)
utility like regular expression regexp
matches any of
the Cookie
headers supplied by the client.
More information about the proposed HTTP Set-Cookie
header is available at http://home.netscape.com/newsref/std/cookie_spec.html.
#if environ VAR
--
Match client's environment variable VAR
The lines:
<!-- #if environ VAR =~ "regexp" -->
or:
<!-- #if environ VAR; file= "foo" -->
specifies that this text segment should be served if the UNIX
grep(1)
utility like regular expression regexp
matches any of
the the contents of the server's environment variable VAR
.
#if field
--
Match document's user defined field
The lines:
<!-- #if field3 =~ "regexp" -->
or:
<!-- #if field3 file= "foo" -->
specify that this text segment should be served if the UNIX grep(1)
utility like regular expression "regexp
" matches the
contents of the user defined field number 3
(in the first case) or if the file "foo
" contains a
matching regular expression (in the second) case. Any valid field
number may be used in place of 3.
#if hostname
-- Match client's hostname
The lines:
<!-- #if hostname =~ "regexp" -->
or:
<!-- #if hostname file= "foo" -->
specify that this text segment should be served if the UNIX grep(1)
utility like regular expression "regexp
" matches the
hostname of the client (in the first case) or if the file
"foo
" contains a matching regular expression (in the
second) case. For an alternate method of doing this see the "#if accessfile
" syntax
described below.
Be aware that the character '.
' (dot) has a special
meaning in regular expressions and must be escaped with a
'\
' to have its usual meaning.
#if host_header
-- Match server's virtual hostname from client's HTTP Host header
The line:
<!-- #if host_header =~ "regexp" -->
specify that this text segment should be served if the UNIX grep(1)
utility like regular expression "regexp
" matches the
contents of the HTTP "Host:
" header supplied by the
client in its request.
Be aware that the character '.
' (dot) has a special
meaning in regular expressions and must be escaped with a
'\
' to have its usual meaning.
#if IP
-- Match
client's IP address
The lines:
<!-- #if IP =~ "regexp" -->
or:
<!-- #if IP file= "foo" -->
specify that this text segment should be served if the UNIX grep(1)
utility like regular expression "regexp
" matches the IP
address of the client (in the first case) or if the file
"foo
" contains a matching regular expression (in the
second case). For an alternate method of doing this see the "#if accessfile
" syntax
described below.
Be aware that the character '.
' (dot) has a special
meaning in regular expressions and must be escaped with a
'\
' to have its usual meaning.
#if language
-- Match client's Accept-Language
headers
The lines:
<!-- #if language =~ "regexp" -->
or:
<!-- #if language file = "foo" -->
specify that this text segment should be served if (in the first
case) the UNIX grep(1)
utility like regular expression "regexp
" matches any of
the Accept-Language
headers supplied by the client. Or
for the second line if the file "foo
" contains a regular
expression matching any of the Accept-Language
headers.
These forms are deprecated. The preferred form is to use
#if accept_language
in place of #if language.
#if query
--
Match query string supplied in request URL
The lines:
<!-- #if query =~ "regexp" -->
or:
<!-- #if query file = "foo" -->
specifies that this text segment should be served if the UNIX
grep(1)
utility like regular expression "regexp
" matches the
query string supplied by the client in the URL (in the first case) or
if the file "foo
" contains a matching regular expression
(in the second case).
#if referer
--
Match client supplied Referer:
header
The lines:
<!-- #if referer =~ "regexp" -->
or:
<!-- #if referer file = "foo" -->
specify that this text segment should be served if the UNIX grep(1)
utility like regular expression "regexp
" matches the
contents of the Referer:
header supplied by the client
or if the file "foo
" contains a matching regular
expression (in the second case). The Referer:
header
contains the URL of the document containing the link accessed to
obtain the current document.
#if request
--
Match client's request
The lines:
<!-- #if request =~ "regexp" -->
or:
<!-- #if request file = "foo" -->
specify that this text segment should be served if the UNIX grep(1)
utility like regular expression "regexp
" matches the
contents of the full text of the request supplied by the client (in
the first case) or if the file "foo
" contains a matching
regular expression (in the second case). The full request contains
the "method" (GET
or POST
) followed by the
URL requested with the "http://host
" part having been
removed (by the client).
#if TE
-- Match
client's TE
header
The lines:
<!-- #if TE =~ "regexp" -->
or:
<!-- #if TE file = "foo" -->
specify that this text segment should be served if (in the first
case) the UNIX grep(1)
utility like regular expression "regexp
" matches
the TE
header supplied by the client. Or for the
second line if the file "foo
" contains a regular
expression matching the TE
header. The TE
header specifies the transfer encoding
.
#if true
and
#if false
-- Include or exclude text segment
The line:
<!-- #if false -->
specifies that the corresponding text segment should not be served.
It may be useful for "commenting out" a part of a document which is
under construction. The "#if true
" construct is
present for logical completeness.
#if UA
-- Match
client's User-Agent:
header
The lines:
<!-- #if UA =~ "regexp" -->
or:
<!-- #if UA file = "foo" -->
specifies that this text segment should be served if the UNIX
grep(1)
utility like regular expression "regexp
" matches the
User-Agent:
header supplied by the client (in the first
case) or if the file "foo
" contains a matching regular
expression (in the second case).
The normal access control files used by WN to limit access to a directory can also be used to conditionally permit or deny access to text segments.
#if accessfile="filename"
-- Check access control file
The line:
<!-- #if accessfile="/dir/accessfile" -->
specifies that the file /dir/accessfile
is to be used to
determine access privileges (by hostname or IP address) for this text
segment. The path /dir/accessfile
is relative to the
server root directory. If this path does not begin with a
'/
' then the path is relative to the directory
containing the file with this text. See the chapter "Limiting Access to Your WN Hierarchy"
in this guide.
#include
-- Insert the
contents of a file
The line:
<!-- #include -->
specifies that the contents of next file listed in the includes or wrappers should be inserted at this point. It is permissible to add the name of this file, as in:
<!-- #include foo.txt -->
but this acts only as a comment. The actual file inserted depends
only on the "Includes=
" and "Wrappers=
" directives in
the index.wn
file (or
more precisely the index.cache
file created from it).
#section
-- Insert part of
the contents of a file
The line:
<!-- #section -->
specifies that part of the contents of next file listed in the includes or wrappers should be inserted at this point. It is permissible to add the name of this file, as in:
<!-- #section foo.txt -->
but this acts only as a comment. The actual file inserted depends
only on the "Includes=
" and "Wrappers=
" directives in
the index.wn
file (or
more precisely the index.cache
file created from it).
The part of the file actually included is that portion of the
document between the special comments
"<!-- #start -->
" and
"<!-- #end -->
" inserted in that
document. This requires that these starting and ending comments
occur in the HTML document on lines by themselves. For more
information see the section "More on
Including: the section
Marker" in this guide.
#start
and #end
--
Mark the beginning and end of text to be included
The lines:
<!-- #start -->
and:
<!-- #end -->
mark the beginning and end of the portion of the text to be inserted
from an include or wrapper in response to encountering
"<!-- #section -->
" in the text of a
document being parsed. There can be more than one
"#start
/#end
" pair in a document. For more
information see the section "More on
Including: the section
Marker" in this guide.
#title
, #query
, and
#field
-- Insert the title, current search string, or
a user defined field
The lines:
<!-- #title -->
<!-- #query -->
or:
<!-- #field 3 -->
in a parsed document instruct the server to include the title of the
current document, the current search term from the client or the
value of user defined "field #3
" for the current
document. All of these markers must occur on a line by themselves.
For more information see the section "Including Title, Query, Fields and
Environment Variables" in this guide.
#environ
-- Insert the
contents of an environment variable
The lines:
<!-- #environ = "WHATEVER" -->
in a parsed document instructs the server to include the contents of
the environment variable WHATEVER
. Remember to use an
"Attributes=parse
"
line when using this construct and to use an "Attributes=cgi
"
when it is a CGI variable like HTTP_REFERER
which is to be included.
#redirect
-- Redirect to a
different URL
The line:
<!-- #redirect = "url" -->
specifies that if no text has yet been sent the server should send an HTTP redirect to the given URL. This might be used as follows. If the text:
<!-- #if hostname =~ "\.uk$" -->
<!-- #redirect = "UK_mirror_url" -->
<!-- #endif -->
is included at the beginning of an HTML document then any request
from a uk
host will automatically be redirected to the
specified URL, the UK_mirror_url
in this case. This
mechanism could also be used to redirect text only browsers to a text
only alternative page, etc.
There must be no text sent before the
'<!-- #redirect = "url" -->
'
is encountered (not even blank lines) since the server cannot send an
HTTP redirect while in the middle of transmitting a document. Thus
the example above would be an error if there are any blank lines
before the "#if hostname
" line or any blank lines
after it before the "#redirect
" line. When such an
error occurs it is logged in the error
file and the "#redirect
" line is ignored.
Note however that:
<!-- #if hostname =~ "\.uk$" -->
[Lots of text here]
<!-- #else -->
<!-- #redirect = "some_URL" -->
<!-- #endif -->
is correct since when the #redirect
line is encountered
no text has been sent.
Normally the URL in the
"<!-- #redirect = "URL" -->
"
line is fully qualified, like "http://host/path/foo
".
However, it can also be simply "foo
" referring to a file
in the same directory as the file being parsed. In this case an HTTP
redirection is not sent, and instead the file "foo
" is
returned immediately to the client.