[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
makeinfo
generates Info output by default, but given the
‘--html’ option, it will generate HTML, for web browsers and
other programs. This chapter gives some details on such HTML output.
makeinfo
can also write in XML and Docbook format, but we do
not as yet describe these further. See section Output Formats, for a brief
overview of all the output formats.
22.1 HTML Translation | Details of the HTML output. | |
22.2 HTML Splitting | How HTML output is split. | |
22.3 HTML CSS | Influencing HTML output with Cascading Style Sheets. | |
22.4 HTML Cross-references | Cross-references in HTML output. |
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
makeinfo
will include segments of Texinfo source between
@ifhtml
and @end ifhtml
in the HTML output (but not
any of the other conditionals, by default). Source between
@html
and @end html
is passed without change to the
output (i.e., suppressing the normal escaping of input ‘<’,
‘>’ and ‘&’ characters which have special significance in
HTML). See section Conditional Commands.
The ‘--footnote-style’ option is currently ignored for HTML output; footnotes are always linked to the end of the output file.
By default, a navigation bar is inserted at the start of each node,
analogous to Info output. The ‘--no-headers’ option suppresses
this if used with ‘--no-split’. Header <link>
elements in
split output can support info-like navigation with browsers like Lynx
and Emacs W3 which implement this HTML 1.0 feature.
The HTML generated is mostly standard (i.e., HTML 2.0, RFC-1866).
One exception is that HTML 3.2 tables are generated from the
@multitable
command, but tagged to degrade as well as possible
in browsers without table support. The HTML 4 ‘lang’
attribute on the ‘<html>’ attribute is also used. (Please report
output from an error-free run of makeinfo
which has browser
portability problems as a bug.)
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
When splitting output (which is the default), makeinfo
writes HTML output into (generally) one output file per Texinfo source
@node
.
The output file name is the node name with special characters replaced
by ‘-’’s, so it can work as a filename. In the unusual case of
two different nodes having the same name after this treatment, they
are written consecutively to the same file, with HTML anchors so each
can be referred to separately. If makeinfo
is run on a
system which does not distinguish case in filenames, nodes which are
the same except for case will also be folded into the same output
file.
When splitting, the HTML output files are written into a subdirectory, with the name chosen as follows:
makeinfo
first tries the subdirectory with the base name
from @setfilename
(that is, any extension is removed). For
example, HTML output for @setfilename gcc.info
would be
written into a subdirectory named ‘gcc’.
makeinfo
tries appending ‘.html’ to the directory name.
For example, output for @setfilename texinfo
would be written
to ‘texinfo.html’.
makeinfo
gives up.
In any case, the top-level output file within the directory is always named ‘index.html’.
Monolithic output (--no-split
) is named according to
@setfilename
(with any ‘.info’ extension is replaced with
‘.html’) or --output
(the argument is used literally).
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Cascading Style Sheets (CSS for short) is an Internet standard for influencing the display of HTML documents: see http://www.w3.org/Style/CSS/.
By default, makeinfo
includes a few simple CSS commands to
better implement the appearance of some of the environments. Here
are two of them, as an example:
pre.display { font-family:inherit } pre.smalldisplay { font-family:inherit; font-size:smaller } |
A full explanation of CSS is (far) beyond this manual; please see the
reference above. In brief, however, this specification tells the web
browser to use a ‘smaller’ font size for @smalldisplay
text,
and to use the ‘inherited’ font (generally a regular roman typeface)
for both @smalldisplay
and @display
. By default, the
HTML ‘<pre>’ command uses a monospaced font.
You can influence the CSS in the HTML output with two
makeinfo
options: ‘--css-include=file’ and
‘--css-ref=url’.
The option ‘--css-ref=url’ adds to each output HTML file a ‘<link>’ tag referencing a CSS at the given url. This allows using external style sheets.
The option ‘--css-include=file’ includes the contents file in the HTML output, as you might expect. However, the details are somewhat tricky, as described in the following, to provide maximum flexibility.
The CSS file may begin with so-called ‘@import’ directives,
which link to external CSS specifications for browsers to use when
interpreting the document. Again, a full description is beyond our
scope here, but we’ll describe how they work syntactically, so we can
explain how makeinfo
handles them.
There can be more than one ‘@import’, but they have to come
first in the file, with only whitespace and comments interspersed, no
normal definitions. (Technical exception: an ‘@charset’
directive may precede the ‘@import’’s. This does not alter
makeinfo
’s behavior, it just copies the ‘@charset’ if
present.) Comments in CSS files are delimited by ‘/* ... */’, as
in C. An ‘@import’ directive must be in one of these two forms:
@import url(http://example.org/foo.css); @import "http://example.net/bar.css"; |
As far as makeinfo
is concerned, the crucial characters are
the ‘@’ at the beginning and the semicolon terminating the
directive. When reading the CSS file, it simply copies any such
‘@’-directive into the output, as follows:
makeinfo
’s default CSS, thus overriding it.
makeinfo
’s
default CSS is included. If you need to override makeinfo
’s
defaults from an ‘@import’, you can do so with the ‘!
important’ CSS construct, as in:
pre.smallexample { font-size: inherit ! important } |
makeinfo
’s defaults, and lastly the inline CSS from
file.
makeinfo
includes
its default CSS and then the rest of the file.
If the CSS file is malformed or erroneous, makeinfo
’s output
is unspecified. makeinfo
does not try to interpret the
meaning of the CSS file in any way; it just looks for the special
‘@’ and ‘;’ characters and blindly copies the text into the
output. Comments in the CSS file may or may not be included in the
output.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Cross-references between Texinfo manuals in HTML format amount, in the
end, to a standard HTML <a>
link, but the details are
unfortunately complex. This section describes the algorithm used in
detail, so that Texinfo can cooperate with other programs, such as
texi2html
, by writing mutually compatible HTML files.
This algorithm may or may not be used for links within HTML output for a Texinfo file. Since no issues of compatibility arise in such cases, we do not need to specify this.
We try to support references to such “external” manuals in both monolithic and split forms. A monolithic (mono) manual is entirely contained in one file, and a split manual has a file for each node. (See section HTML Splitting.)
Acknowledgement: this algorithm was primarily devised by Patrice Dumas in 2003–04.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
For our purposes, an HTML link consists of four components: a host
name, a directory part, a file part, and a target part. We
always assume the http
protocol. For example:
http://host/dir/file.html#target |
The information to construct a link comes from the node name and manual name in the cross-reference command in the Texinfo source (see section Cross References), and from external information, which is currently simply hardwired. In the future, it may come from an external data file.
We now consider each part in turn.
The host is hardwired to be the local host. This could either be the literal string ‘localhost’, or, according to the rules for HTML links, the ‘http://localhost/’ could be omitted entirely.
The dir and file parts are more complicated, and depend on the relative split/mono nature of both the manual being processed and the manual that the cross-reference refers to. The underlying idea is that there is one directory for Texinfo manuals in HTML, and a given manual is either available as a monolithic file ‘manual.html’, or a split subdirectory ‘manual/*.html’. Here are the cases:
One exception: the algorithm for node name expansion prefixes the string ‘g_t’ when the node name begins with a non-letter. This kludge (due to XHTML rules) is not necessary for filenames, and is therefore omitted.
Any directory part in the filename argument of the source
cross-reference command is ignored. Thus, @xref{,,,../foo}
and @xref{,,,foo}
both use ‘foo’ as the manual name.
This is because any such attempted hardwiring of the directory is very
unlikely to be useful for both Info and HTML output.
Finally, the target part is always the expanded node name.
Whether the present manual is split or mono is determined by user
option; makeinfo
defaults to split, with the
‘--no-split’ option overriding this.
Whether the referent manual is split or mono is another bit of the
external information. For now, makeinfo
simply assumes the
referent manual is the same as the present manual.
There can be a mismatch between the format of the referent manual that the generating software assumes, and the format it’s actually present in. See section HTML Cross-reference Mismatch.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
As mentioned in the previous section, the key part of the HTML cross-reference algorithm is the conversion of node names in the Texinfo source into strings suitable for XHTML identifiers and filenames. The restrictions are similar for each: plain ASCII letters, numbers, and the ‘-’ and ‘_’ characters are all that can be used. (Although HTML anchors can contain most characters, XHTML is more restrictive.)
Cross-references in Texinfo can actually refer either to nodes or
anchors (see section @anchor
: Defining Arbitrary Cross-reference Targets), but anchors are treated identically to nodes
in this context, so we’ll continue to say “node” names for
simplicity.
(@-commands and 8-bit characters are not presently handled by
makeinfo
for HTML cross-references. See the next section.)
A special exception: the Top node (see section The ‘Top’ Node and Master Menu) is always mapped to the file ‘index.html’, to match web server software. However, the HTML target is ‘Top’. Thus (in the split case):
@xref{Top, Introduction,, xemacs, XEmacs User's Manual}. ⇒ <a href="xemacs/index.html#Top"> |
For example:
@node A node --- with _'% ⇒ A-node-_002d_002d_002d-with-_005f_0027_0025 |
Notice in particular:
On case-folding computer systems, nodes differing only by case will be mapped to the same file.
In particular, as mentioned above, Top always maps to the file ‘index.html’. Thus, on a case-folding system, Top and a node named ‘Index’ will both be written to ‘index.html’.
Fortunately, the targets serve to distinguish these cases, since HTML target names are always case-sensitive, independent of operating system.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
In standard Texinfo, node names may not contain @-commands.
makeinfo
has an option ‘--commands-in-node-names’
which partially supports it (see section Running makeinfo
from a Shell), but it is not
robust and not recommended.
Thus, makeinfo
does not fully implement this part of the
HTML cross-reference algorithm, but it is documented here for the sake
of completeness.
First, comments are removed.
Next, any @value
commands (see section @set
and @value
) and macro invocations
(see section Invoking Macros) are fully expanded.
Then, for the following commands, the command name and braces are removed, the text of the argument is recursively transformed:
@asis @b @cite @code @command @dfn @dmn @dotless @emph @env @file @indicateurl @kbd @key @samp @sc @slanted @strong @t @var @w |
For @sc
, any letters are capitalized.
The following commands are replaced by constant text, as shown. If
any of these commands have non-empty arguments, as in
@TeX{bad}
, it is an error, and the result is unspecified.
‘(space)’ means a space character, ‘(nothing)’ means the empty string,
etc. The notation ‘U+xxxx’ means Unicode code point xxxx
(in hex, as usual). There are further transformations of many of
these expansions for the final file or target name, such as space
characters to ‘-’, etc., according to the other rules.
@(newline) | (space) |
@(space) | (space) |
@(tab) | (space) |
@! | ‘!’ |
@* | (space) |
@- | (nothing) |
@. | ‘.’ |
@: | (nothing) |
@? | ‘?’ |
@@ | ‘@’ |
@{ | ‘{’ |
@} | ‘}’ |
@LaTeX | ‘LaTeX’ |
@TeX | ‘TeX’ |
@arrow | U+2192 |
@bullet | U+2022 |
@comma | ‘,’ |
@copyright | U+00A9 |
@dots | U+2026 |
@enddots | ‘...’ |
@equiv | U+2261 |
@error | ‘error-->’ |
@euro | U+20AC |
@exclamdown | U+00A1 |
@expansion | U+2192 |
@geq | U+2265 |
@leq | U+2264 |
@minus | U+2212 |
@ordf | U+00AA |
@ordm | U+00BA |
@point | U+2605 |
@pounds | U+00A3 |
@print | U+22A3 |
@questiondown | U+00BF |
@registeredsymbol | U+00AE |
@result | U+21D2 |
@textdegree | U+00B0 |
@tie | (space) |
Quotation mark commands are likewise replaced by their Unicode values (see section Inserting Quotation Marks).
An @acronym
or @abbr
command is replaced by the first
argument, followed by the second argument in parentheses, if present.
See section @acronym
{acronym[, meaning]}.
An @email
command is replaced by the text argument if
present, else the address. See section @email
{email-address[, displayed-text]}.
An @image
command is replaced by the filename (first)
argument. See section Inserting Images.
A @verb
command is replaced by its transformed argument.
See section @verb
{<char>text<char>}.
Any other command is an error, and the result is unspecified.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Usually, characters other than plain 7-bit ASCII are transformed into the corresponding Unicode code point(s) in Normalization Form C, which uses precomposed characters where available. (This is the normalization form recommended by the W3C and other bodies.) This holds when that code point is 0xffff or less, as it almost always is.
These will then be further transformed by the rules above into the string ‘_xxxx’, where xxxx is the code point in hex.
For example, combining this rule and the previous section:
@node @b{A} @TeX{} @u{B} @point{}@enddots{} ⇒ A-TeX-B_0306-_2605_002e_002e_002e |
Notice: 1) @enddots
expands to three periods which in
turn expands to three ‘_002e’’s; 2) @u{B}
is a ‘B’
with a breve accent, which does not exist as a pre-accented Unicode
character, therefore expands to ‘B_0306’ (B with combining
breve).
When the Unicode code point is above 0xffff, the transformation is ‘__xxxxxx’, that is, two leading underscores followed by six hex digits. Since Unicode has declared that their highest code point is 0x10ffff, this is sufficient. (We felt it was better to define this extra escape than to always use six hex digits, since the first two would nearly always be zeros.)
This method works fine if the node name consists mostly of ASCII characters and contains only few 8-bit ones. If the document is written in a language whose script is not based on the Latin alphabet (such as, e.g. Ukrainian), it will create file names consisting entirely of ‘_xxxx’ notations, which is inconvenient.
To handle such cases, makeinfo
offers
‘--transliterate-file-names’ command line option. This option
enables transliteration of node names into ASCII characters for
the purposes of file name creation and referencing. The
transliteration is based on phonetic principle, which makes the
produced file names easily readable.
For the definition of Unicode Normalization Form C, see Unicode report UAX#15, http://www.unicode.org/reports/tr15/. Many related documents and implementations are available elsewhere on the web.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
As mentioned earlier (see section HTML Cross-reference Link Basics), the generating software has to guess whether a given manual being cross-referenced is available in split or monolithic form—and, inevitably, it might guess wrong. However, it is possible when the referent manual itself is generated, it is possible to handle at least some mismatches.
In the case where we assume the referent is split, but it is actually available in mono, the only recourse would be to generate a ‘manual/’ subdirectory full of HTML files which redirect back to the monolithic ‘manual.html’. Since this is essentially the same as a split manual in the first place, it’s not very appealing.
On the other hand, in the case where we assume the referent is mono, but it is actually available in split, it is possible to use JavaScript to redirect from the putatively monolithic ‘manual.html’ to the different ‘manual/node.html’ files. Here’s an example:
function redirect() { switch (location.hash) { case "#Node1": location.replace("manual/Node1.html#Node1"); break; case "#Node2" : location.replace("manual/Node2.html#Node2"); break; … default:; } } |
Then, in the <body>
tag of ‘manual.html’:
<body onLoad="redirect();"> |
Once again, this is something the software which generated the referent manual has to do in advance, it’s not something the software generating the actual cross-reference in the present manual can control.
Ultimately, we hope to allow for an external configuration file to control which manuals are available from where, and how.
[ << ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
This document was generated by Aidan Kehoe on December 27, 2016 using texi2html 1.82.