manpagez: man pages & more
info texinfo
Home | html | info | man
[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

18.2 @documentencoding enc: Set Input Encoding

The @documentencoding command declares the input document encoding. Write it on a line by itself, with a valid encoding specification following, near the beginning of the file but after @setfilename (see section @setfilename):

 
@documentencoding enc

At present, Texinfo supports only these encodings:

US-ASCII

This has no particular effect, but it's included for completeness.

UTF-8

The vast global character encoding, expressed in 8-bit bytes. The Texinfo processors have no deep knowledge of Unicode; for the most part, they just pass along the input they are given to the output.

ISO-8859-1
ISO-8859-15
ISO-8859-2

These specify the standard encodings for Western European (the first two) and Eastern European languages (the third), respectively. ISO 8859-15 replaces some little-used characters from 8859-1 (e.g., precomposed fractions) with more commonly needed ones, such as the Euro symbol (€).

A full description of the encodings is beyond our scope here; one useful reference is http://czyborra.com/charsets/iso8859.html.

koi8-r

This is the commonly used encoding for the Russian language.

koi8-u

This is the commonly used encoding for the Ukrainian language.

Specifying an encoding enc has the following effects:

In Info output, unless the option ‘--disable-encoding’ is given to makeinfo, a so-called `Local Variables' section (see (emacs)File Variables section `File Variables' in The GNU Emacs Manual) is output including enc. This allows Info readers to set the encoding appropriately.

 
Local Variables:
coding: enc
End:

Also, in Info and plain text output (barring ‘--disable-encoding’), accent constructs and special characters, such as @'e, are output as the actual 8-bit character in the given encoding.

In HTML output, a ‘<meta>’ tag is output, in the ‘<head>’ section of the HTML, that specifies enc. Web servers and browsers cooperate to use this information so the correct encoding is used to display the page, if supported by the system.

 
<meta http-equiv="Content-Type" content="text/html;
     charset=enc">

In split HTML output, if ‘--transliterate-file-names’ is given (see section HTML Cross-reference 8-bit Character Expansion), the names of HTML files are formed by transliteration of the corresponding node names, using the specified encoding.

In XML and Docbook output, the given document encoding is written in the output file as usual with those formats.

In TeX output, the characters which are supported in the standard Computer Modern fonts are output accordingly. (For example, this means using constructed accents rather than precomposed glyphs.) Using a missing character generates a warning message, as does specifying an unimplemented encoding.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]