|
Name | Description | Concepts | History | Using [22mroff[24m [1m | roff |
|
|
|
roff(7) Miscellaneous Information Manual roff(7)
roff - concepts and history of roff typesetting
The term roff denotes a family of document formatting systems
known by names like troff, nroff, and ditroff. A roff system
consists of an interpreter for an extensible text formatting
language and a set of programs for preparing output for various
devices and file formats. Unix-like operating systems often
distribute a roff system. The manual pages on Unix systems
(“man pages”) and bestselling books on software engineering,
including Brian Kernighan and Dennis Ritchie's The C Programming
Language and W. Richard Stevens's Advanced Programming in the
Unix Environment have been written using roff systems. GNU roff—
groff—is arguably the most widespread roff implementation.
Below we present typographical concepts that form the background
of all roff implementations, narrate the development history of
some roff systems, detail the command pipeline managed by survey
the formatting language, suggest tips for editing roff input, and
recommend further reading materials.
roff input files contain text interspersed with instructions to
control the formatter. Even in the absence of such instructions,
a roff formatter still processes its input in several ways, by
filling, hyphenating, breaking, and adjusting it, and
supplementing it with inter-sentence space. These processes are
basic to typesetting, and can be controlled at the input
document's discretion.
When a device-independent roff formatter starts up, it obtains
information about the device for which it is preparing output
from the latter's description file (see An essential property is
the length of the output line, such as “6.5 inches”.
The formatter interprets plain text files employing the Unix
line-ending convention. It reads input a character at a time,
collecting words as it goes, and fits as many words together on
an output line as it can—this is known as filling. To a roff
system, a word is any sequence of one or more characters that
aren't spaces, tabs, or newlines. The exceptions separate words.
A roff formatter attempts to detect the boundaries between
sentences, and supplies additional inter-sentence space between
them. It does this by flagging certain characters (normally “!”,
“?”, and “.”) as potentially ending a sentence. When the
formatter encounters one of these end-of-sentence characters at
the end of a line, or one of them is followed by two spaces on
the same input line, it appends an inter-word space followed by
an inter-sentence space in the formatted output. The non-
printing input break escape sequence \& can be used after an end-
of-sentence character to defeat end-of-sentence detection on a
per-instance basis. Normally, the occurrence of a visible non-
end-of-sentence character (as opposed to a space or tab)
immediately after an end-of-sentence character cancels detection
of the end of a sentence. However, several characters are
treated transparently after the occurence of an end-of-sentence
character. That is, a roff does not cancel end-of-sentence
detection when it processes them. This is because such
characters are often used as footnote markers or to close
quotations and parentheticals. The default set is ", ', ), ], *,
\[dg], \[dd], \[rq], and \[cq]. The last four are examples of
special characters, escape sequences whose purpose is to obtain
glyphs that are not easily typed at the keyboard, or which have
special meaning to the formatter (like \).
When an output line is nearly full, it is uncommon for the next
word collected from the input to exactly fill it—typically, there
is room left over only for part of the next word. The process of
splitting a word so that it appears partially on one line (with a
hyphen to indicate to the reader that the word has been broken)
with its remainder on the next is hyphenation. Hyphenation
points can be manually specified; groff also uses a hyphenation
algorithm and language-specific pattern files to decide which
words can be hyphenated and where. Hyphenation does not always
occur even when the hyphenation rules for a word allow it; it can
be disabled, and when not disabled there are several parameters
that can prevent it in certain circumstances.
Once an output line has been filled, whether or not hyphenation
has occurred on that line, the next word read from the input will
be placed on a different output line; this is called a break. In
this document and in roff discussions generally, a “break” if not
further qualified always refers to the termination of an output
line. When the formatter is filling text, it introduces breaks
automatically to keep output lines from exceeding the configured
line length. After an automatic break, a roff formatter adjusts
the line if applicable (see below), and then resumes collecting
and filling text on the next output line.
Sometimes, a line cannot be broken automatically. This usually
does not happen with natural language text unless the output line
length has been manipulated to be extremely short, but it can
with specialized text like program source code. groff provides a
means of telling the formatter where the line may be broken
without hyphens. This is done with the non-printing break point
escape sequence \:.
There are several ways to cause a break at a predictable
location. A blank input line not only causes a break, but by
default it also outputs a one-line vertical space (effectively a
blank output line). Macro packages may discourage or disable
this “blank line method” of paragraphing in favor of their own
macros. A line that begins with one or more spaces causes a
break. The spaces are output at the beginning of the next line
without being adjusted (see below). Again, macro packages may
provide other methods of producing indented paragraphs. Trailing
spaces on text lines (see below) are discarded. The end of input
causes a break.
After the formatter performs an automatic break, it may then
adjust the line, widening inter-word spaces until the text
reaches the right margin. Extra spaces between words are
preserved. Leading and trailing spaces are handled as noted
above. Text can be aligned to the left or right margin only, or
centered, using requests.
A roff formatter translates horizontal tab characters, also
called simply “tabs”, in the input into movements to the next tab
stop. These tab stops are by default located every half inch
measured from the current position on the input line. With them,
simple tables can be made. However, this method can be
deceptive, as the appearance (and width) of the text in an editor
and the results from the formatter can vary greatly, particularly
when proportional typefaces are used. A tab character does not
cause a break and therefore does not interrupt filling. The
formatter provides facilities for sophisticated table
composition; there are many details to track when using the “tab”
and “field” low-level features, so most users turn to the
preprocessor for table construction.
Requests and macros
A request is an instruction to the formatter that occurs after a
control character, which is recognized at the beginning of an
input line. The regular control character is a dot “.”. Its
counterpart, the no-break control character, a neutral apostrophe
“'”, suppresses the break implied by some requests. These
characters were chosen because it is uncommon for lines of text
in natural languages to begin with them. If you require a
formatted period or apostrophe (closing single quotation mark)
where the formatter is expecting a control character, prefix the
dot or neutral apostrophe with the non-printing input break
escape sequence, “\&”.
An input line beginning with a control character is called a
control line. Every line of input that is not a control line is
a text line.
Requests often take arguments, words (separated from the request
name and each other by spaces) that specify details of the action
the formatter is expected to perform. If a request is
meaningless without arguments, it is typically ignored. Of key
importance are the requests that define macros. Macros are
invoked like requests, enabling the request repertoire to be
extended or overridden.
A macro can be thought of as an abbreviation you can define for a
collection of control and text lines. When the macro is called
by giving its name after a control character, it is replaced with
what it stands for. The process of textual replacement is known
as interpolation. Interpolations are handled as soon as they are
recognized, and once performed, a roff formatter scans the
replacement for further requests, macro calls, and escape
sequences.
In roff systems, the “de” request defines a macro.
Page geometry
roff systems format text under certain assumptions about the size
of the output medium, or page. For the formatter to correctly
break a line it is filling, it must know the line length, which
it derives from the page width. For it to decide whether to
write an output line to the current page or wait until the next
one, it must know the page length. A device's resolution
converts practical units like inches or centimeters to basic
units, a convenient length measure for the output device or file
format. The formatter and output driver use basic units to
reckon page measurements. The device description file defines
its resolution and page dimensions (see
A page is a two-dimensional structure upon which a roff system
imposes a rectangular coordinate system with its upper left
corner as the origin. Coordinate values are in basic units and
increase down and to the right. Useful ones are therefore always
positive and within numeric ranges corresponding to the page
boundaries.
While the formatter (and, later, output driver) is processing a
page, it keeps track of its drawing position, which is the
location at which the next glyph will be written, from which the
next motion will be measured, or where a geometric primitive will
commence rendering. Notionally, glyphs are drawn from the text
baseline upward and to the right. (groff does not yet support
right-to-left scripts.) The text baseline is a (usually
invisible) line upon which the glyphs of a typeface are aligned.
A glyph therefore “starts” at its bottom-left corner. If drawn
at the origin, a typical letter glyph would lie partially or
wholly off the page, depending on whether, like “g”, it features
a decender below the baseline.
Such a situation is nearly always undesirable. It is furthermore
conventional not to write or draw at the extreme edges of the
page. Therefore the initial drawing position of a roff formatter
is not at the origin, but below and to the right of it. This
rightward shift from the left edge is known as the page offset.
(groff's terminal output devices have page offsets of zero.) The
downward shift leaves room for a text output line.
Text is arranged on a one-dimensional lattice of text baselines
from the top to the bottom of the page. Vertical spacing is the
distance between adjacent text baselines. Typographic tradition
sets this quantity to 120% of the type size. The initial
vertical drawing position is one unit of vertical spacing below
the page top. Typographers term this unit a vee.
Vertical spacing has an impact on page-breaking decisions.
Generally, when a break occurs, the formatter moves the drawing
position to the next text baseline automatically. If the
formatter were already writing to the last line that would fit on
the page, advancing by one vee would place the next text baseline
off the page. Rather than let that happen, roff formatters
instruct the output driver to eject the page, start a new one,
and again set the drawing position to one vee below the page top;
this is a page break.
When the last line of input text is also the last output line
that can fit on the page, the break caused by the end of input
will also break the page, producing a useless blank one. Macro
packages keep users from having to confront this difficulty by
setting “traps”; moreover, all but the simplest page layouts tend
to have headers and footers, or at least bear vertical margins
larger than one vee.
Other language elements
Escape sequences start with the escape character, a backslash \,
and are followed by at least one additional character. They can
appear anywhere in the input.
With requests, the escape and control characters can be changed;
further, escape sequence recognition can be turned off and back
on.
Strings store character sequences. In groff, they can be
parameterized as macros can.
Registers store numerical values, including measurements. The
latter are generally in basic units; scaling units can be
appended to numeric expressions to clarify their meaning when
stored or interpolated. Some read-only predefined registers
interpolate text.
Fonts are identified either by a name or by a mounting position
(a non-negative number). Four font styles are available on all
devices. R is “roman”: normal, upright text. B is bold, an
upright typeface with a heavier weight. I is italic, a face that
is oblique on typesetter output devices and usually underlined
instead on terminal devices. BI is bold-italic, combining both
of the foregoing style variations. Typesetter devices typically
offer one or more special fonts as well; they provide glyphs that
are not available in the multiple styles of text fonts.
groff supports named colors for glyph rendering and drawing of
geometric primitives. Stroke and fill colors are distinct; the
stroke color is used for glyphs.
Glyphs are visual representation forms of characters. In groff,
the distinction between those two elements is not always obvious
(and a full discussion is beyond our scope). To roughly
characterize, “A” is a character when we consider it in the
abstract: to make it a glyph, we must select a typeface with
which to render it, and determine its type size and color. The
formatting process turns input characters into output glyphs. A
few characters commonly seen on keyboards are treated specially
by the roff language and may not look correct in output if used
unthinkingly; they are the (double) quotation mark ("), the
neutral apostrophe ('), the minus sign (-), the backslash (\),
the caret or circumflex accent (^), the grave accent (`), and the
tilde (~). All of these and more can be produced with special
character escape sequences; see
groff offers streams, identifiers for writable files, but for
security reasons this feature is disabled by default.
A further few language elements arise as page layouts become more
sophisticated and demanding. Environments collect formatting
parameters like line length and typeface. A diversion stores
formatted output for later use. A trap is a condition on the
input or output, tested automatically by the formatter, that is
associated with a macro, causing it to be called when that
condition is fulfilled.
Footnote support often exercises all three of the foregoing
features. A simple implementation might work as follows. A pair
of macros is defined: one starts a footnote and the other ends
it. The author calls the first macro where a footnote marker is
desired. The macro establishes a diversion so that the footnote
text is collected at the place in the body text where its
corresponding marker appears. An environment is created for the
footnote so that it is set at a smaller typeface. The footnote
text is formatted in the diversion using that environment, but it
does not yet appear in the output. The document author calls the
footnote end macro, which returns to the previous environment and
ends the diversion. Later, after much more body text in the
document, a trap, set a small distance above the page bottom, is
sprung. The macro called by the trap draws a line across the
page and emits the stored diversion. Thus, the footnote is
rendered.
Computer-driven document formatting dates back to the 1960s. The
roff system is intimately connected with Unix, but its origins
lie with the earlier operating systems CTSS, GECOS, and Multics.
The predecessor—RUNOFF
roff's ancestor RUNOFF was written in the MAD language by Jerry
Saltzer to prepare his Ph.D. thesis on the Compatible Time
Sharing System (CTSS), a project of the Massachusetts Institute
of Technology (MIT). This program is referred to in full
capitals, both to distinguish it from its many descendants, and
because bits were expensive in those days; five- and six-bit
character encodings were still in widespread usage, and mixed-
case alphabetics in file names seen as a luxury. RUNOFF
introduced a syntax of inlining formatting directives amid
document text, by beginning a line with a period (an unlikely
occurrence in human-readable material) followed by a “control
word”. Control words with obvious meaning like “.line length n”
were supported as well as an abbreviation system; the latter came
to overwhelm the former in popular usage and later derivatives of
the program. A sample of control words from a RUNOFF manual of
December 1966
⟨http://web.mit.edu/Saltzer/www/publications/ctss/AH.9.01.html⟩
was documented as follows (with the parameter notation slightly
altered). The abbreviations will be familiar to roff veterans.
Abbreviation Control word
.ad .adjust
.bp .begin page
.br .break
.ce .center
.in .indent n
.ll .line length n
.nf .nofill
.pl .paper length n
.sp .space [n]
In 1965, MIT's Project MAC teamed with Bell Telephone
Laboratories and General Electric (GE) to inaugurate the Multics
⟨http://www.multicians.org⟩ project. After a few years, Bell
Labs discontinued its participation in Multics, famously
prompting the development of Unix. Meanwhile, Saltzer's RUNOFF
proved influential, seeing many ports and derivations elsewhere.
In 1969, Doug McIlroy wrote one such reimplementation, adding
extensions, in the BCPL language for a GE 645 running GECOS at
the Bell Labs location in Murray Hill, New Jersey. In its
manual, the control commands were termed “requests”, their two-
letter names were canonical, and the control character was
configurable with a .cc request. Other familiar requests emerged
at this time; no-adjust (.na), need (.ne), page offset (.po), tab
configuration (.ta, though it worked differently), temporary
indent (.ti), character translation (.tr), and automatic
underlining (.ul; on RUNOFF you had to backspace and underscore
in the input yourself). .fi to enable filling of output lines
got the name it retains to this day. McIlroy's program also
featured a heuristic system for automatically placing hyphenation
points, designed and implemented by Molly Wagner. It furthermore
introduced numeric variables, termed registers. By 1971, this
program had been ported to Multics and was known as roff, a name
McIlroy attributes to Bob Morris, to distinguish it from CTSS
RUNOFF.
Unix and roff
McIlroy's roff was one of the first Unix programs. In Ritchie's
term, it was “transliterated” from BCPL to DEC PDP-7 assembly
language for the fledgling Unix operating system. Automatic
hyphenation was managed with .hc and .hy requests, line spacing
control was generalized with the .ls request, and what later
roffs would call diversions were available via “footnote”
requests. This roff indirectly funded operating systems research
at Murray Hill; AT&T prepared patent applications to the U.S.
government with it. This arrangement enabled the group to
acquire a PDP-11; roff promptly proved equal to the task of
formatting the manual for what would become known as “First
Edition Unix”, dated November 1971.
Output from all of the foregoing programs was limited to line
printers and paper terminals such as the IBM 2471 (based on the
Selectric line of typewriters) and the Teletype Corporation Model
37. Proportionally-spaced type was unavailable.
New roff and Typesetter roff
The first years of Unix were spent in rapid evolution. The
practicalities of preparing standardized documents like patent
applications (and Unix manual pages), combined with McIlroy's
enthusiasm for macro languages, perhaps created an irresistible
pressure to make roff extensible. Joe Ossanna's nroff, literally
a “new roff”, was the outlet for this pressure. By the time of
Unix Version 3 (February 1973)—and still in PDP-11 assembly
language—it sported a swath of features now considered essential
to roff systems: definition of macros (.de), diversion of text
thence (.di), and removal thereof (.rm); trap planting (.wh;
“when”) and relocation (.ch; “change”); conditional processing
(.if); and environments (.ev). Incremental improvements included
assignment of the next page number (.pn); no-space mode (.ns) and
restoration of vertical spacing (.rs); the saving (.sv) and
output (.os) of vertical space; specification of replacement
characters for tabs (.tc) and leaders (.lc); configuration of the
no-break control character (.c2); shorthand to disable automatic
hyphenation (.nh); a condensation of what were formerly six
different requests for configuration of page “titles” (headers
and footers) into one (.tl) with a length controlled separately
from the line length (.lt); automatic line numbering (.nm);
interactive input (.rd), which necessitated buffer-flushing
(.fl), and was made convenient with early program cessation
(.ex); source file inclusion in its modern form (.so; though
RUNOFF had an “.append” control word for a similar purpose) and
early advance to the next file argument (.nx); ignorable content
(.ig); and programmable abort (.ab).
Third Edition Unix also brought the system call, the explosive
growth of a componentized system based around it, and a “filter
model” that remains perceptible today. Equally importantly, the
Bell Labs site in Murray Hill acquired a Graphic Systems C/A/T
phototypesetter, and with it came the necessity of expanding the
capabilities of a roff system to cope with a variety of
proportionally-spaced typefaces at multiple sizes. Ossanna wrote
a parallel implementation of nroff for the C/A/T, dubbing it
troff (for “typesetter roff”). Unfortunately, surviving
documentation does not illustrate what requests were implemented
at this time for C/A/T support; the man page in Fourth Edition
Unix (November 1973) does not feature a request list, unlike
Apart from typesetter-driven features, Unix Version 4 roffs added
string definitions (.ds); made the escape character configurable
(.ec); and enabled the user to write diagnostics to the standard
error stream (.tm). Around 1974, empowered with multiple type
sizes, italics, and a symbol font specially commissioned by Bell
Labs from Graphic Systems, Kernighan and Lorinda Cherry
implemented eqn for typesetting mathematics. In the same year,
for Fifth Edition Unix, Ossanna combined and reimplemented the
two roffs in C, using that language's preprocessor to generate
both from a single source tree.
Ossanna documented the syntax of the input language to the nroff
and troff programs in the “Troff User's Manual”, first published
in 1976, with further revisions as late as 1992 by Kernighan.
(The original version was entitled “Nroff/Troff User's Manual”,
which may partially explain why roff practitioners have tended to
refer to it by its AT&T document identifier, “CSTR #54”.) Its
final revision serves as the de facto specification of AT&T
troff, and all subsequent implementors of roff systems have done
so in its shadow.
A small and simple set of roff macros was first used for the
manual pages of Unix Version 4 and persisted for two further
releases, but the first macro package to be formally described
and installed was ms by Michael Lesk in Version 6. He also wrote
a manual, “Typing Documents on the Unix System”, describing ms
and basic nroff/troff usage, updating it as the package accrued
features. Sixth Edition additionally saw the debut of the tbl
preprocessor for formatting tables, also by Lesk.
For Unix Version 7 (January 1979), McIlroy designed, implemented,
and documented the man macro package, introducing most of the
macros described in today, and edited volume 1 of the Version 7
manual using it. Documents composed using ms featured in volume
2, edited by Kernighan.
Meanwhile, troff proved popular even at Unix sites that lacked a
C/A/T device. Tom Ferrin of the University of California at San
Francisco combined it with Allen Hershey's popular vector fonts
to produce vtroff, which translated troff's output to the command
language used by Versatec and Benson-Varian plotters.
Ossanna had passed away unexpectedly in 1977, and after the
release of Version 7, with the C/A/T typesetter becoming
supplanted by alternative devices such as the Mergenthaler
Linotron 202, Kernighan undertook a revision and rewrite of troff
to generalize its design. To implement this revised
architecture, he developed the font and device description file
formats and the device-independent output format that remain in
use today. He described these novelties in the article “A
Typesetter-independent TROFF”, last revised in 1982, and like the
troff manual itself, it is widely known by a shorthand, “CSTR
#97”.
Kernighan's innovations prepared troff well for the introduction
of the Adobe PostScript language in 1982 and a vibrant market in
laser printers with built-in interpreters for it. An output
driver for PostScript, dpost, was swiftly developed. However,
AT&T's software licensing practices kept Ossanna's troff, with
its tight coupling to the C/A/T's capabilities, in parallel
distribution with device-independent troff throughout the 1980s.
Today, however, all actively maintained troffs follow Kernighan's
device-independent design.
groff—a free roff from GNU
The most important free roff project historically has been groff,
the GNU implementation of troff, developed from scratch by James
Clark starting in 1989 and distributed under copyleft
⟨http://www.gnu.org/copyleft⟩ licenses, ensuring to all the
availability of source code and the freedom to modify and
redistribute it, properties unprecedented in roff systems to that
point. groff rapidly attracted contributors, and has served as a
complete replacement for almost all applications of AT&T troff
(exceptions include mv, a macro package for preparation of
viewgraphs and slides, and the ideal preprocessor for producing
diagrams from a constraint-based language). Beyond that, it has
added numerous features; see Since its inception and for at least
the following three decades, it has been used by practically all
GNU/Linux and BSD operating systems.
groff continues to be developed, is available for almost all
operating systems in common use (along with several obscure
ones), and it is free. These factors make groff the de facto
roff standard today.
Other free roffs
In 2007, Caldera/SCO and Sun Microsystems, having acquired rights
to AT&T Documenter's Workbench troff (a descendant of the Bell
Labs code), released it under a free but GPL-incompatible
license. This implementation
⟨https://github.com/n-t-roff/DWB3.3⟩ was made portable to modern
POSIX systems, and adopted and enhanced first by Gunnar Ritter
and then Carsten Kunze to produce Heirloom Doctools troff
⟨https://github.com/n-t-roff/heirloom-doctools⟩.
In 2012, Ali Gholami Rudi began working on neatroff,
⟨https://github.com/aligrudi/neatroff⟩ a permissively licensed new
implementation.
Many people use roff frequently without knowing it. When you
read a system manual page (man page), it is often a roff working
in the background to render it. But using a roff explicitly
isn't difficult.
Some roff implementations provide wrapper programs that make it
easy to use the roff system from the shell's command line. These
can be specific to a macro package, like or more general.
provides command-line options sparing the user from constructing
the long, order-dependent pipelines familiar to AT&T troff users.
Further, a heuristic program, is available to infer from a
document's contents which groff arguments should be used to
process it.
The roff pipeline
Each roff system consists of preprocessors, one or more roff
formatter programs, and a set of output drivers (or “device
postprocessors”). This arrangement is designed to take advantage
of a landmark Unix innovation in inter-process communication: the
pipe. That is, a series of programs termed a “pipeline” is
called together where the output of each program in the sequence
is taken as the input for the next program, without (necessarily)
passing through temporary files on a disk. (On non-Unix systems,
pipelines may have to be simulated.)
$ preproc1 < input-file | preproc2 | ... | troff [option ...] \
| output-driver
Once all preprocessors have run, they deliver a pure roff
document to the formatter, which in turn generates intermediate
output that is fed into an output driver for viewing, printing,
or further processing.
All of these parts use programming languages of their own; each
language is totally unrelated to the other parts. Moreover, roff
macro packages that are tailored for special purposes can be
included.
Most roff input files use the macros of a document formatting
package, intermixed with instructions for one or more
preprocessors, seasoned with escape sequences and requests
directly from the roff language. Some documents are simpler
still, since their formatting packages discourage direct use of
roff requests; man pages are a prominent example. The full power
of the roff formatting language is seldom needed by users; only
programmers of macro packages need a substantial command of it.
Preprocessors
A roff preprocessor is a program that, directly or ultimately,
generates output in the roff language. Typically, each
preprocessor defines a language of its own that transforms its
input into that for roff or another preprocessor. As an example
of the latter, chem produces pic input. Preprocessors must
consequently be run in an appropriate order; handles this
automatically for all preprocessors supplied by the GNU roff
system.
Portions of the document written in preprocessor languages are
usually bracketed by tokens that look like roff macro calls.
roff preprocessor programs transform only the regions of the
document intended for them. When a preprocessor language is used
by a document, its corresponding program must process it before
the input is seen by the formatter, or incorrect rendering is
almost guaranteed.
GNU roff provides several preprocessors, including eqn, grn, pic,
tbl, refer, and soelim. See for a complete list. Other
preprocessors for roff systems are known.
dformat depicts data structures;
grap constructs statistical charts; and
ideal draws diagrams using a constraint-based language.
Formatter programs
A roff formatter interprets input in the roff language and
transforms it into intermediate output intended for processing by
a selected device. Intermediate output uses its own language,
described in Intermediate output is specialized in its
parameters, but not its syntax, for the selected device; the
format is device-independent, but not device-agnostic. The
parameters the formatter uses to arrange the document are stored
in device and font description files; see
AT&T Unix had two formatters—nroff for terminals, and troff for
typesetters. Often, the name troff is used loosely to refer to
both. When generalizing thus, groff documentation prefers the
term “roff”. In GNU roff, the formatter program is always
Devices and output drivers
To a roff system, a device is a hardware interface like a
printer, a text or graphical terminal, or a standardized file
format that unrelated software can interpret. An output driver
is a program that parses the output of troff and produces
instructions specific to the device or file format it supports.
An output driver might support multiple devices, particularly if
they are similar.
The names of the devices and their driver programs are not
standardized. Technologies change; the devices used for document
preparation have greatly changed since CSTR #54 was first written
in the 1970s. Such hardware is no longer used in production
environments, and device capabilities (including resolution,
support for multiple colors, and font repertoire) have tended to
increase. Further, to reduce file size and processing time, AT&T
troff's device-independent output format placed low limits on the
magnitudes of some of the quantities it could represent. Its
PostScript output driver, had a resolution of 720 units per inch;
groff's uses 72,000.
roff
Documents using roff are normal text files interleaved with roff
formatting elements. The roff language is powerful enough to
support arbitrary computation and supply facilities that
encourage their extension. The primary such facility is macro
definition; with this feature, macro packages have been developed
that are tailored for particular applications.
Macro packages
Macro packages can have a much smaller vocabulary than roff
itself; this trait combined with their domain-specific nature can
make them easy to acquire and master. The macro definitions of a
package are typically kept in a file called name.tmac
(historically, tmac.name). All tmac files are stored in one or
more directories at standardized positions. Details on the
naming of macro packages and their placement is found in
A macro package anticipated for use in a document can be delcared
to the formatter by the command-line option -m; see It can
alternatively be specified within a document using the file
inclusion requests of the roff language; see
Well-known macro packages include man for traditional man pages
and mdoc for BSD-style manual pages. Macro packages for
typesetting books, articles, and letters include ms (from
“manuscript macros”), me (named by a system administrator from
the first name of its creator, Eric Allman), mm (from “memorandum
macros”), and mom, a punningly-named package exercising many
groff extensions.
The roff formatting language
The canonical reference for the AT&T troff language is Ossanna's
“Troff User's Manual”, CSTR #54, in its 1992 revision by
Kernighan. The roff language provides requests, escape
sequences, macro definition facilities, string variables,
registers for storage of numbers or dimensions, and control of
execution flow. The theoretically-minded will observe that a
roff is not a mere markup language, but Turing-complete. It has
storage (registers); it can perform tests (as in conditional
expressions like “(\n[i] >= 1)”); it can jump or branch using the
.if request; and macro definition permits unbounded recursion.
Requests and escape sequences are instructions, predefined parts
of the language, that perform formatting operations or otherwise
change the state of the parser. The user can define their own
request-like elements by composing together text, requests, and
escape sequences ad libitum. A document writer will not
(usually) note any difference in usage for requests or macros;
both are written on a line on their own starting with a dot.
However, there is a distinction; requests take either a fixed
number of arguments (sometimes zero), silently ignoring any
excess, or consume the rest of the input line, whereas macros can
take a variable number of arguments. Since arguments are
separated by spaces, macros require a means of embedding a space
in an argument; in other words, of quoting it. This then demands
a mechanism of embedding the quoting character itself, in case it
is needed literally in a macro argument. AT&T troff had complex
rules involving the placement and repetition of the double quote
to achieve both aims. groff cuts this knot by supporting a
special character escape sequence for the neutral double quote,
“\[dq]”, which never performs quoting in the typesetting
language, but is simply a glyph, ‘"’.
Escape sequences start with a backslash, “\”. They can appear
almost anywhere, even in the midst of text on a line, and
implement various features, including the insertion of special
characters with “\(” or “\[]”, break suppression at input line
endings with “\c”, font changes with “\f”, type size changes with
“\s”, in-line comments with “\"”, and many others.
Strings store text. They are populated with the .ds request and
interpolated using the \* escape sequence.
Registers store numbers and measurements. A register can be set
with the request .nr and its value can be retrieved by the escape
sequence \n.
The structure or content of a file name, beyond its location in
the file system, is not significant to roff tools. roff
documents employing “full-service” macro packages (see tend to be
named with a suffix identifying the package; we thus see file
names ending in .man, .ms, .me, .mm, and .mom, for instance.
When installed, man pages tend to be named with the manual's
section number as the suffix. For example, the file name for
this document is roff.7. Practice for “raw” roff documents is
less consistent; they are sometimes seen with a .t suffix.
Since troff fills text automatically, it is common practice in
the roff language to avoid visual composition of text in input
files: the esthetic appeal of the formatted output is what
matters. Therefore, roff input should be arranged such that it
is easy for authors and maintainers to compose and develop the
document, understand the syntax of roff requests, macro calls,
and preprocessor languages used, and predict the behavior of the
formatter. Several traditions have accrued in service of these
goals.
• Follow sentence endings in the input with newlines to ease
their recognition. It is frequently convenient to end text
lines after colons and semicolons as well, as these typically
precede independent clauses. Consider doing so after commas;
they often occur in lists that become easy to scan when
itemized by line, or constitute supplements to the sentence
that are added, deleted, or updated to clarify it.
Parenthetical and quoted phrases are also good candidates for
placement on text lines by themselves.
• Set your text editor's line length to 72 characters or fewer;
see the subsections below. This limit, combined with the
previous item of advice, makes it less common that an input
line will wrap in your text editor, and thus will help you
perceive excessively long constructions in your text. Recall
that natural languages originate in speech, not writing, and
that punctuation is correlated with pauses for breathing and
changes in prosody.
• Use \& after “!”, “?”, and “.” if they are followed by space,
tab, or newline characters and don't end a sentence.
• In filled text lines, use \& before “.” and “'” if they are
preceded by space, so that reflowing the input doesn't turn
them into control lines.
• Do not use spaces to perform indentation or align columns of a
table. Leading spaces are reliable when text is not being
filled.
• Comment your document. It is never too soon to apply comments
to record information of use to future document maintainers
(including your future self). The \" escape sequence causes
troff to ignore the remainder of the input line.
• Use the empty request—a control character followed immediately
by a newline—to visually manage separation of material in input
files. Many of the groff project's own documents use an empty
request between sentences, after macro definitions, and where a
break is expected, and two empty requests between paragraphs or
other requests or macro calls that will introduce vertical
space into the document. You can combine the empty request
with the comment escape sequence to include whole-line comments
in your document, and even “comment out” sections of it.
An example sufficiently long to illustrate most of the above
suggestions in practice follows. An arrow → indicates a tab
character.
.\" nroff this_file.roff | less
.\" groff -T ps this_file.roff > this_file.ps
→The theory of relativity is intimately connected with
the theory of space and time.
.
I shall therefore begin with a brief investigation of
the origin of our ideas of space and time,
although in doing so I know that I introduce a
controversial subject. \" remainder of paragraph elided
.
.
→The experiences of an individual appear to us arranged
in a series of events;
in this series the single events which we remember
appear to be ordered according to the criterion of
\[lq]earlier\[rq] and \[lq]later\[rq], \" punct swapped
which cannot be analysed further.
.
There exists,
therefore,
for the individual,
an I-time,
or subjective time.
.
This itself is not measurable.
.
I can,
indeed,
associate numbers with the events,
in such a way that the greater number is associated with
the later event than with an earlier one;
but the nature of this association may be quite
arbitrary.
.
This association I can define by means of a clock by
comparing the order of events furnished by the clock
with the order of a given series of events.
.
We understand by a clock something which provides a
series of events which can be counted,
and which has other properties of which we shall speak
later.
.\" Albert Einstein, _The Meaning of Relativity_, 1922
Editing with Emacs
Official GNU doctrine holds that the best program for editing a
roff document is Emacs; see It provides an nroff major mode that
is suitable for all kinds of roff dialects. This mode can be
activated by the following methods.
When editing a file within Emacs the mode can be changed by
typing “M-x nroff-mode”, where M-x means to hold down the meta
key (often labelled “Alt”) while pressing and releasing the “x”
key.
It is also possible to have the mode automatically selected when
a roff file is loaded into the editor.
• The most general method is to include file-local variables at
the end of the file; we can also configure the fill column this
way.
.\" Local Variables:
.\" fill-column: 72
.\" mode: nroff
.\" End:
• Certain file name extensions, such as those commonly used by
man pages, trigger the automatic activation of the nroff mode.
• Technically, having the sequence
.\" -*- nroff -*-
in the first line of a file will cause Emacs to enter the nroff
major mode when it is loaded into the buffer. Unfortunately,
some implementations of the program are confused by this
practice, so we discourage it.
Editing with Vim
Other editors provide support for roff-style files too, such as
an extension of the program. Vim's highlighting can be made to
recognize roff files by setting the filetype option in a Vim
modeline. For this feature to work, your copy of vim must be
built with support for, and configured to enable, several
features; consult the editor's online help topics “auto-setting”,
“filetype”, and “syntax”. Then put the following at the end of
your roff files, after any Emacs configuration:
.\" vim: set filetype=groff textwidth=72:
Replace “groff” in the above with “nroff” if you want highlighing
that does not recognize many of the GNU extensions to roff, such
as request, register, and string names longer than two
characters.
This document was written by Bernd Warken ⟨groff-bernd.warken-72@
web.de⟩, with the sections “Concepts”, “History”, “File name
conventions”, and “Input conventions” mostly written by G.
Branden Robinson ⟨g.branden.robinson@gmail.com⟩.
There is a lot of documentation about roff. The original papers
describing AT&T troff are still available, and all aspects of
groff are documented in great detail.
Internet sites
Unix Text Processing
⟨https://github.com/larrykollar/Unix-Text-Processing⟩, by Dale
Dougherty and Tim O'Reilly, 1987, Hayden Books. This well-
regarded text brings the reader from a state of no knowledge of
Unix or text editing (if necessary) to sophisticated computer-
aided typesetting. It has been placed under a free software
license by its authors and updated by a team of groff
contributors and enthusiasts.
“History of Unix Manpages” ⟨http://manpages.bsd.lv/history.html⟩,
an online article maintained by the mdocml project, provides an
overview of roff development from Salzer's RUNOFF to 2008, with
links to original documentation and recollections of the authors
and their contemporaries.
troff.org ⟨http://www.troff.org/⟩, Ralph Corderoy's troff site,
provides an overview and pointers to much historical roff
information.
Multicians ⟨http://www.multicians.org/⟩, a site by Multics
enthusiasts, contains a lot of information on the MIT projects
CTSS and Multics, including RUNOFF; it is especially useful for
its glossary and the many links to historical documents.
The Unix Archive ⟨http://www.tuhs.org/Archive/⟩, curated by the
Unix Heritage Society, provides the source code and some binaries
of historical Unices (including the source code of some versions
of troff and its documentation) contributed by their copyright
holders.
Jerry Saltzer's home page
⟨http://web.mit.edu/Saltzer/www/publications/pubs.html⟩ stores
some documents using the original RUNOFF formatting language.
groff ⟨http://www.gnu.org/software/groff⟩, GNU roff's web site,
provides convenient access to groff's source code repository, bug
tracker, and mailing lists (including archives and the
subscription interface).
Historical roff documentation
Many AT&T troff documents are available online, and can be found
at Ralph Corderoy's site (see above) or via Internet search.
Of foremost significance are two mentioned in section “History”
above, describing the language and its device-independent
implementation, respectively.
“Troff User's Manual” by Joseph F. Ossanna, 1976 (revised by
Brian W. Kernighan, 1992), AT&T Bell Laboratories Computing
Science Technical Report No. 54.
“A Typesetter-independent TROFF” by Brian W. Kernighan, 1982,
AT&T Bell Laboratories Computing Science Technical Report No. 97.
You can obtain many relevant Bell Labs papers in PDF from Bernd
Warken's “roff classical” GitHub repository
⟨https://github.com/bwarken/roff_classical.git⟩.
Manual pages
As a system of multiple components, a roff system potentially has
many man pages, each describing an aspect of it. Unfortunately,
there is no general naming scheme for the documentation among the
different roff implementations.
For GNU roff, the man page enumerates all man pages distributed
with the system, and individual pages frequently refer to
external resources as well as manuals distributed with groff on a
variety of topics.
With other roffs, you are on your own, but might be a good
starting point.
This page is part of the groff (GNU troff) project. Information
about the project can be found at
⟨http://www.gnu.org/software/groff/⟩. If you have a bug report
for this manual page, see ⟨http://www.gnu.org/software/groff/⟩.
This page was obtained from the project's upstream Git repository
⟨https://git.savannah.gnu.org/git/groff.git⟩ on 2022-12-17. (At
that time, the date of the most recent commit that was found in
the repository was 2022-12-14.) If you discover any rendering
problems in this HTML version of the page, or you believe there
is a better or more up-to-date source for the page, or you have
corrections or improvements to the information in this COLOPHON
(which is not part of the original manual page), send a mail to
man-pages@man7.org
groff 1.23.0.rc1.3569-94746-d1i4rtDyecember 2022 roff(7)