stx2any — low-level and common markup facilities

This file is copyright © 2004,2005,2006,2011 by Panu Kalliokoski and released under the stx2any license.

Included herein are macros that transform low-level annotations made in the sed phase into high-level macro calls to be defined specific to output format. Actually, this border is very fuzzy: some output formats don't need these helpers (e.g. LaTeX doesn't need footnote facilities), and may redefine some markup on a lower level.

Because overriding definitions made here is easy, this file is also used for providing some sensible (output format independent) defaults for several macros.

First we change m4 comments into something that the user is unlikely to write by accident. The comment mechanism is used for passing preformatted blocks through as is.

changecom(`begin.-.passthru',`end.-.passthru')

Defaults

The default character set.

w_char_coding(utf8)

Defaults for some simple macros.

define(`w_horizbr', `||')
define(`w_apo', ')
define(`w_eline',)
define(`w_section',)
define(`w_url', `w_literal(`$1')')
define(`w_man_desc', `ifelse(`$2',,
  `w_begdiv(ingr)w_emph(`$1')`'w_linebr`'w_nl`'w_enddiv(ingr)dnl',
  `$1 w_emdash $2 w_linebr')')
define(`w_techemph', `w_emph(`$1')')
define(`w_quotation', ``'w_bq`'w_bq`'$1`'w_apo`'w_apo`'')
define(`w_emdash', `--')
define(`w_endash', `--')
define(`w_ellipsis', `...')
define(`w_copyrightsign', `(c)')
define(`w_sectbreak',
`w_paragraph`'w_emdash w_emdash w_emdash`''dnl
`ifelse(eval(`$1>4'),1,` w_emdash w_emdash`'')w_softopen')

Both itemised list types should have the same effect. In most situations, list items have the same effect, too.

w_derive_env(*, -, 0,,,,)
w_derive_env(*i, i, 0,,,,)
w_derive_env(-i, i, 0,,,,)
w_derive_env(#i, i, 0,,,,)

These environments have sensible defaults, but are overridden in some output formats when more appropriate markup is available.

w_define_env(`compactlist',
`pushdef(`w_eline', `w_linebr')',
`popdef(`w_eline')')
w_derive_env(`citation', q, 0,,,`w_linebr`'w_nl  w_emdash $*`'w_nl',)
w_derive_env(`abstract', q, 0,`w_begdiv(ingr)',,,`w_enddiv(ingr)')
w_derive_env(`admonition', q, 1,`$1:w_nl`'',,,)

define(`w_slideheader', `w_set_or_get(`@w_slideheader', `$*')')
define(`w_slidefooter', `w_set_or_get(`@w_slidefooter', `$*')')
w_define_env(`slide',
`w_paragraph`'w_slideheader`'w_nl`'w_sbreak(3)`'w_softopen`'w_nl`'',
`w_sbreak(3)
w_paragraph`'w_slidefooter`'w_nl`'w_sbreak(5)`'w_softopen`'w_nl`'')

Quoted characters. We suppose most of these characters don't have any special meaning and let output format specific definitions override those that do.

define(`w_lt', `<')
define(`w_gt', `>')
define(`w_amp', `&')
define(`w_bldot', `.')
define(`w_blap', ')
define(`w_bs', `\')
define(`w_obr', `{')
define(`w_bar', `|')
define(`w_cbr', `}')
define(`w_us', `_')
define(`w_ct', `^')
define(`w_td', `~')
define(`w_dol', `$')
define(`w_hs', `#')
define(`w_pct', `%')

Sectioning

These macros just do section numbering for output formats that do not support it natively. They transform the low-level w_headl to the high-level w_headline.

w_newcounter(subsubsection)
w_newcounter(subsection, subsubsection)
w_newcounter(section, subsection)
w_newcounter(chapter, section)
define(`w_headl',
`w_newindent(0)'dnl
`w_stepcounter(w_pickn(`$1', chapter, section, subsection, subsubsection))'dnl
`w_headline(`$1', `w_maybe_tocline(w_number_of(`$1'), `$2')')')
define(`w_maybe_tocline',
`ifelse(w_make_toc, true, `w_index(toc, `$1', `$2')',
   w_do_link_abbr, true, `$1`'w_autolabel(`$2')', `$1`'$2')')
define(`w_number_of',
`ifelse(w_do_numbering, true,
  `w_sectionmark(`$1', chapter, section, subsection, subsubsection) ')')
define(`w_sectionmark',
`w_counter_arabic(`$2')`'ifelse(`$1',1,,
   `.w_sectionmark(decr(`$1'),shift(shift($@)))')')

Diversions

Diversions common for every output format are declared here.

w_define_div(`frontmatter')
w_define_div(`ingr')
w_define_div(`body')
w_define_div(`backmatter')
ifelse(w_make_toc,true,`w_define_div(`toc')',`w_define_trashcan(`toc')')

defs is a genuine trashcan diversion. The others serve as defaults for diversions that are not used for most output formats.

w_define_trashcan(`defs')
w_define_trashcan(`metas')
w_define_trashcan(`preamble')

Footnotes

By default, we gather footnotes in a diversion that can be dumped upon request. w_footnote is meant for end users; the environment is the real thing, used directly by abbreviations.

w_define_div(`footnote')
define(`w_footnote', ``'w_beg(footnote)`'$1`'w_end(footnote)`'')
w_newcounter(footnote)
w_define_env(footnote,
`define(`@w_footnote_flag',t)'dnl
`w_stepcounter(footnote)w_footnotemark(w_counter_arabic(footnote))`''dnl
`w_begdiv(footnote)w_footnotemark(w_counter_arabic(footnote)) ',
`w_linebr`'w_nl`'w_enddiv(footnote)')
define(`w_dump_footnotes',
`ifelse(defn(`@w_footnote_flag'),,,`$1`'w_dumpdiv(footnote)$2')'dnl
`undefine(`@w_footnote_flag')')

Link abbreviation support

The link abbreviations use three macros of their own: w_generic_link, w_make_autolabel, w_autolabel and w_autorefer. The latter two are just like w_label and w_refer, but you don't have to come up with a label name: it is generated from the text.

You call w_make_autolabel when you want to make sure that the label of some specific text is unique. w_index does this, and link abbreviations also generate unique labels on beforehand for all anchors and headings (to account for cases where the headings differ only in characters that w_tidystring trashes).

However, if we don't have a generated label, we just use whatever w_tidystring returns; that's what w_make_autolabel was likely to produce for the unique label anyway.

w_newcounter(autolabel)
define(`w_genlabel',
`ifdef(`@w_label_used_$1',
  `w_stepcounter(autolabel)w_genlabel(`$1'w_counter_arabic(autolabel))',
  `define(`@w_label_used_$1',t)`$1'')')
define(`w_tidystring', `patsubst(``$1'',`[^0-9A-Za-z`']',`.')')
define(`w_make_autolabel',
`define(`@w_label_of_$1',w_genlabel(w_tidystring(`$1')))')
define(`w_get_autolabel',
`ifdef(`@w_label_of_$1',`defn(`@w_label_of_$1')',`w_tidystring(`$1')')')
define(`w_autolabel', `w_label(w_get_autolabel(`$1'), `$1')')
define(`w_autorefer',
`w_refer(w_get_autolabel(`$1'), ifelse(`$2',,``$1'',``$2''))')

Now, generic links are quite a beast. They can become:

  1. cross references (if the label exists),
  2. cross links (if the document is known),
  3. inline images or ordinary links (if it seems like we have a URL),
  4. footnotes (if everything else fails)
in this order of preference. They can link directly or indirectly (via a link data block).
define(`w_generic_link',
`ifdef(`@w_linkdata_of_$1', `w_generic_link(defn(`@w_linkdata_of_$1'),`$2',fn_ok)',
 `ifdef(`@w_label_of_$1', `w_autorefer(`$1', `$2')',
  `ifdef(`@w_filename_of_$1', `w_crosslink(`$1', `$2')',
   `ifdef(`@w_file_exists_$1', `w_crosslink(`$1', `$2')',
    `ifelse(index(`$1',img:),0, `w_img(substr(`$1', 4), `$2')',
     `ifelse(w_is_url(`$1'),t,`ifelse(`$2',,`w_url(`$1')',`w_link(`$1',`$2')')',
      `ifelse(`$3',fn_ok,`$2`'w_footnote(`$1')',
       `$2[$1]w_warning(`Unknown link tag: "$1"')')')')')')')')')
define(`w_is_url',
`ifelse(index(`$1',http://),0,t,index(`$1',https://),0,t,index(`$1',ftp://),0,t,
   index(`$1',gopher://),0,t,index(`$1',file:/),0,t,index(`$1',nntp://),0,t,
   index(`$1',mailto:),0,t,index(`$1',news:),0,t,
   index(`$1',./),0,t,index(`$1',../),0,t)')

End-user markup

These definitions don't have anything to do with anything else and are included here because they are output format independent. They are expected to be invoked by the author of the document directly.

These didn't fit anywhere else.

define(`w_def_in_fmt',
`ifelse(defn(`w_outputfmt'), `$1', `define(`$2', `$3')')')
define(`w_invoke',
`ifdef(`$1',`$1',`w_warning(`Unknown macro "$1" called')w_void')')
define(`w_use',
`ifdef(`@w_included_$1',,
   `define(`@w_included_$1',t)include(`$1.m4')`'')')

Indexes and cross-links

Indexing something currently simply puts the same text both in the index diversion and in the current text, cross-referencing them. Some indexes should probably be lexicographically ordered, but this needs more careful designing. As it stands, this system is quite sufficient for lists of pictures and the like.

define(`w_index',
`ifelse(`$3',,`w_index(`$1',,`$2')',
`w_make_autolabel(`$3')`$2'w_autolabel(`$3')`''dnl
`w_begdiv(`$1')`$2'w_autorefer(`$3')`'w_linebr`'w_nl`'w_enddiv(`$1')')')
define(`w_indexword',
`define(`$2', `w_index(`$1', ``$2'')')')

Cross links are links between documents. To make a cross link properly to another document, a document needs to know something about the other document. gather_stx_titles and w_crosslink together provide a way to keep track of this information.

There are seven cases:

  1. document contains both title and id, referenced by id
  2. document contains both title and id, referenced by filename
  3. document contains only title, referenced by filename
  4. document contains only id, referenced by id
  5. document contains only id, referenced by filename
  6. document contains neither title nor id, referenced by filename
  7. document unknown or doesn't exist

Cases 1–3 produce a link to the file with text of title. Cases 4–6 produce a link with text of the filename, case 7 just whatever the document happened to be referenced by and a warning. A second argument, if present, will override the visible text produced by this macro.

define(`w_file',
`ifelse(w_is_url(`$1'),t,,ifdef(`w_base',defn(`w_base')/))`$1'')
define(`w_crosslink',
`ifdef(`@w_filename_of_$1',
 `w_crosslink(defn(`@w_filename_of_$1'),`$2')',
 `ifdef(`@w_file_exists_$1',
  `w_link(w_file(`$1'),
   ifelse(`$2',,`ifdef(`@w_title_of_$1',
     `defn(`@w_title_of_$1')',``$1'')',``$2''))',
  `w_warning(`Unknown cross link to "$1"')ifelse(
     `$2',,`$1',`$2')')')')

Some environments

w_define_env(`text',,)
w_define_env(`ifeq',
  `ifelse(`$1', `$2',, `w_begdiv(defs)')',
  `ifelse(`$1', `$2',, `w_enddiv(defs)')')

Floats and their infrastructure.

w_define_env(`float',
  `w_beg(w_some_float_env(`$1'), shift($@))',
  `w_end(w_some_float_env(`$1'), shift($@))')
define(`w_some_float_env',
`ifelse(`$1',,`w_float_default',
 `w_ifdef_env(`w_float_'substr(`$1',0,1),
  ``w_float_'substr(`$1',0,1)',
  `w_some_float_env(substr(`$1',1))')')')

w_define_env(`w_float_h',
  `w_sbreak(5)`'w_nl',
  `w_paragraph`'ifelse(`$1',,,w_caption(`$1')`'w_nl`')w_sbreak(5)`'w_nl')
w_derive_env(`w_float_n', `w_float_h', 0,
  `define(`@w_footnote_flag',t)w_begdiv(footnote)',,,
  `w_enddiv(footnote)undefine(`@w_para_flag')')
w_derive_env(`w_float_default', `w_float_n', 0,,,,)
define(`w_caption', `$1')

Table infrastructure

Common helpers used by both table environments.

define(`w_begin_row',
`w_reinit_list(columns)w_stepcounter(row)'dnl
`w_beg(w_row, n, w_counter_arabic(row))`'')
define(`w_begin_cell',
`w_stepcounter(column)w_beg(w_cell, n, w_next_in_list(columns))')

Generic table environment. These transform low-level w_horizbr and w_linebr into high-level environment calls. The environment is based on w_table, to be defined output-format-specifically.

w_derive_env(`table', `w_table', 0,
`w_setup_list(columns, $@)w_newcounter(column)w_newcounter(row,column)'dnl
`pushdef(`w_pending_block_hook', `w_begin_row`'w_begin_cell`'')'dnl
`pushdef(`w_linebr',
  `w_end(w_cell)`'w_end(w_row)`'define(`w_pending_block_hook',
    `w_begin_row`'w_begin_cell`'')')'dnl
`pushdef(`w_horizbr', `w_end(w_cell)`'w_begin_cell`'')'dnl
`pushdef(`w_sectbreak', `w_table_rule(w_length_list(columns))')',
`undefine(`@w_para_flag')',
,
`popdef(`w_linebr')popdef(`w_horizbr')popdef(`w_sectbreak')'dnl
`popdef(`w_pending_block_hook')w_delcounter(column)w_delcounter(row)'dnl
`w_unsetup_list(columns)')

List tables.

w_derive_env(`listtable', `w_table', 0,
`w_setup_list(columns, $@)w_newcounter(column)w_newcounter(row,column)'dnl
`pushdef(`w_sectbreak', `w_table_rule(w_length_list(columns))')'dnl
`w_push_env(*)w_derive_env(*,w_listtable_level,0,,,,)'dnl
`w_newcounter(w_listtable)w_push_env(*i)undefine(`@w_para_flag')',,,
`w_pop_env(*i)w_pop_env(*)w_delcounter(w_listtable)popdef(`w_sectbreak')'dnl
`w_unsetup_list(columns)w_delcounter(column)w_delcounter(row)')
w_define_env(`w_listtable_level',
`w_stepcounter(w_listtable)'dnl
`ifelse(w_counter_arabic(w_listtable),1,`w_derive_env(*i,w_row,0,,,,)',
  w_counter_arabic(w_listtable),2,`w_derive_env(*i,w_cell,0,,,,)',
  `w_error(`Hm, trying to make three-dimensional tables?')')',
`w_stepcounter(w_listtable,-1)'dnl
`ifelse(w_counter_arabic(w_listtable),1,`w_derive_env(*i,w_row,0,,,,)',
  w_counter_arabic(w_listtable),0,`w_define_env(*i,,)')')

Paragraphs

These transform low-level w_para calls into high-level w_paragraph calls.

define(`w_softopen', `define(`@w_para_flag',t)')
define(`w_para', `w_softopen`'w_softpara')
define(`w_beg_para',
`ifelse(defn(`@w_para_flag'),t,`w_paragraph`'')'dnl
`undefine(`@w_para_flag')')

Hooks for dealing with breaks (w_softpara is a hook for those who want to do something for raw w_para).

define(`w_softbr',)
define(`w_softpara',)

Pending blocks. This hook is meant to be invoked for opening a block when (or if) any text is forthcoming. Used by tables (sometimes) and definition lists.

define(`w_pending_block_hook',)
define(`w_pending_block',
`w_pending_block_hook`'define(`w_pending_block_hook',)')

Block system

Block infrastructure

This is the real thing. These macros provide the infrastructure for transforming indents and dedents (produced by the indentation system in stx2any — common m4 facilities) into block structure. Blocks are environments which are opened upon indent and closed upon dedent. If the block type changes, the old block is closed and a new one opened.

There is a pseudo block, n, which kind of means no block at all. It is used because when we get a new indent, we don't know the forthcoming block type (because the indentation system is independent of block types).

We store previous block type because sometimes the type of a new block depends on the type of the enclosing block.

w_define_env(n,,)
define(`@w_block_type',n)
define(`w_indent',
`pushdef(`@w_prev_block_type', defn(`@w_block_type'))'dnl
`pushdef(`@w_block_type',n)w_beg(n)`'w_indent_hook`'')
define(`w_dedent',
`w_dedent_hook`'w_end(defn(`@w_block_type'))`''dnl
`popdef(`@w_prev_block_type')popdef(`@w_block_type')')

Hooks for direct users of the indent system.

define(`w_indent_hook',)
define(`w_dedent_hook',)

Change block type within indent level.

define(`w_setblocktype',
`ifelse(defn(`@w_block_type'),`$1',,
`w_end(defn(`@w_block_type'))`'define(`@w_block_type',`$1')'dnl
`w_beg(defn(`@w_block_type'))')')

Block markup glue

These definitions transform low-level w_sbreak, w_bline, w_item and w_term into high-level w_sectbreak, environment invocations, w_paragraph, w_listitem and w_defnterm. They use the paragraph system and block system above as well as the indentation system to achieve this.

The block types have the following meaning:

n
no block yet
text
ordinary text
q
block quote
-, *, #, :
itemised, itemised, numbered, and definition lists, respectively. Note that these are not the blocks of the list items, but of the lists themselves.
-i, *i, #i, t
list items. t is the type of a definition in definition lists (terms in definition lists are not considered blocks at all).

Pending block hook usually contains some opening element, if anything. We try to invoke it at an appropriate place: after everything has been closed (so the environments have time to cancel it), but before the elements that were possibly supposed to be within the pending block.

Okay, on with the definitions.

Section breaks.

define(`w_sbreak', `w_newindent(0)`'w_sectbreak(`$1')')

Ordinary text lines. These only induce one level of indentation, whose type is normal text in top level and inside list items, blockquote elsewhere. Kind of like saying, it can remain normal if it has a good reason to be indented so, otherwise it becomes a block quote.

define(`w_bline',
`w_newindent(`$1')`'w_onlyindent`''dnl
`w_pending_block`'w_beg_para`'w_softbr')
define(`w_onlyindent',
`ifelse(index(`t-i*i#i',defn(`@w_prev_block_type')),-1,
   `w_setblocktype(q)',`w_setblocktype(text)')')
w_define_env(i,`w_listitem',)

Different kinds of list items. These imply the presence of a list; if we were already in a list, the first indent level closes the pending list item. We also mark the indent level of the item text, so we can tell if the next line is a block quote (indented more).

define(`w_item',
`w_newindent(`$1',1)`'w_pending_block`'w_setblocktype(`$3')`''dnl
`w_newindent(`$1',2)`'w_setblocktype(`$3i')`''dnl
`w_newindent(`$2',0)`'w_setblocktype(text)`'undefine(`@w_para_flag')w_softbr')

Terms of definition list. These imply the presence of a definition list and a forthcoming definition. However, as the definition text does not begin on this line, we don't set up an indent level for it but just the t block.

define(`w_term',
`w_newindent(`$1',1)`'w_pending_block`'w_setblocktype(:)`''dnl
`w_defnterm(`$2')`''dnl
`w_newindent(`$1',2)`'w_setblocktype(t)`'undefine(`@w_para_flag')')