stx2any — common m4 facilities

This file is copyright © 2004,2005,2006,2011 by Panu Kalliokoski and released under the stx2any license.

This source file deals with definitions that are generally useful, not directly related to stx2any. You might well want to use these in any m4 based project.

First, ignore output. We only want to make definitions; output belongs to (format-specific) templates.

divert(-1)

Undefine format. I wonder what the GNU gurus were thinking about when they defined this extension to be recognised without parameters.

undefine(`format')

Infrastructure

These macros are simple computation or exception facilities.

Error reporting

There are 2 error levels, fatal and non-fatal (ta-da!). Both should report the error location in standard Unix format, because some tools, like emacs, can read this output and directly jump to the error locations. There are two complications:

  1. block-related errors come in pairs, as mismatching begin and end conditions. The error, which is usually an omission, is at neither place but usually somewhere in between. So, we must report both places. Currently, only the end-place is reported in the standard format.
  2. because the actual input is streamed to m4 through sed, we don't know the actual file names. Fear not: we define the macro w_real_file upon entering each input file so we can report errors properly.
    define(`w_stdin_p', `ifelse(__file__,stdin,t,__file__,-,t)')
    define(`w_current_location',
    `ifelse(w_stdin_p,t,w_real_file,__file__):'dnl
    `ifelse(w_stdin_p,t,`eval(__line__ - w_line_base)',__line__)')
    define(`w_warning',
    `errprint(ifelse(`$2',,`w_current_location',``$2'')`: stx2any: $1'w_nl)')
    define(`w_error', `w_warning($@)m4exit(1)')
    

Output twiddling

w_nl is indispensable if you want to write neat code. The whitespace dependence in m4 is bad enough as it is.

For producing backquotes, we need to override quotes both upon definition and upon invocation. The code looks weird but works.

define(`w_nl',`
')
define(`w_void',)
define(`w_bq',
changequote([[,]])dnl
[[changequote(.,.)`changequote(`,')]]dnl
changequote(`,'))

Quote facilities

These are needed for storing parameter lists.[1] Combining packs many parameters in a single string; dequoting removes one level of quotes for a string, effectively unpacking the parameter list (parameters are still quoted separately).

[1] Single parameters are conveniently handled by simple pushdef / popdef.

Actually, foo is equal to w_dequote(defn(foo)), but only for strings that are valid macro names. For internal variables beginning with @, defn + dequoting is the only way.

define(`w_combine', ``$@'')
define(`w_gather', ``$*'')
define(`w_dequote', `$1')

List facilities

define(`w_pickn',
`ifelse(`$1',1,`$2',`w_pickn(decr(`$1'),shift(shift($@)))')')
define(`w_listlen',
`ifelse(`$*',,0,`incr(w_listlen(shift($@)))')')

Counter system (à la LaTeX)

Reentrant newcounter and delcounter.

define(`w_newcounter',
`pushdef(`@w_counter_$1',0)'dnl
`pushdef(`@w_refcounter_$1',`$2')')
define(`w_delcounter',
`popdef(`@w_counter_$1')popdef(`@w_refcounter_$1')')

All changes to counters are done via setcounter, in order to deal with reference counters.

define(`w_setcounter',
`define(`@w_counter_$1',`$2')'dnl
`ifelse(defn(`@w_refcounter_$1'),,,
   `w_setcounter(defn(`@w_refcounter_$1'),0)')')
define(`w_getcounter', `defn(`@w_counter_$1')')
define(`w_stepcounter',
`w_setcounter(`$1',ifelse(`$2',,`incr(',`eval($2+')w_getcounter(`$1')))')

Counter value in different formats.

define(`w_counter_arabic', `w_getcounter(`$1')')
define(`w_counter_alpha',
`substr(_abcdefghijklmnopqrstuvwxyz,w_getcounter(`$1'),1)')
define(`w_counter_Alpha',
`substr(_ABCDEFGHIJKLMNOPQRSTUVWXYZ,w_getcounter(`$1'),1)')

Diversion system

Diversions are for rearranging input. These are just a thin wrapper around the native diversions of m4, providing nesting, error reporting, and names for diversions. I think naming would be reason enough to use these.

Diversions are somewhat hard to understand, because they don't do anything to the way m4 processes macros, they only say where the output goes when there is some output. But in m4, expansions are reread until they don't expand any more; so it's not that simple to tell when there will be output. Stated differently: diversions are side effects, so make sure (by quoting) that they won't take effect before you want them to. Another important point to realise is that other side effects (e.g. definitions) are not affected by diversions.

define(`w_begdiv',
`ifdef(`@w_div_$1',,`w_error(`unknown diversion "$1"')')'dnl
`pushdef(`@w_divlocstack', w_current_location)'dnl
`pushdef(`@w_divstack',$1)divert(defn(`@w_div_$1'))')
define(`w_enddiv', 
`ifdef(`@w_divstack',,`w_error(`diversion stack empty')')'dnl
`ifelse(`$1',,,`$1',defn(`@w_divstack'),,
   `w_warning("defn(`@w_divstack')`" begins here...', defn(`@w_divlocstack'))'
   `w_error(`diversion "'defn(`@w_divstack')`" closed by "$1"')')'dnl
`popdef(`@w_divlocstack')popdef(`@w_divstack')'dnl
`ifdef(`@w_divstack',`divert(defn(`@w_div_'defn(`@w_divstack')))')')
define(`w_check_div', `ifdef(`@w_divstack',
   `w_error(`unclosed diversion "'defn(`@w_divstack')", defn(`@w_divlocstack'))')')
define(`w_dumpdiv', `undivert(defn(`@w_div_$1'))')

Diversions are actually numbers. Give some way to map names to those numbers.

w_newcounter(`w_n_avail_div')
define(`w_define_div',
`w_stepcounter(`w_n_avail_div')'dnl
`define(`@w_div_$1', w_getcounter(`w_n_avail_div'))')
define(`w_define_trashcan',
`define(`@w_div_$1', -1)')

Environment system (à la LaTeX)

Environments are meant for big things, where it would be ugly and/or unwieldy to use a single macro. For example, I wouldn't like it if I had to wrap a whole block quote in a macro call. Macros are more sensitive to syntax errors with parentheses and quotes and provide less information about what went wrong. On the other hand, environments can't read and process the included text,[2] so the effect of environments is limited to output upon opening and closing, and indirect effects like redefining hooks.

[2] unless you are perverse enough to have the environment expand to a big macro call, in which case the problems of macros apply.

That said, environments are a relatively thin wrapper around macro calls, as they are in LaTeX. They provide error reporting, saving of arguments until the end of the environment, and a separate namespace.

To allow environments to call other environments, we define many layers of environment variables. Always when we are executing an environment definition, we increase the layer. This ensures that the arguments of the calling environment won't mess with the arguments of the called environment. But this imposes a restriction: environments must always be closed at the same layer where they are opened.

w_newcounter(`w_layer')
define(`w_layervar', ``w_layer_'w_getcounter(`w_layer')`_$1'')
define(`w_sublayer',
`w_stepcounter(`w_layer')$1`'w_stepcounter(`w_layer',-1)')

define(`w_define_env',
   `define(`@w_begin_$1', `$2')define(`@w_end_$1', `$3')')
define(`w_ifdef_env', `ifdef(`@w_begin_$1', `$2', `$3')')

define(`w_beg',
`w_ifdef_env(`$1',, `w_error(`unknown environment "$1"')')'dnl
`pushdef(w_layervar(env), `$1')'dnl
`pushdef(w_layervar(params), w_combine(shift($@)))'dnl
`pushdef(w_layervar(loc), w_current_location)'dnl
`w_sublayer(`indir(`@w_begin_$1',shift($@))')')

define(`w_end',
`ifdef(w_layervar(env),,`w_error(`environment stack empty')')'dnl
`ifelse(`$1',,,`$1',defn(w_layervar(env)),,
   `w_warning("defn(w_layervar(env))`" begins here...', defn(w_layervar(loc)))'
   `w_error(`environment "'defn(w_layervar(env))`" closed by "$1" in layer 'w_counter_arabic(`w_layer'))')'dnl
`w_sublayer(`indir(`@w_end_''defn(w_layervar(env))`,'
   defn(w_layervar(params))`)')'dnl
`popdef(w_layervar(loc))popdef(w_layervar(env))popdef(w_layervar(params))')

define(`w_check_env1', `ifdef(w_layervar(env),
   `w_error(`unclosed environment "'defn(w_layervar(env))`" in layer 'w_counter_arabic(`w_layer'), defn(w_layervar(loc)))')')
define(`w_check_env',
`w_sublayer(`w_sublayer(`w_check_env1')w_check_env1')w_check_env1')

define(`w_push_env', `pushdef(`@w_begin_$1',)pushdef(`@w_end_$1',)')
define(`w_pop_env', `popdef(`@w_begin_$1')popdef(`@w_end_$1')')

define(`w_make_param_shifter',
`ifelse(`$1',0,``$'@',``shift('w_make_param_shifter(decr(`$1'))`)'')')
define(`w_derive_env', `w_define_env(`$1',
`$4`'w_beg(`$2','w_make_param_shifter(`$3')`)`'$5',
`$6`'w_end(`$2','w_make_param_shifter(`$3')`)`'$7')')

Indentation system (à la Python)

The indentation system forms the basis of the block system, because indentation determines the nesting of various elements. Actually, the indents are at least partially virtual. If an element takes a specific indentation, it means that that element wants anything with a greater indentation to be inside it, and with less or equal indentation, outside.

All the indentation system does is to translate an indentation level into some or none w_dedents possibly followed by a w_indent. The work of translating these into element openings and closings is the job of stx2any — low-level and common markup facilities. We dedent until we can find an enclosing or equal indentation level; then, if we have an enclosing level, we indent onto the requested level.

The indentation level consists of two parts: an indent column and a sub-character level. Sub-character levels are needed because some constructs may need to open many blocks but only have one sensible column to mark them at. Besides, some constructs (like body text) are outside some others (like lists) even if they begin in the same column.

define(`w_newindent',
`ifelse(`$2',,`w_new_indents(`$1',0)',
   `w_new_indents(`$1',`$2')')')
define(`w_new_indents',
`w_compare_indent(`$1', `$2', w_dequote(defn(`@w_indstack')),
   `pushdef(`@w_indstack',`$1,$2')w_indent`'',
   `popdef(`@w_indstack')w_dedent`'w_new_indents(`$1',`$2')',)')
define(`w_compare_indent',
`ifelse(eval(`$1>$3'),1,`$5',eval(`$1<$3'),1,`$6',
   eval(`$2>$4'),1,`$5',eval(`$2<$4'),1,`$6',`$7')')
define(`@w_indstack',`0,0')

List helpers

These are facilities to iterate through lists (possibly many times). They are used by some table environments to track column types.

define(`w_setup_list',
`pushdef(`@w_list_len_$1', w_listlen(shift($@)))'dnl
`pushdef(`@w_list_save_$1', w_combine(shift($@)))'dnl
`pushdef(`@w_list_$1', defn(`@w_list_save_$1'))')
define(`w_unsetup_list',
`popdef(`@w_list_$1')popdef(`@w_list_save_$1')popdef(`@w_list_len_$1')')
define(`w_reinit_list', `define(`@w_list_$1', defn(`@w_list_save_$1'))')
define(`w_next_in_list', 
`w_pickn(1,w_dequote(defn(`@w_list_$1')))`''dnl
`define(`@w_list_$1',w_combine(shift(w_dequote(defn(`@w_list_$1')))))')
define(`w_length_list', `defn(`@w_list_len_$1')')

Title and other metadata

These are not related to any specific markup and are thus defined here.

define(`w_set_or_get',
`ifelse(`$2',,`defn(`$1')',`define(`$1', `$2')')')
define(`w_doc_id',)
define(`w_documentclass',)
define(`w_title', `w_set_or_get(`@w_title', `$1')')
define(`w_gettitle', `w_title')
define(`w_author', `w_set_or_get(`@w_author', `$1')')
define(`w_date', `w_set_or_get(`@w_date', `$1')')
define(`w_getdate', `w_date')
define(`w_language',
`define(`@w_language', `$1')'dnl
`define(`@w_iso_language',
   ifelse(`$2',,`substr(`$1',0,2)',`$2'))')

define(`w_char_coding',
`define(`@w_char_coding', `$1')'dnl
`define(`w_long_charset_name',
   ifelse(`$2',,`w_long_charset_name_for(`$1')',`$2'))')
define(`w_long_charset_name_for',
`ifelse(`$1',latin9,ISO-8859-15,
	`$1',ascii,US-ASCII,
	`$1',latin1,ISO-8859-1,
	utf-8)')