gather data for linking constructs

This is a (scary) preprocessor that gathers link data from documents. Link abbreviation processing is three-phase: this is the first phase, before actually processing the documents, in which we just gather enough information for the third phase and ditch everything else.

The link data is supposed to be in blocks. Outside blocks, all we have to gather is labels. Here we limit the scope to material outside blocks:

/^\[[A-Za-z0-9][A-Za-z0-9]*\] /,/^$/!{
/./!d

I don't care to process more than one label per line. If somebody uses two explicit labels on the same line, (s)he can't have a good reason to do so. We read text in whole paragraphs so as to skip blocklike-looking constructs in the middle of a paragraph.

: gulp
s#^.*\[+\([^]]*\)+\].*$#w_make_autolabel(`\1')dnl#p
s#^!!* *\(.*\)$#w_make_autolabel(`\1')dnl#p
$!{
N
/\n$/!b gulp
}
d
}

End-block processing.

/^$/{
x
s#$#`'')dnl#p
d
}

From here on, we are within a link data block. During that, we keep the previous line in hold space. Empty hold space means no line.

s#'#`'w_apo`'#g

The case that we have a new datum: see whether there is a line to finish.

/^\[\([A-Za-z0-9][A-Za-z0-9]*\)\] /{
s##define(`@w_linkdata_of_\1',`#
x
/./!d
s#$#`'')dnl#p
d
}

The case that we don't: just store the new one, dump the old one. (Whitespace is stripped to allow indenting line continuations on the same level as the link data marker.)

s#^[ 	]*##
x