Lisp2k language reference

If you are impatient, you might want to go directly to the Short reference.

The command line

The Lisp2k command line looks like so:

windowsill [ -debug ] progfile

(The name of the Lisp2k interpreter is windowsill. There is currently no compiler available.) This executes the program in progfile. If you give the -debug flag to windowsill, the evaluator will dump out tracing information as it evaluates. You will probably find this valuable if you do anything non-trivial in Lisp2k.

Program structure

In Lisp2k, a program is a sequence (list) of instructions. The sequence might contain embedded sub-sequences. Effectively, this forms a tree-like structure which represents the program.

The tree structure is reflected in the source file by using white space. The toplevel sequence is written in the source file, line by line, from the beginning of the file downwards; you can make embedded sequences by inducing extra indentation. For example, the following source code:

x
y
 z
 b
w

corresponds to the following tree structure: x,y,(z,b),w. For a more complicated example:

a
    b
  c
  d
    e
 f
g
   h

corresponds to the following tree structure: a,(b),(c,d,(e)),(f),g,(h). As you can see, a sequence with one item is a distinct object from an item by itself. By the way, the amount of indentation does not matter; the structure is decided by the relations of the indentations.

Data that are not sequences (lists) are called atoms. The Lisp2k model for atoms is particularly simple, there being only one type of them, the symbol. A symbol is like a name: it has textual outward appearance and identity. You can (and must) use symbols in any way you like.

There is one more shorthand: you can write a flat list on a single line, items separated by whitespace. Writing multiple items on a single line always induces a list, so if you also use indentation, you will end with two levels of embedding. For example:

ghee
 amma omma
 doh

is equivalent to the tree structure ghee,((amma,omma),doh).

As in many Lisp-like languages, evaluation in Lisp2k proceeds by executing instructions in the top-level sequence. These instructions, in turn, might cause the evaluation of sub-sequences, etc. However, there are some big differences with Lisp:

In Lisp, a function application always corresponds to a list. Not so in Lisp2k: function arguments are simply written after the functions, and the functions take care of snatching the proper amount of them. This is like having an implicit (progn) without having to bracket procedure calls separately.
Consequently, there is no notation of function position / value position in Lisp2k. Even more consequently, you always have to activete functions by hand except the primitive special forms. This is like the requirement to use (funcall) on parameter-passed functions, just that there is only one namespace so all names are taken to be variables by default.
Lisp2k handles quoting rather differently from Lisp. In particular, in Lisp, a sublist is by default a subexpression, whereas in Lisp2k, it is a list literal. As a consequence, constructions that automatically evaluate are few.

Evaluation

The execution of the sequence of instructions is called evaluation. In Lisp2k, any operation might produce zero, one or multiple results. These results are gathered in a list as the evaluation proceeds. Often this result list is just thrown away; at other times, it is re-inspected, particularly by evaluating it again.

In addition to the result of evaluation, operations might have side effects. The only kinds of side effects in Lisp2k are I/O and changing symbol bindings. More about those in the following section.

Primitives

Every Lisp2k program consists of the Lisp2k primitives, all of which are special forms. Some primitives require some specific type / kind of parameters and will terminate the program if not given them. At a later stage, some polymorphism might be introduced to the basic operations, as it becomes clearer what kind of primitives would benefit the users most.

Lisp2k supports many programming styles, which are conveniently defined by the primitives used in each. The sets of operations are, by programming style:

minimal Lisp2k: for those who don't like permanent variables. It results in very functional-style code. Allowed primitives are x and apply.
standard Lisp2k: emphasises traditional and practical Lisp style. Allowed primitives are d, q, c and x.
abstract Lisp2k: augments standard Lisp2k with ipperopp (for special variables) and apply (for lexical variables).
interactive Lisp2k: for those who like to converse with the user (but don't be too verbose!). Augments any style of Lisp2k with pr, inbüt and c.

The primitives are as follows:

d

The most lisp-ish of them all, the d primitive corresponds to a (setq) / (defun). It takes two arguments (without evaluating either, thus more akin to (defun)), and matches the first one (the pattern) to the second one (the value) to bind values thus:

if the pattern is a symbol, then that symbol is bound to the value, be it list or symbol;
if the pattern is a list and the value is a list, then the symbols in the pattern list are bound to the corresponding items of the value list, and the last symbol in the pattern list is bound to the rest of the value list;
if the pattern list contains sublists, they are handled recursively (so the value must also contain sublists in the corresponding positions);
other combinations don't match and are errors.

By the way, the result of evaluating a symbol is its value. That is, if the Lisp2k evaluator encounters a symbol, it produces its value in the output. Symbol replacement always yields a single value (remember, operations in Lisp2k might yield any number of results.) Symbols that have no assigned value are replaced by a special symbol, nil.

q

This is the quote operation. The q primitive takes one argument and produces a singleton sequence[1] from it as a result. This is a kind of quoting, because sequences, in turn, evaluate to their contents.

[1] a sequence with only one element

That is, if the evaluator processes, for example, the list q,a, then the result is (a) which, if re-evaluated, expands to a. Because sequences evaluate to their contents and might be of any length, evaluating a sequence might yield any number of results. (The sequences formed by q are always singleton, however, so you can count on them returning a single result when evaluated.)

c

This is a concatenation operation which you can use for building values in Lisp2k. The c primitive takes two arguments, which are promoted to sequences if they are not already (a symbol becomes a singleton sequence), and produces a single value, a sequence that contains all the items in those two sequences.

For example, if you concatenate the three-item sequence a,(b,c),d and the four-item sequence g,h,i,j, you get the seven-item sequence a,(b,c),d,g,h,i,j. If you concatenate the symbol a and the empty sequence (), you get the one-item sequence containing a.

x

While the former three primitives are mostly for handling data, this one is for controlling the flow of the program. Its name comes from execute but is really the equivalent of Lisp's (eval). Unlike Lisp, its (single) argument must always be a sequence. (Symbol arguments to x might provide an interface to first-class stoppinguations in future versions.)

What x does is a little bit difficult to explain. Basically, x takes the result of the evaluation of its argument sequence (remember evaluation always yields a sequence of results), and re-evaluates it. This technique, used with different combinations of other primitives, can be used to form code at runtime — the famous subject of pride of Lisp aficionados.

For example, if you evaluate the structure x,((q,b),c,((d,e))), x double-evaluates the sequence (q,b),c,((d,e)), producing on the first run q,b,nil,(d,e),[2] and on the second run, (b),nil,d,e. Note how the q instruction got evaluated on the second go.

[2] actually c is the concatenation operation, so this would produce a runtime error because of the missing operand

apply

The apply primitive is the combined data handling / flow control tool that makes a perfect complement for the x primitive.[3] It takes three arguments, namely the value, the pattern and the template.

[3] It is interesting to note that (eval) and (apply) make a pretty complete language by themselves if we augment (apply) a little, as we are doing here.

First, the value is matched against the pattern using the same rules as for the d primitive. But the resulting bindings are not put in the global symbol table — instead, they are used only to fill in the template, where occurrences of symbols bound by the match are substituted with the symbol's value. The resulting template is returned (apply always returns a single value).

For example, if you apply the value xebboo with the pattern X to the template X,(X,X), you get the result xebboo,(xebboo,xebboo). If you apply the value 1,2,3 with the pattern X,Y to the template X,Y,Y, you get the result 1,(2,3),(2,3).

ipperopp

The ipperopp instruction is so called because it has no counterpart in any language I know. Lisp's (save-excursion) feature is probably one of the closest things we have. The ipperopp instruction takes no arguments and yields no results but stows the current symbol table (symbol assignments) away for later restoration. What makes the ipperopp instruction somewhat peculiar is the fact that it has dynamic scope: it is not quite clear when the old symbol table will be restored.[4] Basically, ipperopp restores the old symbols upon finishing the evaluation of a sequence, but because the evaluator sometimes combines sequences (especially for the x instruction), it's not so obvious when this happens. Well, anyway.

[4] Actually, I'm very happy if someone bothers to shed light on this.

ipperopp provides the means for structural programming in two ways: first, it allows programs to use local variables, which makes recursion safe and easy; second, it allows programs to build different kinds of execution environments which, in turn, can be used as a sophisticated means of parameter passing.

The input/output facilities are:

pr: For printing values. Printing a sequence results in a line with all the sequence's elements separated by spaces, and is well suited for messages. Printing a symbol causes no extra whitespace whatsoever, and is well suited for quasi-graphical display and such stuff. This is a good example of what might be achieved with nice overloading of primitives.
inbüt: The input primitive is called inbüt, so you'd better have a compose character key on your keyboard… it allows you to butt in in the peaceful pastimes of the poor user. Input is a relatively high-level facility akin to Lisp's (read), and always produces full-fledged values. Indentation is handled similarly to source files, but inbüt only reads one line of text per invocation. inbüt may yield any number of results.
debug: This command is intended (not surprisingly) for debugging purposes. It takes no arguments, produces no results, and dumps the definitions in the current symbol table as a side effect.

Lexical issues

Unlike most programming languages, Lisp2k treats empty lines as values, namely symbols. You can e.g. print an empty line, which causes a newline in the output, by writing pr followed by an empty line. Or you can define a value for empty lines.

Comments are supposed to work by putting a semicolon (;) somewhere. Comments extend to the end of line, and are treated as (non-significant) whitespace, except that a comment near the left edge of the screen affects Lisp2k's idea of how much indentation the line has.

Short reference

Program structure and whitespace:

; this is comment (but it might not work)
  ; this is too, but it might cause a(n empty) subsequence
x			; <- that instruction is on the top level
	d foo bar	; <- this is double-embedded seq: once for indent,
			; <- second time for having multiple atoms
Heppa			; <- this is a symbol
/Heppa			; <- this is a list (a slash is a separate symbol)
   Heppa		; <- this is also a list (because of indentation)
 Heppa			; <- this is another list on the same level
 Heppa			; <- this is the second item of that list
   Heppa		; <- this is an embedded list

; The previous line is a symbol, the empty line

Evaluation:
form expands to side effects

sym nil or value none

(list of a b c) list of a b c none

d foo bar nothing sets the value of foo to bar

d foo (quux) nothing sets the value of foo to (quux)

d (h t) (a b c) nothing sets h to a, t to (b c)

d (x (y z)) ((a) (b)) nothing sets x to (a), y to b and z to ()

q zoq (zoq) none

c (a b) (c e) (a b c e) none

c a (b t) (a b t) none

c u ö (u ö) none

c () foo (foo) none

x (foo) foo => (quux) => quux none unless induced by evaluation

apply (1 2 3) (h t) h 1 none

apply zuc z (z (z z) z) (zuc (zuc zuc) zuc) none

apply ((y)) ((X Y) Z) Z () none

ipperopp nothing creates a local variable enrironment

pr foo nothing prints foo

pr (foo bar) nothing prints foo bar and a newline

inbüt anything reads input from user

debug nothing dumps symbol table contents

form	expands to	side effects

`sym`	`nil` or `value`	none
`(list of a b c)`	`list of a b c`	none

`d foo bar`	nothing	sets the value of `foo` to `bar`
`d foo (quux)`	nothing	sets the value of `foo` to `(quux)`
`d (h t) (a b c)`	nothing	sets `h` to `a`, `t` to `(b c)`
`d (x (y z)) ((a) (b))`	nothing	sets `x` to `(a)`, `y` to `b` and `z` to `()`
`q zoq`	`(zoq)`	none
`c (a b) (c e)`	`(a b c e)`	none
`c a (b t)`	`(a b t)`	none
`c u ö`	`(u ö)`	none
`c () foo`	`(foo)`	none

`x (foo)`	`foo` => `(quux)` => `quux`	none unless induced by evaluation

`apply (1 2 3) (h t) h`	`1`	none
`apply zuc z (z (z z) z)`	`(zuc (zuc zuc) zuc)`	none
`apply ((y)) ((X Y) Z) Z`	`()`	none
`ipperopp`	nothing	creates a local variable enrironment

`pr foo`	nothing	prints foo
`pr (foo bar)`	nothing	prints foo bar and a newline
`inbüt`	anything	reads input from user

`debug`	nothing	dumps symbol table contents