Unspecial forms

case-lambda, aif, with-condition-restarts, load-time-value... Lispers like to talk about special forms.

We rarely talk about the others. Most code, even in Lisp, consists almost entirely of a few anonymous constructs that aren't special forms. They're anonymous because they're too abbreviated to have names, and they're abbreviated because they're so common. These are the forms that are not special: variable reference, function call, and self-evaluating forms.

Their abbreviations are in structure, not syntax, but abbreviations can be done at the syntactic level too. And if you design syntax, keep the unspecial forms in mind. More than anything else, they need to be terse.

It's possible, but painful, to write code without the abbreviations. (I know someone else has written about this before, but I can't find the article.) Here's what the ordinary factorial might look like, in a Lisp-1, with the abbreviations replaced by special forms:

(defun factorial (n)
  (if (call (var <=) (ref n) (quote 1))
    (quote 1)
    (call (var *)
          (call (var factorial) (call (var -) (var n) (quote 1)))
          (var n))))

In these 16 forms, there are seven variable references, four calls, three literals, and two special forms (if and defun). There are also a few things that aren't forms: the binding occurrences of factorial and n, and the argument list. There are a few other nonform constructs not seen here: keywords for keyword arguments; docstrings; the structural lists that occur in forms like let and do.

I was curious about their frequencies, so I wrote a form counter to collect more data. This would be a perfect task for a code-walker, except that it has to understand the structure of every macro directly, rather than expanding it. I quickly got tired of adding cases to my form counter, so I only counted about a hundred lines. Bear in mind that this is a Lisp-1, so about half of the variable references are to functions (mostly from the standard library, of course).

RoleCountFrequency
Variable reference18240%
Function call7817%
Binding occurrence6514%
Special form (including macros)5111%
Argument list245%
Structural list235%
Integer literal184%
Keyword102%
String or character literal51%
Docstring20.4%
Total forms33473%
Total458100%

Variable bindings and references account for over half of all nodes. 30% of the bindings are top-level definitions, but if the remainder are each referenced once on average, then 20% of all nodes are spent naming local variables. This is a powerful argument for points-free style.

Both structural lists and special forms are less abundant than I expected. It's tempting to try to shrink programs by abbreviating some common special forms - lambda, define, let, and perhaps partial application - but it can't help much, unless programming style changes significantly in response.

Data can be so disappointing sometimes.

No comments:

Post a Comment

It's OK to comment on old posts.