On Lisp

Paul Graham



Preface

This book is intended for anyone who wants to become a better Lisp programmer. It assumes some familiarity with Lisp, but not necessarily extensive programming experience. The first few chapters contain a fair amount of review. I hope that these sections will be interesting to more experienced Lisp programmers as well, because they present familiar subjects in a new light.

It's difficult to convey the essence of a programming language in one sentence, but John Foderaro has come close:

Lisp is a programmable programming language.

There is more to Lisp than this, but the ability to bend Lisp to one's will is a large part of what distinguishes a Lisp expert from a novice. As well as writing their programs down toward the language, experienced Lisp programmers build the language up toward their programs. This book teaches how to program in the bottom-up style for which Lisp is inherently well-suited.

Bottom-up Design

Bottom-up design is becoming more important as software grows in complexity. Programs today may have to meet specifications which are extremely complex, or even open-ended. Under such circumstances, the traditional top-down method sometimes breaks down. In its place there has evolved a style of programming quite different from what is currently taught in most computer science courses: a bottom-up style in which a program is written as a series of layers, each one acting as a sort of programming language for the one above. X Windows and TeX are examples of programs written in this style.

The theme of this book is twofold: that Lisp is a natural language for programs written in the bottom-up style, and that the bottom-up style is a natural way to write Lisp programs. On Lisp will thus be of interest to two classes of readers. For people interested in writing extensible programs, this book will show what you can do if you have the right language. For Lisp programmers, this book offers a practical explanation of how to use Lisp to its best advantage.

The title is intended to stress the importance of bottom-up programming in Lisp. Instead of just writing your program in Lisp, you can write your own language on Lisp, and write your program in that.

It is possible to write programs bottom-up in any language, but Lisp is the most natural vehicle for this style of programming. In Lisp, bottom-up design is not a special technique reserved for unusually large or difficult programs. Any substantial program will be written partly in this style. Lisp was meant from the start to be an extensible language. The language itself is mostly a collection of Lisp functions, no different from the ones you define yourself. What's more, Lisp functions can be expressed as lists, which are Lisp data structures. This means you can write Lisp functions which generate Lisp code.

A good Lisp programmer must know how to take advantage of this possibility. The usual way to do so is by defining a kind of operator called a macro. Mastering macros is one of the most important steps in moving from writing correct Lisp programs to writing beautiful ones. Introductory Lisp books have room for no more than a quick overview of macros: an explanation of what macros are,together with a few examples which hint at the strange and wonderful things you can do with them. Those strange and wonderful things will receive special attention here. One of the aims of this book is to collect in one place all that people have till now had to learn from experience about macros.

Understandably, introductory Lisp books do not emphasize the differences between Lisp and other languages. They have to get their message across to students who have, for the most part, been schooled to think of programs in Pascal terms. It would only confuse matters to explain that, while defun looks like a procedure definition, it is actually a program-writing program that generates code which builds a functional object and indexes it under the symbol given as the first argument.

One of the purposes of this book is to explain what makes Lisp different from other languages. When I began, I knew that, all other things being equal, I would much rather write programs in Lisp than in C or Pascal or Fortran. I knew also that this was not merely a question of taste. But I realized that if I was actually going to claim that Lisp was in some ways a better language, I had better be prepared to explain why.

When someone asked Louis Armstrong what jazz was, he replied "If you have to ask what jazz is, you'll never know." But he did answer the question in a way: he showed people what jazz was. That's one way to explain the power of Lisp--to demonstrate techniques that would be difficult or impossible in other languages. Most books on programming--even books on Lisp programming--deal with the kinds of programs you could write in any language. On Lisp deals mostly with the kinds of programs you could only write in Lisp. Extensibility, bottom-up programming, interactive development, source code transformation, embedded languages--this is where Lisp shows to advantage.

In principle, of course, any Turing-equivalent programming language can do the same things as any other. But that kind of power is not what programming languages are about. In principle, anything you can do with a programming language you can do with a Turing machine; in practice, programming a Turing machine is not worth the trouble.

So when I say that this book is about how to do things that are impossible in other languages, I don't mean "impossible" in the mathematical sense, but in the sense that matters for programming languages. That is, if you had to write some of the programs in this book in C, you might as well do it by writing a Lisp compiler in C first. Embedding Prolog in C, for example--can you imagine the amount of work that would take? Chapter 24 shows how to do it in 180 lines of Lisp.

I hoped to do more than simply demonstrate the power of Lisp, though. I also wanted to explain why Lisp is different. This turns out to be a subtle question--too subtle to be answered with phrases like "symbolic computation." What I have learned so far, I have tried to explain as clearly as I can.

Plan of the Book

Since functions are the foundation of Lisp programs, the book begins with several chapters on functions. Chapter 2 explains what Lisp functions are and the possibilities they offer. Chapter 3 then discusses the advantages of functional programming, the dominant style in Lisp programs. Chapter 4 shows how to use functions to extend Lisp. Then Chapter 5 suggests the new kinds of abstractions we can define with functions that return other functions. Finally, Chapter 6 shows how to use functions in place of traditional data structures.

The remainder of the book deals more with macros than functions. Macros receive more attention partly because there is more to say about them, and partly because they have not till now been adequately described in print. Chapters 7--10 form a complete tutorial on macro technique. By the end of it you will know most of what an experienced Lisp programmer knows about macros: how they work; how to define, test, and debug them; when to use macros and when not; the major types of macros; how to write programs which generate macro expansions; how macro style differs from Lisp style in general; and how to detect and cure each of the unique problems that afflict macros.

Following this tutorial, Chapters 11--18 show some of the powerful abstractions you can build with macros. Chapter 11 shows how to write the classic macros--those which create context, or implement loops or conditionals. Chapter 12 explains the role of macros in operations on generalized variables. Chapter 13 shows how macros can make programs run faster by shifting computation to compile-time. Chapter 14 introduces anaphoric macros, which allow you to use pronouns in your programs. Chapter 15 shows how macros provide a more convenient interface to the function-builders defined in Chapter 5. Chapter 16 shows how to use macro-defining macros to make Lisp write your programs for you. Chapter 17 discusses read-macros, and Chapter 18, macros for destructuring.

With Chapter 19 begins the fourth part of the book, devoted to embedded languages. Chapter 19 introduces the subject by showing the same program, a program to answer queries on a database, implemented first by an interpreter and then as a true embedded language. Chapter 20 shows how to introduce into Common Lisp programs the notion of a continuation, an object representing the remainder of a computation. Continuations are a very powerful tool, and can be used to implement both multiple processes and nondeterministic choice. Embedding these control structures in Lisp is discussed in Chapters 21 and 22, respectively. Nondeterminism, which allows you to write programs as if they had foresight, sounds like an abstraction of unusual power. Chapters 23 and 24 present two embedded languages which show that nondeterminism lives up to its promise: a complete ATN parser and an embedded Prolog which combined total about 200 lines of code.

The fact that these programs are short means nothing in itself. If you resorted to writing incomprehensible code, there's no telling what you could do in 200 lines. The point is, these programs are not short because they depend on programming tricks, but because they're written using Lisp the way it's meant to be used. The point of Chapters 23 and 24 is not how to implement ATNs in one page of code or Prolog in two, but to show that these programs, when given their most natural Lisp implementation, simply are that short. The embedded languages in the latter chapters provide a proof by example of the twin points with which I began: that Lisp is a natural language for bottom-up design, and that bottom-up design is a natural way to use Lisp.

The book concludes with a discussion of object-oriented programming, and particularly CLOS, the Common Lisp Object System. By saving this topic till last, we see more clearly the way in which object-oriented programming is an extension of ideas already present in Lisp. It is one of the many abstractions that can be built on Lisp.

A chapter's worth of notes begins on page 387. The notes contain references, additional or alternative code, or descriptions of aspects of Lisp not directly related to the point at hand. Notes are indicated by a small circle in the outside margin, like this. There is also an Appendix (page 381) on packages.

Just as a tour of New York could be a tour of most of the world's cultures, a study of Lisp as the programmable programming language draws in most of Lisp technique. Most of the techniques described here are generally known in the Lisp community, but many have not till now been written down anywhere. And some issues, such as the proper role of macros or the nature of variable capture, are only vaguely understood even by many experienced Lisp programmers.

Examples in Preface

Lisp is a family of languages. Since Common Lisp promises to remain a widely used dialect, most of the examples in this book are in Common Lisp. The language was originally defined in 1984 by the publication of Guy Steele's Common Lisp: the Language (CLTL1). This definition was superseded in 1990 by the publication of the second edition (CLTL2), which will in turn yield place to the forthcoming ANSI standard.

This book contains hundreds of examples, ranging from single expressions to a working Prolog implementation. The code in this book has, wherever possible, been written to work in any version of Common Lisp. Those few examples which need features not found in CLTL1 implementations are explicitly identified in the text. Later chapters contain some examples in Scheme. These too are clearly identified.

The code is available by anonymous FTP from endor.harvard.edu, where onlisp@das.harvard.edu.

Acknowledgements

While writing this book I have been particularly thankful for the help of Robert Morris. I went to him constantly for advice and was always glad I did. Several of the examples in this book are derived from code he originally wrote, including the version of for on page 127, the version of aand on page 191, match on page 239, the breadth-first true-choose on page 304, and the Prolog interpreter in Section 24.2. In fact, the whole book reflects (sometimes, indeed, transcribes) conversations I've had with Robert during the past seven years. (Thanks, rtm!)

I would also like to give special thanks to David Moon, who read large parts was completely rewritten at his suggestion, and the example of variable capture on page 119 is one that he provided.

I was fortunate to have David Touretzky and Skona Brittain as the technical reviewers for the book. Several sections were added or rewritten at their suggestion. The alternative true nondeterministic choice operator on page 397 is based on a suggestion by David Toureztky.

Several other people consented to read all or part of the manuscript, including Tom Cheatham, Richard Draves (who also rewrote alambda and propmacro back in 1985), John Foderaro, David Hendler, George Luger, Robert Muller, Mark Nitzberg, and Guy Steele.

I'm grateful to Professor Cheatham, and Harvard generally, for providing the facilities used to write this book. Thanks also to the staff at Aiken Lab, including Tony Hartman, Janusz Juda, Harry Bochner, and Joanne Klys.

The people at Prentice Hall did a great job. I feel fortunate to have worked with Alan Apt, a good editor and a good guy. Thanks also to Mona Pompili, Shirley Michaels, and Shirley McGuire for their organization and good humor.

The incomparable Gino Lee of the Bow and Arrow Press, Cambridge, did the cover. The tree on the cover alludes specifically to the point made on page 27.

This book was typeset using LaTeX, a language written by Leslie Lamport atop Donald Knuth's TeX, with additional macros by L. A. Carr, Van Jacobson, and Guy Steele. The diagrams were done with Idraw, by John Vlissides and Scott Stanton. The whole was previewed with Ghostview, by Tim Theisen, which is built on Ghostscript, by L. Peter Deutsch. Gary Bisbee of Chiron Inc. produced the camera-ready copy.

I owe thanks to many others, including Paul Becker, Phil Chapnick, Alice Hartley, Glenn Holloway, Meichun Hsu, Krzysztof Lenk, Arman Maghbouleh, Howard Mullings, NancyParmet, Robert Penny, Gary Sabot, Patrick Slaney, Steve Strassman, Dave Watkins, the Weickers, and Bill Woods.

Most of all, I'd like to thank my parents, for their example and encouragement; and Jackie, who taught me what I might have learned if I had listened to them.

I hope reading this book will be fun. Of all the languages I know, I like Lisp the best, simply because it's the most beautiful. This book is about Lisp at its lispiest. I had fun writing it, and I hope that comes through in the text. Paul Graham


1. The Extensible Language

Not long ago, if you asked what Lisp was for, many people would have answered "for artificial intelligence." In fact, the association between Lisp and AI is just an accident of history. Lisp was invented by John McCarthy, who also invented the term "artificial intelligence." His students and colleagues wrote their programs in Lisp, and so it began to be spoken of as an AI language. This line was taken up and repeated so often during the brief AI boom in the 1980s that it became almost an institution.

Fortunately, word has begun to spread that AI is not what Lisp is all about. Recent advances in hardware and software have made Lisp commercially viable: it is now used in Gnu Emacs, the best Unix text-editor; Autocad, the industry standard desktop CAD program; and Interleaf, a leading high-end publishing program. The way Lisp is used in these programs has nothing whatever to do with AI.

If Lisp is not the language of AI, what is it? Instead of judging Lisp by the company it keeps, let's look at the language itself. What can you do in Lisp that you can't do in other languages? One of the most distinctive qualities of Lisp is the way it can be tailored to suit the program being written in it. Lisp itself is a Lisp program, and Lisp programs can be expressed as lists, which are Lisp data structures. Together, these two principles mean that any user can add operators to Lisp which are indistinguishable from the ones that come built-in.


1.1 Design by Evolution

Because Lisp gives you the freedom to define your own operators, you can mold it into just the language you need. If you're writing a text-editor, you can turn Lisp into a language for writing text-editors. If you're writing a CAD program, you can turn Lisp into a language for writing CAD programs. And if you're not sure yet what kind of program you're writing, it's a safe bet to write it in Lisp. Whatever kind of program yours turns out to be, Lisp will, during the writing of it, have evolved into a language for writing that kind of program.

If you're not sure yet what kind of program you're writing? To some ears that sentence has an odd ring to it. It is in jarring contrast with a certain model of doing things wherein you (1) carefully plan what you're going to do, and then (2) do it. According to this model, if Lisp encourages you to start writing your program before you've decided how it should work, it merely encourages sloppy thinking.

Well, it just ain't so. The plan-and-implement method may have been a good way of building dams or launching invasions, but experience has not shown it to be as good a way of writing programs. Why? Perhaps it's because computers are so exacting. Perhaps there is more variation between programs than there is between dams or invasions. Or perhaps the old methods don't work because old concepts of redundancy have no analogue in software development: if a dam contains 30% too much concrete, that's a margin for error, but if a program does 30% too much work, that is an error.

It may be difficult to say why the old method fails, but that it does fail, anyone can see. When is software delivered on time? Experienced programmers know that no matter how carefully you plan a program, when you write it the plans will turn out to be imperfect in some way. Sometimes the plans will be hopelessly wrong. Yet few of the victims of the plan-and-implement method question its basic soundness. Instead they blame human failings: if only the plans had been made with more foresight, all this trouble could have been avoided. Since even the very best programmers run into problems when they turn to implementation, perhaps it's too much to hope that people will ever have that much foresight. Perhaps the plan-and-implement method could be replaced with another approach which better suits our limitations.

We can approach programming in a different way, if we have the right tools. Why do we plan before implementing? The big danger in plunging right into a project is the possibility that we will paint ourselves into a corner. If we had a more flexible language, could this worry be lessened? We do, and it is. The flexibility of Lisp has spawned a whole new style of programming. In Lisp, you can do much of your planning as you write the program.

Why wait for hindsight? As Montaigne found, nothing clarifies your ideas like trying to write them down. Once you're freed from the worry that you'll paint yourself into a corner, you can take full advantage of this possibility. The ability to plan programs as you write them has two momentous consequences: programs take less time to write, because when you plan and write at the same time, you have a real program to focus your attention; and they turn out better, because the final design is always a product of evolution. So long as you maintain a certain discipline while searching for your program's destiny--so long as you always rewrite mistaken parts as soon as it becomes clear that they're mistaken--the final product will be a program more elegant than if you had spent weeks planning it beforehand.

Lisp's versatility makes this kind of programming a practical alternative. Indeed, the greatest danger of Lisp is that it may spoil you. Once you've used Lisp for a while, you may become so sensitive to the fit between language and application that you won't be able to go back to another language without always feeling that it doesn't give you quite the flexibility you need.


1.2 Programming Bottom-Up

It's a long-standing principle of programming style that the functional elements of a program should not be too large. If some component of a program grows beyond the stage where it's readily comprehensible, it becomes a mass of complexity which conceals errors as easily as a big city conceals fugitives. Such software will be hard to read, hard to test, and hard to debug.

In accordance with this principle, a large program must be divided into pieces, and the larger the program, the more it must be divided. How do you divide a program? The traditional approach is called top-down design: you say "the purpose of the program is to do these seven things, so I divide it into seven major subroutines. The first subroutine has to do these four things, so it in turn will have four of its own subroutines," and so on. This process continues until the whole program has the right level of granularity--each part large enough to do something substantial, but small enough to be understood as a single unit.

Experienced Lisp programmers divide up their programs differently. As well as top-down design, they follow a principle which could be called bottom-up design--changing the language to suit the problem. In Lisp, you don't just write your program down toward the language, you also build the language up toward your program. As you're writing a program you may think "I wish Lisp had such-and-such an operator." So you go and write it. Afterward you realize that using the new operator would simplify the design of another part of the program, and so on. Language and program evolve together. Like the border between two warring states, the boundary between language and program is drawn and redrawn, until eventually it comes to rest along the mountains and rivers, the natural frontiers of your problem. In the end your program will look as if the language had been designed for it. And when language and program fit one another well, you end up with code which is clear, small, and efficient.

It's worth emphasizing that bottom-up design doesn't mean just writing the same program in a different order. When you work bottom-up, you usually end up with a different program. Instead of a single, monolithic program, you will get a larger language with more abstract operators, and a smaller program written in it. Instead of a lintel, you'll get an arch.

In typical code, once you abstract out the parts which are merely bookkeeping, what's left is much shorter; the higher you build up the language, the less distance you will have to travel from the top down to it. This brings several advantages:

Bottom-up design is possible to a certain degree in languages other than Lisp. Whenever you see library functions, bottom-up design is happening. However, Lisp gives you much broader powers in this department, and augmenting the language plays a proportionately larger role in Lisp style--so much so that Lisp is not just a different language, but a whole different way of programming.

It's true that this style of development is better suited to programs which can be written by small groups. However, at the same time, it extends the limits of what can be done by a small group. In The Mythical Man-Month, Frederick Brooks proposed that the productivity of a group of programmers does not grow linearly with its size. As the size of the group increases, the productivity of individual programmers goes down. The experience of Lisp programming suggests a more cheerful way to phrase this law: as the size of the group decreases, the productivity of individual programmers goes up. A small group wins, relatively speaking, simply because it's smaller. When a small group also takes advantage of the techniques that Lisp makes possible, it can win outright.


1.3 Extensible Software

The Lisp style of programming is one that has grown in importance as software has grown in complexity. Sophisticated users now demand so much from software that we can't possibly anticipate all their needs. They themselves can't anticipate all their needs. But if we can't give them software which does everything they want right out of the box, we can give them software which is extensible. We transform our software from a mere program into a programming language, and advanced users can build upon it the extra features that they need.

Bottom-up design leads naturally to extensible programs. The simplest bottom-up programs consist of two layers: language and program. Complex programs may be written as a series of layers, each one acting as a programming language for the one above. If this philosophy is carried all the way up to the topmost layer, that layer becomes a programming language for the user. Such a program, where extensibility permeates every level, is likely to make a much better programming language than a system which was written as a traditional black box, and then made extensible as an afterthought.

X Windows and T E X are early examples of programs based on this principle. In the 1980s better hardware made possible a new generation of programs which had Lisp as their extension language. The first was Gnu Emacs, the popular Unix text-editor. Later came Autocad, the first large-scale commercial product to provide Lisp as an extension language. In 1991 Interleaf released a new version of its software that not only had Lisp as an extension language, but was largely implemented in Lisp.

Lisp is an especially good language for writing extensible programs because it is itself an extensible program. If you write your Lisp programs so as to pass this extensibility on to the user, you effectively get an extension language for free. And the difference between extending a Lisp program in Lisp, and doing the same thing in a traditional language, is like the difference between meeting someone in person and conversing by letters. In a program which is made extensible simply by providing access to outside programs, the best we can hope for is two black boxes communicating with one another through some predefined channel. In Lisp, extensions can have direct access to the entire underlying program. This is not to say that you have to give users access to every part of your program--just that you now have a choice about whether to give them access or not.

When this degree of access is combined with an interactive environment, you have extensibility at its best. Any program that you might use as a foundation for extensions of your own is likely to be fairly big--too big, probably, for you to have a complete mental picture of it. What happens when you're unsure of something? If the original program is written in Lisp, you can probe it interactively: you can inspect its data structures; you can call its functions; you may even be able to look at the original source code. This kind of feedback allows you to program with a high degree of confidence--to write more ambitious extensions, and to write them faster. An interactive environment always makes programming easier, but it is nowhere more valuable than when one is writing extensions.

An extensible program is a double-edged sword, but recent experience has shown that users prefer a double-edged sword to a blunt one. Extensible programs seem to prevail, whatever their inherent dangers.


1.4 Extending Lisp

There are two ways to add new operators to Lisp: functions and macros. In Lisp, functions you define have the same status as the built-in ones. If you want a new variant of mapcar, you can define one yourself and use it just as you would use mapcar. For example, if you want a list of the values returned by some function when it is applied to all the integers from 1 to 10, you could create a new list and pass it to mapcar:

(mapcar fn
	(do* ((x 1 (1+ x))
	      (result (list x) (push x result)))
	    ((= x 10) (nreverse result))))

but this approach is both ugly and inefficient.(2) Instead you could define a new mapping function map1-n (see page 54), and then call it as follows:

(map1-n fn 10)

Defining functions is comparatively straightforward. Macros provide a more general, but less well-understood, means of defining new operators. Macros are programs that write programs. This statement has far-reaching implications, and exploring them is one of the main purposes of this book.

The thoughtful use of macros leads to programs which are marvels of clarity and elegance. These gems are not to be had for nothing. Eventually macros will seem the most natural thing in the world, but they can be hard to understand at first. Partly this is because they are more general than functions, so there is more to keep in mind when writing them. But the main reason macros are hard to understand is that they're foreign. No other language has anything like Lisp macros. Thus learning about macros may entail unlearning preconceptions inadvertently picked up from other languages. Foremost among these is the notion of a program as something afflicted by rigor mortis. Why should data structures be fluid and changeable, but programs not? In Lisp, programs are data, but the implications of this fact take a while to sink in.

If it takes some time to get used to macros, it is well worth the effort. Even in such mundane uses as iteration, macros can make programs significantly smaller and cleaner. Suppose a program must iterate over some body of code for x from a to b. The built-in Lisp do is meant for more general cases. For simple iteration it does not yield the most readable code:

(do ((x a (+ 1 x)))
    ((> x b))
  (print x))

Instead, suppose we could just say:

(for (x a b)
     (print x))

Macros make this possible. With six lines of code (see page 154) we can add for to the language, just as if it had been there from the start. And as later chapters will show, writing for is only the beginning of what you can do with macros.

You're not limited to extending Lisp one function or macro at a time. If you need to, you can build a whole language on top of Lisp, and write your programs in that. Lisp is an excellent language for writing compilers and interpreters, but it offers another way of defining a new language which is often more elegant and certainly much less work: to define the new language as a modification of Lisp. Then the parts of Lisp which can appear unchanged in the new language (e.g. arithmetic or I/O) can be used as is, and you only have to implement the parts which are different (e.g. control structure). A language implemented in this way is called an embedded language.

Embedded languages are a natural outgrowth of bottom-up programming. Common Lisp includes several already. The most famous of them, CLOS, is discussed in the last chapter. But you can define embedded languages of your own, too. You can have the language which suits your program, even if it ends up looking quite different from Lisp.


1.5 Why Lisp (or When)

These new possibilities do not stem from a single magic ingredient. In this respect, Lisp is like an arch. Which of the wedge-shaped stones (voussoirs) is the one that holds up the arch? The question itself is mistaken; they all do. Like an arch, Lisp is a collection of interlocking features. We can list some of these features--- dynamic storage allocation and garbage collection, runtime typing, functions as objects, a built-in parser which generates lists, a compiler which accepts programs expressed as lists, an interactive environment, and so on--but the power of Lisp cannot be traced to any single one of them. It is the combination which makes Lisp programming what it is.

Over the past twenty years, the way people program has changed. Many of these changes--interactive environments, dynamic linking, even object-oriented programming--have been piecemeal attempts to give other languages some of the flexibility of Lisp. The metaphor of the arch suggests how well they have succeeded.

It is widely known that Lisp and Fortran are the two oldest languages still in use. What is perhaps more significant is that they represent opposite poles in the philosophy of language design. Fortran was invented as a step up from assembly language. Lisp was invented as a language for expressing algorithms. Such different intentions yielded vastly different languages. Fortran makes life easy for the compiler writer; Lisp makes life easy for the programmer. Most programming languages since have fallen somewhere between the two poles. Fortran and Lisp have themselves moved closer to the center. Fortran now looks more like Algol, and Lisp has given up some of the wasteful habits of its youth.

The original Fortran and Lisp defined a sort of battlefield. On one side the battle cry is "Efficiency! (And besides, it would be too hard to implement.)" On the other side, the battle cry is "Abstraction! (And anyway, this isn't production software.)" As the gods determined from afar the outcomes of battles among the ancient Greeks, the outcome of this battle is being determined by hardware. Every year, things look better for Lisp. The arguments against Lisp are now starting to sound very much like the arguments that assembly language programmers gave against high-level languages in the early 1970s. The question is now becoming not Why Lisp?, but When?


2. Functions

Functions are the building-blocks of Lisp programs. They are also the building-blocks of Lisp. In most languages the + operator is something quite different from user-defined functions. But Lisp has a single model, function application, to describe all the computation done by a program. The Lisp + operator is a function, just like the ones you can define yourself.

In fact, except for a small number of operators called special forms, the core of Lisp is a collection of Lisp functions. What's to stop you from adding to this collection? Nothing at all: if you think of something you wish Lisp could do, you can write it yourself, and your new function will be treated just like the built-in ones.

This fact has important consequences for the programmer. It means that any new function could be considered either as an addition to Lisp, or as part of a specific application. Typically, an experienced Lisp programmer will write some of each, adjusting the boundary between language and application until the two fit one another perfectly. This book is about how to achieve a good fit between language and application. Since everything we do toward this end ultimately depends on functions, functions are the natural place to begin.


2.1 Functions as Data

Two things make Lisp functions different. One, mentioned above, is that Lisp itself is a collection of functions. This means that we can add to Lisp new operators of our own. Another important thing to know about functions is that they are Lisp objects.

Lisp offers most of the data types one finds in other languages. We get integers and floating-point numbers, strings, arrays, structures, and so on. But Lisp supports one data type which may at first seem surprising: the function. Nearly all programming languages provide some form of function or procedure. What does it mean to say that Lisp provides them as a data type? It means that in Lisp we can do with functions all the things we expect to do with more familiar data types, like integers: create new ones at runtime, store them in variables and in structures, pass them as arguments to other functions, and return them as results.

The ability to create and return functions at runtime is particularly useful. This might sound at first like a dubious sort of advantage, like the self-modifying machine language programs one can run on some computers. But creating new functions at runtime turns out to be a routinely used Lisp programming technique.


2.2 Defining Functions

Most people first learn how to make functions with defun. The following expression defines a function called double which returns twice its argument.

> (defun double (x) (* x 2))
DOUBLE

Having fed this to Lisp, we can call double in other functions, or from the toplevel:

> (double 1)
2

A file of Lisp code usually consists mainly of such defuns, and so resembles a file of procedure definitions in a language like C or Pascal. But something quite different is going on. Those defuns are not just procedure definitions, they're Lisp calls. This distinction will become clearer when we see what's going on underneath defun.

Functions are objects in their own right. What defun really does is build one, and store it under the name given as the first argument. So as well as calling double, we can get hold of the function which implements it. The usual way to do so is by using the #' (sharp-quote) operator. This operator can be understood as mapping names to actual function objects. By affixing it to the name of double

> #'double
#<Interpreted-Function C66ACE>

we get the actual object created by the definition above. Though its printed representation will vary from implementation to implementation, a Common Lisp function is a first-class object, with all the same rights as more familiar objects like numbers and strings. So we can pass this function as an argument, return it, store it in a data structure, and so on:

> (eq #'double (car (list #'double)))
T

We don't even need defun to make functions. Like most Lisp objects, we can refer to them literally. When we want to refer to an integer, we just use the integer itself. To represent a string, we use a series of characters surrounded by double-quotes. To represent a function, we use what's called a lambda-expression. A lambda-expression is a list with three parts: the symbol lambda, a parameter list, and a body of zero or more expressions. This lambda-expression refers to a function equivalent to double:

(lambda (x) (* x 2))

It describes a function which takes one argument x, and returns 2x.

A lambda-expression can also be considered as the name of a function. If double is a proper name, like "Michelangelo," then (lambda (x) (* x 2)) is a definite description, like "the man who painted the ceiling of the Sistine Chapel." By putting a sharp-quote before a lambda-expression, we get the corresponding function:

> #'(lambda (x) (* x 2))
#<Interpreted-Function C674CE>

This function behaves exactly like double, but the two are distinct objects.

In a function call, the name of the function appears first, followed by the arguments:

> (double 3)
6

Since lambda-expressions are also names of functions, they can also appear first in function calls:

> ((lambda (x) (* x 2)) 3)
6

In Common Lisp, we can have a function named double and a variable named double at the same time.

> (setq double 2)
2
> (double double)
4

When a name occurs first in a function call, or is preceded by a sharp-quote, it is taken to refer to a function. Otherwise it is treated as a variable name.

It is therefore said that Common Lisp has distinct name-spaces for variables and functions. We can have a variable called foo and a function called foo, and they need not be identical. This situation can be confusing, and leads to a certain amount of ugliness in code, but it is something that Common Lisp programmers have to live with.

If necessary, Common Lisp provides two functions which map symbols to the values, or functions, that they represent. The function symbol-value takes a symbol and returns the value of the corresponding special variable:

> (symbol-value 'double)
2

while symbol-function does the same for a globally defined function:

> (symbol-function 'double)
#<Interpreted-Function C66ACE>

Note that, since functions are ordinary data objects, a variable could have a function as its value:

> (setq x #'append)
#<Compiled-Function 46B4BE>
> (eq (symbol-value 'x) (symbol-function 'append))
T

Beneath the surface, defun is setting the symbol-function of its first argument to a function constructed from the remaining arguments. The following two expressions do approximately the same thing:

(defun double (x) (* x 2))

(setf (symbol-function 'double)
      #'(lambda (x) (* x 2)))

So defun has the same effect as procedure definition in other languages--to associate a name with a piece of code. But the underlying mechanism is not the same. We don't need defun to make functions, and functions don't have to be stored away as the value of some symbol. Underlying defun, which resembles procedure definition in any other language, is a more general mechanism: building a function and associating it with a certain name are two separate operations. When we don't need the full generality of Lisp's notion of a function, defun makes function definition as simple as in more restrictive languages.


2.3 Functional Arguments

Having functions as data objects means, among other things, that we can pass them as arguments to other functions. This possibility is partly responsible for the importance of bottom-up programming in Lisp.

A language which allows functions as data objects must also provide some way of calling them. In Lisp, this function is apply. Generally, we call apply with two arguments: a function, and a list of arguments for it. The following four expressions all have the same effect:

(+ 1 2)
(apply #'+ '(1 2))
(apply (symbol-function '+) '(1 2))
(apply #'(lambda (x y) (+ x y)) '(1 2))

In Common Lisp, apply can take any number of arguments, and the function given first will be applied to the list made by consing the rest of the arguments onto the list given last. So the expression

(apply #'+ 1 '(2))

is equivalent to the preceding four. If it is inconvenient to give the arguments as a list, we can use funcall, which differs from apply only in this respect. This expression

(funcall #'+ 1 2)

has the same effect as those above.

Many built-in Common Lisp functions take functional arguments. Among the most frequently used are the mapping functions. For example, mapcar takes two or more arguments, a function and one or more lists (one for each parameter of the function), and applies the function successively to elements of each list:

> (mapcar #'(lambda (x) (+ x 10))
'(1 2 3))
(11 12 13)
> (mapcar #'+
'(1 2 3)
'(10 100 1000))
(11 102 1003)

Lisp programs frequently want to do something to each element of a list and get back a list of results. The first example above illustrates the conventional way to do this: make a function which does what you want done, and mapcar it over the list.

Already we see how convenient it is to be able to treat functions as data. In many languages, even if we could pass a function as an argument to something like mapcar, it would still have to be a function defined in some source file beforehand. If just one piece of code wanted to add 10 to each element of a list, we would have to define a function, called plus ten or some such, just for this one use. With lambda-expressions, we can refer to functions directly.

One of the big differences between Common Lisp and the dialects which preceded it are the large number of built-in functions that take functional arguments. Two of the most commonly used, after the ubiquitous mapcar, are sort and remove-if. The former is a general-purpose sorting function. It takes a list and a predicate, and returns a list sorted by passing each pair of elements to the predicate.

> (sort '(1 4 2 5 6 7 3) #'<)
(1 2 3 4 5 6 7)

To remember how sort works, it helps to remember that if you sort a list with no duplicates by <, and then apply < to the resulting list, it will return true.

If remove-if weren't included in Common Lisp, it might be the first utility you would write. It takes a function and a list, and returns all the elements of the list for which the function returns false.

> (remove-if #'evenp '(1 2 3 4 5 6 7))
(1 3 5 7)

As an example of a function which takes functional arguments, here is a definition of a limited version of remove-if:

(defun our-remove-if (fn lst)
  (if (null lst)
      nil
    (if (funcall fn (car lst))
	(our-remove-if fn (cdr lst))
      (cons (car lst) (our-remove-if fn (cdr lst))))))

Note that within this definition fn is not sharp-quoted. Since functions are data objects, a variable can have a function as its regular value. That's what's happening here. Sharp-quote is only for referring to the function named by a symbol--usually one globally defined as such with defun.

As Chapter 4 will show, writing new utilities which take functional arguments is an important element of bottom-up programming. Common Lisp has so many utilities built-in that the one you need may exist already. But whether you use built-ins like sort, or write your own utilities, the principle is the same. Instead of wiring in functionality, pass a functional argument.


2.4 Functions as Properties

The fact that functions are Lisp objects also allows us to write programs which can be extended to deal with new cases on the fly. Suppose we want to write a function which takes a type of animal and behaves appropriately. In most languages, the way to do this would be with a case statement, and we can do it this way in Lisp as well:

(defun behave (animal)
  (case animal
    (dog (wag-tail)
	 (bark))
    (rat (scurry)
	 (squeak))
    (cat (rub-legs)
	 (scratch-carpet))))

What if we want to add a new type of animal? If we were planning to add new animals, it would have been better to define behave as follows:

(defun behave (animal)
  (funcall (get animal 'behavior)))

and to define the behavior of an individual animal as a function stored, for example, on the property list of its name:

(setf (get 'dog 'behavior)
      #'(lambda ()
	  (wag-tail)
	  (bark)))

This way, all we need do in order to add a new animal is define a new property. No functions have to be rewritten.

The second approach, though more flexible, looks slower. It is. If speed were critical, we would use structures instead of property lists and, especially, compiled instead of interpreted functions. (Section 2.9 explains how to make these.) With structures and compiled functions, the more flexible type of code can approach or exceed the speed of versions using case statements.

This use of functions corresponds to the concept of a method in object-oriented programming. Generally speaking, a method is a function which is a property of an object, and that's just what we have. If we add inheritance to this model, we'll have all the elements of object-oriented programming. Chapter 25 will show that this can be done with surprisingly little code.

One of the big selling points of object-oriented programming is that it makes programs extensible. This prospect excites less wonder in the Lisp world, where extensibility has always been taken for granted. If the kind of extensibility we need does not depend too much on inheritance, then plain Lisp may already be sufficient.


2.5 Scope

Common Lisp is a lexically scoped Lisp. Scheme is the oldest dialect with lexical scope; before Scheme, dynamic scope was considered one of the defining features of Lisp.

The difference between lexical and dynamic scope comes down to how an implementation deals with free variables. A symbol is bound in an expression if it has been established as a variable, either by appearing as a parameter, or by variable-binding operators like let and do. Symbols which are not bound are said to be free. In this example, scope comes into play:

(let ((y 7))
  (defun scope-test (x)
    (list x y)))

Within the defun expression,x is bound and y is free. Free variables are interesting because it's not obvious what their values should be. There's no uncertainty about the value of a bound variable--when scope-test is called, the value of x should be whatever is passed as the argument. But what should be the value of y? This is the question answered by the dialect's scope rules.

In a dynamically scoped Lisp, to find the value of a free variable when executing scope-test, we look back through the chain of functions that called it. When we find an environment where y was bound, that binding of y will be the one used in scope-test. If we find none, we take the global value of y. Thus, in a dynamically scoped Lisp, y would have the value it had in the calling expression:

> (let ((y 5))
(scope-test 3))
(3 5)

With dynamic scope, it means nothing that y was bound to 7 when scope-test was defined. All that matters is that y had a value of 5 when scope-test was called.

In a lexically scoped Lisp, instead of looking back through the chain of calling functions, we look back through the containing environments at the time the function was defined. In a lexically scoped Lisp, our example would catch the binding of y where scope-test was defined. So this is what would happen in Common Lisp:

> (let ((y 5))
(scope-test 3))
(3 7)

Here the binding of y to 5 at the time of the call has no effect on the returned value.

Though you can still get dynamic scope by declaring a variable to be special, lexical scope is the default in Common Lisp. On the whole, the Lisp community seems to view the passing of dynamic scope with little regret. For one thing, it used to lead to horribly elusive bugs. But lexical scope is more than a way of avoiding bugs. As the next section will show, it also makes possible some new programming techniques.


2.6 Closures

Because Common Lisp is lexically scoped, when we define a function containing free variables, the system must save copies of the bindings of those variables at the time the function was defined. Such a combination of a function and a set of variable bindings is called a closure. Closures turn out to be useful in a wide variety of applications.

Closures are so pervasive in Common Lisp programs that it's possible to use them without even knowing it. Every time you give mapcar a sharp-quoted lambda-expression containing free variables, you're using closures. For example, suppose we want to write a function which takes a list of numbers and adds a certain amount to each one. The function list+

(defun list+ (lst n)
  (mapcar #'(lambda (x) (+ x n))
	  lst))

will do what we want:

> (list+ '(1 2 3) 10)
(11 12 13)

If we look closely at the function which is passed to mapcar within list+, it's actually a closure. The instance of n is free, and its binding comes from the surrounding environment. Under lexical scope, every such use of a mapping function causes the creation of a closure.(3)

Closures play a more conspicuous role in a style of programming promoted by Abelson and Sussman's classic Structure and Interpretation of Computer Programs. Closures are functions with local state. The simplest way to use this state is in a situation like the following:

(let ((counter 0))
  (defun new-id () (incf counter))
  (defun reset-id () (setq counter 0)))

These two functions share a variable which serves as a counter. The first one returns successive values of the counter, and the second resets the counter to 0. The same thing could be done by making the counter a global variable, but this way it is protected from unintended references.

It's also useful to be able to return functions with local state. For example, the function make-adder

(defun make-adder (n)
  #'(lambda (x) (+ x n)))

takes a number, and returns a closure which, when called, adds that number to its argument. We can make as many instances of adders as we want:

> (setq add2 (make-adder 2)
add10 (make-adder 10))
#<Interpreted-Function BF162E>
> (funcall add2 5)
7
> (funcall add10 3)
13

In the closures returned by make-adder, the internal state is fixed, but it's also possible to make closures which can be asked to change their state.

(defun make-adderb (n)
  #'(lambda (x &optional change)
      (if change
	  (setq n x)
	(+ x n))))

This new version of make-adder returns closures which, when called with one argument, behave just like the old ones.

> (setq addx (make-adderb 1))
#<Interpreted-Function BF1C66>
> (funcall addx 3)
4

However, when the new type of adder is called with a non-nil second argument, its internal copy of n will be reset to the value passed as the first argument:

> (funcall addx 100 t)
100
> (funcall addx 3)
103

It's even possible to return a group of closures which share the same data objects. Figure 2.1 contains a function which creates primitive databases. It takes an assoc-list (db), and returns a list of three closures which query, add, and delete entries, respectively.

Each call to make-dbms makes a new database--a new set of functions closed over their own shared copy of an assoc-list.

> (setq cities (make-dbms '((boston . us) (paris . france))))
(#<Interpreted-Function 8022E7>
#<Interpreted-Function 802317>
#<Interpreted-Function 802347>)

(defun make-dbms (db)
  (list
   #'(lambda (key)
       (cdr (assoc key db)))
   #'(lambda (key val)
       (push (cons key val) db)
       key)
   #'(lambda (key)
       (setf db (delete key db :key #'car))
       key)))

Figure 2.1: Three closures share a list.

The actual assoc-list within the database is invisible from the outside world--we can't even tell that it's an assoc-list--but it can be reached through the functions which are components of cities:

> (funcall (car cities) 'boston)
US
> (funcall (second cities) 'london 'england)
LONDON
> (funcall (car cities) 'london)
ENGLAND

Calling the car of a list is a bit ugly. In real programs, the access functions might instead be entries in a structure. Using them could also be cleaner--databases could be reached indirectly via functions like:

(defun lookup (key db)
  (funcall (car db) key))

However, the basic behavior of closures is independent of such refinements.

In real programs, the closures and data structures would also be more elaborate than those we see in make-adderor make-dbms. The single shared variable could be any number of variables, each bound to any sort of data structure.

Closures are one of the distinct, tangible benefits of Lisp. Some Lisp programs could, with effort, be translated into less powerful languages. But just try to translate a program which uses closures as above, and it will become evident how much work this abstraction is saving us. Later chapters will deal with closures in more detail. Chapter 5 shows how to use them to build compound functions, and Chapter 6 looks at their use as a substitute for traditional data structures.


2.7 Local Functions

When we define functions with lambda-expressions, we face a restriction which doesn't arise with defun: a function defined in a lambda-expression doesn't have a name and therefore has no way of referring to itself. This means that in Common Lisp we can't use lambda to define a recursive function.

If we want to apply some function to all the elements of a list, we use the most familiar of Lisp idioms:

> (mapcar #'(lambda (x) (+ 2 x))
'(2 5 7 3))
(4 7 9 5)

What about cases where we want to give a recursive function as the first argument to mapcar? If the function has been defined with defun, we can simply refer to it by name:

> (mapcar #'copy-tree '((a b) (c d e)))
((A B) (C D E))

But now suppose that the function has to be a closure, taking some bindings from the environment in which the mapcar occurs. In our example list+,

(defun list+ (lst n)
  (mapcar #'(lambda (x) (+ x n))
	  lst))

the first argument to mapcar,#'(lambda (x) (+ x n)), must be defined within list+ because it needs to catch the binding of n. So far so good, but what if we want to give mapcar a function which both needs local bindings and is recursive? We can't use a function defined elsewhere with defun, because we need bindings from the local environment. And we can't use lambda to define a recursive function, because the function will have no way of referring to itself.

Common Lisp gives us labels as a way out of this dilemma. With one important reservation, labels could be described as a sort of let for functions. Each of the binding specifications in a labels expression should have the form

(#name# #parameters# . #body#)

Within the labels expression, #name# will refer to a function equivalent to:

#'(lambda #parameters# . #body#)

So for example:

> (labels ((inc (x) (1+ x)))
(inc 3))
4

However, there is an important difference between let and labels. In a let expression, the value of one variable can't depend on another variable made by the same let--that is, you can't say

(let ((x 10) (y x))
  y)

and expect the value of the newy to reflect that of the newx. In contrast, the body of a function f defined in a labels expression may refer to any other function defined there, including f itself, which makes recursive function definitions possible.

Using labels we can write a function analogous to list+, but in which the first argument to mapcar is a recursive function:

(defun count-instances (obj lsts)
  (labels ((instances-in (lst)
			 (if (consp lst)
			     (+ (if (eq (car lst) obj) 1 0)
				(instances-in (cdr lst)))
			   0)))
    (mapcar #'instances-in lsts)))

This function takes an object and a list, and returns a list of the number of occurrences of the object in each element:

> (count-instances 'a '((a b c) (d a r p a) (d a r) (a a)))
(1 2 1 2)


2.8 Tail-Recursion

A recursive function is one that calls itself. Such a call is tail-recursive if no work remains to be done in the calling function afterwards. This function is not tail-recursive

(defun our-length (lst)
  (if (null lst)
      0
    (1+ (our-length (cdr lst)))))

because on returning from the recursive call we have to pass the result to 1+. The following function is tail-recursive, though

(defun our-find-if (fn lst)
  (if (funcall fn (car lst))
      (car lst)
    (our-find-if fn (cdr lst))))

because the value of the recursive call is immediately returned.

Tail-recursion is desirable because many Common Lisp compilers can transform tail-recursive functions into loops. With such a compiler, you can have the elegance of recursion in your source code without the overhead of function calls at runtime. The gain in speed is usually great enough that programmers go out of their way to make functions tail-recursive.

A function which isn't tail-recursive can often be transformed into one that is by embedding in it a local function which uses an accumulator. In this context, an accumulator is a parameter representing the value computed so far. For example, our-length could be transformed into

(defun our-length (lst)
  (labels ((rec (lst acc)
		(if (null lst)
		    acc
		  (rec (cdr lst) (1+ acc)))))
    (rec lst 0)))

where the number of list elements seen so far is contained in a second parameter, acc. When the recursion reaches the end of the list, the value of acc will be the total length, which can just be returned. By accumulating the value as we go down the calling tree instead of constructing it on the way back up, we can make rec tail-recursive.

Many Common Lisp compilers can do tail-recursion optimization, but not all of them do it by default. So after writing your functions to be tail-recursive, you may also want to put

(proclaim '(optimize speed))

at the top of the file, to ensure that the compiler can take advantage of your efforts.(4)

Given tail-recursion and type declarations, existing Common Lisp compilers can generate code that runs as fast as, or faster than, C. Richard Gabriel gives as an example the following function, which returns the sum of the integers from 1 to n:

(defun triangle (n)
  (labels ((tri (c n)
		(declare (type fixnum n c))
		(if (zerop n)
		    c
		  (tri (the fixnum (+ n c))
		       (the fixnum (- n 1))))))
    (tri 0 n)))

This is what fast Common Lisp code looks like. At first it may not seem natural to write functions this way. It's often a good idea to begin by writing a function in whatever way seems most natural, and then, if necessary, transforming it into a tail-recursive equivalent.


2.9 Compilation

Lisp functions can be compiled either individually or by the file. If you just type a defun expression into the toplevel,

> (defun foo (x) (1+ x))
FOO

many implementations will create an interpreted function. You can check whether a given function is compiled by feeding it to compiled-function-p:

> (compiled-function-p #'foo)
NIL

We can have foo compiled by giving its name to compile

> (compile 'foo)
FOO

which will compile the definition of foo and replace the interpreted version with a compiled one.

> (compiled-function-p #'foo)
T

Compiled and interpreted functions are both Lisp objects, and behave the same, except with respect to compiled-function-p. Literal functions can also be compiled: compile expects its first argument to be a name, but if you give nil as the first argument, it will compile the lambda-expression given as the second argument.

> (compile nil '(lambda (x) (+ x 2)))
#<Compiled-Function BF55BE>

If you give both the name and function arguments, compile becomes a sort of compiling defun:

> (progn (compile 'bar '(lambda (x) (* x 3)))
(compiled-function-p #'bar))
T

Having compile in the language means that a programcould build and compile new functions on the fly. However, calling compile explicitly is a drastic measure, comparable to calling eval, and should be viewed with the same suspicion.(5) When Section 2.1 said that creating new functions at runtime was a routinely used programming technique, it referred to new closures like those made by make-adder, not functions made by calling compile on raw lists. Calling compile is not a routinely used programming technique--it's an extremely rare one. So beware of doing it unnecessarily. Unless you're implementing another language on top of Lisp (and much of the time, even then), what you need to do may be possible with macros.

There are two sorts of functions which you can't give as an argument to compile. According to CLTL2 (p. 677), you can't compile a function "defined interpretively in a non-null lexical environment." That is, if at the toplevel you define foo within a let

> (let ((y 2))
(defun foo (x) (+ x y)))

then (compile 'foo) will not necessarily work.(6) You also can't call compile on a function which is already compiled. In this situation, CLTL2 hints darkly that "the consequences. . .are unspecified."

The usual way to compile Lisp code is not to compile functions individually with compile, but to compile whole files with compile-file. This function takes a filename and creates a compiled version of the source file--typically with the same base name but a different extension. When the compiled file is loaded, compiled-function-pshould return true for all the functions defined in the file.

Later chapters will depend on another effect of compilation: when one function occurs within another function, and the containing function is compiled, the inner function will also get compiled. CLTL2 does not seem to say explicitly that this will happen, but in a decent implementation you can count on it.

The compiling of inner functions becomes evident in functions which return functions. When make-adder (page 18) is compiled, it will return compiled functions:

> (compile 'make-adder)
MAKE-ADDER
> (compiled-function-p (make-adder 2))
T

As later chapters will show, this fact is of great importance in the implementation of embedded languages. If a new language is implemented by transformation, and the transformation code is compiled, then it yields compiled output--and so becomes in effect a compiler for the new language. (A simple example is described on page 81.)

If we have a particularly small function, we may want to request that it be compiled inline. Otherwise, the machinery of calling it could entail more effort than the function itself. If we define a function:

(defun 50th (lst) (nth 49 lst))

and make the declaration:

(proclaim '(inline 50th))

then a reference to 50th within a compiled function should no longer require a real function call. If we define and compile a function which calls 50th,

(defun foo (lst)
  (+ (50th lst) 1))

then when foo is compiled, the code for 50th should be compiled right into it, just as if we had written

(defun foo (lst)
  (+ (nth 49 lst) 1))

in the first place. The drawback is that if we redefine 50th, we also have to recompile foo, or it will still reflect the old definition. The restrictions on inline functions are basically the same as those on macros (see Section 7.9).


2.10 Functions from Lists

In some earlier dialects of Lisp, functions were represented as lists. This gave Lisp programs the remarkable ability to write and execute their own Lisp programs. In Common Lisp, functions are no longer made of lists--good implementations compile them into native machine code. But you can still write programs that write programs, because lists are the input to the compiler.

It cannot be overemphasized how important it is that Lisp programs can write Lisp programs, especially since this fact is so often overlooked. Even experienced Lisp users rarely realize the advantages they derive from this feature of the language. This is why Lisp macros are so powerful, for example. Most of the techniques described in this book depend on the ability to write programs which manipulate Lisp expressions.


3. Functional Programming

The previous chapter explained how Lisp and Lisp programs are both built out of a single raw material: the function. Like any building material, its qualities influence both the kinds of things we build, and the way we build them.

This chapter describes the kind of construction methods which prevail in the Lisp world. The sophistication of these methods allows us to attempt more ambitious kinds of programs. The next chapter will describe one particularly important class of programs which become possible in Lisp: programs which evolve instead of being developed by the old plan-and-implement method.


3.1 Functional Design

The character of an object is influenced by the elements from which it is made. A wooden building looks different from a stone one, for example. Even when you are too far away to see wood or stone, you can tell from the overall shape of the building what it's made of. The character of Lisp functions has a similar influence on the structure of Lisp programs.

Functional programming means writing programs which work by returning values instead of by performing side-effects. Side-effects include destructive changes to objects (e.g. by rplaca) and assignments to variables (e.g. by setq). If side-effects are few and localized, programs become easier to read, test, and debug. Lisp programs have not always been written in this style, but over time Lisp and functional programming have gradually become inseparable.

An example will show how functional programming differs from what you might do in another language. Suppose for some reason we want the elements of a list in the reverse order. Instead of writing a function to reverse lists, we write a function which takes a list, and returns a list with the same elements in the reverse order.

(defun bad-reverse (lst)
  (let* ((len (length lst))
	 (ilimit (truncate (/ len 2))))
    (do ((i 0 (1+ i))
	 (j (1- len) (1- j)))
	((>= i ilimit))
      (rotatef (nth i lst) (nth j lst)))))

Figure 3.1: A function to reverse lists.

Figure 3.1 contains a function to reverse lists. It treats the list as an array, reversing it in place; its return value is irrelevant:

> (setq lst '(a b c))
(A B C)
> (bad-reverse lst)
NIL
> lst
(C B A)

As its name suggests, bad-reverse is far from good Lisp style. Moreover, its ugliness is contagious: because it works by side-effects, it will also draw its callers away from the functional ideal.

Though cast in the role of the villain, bad-reverse does have one merit: it shows the Common Lisp idiom for swapping two values. The rotatef macro rotates the values of any number of generalized variables--that is, expressions you could give as the first argument to setf. When applied to just two arguments, the effect is to swap them.

In contrast, Figure 3.2 shows a function which returns reversed lists. With good-reverse, we get the reversed list as the return value; the original list is not touched.

> (setq lst '(a b c))
(A B C)
> (good-reverse lst)
(C B A)
> lst
(A B C)

(defun good-reverse (lst)
  (labels ((rev (lst acc)
		(if (null lst)
		    acc
		  (rev (cdr lst) (cons (car lst) acc)))))
    (rev lst nil)))

Figure 3.2: A function to return reversed lists.

It used to be thought that you could judge someone's character by looking at the shape of his head. Whether or not this is true of people, it is generally true of Lisp programs. Functional programs have a different shape from imperative ones. The structure in a functional program comes entirely from the composition of arguments within expressions, and since arguments are indented, functional code will show more variation in indentation. Functional code looks fluid(7) on the page; imperative code looks solid and blockish, like Basic.

Even from a distance, the shapes of bad- and good-reverse suggest which is the better program. And despite being shorter, good-reverse is also more efficient: O(n) instead of O(n 2 ).

We are spared the trouble of writing reverse because Common Lisp has it built-in. It is worth looking briefly at this function, because it is one that often brings to the surface misconceptions about functional programming. Like good-reverse,the built-in reverseworks by returning a value--it doesn't touch its arguments. But people learning Lisp may assume that, like bad-reverse, it works by side-effects. If in some part of a program they want a list lst to be reversed, they may write

(reverse lst)

and wonder why the call seems to have no effect. In fact, if we want effects from such a function, we have to see to it ourselves in the calling code. That is, we need to write

(setq lst (reverse lst))

instead. Operators like reverse are intended to be called for return values, not side-effects. It is worth writing your own programs in this style too--not only for its inherent benefits, but because, if you don't, you will be working against the language.

One of the points we ignored in the comparison of bad- and good-reverse is that bad-reverse doesn't cons. Instead of building new list structure, it operates on the original list. This can be dangerous--the list could be needed elsewhere in the program--but for efficiency it is sometimes necessary. For such cases, Common Lisp provides an O(n) destructive reversing function called nreverse.

A destructive function is one that can alter the arguments passed to it. However, even destructive functions usually work by returning values: you have to assume that nreverse will recycle lists you give to it as arguments, but you still can't assume that it will reverse them. As before, the reversed list has to be found in the return value. You still can't write

(nreverse lst)

in the middle of a function and assume that afterwards lst will be reversed. This is what happens in most implementations:

> (setq lst '(a b c))
(A B C)
> (nreverse lst)
(C B A)
> lst
(A)

To reverse lst, you have would have to set lst to the return value, as with plain reverse.

If a function is advertised as destructive, that doesn't mean that it's meant to be called for side-effects. The danger is, some destructive functions give the impression that they are. For example,

(nconc x y)
almost always has the same effect as

(setq x (nconc x y))

If you wrote code which relied on the former idiom, it might seem to work for some time. However, it wouldn't do what you expected when x was nil.

Only a few Lisp operators are intended to be called for side-effects. In general, the built-in operators are meant to be called for their return values. Don't be misled by names like sort, remove, or substitute. If you want side-effects, use setq on the return value.

This very rule suggests that some side-effects are inevitable. Having functional programming as an ideal doesn't imply that programs should never have side-effects. It just means that they should have no more than necessary.

It may take time to develop this habit. One way to start is to treat the following operators as if there were a tax on their use:

set setq setf psetf psetq incf decf push pop pushnew
rplaca rplacd rotatef shiftf remf remprop remhash

and also let*, in which imperative programs often lie concealed. Treating these operators as taxable is only proposed as a help toward, not a criterion for, good Lisp style. However, this alone can get you surprisingly far.

In other languages, one of the most common causes of side-effects is the need for a function to return multiple values. If functions can only return one value, they have to "return" the rest by altering their parameters. Fortunately, this isn't necessary in Common Lisp, because any function can return multiple values.

The built-in function truncate returns two values, for example--the truncated integer, and what was cut off in order to create it. A typical implementation will print both when truncate is called at the toplevel:

> (truncate 26.21875)
26
0.21875

When the calling code only wants one value, the first one is used:

> (= (truncate 26.21875) 26)
T

The calling code can catch both return values by using a multiple-value-bind. This operator takes a list of variables, a call, and a body of code. The body is evaluated with the variables bound to the respective return values from the call:

> (multiple-value-bind (int frac) (truncate 26.21875)
(list int frac))
(26 0.21875)

Finally, to return multiple values, we use the values operator:

> (defun powers (x)
(values x (sqrt x) (expt x 2)))
POWERS
> (multiple-value-bind (base root square) (powers 4)
(list base root square))
(4 2.0 16)

Functional programming is a good idea in general. It is a particularly good idea in Lisp, because Lisp has evolved to support it. Built-in operators like reverse and nreverse are meant to be used in this way. Other operators, like values and multiple-value-bind, have been provided specifically to make functional programming easier.


3.2 Imperative Outside-In

The aims of functional programming may show more clearly when contrasted with those of the more common approach, imperative programming. A functional program tells you what it wants; an imperative program tells you what to do. A functional program says "Return a list of a and the square of the first element of x:"

(defun fun (x)
  (list 'a (expt (car x) 2)))

An imperative programs says "Get the first element of x, then square it, then return a list of a and the square:"

(defun imp (x)
  (let (y sqr)
    (setq y (car x))
    (setq sqr (expt y 2))
    (list 'a sqr)))

Lisp users are fortunate in being able to write this program both ways. Some languages are only suited to imperative programming--notably Basic, along with most machine languages. In fact, the definition of imp is similar in form to the machine language code that most Lisp compilers would generate for fun.

Why write such code when the compiler could do it for you? For many

programmers, this question does not even arise. A language stamps its pattern on our thoughts: someone used to programming in an imperative language may have begun to conceive of programs in imperative terms, and may actually find it easier to write imperative programs than functional ones. This habit of mind is worth overcoming if you have a language that will let you.

For alumni of other languages, beginning to use Lisp may be like stepping onto a skating rink for the first time. It's actually much easier to get around on ice than it is on dry land--if you use skates. Till then you will be left wondering what people see in this sport.

What skates are to ice, functional programming is to Lisp. Together the two allow you to travel more gracefully, with less effort. But if you are accustomed to another mode of travel, this may not be your experience at first. One of the obstacles to learning Lisp as a second language is learning to program in a functional style.

Fortunately there is a trick for transforming imperative programs into functional ones. You can begin by applying this trick to finished code. Soon you will begin to anticipate yourself, and transform your code as you write it. Soon after that, you will begin to conceive of programs in functional terms from the start.

The trick is to realize that an imperative program is a functional program turned inside-out. To find the functional program implicit in our imperative one, we just turn it outside-in. Let's try this technique on imp.

The first thing we notice is the creation of y and sqr in the initial let. This is a sign that bad things are to follow. Like eval at runtime, uninitialized variables are so rarely needed that they should generally be treated as a symptom of some illness in the program. Such variables are often used like pins which hold the program down and keep it from coiling into its natural shape.

However, we ignore them for the time being, and go straight to the end of the function. What occurs last in an imperative program occurs outermost in a functional one. So our first step is to grab the final call to list and begin stuffing the rest of the program inside it--just like turning a shirt inside-out. We continue by applying the same transformation repeatedly, just as we would with the sleeves of the shirt, and in turn with their cuffs.

Starting at the end, we replace sqr with (expt y 2), yielding:

(list 'a (expt y 2)))

Then we replace y by (car x):

(list 'a (expt (car x) 2))

Now we can throw away the rest of the code, having stuffed it all into the last expression. In the process we removed the need for the variables y and sqr, so we can discard the let as well.

The final result is shorter than what we began with, and easier to understand. In the original code, we're faced with the final expression (list 'a sqr), and it's not immediately clear where the value of sqr comes from. Now the source of the return value is laid out for us like a road map.

The example in this section was a short one, but the technique scales up. Indeed, it becomes more valuable as it is applied to larger functions. Even functions which perform side-effects can be cleaned up in the portions which don't.


3.3 Functional Interfaces

Some side-effects are worse than others. For example, though this function calls nconc

(defun qualify (expr)
  (nconc (copy-list expr) (list 'maybe)))

it preserves referential transparency.(8) If you call it with a given argument, it will always return the same (equal) value. From the caller's point of view, qualify might as well be purely functional code. We can't say the same for bad-reverse (page 29), which actually modifies its argument.

Instead of treating all side-effects as equally bad, it would be helpful if we had some way of distinguishing between such cases. Informally, we could say that it's harmless for a function to modify something that no one else owns. For example, the nconc in qualify is harmless because the list given as the first argument is freshly consed. No one else could own it.

In the general case, we have to talk about ownership not by functions, but by invocations of functions. Though no one else owns the variable x here,

(let ((x 0))
  (defun total (y)
    (incf x y)))

the effects of one call will be visible in succeeding ones. So the rule should be: a given invocation can safely modify what it uniquely owns.

Who owns arguments and return values? The convention in Lisp seems to be that an invocation owns objects it receives as return values, but not objects passed to it as arguments. Functions that modify their arguments are distinguished by the label "destructive," but there is no special name for functions that modify objects returned to them.

This function adheres to the convention, for example:

(defun ok (x)
  (nconc (list 'a x) (list 'c)))

It calls nconc, which doesn't, but since the list spliced by nconc will always be freshly made rather than, say, a list passed to ok as an argument, ok itself is ok.

If it were written slightly differently, however,

(defun not-ok (x)
  (nconc (list 'a) x (list 'c)))

then the call to nconc would be modifying an argument passed to not-ok.

Many Lisp programs violate this convention, at least locally. However, as we saw with ok, local violations need not disqualify the calling function. And functions which do meet the preceding conditions will retain many of the advantages of purely functional code.

To write programs that are really indistinguishable from functional code, we have to add one more condition. Functions can't share objects with other code that doesn't follow the rules. For example, though this function doesn't have side-effects,

(defun anything (x)
  (+ x *anything*))

its return value depends on the global variable *anything*. So if any other function can alter the value of this variable, anything could return anything.

Code written so that each invocation only modifies what it owns is almost as good as purely functional code. A function that meets all the preceding conditions at least presents a functional interface to the world: if you call it twice with the same arguments, you should get the same results. And this, as the next section will show, is a crucial ingredient in bottom-up programming.

One problem with destructive operations is that, like global variables, they can destroy the locality of a program. When you're writing functional code, you can narrow your focus: you only need consider the functions that call, or are called by, the one you're writing. This benefit disappears when you want to modify something destructively. It could be used anywhere.

The conditions above do not guarantee the perfect locality you get with purely functional code, though they do improve things somewhat. For example, suppose that f calls g as below:

(defun f (x)
  (let ((val (g x)))
    ; safe to modify val here?
    ))

Is it safe for f to nconc something onto val? Not if g is identity: then we would be modifying something originally passed as an argument to f itself.

So even in programs which do follow the convention, we may have to look beyond f if we want to modify something there. However, we don't have to look as far: instead of worrying about the whole program, we now only have to consider the subtree beginning with f.

A corollary of the convention above is that functions shouldn't return anything that isn't safe to modify. Thus one should avoid writing functions whose return values incorporate quoted objects. If we define exclaim so that its return value incorporates a quoted list,

(defun exclaim (expression)
  (append expression '(oh my)))

Then any later destructive modification of the return value

> (exclaim '(lions and tigers and bears))
(LIONS AND TIGERS AND BEARS OH MY)
> (nconc * '(goodness))
(LIONS AND TIGERS AND BEARS OH MY GOODNESS)

could alter the list within the function:

> (exclaim '(fixnums and bignums and floats))
(FIXNUMS AND BIGNUMS AND FLOATS OH MY GOODNESS)

To make exclaim proof against such problems, it should be written:

(defun exclaim (expression)
  (append expression (list 'oh 'my)))

There is one major exception to the rule that functions shouldn't return quoted lists: the functions which generate macro expansions. Macro expanders can safely incorporate quoted lists in the expansions they generate, if the expansions are going straight to the compiler.

Otherwise, one might as well be a suspicious of quoted lists generally. Many other uses of them are likely to be something which ought to be done with a macro like in (page 152).


3.4 Interactive Programming

The previous sections presented the functional style as a good way of organizing programs. But it is more than this. Lisp programmers did not adopt the functional style purely for aesthetic reasons. They use it because it makes their work easier. In Lisp's dynamic environment, functional programs can be written with unusual speed, and at the same time, can be unusually reliable.

In Lisp it is comparatively easy to debug programs. A lot of information is available at runtime, which helps in tracing the causes of errors. But even more important is the ease with which you can test programs. You don't have to compile a program and test the whole thing at once. You can test functions individually by calling them from the toplevel loop.

Incremental testing is so valuable that Lisp style has evolved to take advantage of it. Programs written in the functional style can be understood one function at a time, and from the point of view of the reader this is its main advantage. However, the functional style is also perfectly adapted to incremental testing: programs written in this style can also be tested one function at a time. When a function neither examines nor alters external state, any bugs will appear immediately. Such a function can affect the outside world only through its return values. Insofar as these are what you expected, you can trust the code which produced them.

Experienced Lisp programmers actually design their programs to be easy to test:

  1. They try to segregate side-effects in a few functions, allowing the greater part of the program to be written in a purely functional style.

  2. If a function must perform side-effects, they try at least to give it a functional interface.

  3. They give each function a single, well-defined purpose.

When a function is written, they can test it on a selection of representative cases, then move on to the next one. If each brick does what it's supposed to do, the wall will stand.

In Lisp, the wall can be better-designed as well. Imagine the kind of conversation you would have with someone so far away that there was a transmission delay of one minute. Now imagine speaking to someone in the next room. You wouldn't just have the same conversation faster, you would have a different kind of conversation. In Lisp, developing software is like speaking face-to-face. You can test code as you're writing it. And instant turnaround has just as dramatic an effect on development as it does on conversation. You don't just write the same program faster; you write a different kind of program.

How so? When testing is quicker you can do it more often. In Lisp, as in any language, development is a cycle of writing and testing. But in Lisp the cycle is very short: single functions, or even parts of functions. And if you test everything as you write it, you will know where to look when errors occur: in what you wrote last. Simple as it sounds, this principle is to a large extent what makes bottom-up programming feasible. It brings an extra degree of confidence which enables Lisp programmers to break free, at least part of the time, from the old plan-and-implement style of software development.

Section 1.1 stressed that bottom-up design is an evolutionary process. You build up a language as you write a program in it. This approach can work only if you trust the lower levels of code. If you really want to use this layer as a language, you have to be able to assume, as you would with any language, that any bugs you encounter are bugs in your application and not in the language itself.

So your new abstractions are supposed to bear this heavy burden of responsibility, and yet you're supposed to just spin them off as the need arises? Just so; in Lisp you can have both. When you write programs in a functional style and test them incrementally, you can have the flexibility of doing things on the spur of the moment, plus the kind of reliability one usually associates with careful planning.


4. Utility Functions

Common Lisp operators come in three types: functions and macros, which you can write yourself, and special forms, which you can't. This chapter describes techniques for extending Lisp with new functions. But "techniques" here means something different from what it usually does. The important thing to know about such functions is not how they're written, but where they come from. An extension to Lisp will be written using mostly the same techniques you would use to write any other Lisp function. The hard part of writing these extensions is not deciding how to write them, but deciding which ones to write.


4.1 Birth of a Utility

In its simplest form, bottom-up programming means second-guessing whoever designed your Lisp. At the same time as you write your program, you also add to Lisp new operators which make your program easy to write. These new operators are called utilities.

The term "utility" has no precise definition. A piece of code can be called a utility if it seems too small to be considered as a separate application, and too general-purpose to be considered as part of a particular program. A database program would not be a utility, for example, but a function which performed a single operation on a list could be. Most utilities resemble the functions and macros that Lisp has already. In fact, many of Common Lisp's built-in operators began life as utilities. The function remove-if-not, which collects all the elements of a list satisfying some predicate, was defined by individual programmers for years before it became a part of Common Lisp.

Learning to write utilities would be better described as learning the habit of writing them, rather than the technique of writing them. Bottom-up programming means simultaneously writing a program and a programming language. To do this well, you have to develop a fine sense of which operators a program is lacking. You have to be able to look at a program and say, "Ah, what you really mean to say is this."

For example, suppose that nicknames is a function which takes a name and builds a list of all the nicknames which could be derived from it. Given this function, how do we collect all the nicknames yielded by a list of names? Someone learning Lisp might write a function like:

<