Revision 1.8
Robert Brown François-René Rideau In memoriam Dan WeinrebHooray! Now you know you can expand points to get more details. Alternatively, there's an "expand all" at the top of this document.
Common Lisp is a powerful multiparadigm programming language. With great power comes great responsibility. This guide recommends formatting and stylistic choices designed to make your code easier for other people to understand.
This guide is not a Common Lisp tutorial. For basic information about the language, please consult Practical Common Lisp. For a language reference, please consult the Common Lisp HyperSpec. For more detailed style guidance, take a look at Peter Norvig and Kent Pitman's style guide.
MUST |
This word, or the terms "REQUIRED" or "SHALL", means that the guideline is an absolute requirement. You must ask permission to violate a MUST. |
---|---|
MUST NOT |
This phrase, or the phrase "SHALL NOT", means that the guideline is an absolute prohibition. You must ask permission to violate a MUST NOT. |
SHOULD |
This word, or the adjective "RECOMMENDED", means that there may exist valid reasons in particular circumstances to ignore the demands of the guideline, but the full implications must be understood and carefully weighed before choosing a different course. You must ask forgiveness for violating a SHOULD. |
SHOULD NOT |
This phrase, or the phrase "NOT RECOMMENDED", means that there may exist valid reasons in particular circumstances to ignore the prohibitions of this guideline, but the full implications should be understood and carefully weighed before choosing a different course. You must ask forgiveness for violating a SHOULD NOT. |
MAY |
This word, or the adjective "OPTIONAL", means that an item is truly optional. |
Unlike RFCs, we don't capitalize every instance of one of the above keywords when it is used.
Permission comes from the OWNERS of your project.
Forgiveness is requested in a comment near the point of guideline violation, and is granted by your code reviewer. The original comment should be signed by you, and the reviewer should add a signed approval to the comment at review time.
A lot of our code was written before these guidelines existed. You should fix violations as you encounter them in the course of your normal coding. You must not fix violations en masse without warning other developers and coordinating with them, so as not to make the merging of large branches more difficult than it already is.
i<
and friends
(premature optimization, bug introduction)
When making decisions about how to write a given piece of code, aim for the following -ilities in this priority order:
Most of these are obvious.
Usability by the customer means that the system has to do what the customer requires; it has to handle the customer's transaction volumes, uptime requirements; etc.
For the Lisp efficiency point, given two options of equivalent complexity, pick the one that performs better. (This is often the same as the one that conses less, i.e. allocates less storage from the heap.)
Given two options where one is more complex than the other, pick the simpler option and revisit the decision only if profiling shows it to be a performance bottleneck.
However, avoid premature optimization. Don't add complexity to speed up something that runs rarely, since in the long run, it matters less whether such code is fast.
If your work affects other groups, might be reusable across groups, adds new components, has an impact on other groups (including QA or Ops), or otherwise isn't purely local, you must write it up using at least a couple of paragraphs, and get a design approval from the other parties involved before starting to write code — or be ready to scratch what you have when they object.
If you don't know or don't care about these issues, ask someone who does.
Note that if you have a "clever" implementation trick, and your trick really is clever, then you must definitely not include it in business specific code; but it may have its place in an open-source library used by the code. If your idea is not general purpose enough to have any users beyond your regular business users, then it is definitely either not clever enough or way too clever, and in either case does not belong in the code.
If you write a general-purpose library, or modify an existing open-source library, you are encouraged to publish the result separate from your main project and then have your project import it like any other open-source library.
Use your judgement to distinguish general-purpose versus business-specific code, and open-source the general-purpose parts, while keeping the business-specific parts a trade secret.
Open-Sourcing code has many advantages, including being able to leverage third parties for development, letting the development of features be user-directed, and keeping you honest with respect to code quality. Whatever code you write, you will have to maintain anyway, and make sure its quality is high enough to sustain use in production. There should therefore be no additional burden to Open-Sourcing, even of code that (at least initially) is not directly usable by third parties.
xcvb-driver:with-controlled-compiler-conditions
and
xcvb-driver:*uninteresting-conditions*
framework (also available as asdf-condition-control
),
either around the entire project, or around individual files
(using asdf's :around-compile
hooks).
You must use correct spelling in your comments, and most importantly in your identifiers.
When several correct spellings exist (including American vs English), and there isn't a consensus amongst developers as which to use, you should choose the shorter spelling.
You must avoid using abbreviations for words, unless it's a word that is used very frequently, in which case you must use the same abbreviation consistently.
If you're not sure, consult a dictionary, Google for alternative spellings, or ask a local grammar nazi.
Here are examples of choosing the correct spelling:
Here are examples of choosing the shorter spelling:
Make appropriate exceptions for industry standard nomenclature/jargon, including plain misspellings. For instance:
Some line length restriction is better than none at all. Google Java developers have adopted a 100-column limitation on source code lines and C++ developers limit themselves to 80 columns. Common Lispers at ITA have long adopted the 100-column limit. Allowing 100 columns seems better, since good style encourages the use of descriptive variables and function names.
Indent your code the way GNU Emacs does.
Indent carefully to make the code easier to understand.
By default, GNU Emacs does an excellent job indenting Common Lisp code. It can be taught how to indent new defining forms and special rules for domain specific languages. Each project may have some file to customize indentation; use it.
Use indentation to make complex function applications easier to read. When an application does not fit on one line or the function takes many arguments, consider inserting newlines between the arguments so that each one is on a separate line. However, do not insert newlines in a way that makes it hard to tell how many arguments the function takes or where an argument form starts and ends.
You must include maintainership and other important information at the top of each source file.
You should not include a copyright statement in source files.
Every source file may begin with a one-line description of the file.
After that optional description, every source file may prominently include a statement about who originally wrote the code, made major changes, and/or is the current owner/maintainer. This makes it easier for hackers to locate whom to ask questions about the code, or to identify that no one is left to reply to such inquiries. However, consider that the information is not very helpful if it is not maintained; unless it brings information that cannot be easily extracted from source control, it is better skipped.
After that optional statement, every file should follow with a brief explanation of what the file contains.
After that explanation, every file should start the code itself with an
(in-package :package-name)
form.
After that in-package
form,
every file should follow with any file-specific
(declaim (optimize ...))
declaration
that is not covered by an asdf :around-compile
hook.
You should not include copyright information in individual source code files. An exception is made for files meant to be disseminated as standalone.
Each project or library has a single file specifying its license. Absence of a LICENSE or COPYING file means the project is proprietary code.
You should include one blank line between top-level forms, such as function definitions. Exceptionally, blank lines can be omitted between simple, closely related defining forms of the same kind, such as a group of related type declarations or constant definitions.
Blank lines can be used to separate parts of a complicated function.
Generally, however, you should break a large function into smaller ones
instead of trying to make it more readable by adding vertical space.
If you can't, you should document with a ;;
comment
what each of the separated parts of the function does.
Every top-level form
should be fewer than 61 lines long,
including comments but excluding the documentation string.
This applies to each of the forms in an eval-when
,
rather than to the eval-when
itself.
Additionally, defpackage
forms may be longer,
since they may include long lists of symbols.
You must not include extra horizontal whitespace before or after parentheses or around symbols.
You must not place right parentheses by themselves on a line. A set of consecutive trailing parentheses must appear on the same line.
You should use only one space between forms.
You should not use spaces to vertically align forms in the middle of consecutive lines. An exception is made when the code possesses an important yet otherwise not visible symmetry that you want to emphasize.
You must align nested forms if they occur across more than one line.
The convention is that the body of a binding form is indented two spaces after the form. Any binding data before the body is usually indented four spaces. Arguments to a function call are aligned with the first argument; if the first argument is on its own line, it is aligned with the function name.
An exception to the rule against lonely parentheses
is made for an eval-when
form around several definitions;
in this case, include a comment ; eval-when
after the closing parenthesis.
You must set your editor to
avoid inserting tab characters in the files you edit.
Tabs cause confusion when editors disagree
on how many spaces they represent.
In Emacs, do (setq-default indent-tabs-mode nil)
.
You must comment anything complicated so that the next developer can understand what's going on. (Again, the "hit by a truck" principle.)
Unless some bit of code is painfully self-explanatory, document it.
Prefer documentation strings to comments
because the former can be displayed by programming tools, such as IDEs,
or by REPL queries such as (describe 'foo)
;
they can also be extracted
to create web-based documentation or other reference works.
Supply a documentation string (also known as docstring) when defining top-level functions, types, classes, and macros. Generally, add a documentation string wherever the language allows.
For functions, the docstring should describe the function's contract: what the function does, what the arguments mean, what values are returned, what conditions the function can signal. It should be expressed at the appropriate level of abstraction, explaining the intended meaning rather than, say, just the syntax. In documentation strings, capitalize the names of Lisp symbols, such as function arguments. For example, "The value of LENGTH should be an integer."
A long docstring may usefully begin with a short, single-sentence summary, followed by the larger body of the docstring.
When the name of a type is used, the symbol may be quoted by surrounding it with a back quote at the beginning and a single quote at the end. Emacs will highlight the type, and the highlighting serves as a cue to the reader that M-. will lead to the symbol's definition.
Every method of a generic function should be independently documented when the specialization affects what the method does, beyond what is described in its generic function's docstring.
When you fix a bug, consider whether what the fixed code does is obviously correct or not; if not, you must add a comment explaining the reason for the code in terms of fixing the bug. Adding the bug number, if any, is also recommended.
You should include a space between the semicolon and the text of the comment.
When a comment is a full sentence, you should capitalize the initial letter of the first word and end the comment with a period. In general, you should use correct punctuation.
---
for comments requiring special attention,
including unobvious tricks, TODO items, questions, breakage, danger.
;---
prefixes a cautionary comment,
e.g. explaining why the code in question is particularly
tricky, delicate, or non-obvious.
;---???
prefixes a serious question
which needs to be resolved soon,
by fixing either the code or its documentation.
;---!!!
identifies code which is broken,
but which for some reason you cannot fix at this time.
You should not use this often for new code.
;---***
identifies active DANGER,
for instance where important functionality is stubbed out,
or a large design issue remains unresolved.
Anything so marked must be fixed
before code is rolled into production.
You must sign and date any of the above "requiring further attention" comments (but not mere cautionary explanations).
This strategy ensures that grepping for ;---
will always yield all the comments that require caution,
as well as whom to talk to about each one.
Only use ;---
on the first line of such a comment.
Other lines should use spaces to align vertically.
This way, grepping will also yield a count of the number of issues.
You should insert a space after this comment prefix.
You may use these with multiple-semicolon comments as well.
Some people like to use words like FIXME
or TODO
.
You may use these, but they must be preceded with ---
.
Use TODO comments when the code is known to be incomplete and you want to indicate what work remains to be done.
The comments begin with TODO
in all capital letters,
followed by your email address or other identifier in parentheses,
followed by a colon, a space, and
an explanation of what additional work is desirable or required.
The user name included in the comment is that
of a person who understands the deficiency.
A TODO comment is not a commitment to fix the problem.
When signing comments, you should use your username (for code within the company) or full email address (for code visible outside the company), not just initials.
Be specific when indicating times or software releases in a TODO comment:
You should design your Domain Specific Language to be easy to read and understand by people familiar with the domain.
You must properly document all your Domain Specific Language.
Sometimes, your DSL is designed for terseness. In that case, it is important to document what each program does, if it's not painfully obvious from the context.
Notably, when you use regular expressions
(e.g. with the CL-PPCRE
package),
you MUST ALWAYS put in a comment
(usually a two-semicolon comment on the previous line)
explaining, at least basically, what the regular expression does,
or what the purpose of using it is.
The comment need not spell out every bit of the syntax, but
it should be possible for someone to follow the logic of the code
without actually parsing the regular expression.
Use lower case for all symbols. Consistently using lower case makes searching for symbol names easier and is more readable.
Note that Common Lisp is case-converting,
and that the symbol-name
of your symbols
will be upper case.
Because of this case-converting,
attempts to distinguish symbols by case are defeated,
and only result in confusion.
While it is possible to escape characters in symbols
to force lower case,
you should not use this capability
unless this is somehow necessary
to interoperate with third-party software.
Place hyphens between all the words in a symbol. If you can't easily say an identifier out loud, it is probably badly named.
You must not use "/"
or "."
instead of "-"
unless you have a well-documented overarching reason to,
and permission from other hackers who review your proposal.
Generally, you should do not abbreviate words. You must avoid using abbreviations for words, unless it's a word that is used very frequently, in which case you must use the same abbreviation consistently. Abbreviations may also be used sparingly to avoid overly-long symbol names; it's easy to run into 100-column limit when there are very long names! You must especially avoid inconsistent abbreviations in exported names. For lexical variables of limited scope, abbreviations are fine.
There are conventions in Common Lisp for the use of punctuation in symbols. You should not use punctuation in symbols outside these conventions.
Unless the scope of a variable is very small,
do not use overly short names like
i
and zq
.
You should name a variable according to the high-level concept that it represents, not according to the low-level implementation details of how the concept is represented.
Thus, you should avoid embedding
data structure or aggregate type names,
such as list
, array
, or
hash-table
inside variable names,
unless you're writing a generic algorithm that applies to
arbitrary lists, arrays, hash-tables, etc.
In that case it's perfectly OK to name a variable
list
or array
.
Indeed, you should be introducing new abstract data types
with DEFCLASS
or DEFTYPE
,
whenever a new kind of intent appears for objects in your protocols.
Functions that manipulate such objects generically may then
use variables the name of which reflect that abstract type.
For example, if a variable's value is always a row
(or is either a row or NIL
),
it's good to call it row
or first-row
or something like that.
It is alright is row
has been
DEFTYPE
'd to STRING
—
precisely because you have abstracted the detail away,
and the remaining salient point is that it is a row.
You should not name the variable STRING
in this context,
except possibly in low-level functions that specifically manipulate
the innards of rows to provide the suitable abstraction.
Be consistent.
If a variable is named row
in one function,
and its value is being passed to a second function,
then call it row
rather than, say, value
(this was a real case).
When naming public symbols of a package, you should not include as a prefix the package name. Naming functions this way makes them awkward to use from client package with package-qualified symbols.
An exception to the above rule would be to include a prefix
for the names of variables that would otherwise be expected to clash
with variables in packages that use the current one.
For instance, ASDF
exports a variable *asdf-verbose*
that controls the verbosity of asdf only and the entire Lisp programs.
The names of global constants should start and end with plus characters.
Global variable names should start and end with asterisks (also known in this context as earmuffs).
In some projects, parameters that are not meant to be usually modified or bound under normal circumstances (but may be during experimentation or exceptional situations) should start (but do not end) with a dollar sign. If such a convention exists within your project, you should follow it consistently. Otherwise, you should avoid naming variables like this.
Common Lisp does not have global lexical variables,
so a naming convention is used to ensure that globals,
which are dynamically bound,
never have names that overlap with local variables.
It is possible to fake global lexical variables
with a differently named global variable
and a DEFINE-SYMBOL-MACRO
.
You should not use this trick.
"P"
.
Name boolean-valued functions with a trailing
"P"
or "-P"
,
to indicate they are predicates.
Generally, you should use
"P"
when the rest of the function name is one word
and "-P"
when it is more than one word.
For uniformity, you should follow the convention above, and not one of the alternatives below.
Alternative rules used in some existing packages
is to always use "-P"
,
or to always use "?"
.
When you develop such a package,
you must be consistent with the rest of the package.
When you start a new package,
you should not use such an alternative rule
without a very good reason.
Lisp is best used as a "mostly functional" language.
Avoid modifying local variables, try rebinding instead.
Avoid creating objects and the SETFing their slots. It's better to set the slots during initialization.
Make classes as immutable as possible, that is, avoid giving slots setter functions if at all possible.
Using a mostly functional style makes it much easier to write concurrent code that is thread-safe. It also makes it easier to test the code.
Common Lisp systems are not required to implement constant space optimizations for recursive function calls from tail positions; however, most serious implementations (including SBCL and CCL) do implement proper tail calls. Still, even compilers that implement proper tail call do it only in restricted conditions:
(DECLARE (OPTIMIZE ...))
settings
must favor SPEED
enough and
not favor DEBUG
too much,
for some compiler-dependent meanings of "enough" and "too much".
(For instance, in SBCL, you should avoid (SPEED 0)
and (DEBUG 3)
to achieve tail call elimination.)
For compatibility with all compilers and to avoid stack overflow when debugging, you should use iteration or the built in mapping functions rather than relying on proper tail calls.
If you do rely on proper tail calls, you must prominently document the fact, and take appropriate measures to ensure an appropriate compiler is used with appropriate optimization settings. For fully portable code, you may have to use trampolines instead.
Using Lisp "special" (dynamically bound) variables as implicit arguments to functions should be used sparingly, and only in cases where it won't surprise the person reading the code, and where it offers significant benefits.
Good candidates for such special variables are items for which "the current" can be naturally used as prefix, such as "the current database connection" or "the current business data source". They are singletons as far as the rest of the code is concerned, and often passing them as an explicit argument does not add anything to the readability or maintainability of the source code in question.
They can make it easier to write code that can be refactored. If you have a request processing chain, with a number of layers that all operate upon a "current" request, passing the request object explicitly to every function requires that every function in the chain have a request argument. Factoring out code into new functions often requires that these functions also have this argument, which clutters the code with boilerplate.
Note that a Lisp special variable is not a global variable in the sense of a global variable in, say, BASIC or C. As special variables can be dynamically bound, they are much more powerful than global value cells that can be changed from everywhere.
You should treat special variables as though they are per-thread variables. That is, leave the special variable with no top-level binding at all, and each thread of control that needs the variable should bind it explicitly. This will mean that any incorrect use of the variable will result in an "unbound variable" error, and each thread will see its own value for the variable.
There are several styles for dealing with assignment and side-effects; whichever a given package is using, keep using the same consistently when hacking said package. Pick a style that makes sense when starting a new package.
Regarding multiple assignment in a same form, there are two schools:
the first style groups as many assignments as possible into a single
SETF
or PSETF
form
thus minimizing the number of forms with side-effects;
the second style splits assignments into as many individual
SETF
(or SETQ
, see below) forms as possible,
to maximize the chances of locating forms that modify a kind of place
by grepping for (setf (foo ...
.
A grep pattern must actually contain as many place-modifying forms
as you may use in your programs, which may make this rationale either
convincing or moot depending on the rest of the style of your code.
You should follow the convention used in the package you are hacking.
We recommend the first convention for new packages.
Regarding SETF
and SETQ
,
there are two schools:
this first regards SETQ
as an archaic implementation detail,
and avoids it entirely in favor of SETF
;
the second regards SETF
as an additional layer of complexity,
and avoids it in favor of SETQ
whenever possible
(i.e. whenever the assigned place is a variable or symbol-macro).
You should follow the convention used in the package you are hacking.
We recommend the first convention for new packages.
In the spirit of a mostly pure functional style, which makes testing and maintenance easier, we invite you to consider how to do things with the fewest assignments required.
Lisp packages are used to demarcate namespaces. Usually, each system has its own namespace. A package has a set of external symbols, which are intended to be used from outside the package, in order to allow other modules to use this module's facilities.
The internal symbols of a package
should never be referred to from other packages.
That is, you should never have to use
the double-colon ::
construct.
(e.g. QUAKE::HIDDEN-FUNCTION
).
If you need to use double-colons to write real production code,
something is wrong and needs to be fixed.
As an exception, unit tests may use the internals of the package being tested. So when you refactor, look at the package's unit tests.
The ::
construct is also useful for very temporary hacks,
and at the REPL.
But if the symbol really is part of
the externally-visible definition of the package,
export it.
Each package is one of two types:
:use
specification of other packages.
If package A
"uses" package B
,
then the external symbols of package B
can be referenced from within package A
without a package prefix.
We mainly use this for low-level modules
that provide widely-used facilities.
B
,
code in package A
must use an explicit package prefix,
e.g. B:DO-THIS
.
If you add a new package, it should always be of the second type,
unless you have a special reason and get permission.
Usually a package is designed to be one or the other,
by virtue of the names of the functions.
For example, if you have an abstraction called FIFO
,
and it were in a package of the first type
you'd have functions named things like
FIFO-ADD-TO
and FIFO-CLEAR-ALL
.
If you used a package of the second type,
you'd have names like ADD-TO
and CLEAR-ALL
,
because the callers would be saying
FIFO:ADD-TO
and FIFO:CLEAR-ALL
.
(FIFO:FIFO-CLEAR-ALL
is redundant and ugly.)
Another good thing about packages is that your symbol names won't "collide" with the names of other packages, except the ones your packages "uses". So you have to stay away from symbols that are part of the Lisp implementation (since you always "use" that) and that are part of any other packages you "use", but otherwise you are free to make up your own names, even short ones, and not worry about some else having used the same name. You're isolated from each other.
Your package must not shadow (and thus effectively redefine) symbols that are part of the Common Lisp language. There are certain exceptions, but they should be very well-justified and extremely rare:
log:error
and log:warn
and so on.
ASSERT
should be used ONLY to detect internal bugs.
Code should ASSERT
invariants whose failure indicates
that the software is itself broken.
Incorrect input should be handled properly at runtime,
and must not cause an assertion violation.
The audience for an ASSERT
failure is a developer.
Do not use the data-form and argument-form in ASSERT
to specify a condition to signal.
It's fine to use them to print out a message for debugging purposes
(and since it's only for debugging, there's no issue of
internationalization).
CHECK-TYPE
,
ETYPECASE
are also forms of assertion.
When one of these fails, that's a detected bug.
You should prefer to use CHECK-TYPE
over (DECLARE (TYPE ...))
for the inputs of functions.
ERROR
should be used
to detect problems with user data, requests, permissions, etc.,
or to report "unusual outcomes" to the caller.
ERROR
should always be called
with an explicit condition type;
it should never simply be called with a string.
This enables internationalization.
ERROR
instead of ASSERT
.
WARN
.
Instead, you should use the appropriate logging framework.
SIGNAL
.
Instead, use ERROR
or ASSERT
.
THROW
and CATCH
;
instead use the restart
facility.
T
, or use IGNORE-ERRORS
.
Instead, let unknown conditions propagate to
the standard ultimate handler for processing.
ERROR
, not T
and not SERIOUS-CONDITION
.
(This is notably because CCL's process shutdown
depends on being able to signal process-reset
and have it handled by CCL's handler,
so we must not interpose our own handler.)
(error (make-condition 'foo-error ...))
is equivalent to (error 'foo-error ...)
—
code must use the shorter form.
UNWIND-PROTECT
(unless they are always handled inside the cleanup form),
or otherwise do non-local exits from cleanup handers
outside of the handler e.g. INVOKE-RESTART
.
If your function is using a special variable as an implicit argument,
it's good to put in a CHECK-TYPE
for the special variable,
for two reasons:
to clue in the person reading the code
that this variable is being used implicitly as an argument,
and also to help detect bugs.
Using (declare (type ...))
is the least-desirable mechanism to use
because, as Scott McKay puts it:
The fact is,
(declare (type ...))
does different things depending on the compiler settings of speed, safety, etc. In some compilers, when speed is greater than safety,(declare (type ...))
will tell the compiler "please assume that these variables have these types" without generating any type-checks. That is, if some variable has the value1432
in it, and you declare it to be of typestring
, the compiler might just go ahead and use it as though it's a string.Moral: don't use
(declare (type ...))
to declare the contract of any API functions, it's not the right thing. Sure, use it for "helper" functions, but not API functions.
You must never use a macro where a function will do. That is, if the semantics of what you are writing conforms to the semantics of a function, then write it as a function rather than a macro.
You must not use a macro for performance reasons.
If profiling shows that you have a performance problem
with a specific function,
document the need and profiling-results appropriately,
and
DECLAIM
to that function INLINE
.
You can also use "compiler-macros" as a way to speed up function execution by specifying a source-to-source transformation. Beware that it interferes with tracing the optimized functions.
When you write a macro-defining macro (a macro that generates macros), comment it particularly clearly, since these are hard for the uninitiated to understand.
Using Lisp macros properly requires taste. Avoid writing complicated macros unless the benefit clearly outweighs the cost. It takes more effort for your fellow developers to learn your macro, so you should only use a macro if the gain in expressiveness is big enough to justify that cost. As usual, feel free to consult your colleagues if you're not sure, since without a lot of Lisp experience, it can be hard to make this judgment.
You must not define new reader macros.
If your macro has a parameter that is a Lisp form
that will be evaluated when the expanded code is run,
you should name the parameter with the suffix -form
.
This convention helps make it clearer to the macro's user
which parameters are Lisp forms to be evaluated, and which are not.
One way to write a macro is the so-called "call-with" style, explained at length in http://random-state.net/log/3390120648.html. The idea is to keep the macro very simple, generating a call to an auxiliary function, which often takes a functional argument consisting of code in the original macro call. Advantages: during development, you can modify the function instead of recompiling all macro call sites; during debugging, you can see the function in the stack trace; there is less generated code so smaller memory usage. You should use this style unless the macro body is simple, rarely subject to change, and the macro is used in tight loops where performance matters. Think about whether the extra stack frames are helpful or just clutter.
Any functions (closures) created by the macro should be named:
either use FLET
or NAMED-LAMBDA
.
Using FLET
is also good
because you can declare the function to be of dynamic
extent (if it is — and usually it is).
If a macro call contains a form,
and the macro expansion includes more than one copy of that form,
the form can be evaluated more than once.
If someone uses the macro and calls it
with a form that has side effects or that takes a long time to compute,
the behavior will be undesirable
(unless you're intentionally writing
a control structure such as a loop).
A convenient way to avoid this problem
is to evaluate the form only once,
and bind a (generated) variable to the result.
There is a very useful macro called ALEXANDRIA:ONCE-ONLY
that generates code to do this.
See also ALEXANDRIA:WITH-GENSYMS
,
to make some temporary variables in the generated code.
When you write a macro with a body,
such as a WITH-xxx
macro,
even if there aren't any parameters,
you should leave space for them anyway.
For example, if you invent WITH-LIGHTS-ON
,
do not make the call to it look like
(defmacro with-lights-on (&body b) ...)
.
Instead, do (defmacro with-lights-on (() &body b) ...)
.
That way, if parameters are needed in the future,
you can add them without necessarily having to change
all the uses of the macro.
#.
sparingly,
and you must avoid read-time side-effects.
The #.
standard read-macro
will read one object, evaluate the object, and
have the reader return the resulting value.
It is mainly used as a quick way
to get something evaluated at compile time
(actually "read time" but it amounts to the same thing).
If you use this, the evaluation MUST NOT have any side effects
and MUST NOT depend on any variable global state.
The #.
should be treated as a way
to force "constant-folding"
that a sufficiently-clever compiler
could have figure out all by itself,
when the compiler isn't sufficiently-clever
and the difference matters.
Consider using a DEFCONSTANT
and its variants,
which would give the value a name explaining what it means.
EVAL-WHEN
is tricky. Be aware.
Lisp evaluation happens at several "times". Be aware of them when writing macros. EVAL-WHEN considered harmful to your mental health.
In summary of the article linked above,
unless you're doing truly advanced macrology,
the only valid combination in an EVAL-WHEN
is to include all of
(eval-when (:compile-toplevel :load-toplevel :execute) ...)
It is usually an error to omit the :execute
,
for it prevents LOAD
ing the source rather than the fasl.
It is usually an error to omit the :load-toplevel
(except to modify e.g. readtables and compile-time settings),
for it prevents LOAD
ing future files
or interactively compiling code
that depend on the effects that happen at compile-time
unless the current file was COMPILE-FILE
d
within the same Lisp session.
In some odd cases, you may want to evaluate things from within
the expansion of a DEFTYPE
or of a non-top-level DEFMACRO
.
In these cases, you should use ASDF-FINALIZERS
and its ASDF-FINALIZERS:EVAL-AT-TOPLEVEL
form.
When a generic function is intended to be called from other
modules (other parts of the code), there should be an
explicit DEFGENERIC
form,
with a :DOCUMENTATION
string
explaining the generic contract of the function
(as opposed to its behavior for some specific class).
It's generally good to do explicit DEFGENERIC
forms,
but for module entry points it is mandatory.
When the argument list of a generic function includes
&KEY
,
the DEFGENERIC
should always explicitly list
all of the keyword arguments that are acceptable,
and explain what they mean.
(Common Lisp does not require this, but it is good form,
and it may avoid spurious warnings on SBCL.)
You should avoid SLOT-VALUE
and WITH-SLOTS
,
unless you absolutely intend to circumvent
any sort of method combination that might be in effect for the slot.
Rare exceptions include INITIALIZE-INSTANCE
and PRINT-OBJECT
methods and
the initialization of Quake volatile slots
in INITIALIZE-RECORD
methods.
Otherwise, you should use accessors,
WITH-ACCESSORS
Accessor names generally follow a convention of
<protocol-name>-<slot-name>
,
where an "protocol" in this case loosely indicates
a set of functions with well-defined behavior;
a class can implement all or part of an interface
by defining some methods for (generic) functions in the protocol,
including readers and writers.
No implication of a formal "protocol" concept is intended.
For example, if there were a "notional" protocol called
is pnr
with accessors pnr-segments
and pnr-passengers
, then
the classes air-pnr
, hotel-pnr
and
car-pnr
could each reasonably implement
methods for pnr-segments
and pnr-passengers
as accessors.
By default, an abstract base class name is used
as the notional protocol name, so accessor names default
to <class-name>-<slot-name>
;
while such names are thus quite prevalent,
this form is neither required nor even preferred.
In general, it contributes to "symbol bloat",
and in many cases has led to a proliferation of "trampoline" methods.
Accessors named <slot-name>-of
should not be used.
Explicit DEFGENERIC
forms should be used when there are
(or it is anticipated that there will be)
more than one DEFMETHOD
for that generic function.
The reason is that the documentation for the generic function
explains the abstract contract for the function,
as opposed to explaining what an individual method does for
some specific class(es).
You must not use generic functions where there is no "notional" protocol. To put it more concretely, if you have more than one generic function that specializes its Nth argument, the specializing classes should all be descendants of a single class. Generic functions must not be used for "overloading", i.e. simply to use the same name for two entirely unrelated types.
More precisely, it's not really
whether they descend from a common superclass,
but whether they obey the same "protocol".
That is, the two classes should handle the same set of generic functions,
as if there were an explicit DEFGENERIC
for each method.
Here's another way to put it. Suppose you have two classes, A and B, and a generic function F. There are two methods for F, which dispatch on an argument being of types A and B. Is it plausible that there might be a function call somewhere in the program that calls F, in which the argument might sometimes, at runtime, be of class A and other times be of class B? If not, you probably are overloading and should not be using a single generic function.
We allow one exception to this rule: it's OK to do overloading if the corresponding argument "means" the same thing. Typically one overloading allows an X object, and the other allows the name of an X object, which might be a symbol or something.
You must not use MOP "intercessory" operations.
If a class definition creates a method
as a :READER
, :WRITER
,
or :ACCESSOR
,
do not redefine that method.
It's OK to add :BEFORE
, :AFTER
,
and :AROUND
methods,
but don't override the primary method.
In methods with keyword arguments,
you must always use &KEY
,
even if the method does not care about the values of any keys,
and you should never use &ALLOW-OTHER-KEYS
.
As long as a keyword is accepted by any method of a generic function,
it's OK to use it in the generic function,
even if the other methods of the same generic function
don't mention it explicitly.
This is particularly important
for INITIALIZE-INSTANCE
methods,
since if you did use &ALLOW-OTHER-KEYS
,
it would disable error checking for misspelled or wrong keywords
in MAKE-INSTANCE
calls!
A typical PRINT-OBJECT
method might look like this:
NIL
.
NIL
can have several different interpretations:
NIL
.
You should test for false NIL
using the operator NOT
or
using the predicate function NULL
.
'()
.
(Be careful about quoting the empty-list when calling macros.)
You should use ENDP
to test for the empty list
when the argument is known to be a proper list,
or with NULL
otherwise.
NIL
if there is no risk of ambiguity anywhere in your code;
otherwise you should use an explicit, descriptive symbol.
NIL
.
You must not introduce ambiguity in your data representations
that will cause headaches for whoever has to debug code.
If there is any risk of ambiguity,
you should use an explicit, descriptive symbol or keyword
for each case,
instead of using NIL
for either.
If you do use NIL
,
you must make sure that the distinction is well documented.
When working with database classes, keep in mind that
NIL
need not always map to 'NULL'
(and vice-versa)!
The needs of the database may differ from the needs of the Lisp.
LIST
data structure.
Common Lisp makes it especially easy to use
its builtin (single-linked) LIST
data structure.
However, you should only use this data structure
where it is appropriate.
You must not use lists when they are an inappropriate abstraction for the data being manipulated.
You must only use lists when their performance characteristics is appropriate for the algorithm at hand (i.e. sequential iteration over the entire contents).
An exception is when it is known in advance that the size of the list will remain very short (say, less than 16 elements), especially so when manipulating source code at compile-time.
Another exception is for introducing literal constants that will be transformed into more appropriate data structures at compile-time or load-time.
You should avoid using a list as anything
besides a container of elements of like type.
You must not use a list as method of passing
multiple separate values of different types
in and out of function calls.
Sometimes it is convenient to use a list
as a little ad hoc structure,
i.e. "the first element of the list is a FOO, and the second is a BAR",
but this should be used minimally
since it gets harder to remember the little convention.
You must only use a list that way
when destructuring the list of arguments from a function,
or creating a list of arguments
to which to APPLY
a function.
The proper way to pass around an object
comprising several values of heterogeneous types
is to use a structure as defined by DEFSTRUCT
or DEFCLASS
.
You should use multiple values only when function returns a small number of values that are meant to be destructured immediately by the caller, rather than passed together as arguments to further functions.
You should not return a condition object as one of a set of multiple values. Instead, you should signal the condition to denote an unusual outcome.
You should signal a condition to denote an unusual outcome, rather than relying on a special return type.
Use FIRST
to access the first element of a list,
SECOND
to access the second element, etc.
Use REST
to access the tail of a list.
Use ENDP
to test for the end of the list.
Use CAR
and CDR
when the cons cell is not being used to implement a proper list
and is instead being treated as a pair of more general objects.
Use NULL
to test for NIL
in this context.
The latter case should be rare outside of alists, since you should be using structures and classes where they apply, and data structure libraries when you want trees.
Exceptionally, you may use CDADR
and other variants
on lists when manually destructuring them,
instead of using a combination of several list accessor functions.
In this context, using CAR
and CDR
instead of FIRST
and REST
also makes sense.
However, mind in such cases that it might be more appropriate
to use higher-level constructs such as
DESTRUCTURING-BIND
or FARE-MATCHER:MATCH
.
ELT
has O(n) behavior when used on lists.
If you are to use random element access on an object,
use arrays and AREF
instead.
The exception is for code outside the critical path where the list is known to be small anyway.
Using lists as representations of sets is a bad idea
unless you know the lists will be small,
for accessors are O(n) instead of O(log n).
For arbitrary big sets, use balanced binary trees,
for instance using lisp-interface-library
.
If you still use lists as sets,
you should not UNION
lists just to search them.
Indeed, UNION
not only conses unnecessarily,
but it can be O(n^2) on some implementations,
and is rather slow even when it's O(n).
#'FUN
rather than 'FUN
.
The former refers to the function object, as is properly scoped.
The latter refers to the symbol, which when called
uses the global FDEFINITION
of the symbol.
When using functions that take a functional argument
(e.g., MAPCAR
, APPLY
,
:TEST
and :KEY
arguments),
you should use the #'
to quote the function,
not just single quote.
An exception is when you explicitly want dynamic linking, because you anticipate that the global function binding will be updated.
Another exception is when you explicitly want to access a global function binding, and avoid a possible shadowing lexical binding. This shouldn't happen often, as it is usually a bad idea to shadow a function when you will want to use the shadowed function; just use a different name for the lexical function.
You must consistently use either #'(lambda ...)
or (lambda ...)
without #'
everywhere.
You should only use the former style if your code is intended as a library
with maximal compatibility to all Common Lisp implementations.
Unlike the case of #'symbol
vs 'symbol
,
it is only a syntactic difference with no semantic impact,
except that the former works on Genera and the latter doesn't.
Note that if you start writing a new system
in a heavily functional style,
you may consider using LAMBDA-READER
,
a system that lets you use the unicode character λ
instead of LAMBDA
.
But you must not start using such a syntactic extension
in an existing system without getting permission from other developers.
It is surprisingly hard to properly deal with pathnames in Common Lisp.
First, be aware of the discrepancies between
the syntax of Common Lisp pathnames,
which depends on which implementation and operating system
you are using,
and the native syntax of pathnames on your operating system.
The Lisp syntax may involves quoting of special characters
such as #\.
and #\*
, etc.,
in addition to the quoting of
#\\
and #\"
within strings.
By contrast, your operating system's other
system programming languages
(shell, C, scripting languages)
may only have one layer of quoting, into strings.
Second, when using MERGE-PATHNAMES
,
be wary of the treatment of the HOST
component,
which matters a lot on non-Unix platforms.
You probably should instead be using
ASDF-UTILS:MERGE-PATHNAMES*
.
Third, be aware that DIRECTORY
is not portable
in how it handles wildcards, sub-directories, symlinks, etc.
There again, ASDF-UTILS
provides several
common abstractions to deal with pathnames.
Finally, be aware that paths may change between
the time you build the Lisp image for your application,
and the time you run the application from its image.
You should be careful to reset your image
to forget irrelevant build-time paths and
reinitialize any search path from current environment variables.
ASDF
for instance requires you to reset its paths
with ASDF:CLEAR-CONFIGURATION
.
You must follow the proper usage regarding well-known functions, macros and special forms.
The Lisp system we primarily use, SBCL, is very picky and
signals a condition whenever a constant is redefined to a value not
EQL
to its previous setting.
You must not use DEFCONSTANT
when defining variables that are not
numbers, characters, or symbols (including booleans and keywords).
Instead, consistently use whichever alternative
is recommended for your project.
Open-Source libraries may use
ALEXANDRIA:DEFINE-CONSTANT
for constants other than numbers, characters and symbols
(including booleans and keywords).
You may use the :TEST
keyword argument
to specify an equality predicate.
&OPTIONAL
,
&KEY
,
and
&AUX
arguments.
You should avoid using &ALLOW-OTHER-KEYS
,
since it blurs the contract of a function.
Almost any real function (generic or not) allows a certain
fixed set of keywords, as far as its caller is concerned,
and those are part of its contract.
If you are implementing a method of a generic function,
and it does not need to know
the values of some of the keyword arguments,
it is acceptable to use &ALLOW-OTHER-KEYS
rather than list all the keyword arguments explicitly
and use (declare (ignore ...))
on them.
(Of course in such a case there should not be a &REST.)
Note that the contract of a generic function belongs in
the DEFGENERIC
, not in the DEFMETHOD
which is basically an "implementation detail" of the generic function
as far as the caller of the generic is concerned.
You should avoid using &AUX
arguments,
except in very short helper functions
where they allow you to eschew a LET
.
You should avoid having both &OPTIONAL
and &KEY
arguments,
unless it never makes sense to specify keyword arguments
when the optional arguments are not all specified.
You must not have non-NIL
defaults
to your &OPTIONAL
arguments
when your function has both &OPTIONAL
and &KEY
arguments.
You should avoid excessive nesting of binding forms inside a function.
If your function ends up with massive nesting,
you should probably break it up into several functions or macros.
If it is really a single conceptual unit,
consider using a macro such as FARE-UTILS:NEST
to at least reduce the amount of indentation required.
It is bad form to use NEST
in typical short functions
with 4 or fewer levels of nesting,
but also bad form not to use it in the exceptional long functions
with 10 or more levels of nesting.
Use your judgment and consult your reviewers.
Use WHEN
and UNLESS
when there is only one alternative.
Use IF
when there are two alternatives
and COND
when there are several.
However, don't use PROGN
for an IF
clause
— use COND
, WHEN
, or UNLESS
.
Note that in Common Lisp,
WHEN
and UNLESS
return NIL
when the condition evaluates to NIL
.
Nevertheless, you may use an IF
to explicitly return NIL
if you have a specific reason to insist on the return value.
You should only use CASE
and ECASE
to compare integers, characters or symbols
(including booleans and keywords).
Indeed, CASE
uses EQL
for comparisons,
so strings and other numbers may not compare the way you expect.
You must not use gratuitous single quotes in CASE
forms.
This is a common error:
'BAR
there is (QUOTE BAR)
,
meaning this leg of the case will be executed
if X
is QUOTE
...
and ditto for the second leg
(though QUOTE
will be caught by the first clause).
This is unlikely to be what you really want.
In CASE
forms,
you must use otherwise
instead of t
when you mean "execute this clause if the others fail".
And you must use ((t) ...)
when you mean "match the symbol T".
You should use ECASE
and ETYPECASE
in preference to CASE
and TYPECASE
.
You should not use CCASE
or CTYPECASE
at all.
Lisp provides four general equality predicates:
EQ
, EQL
, EQUAL
,
and EQUALP
,
which subtly vary in semantics.
Additionally, Lisp provides the type-specific predicates
=
, CHAR=
, CHAR-EQUAL
,
STRING=
, and STRING-EQUAL
.
Know the distinction!
You should use EQL
to compare objects and symbols
for identity.
You must not use EQ
to compare numbers or characters.
Two numbers or characters that are EQL
are not required by Common Lisp to be EQ
.
When choosing between EQ
and EQL
,
you should use EQL
unless you are writing
performance-critical low-level code.
EQL
reduces the opportunity
for a class of embarrassing errors
(i.e. if characters are ever compared).
There may a tiny performance cost relative to EQ
,
although under SBCL, it often compiles away entirely.
You should use CHAR=
for case-dependent character comparisons,
and CHAR-EQUAL
for case-ignoring character comparisons.
You should use STRING=
for case-dependent string comparisons,
and STRING-EQUAL
for case-ignoring string comparisons.
A common mistake when using SEARCH
on strings
is to provide STRING=
or STRING-EQUAL
as the :TEST
function.
The :TEST
function
is given two sequence elements to compare.
If the sequences are strings,
the :TEST
function is called on two characters,
so the correct tests are CHAR=
or CHAR-EQUAL
.
If you use STRING=
or STRING-EQUAL
,
the result is what you expect,
but in some Lisp implementations it's much slower.
CCL (at least as of 8/2008)
creates a one-character string upon each comparison, for example,
which is very expensive.
Also, you should use :START
and :END
arguments
to STRING=
or STRING-EQUAL
instead of using SUBSEQ
;
e.g. (string-equal (subseq s1 2 6) s2)
should instead be
(string-equal s1 s2 :start1 2 :end1 6)
This is preferable because it does not cons.
You should use ZEROP
,
PLUSP
, or MINUSP
,
instead of comparing a value to 0
or 0.0
.
You must not use exact comparison on floating point numbers, since the vague nature of floating point arithmetic can produce little "errors" in numeric value. You should compare absolute values to a threshhold.
You must use =
to compare numbers,
unless it's really okay for 0
,
0.0
and -0.0
to compare unequal!
But then again, you must not usually use exact comparison
on floating point numbers.
Monetary amounts should be using decimal (rational) numbers to avoid the complexities and rounding errors of floating-point arithmetic.
You should simpler forms such as
DOLIST
or DOTIMES
instead of LOOP
in simple cases when you're not going to use any
of the LOOP
facilities such as
bindings, collection or block return.
Use the WITH
clause of LOOP
when it will avoid a level of nesting with LET
.
You may use LET
if it makes it clearer
to return one of bound variables after the LOOP
,
rather than use a clumsy FINALLY (RETURN ...)
form.
In the body of a DOTIMES
,
do not set the iteration variable.
(CCL will issue a compiler warning if you do.)
Most systems use unadorned symbols in the current package
as LOOP
keywords.
Other systems use actual :keywords
from the KEYWORD
package
as LOOP
keywords.
You must be consistent with the convention used in your system.
When writing a server,
code must not send output to the standard streams such as
*STANDARD-OUTPUT*
or *ERROR-OUTPUT*
.
Instead, code must use the proper logging framework
to output messages for debugging.
We are running as a server, so there is no console!
Code must not use PRINT-OBJECT
to communicate with a user —
PRINT-OBJECT
is for debugging purposes only.
Modifying any PRINT-OBJECT
method
must not break any public interfaces.
You should not use a sequence of WRITE-XXX
where a single FORMAT
string could be used.
Using format allows you
to parameterize the format control string in the future
if the need arises.
You should use WRITE-CHAR
to emit a character
rather than WRITE-STRING
to emit a single-character string.
You should not use (format nil "~A" value)
;
you should use PRINC-TO-STRING
instead.
You should use ~<Newline>
or ~@<Newline>
in format strings
to keep them from wrapping in 100-column editor windows,
or to indent sections or clauses to make them more readable.
You should not use STRING-UPCASE
or STRING-DOWNCASE
on format control parameters;
instead, it should use "~:@(~A~)"
or "~(~A~)"
.
Be careful when using the FORMAT
conditional directive.
The parameters are easy to forget.
"~[Siamese~;Manx~;Persian~] Cat"
":"
in front of the last ";"
.
E.g. in "~[Siamese~;Manx~;Persian~:;Alley~] Cat"
,
an out-of-range arg prints "Alley"
.
:
parameter, e.g. "~:[Siamese~;Manx~]"
NIL
,
use the first clause, otherwise use the second clause.
@
parameter, e.g. "~@[Siamese ~a~]"
#
parameter, e.g. "~#[ none~; ~s~; ~s and ~s~]"
"Items:~#[ none~; ~S~; ~S and ~S~:;~@{~#[~; and~] ~S~^ ,~}~]."
INTERN
or UNINTERN
at runtime.
You must not use INTERN
it at runtime.
Not only does it cons,
it either creates a permanent symbol that won't be collected
or gives access to internal symbols.
This creates opportunities for memory leaks, denial of service attacks,
unauthorized access to internals, clashes with other symbols.
You must not INTERN
a string
just to compare it to a keyword;
use STRING=
or STRING-EQUAL
.
You must not use UNINTERN
at runtime.
It can break code that relies on dynamic binding.
It makes things harder to debug.
You must not dynamically intern any new symbol,
and therefore you need not dynamically unintern anything.
You may of course use INTERN
at compile-time,
in the implementation of some macros.
Even so, it is usually more appropriate
to use abstractions on top of it, such as
ALEXANDRIA:SYMBOLICATE
or
ALEXANDRIA:FORMAT-SYMBOL
to create the symbols you need.
EVAL
.
Places where it is actually appropriate to use EVAL
are so few and far between that you must get permission;
it's easily misused.
If your code manipulates symbols at runtime
and needs to get the value of a symbol,
use SYMBOL-VALUE
, not EVAL
.
Often, what you really need is to write a macro,
not to use EVAL
.
Places where it is OK to use EVAL
are:
testing frameworks and code that is ONLY used for testing;
the build infrastructure; and
inside macros when there isn't any reasonable way
to avoid using EVAL
(there almost always is).
Other uses need to be checked.
We do have a few special cases where EVAL
is allowed.
In a language with automatic storage management (such as Lisp or Java), the colloquial phrase "memory leak" refers to situation where storage that is not actually needed nevertheless does not get deallocated, because it is still reachable.
You should be careful that when you create objects, you don't leave them reachable after they are no longer needed!
Here's a particular trap-for-the-unwary in Common Lisp. If you make an array with a fill pointer, and put objects in it, and then set the fill pointer back to zero, those objects are still reachable as far as Lisp goes (the Common Lisp spec says that it's still OK to refer to the array entries past the end of the fill pointer).
Don't cons (i.e., allocate) unnecessarily. Garbage collection is not magic. Excessive allocation is usually a performance problem.
DYNAMIC-EXTENT
where it matters for performance,
and you can document why it is correct.
The purpose of the DYNAMIC-EXTENT
declaration
is to improve performance by reducing garbage collection
in cases where it appears to be obvious that an object's lifetime
is within the "dynamic extent" of a function.
That means the object is created at some point
after the function is called, and
the object is always inaccessible after the function exits by any means.
By declaring a variable or a local function DYNAMIC-EXTENT
,
the programmer asserts to Lisp
that any object that is ever a value of that variable
or the closure that is the definition of the function
has a lifetime within the dynamic extent of the (innermost) function
that declares the variable.
The Lisp implementation is then free to use that information to make the program faster. Typically, Lisp implementations can take advantage of this knowledge to stack-allocate:
&REST
parameters.
If the assertion is wrong, i.e. if the programmer's claim is not true, the results can be catastrophic: Lisp can terminate any time after the function returns, or it hang forever, or — worst of all — produce incorrect results without any runtime error!
Even if the assertion is correct, future changes to the function might introduce a violation of the assertion. This increases the danger.
In most cases, such objects are ephemeral. Modern Lisp implementations use generational garbage collectors, which are quite efficient under these circumstances.
Therefore, DYNAMIC-EXTENT
declarations
should be used sparingly. You must only use them if:
Point (1) is a special case of the principle of avoiding premature optimization. An optimization like this only matters if such objects are allocated at a very high rate, e.g. "inside an inner loop".
It's sometimes hard to know what the rate will be. When writing a function or macro that's part of a library of reusable code, there's no a priori way to know how often the code will run. Ideally, tools would be available to discover the availability and suitability of using such an optimization based on running simulations and test cases, but in practice this isn't as easy as it ought to be. It's a tradeoff. If you're very, very sure that the assertion is true (that the object is only used within the dynamic scope), and it's not obvious how much time will be saved and it's not easy to measure, then it may be better to put in the declaration than to leave it out. (Ideally it would be easier to make such measurements than it actually is.)
Some systems define unsafe numerical comparators, that are designed to be used with fixnums only, and are faster in that case, but incorrect in case of overflow, and have undefined behavior when called with anything but a fixnum. You must not use these functions without both profiling results indicating the need for this optimization, and careful documentation explaining why it is safe to use them.
REDUCE
instead of APPLY
where appropriate.
You should use REDUCE
instead of APPLY
and a consed-up list,
where the semantics of the first operator argument
otherwise guarantees the same semantics.
Of course, you must use APPLY
if it does what you want and REDUCE
doesn't.
For instance, (apply #'+ (mapcar #'acc frobs)
should instead be (reduce #'+ frobs :key #'acc)
This is preferable because it does not do extra consing,
and does not risk going beyond CALL-ARGUMENTS-LIMIT
on implementations where that limit is small,
which could blow away the stack on long lists
(we want our code to not be gratuitously unportable).
However, you must be careful not to use REDUCE
in ways that needlessly increase
the complexity class of the computation.
For instance, (REDUCE 'STRCAT ...)
is O(n^2)
when an appropriate implementation is only O(n).
Moreover, (REDUCE 'APPEND ...)
is also O(n^2) unless you specify :FROM-END T
.
In such cases, you must use proper abstractions
that cover those cases instead of calling REDUCE
,
first defining them in a suitable library if needs be.
NCONC
;
you should use APPEND
instead,
or better data structures.
You should almost never use NCONC
.
You should use APPEND
when you don't depend on any side-effect.
You should use ALEXANDRIA:APPENDF
when you need to update a variable.
You should probably not depend on games
being played with the CDR
of the current cons cell;
and if you do, you must include a prominent
comment explaining the use of NCONC
;
and you should probably reconsider your data representation.
By extension, you should avoid MAPCAN
or the NCONC
feature of LOOP
.
You should instead respectively use
ALEXANDRIA:MAPPEND
and the APPEND
feature of LOOP
respectively.
NCONC
is very seldom a good idea,
since its time complexity class is no better than APPEND
,
its space complexity class also is no better than APPEND
in the common case where no one else is sharing the side-effected list,
and its bug complexity class is way higher than APPEND
.
If the small performance hit due
to APPEND
vs. NCONC
is a limiting factor in your program,
you have a big problem and are probably using the wrong data structure:
you should be using sequences with constant-time append
(see Okasaki's book, and add them to lisp-interface-library),
or more simply you should be accumulating data in a tree
that will get flattened once in linear time
after the accumulation phase is complete (see how ASDF does it).
You may only use NCONC
, MAPCAN
or the NCONC
feature of LOOP
in low-level functions where performance matters,
where the use of lists as a data structure has been vetted
because these lists are known to be short,
and when the function or expression the result of which are accumulated
explicitly promises in its contract that it only returns fresh lists.
Even then, the use of such primitives must be rare,
and accompanied by justifying documentation.
Revision 1.8
Robert Brown François-René Rideau