When (not!) to use macros in Erlang

Throughout the few years of writing Erlang I’ve noticed a number of uses of macros in source code, some of which I definitely agree with and some which in my opinion should have been expressed differently.

Before elaborating on whys and why nots, here’s a short recap for those who might be new to Erlang. This is how macros look like and how they can be used:

%% BAD EXAMPLE! Explained later.
-define(PI, 3.14).
-define(debug(Format, Args),
io:format("~s:~b: " Format, [?FILE, ?LINE] ++ Args)).
area(R) ->
?debug("radius: ~p~n", [R]),
?PI * R * R.

So, macros in Erlang are simple text level substitutions — think C, not Lisp, if you’re looking for comparisons to other languages. They’re completely expanded by the preprocessor and the compiler is unaware of their existence. It’s necessary to clarify at this point that Erlang supports Lisp-like macros which operate on the language abstract syntax tree by parse transformations, as well as core transformations which operate on Core Erlang used later in the compiler pipeline, but it’s a completely different topic.

A macro is introduced by a -define module attribute. It requires a name (PI) or a name along with parameter names (debug(Format, Args)). Conventionally, macro names tend to be ALL CAPS, but it’s not a hard and ubiquitous rule. We use macros by prefixing the name with a question mark (?PI). There exist some predefined macros with pretty self-explanatory names like ?MODULE or ?LINE. Arguments in the macro body are expanded as text — the preprocessor is oblivious to their meaning, so be careful with operator precedence when passing expressions to macros. The usual trick is to place arguments in parentheses, so that we don’t end up with erroneous code like this:

-define(lousy(Arg), Arg * 2).test() ->
8 = ?lousy(2 + 2). %% error: expands to 2 + 2 * 2

Instead, we would write:

-define(quite_useless(Arg), (Arg) * 2).test() ->
8 = ?quite_useless(2 + 2).

Fortunately, in the above trivial case the compiler will warn us about an invalid pattern matching (Warning: no clause will ever match), but if the result is further we might not be this lucky.

OK, so what can we do with them? When should we use them? This answer might seem slightly controversial, but… almost never! Or at least as seldom as possible.

Why is that so? The macro language is primitive and error prone. Given the choice of using it or the higher level and usually more expressive language the macros are embedded in, we should almost always pick the latter. Apart from this a bit philosophical answer, there are also some very pragmatic reasons:

  • They aren’t first class objects — we can’t pass macros around as we can do with functions.
  • Macros can’t be accessed from modules outside of which they are defined — unless we use header files, what is more cumbersome than just calling an exported function. In general, reusing them requires us to do more work. Not a lot more, I admit, but still more.
  • Connected with the above: Erlang has pretty limited namespacing support (one flat namespace of module names), but using it can save us some headaches with name clashes. Macros shared through header files do not make use of even this simple namespacing mechanism.
  • Macros are expanded to plain code, therefore macro “calls” can’t be traced using the machinery provided by the Erlang VM (erlang:trace/3, erlang:trace_pattern/2,3, dbg).
  • Macros don’t introduce their own scope (i.e. Erlang macros are non-hygienic). It’s possible to mistakenly shadow or reuse variables from the code written outside the macro expansion site, though thanks to single-assignment this is not as much of a problem in Erlang as it’s in C. This can be worked around with begin … end blocks or immediately invoked function expressions, but these techniques make code more complex and this makes us look bad in the eyes of Albert.
Everything should be made as simple as possible, but no simpler. — Albert Einstein
  • As already mentioned, extra parentheses are needed to preserve evaluation order of some expressions, if it’s expected that macro parameters might also be expressions.
  • If the macro parameter is a function call, it might be called more than once if the macro parameter is used multiple times in the macro body. What happens then, if the function sends a message?

To sum up, if we can choose between writing a macro to do something and writing a function, we should always choose writing a function. However, there are some situations in which we actually can’t express something in plain Erlang. Then, macros to the rescue!

As previously mentioned, there are some predefined macros, whose values simply aren’t accessible in the program in any other way. For example: ?LINE, ?MODULE, ?FILE.

The ?LINE (and ?FILE, though arguably less so) macro is indispensable when writing assertions or error messages. However, it’s not sufficient to write the assertion as a function — then the reported error line would always be in the function body. We have to write the assertion as a macro to get proper line numbers of the macro “call” sites.

Moreover, even when we reuse existing assertions, but try to add some functionality, we also have to rely on defining a macro to maintain the correct line reporting, as in this macro for asserting equality of proplists (assertEqual comes from EUnit):

-define(proplists_eq(Expected, Actual),
(fun (__Expected, __Actual) ->
end)(Expected, Actual)).

Due to the fact that Erlang guard expressions can’t in general call functions (apart from some BIFs), composite predicates for use in guards have to be written as macros.

The record syntax doesn’t allow to access record fields stored in variable names. In case of composite record structures, it might be convenient to use a short helper macro for accesses instead of a multiple-level deep expression:

-define(get_inner_data(Data, Key),

Thanks go to Paweł Chrząszcz for pointing out this particular use case.

Macros might be useful for code generation aka a limited form of templating.

One more time, we might use a macro to sidestep a limitation of the language. In order to have a single definition for both -type and -spec attributes it’s necessary to use a macro (the example comes from Escalus, an XMPP testing library):

features()) -> step_state()).
-type step() :: fun(?CONNECTION_STEP).

%% ...
-spec start_stream/3 :: ?CONNECTION_STEP.
start_stream(Conn, Props, [] = _Features) ->
%% ...

To distill the problem, we can’t write code like this, because the -spec/-type syntax isn’t flexible enough:

-type my_f() :: fun ((A) -> [A]).
-spec f/1 :: my_f().
f(A) -> [A].

Instead, we have to rely on text substitution:

-define(MY_F, (A) -> [A]).
-type my_f() :: fun (?MY_F).
-spec f/1 :: ?MY_F.
f(A) -> [A].

Going back to the example from the beginning of this note:

-define(PI, 3.14).
-define(debug(Format, Args),
io:format("~s:~b: " Format, [?FILE, ?LINE] ++ Args)).

The definition of debug(Format, Args) can’t be expressed without using a macro, though it would be cleaner (remember tracing!) to rewrite it as:

-define(debug(Format, Args), debug(?FILE, ?LINE, Format, Args)).debug(File, Line, Format, Args) ->
io:format("~s:~b: " Format, [File, Line] ++ Args)).

PI on the other hand doesn’t fall into any of the categories listed above where a macro is necessary. It seems that the Erlang standard library authors also followed this chain of thought, as the constant is available by default as math:pi/0 — a nullary function. Use macros when you really have to. Use plain functions in every other case.

Originally published at github.com.

Erlang/Elixir engineer from Kraków, Poland

Erlang/Elixir engineer from Kraków, Poland