Macros and more
Metaprogramming is a technique where programs can be treated as data, allow you to write code which can read, manipulate and generate code. This is an incredibly powerful feature as it allows you to avoid writing out similar code multiple times, instead generating it at compile time.
Metaprogramming in Urn is achieved through macros. These are function-like objects which are executed at compile time, allowing them to create and modify code. Before we learn about macros though, we need to talk about quoting.
Quoting
Quoting allows you to “escape” code: converting it from executable code into a series of lists which can be modified. To
do that, you can either call the quote
construct, or place the quote symbol at the front of an expression.
For instance, compare a normal expression:
> (+ 2 3)
5
with the quoted form:
> '(+ 2 3)
(+ 2 3)
> (quote (+ 2 3))
As you can see, the quoted form looks exactly the same, but can then be modified as a normal object:
> (car '(+ 2 3))
+
What happens if we want to include our own values in this quoted expression though? Maybe we want to allow people to specify their own symbols. Of course, we could always create our own lists manually:
> (define x 3)
> (list '+ 2 x)
(+ 2 3)
However, this ends up looking very ugly for nested expressions. Consider:
> '(+ 2 (* 3 5))
(+ 2 (* 3 5))
> (list '+ 2 (list '* x 5))
(+ 2 (* 3 5))
This won’t do! Thankfully there is an alternative version of quote
: using syntax-quote
or a backtick (`
) will
act just like a normal quote until it hits a comma (or unquote
). This will execute the value following the comma and
substitute it’s value back into the expression.
> `(+ 2 (* ,x 5))
(+ 2 (* 3 5))
A word on
syntax-quote
: Strictly speaking,syntax-quote
doesn’t act exactly the same asquote
: it requires all quoted symbols to be resolvable at compile time. For instance,`(+ 2 3)
requires+
to be defined.
It is also possible to insert whole lists inside a backtick expression using ,@
(unquote-splice
).
> (define x '("foo" "bar" "baz"))
> `(.. ,@x "qux")
(.. "foo" "bar" baz" "qux")
Defining macros
As we’ve mentioned, macros allow us to change code at compile time. First off, let’s see an example:
(defmacro show (x)
(print! "Compile-time" (pretty x))
`(progn
(print! "Run-time" ,(pretty x))
(print! "Value" ,x)))
If we then use this macro, we should get three lines:
> (show (+ 2 3))
Compile-time (+ 2 3)
Run-time (+ 2 3)
Value 5
So let’s break this down a little. As you may have guessed, defmacro
is sort of like defun
but for macros. Instead
of getting a series of values as arguments, it receives the quoted form of its arguments: in this case you will get the
equivalent of '(+ 2 3)
.
The body of the function is executed at compile time, meaning our first print statement executes when the code is compiled. It then returns a quoted list, which is composed of two expressions: one which prints out the quoted form of the value, one which prints out the value itself.
One thing to try here is replace defmacro
with defun
and call show
with the quoted value, then you can see your
exact outputs. In this case, it will be something like:
(progn
(print! "Run-time" "(+ 2 3)")
(print! (+ 2 3)))
Of course, this isn’t all you can do, anything that can be done in a function can be done in a macro. This means you can define complex macros which traverse their arguments, perform IO, or anything else you fancy.
There are a couple of things you should be careful of when implementing macros:
- Macro hygiene
- Correct evaluation
For instance, let’s consider a macro which will evaluate its argument, printing its tree form and value if it returns a falsey value. Let’s look at attempt number 1:
(defmacro debug-true (x)
`(unless ,x
(print! (.. ,(pretty x) " returned " (pretty ,x)))))
Let’s try this out on a couple of values:
(debug-true false)
(debug-true (nil? '(1 2 3)))
seems to be working pretty well right? How about this:
(debug-true (print! "foo"))
Uh, oh. Due to how we’ve written the macro, the expression gets evaluated twice if it evaluates to false. Maybe we should store the result in a temporary variable.
(defmacro debug-true (x)
`(with (tmp ,x)
(unless tmp
(print! (.. ,(pretty x) " returned " (pretty tmp))))))
Wait, now we’re getting errors about “Cannot find variable tmp”? That doesn’t even make sense! Remember what I said
earlier about syntax-quote
requiring a variable to be defined: that’s coming into play here: tmp
doesn’t actually
exist when we quote this expression.
First off, let’s talk about why this restriction exists in the first place. The answer is simple: macro hygiene. Imagine we didn’t have such a requirement and I wrote some code like this:
(with (pretty "this text is so pretty :)")
(debug-true "foo"))
We’re now going to get lots of confusing errors about calling strings and what not: our debug-true
macro is now
calling this pretty
variable, rather than the symbol we wanted it to! By resolving all symbols, syntax-quote
ensures
we always get the correct one.
Now on to the problem at hand: tmp
doesn’t exist and so we can’t use it. More over, if we used tmp
we might suffer
from a similar hygiene problem: what if code inside our macro’s argument relied on tmp
: then it might be using our
value rather than the value they wanted! The solution here is to make our own symbol which won’t be used by anyone else:
enter gensym
.
gensym
will create a new symbol, which we can store to a variable and unquote
into our expression:
(defmacro debug-true (x)
(with (tmp (gensym))
`(with (,tmp ,x)
(unless ,tmp
(print! (.. ,(pretty x) " returned " (pretty ,tmp)))))))
Whilst this macro is a tad more complicated than our original implementation, it is much more sturdy.
A word on macro hygiene: The astute of you will have noticed that
quote
allows you to create your own symbols without them having to exist. This means you can unquote a quoted symbol to use a custom symbol. The above example could be written as:(defmacro debug-true (x) `(with (,'tmp ,x) (unless ,'tmp (print! (.. ,(pretty x) " returned " (pretty ,'tmp))))))
However, be careful: you loose some of the macro safety here so only do it for blocks you know no user code will be executed within.
Everything’s a macro
The largest testament to the power of macros is the fact that the vast majority of Urn’s language is defined as them. In fact, there are only a handful of “builtin” constructs, from which everything else can be created:
define
,define-macro
anddefine-native
: Our terms which allow you to create new top level definitions.quote
,syntax-quote
,unquote
andunquote-splice
: These allow you to switch between code and data.lambda
: For all function definitions.cond
: For all conditionals.set!
: For assigning variables.import
: For interfacing with external modules.
So, how can we use these to build up the whole language? Let’s start with a couple of simple macros. First off: defun
.
(define-macro defun (lambda (name args &body))
`(define ,name (lambda ,args ,@body)))
As you can see, defun
is just a simple wrapper for define-macro
: wrapping the body and arguments in a lambda and
then declaring a variable. defmacro
is not dissimilar.
A word on
defun
: In reality,defun
is a tad more complicated as it moves documentation strings from inside the function body to outside.
Next off, let’s consider another pretty common construct: if
. We’ve only got one way of doing conditionals, cond
,
so we’ll have to use that. This ends up being pretty simple.
(defmacro if (c t f)
`(cond
[,c ,t]
[true ,f]))
This should be fairly self-explanatory: if c
evaluates to true then we’ll execute t
, otherwise we’ll execute
f
. Note that now that we’ve defined deffun
, we’re free to use it wherever. By defining more complex terms from
simpler ones, the entire language is “bootstrapped”.
Now let’s cover variables. Looking at the above, we’ve only two ways to introduce new variables into scope: with
define
or lambda
’s arguments. define
is obviously far from ideal so we’ll have to use lambda
. Let’s consider how
this would work: we want to declare a variable x
and assign it the value 2
. To create this variable, we just need to
create a new lambda taking one argument: x
.
(lambda (x) (print! x))
Now we need to find a way to set x
to 2
and then execute it. The answer is simple: call it with 2
!
((lambda (x) (print! x)) 2)
No one wants to actually write code like this though, so we create a macro, with
, which handles all of this.
(defmacro with (var &body)
`((lambda (,(car var 1)) ,@body) ,(cadr var)))
We won’t go into any more detail here about how other language features are implemented, but you can always peruse the source if you are interested.
Debugging macros
Debugging macros is a pesky business as you have to deal with code failing at compile time, producing invalid outputs
and more! There isn’t any simple way to debug compile time failures: often smattering your code everywhere with debug
s
is your best solution.
With generated tools, there is one tool which is slightly more useful. If you run the compiler with the --emit-lisp
flag, it will generate a .lisp
file with the same name as your .lua
output file. This will contain the expanded
version of all macros, including the one you just defined!
Top-level unquotes
As shown earlier, syntax-quote
allows you to escape your code as data, and unquote
allows you to step back to
executing code again. However, what happens if you use unquote
outside a syntax-quote
? Logically it makes sense for
it to step up from the data (your program) and execute code. This allows for arbitrary compile-time execution without
the use of macros.
Like macros, or unquote
inside a syntax-quote
, the resulting value will be spliced back in. Using this, we can
compute the first 10 square numbers at compile time:
(define squares ,(cons `list (map (cut ^ <> 2) (range 1 10))))
Note that we have to add a call to list
, otherwise this would emit the following invalid code:
(define squares (1 4 9 16 25 36 49 64 81 100))