The Urn Logo

Version 0.4.2 released

Well, it’s Urn update time, and there’s a lot to be excited about!

Self-hosting!

As mentioned in the last post, the Urn compiler was initially written in Lua and ported across to Urn as the compiler matured. This release marks the last step of this process, with the compiler being written in 100% Urn. This shouldn’t have any major effect on the end user (you guys), but is still an important step in the maturing of the language.

Because it is mildly interesting, I thought I’d post some statistics about the Urn codebase:

  • The compiler is made up of 5614 lines of Urn code.
  • The standard library has 3498 lines of Urn, and 399 lines of Lua.
  • There are an additional 789 lines of tests.

Improved name-mangling

If you’ve ever looked at the Lua code the Urn compiler emits, you’ll notice it isn’t especially nice. One of the worst bits is that every variable would be uniquely numbered: for instance arg would become arg1. This was required in order to avoid name collisions in the result of function inlining, but wasn’t especially pleasing to the eye either. Thankfully we’ve now improved this, meaning variables will only be numbered if there would otherwise be a name collision.

A further improvement here is that variables emitted from gensym will now be emitted as temp (with a number in the event of collisions). This makes the code slightly easier to read, and means using macros won’t result in large potions of the compiled output changing - only some expressions in the local scope.

Improved recursion

As I’m sure you’re aware (I’ve gone on about it enough times), Urn attempts to have a minimal “core language”, with everything else being added via macros. Consequently, there is no built-in support for loops, this being implemented using tail recursion.

Whilst this leads to a much simpler and more flexible compiler, it does mean the generated Lua is rather inefficient. For instance, consider the following for loop:

(for i 1 5 1
  (print! i))

This would compile into something like:

local temp = nil
temp = (function(temp1)
  if (temp1 <= 5) then
    print1(temp1)
    return temp((temp1 + 1))
  else
    return nil
  end
end)
temp(1)

Hardly very nice to look at, nor very efficient. However, if you look at what the macros expand do, the reasons why you get such ugly code become obvious:

((lambda (temp)
  (set! temp (lambda (temp1)
               (cond
                 [(<= temp1 5)
                  (base/print temp1)
                  (temp (+ temp1 1))]
                  [true])))
  (r_255 1)))

Ideally, we’d be able to detect this code is a loop and convert it into something more sane. However, it turns out that it’s rather hard to create a use-define chain and determine what is a loop and what isn’t. Thankfully we don’t need to - pretty much every loop (while, for, etc…) will get compiled into something resembling the above - we just need to detect that basic form instead!

We can also check for when such a loop is only used once, and so inline it at the call site, meaning you don’t have a temporary lambda at all. This means the initial for loop now gets compiled to the following:

local temp = nil
local temp1 = 1
while (temp1 <= 5) do
  print1(temp1)
  temp1 = (temp1 + 1)
end

Note that there are also some improvements to general loop compilation too. Under previous versions of Urn, such a loop would have been compiled as a while true do with a condition and breaks. We can now detect certain constructs and convert them into the appropriate loop. Obviously there are more potential improvements - this could be converted into a for loop. However, this is definitely a stab in the right direction.

It is also worth stating that these changes aren’t purely cosmetic - there is a significant performance impact too. Thanks to the improved codegen, compiling the compiler took 7.1 seconds, instead of the initial 9.8 - an improvements of 2.7 seconds.

Other changes

Many thanks to CrazedProgrammer and Lignumm for their additions to the standard library.

You can read the full changelog on the GitLab repo.