Zen of Nim

This is a transcript of Araq's presentation at NimConf2021 delivered on June 26th (see the video on youtube, check the slides on github). It has been adapted to blog post format by Pietro Peterlongo and further reviewed by Araq.

Zen of Nim

  1. Copying bad design is not good design.
  2. If the compiler cannot reason about the code, neither can the programmer.
  3. Don’t get in the programmer’s way.
  4. Move work to compile-time: Programs are run more often than they are compiled.
  5. Customizable memory management.
  6. Concise code is not in conflict with readability, it enables readability.
  7. (Leverage meta programming to keep the language small.)
  8. Optimization is specialization: When you need more speed, write custom code.
  9. There should be one and only one programming language for everything. That language is Nim.

Editor’s note:

In the original presentation the Zen of Nim was given at the end (and without numbering). Here we provide the Zen of Nim rules at the very beginning, numbered for ease of referencing. The discussion of the above rules is done in the context of a general discussion of the language and does not try to follow the order above. The content is here presented following the original presentation, starting from slide material and transcript of the video with minimal editing (this results in an informal tone).

Table of contents:

  • Introduction
  • Syntax (introduces Nim and motivates rule 6: concise code enables readability)
  • A smart compiler (rule 2: compilers must be able to reason about code)
  • Meta programming features (introduced through rule 1: copying bad design …)
  • A practical language (rule 3: don’t get in programmer’s way)
  • Customizable memory management (rule 5)
  • Zen of Nim (recap and discussion of all the rules; rules 4, 7, 8, 9 are only discussed here)

Introduction

In this blog post I will explain the philosophy of Nim language and why Nim can be useful for a wide range of application domains, such as:

  • scientific computing
  • games
  • compilers
  • operating systems development
  • scripting
  • everything else

“Zen” means that we will arrive at a set of rules (shown above) that guide the language design and evolution, but I will go through these rules via examples.

Syntax

Let me introduce Nim via its syntax. I am aware that most of you probably know the language, but to give you a gentle introduction even if you have not seen it before, I will explain basic syntax and hopefully come to interesting conclusions.

Nim uses an indentation based syntax as inspired by Haskell or Python that fits Nim’s macro system.

Function application

Nim distinguishes between statements and expressions and most of the time an expression is a function application (also called a “procedure call”). Function application uses the traditional mathy syntax with the parentheses: f(), f(a), f(a, b).

And here is the sugar:

  Sugar Meaning Example
1 f a f(a) spawn log("some message")
2 f a, b f(a, b) echo "hello ", "world"
3 a.f() f(a) db.fetchRow()
4 a.f f(a) mystring.len
5 a.f(b) f(a, b) myarray.map(f)
6 a.f b f(a, b) db.fetchRow 1
7 f"\n" f(r"\n") re"\b[a-z*]\b"
8 f a: b f(a, b) lock x: echo "hi"
  • In rules 1 and 2 you can leave out the parentheses and there are examples so that you can see why that might be useful: spawn looks like a keyword, which is good, since it does something special; echo is also famous for leaving out the parentheses because usually you write these statements for debugging, thus you are in a hurry to get things done.
  • You have a dot notation available and you can leave out parentheses (rules 3-6).
  • Rule 7 is about string literals: f followed by a string without whitespace is still a call but the string is turned into a raw string literal, which is very handy for regular expressions because regular expressions have their own idea of what a backslash is supposed to mean.
  • Finally, in the last rule we can see that you can pass a block of code to the f with a : syntax. The code block is usually the last argument you pass to the function. This can be used to create a custom lock statement.

There is one exception to leaving out the parentheses, in the case you want to refer to f directly: f does not mean f(). In the case of myarray.map(f) you do not want to invoke f, instead you want to pass the f itself to map.

Operators

Nim has binary and unary operators:

  • Most of the time binary operators are simply invoked as x @ y and unary operators as @x.
  • There is no explicit distinction between operators and functions, and between binary and unary operators.
func `++`(x: var int; y: int = 1; z: int = 0) =
  x = x + y + z

var g = 70
++g
g ++ 7
# operator in backticks is treated like an 'f':
g.`++`(10, 20)
echo g  # writes 108
  • Operators are simply sugar for functions.
  • The operator token goes inside backticks (e.g. ++) when the function is defined and it can be called as a function using backticks notation.

Recall that the var keyword indicates mutability:

  • parameters are readonly unless declared as var
  • var means “pass by reference” (it is implemented as a hidden pointer).

Statements vs expressions

Statements require indentation:

# no indentation is needed for a single assignment statement:
if x: x = false

# indentation is needed for nested if statements:
if x:
  if y:
    y = false
else:
  y = true

# indentation is needed, because two statements
# follow the condition:
if x:
  x = false
  y = false

You can also use semicolons instead of new lines but this is very uncommon in Nim.

Expressions are not really based on indentation so you are free to use additional white space within expressions:

if thisIsaLongCondition() and
    thisIsAnotherLongCondition(1,
        2, 3, 4):
  x = true

This can be very handy for breaking up long lines. As a rule of thumb you can have optional indentation after operators, parentheses and commas.

Finally the if, case, etc statements are also available as expressions, so they can produce a value.

As a simple example to conclude this section, here is a full Nim program to show a little bit more of syntax. If you are familiar with Python, this should be straightforward to read:

func indexOf(s: string; x: set[char]): int =
  for i in 0..<s.len:
    if s[i] in x: return i
  return -1

let whitespacePos = indexOf("abc def", {' ', '\t'})
echo whitespacePos
  • Nim uses static typing, so the parameters have types attached: the input parameter named s has type string; x has the type “set of characters”; the function called indexOf produces an integer value as final result.
  • You can iterate over the string index via the for loop, the goal is to find the position of the first character inside the string that matches the given set of values.
  • When calling the function, we construct a set of characters covering the “whitespace” property using curly parentheses ({}).

Having talked mostly about syntax so far, the take-away here is our first Zen rule:

Concise code is not in conflict with readability, it enables readability.

As you can see in the above tiny example, it is very easy to follow and to read because we basically leave out the symbols that carry little meaning, such as curly braces for blocks or semicolons to terminate the statements. This scales up, so in longer programs it is really helpful when you have less code to look at, because then you can more easily figure out how the code is supposed to work or what it can do (without getting too much details).

Usually the argument is like: “the syntax is terse, so it is unreadable and all you want to do is to save typing work”; in my opinion this totally misses the point, it is not about saving keystrokes or saving typing effort, it is saving the effort when you look at the resulting code. Programs are read way more often than they are written and when you read them it really helps if they are shorter.

A smart compiler

The second rule for our Zen of Nim is:

The compiler must be able to reason about the code.

This means we want:

  • Structured programming.
  • Static typing!
  • Static binding!
  • Side effects tracking.
  • Exception tracking.
  • Mutability restrictions (the enemy is shared mutable state, but if the state is not shared it is fine to mutate it: we want to be able to do it precisely).
  • Value based datatypes (aliasing is very hard to reason about!).

We will see now what these points mean in detail.

Structured programming

In the following example the task is to count all the words in the file (given by filename as a string), and produce a count table of strings, so in the end there will be entries for every word and how often it occurs in the text.

import tables, strutils

proc countWords(filename: string): CountTable[string] =
  ## Counts all the words in the file.
  result = initCountTable[string]()
  for word in readFile(filename).split:
    result.inc word
  # 'result' instead of 'return', no unstructed control flow

Thankfully, the Nim standard library already offers us a CountTable so the first line of the proc is the new count table. result is built into Nim and it represents the return value so you do not have to write return result which is unstructured programming, because return immediately leaves every scope and returns back the result. Nim does offer the return statement but we advise you to avoid it for this reason, since that is unstructured programming.

In the rest of the proc, we read the file into a single buffer, we split it to the get the single word and we count the word via result.inc.

Structured programming means you have a single entry point into a block of code and a single exit point.

In the next example, I leave the for loop body in a more convoluted manner, with a continue statement:

for item in collection:
  if item.isBad: continue
  # what do we know here at this point?
  use item
  • For every item in this collection if the item is bad we continue and otherwise we use the item.
  • What do I know after the continue statement? well, I know that the item is not bad.

Why then not write it in this way, using structured programming:

for item in collection:
  if not item.isBad:
    # what do we know here at this point?
    # that the item is not bad.
    use item
  • The indentation here gives us clues about the invariances in our code, so that we can see much more easily that when I use item the invariant holds that the item is not bad.

If you prefer the continue and return statements, that is fine, it is not a crime to use them, I use them myself if nothing else works, but you should try to avoid it and more importantly it means that we will probably never add a more general go-to statement to the Nim programming language because go-to is even more against the structured programming paradigm. We want to be in this position where we can prove more and more properties about your code and structured programming makes it much easier for a proof engine to help with this.

Static typing

Another argument for static typing is that we really want you to use custom types dedicated to the problem domain.

Here we have a little example showing you the distinct string feature (with enum and set):

type
  SandboxFlag = enum         ## what the interpreter should allow
    allowCast,               ## allow unsafe language feature: 'cast'
    allowFFI,                ## allow the FFI
    allowInfiniteLoops       ## allow endless loops

  NimCode = distinct string

proc runNimCode(code: NimCode; flags: set[SandboxFlag] = {allowCast, allowFFI}) =
  ...
  • NimCode can be stored as a string but it is a distinct string so it is a special type with special rules.
  • The proc runNimCode can run arbitrary Nim code that you passed to it, but it is a virtual machine that can run the code and it can restrict what is possible.
  • There is a sandbox environment in this example and there are custom properties that you might want to use. For instance you can say: allow the nim cast operation (allowCast) or allow the function foreign interface (allowFFI); the last option is about allowing Nim code to run into infinite loops (allowInfiniteLoops).
  • We put the options in an ordinary enum and then we can produce a set of enums, indicating that every option is independent of the others.

If you compare the above to C for instance, where it is common to use the same mechanism, you lose the type safety:

#define allowCast (1 << 0)
#define allowFFI (1 << 1)
#define allowInfiniteLoops (1 << 2)

void runNimCode(char* code, unsigned int flags = allowCast|allowFFI);

runNimCode("4+5", 700); // nobody stops us from passing 700
  • When calling runNimCode, flags is only an unsigned integer and nobody stops you from passing the value 700 even though it does not make any sense.
  • You need to use bit twiddling operations to define allowCast, … allowInfiniteLoops.

You lose information here: even though it is very much in the programmer’s head what is really a valid value for this flags argument, yet it is not written down in the program, so the compiler cannot really help you.

Static binding

We want Nim to use static binding. Here is a modified “hello world” example:

echo "hello ", "world", 99

what happens is that the compiler rewrites the statment to:

echo([$"hello ", $"world", $99])
  • echo is declared as: proc echo(a: varargs[string, `$`]);
  • $ (Nim’s toString operator) is applied to every argument.
  • We use overloading (of the $ operator in this case) instead of dynamic binding (as it would be done for example in C#).

This mechanism is extensible:

proc `$`(x: MyObject): string = x.s
var obj = MyObject(s: "xyz")
echo obj  # works
  • Here I have my custom type MyObject and I define the $ operator to actually return just the s field.
  • Then, I construct a MyObject with value "xyz".
  • echo understands how to echo these objects of type MyObject because they have a dollar operator defined.

Value based datatypes

We want value based data types so that the program becomes easier to reason about. I have already said we want to restrict the shared mutable state. One solution that is usually overlooked by functional programming languages is that in order to do that you want to restrict aliasing and not the mutation. Mutation is very direct, convenient and efficient.

type
  Rect = object
    x, y, w, h: int

# construction:
let r = Rect(x: 12, y: 22, w: 40, h: 80)

# field access:
echo r.x, " ", r.y

# assignment does copy:
var other = r
other.x = 10
assert r.x == 12

The fact that the assignment other = r performed a copy, means that there is no spooky action at a distance involved here, there is only one access path to r.x and other.x is not an access path to the same memory location.

Side effects tracking

We want to be able to track side effects. Here is an example where the goal is to count the number of substrings inside a string.

import strutils

proc count(s: string, sub: string): int {.noSideEffect.} =
  result = 0
  var i = 0
  while true:
    i = s.find(sub, i)
    if i < 0: break
    echo "i is: ", i  # error: 'echo' can have side effects
    i += sub.len
    inc result

Let us assume that this is not correct code and there is a debug echo statement. The compiler would complain: you say proc has no side effect but echo produces a side effect, so you are wrong, go fix your code!

The other aspect of Nim is that while the compiler is very smart and can help you, sometimes you need to get work done and you must be able to override these very good defaults.

So if I say: “okay, I know this does produce a side effect, but I don’t care because this is only code I added for debugging” you can say: “hey, cast this body of code to a noSideEffect effect” and then the compiler is happy and says “ok, go ahead”:

import strutils

proc count(s: string, sub: string): int {.noSideEffect.} =
  result = 0
  var i = 0
  while true:
    i = s.find(sub, i)
    if i < 0: break
    {.cast(noSideEffect).}:
      echo "i is: ", i  # 'cast', so go ahead
    i += sub.len
    inc result

cast means: “I know what I am doing, leave me alone”.

Exception tracking

We want exception tracking!

Here I have my main proc and I want to say it raises nothing, I want to be able to ensure that I handled every exception that can happen:

import os

proc main() {.raises: [].} =
  copyDir("from", "to")
  # Error: copyDir("from", "to") can raise an
  # unlisted exception: ref OSError

The compiler would complain and say “look, this is wrong, copyDir can raise an unlisted exception, namely OSError”. So you say, “fine, in fact I did not handle it”, so now I can claim that main raises OSError and the compilers says: “you are right!”:

import os

proc main() {.raises: [OSError].} =
  copyDir("from", "to")
  # compiles :-)

We want to be able to parametrize over this a little bit:

proc x[E]() {.raises: [E].} =
  raise newException(E, "text here")

try:
  x[ValueError]()
except ValueError:
  echo "good"
  • I have a generic proc x[E] (E is the generic type), and I say: “whatever E you pass to this x, that is what I am going to raise.
  • Then I instantiate this x with this ValueError exception and the compiler is happy!

I was really surprised that this works out of the box. When I came up with this example I was quite sure the compiler would produce a bug, but it is already handling this situation very well and I think the reason for that is that somebody else helped out and fixed this bug.

Mutability restrictions

Here I am going to show and explain what the experimental strictFuncs switch does:

{.experimental: "strictFuncs".}

type
  Node = ref object
    next, prev: Node
    data: string

func len(n: Node): int =
  var it = n
  result = 0
  while it != nil:
    inc result
    it = it.next
  • I have a Node type which is a ref object and next and prev are pointers to these kind of objects (it is a doubly linked list). There is also a data field of type string.
  • I have a function len and it counts the number of nodes that are in my linked list.
  • The implementation is pretty straightforward: unless it is nil you count the node and then follow to next node.

The crucial point is that via strictFuncs we tell the compiler that parameters are now deeply immutable, so the compiler is fine with this code and it is also fine with this example:

{.experimental: "strictFuncs".}

func insert(x: var seq[Node]; y: Node) =
  let L = x.len
  x.setLen L + 1
  x[L] = y
  • I want to insert something but it is a func so it is very strict about my mutations.
  • I want to append to x, which is a sequence of nodes, so x is explicitly mutable via the var keyword (and y is not mutable).
  • I can set x’s length as the old length plus one and then overwrite what is in there, and this is fine.

Finally, I can still mutate local state:

func doesCompile(n: Node) =
  var m = Node()
  m.data = "abc"

I have a variable m of type Node, but it is freshly created and then I mutate it and set the data field and since it is not connected to n the compiler is happy.

The semantics are: “you cannot mutate what is reachable via parameters, unless these parameters are explicitly marked as var”.

Here is an example where the compiler says: “yeah, look, no, you are trying to mutate n, but you are in strictFunc mode so you are not allowed to do that”

{.experimental: "strictFuncs".}

func doesNotCompile(n: Node) =
  n.data = "abc"

We can now play these games and see how smart the compiler is.

Here I try to trick the compiler into accepting the code but I was not able to:

{.experimental: "strictFuncs".}

func select(a, b: Node): Node = b

func mutate(n: Node) =
  var it = n
  let x = it
  let y = x
  let z = y # <-- is the statement that connected
            # the mutation to the parameter

  select(x, z).data = "tricky" # <-- the mutation is here
  # Error: an object reachable from 'n'
  # is potentially mutated
  • select is a helper function that takes two nodes and simply returns the second one.
  • Then I want to mutate n but I assign it to it, and then it to x, x to y, and y to z.
  • Then I select either x or z and then mutate the data field and overwrite the string to value "tricky".

The compiler will tell you “Error, an object reachable from n is potentially mutated” and it will point out the statement that connects the graph to this parameter. What it does internally is: it has a notion of an abstract graph and it starts with “every graph that is constructed is disjoint”, but depending on the body of your function, these disjoint graphs can be connected. When you mutate something, the graph is mutated and if it is connected to an input parameter, then the compiler will complain.

So the second rule is:

If the compiler cannot reason about the code, neither can the programmer.

We really want a smart compiler helping you out, because programming is quite hard.

Meta programming features

Another rule that is kind of famous by now is:

Copying bad design is not good design.

If you say “hey, language X has feature F, let’s have that too!”, you copy this design but you do not know if it is good or bad, because you did not start from first principles.

So, “C++ has compile-time function evaluation, let’s have that too!”. This is not a reason for adding compile-time function evaluation, the reason why we have it (and we do it very differently from C++), is the following: “We have many use cases for feature F”.

In this case F is the macro system: “We need to be able to do locking, logging, lazy evaluation, a typesafe Writeln/Printf, a declarative UI description language, async and parallel programming! So instead of building these features into the language, let’s have a macro system.”

Let’s have a look at these meta programming features. Nim offers templates and macros for this purpose.

Templates for lazy evaluation

A template is a simple substitution mechanism. Here I define a template named log:

template log(msg: string) =
  if debug:
    echo msg

log("x: " & $x & ", y: " & $y)

You can read it as some kind of function, but the crucial difference is that it expands the code directly in line (where you invoke log).

You can compare the above to the following C code where log is a #define:

#define log(msg) \
  if (debug) { \
    print(msg); \
  }

log("x: " + x.toString() + ", y: " + y.toString());

It is quite similar! The reason why this is a template (or a #define) is that we want this message parameter to be evaluated lazily, because in this example I do perform expensive operations like string concatenations and turning variables into strings and if debug is disabled this code should not be run. The usual argument passing semantics are: “evaluate this expression and then call the function”, but then inside the function you would notice that debug is disabled and that you do not need all this information, so it does not have to be computed at all. This is what this template achieves here for us, because it is expanded directly when invoked: if debug is false then this complex expression of concats is not performed at all.

Templates for control flow abstraction:

We can use templates for control flow abstractions. If we want a withLock statement, C# offers it is a language primitive, in Nim you do not have to build this into the language at all, you just write a withLock template that acquires the lock:

template withLock(lock, body) =
  var lock: Lock
  try:
    acquire lock
    body
  finally:
    release lock

withLock myLock:
  accessProtectedResource()
  • withLock acquires the lock and finally releases the lock.
  • In between the locking section the full body is run, which can be passed to withLock statement via colon indentation syntax.

Macros to implement DSLs

You can use macros to implement DSLs (Domain Specific Languages).

Here is a DSL for describing html code:

html mainPage:
  head:
    title "Zen of Nim"
  body:
    ul:
      li "A bunch of rules that make no sense."

echo mainPage()

It produces:

<html>
  <head><title>Zen of Nim</title></head>
  <body>
    <ul>
      <li>A bunch of rules that make no sense.</li>
    </ul>
  </body>
</html>

Lifting

You can use meta programming for “lifting” operations that come up again and again in programming.

For example, we have square root in math for floating point numbers and now I want to have a square root operation that works for a list of floating point numbers. I could use a map call, but I can also create a dedicated sqrt function:

import math

template liftFromScalar(fname) =
  proc fname[T](x: openArray[T]): seq[T] =
    result = newSeq[typeof(x[0])](x.len)
    for i in 0..<x.len:
      result[i] = fname(x[i])

# make sqrt() work for sequences:
liftFromScalar(sqrt)
echo sqrt(@[4.0, 16.0, 25.0, 36.0])
# => @[2.0, 4.0, 5.0, 6.0]
  • We pass fname to the template and fname is applied to every element of the sequence.
  • The final name of the proc is also fname (in this case sqrt).

Declarative programming

You can use templates to turn imperative code into declarative code.

Here I have an example extracted from our test suite:

proc threadTests(r: var Results, cat: Category,
                  options: string) =
  template test(filename: untyped) =
    testSpec r, makeTest("tests/threads" / filename,
      options, cat, actionRun)
    testSpec r, makeTest("tests/threads" / filename,
      options & " -d:release", cat, actionRun)
    testSpec r, makeTest("tests/threads" / filename,
      options & " --tlsEmulation:on", cat, actionRun)

  test "tactors"
  test "tactors2"
  test "threadex"

There are threading tests called tactors, tactors2 and threadex and every single of these tests runs in three different configurations: with the default options, default options plus release switch, default options plus thread local storage emulation. This threadTests call takes many parameters (like category and options and filename), which is just distracting when you copy and paste it over and over again, so here we want to say “I have a test that is called tactors, I have a test that is called tactors2 and I have a test that is called threadex” and by shortening this you are now working at the level of abstraction that you actually want to work on:

test "tactors"
test "tactors2"
test "threadex"

You can shorten this further, since all these test invocations are kind of annoying. What I really want to say is:

test "tactors", "tactors2", "threadex"

Here is a simple macro that does that:

import macros

macro apply(caller: untyped;
            args: varargs[untyped]): untyped =
  result = newStmtList()
  for a in args:
    result.add(newCall(caller, a))

apply test, "tactors", "tactors2", "threadex"

Since it is so simple, it is not able to accomplish the full thing and you need to say apply test. This macro produces a list of statements, and every statement inside this list is actually a call expression calling this test with a (a is the current argument and we iterate over every argument).

The details are not really that important, the crucial insight here is that Nim gives you the capabilities of doing these things and once you get used to it, it is remarkably easy.

Typesafe Writeln/Printf

The next example is a macro system that gives us a type safe printf:

proc write(f: File; a: int) = echo a
proc write(f: File; a: bool) = echo a
proc write(f: File; a: float) = echo a

proc writeNewline(f: File) =
  echo "\n"

macro writeln*(f: File; args: varargs[typed]) =
  result = newStmtList()
  for a in args:
    result.add newCall(bindSym"write", f, a)
  result.add newCall(bindSym"writeNewline", f)
  • Same thing as before, we create a statement list in the first line of the macro and then we iterate over every argument and we produce a function call called write.
  • The bindSym"write" binds write but this is not a single write, it is an overloaded operation because I have three write operations at the start of the example (for int, bool and float), and overloading resolution kicks in and picks the right write operation.
  • Finally, the last line of the macro, there is a call to a writeNewline function that was declared earlier (which produces a new line).

A practical language

The compiler is smart but:

Don’t get in the programmer’s way

We have a tremendous amount of code written in C++, C and Javascript that programmers really need to reuse. We accomplishing this interoperability with C++, C and JavaScript, by compiling Nim to these languages. Note that this is for interoperability, the philosophy is not: “let’s use C++ plus Nim, because Nim does not offer some features that are required to get the job done”. Nim does indeed offer low level features such as:

  • bit twiddling,
  • unsafe type conversions (“cast”),
  • raw pointers.

Interfacing with C++ is the last resort, usually we want you to write Nim code and not leave Nim code, but then the real world kicks in and says: “hey, there’s a bunch of code already written in these languages, how about you make the interoperability with these systems very good”.

We do not want Nim to be just one language out of many and then you use different programming languages to accomplish your system. Ideally, you only use the Nim language because that is much cheaper to do. Then you can hire programmers that only know a single programming language rather than four (or whatever you need).

The interoperability story goes so far that we actually offer an emit statement where you can directly put foreign code into your Nim code and the compiler merges these two things together in the final file.

Here is an example:

{.emit: """
static int cvariable = 420;
""".}

proc embedsC() =
  var nimVar = 89
  {.emit: ["""fprintf(stdout, "%d\n", cvariable + (int)""",
    nimVar, ");"].}

embedsC()

You can emit a static int cvariable and the communication is two way, so you can emit a fprintf statement where the variable nimVar is actually coming from Nim (using the bracket notation you can have both strings and name expressions in the same environment). The C code can use Nim code and viceversa. However, this is really not a good way to do this interfacing, it is just to show that we want you to be able to get things done.

A much better way for interoperability is where you can actually tell Nim: “hey, there is an fprintf function, it is coming from C and these are its types, I want to be able to call it”. Still, the emit pragma gets the point across very well that we want this language to be practical.

Customizable memory management

Now a different topic, so far we did not talk about memory management. In the newer Nim versions it is based on destructors and it is called the gc:arc or gc:orc mode. Destructors and ownership are hopefully familiar notions from C++ and Rust.

The sink parameter here means that the function gets ownership of the string (and then it does not do anything with x):

func f(x: sink string) =
  discard "do nothing"

f "abc"

The question is: “did I produce a memory leak? what happens?”. You can ask the Nim compiler: “hey, expand this function f for me; show me where the destructors are, where moves are performed, where deep copies are performed” (compile with nim c --gc:orc --expandArc:f $file).

The compiler would tell you “look, function f is actually your discard statement and I added this call to the destructor at the end”:

func f(x: sink string) =
  discard "do nothing"
  `=destroy`(x)

The nice thing is that Nim’s intermediate language is Nim itself, so Nim is this one language that can express everything very well.

Here I have a different example:

var g: string

proc f(x: sink string) =
  g = x

f "abc"

This time I take ownership of x and I really do something with the ownership, namely I put x into this global variable g. Again, we can ask the compiler what does it do and the compiler says: “this is a move operation, it is called =sink”. So we move the x into the g and the move takes care of freeing what is inside the g (if there is something) and then it takes x’s value over:

var g: string

proc f(x: sink string) =
  `=sink`(g, x)

f "abc"

What it really did, and unfortunately that is not really visible here, is that it says: “okay, x is moved into g and then we say x was moved and call the destructor”, but these wasMoved and =destroy calls cancel out so that the compiler optimized this for us:

var g: string

proc f(x: sink string) =
  `=sink`(g, x)
  # optimized out:
  wasMoved(x)
  `=destroy`(x)

f "abc"

A custom container

You can use these moves, destructors and copy assignments to create custom data structures.

Here I have a short example, but I will not go into much details.

Destructor:

type
  myseq*[T] = object
    len, cap: int
    data: ptr UncheckedArray[T]

proc `=destroy`*[T](x: var myseq[T]) =
  if x.data != nil:
    for i in 0..<x.len: `=destroy`(x[i])
    dealloc(x.data)

Move operator:

proc `=sink`*[T](a: var myseq[T]; b: myseq[T]) =
  # move assignment, optional.
  # Compiler is using `=destroy` and
  # `copyMem` when not provided
  `=destroy`(a)
  a.len = b.len
  a.cap = b.cap
  a.data = b.data

Assignment operator:

proc `=copy`*[T](a: var myseq[T]; b: myseq[T]) =
  # do nothing for self-assignments:
  if a.data == b.data: return
  `=destroy`(a)
  a.len = b.len
  a.cap = b.cap
  if b.data != nil:
    a.data = cast[typeof(a.data)](alloc(a.cap * sizeof(T)))
    for i in 0..<a.len:
      a.data[i] = b.data[i]

Accessors

proc add*[T](x: var myseq[T]; y: sink T) =
  if x.len >= x.cap: resize(x)
  x.data[x.len] = y
  inc x.len

proc `[]`*[T](x: myseq[T]; i: Natural): lent T =
  assert i < x.len
  x.data[i]

proc `[]=`*[T](x: var myseq[T]; i: Natural; y: sink T) =
  assert i < x.len
  x.data[i] = y

The point here is that destructors, move operators, … can be written by you for your custom containers and then they work well with Nim’s built-in containers, but it also gives you very precise control over the memory allocations and how they are done.

So another Zen rule is:

Customizable memory management

Zen of Nim

Here are the rules once again as a summary:

  • Copying bad design is not good design: we want to create good design by reasoning from first principles about the problem.
  • If the compiler cannot reason about the code, neither can the programmer.
  • However, don’t get in the programmer’s way. The compiler is a smart dog: you can teach it new tricks and it really helps you out, it can perform tasks for you like carrying a newspaper, but in the end the programmer is still smarter than the compiler.
  • We want to move work to compile-time because programs are run more often than they are compiled.
  • We want customizable memory management.
  • Concise code is not in conflict with readability, it enables readability.
  • There was this rule of Zen that was like leverage meta programming to keep the language small, however it is hard to say that and keep a straight face when Nim really offers quite a lot of features. There is a friction between “we want the language to be complete” and “we want the language to be minimal”. The older Nim gets the more Nim is about completeness (all minimal languages grow to serve certain needs).
  • Optimization is specialization. I have not talked about this yet, but when you need more speed, you should really consider to write custom code. The Nim standard library cannot offer everything to everybody and for us it is also much harder to give you the best library for everything, because the best library must be general purpose, it must be the fastest library, it must have the least amount of overhead for your compile times, and that is very hard to accomplish. It is much easier to say “ok, Nim offers this as a standard library, but here I wrote this myself in 10 lines and I can benchmark it and usually my custom code is faster, because it is hand tailored to the application that I am writing”. So the really is: specialize your code and then it will run fast.
  • Finally, there should be one and only one programming language for everything. That language is Nim.

Thank you for reading!