Skip to content

Instantly share code, notes, and snippets.

@lobre
Last active March 15, 2024 20:43
Show Gist options
  • Star 49 You must be signed in to star a gist
  • Fork 4 You must be signed in to fork a gist
  • Save lobre/4fbf32961513784dde2a9ef4e6a4a1d9 to your computer and use it in GitHub Desktop.
Save lobre/4fbf32961513784dde2a9ef4e6a4a1d9 to your computer and use it in GitHub Desktop.
Zig type system illustrated using ascii diagrams

Zig Type System

Zig aims to be a simple language. It is not easy to define what simple exactly means, but zig is also a low-level programming language that aims for c-compatibility. To reach this goal, it needs good semantics in its type system so that developers have a complete toolbox to manipulate data.

So types in zig are composable, but this can become rapidly overwhelming. See those examples. Are you able to understand them at a glance, as soon as you read them?

*const ?u8
?*const u8
*const [2]u8
[]?u8
?[]u8
[*]u8
[*:0]u8
*[]const u8
*[]*const ?u8

They seemed complex to me, even if Loris Cro helped a lot in the following video.

https://www.youtube.com/watch?v=VgjRyaRTH6E

So when I don’t understand something, I like to draw representations to illustrate the concepts. It is what I do for example to understand how commits relate to each other in git.

So here is my take on the zig type system. Feel free to comment if anything is wrong or unclear.

Types

It all starts with boxes.

┌───┐
│   │
└───┘

u8 and u16 are examples of box sizes and are called types.

┌───┐     ┌─────┐      ┌─┐
│   │ u8  │     │ u16  │ │ u2
└───┘     └─────┘      └─┘

There are many other types in zig, but we will use u8 in this document to illustrate the concepts. In the end, types are just representations of different things of different sizes.

Variables

A variable is a named box having a type and a value.

 ┌───┐
a┤ 1 │   var a: u8 = 1;
 └───┘

Here, the box named a of type u8 holds the value 1.

Constants

A constant is a variable that cannot change over time. Imagine a variable box, but that looks like a jail because it cannot be opened/changed anymore after initialization.

 ┌┰─┰┐
a┤┃1┃│   const a: u8 = 1;
 └┸─┸┘

Optionals

An optional variable can hold either a value or be null. The box accomodates both and is represented here with a vertical split. The top part represents the variable when it is filled, and the bottom part is when it is null.

 ┌───┐                        ┌───┐ 
 │ 1 │                        │   │
a┼─?─┤   var a: ?u8 = 1;     a┼─?─┤   var a: ?u8 = null;
 │   │                        │ ∅ │
 └───┘                        └───┘

Arrays

An array holds multiple values of the same type. The boxes are visually represented attached to each other.

 ┌───┐┌───┐┌───┐
a┤ 1 ├┤ 2 ├┤ 3 │   var a: [3]u8 = [_]u8{ 1, 2, 3 };
 └───┘└───┘└───┘

Values in the array can be accessed using indexes starting at 0. For example, a[0] has the value 1 and can be changed with a[0] = 2.

Here is an array of optional u8 values.

 ┌───┐┌───┐┌───┐
 │ 1 ││   ││ 3 │
a┼─?─┼┼─?─┼┼─?─┤   var a: [3]?u8 = [_]?u8{ 1, null, 3 };
 │   ││ ∅ ││   │
 └───┘└───┘└───┘

An array can also be constant. Elements cannot be changed.

 ┌┰─┰┐┌┰─┰┐┌┰─┰┐
a┤┃1┃├┤┃2┃├┤┃3┃│   const a: [3]u8 = [_]u8{ 1, 2, 3 };
 └┸─┸┘└┸─┸┘└┸─┸┘

An array can be zero-terminated. It means there is an additional value 0 at the index of its length. And the compiler will let you access that element instead of returning an out-of-bands error.

 ┌───┐┌───┐┌───┐┌───┐
a┤ 1 ├┤ 2 ├┤ 3 ├┤ 0 │   var a: [3:0]u8 = [_:0]u8{ 1, 2, 3 };
 └───┘└───┘└───┘└───┘
                        std.debug.print("{}\n", .{a[3]}); // correct and prints 0

This zero value can be changed to any other sentinel value ([3:2]u8 or even [3:'f']u8 for example).

Pointers

A pointer is an address to another variable. In the diagram here, & represents an address for simplicity but it is normally a real memory address. The arrow means that the address stored in the pointer is the one of the pointed variable.

       ┌───┐
      a┤ 1 │   var a: u8 = 1;
       └─▲─┘
 ┌───┐   │
p┤ & ├───┘     var p: *u8 = &a; // &a is the address of a
 └───┘
               p.* = 2; // the value of a can be changed through p using the * keyword

If a pointer is constant, it cannot change but the pointed variable can.

       ┌───┐
      a┤ 1 │   var a: u8 = 1;
       └─▲─┘
 ┌┰─┰┐   │
p┤┃&┃├───┘     const p: *u8 = &a;
 └┸─┸┘
               p.* = 2; // changing a through p is correct
               p = &c; // changing p directly is incorrect

A pointer can instead point to a constant instead of a variable.

       ┌┰─┰┐
      a┤┃1┃│   const a: u8 = 1;
       └┸▲┸┘
 ┌───┐   │
p┤ & ├───┘     var p: *const u8 = &a;
 └───┘
               p.* = 2; // incorrect
               p = &c; // correct

A pointer to a variable can be coerced to a pointer to a constant, but not the opposite.

       ┌───┐                                           ┌┰─┰┐
      a┤ 1 │   var a: u8 = 1;                         a┤┃1┃│   const a: u8 = 1;
       └─▲─┘                                           └┸▲┸┘
 ┌───┐   │                                       ┌───┐   │
p┤ & ├───┘     var p: *u8 = &a;                 p┤ & ├───┘     var p: *const u8 = &a;
 └───┘                                           └───┘
               var p2: *const u8 = p; // correct               var p2: *u8 = p; // incorrect

A pointer can point to an optional variable.

       ┌───┐
       │   │
      a┼─?─┤   var a: ?u8 = null;
       │ ∅ │
       └─▲─┘
 ┌───┐   │
p┤ & ├───┘     var p: *?u8 = &a;
 └───┘

Or a pointer can itself be optional.

       ┌───┐
      a┤ 1 │   var a: u8 = 1;
       └─▲─┘
 ┌───┐   │
 │ & ├───┘     var p: ?*u8 = &a;
p┼─?─┤
 │   │         p = null; // correct
 └───┘

A pointer can point to a constant value that is optional.

       ┌┰─┰┐
       │┃2┃│
      a┼╂?╂┤   const a: ?u8 = 2;
       │┃ ┃│
       └┸▲┸┘
 ┌───┐   │
p┤ & ├───┘     var p: *const ?u8 = &a;
 └───┘

A optional pointer can also point to a constant.

       ┌┰─┰┐
      a┤┃1┃│   const a: u8 = 1;
       └┸▲┸┘
 ┌───┐   │
 │ & ├───┘     var p: ?*const u8 = &a;
p┼─?─┤
 │   │
 └───┘

A pointer can point to an array. This one points to an array of u8.

       ┌───┐┌───┐┌───┐
      a┤ 1 ├┤ 2 ├┤ 3 │   var a: [3]u8 = [_]u8{ 1, 2, 3 };
       └─▲─┘└───┘└───┘
 ┌───┐   │
p┤ & ├───┘               var p: *[3]u8 = &a;
 └───┘

This one points to a constant array of u8.

       ┌┰─┰┐┌┰─┰┐┌┰─┰┐
      a┤┃1┃├┤┃2┃├┤┃3┃│   const a: [3]u8 = [_]u8{ 1, 2, 3 };
       └┸▲┸┘└┸─┸┘└┸─┸┘
 ┌───┐   │
p┤ & ├───┘               var p: *const [3]u8 = &a;
 └───┘

A pointer can point to an unknown number of u8.

       ┌───┐┌───┐┌───┐┌───┐
      a┤ 1 ├┤ 2 ├┤ … ├┤ 5 │   var a: [5]u8 = [_]u8{ 1, 2, 3, 4, 5 };
       └─▲─┘└───┘└───┘└───┘
 ┌───┐   │
p┤ & ├───┘                    var p: [*]u8 = &a;
 └───┘

The advantage over a regular pointer to u8 (*u8) is that it says there can be many u8 at this address. The system just does not know how many.

A pointer can also point to an unknown number but zero-terminated of u8 values.

       ┌───┐┌───┐┌───┐┌───┐┌───┐
      a┤ 1 ├┤ 2 ├┤ … ├┤ 5 ├┤ 0 │   var a: [5:0]u8 = [_:0]u8{ 1, 2, 3, 4, 5 };
       └─▲─┘└───┘└───┘└───┘└───┘
 ┌───┐   │
p┤ & ├───┘                         var p: [*:0]u8 = &a;
 └───┘

At the opposite, see an array of pointers to u8 values.

 ┌───┐  ┌───┐  ┌───┐   var a: u8 = 1;
a┤ 1 │ b┤ 2 │ c┤ 3 │   var b: u8 = 2;
 └─▲─┘  └─▲─┘  └─▲─┘   var c: u8 = 3;
   │    ┌─┘      │
   │    │    ┌───┘
 ┌─┴─┐┌─┴─┐┌─┴─┐
p┤ & ├┤ & ├┤ & │       var p: [3]*u8 = [_]*u8{ &a, &b, &c };
 └───┘└───┘└───┘

And to finish with pointers, they can also point to other pointers.

              ┌───┐
             a┤ 1 │   var a: u8 = 1;
              └─▲─┘
        ┌───┐   │
      p1┤ & ├───┘     var p1: *u8 = &a;
        └─▲─┘
  ┌───┐   │
p2┤ & ├───┘           var p2: **u8 = &p1;
  └───┘

Slices

A slice is a pointer to an array with a length known at runtime. In the slice box, there is the address of the first element of the array represented by & and the length of the slice after the colon character.

A slice can be initiated using a pointer to the backing array (the compiler knows how to coerce them), and the length will be defined at runtime to the length of the array. This means we can always coerce a pointer to an array into a slice, but not the opposite. That’s because the compiler won’t know the length of the array from the slice at compile time.

         ┌───┐┌───┐┌───┐
        a┤ a ├┤ b ├┤ c │   var a: [3]u8 = [_]u8{ 'a', 'b', 'c' };
         └─[─┘└───┘└─]─┘
 ┌───┐     │
p┤ & ├─────┤               var p: *[3]u8 = &a;
 └───┘     │
 ┌─────┐   │
s┤ &:3 ├───┘               var s: []u8 = &a;      // directly pointing to a
 └─────┘                   var s: []u8 = p;       // or assigned from p
                           var s: []u8 = a[0..3]; // or from a range of the array

A zero-terminated slice guarantees that a zero value exists at the element indexed by the length.

         ┌───┐┌───┐┌───┐┌───┐
        a┤ a ├┤ b ├┤ c ├┤ 0 │   var a: [3:0]u8 = [_:0]u8{ 'a', 'b', 'c' };
         └─[─┘└───┘└─]─┘└───┘
 ┌─────┐   │
s┤ &:3 ├───┘                    var s: [:0]u8 = &a;
 └─────┘

A slice can be optional.

         ┌───┐┌───┐┌───┐┌───┐
        a┤ a ├┤ b ├┤ c ├┤ d │   var a: [4]u8 = [_]u8{ 'a', 'b', 'c', 'd' };
         └───┘└─[─┘└───┘└─]─┘
 ┌─────┐        │
 │ &:3 ├────────┘               var s: ?[]u8 = &a;
s┼──?──┤
 │     │                        s = null; // correct
 └─────┘

Now, here is a slice of constant u8 values.

         ┌┰─┰┐┌┰─┰┐┌┰─┰┐
        a┤┃1┃├┤┃2┃├┤┃3┃│   const a: [3]u8 = [_]u8{ 1, 2, 3 };
         └┸[┸┘└┸─┸┘└┸]┸┘
 ┌─────┐   │
s┤ &:3 ├───┘               var s: []const u8 = &a;
 └─────┘

A string litteral is a zero-terminated constant known at comptime that is stored in the binary.

         ┌┰─┰┐┌┰─┰┐┌┰─┰┐┌┰─┰┐┌┰─┰┐┌┰─┰┐
         │┃h┃├┤┃e┃├┤┃l┃├┤┃l┃├┤┃o┃├┤┃0┃│
         └┸[┸┘└┸─┸┘└┸─┸┘└┸─┸┘└┸]┸┘└┸─┸┘
 ┌───┐     │
p┤ & ├─────┤        var p: *const [5:0]u8 = "hello";
 └───┘     │
 ┌─────┐   │
s┤ &:5 ├───┘        var s: []const u8 = p; // correct
 └─────┘

And to add a level of indirection, here is a slice of pointers pointing to constant u8 values.

         ┌┰─┰┐  ┌┰─┰┐  ┌┰─┰┐   const a: u8 = 1;
        a┤┃1┃│ b┤┃2┃│ c┤┃3┃│   const b: u8 = 2;
         └┸▲┸┘  └┸▲┸┘  └┸▲┸┘   const c: u8 = 3;
           │    ┌─┘      │
           │    │    ┌───┘
         ┌─┴─┐┌─┴─┐┌─┴─┐
        p┤ & ├┤ & ├┤ & │       var p: [3]*const u8 = [_]*u8{ &a, &b, &c };
         └─[─┘└───┘└─]─┘
 ┌─────┐   │
s┤ &:3 ├───┘                   var s: []*const u8 = p[0..p.len]; // s.len is 3
 └─────┘

Composability

After all, the type system of zig is just composing those previous concepts. And drawing the representation can help understanding what is going on behind the scenes. I cannot draw all the possible combinations as there is a lot of them, but here is a last one for fun that is more complex.

               ┌┰─┰┐  ┌┰─┰┐  ┌┰─┰┐
               │┃1┃│  │┃ ┃│  │┃3┃│   const a: ?u8 = 1;
              a┼╂?╂┤ b┼╂?╂┤ c┼╂?╂┤   const b: ?u8 = null;
               │┃ ┃│  │┃∅┃│  │┃ ┃│   const c: ?u8 = 3;
               └┸▲┸┘  └┸▲┸┘  └┸▲┸┘
                 │    ┌─┘  ┌───┘
               ┌─┴─┐┌─┴─┐┌─┴─┐
              d┤ & ├┤ & ├┤ & │       var d: [3]*const ?u8 = [_]*const ?u8{ &a, &b, &c };
               └─[─┘└───┘└─]─┘
       ┌─────┐   │
      s┤ &:3 ├───┘                   var s: []*const ?u8 = d[0..2];
       └──▲──┘
 ┌───┐    │
p┤ & ├────┘                          var p: *[]*const ?u8 = &s;
 └───┘

In case you did not guess, this is a pointer to slice of pointers to constant optional u8 values.

Feel free to find other combinations and try to draw them to improve your knowledge about the zig type system!

@ClarkThan
Copy link

the s box in the most bottom diagram should be: &:3

@lobre
Copy link
Author

lobre commented Dec 29, 2022

Thanks for reporting that @ClarkThan, I have modified it.

@odalet
Copy link

odalet commented Dec 29, 2022

In the first diagram in the Slices section, shouldn't

var s: []u8 = p;       // or assigned to p

be from p?

@lobre
Copy link
Author

lobre commented Dec 29, 2022

be from p?

After re-reading that, from p might be more appropriate effectively! Thanks a lot @odalet

@nahuakang
Copy link

The diagram in this section for the array of constant pointers to optional u8s has a while it should be d :)

@lobre
Copy link
Author

lobre commented Jan 3, 2023

It seems I was a bit tired at the end :-). Happy that it is getting reviewed. Thanks @nahuakang for spotting the mistake, I have updated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment