Skip to content

Instantly share code, notes, and snippets.

@feldspath
Last active August 1, 2021 16:00
Show Gist options
  • Save feldspath/b35f17f3b8c9a62643466570c76eb3b7 to your computer and use it in GitHub Desktop.
Save feldspath/b35f17f3b8c9a62643466570c76eb3b7 to your computer and use it in GitHub Desktop.

Stop using malloc! What a good developper needs to know about memory allocation

If you are new to programming, you've probably already heared of that malloc function. If you're more experiment, you should have used it a couple of times already. A running program needs space to save and read variables, so your computer provides two different types of location to put them: the Stack and the Heap. The Stack is used for general purposes and is handled automatically by the program, whereas the Heap is managed by the programmer. Let's look at a quick example: https://gist.github.com/bdeb28620844fbc2b6c95ab15de8cc17

In this basic function, the definition of the variable a is made on the Stack. The value 3 is just pushed on top of it and the C++ compiler remembers its address when it needs to be accessed. Once our print_example function returns, the variable runs out of scope and everything that was declared inside is discarded and cannot be accessed anymore. Now, this is a basic concept you should already know.

Let us play with that concept of scope and try to challenge the compiler. Let's say we need a function that initializes a variable and returns a reference to it, which is often the case in C where there is no such thing as classes and constructors. https://gist.github.com/26aaedee1ef44e56fd8e23270add1ef3

Can you guess what will be printed on the console ? The compiler doesn't seem to like: https://gist.github.com/ac9e79cefc5a966ca1f2c0aaa734ec03

The program compiles but throws a segmentation fault, meaning we tried to access a location we shouldn't have. This doesn't work because as we said, the variable a is discarded after the function returns, so the pointer to a we got in the main function is dangling. To do this properly, we actually need to allocate the variable on the Heap using malloc. Here is the corrected function: https://gist.github.com/af7e0b213664f375b1c2d60e2aaa248c

malloc returns the address of a free location of the requested size (in bytes), here the size of an int. The variable doesn't get discarded because it is not part of the context of the function, but rather to the program itself. Thus, the variable is accessible everywhere in the program, even though it is not a global variable! We now get the wanted output: https://gist.github.com/1433a0b2966386e905583fda308b04e2

When a variable is allocated on the Heap, it is important to free that memory once we do not need it anymore, or we might end up with memory leak. This very annoying problem is hard to suppress totally for large applications, as it requires to be excessively carefull and meticulous. undefined

If you forget about it and run your code, don't worry, the memory won't be gone forever as the memory allocated for the Heap of a program is discarded once it terminates. That is why memory leak is even more annoying when a program is running for extended periods of time, like video games for example.

For those who want to know more about the differences between the Stack and the Heap, I recommend to take a look at this article. You will also learn when to use the Heap instead of the Stack. You should remember two things:

  • avoid create huge objects on the Stack (because its size is limited and might lead to a stack overflow)
  • be carefull to the scope of the variables and ask yourself if the pointer you're returning is not dangling

So this is basically how to use heap memory allocation. It's actually not that complicated to understand, we request memory of a given size, write to it, and free the space when we don't need it anymore. The difficult part is to be watching for these memory leaks, which can be a real pain to deal with, because of unexpected path the program can take. For example, if your function throws an exception and returns early, the free instruction might never be reached, and the momory it is associated becomes useless. Because of the difficulty of handling raw malloc and free, it should be avoided, except if we really know what we're doing or if our application is simple enough (and even then, we can always be surprised).

So what are the solutions to this problem ? One possibility is to use the RAII programming technique, it stands for Ressource Aquisition Is Initialisation. It means we initiate the memory when the object is aquired. That whay, the allocated memory is bound to the lifetime of the object, meaning that if it runs out of scope and is discarded, the destructor of the object will clean the heap allocated memory for us. Here's what a memory allocator class could look like: https://gist.github.com/2c34892ffdc555aa7aa11e0d2b8528d2

It is pretty straightforward to use: we instantiate the object and request its location it was allocated. https://gist.github.com/179c452b8a60a2404bf600d9de3e3847

Now, when the MemoryAllocator instance runs out of scope and is discarded, its destructor is called and the allocated memory will be automatically freed. If every single heap allocation is made through RAII, we don't have to worry about memory leaks anymore. Even when a function throws and exception, its local variables are discarded and so is the MemoryAllocator instance.

RAII has really became the norm now, and modern programming languages such as Rust forces the programmer to use it, by associating the scope of an object and its existence. This is called the lifetime of an object. If you want to learn more about safe programming techniques, I recommend to try yourself at Rust. You might discover programming from another point of view.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment