Last active
April 6, 2018 17:28
-
-
Save JoshCheek/eb0b47d6571c2fc3613ea45b129c6e1e to your computer and use it in GitHub Desktop.
Memory allocation, NULL, dereferencing.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
// After you think you understand this program, try writing it yourself until | |
// you can get it to work, the first time, without error ^_^ | |
// | |
// ALSO: remember: | |
// $ gcc program.c # compile | |
// $ ./a.out # run | |
// We'll get `printf` from standard input/output's header file | |
#include <stdio.h> | |
// We'll get `malloc` and `free` from the standard library's header file. | |
#include <stdlib.h> | |
// We're telling C that we're going to use memory that has a certain structure, | |
// which we'll call "struct s". The "struct s" has an integer named "var1", | |
// and after that, it has a pointer to another "struct s", named var2. Remember | |
// that a "pointer" is simply a number whose value is an address in memory. | |
struct s { | |
int var1; | |
struct s* var2; | |
}; | |
// Because it's annoying to say "struct s" everywhere, we're going to tell the | |
// C compiler to define a new type, which is an alias for "struct s", and is | |
// named "S". Whitespace added to make it easier for humans to read. | |
typedef struct s S; | |
int main(int argc, char** argv) { | |
// Here, we will use `malloc`, short for "memory allocate". This will set aside | |
// some heap memory for us to use. In this case, we could have used memory on | |
// the stack, and it would have been just as good, but when we want to have | |
// more than one function, we'll need memory that lasts beyond the stack. | |
// | |
// The C compiler knows how big S is, because it knows how big an int and a pointer are. | |
// As such, we can say `sizeof(S)`, which the compiler will see, and replace with | |
// the actual size of S. Thus, we will allocate enough memory to hold three S's, | |
// named `first`, `second`, and `third`. | |
// | |
// `malloc` returns the address of the first byte of memory that it set aside | |
// for us. But what kind of memory is that? C isn't smart enough to realize | |
// that our `sizeof(S)` means that the memory we allocated is an `S`. So, the | |
// address that it returns to us could really be anything. But *we* know that | |
// it's the first address of a chunk of memory that will hold an S. So, after | |
// allocating our memory, we then "typecast" it with `(S*)`. This tells C that | |
// the address returned from `malloc` should be treated as a pointer to an S. | |
// | |
// There are two values to this book keeping: | |
// 1. It needs to know the memory layout of everything we do, this allows it | |
// to do things like refer to specific offsets of memory by name, rather | |
// than requiring us to calculate the offset ourselves. | |
// 2. It will catch certain errors that we might make ("type errors", we have | |
// these in Ruby, too) | |
S* first = (S*) malloc(sizeof(S)); | |
S* second = (S*) malloc(sizeof(S)); | |
S* third = (S*) malloc(sizeof(S)); | |
// Now, lets set our integers. We have a pointer to an S, which means that | |
// we need to go to the location that was allocated, and set the memory there | |
// to equal our number. To go to the location that a pointer is storing, we | |
// use the asterisk, similarly to how we defined them. This is called | |
// "dereferencing" (take a moment to think about why). To refer to one of the | |
// things in S (sry, not sure the right noun here, maybe "attributes"?), we | |
// can use a dot (period), and then refer to it by name. This works because | |
// C knows that `first` is an `S*`, so if we dereference an `S*`, then we get | |
// an `S`, and an `S` is a `struct s`, which has an `int` named `var1` at | |
// offset 0 (the same location the pointer is pointing at), and a `struct s` | |
// pointer named `var2`, at one integer's memory further in. | |
(*first).var1 = 111; | |
(*second).var1 = 222; | |
(*third).var1 = 333; | |
// Now, lets link our structs together. Because `var2` is a `struct s*`, we | |
// can assign it an `S*`. However, we only have three, so for the third one, | |
// we have nothing to link it to. For that situation, we will set its pointer | |
// to "NULL". You can think about `NULL` like you think about `nil` in Ruby, | |
// However, in C, `NULL` is really just the number zero, but with the type | |
// information that it's a pointer. C wouldn't like it if we assigned an int | |
// to a `struct s*`, so even though `NULL` and `0` have the same values, | |
// this allows us to do it in a way that the compiler can understand. | |
(*first).var2 = second; | |
(*second).var2 = third; | |
(*third).var2 = NULL; | |
// Now, lets iterate over our three S's and print out their integers! | |
// We'll make a new `S*` to hold the one we're iterating over. We can pass | |
// it as the condition to a while loop, because we will finish when we hit | |
// third's var2, which is NULL. Remember that `NULL` is zero, so when cursor | |
// hits it, its value will be zero (64 bits worth of zeros, on my 64 bit machine). | |
// C is so low-level that it doesn't even have proper booleans! It just | |
// considers "false" to be zero, and anything which isn't zero to be true. | |
// As a consequence, `NULL` will behave like `false` in a conditional. | |
S* cursor = first; | |
while(cursor) { | |
printf("number: %d\n", (*cursor).var1); | |
cursor = (*cursor).var2; | |
} | |
// Our program is about to end, so there's technically no harm in omitting this, | |
// but lets not be sloppy! We set aside some memory for our `S` pointers, and | |
// now we want to say we're done with it. This will make that memory available | |
// again, for future calls to `malloc`. If we don't do this, then that memory | |
// is still set aside for us, even though we're done using it. If we weren't | |
// exiting our program immediately after, this would result in a situation known | |
// as a "memory leak". Basiclly, the program will get bigger and bigger, because | |
// unused memory is not being made available again, so `malloc` will have to | |
// keep asking the operating system to set aside more memory for it to use. | |
// In languages like Ruby, you don't have to do this, because there is a bit | |
// of code called a "garbage collector", which keeps track of all the pointers, | |
// and periodically inspects them to see which memory is no longer being pointed | |
// at. Then it frees that memory on its own, without you needing to worry about it. | |
free(first); | |
free(second); | |
free(third); | |
return 0; | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment