Skip to content

Instantly share code, notes, and snippets.

@psmolak
Last active September 10, 2018 23:36
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save psmolak/7a25162cc94571554979b33eb96199c6 to your computer and use it in GitHub Desktop.
Save psmolak/7a25162cc94571554979b33eb96199c6 to your computer and use it in GitHub Desktop.
Arrays and Pointers. What's the difference?

This article is ment to explain the differences between arrays and pointers in C.

If you started to learn C, you might have stumbled upon people saying that "array is
just a pointer to its first element". As somewhat true, it's a huge simplification.
To understand why is that, we have to understand what arrays and pointers really are.

Arrays

Every array type is a pair (type, size), where type is the type of its elements,
and size is the number of these elements. Thus int foo[100] and int bar[200]
are names of two different types: foo is of type int[100] whereas bar is of
type int[200].

Any variable of array type can't be modified because arrays in C are not first class
values. Consider such code:

int foo[100];  
int bar[100];  

/* since foo is not a pointer but an array (block of memory),  */  
/* we can't just "point" foo to the bar's block of memory. The */  
/* only sane meaning for this assignment would be to copy all  */  
/* elements from bar into foo, however in C it's not the case  */  
foo = bar;  

/* again, what should that mean? foo is not a pointer, so we    */  
/* can't just simply "point" to the next element. The only sane */  
/* meaning would be to increment all elements of the array,     */  
/* however C does not implement it at all                       */  
foo++;  

In the code above, both statements result in compilation error.

It's worth noting, that &foo and &foo[0] results in two different types. &foo
is of type "array of 100 ints" which in C is expressed as int (*)[100], whereas
&foo[0] is of type int*.

Pointers

A variable of pointer type stores addresses of memory cells. It has no storage by
itself, except few bytes required to fit any address of any memory cell. That
explains why each pointer, no matter what type it points to, has the same size by
means of the sizeof operator.

In the previous example, we were not able to reassign one array to the other. With
pointers we can point to any array we want. How can we do that?

int foo[100];  
int *ptr;  

ptr = &foo;  
ptr = &foo[0];  
ptr = foo;  

You might ask how I was able to do ptr = &foo despite previously saying that &foo
and &foo[0] are two different types, and here the types clearly doesn't match since
we assign &foo of type (int (*)[100]) to ptr of type int*. And in fact,
compiler will produce a warning, saying:

warning: assignment to ‘int *’ from incompatible pointer type ‘int (*)[100]’  

The second way ptr = &foo[0] is correct in terms of types and works without any
warnings.

The third way is somewhat strange. Clearly, foo is of type int[100] and yet we
were able to do the assignment and the compiler didn't protest. That's because in
certain contexts the name of any array is implicitly converted to a pointer to its
elements and its value is the address of the first element.

The implicit conversion from array type to pointer of its first element also happens
when we pass array as an argument to function call. The same happens in the context
of arithmetic expressions.

Additionally, the index operator [] is defined in terms of pointer arithmetic. When
we write foo[i], it's the same as writing *(foo + i), and foo in this context
is converted to pointer of its first element. If we assume the opposite, that is,
there is no implicit conversion between array type and the pointer to its first
element, then expressions such as foo[i] would be interpreted by the compiler as
*( (int[100])foo + i) instead of *( (int*)foo + i). That would result in
different address since the sizeof operator produces different results for both
int[100] and int* thus changing arithmetic of pointers accordingly. However
there is a catch. The conversion happens not when foo[i] is desugared to
*(foo + i), but one step later, when *(foo + i) is compiled into machine code.
In that way, compiler can optimize expressions such as *(foo + i) because foo
is an array, not a pointer, thus the address can be calculated at compile-time
instead of run-time as it happens with pointers. This is an important distinction.
The semantics of [] also explains why expressions such as 10[a] works.

Exercise

Here are some common errors you should be able to explain now:

/* let's say this definition is placed in file1.c */
int array[256];

/* and this declartion in file2.c */
extern int *array;

/* why you will most likely get segmentation fault when you try
 * to treat array in file2.c as array of ints?
 */
void fill_square(int **array, int pattern, int size) {
  ...
  array[i][j] = pattern;
  ...
}

int array[10][10];
fill((int**)array, 0, 10);

/* why this results in segmentation fault? */
/* why this definition makes no sense and results in compilation error? */
int array[];
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment