psmolak/arrays-pointers.md

## arrays-pointers.md

      
    Raw
  

              arrays-pointers.md
            
          
    This article is ment to explain the differences between arrays and pointers in C.
If you started to learn C, you might have stumbled upon people saying that "array is

just a pointer to its first element". As somewhat true, it's a huge simplification.

To understand why is that, we have to understand what arrays and pointers really are.
Arrays

Every array type is a pair (type, size), where type is the type of its elements,

and size is the number of these elements. Thus int foo[100] and int bar[200]

are names of two different types: foo is of type int[100] whereas bar is of

type int[200].
Any variable of array type can't be modified because arrays in C are not first class

values. Consider such code:
int foo[100];  
int bar[100];  

/* since foo is not a pointer but an array (block of memory),  */  
/* we can't just "point" foo to the bar's block of memory. The */  
/* only sane meaning for this assignment would be to copy all  */  
/* elements from bar into foo, however in C it's not the case  */  
foo = bar;  

/* again, what should that mean? foo is not a pointer, so we    */  
/* can't just simply "point" to the next element. The only sane */  
/* meaning would be to increment all elements of the array,     */  
/* however C does not implement it at all                       */  
foo++;  
In the code above, both statements result in compilation error.
It's worth noting, that &foo and &foo[0] results in two different types.  &foo

is of type "array of 100 ints" which in C is expressed as int (*)[100], whereas

&foo[0] is of type int*.
Pointers

A variable of pointer type stores addresses of memory cells.  It has no storage by

itself, except few bytes required to fit any address of any memory cell.  That

explains why each pointer, no matter what type it points to, has the same size by

means of the sizeof operator.
In the previous example, we were not able to reassign one array to the other.  With

pointers we can point to any array we want. How can we do that?
int foo[100];  
int *ptr;  

ptr = &foo;  
ptr = &foo[0];  
ptr = foo;  
You might ask how I was able to do ptr = &foo despite previously saying that &foo

and &foo[0] are two different types, and here the types clearly doesn't match since

we assign &foo of type (int (*)[100]) to ptr of type int*. And in fact,

compiler will produce a warning, saying:
warning: assignment to ‘int *’ from incompatible pointer type ‘int (*)[100]’  

The second way ptr = &foo[0] is correct in terms of types and works without any

warnings.
The third way is somewhat strange. Clearly, foo is of type int[100] and yet we

were able to do the assignment and the compiler didn't protest.  That's because in

certain contexts the name of any array is implicitly converted to a pointer to its

elements and its value is the address of the first element.
The implicit conversion from array type to pointer of its first element also happens

when we pass array as an argument to function call. The same happens in the context

of arithmetic expressions.
Additionally, the index operator [] is defined in terms of pointer arithmetic. When

we write foo[i], it's the same as writing *(foo + i), and foo in this context

is converted to pointer of its first element. If we assume the opposite, that is,

there is no implicit conversion between array type and the pointer to its first

element, then expressions such as foo[i] would be interpreted by the compiler as

*( (int[100])foo + i) instead of *( (int*)foo + i). That would result in

different address since the sizeof operator produces different results for both

int[100] and int* thus changing arithmetic of pointers accordingly. However

there is a catch. The conversion happens not when foo[i] is desugared to

*(foo + i), but one step later, when *(foo + i) is compiled into machine code.

In that way, compiler can optimize expressions such as *(foo + i) because foo

is an array, not a pointer, thus the address can be calculated at compile-time

instead of run-time as it happens with pointers. This is an important distinction.

The semantics of [] also explains why expressions such as 10[a] works.
Exercise

Here are some common errors you should be able to explain now:
/* let's say this definition is placed in file1.c */
int array[256];

/* and this declartion in file2.c */
extern int *array;

/* why you will most likely get segmentation fault when you try
 * to treat array in file2.c as array of ints?
 */
void fill_square(int **array, int pattern, int size) {
  ...
  array[i][j] = pattern;
  ...
}

int array[10][10];
fill((int**)array, 0, 10);

/* why this results in segmentation fault? */
/* why this definition makes no sense and results in compilation error? */
int array[];