imalsogreg/pointers.md

## pointers.md

      
    Raw
  

              pointers.md
            
          
    Pointers

Some background things to keep in mind:


A variable is the combination of a name and a location in memory.
Initializing a variable tells C what the variable's type is, and sets aside some space for the variable's value
Then, using the variable causes C to look into that variable's memory address.
Assigning value to a variable means looking up the address and copying bites there
Reading a variable means looking up the memory address, reading the bytes, and using the variable's types to interpret the bytes (without the type, there'd be no way to know how to interpret the bytes)
Types give values meaning. The meaning of a char is "a single character". This meaning isn't necessarily part of the language. Instead it's for you to have a mental model of what the program does.

Example: variable assignment
int main(int argc, char *argv[]){

  char a;  //  Variable 'a' is a named location in memory
  a = '!'; //  Assigning a value to 'a' causes C to look up 
           //  the memory address and write bytes there

  cout << "The value in a: " << a << endl; 

  // Using 'a' causes C to look up the address for 'a' and read the bytes out.
  // The type of 'a' tells C how to interpret the bytes. 

}
The char* type and the int* type


char* is a type
You can declare a variable with this type normally
The meaning of a char* is: "an integer that corresponds to a memory address used for holding a char.
The meaning of a int* is: "an integer that corresponds to a memory address used for holding an int.
You can add an * to any type, even your own made up types like film. The meaning of a film* is: "An integer that corresponds to a memory address used for holding a film instance.

Example: assigning to variables with pointer type
class film{};

int main(int argc, char *argv[]){

  int*  p;   // declare a pointer to an int
  film* fp;  // declare a pointer to a film

  p  = 0;    // Assign a 0  to p  (a made-up memory address)
  fp = 1;    // Assign a 10 to fp (another made-up memory address)

  cout << "The value of fp: " << fp << endl;

An int* variable is like any other variable, you can assign to it, and you can read from it. The things you can assign are integers, and they mean an address in memory.
But you shouldn't use int* variables like this in real code. The point of this section is just to show that pointers are types, just like any other type. And they give meaning to variables, just like any other variable has a meaning given by its type.

Why do pointers exist?


If a variable is already a name corresponding to a memory location, why do we need pointers, too? If we want to work with a variable int myInt, can't we just refer to it by name? Why would we want to also have a type for an int's memory location?
More often then not, yes, a plain variable is fine. If you want to work with an 'int myInt', just declare it, assign to it, and read from it, use it in a for loop, etc.
Sometimes, working explicitly with a memory address corresponding to an int is useful

Working with pointers


Before we see some useful things we can do with pointers, we need to see the basic operations over pointers
Warning: there aren't many pointer operations, but their names are counter-intuitive
When you see * or & not next to a type, but next to a value, these things are different from the * that you saw next to a type. These are the pointer operators.
The * operator is a function that takes a value of some pointer (memory address) and returns the concrete value stored at that memory address.
The & operator is a function that takes a plain value and returns the memory address of the variable holding the values.
This is kind of the opposite of what * means when used in a type. That's the confusing part.

Example: playing with pointers
void printAll(char* w, char* x, char y, char z){
  cout << w << " " << x << " " << y << " " << z << endl << endl;
} 

int main(int argc, char *argv[]){

  // This can be fun
  char*  ip;
  char*  ip2;
  char   a;
  char   b;

  // Playing with the `&` operator. What is the address of 'a'?
  
  cout << "Playing with the '&' operator\n";
  a = 'A';
  ip = &a;

  // Playing with the '*' operator
  cout << "Value of *ip should be the thing stored at address ip. It's: " << *ip << endl;
  
  ip2 =  &b;
  *ip2 = 'B';
  cout << "Set ip2 to &b, and then write 'B' to ip2 memory address.\n";
  cout << "What is b now? It's: " << b << endl;

}

Important: We never tried to use *ip before doing something like ip = &b. We have to make sure that the integer memory address of ip is a number that makes sense with the way the memory is laid out by C for our program. If you try to * a char* whose value is 0 or some random integer, this will either cause the program to crash or return nonsense data, because there probably isn't a real char stored at that random memory address. You have to be sure that the value in a char* is a memory address that actually contains bytes for a char. This is hard and it's the source of lots of real-world software bugs.

Useful things you can do with pointers


Passing data to functions for mutation

Example: Mutating data through pointers
void incrementPtr(int* c){
  *c =  *c + 1;
}

void thisWontWork(int c){
  c = c + 1;
}

int main(int argc, char *argv[]){

  int a;
  int* pa;

  a = 10;
  pa = &a;

  cout << "a before incrementPtr: " << a << endl;
  
  incrementPtr(pa);
  cout << "a after incrementPtr: " << a << endl;

  thisWontWork(a);
  cout << "a after thisWontWork: " << a << endl;

}

Explicitly controlling when values are created and deleted

Example: Memory allocation
char* allocateChar(){
  char* p = new char;
  return(p);
}

void printIfAllocated(char *p){
  cout << endl << "Checking pointer\n";
  if (p == NULL){
    cout << "No value here." << endl;
  }
  else {
    cout << "The value pointed at is: " << *p << endl;
  }
}

int main(int argc, char *argv[]){

  char *myCharPtr;

  cout << "Allocating\n";
  myCharPtr = allocateChar();

  (*myCharPtr) = 'Y';
  
  printIfAllocated(myCharPtr);

  cout << endl << "Deleting" << endl;
  delete myCharPtr;
  printIfAllocated(myCharPtr); // NOTE: Not working like I expected
  
}                                                                                                                           
                                                                                                                                             
                  
Arrays. Pointer's are the default way of handling arrays in c and c++.
The first element of an array is pointed at by the pointer. Element n is located at (ptr + n)!
There's no way to tell the length of the array. Programmer has to keep track herself. Crazy!

Example: Arrays
void printArray(char *cs, int arrayLength){

  for (int n = 0; n < arrayLength; n++){
     cout << n << ": " << *(cs + n) << endl;
  }

}

int main(int argc, char *argv[]){

  char *myArray;
  char *exampleString = "Hello, world!"
  myArray = new char[20]; // Allocate 20 char's worth of space
  strcpy( myArray, exampleString );
  printArray(myArray, strlen(exampleString));

}
Pointers in c++


Pointers are confusing and unsafe (often leading to program crashes, because as soon as a program becomes more than 100 lines long it gets hard to mentally keep track of whether pointers are pointing to valid data)
Most of the uses of pointers in C have safer replacements in c++
But it's still very common to see pointers used in c++, because they're valid code and they are well understood by c programmers (most c++ programmers are c programmers too). Also pointers extremely fast for a lot of applications, and are the only way of linking your c++ code with c code written by other people.
Because people use pointers in c++ sometimes and the newer features that subsume pointers other times, you'll sometimes see code that accommodates both. For example in your homework this week, you have the overloaded store_title function in two flavors, one that take a char* (character pointer - c's string), and another that takes a string& (a reference to a c++ string). (*but fyi, building in two copies of each value-setter in a class isn't something you'd see in the real world. This is a weird quirk of whoever wrote your homework. Most programmers would choose one form of string and stick with it *).
c++ also offers & at the type level, this is a 'reference'. A string& in a type signature is a reference to a string instance. References serve a very similar purpose to pointers in functions. The difference is that a reference does not mean a point in memory (you can't treat it like an integer address). Instead, it's something more like a 'name' for the object you're passing to the function, so that the function you're calling has access to the object and can potentially modify it. But there's no sense of the function 'allocating' or 'deallocating' that reference. This restriction makes programs that use reference-passing easier to reason about than programs that use pointer-passing, because there's no telling what a function is going to do to that pointer, unless you have the source code of that function and it's short enough to read and completely understand (this isn't often the case).