Skip to content

Instantly share code, notes, and snippets.

@guhou
Forked from anonymous/csharp-lists.md
Last active September 3, 2015 06:16
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save guhou/bcaf4e5e303698fa05ae to your computer and use it in GitHub Desktop.
Save guhou/bcaf4e5e303698fa05ae to your computer and use it in GitHub Desktop.
A brain-dump/introduction to Lists in C#

An introduction to generic Lists in C#

In this article I present an introduction to generic List types in the C# language. I assume an audience familiar with the principles of structured programming, including parameters, arrays and procedures, but with little experience with object-oriented programming. I will motivate an example for an encapsulated collection class, and then introduce generics as a means to improve our collection.

Suppose a programmer would like to remember some numbers for later use. In C#, as with many languages, an array can be used.

int[] numbers = new int[10];
for (int i = 0; i < 10; i++)
{
    numbers[i] = ReadNumber();
}

In just a few lines of code, we've created and filled up our data structure. But what happens if, after making an array of 10 numbers, we decide we actually need to store 11 numbers? Or 12? Or 1000? Fortunately, in C# we can resize arrays to accommodate more numbers than originally anticipated:

Array.Resize(ref numbers, 11); // Set the length of numbers to 11.
numbers[10] = ReadNumber();

Problem solved, right? Sure, but it's a bit awkward to use Array.Resize every single time we need to add another number. Instead, we can define a procedure that handles the resizing for us.

void Add(ref int[] numbers, int newNumber)
{
    Array.Resize(ref numbers, numbers.Length + 1);
    numbers[numbers.Length - 1] = newNumber;
}

In a procedural language, this is the end of the line. Our code works okay, but since we're passing our array around "naked", some other code could play with our array; doing bad things to it and breaking our assumptions about what our array should look like. For example:

void Naughty(ref int[] numbers)
{
    Array.Resize(ref numbers, numbers.Length + 1000);
}

Here, we have code that makes more room for more numbers, without actually adding any more numbers! Our array now contains many 'empty' spaces where our numbers should be, but our Add procedure doesn't know to use the empty spaces first, so we end up with holes in our array!

Here's where we take off our procedural programming hat and put on our object-oriented programming hat. One of the big buzzwords in object-oriented programming is encapsulation. Instead of letting anybody play with our array, it would be good to hide it from the rest of the world, and only let others play with it in ways that we allow. It would be good if programmers could only add numbers to our collection the 'right' way. Then we could be sure that all the numbers are indeed numbers that we want, and not just some garbage that a naughty procedure has put there without asking.

In short, we would like to make an abstract data type called a List,j which supports (among other things) adding items to the list, removing items from the list, and accessing each item in order.

In C#, we make a class to do this:

class IntList
{
    private int[] items;

    public void Add(int item) { /* as above */ }
    // remove, access, etc
}

We've solved our problem. Because our array is a private member of our class, nobody in the outside world can mess with it directly, so as long as our add, remove methods, etc. all work as we want them to, our list is now in much safer hands.

But let's now consider a new problem: our fancy new IntList can only store numbers. What if we would like to store a list of strings instead? We could make a new class StringList, designed just for strings. And if we want it to store Doubles, we'd need to make a new class for that, and if we want to store Colors, we'd need another class for that. We could copy-paste our list class each time, just changing little bits each time. Any programmer worth their salt should turn their nose up at this idea, since it's lots of hard work every time we need to store a new type of thing, and programmers are lazy! What we need is some way to write a List class once, and have it work just the same way for integers, strings, doubles, or even any other type we might want to store! Fortunately, C# has a feature that will help us out here.

Before we use this magical feature that will solve all our problems, it is useful to take a step back and think about functions. In maths, we can write a function such as f(x) = x₂. In this definition, x is called the parameter of the f, and our function works for any value x we give it. Clearly, the value f(x) depends on what value of x we use! In this way, we have values determined by values. We can write this function in C# and many other programming languages.

int Square(int x) { return x * x; }

Functions using values don't solve our problem. We need to go deeper. What we need is a type determined by another type! C# has a feature called generics that allows a type to use other types as parameters. The fancy computer science name for this idea is "parametric polymorphism". If you've ever watched Mighty Morphin Power Rangers, you might know that "morphing" means changing the shape of something, so polymorphism means that something can change into many different shapes! We want to change the 'shape' of our List based on the type of thing we want to store (int, double, string, ...) so we use a type as a parameter to List. Types determined by other types, oh my!

Of course, in the real world we don't need to 8-syllable phrases to use the idea, so in C# we usually call this concept by the shorter, 3-syllable term "generic". To declare a generic List in C#, we'll need to rewrite our original List class to use our special type parameter.

class List<T>
{
    private T[] items;
    void Add(T item)
    { 
        Array.Resize(ref items, items.Length + 1);
        items[items.Length - 1] = item;
    }
    // and so on...
}

Now, our List<T> is not exactly a type anymore, but rather a type function that, when given a particular type T, will make a real type that will store precisely the type T that we tell it. We only need to write one generic List class, and we can make List<int>, or a List<string>, or List<double>, or anything else we could imagine, for free! Here is our special new object-oriented program for storing numbers:

List<int> numbers = new List<int>();
List<string> names = new List<string>();
for (int i = 0; i < 10; i++)
{
    string name = ReadString("Enter your name: ");
    int number = ReadNumber("Enter your favourite number: ");
    names.Add(name);
    numbers.Add(number);
}

We now have a List that can handle our data safely, without any naughty procedures breaking our program. What's more, we only had to write the List class once, and we can now store any possible type of thing! Generic types are a boon to encapsulation in C#, since we can write new algorithms and data structures more safely, and we can maximise their reuse safely! Of course, Lists are so useful that they're already provided to us in C#, along with some fancier data structures such as Dictionaries, HashSets, Comparers, and so much more!

@guhou
Copy link
Author

guhou commented Sep 3, 2015

Issues:

  • Introduction is a bit scary
  • Justify why Naughty might exist
  • Superfluous j in line "... abstract data type..."
  • f(x) = x^2 is subscript instead of superscript
  • Better explanation for "type determined by type"
  • How many syllables does "parametric polymorphism" have?
  • Maybe better example of how List can be considered a type function
  • Parameter names in final listing

@tamasys
Copy link

tamasys commented Sep 3, 2015

idk if this is a better intro:

This article is an introduction to generic List types in C#. If that doesn't make sense to you yet, it hopefully will by the time you finish reading. I'm going to assume that you don't have much experience with object-oriented programming, but that you're familiar with the basics of structured programming, including parameters, arrays and procedures. If not, then this article is probably not for you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment