Skip to content

Instantly share code, notes, and snippets.

@remzmike
Last active March 28, 2023 19:58
Show Gist options
  • Save remzmike/f90c67c0c5a3137c65e4b4ac911c042e to your computer and use it in GitHub Desktop.
Save remzmike/f90c67c0c5a3137c65e4b4ac911c042e to your computer and use it in GitHub Desktop.
Hello C : 03 : Reading Code

Part 3: Reading code

The C programming language grammar is mostly made of statements and expressions.

Statements

Statements are the syntax for defining the main surface of code: functions, data and control flow.

Data

Data are just variables. You declare them, and they end up in memory during execution.

Variables have a type which tells you what kind of data is stored in that memory.

Some variables might represent numbers, and others might represent text.

Types are provided by the programming language as a convenient layer of abstraction between you and the binary data stored in memory. We don't need to know about all the different types yet, but here are some fundamentals we will use immediately.

int - A general purpose integer value.

char - Another general purpose integer, but with a smaller possible range of 256 values.

The char is notable because it is often used to represent a single character of text, even though the underlying value is still a number from the perspective of the language. Multiple characters of text, a string, is represented by multiple char values in an array. An array is a list of values with the same type.

However, for now, we just want to be able to read the data statements.

Variable declaration:

int frequency;

Variable assignment:

frequency = 300;

Variable declaration with assignment:

int frequency = 300;

A basic grammatical definition would be:

<variable-type> <variable-name> = <expression>;

So, the variable type is int, an integer, the variable name is frequency, and the value that is assigned comes from the evaluation of the expression on the right side, 300.

Functions

A function definition can also be considered a statement.

void main() {
    Beep(300, 127);
}

Simplified description of a function definition:

<return-value-type> <function-name>(<param-type> <param-name>, ...) <block>

In this case the return-value-type is void which means the function returns nothing, but soon we will return int instead.

Also, note the <block> at the end of the grammar. That is the code that runs when the function is called.

A block is a collection of statements wrapped in curly braces:

{ <block-body> }			

A function call, on the other hand, is actually an expression which can be written into a generic statement that doesn't do anything except evaluate the expression, which is the function call.

So when you write:

Beep(300, 127);

The language evaluates the expression, which happens to be a function call.

To reinforce this point, consider this code where an expression is written as a statement. It will compile and run, but it doesn't do anything other than evaluate the expression.

For example:

void main() {
    (123 & 1 + 3 / 2 & 0xFF);
}

This is complete nonsense, but it should cement the idea that function calls are expressions, not statements. The benefit of this is that you can use function calls in expressions, and return values from function calls to the expression to be evaluated.

Flow control

The flow of the code is how our imagined 'cursor' moves through the code when it is executing.

These are the core flow control statements.

For Loops:

for (int i = 0; i < 10; i++) {
    ...
}

If/Else:

if (x == 0) {
    ...
} else if (x == 1) {
    ...
} else {
    ...
}

Single keyword statements:

break;

continue;

return;

return x;

These will be explained in part 4, because we should talk about expressions and scopes first.

Expressions

Expressions are made of operators and operands. They are like mathematic expressions.

Expressions are evaluated by the processor so that they break down to a single value.

The best way to know expression is probably by example, but knowing math formulas helps.

Operators

Operators are used to combine operands, or even modify a single operand.

eg. + - / * % ++ -- == != ! || && >> << | &

Different operators have different requirements on how they can be combined with operands in an expression.

Operands

Operands are anything that evaluates to a value, which includes literal values, variable references, and function calls.

Literals

42 - decimal integer literal

's' - character literal

"Hello, World!" - string literal

0x2a - hexadecimal integer literal

Variables

frequency, i, x

Function calls

Beep(300, 127)

GetValue()

Example Expressions

34 - An expression with no operators.

30 + 4 - An expression with single operator, +, and two operands, the left, 30, and right, 4.

Expressions are designed to support arbitrary nesting, so operands can be expressions themselves, or sub-expressions.

This expression evaluates to 34, but uses a sub-expression as the left operand to +.

The sub-expression on the left evaluates to 30, first by evaluating 5 * 3, then by subtracting that from 45.

(45 - 5 * 3) + 4

The multiply happens first due to operator precedence. Parenthesis allow you to override the default evaluation precedence. So, this expression actually evaluates to 124 because the inner-most parenthesis is evaluated first, overriding the default C operator precedence.

((45 - 5) * 3) + 4

Basically, expressions are similar to mathematical expressions, but there is a default order which prevents us from always having to explicitly define evaluation order with parenthesis.

40 / 2 * 10

Evaluates to 200, because it is evaluated from left to right.

And because / and * have the same operator precedence.

40 / ( 2 * 10 )

Evaluates to 2, because the inner parenthesis are evaluated first.

( 40 / 2 ) * 10

Evaluates to 200, which is the same as if the parenthesis were not specified.

40 + 20 / 20

Evaluates to 41, because / has higher 'operator precendence' than +, meaning it gets evaluated first regardless of order.

(40 + 20) / 20

Evaluates to 3, because we bypassed the default operator precedence using parenthesis, to make the + expression evaluate first.

So far I have just shown literal integer operands, or values, but operands are anything that evaluates to a value.

Further, expressions can be arbitrarily complex and made up of different parts.

((GetValue() - 5) * multiplier)

When this expression is evaluated, GetValue() is called first, then 5 is subtracted from its result, then that result is multiplied by multipier.

Variable scope

Scopes are the last big thing to understand when reading code.

On load, your program has one initial scope, the global scope. Variables declared outside of functions are in the global scope. All other variables are defined in a scope which is tracked by the language, and you, when you read the code.

As your code executes, additional scopes are added and removed from a stack as needed, and as defined by the language. In C, blocks get their own scopes. So, function bodies, for loops, if blocks, else blocks, and anonymous blocks, all get their own scope.

Each scope added to the stack can see all the variables currently in the other scopes on the stack at that point. Another way of saying this is that all scopes can see the variables in all of their parent scopes.

The stack of scopes is representing a single path in a tree defined by the structure of your code.

Consider this code:

int a = 1;
// position a

void main() {
    int b = 2;
    // position b

    for (int c = 3; c < 3; ) {
        // position c
        int d = 4;
        // position d
    }

    int e = 5;
    // position e
}

Scopes and values available at the commented positions are:

a: {a = 1}
b: {a = 1}, {b = 2}
c: {a = 1}, {b = 2}, {c = 3}
d: {a = 1}, {b = 2}, {c = 3, d = 4}
e: {a = 1}, {b = 2, e = 5}

In that code, only the for loop can see c and d.

Further, the code in the main function does not see its own b or e until they are declared.

The same is true of sub-scopes defined from there.

So, the for loop can see b, but will not see e.

Here is a more thorough example, with comments inline showing the scope variables available at each commented point.

int a = 1;
// scopes: {a = 1}

void main() {

    int b = 2;
    // scopes: {a = 1}, {b = 2}

    for (int c = 3; ; ) {
        // scopes: {a = 1}, {b = 2}, {c = 3}
        int d = 4;
        // scopes: {a = 1}, {b = 2}, {c = 3, d = 4}

        {
            int e = 5;
            // scopes: {a = 1}, {b = 2}, {c = 3, d = 4}, {e = 5}

            if (1) {
                int f = 6;
                // scopes: {a = 1}, {b = 2}, {c = 3, d = 4}, {e = 5}, {f = 6}
            } else {
                int g = 7;
                // scopes: {a = 1}, {b = 2}, {c = 3, d = 4}, {e = 5}, {g = 7}
            }
        }

        break;
    }

    int h = 8;
    // scopes: {a = 1}, {b = 2}, {h = 8}
}

void foo() {
    // scopes: {a = 1}
}

I think that should make it clear.

Here's some code with prints showing the variables currently 'in scope', if you want to explore more.

int a = 1;
// position a

void main() {
int b = 2;
// position b
printf("%d", a);	
printf("%d", b);	

for (int c = 3; ; ) {
    // position c
    printf("%d", a);
    printf("%d", b);
    printf("%d", c);

    int d = 4;
    // position d
    printf("%d", d);
    break;
}

int e = 5;
// position e
printf("%d", e);
}

Comments

Comments let you write notes in your code.

In C, comments are actually part of the preprocessor. They are removed, but the newlines in them are kept.

There are two styles of comment with different purposes:

    // Comment style one, single-line

    /* Comment style two, multi-line, on one line */

    /* 
        Comment style two, multi-line,
        over four lines
    */

Multi-line comments cannot be nested in each other. That can sometimes be annoying. Some editors let you comment multiple selected lines with a shortcut. In Sublime Text & Visual Studio Code the shortcut is Ctrl-/.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment