Skip to content

Instantly share code, notes, and snippets.

@KaanGaming
Last active May 7, 2024 16:25
Show Gist options
  • Save KaanGaming/32492e2144c549b22d8eae516d550c8e to your computer and use it in GitHub Desktop.
Save KaanGaming/32492e2144c549b22d8eae516d550c8e to your computer and use it in GitHub Desktop.
Useful IL opcodes

Useful IL Opcodes

IL opcodes documentation designed with dynamically building assemblies in C# in mind. This documentation implies you know the basics of C# and programming fundamentals, but not how a computer operates at the deep level of understanding (e.g. stacks, heap, instructions, opcodes, etc.)

What's IL?

If you don't know what IL is, it stands for Intermediate Language, also referred to as Common Intermediate Language (CIL). CIL is the one that's used in generating IL code for dynamic assemblies in C#.

It's an object-oriented, stack-based bytecode language. Which means its memory is stack-based, and it uses bytecodes as instructions. Its object-oriented nature means you can make classes in it. Various compilers also compile code into CIL code, which then later gets compiled into native code.

More information on CIL is to be found in Wikipedia.


Explanations for words that you may not know are included at the bottom of the documentation, in the Key Terms section.

Not all opcodes are included here -- the list for opcodes are found in Microsoft’s documentation.

Usages of the opcodes that have arguments in them have a line that explains how to emit them in an ILGenerator. Example: Emit(Ldarg, Int16), exactly like the one shown in Microsoft’s documentation.

It is also suggested that you read in-depth explanations for the opcodes before you assume what that opcode does. It should also prevent any confusions on that opcode. If there's no section explaining the opcode you're looking at, visit Microsoft's documentations, or leave a comment below to tell me about the lack of explanation on that opcode.

Especially important sections are how store opcodes work, etc.

Load and store

Arguments

  • Ldarg - Load argument at specified index onto eval stack
    Usage: Emit(Ldarg, Int16)
    • Ldarg_0 .. Ldarg_3 - Load argument at index 0, 1, 2, 3
  • Starg - Store value into argument at specified index from eval stack (see more)
    Usage: Emit(Starg, Int16)
    • Starg_0 .. Starg_3 - Store value into argument argument at index 0, 1, 2, 3

Variables

NOTE: You MUST use LocalBuilder to declare local variables in the IL code for the dynamic method.

  • Ldloc - Load local variable at index onto eval stack
    Usage: Emit(Ldloc, LocalBuilder) / Emit(Ldloc, Int16) (Int16 being the index of the local variable)
    • Ldloc_0 .. Ldloc_3 - Load local variable index 0, 1, 2, 3
  • Stloc - Store value into local variable at index from eval stack (see more)
    Usage: Emit(Stloc, LocalBuilder) / Emit(Stloc, Int16) (Int16 being the index of the local variable)
    • Stloc_0 .. Stloc_3 - Store value into local variable at index from eval stack

Objects

  • Ldfld - Load value from a field onto eval stack (see more)
    Usages:
    • Instance: Push object containing the field onto stack then Emit(Ldfld, FieldInfo)
    • Static: Emit(Ldnull) then Emit(Ldfld, FieldInfo)
  • Stfld - Stores a value from the eval stack into a field (see more)
    Usages:
    • Instance: Push object containing the field onto stack then Push any value matching the type of the field then Emit(Stfld, FieldInfo)
    • Static: Emit(Ldnull) then Push any value matching the type of the field then Emit(Stfld, FieldInfo)
  • Box - Converts values into object references. Necessary for methods that use the object type.
    Usage: Emit(Box, Type)
  • Unbox - The inverse of Box opcode. Except it unboxes the object to a value type pointer instead.
    Usage: Emit(Unbox, Type)
    • Unbox_Any - The exact inverse of Box opcode.
      Usage: Emit(Unbox_Any, Type)
  • Newobj - Takes values as arguments, then calls the constructor in order to make an object reference which has the type of that constructor's defining type.
    (see more about how to use Newobj)
    Usage: Emit(Newobj, ConstructorInfo)

Values

  • Ldnull - Push null reference onto eval stack

  • Ldstr - Push string onto eval stack
    Usage: Emit(Ldstr, String)

  • Ldc_I4 - Push number in int/int32 onto eval stack
    Usage: Emit(Ldc_I4, Int32)

    • Ldc_I4_0 .. Ldc_I4_8 - Push number onto eval stack
    • Ldc_I4_M1 - Push -1 onto eval stack
  • Ldc_I8 - Push number in long/int64 onto eval stack
    Usage: Emit(Ldc_I4, Int64)

    Caution: An int32 will NOT be pushed onto the stack, instead it'll be an int64 being pushed into the stack!

  • Ldc_R4 - Push number in float/float32 onto eval stack
    Usage: Emit(Ldc_R4, Single) (Single is float in C#)

  • Ldc_R8 - Push number in double/float64 onto eval stack
    Usage: Emit(Ldc_R8, Double)

Arithmetic

These operations will pop two values at the top of eval stack and put the result to the top of eval stack.

  • Add - Adds two values at the top of eval stack and pushes the result to eval stack
    • Add_Ovf - Adds normally, except it throws an OverflowException if the result has overflowed
  • Sub - Substracts one value from another at the top of eval stack, and pushes the result to eval stack (see more)
    • Sub_Ovf - Subtracts normally, except it throws an OverflowException (underflow...?) if the result has overflowed
  • Mul - Multiplies two values at the top of eval stack and pushes the result to eval stack
    • Mul_Ovf - Multiplies normally, except it throws an OverflowException if the result has overflowed
  • Div - Divides two values at the top of eval stack and pushes the result to eval stack (see more)
    • There is no Div_Ovf.
  • Rem - Divides two values at the top of eval stack and pushes the remainder to eval stack (e.g. 10 % 3 = 1)
  • Neg - Pops the value at the top, negates it then pushes the value onto the eval stack.

Stack Management

  • Dup - Takes the value at the top of eval stack, pops it and pushes two copies of it to the top of eval stack
  • Pop - Removes the value at the top of eval stack

Condition testing / Control transfer / Branching

Usages for these will not be provided, since they all have basically the same usage. These opcodes jumps to the specified label if the condition is fulfilled.

Usage: Emit(OpCode, Label)

  • Beq - Jumps to the target label if the two values in eval stack are equal (see more)
  • Bge - Jumps to the target label if the first value is greater than or equal to the second value
  • Bgt - Jumps to the target label if the first value is greater than the second value
  • Ble - Jumps to the target label if the first value is lesser than or equal to the second value
  • Blt - Jumps to the target label if the first value is lesser than the second value
  • Bne_Un - Jumps to the target label if the first value is NOT equal to the second value
    This takes unsigned values to comparison, but this shouldn't be a problem, and you can convert the values to unsigned values anyways.
  • Br - Unconditionally jumps to the target label.
  • Brfalse - Jumps to the target label if the value is false, a null reference, or zero
  • Brtrue - The exact opposite of Brfalse. Jumps to the target label if the value is true, not null, or a non-zero value

Other

These do not transfer control to other target labels, but they instead return a value based on compared values and the type of comparison being made.

  • Ceq - Compares two values at the top of eval stack, if they're equal, pushes 1 (int32) to the top of eval stack, otherwise 0.
  • Cgt - Compares two values at the top of eval stack then compares them, if the first value is greater than the second, pushes 1, otherwise 0.
  • Clt - Compares two values at the top of eval stack then compares them, if the first value is lesser than the second, pushes 1, otherwise 0.

Method invocation and return

Bitwise operations and bit shift operatoins

  • And - Pushes the bitwise AND of two values to eval stack
  • Or - Pushes the bitwise OR of two values to eval stack
  • Not - Pushes the bitwise complement of the value at the top of eval stack
  • Xor - Pushes the bitwise XOR (exclusive OR) of two values to eval stack (see more)
  • Shl - Shifts an integer value to the left (in zeroes) by a specified number of bits, pushing the result onto the eval stack
  • Shr - Shifts an integer value (in sign) to the right by a specified number of bits, pushing the result onto the eval stack

Exceptions

  • Throw - Throws an exception. The exception is the object reference (referencing to the exception). Throws a NullReferenceException if the object reference is a null reference. (see example)
    • One can also use ILGenerator.ThrowException(Type).
  • Rethrow - Throws the exception caught in a catch block. Does nothing outside of the catch block.
  • Ckfinite - Throws an ArithmeticException if the value at the top of the stack is not finite.

Arrays

  • Newarr - Makes a new array (see more)
    Usage: Emit(Newarr, Type) (Type being the type of elements in the array)
  • Ldelem - Pushes an element from an array onto eval stack. (see more)
    Usage: Emit(Ldelem, Type)
  • Stelem - Stores a value into an element from an array. (see more)
    Usage: Emit(Stelem, Type)
  • Ldlen - Pushes the length of an array. Usage: Push array obj ref then Emit(Ldlen)

Other opcodes

  • Switch - Acts like a switch block from C#. But functions a bit differently than usual. (see more)
    Usage: Emit(Switch, Label[])
  • Nop - Does nothing. Literally. It's emitted by compilers to reflect empty space from source code to compiled code. It also makes it possible for the user to put breakpoints to curly brackets in code.

Others

Labels

Labels are like points in IL code where you can jump to that label.

Many opcodes take advantage of labels. These opcodes are in the Condition testing section.

Labels are simple to define and mark in IL code, all you need to do is input LABEL: at the start of any line, and the LABEL can be anything else. Like CASE1: . But in C#, they're implemented in a different way.

You define labels in C# by using the function ILGenerator.DefineLabel(). That returns a Label object. Later on, you can mark these labels at any point by using the ILGenerator.MarkLabel(Label) method.

Scopes

Scopes are protected regions of code. Local variables defined inside them cannot be accessed outside the scope it's in.

You can begin a scope using ILGenerator.BeginScope() and end one using ILGenerator.EndScope()

Locals

Locals are local variables, like the ones you would find in a C# method.

You can define them by using the ILGenerator.DeclareLocal method.

Try, and catch, and finally

Try, catch, and finally statements exist in IL! Although in a different way.

There are 3 blocks in IL functionally similar to those above: exception, catch and finally.

Illustrated in this way:

exception
{

    catch (Exception e)
    {
    
    }
    finally
    {

    }
}

Exception blocks can be declared and ended by ILGenerator.BeginExceptionBlock() and ILGenerator.EndExceptionBlock().
Catch blocks can be declared by ILGenerator.BeginCatchBlock(Type exceptionType).
And finally, Finally blocks can be declared by ILGenerator.BeginFinallyBlock().

In-depth and Easy-to-Understand Explanations

Ldfld

Go back

Ldfld takes an object reference found in the stack. The reference should be an object reference if the field is an instance field, or null if the field is a static field. Then, the object reference is popped from the stack, this means the value of the specified field has been found. Finally, the value is pushed onto the stack.

Stfld

Go back

Pretty much the same as Ldfld, except it takes one more value from the eval stack. Stfld takes an object reference found in the stack, then a value matching the type of that field. The reference should be an object reference if the field is an instance field, or null if the field is a static field.

Then, the object reference and the value to be assigned to the field is popped from the stack, and that field's value has been changed.

Beq, and other opcodes that transfer control

Go back

These operations act as if statements from other high-level languages, except even more complicated to read.

You'll want to use labels for this. Below this text will be two pieces of code which will be functionally the same.

ldloc.0 // load our local variables which are declared outside this code snippet
ldloc.1 // local var 0 is "a" and local var 1 is "b"

bne.un NOTEQ // jump to NOTEQ label if (a != b) is true
ldloc.1
ldc.i4.2
mul // pop b and 2, then multiply them, then push the result onto stack
stloc.1 // store result to b
NOTEQ:
ret
if (a == b)
{
    b = b * 2;
}
return;

Store opcodes (Starg, etc.)

Go back

Opcodes that perform a store operation pop a value at the bottom of the eval stack, then saves that value into the target itself.

E.g.

IL Code

.locals init (int32 V_0) // irrelevant code, it's just to define local variables in IL

ldc.i4.5 // Emit(Ldc_I4, 5)
stloc.0 // Emit(Stloc, 0)
ret // no value is returned

C# Code

// Generate assembly, etc etc...
LocalBuilder v_0 = ilg.DeclareLocal(typeof(int));

ilg.Emit(OpCodes.Ldc_I4, 5);
ilg.Emit(OpCodes.Stloc, 0);
ilg.Emit(OpCodes.Ret);

Sub

Go back

The Sub opcode pops two values from the top of eval stack, which will be referred to as v1 and v2. v2 is subtracted from v1 (v1 minus v2), then the result is pushed onto the eval stack.

Div

Go back

Integer division

This one is simple. The Div opcode pops two integer values from the top of eval stack, which will be referred to as v1 and v2. v1 is divided by v2 (v1 / v2), then the result is pushed onto the eval stack.

Dividing a number by zero (v2 is zero) throws a DivideByZeroException error.

The operation can also throw an ArithmeticException if the result cannot be represented in the result type. You can get this error by dividing the minimum value (maximum negative value) by -1. (This will be an OverflowException on Intel-based platforms.)

Float division

This one is pretty much the eaxct same as integer division, except with floating point numbers.

Dividing a number by zero produces an infinite value. Dividing a negative number by zero produces a negative infinite instead.

Dividing zero by zero or infinity by infinity produces NaN (Not-A-Number).

Numbers divided by infinity will produce 0.

Call

Go back

In order to use Call, you need to take some things into consideration:

  • Is the target method static or not static?
  • What arguments do you need to call the method?

If you have answers to all of these, read the text below this one to understand how to use the call opcode.

The method is static, what to do to call that method?

You don't need to do anything special, you just need to push the arguments from first one to the last one. Say, you have a static method which has 2 integer arguments. You can do it like this:

ilg.Emit(OpCodes.Ldc_I4, 2); // load int32 as the first argument
ilg.Emit(OpCodes.Ldc_I4, 2); // load int32 as the second argument

ilg.Emit(OpCodes.Call, addMthd); // call the method containing two integer arguments
ilg.Emit(OpCodes.Ret); // return and end the method

The method is NOT static, what to do to call that method?

This one is more tricky to do, as you need to push the object reference containing that method before any arguments to call that method.

Unfortunately I can't explain how to get an object reference in this gist, maybe in a future revision I will - but assume that one of your local variables is that object reference.

ilg.Emit(OpCodes.Ldloc, myobj); // myobj is a LocalBuilder containing the instance of a class that contains the method addMthd
ilg.Emit(OpCodes.Ldc_I4, 2); // first arg
ilg.Emit(OpCodes.Ldc_I4, 2); // second arg

ilg.Emit(OpCodes.Call, addMthd); // call the method
ilg.Emit(OpCodes.Ret);

There's also the Callvirt opcode as well, which is useful for calling methods on the runtime type of object, rather than the compile-time type of the class.

The Callvirt can be used for both calling virtual, and as well as instance methods.

ilg.Emit(OpCodes.Ldloc, myobj); // myobj is a LocalBuilder containing the instance of a class that contains the method addMthd
ilg.Emit(OpCodes.Ldc_I4, 2); // first arg
ilg.Emit(OpCodes.Ldc_I4, 2); // second arg

ilg.Emit(OpCodes.Callvirt, addMthd); // call the method
ilg.Emit(OpCodes.Ret);

Figuring out the arguments you need for the method

Well... this is easy, you just need an access to the MethodInfo you're calling, and check the parameters. You can build your code to handle any number of parameters too.

The EmitCall method

Go back

There's also an EmitCall method, in case you wanted to use this instead. Sometimes you might even need to. With this, you can tell the IL code that the method you're calling has optional parameter types. You can also leave the last argument as null though if the method has no optional arguments, and more often than not there won't be any optional argument.

// push arguments onto stack
// also push object reference if method is not static

ilg.EmitCall(OpCodes.Call, addMthd, null); // use the Call opcode, to call addMthd, with no optional parameters
ilg.Emit(OpCodes.Ret);

Xor

Go back

The bitwise XOR (exclusive OR) operator takes each binary digit of two numbers.

Computes the result like a bitwise OR operator, but the binary digit will be 0 if that digit of both number's equals to 1.

Example:

A B Result
0 0 0
1 0 1
0 1 1
1 1 0

Newobj

Go back

In order to use this opcode:

  1. Get the ConstructorInfo of the class you want to make an instance of.
  2. Push all of the arguments required by ConstructorInfo.
  3. Call Newobj
  4. A reference to the new object will be pushed onto stack

Example:

public class Point
{
    public Point(int x, int y)
    {
        this.x = x;
        this.y = y;
    }

    public int x;
    public int y;
}

// ...
// get constructorinfo, which will be named "ctorOfPoint"
ilg.Emit(OpCodes.Ldc_I4, 5);
ilg.Emit(OpCodes.Ldc_I4, 3);
ilg.Emit(OpCodes.Newobj, ctorOfPoint);

Newarr

Go back

In order to use this opcode:

  1. Push the number of elements this array can hold at a time.
  2. Here you go, your new array has arrived, with some spice and ketchup on it. As an object reference.

You have to fill in the elements of the array yourself one by one too.

Ldelem & Stelem

Go back

In order to use Ldelem:

  1. Push the array object reference to the stack.
  2. Push the index value to the stack.
  3. Use this opcode, and those two values are popped from the stack
  4. The element you're looking for at the array will be outputted onto the stack.

In order to use Stelem:

  1. Push the array object reference to the stack.
  2. Push the index value to the stack.
  3. Push the value of the type of the array.
  4. The value will be stored into the index in the array.

Switch

Go back

The Switch opcode functions a little different than usual. Here's how it exactly works:

  1. The Switch opcode takes an integer value from the evaluation stack.

  2. Then, it takes the Label array we gave to it earlier.

  3. Each Label in the array functions like a case.

    The first index would be case 0:, the second index would be case 1:, the third index would be case 2: and so on.

  4. If the integer value matches none of these cases, control transfers will not happen. Otherwise, the control transfers to that label corresponding to that case.

Ideally you would want to define 2 other labels. One placed past the entire switch section, and another one acting like the default: case.

Here's an example

Examples

There aren't any full fledged examples as of now, but they may get added in the near future.

Mini-example: Throwing an exception

using System;
using System.Reflection;
using System.Reflection.Emit;

DynamicMethod myMethod = new DynamicMethod("MyMethod", null, null);
ILGenerator ilg = myMethod.GetILGenerator();

ConstructorInfo exCtor = typeof(Exception).GetConstructor(new Type[] { typeof(string) });
ilg.Emit(OpCodes.Ldstr, "Throwing an exception in IL test!");
ilg.Emit(OpCodes.Newobj, exCtor);
ilg.Emit(OpCodes.Throw);

Mini-example: Switch case

A full fledged example can be found at Microsoft's documentation, which is where this example stole draws inspiration from.

using System;
using System.Reflection;
using System.Reflection.Emit;

DynamicMethod myMethod = new DynamicMethod("MyMethod", null, new Type[] { typeof(int) });
ILGenerator ilg = myMethod.GetILGenerator();

Label defaultCase = ilg.DefineLabel();
Label endOfSwitch = ilg.DefineLabel();

Label caseTable = new Label[] { ilg.DefineLabel(), ilg.DefineLabel(), ilg.DefineLabel() };

// switch (arg0)
ilg.Emit(OpCodes.Ldarg_0);
ilg.Emit(OpCodes.Switch, caseTable);

// go to default case if none of the cases match
ilg.Emit(OpCodes.Br, defaultCase);

// case 0:
ilg.MarkLabel(caseTable[0]);
ilg.EmitWriteLine("Given argument equals to 0");
ilg.Emit(OpCodes.Br, endOfSwitch);

// case 1:
ilg.MarkLabel(caseTable[1]);
ilg.EmitWriteLine("Given argument equals to 1");
ilg.Emit(OpCodes.Br, endOfSwitch);

// case 2:
ilg.MarkLabel(caseTable[2]);
ilg.EmitWriteLine("Given argument equals to 2");
ilg.Emit(OpCodes.Br, endOfSwitch);

// default:
ilg.MarkLabel(defaultCase);
ilg.EmitWriteLine("Given argument equals to an unknown amount");
ilg.Emit(OpCodes.Br, endOfSwitch);

ilg.MarkLabel(endOfSwitch);
ilg.Emit(OpCodes.Ret);

Key Terms

Eval stack (Evaluation stack)

The stack which is mostly used for operations like Add, Sub, Mul, Ldarg, Call, Callvirt, etc.

Stack

A stack is a type of collection, which is a last-in-first-out collection.

Pop

Whenever a value is popped from the stack, that means it's removed from the stack.

Push

Whenever a value is pushed or loaded onto the stack, that means it's been placed on the top of the stack. This is especially useful to know, since a selection of opcodes operate based on the order of values on the stack.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment