Skip to content

Instantly share code, notes, and snippets.

@tiancilliers
Last active December 26, 2019 13:28
Show Gist options
  • Save tiancilliers/8fd1455eeb113b4a5ecdc941dccd25dd to your computer and use it in GitHub Desktop.
Save tiancilliers/8fd1455eeb113b4a5ecdc941dccd25dd to your computer and use it in GitHub Desktop.

Challenge 15: Self-Replicating Toy

This challenge requires writing a program in the given language that prints out itself. This is commonly known as a quine.

Assemblium

Assemblium uses 2 stacks namely the data stack and code stack, with the following being executed when an instruction gets popped off the top of the code stack:

Instruction Result
0x00-0x7f Pushes instruction to data stack
0x80 XOR top value in data stack with 0x80
0x81 If top value on data istack is 0x00, replace with 0xff, else with 0x00
0x82 Replace top 2 values on data stack with bitwise AND
0x83 Replace top 2 values on data stack with bitwise OR
0x84 Replace top 2 values on data stack with bitwise XOR
0x90 Swaps top 2 values on data stack
0x91 Duplicates top value on data stack
0xa0 Pops top value on data stack as index, pops from data stack until 0xa1 and assigns to indexed function
0xb0 Pops top value on data stack and outputs
0xc0-0xdf Pushes specified function to code stack
0xe0-0xff Pops top value on data stack, pushes specified function to code stack if not 0x00

Writing a quine

A common approach to writing a quine is to have a program declare a string containing the rest of the program, then have the rest of the program write everything before the string, and then use the string to first print it normally then print the code (which is contained in the string).

An example in Python would be as follows:

s = "print('s = \"' + s + '\"')\nprint(s)"
print('s = \"' + s + '\"')
print(s)

This has a problem though, in that it creates a situation where quotation marks need to be escaped, but any escape characters also need to be escaped, et cetera.

To solve this, encode the string in a certain way and only decode it when we print it

from base64 import b64decode
s = b"cHJpbnQoImZyb20gYmFzZTY0IGltcG9ydCBiNjRkZWNvZGUiKQpwcmludCgicyA9IGJcIiIgKyBzLmRlY29kZSgidXRmLTgiKSArICJcIiIpCnByaW50KGI2NGRlY29kZShzKS5kZWNvZGUoInV0Zi04Iikp"
print("from base64 import b64decode")
print("s = b\"" + s.decode("utf-8") + "\"")
print(b64decode(s).decode("utf-8"))

While not the shortest quine ever, this approach does work

Writing a quine in Assemblium

Code from here on will be in the format where all data is hex instructions unless in a comment denoted by a hash

Functions

First of all, we need to be able to define functions. A problem exists in that, all code written to a function must be in the data stack, but in order to get things into the data stack it needs to be smaller than 0x80. What we do to avoid this is to XOR any instructions with 0x80 before, and as soon as they are on the data stack, XOR them back.

A function might look like this

# put a1 on data stack
21 80
# put code on data stack in reverse order
# code to run: 61 b0 (prints a)
30 80 61
# define as function 0
00 a0

And can then be called by doing

c0

Strings

Our code will use the following structure. A function 00 will be defined that copies string data (less than 0x80) onto the code stack, which will then be reversed onto the data stack, where it will be printed by some functions. Firstly, we just need to define a function that prints the string data normally.

The function we use will be recursive, calling itself until the data is printed. To stop at the end, we need an end character. Since there may be null bytes in the string, we used 0x7e instead.

The function, 01, is as follows

# put a1 on data stack
21 80
# put code on data stack in reverse order
# code to run: b0 91 70 0e 83 84 e1 (print string data until 0x7e)
61 80 04 80 03 80 0e 70 11 80 30 80
# define as function 1
01 a0

In pseudocode:

define function 01:
  print top byte on data stack
  duplicate next byte
  put 0x7e on data stack by AND'ing 70 and 0e to avoid an unintentional special character
  XOR next duplicated byte with 0x7e
  if not equal, run function 01

Executable data

Now, since our program will contain executable data (larger than 0x80), we need some way to transform that into string data to be stored in a function, and transform it back to the original bytes on the data stack to be printed. The way we do this is to XOR it with 0x80, but since we can't store the 0x80 instruction itself in the string to be executed, we will store 0x7f instead, signifying that the next bute needs to be XORed with 0x80 before printing as code.

This means that we will need another string printing function that will print the same string, but actually XORing the correct bytes as well, to print out the actual Assemblium code.

This function, 02, is as follows:

# put a1 on data stack
21 80
# put code on data stack in reverse order
# code to run: 91 70 0f 83 84 81 e3 b0 91 70 0e 83 84 e2 (print unencoded data using 0x7f until 0x7e)
62 80 04 80 03 80 0e 70 11 80 30 80 63 80 01 80 04 80 03 80 70 0f 11 80
# define as function 2
02 a0

This also depends on another function 03:

# put a1 on data stack
21 80
# put code on data stack in reverse order
# code to run: 91 84 e3 80 (pop and 0x80 next byte from data stack)
00 80 63 80 04 80 11 80
# define as function 3
03 a0

In pseudocode:

define function 02:
  duplicate next byte
  put 0x7f on data stack by AND'ing 70 and 0f to avoid an unintentional special character
  XOR next duplicated byte with 0x7f
  if equal, run function 03 (pop second duplicate character and XOR next byte on data stack with 0x80)
  print top byte on data stack
  duplicate next byte
  put 0x7e on data stack by AND'ing 70 and 0e to avoid an unintentional special character
  XOR next duplicated byte with 0x7e
  if not equal, run function 02

Putting it all together

Finally, we will be running this code:

# print header (putting a1 on data stack for string function)
21 b0 00 80 b0 

# print string data by putting 0x7e on data stack, putting string on data stack and running function 01
70 0e 83
c0 c1

# print unencoded real data  by putting 0x7e on data stack, putting string on data stack and running function 02
70 0e 83
c0 c2

Thus, our final code is as follows:

#00: copy string data to code stack
#    double reversed string data, replace 0x8_ with 0x7f 0x0_
#    include all code after string line
21 80 
# !!! INSERT PROCESSED CODE STRING FOLLOWING THIS LINE HERE !!!

00 a0 

#01: print normal string data until 0x7e
#    b0 91 70 0e 83 84 e1
21 80
61 80 04 80 03 80 0e 70 11 80 30 80
01 a0

#02: print real data using 0x7f until 0x7e
#    91 70 0f 83 84 81 e3 b0 91 70 0e 83 84 e2
21 80
62 80 04 80 03 80 0e 70 11 80 30 80 63 80 01 80 04 80 03 80 70 0f 11 80
02 a0

#03: pop and 0x80 next byte from data stack
#    91 84 e3 80
21 80
00 80 63 80 04 80 11 80
03 a0

# print header
21 b0 00 80 b0 

# print string data
70 0e 83
c0 c1

# print real code
70 0e 83
c0 c2

Now, we simply take all the bytes following the line for the string data, replace all code bytes (larger than 0x80) with 0x7f and the byte XORed with 0x80, and put this data on the string data line.

String data:

00 7f 20 21 7f 00 61 7f 00 04 7f 00 03 7f 00 0e 70 11 7f 00 30 7f 00 01 7f 20 21 7f 00 62 7f 00 04 7f 00 03 7f 00 0e 70 11 7f 00 30 7f 00 63 7f 00 01 7f 00 04 7f 00 03 7f 00 70 0f 11 7f 00 02 7f 20 21 7f 00 00 7f 00 63 7f 00 04 7f 00 11 7f 00 03 7f 20 21 7f 30 00 7f 00 7f 30 70 0e 7f 03 7f 40 7f 41 70 0e 7f 03 7f 40 7f 42

And that is it! A successful quine in Assemblium

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment