kmill/Anders_fractran_collatz.txt

## Anders_fractran_collatz.txt
See https://codegolf.stackexchange.com/a/203682/78112

This is a description of Anders Kaseorg's solution to the code golf problem of making a program in FRACTRAN
that detects whether or not the Collatz sequence starting from the given number eventually reaches 1.

FRACTRAN programs can be thought of as being a list of chemical reactions with no catalysts (chemical
species that appear as both reagents and products).  Given a fraction c/d, the prime factors present
in d (with multiplicity) are the reagent species, and the prime factors present in c (with
multiplicity) are the product species.  Since fractions are reduced, a species is only ever either a
reagent or a product in a given reaction.

"Decompiling" Anders's solution, we get the following reactions:

INPUT: n of species A

1.       2 REM  ->  DIV
2.  MUL1 + REM  ->  MUL2 + 2 A
3.  MUL1 + DIV  ->  MUL2 + REM
4.  MUL2 + DIV  ->  MUL1 + REM + A
5.        MUL2  ->  0
6.           A  ->  REM
7.         DIV  ->  MUL1 + 2 REM

His assignments to primes are as follows:
  2   <->  REM    -- the remainder of A/2
  3   <->  A      -- the input
  5   <->  DIV    -- the quotient A/2 rounded down
  7   <->  MUL2   -- flag for stage 2 of multiplication
  11  <->  MUL1   -- flag for stage 1 of multiplication

The way this works is as follows:
a. At the beginning, we have n A.
b. Eq.6 converts this all to REM, preparing for division by 2
c. Simultaneously, Eq.1 reduces REM mod 2, leaving DIV with the quotient and REM with the remainder.
d. Once all processed, Eq.7 asserts the MUL1 flag if DIV >= 1 (otherwise n was 1, and the program halts).
   Since FRACTRAN does not allow catalysts, there is a trick here: one DIV molecule is consumed and replaced
   by 2 REM, which are immediately converted back into a DIV by Eq.1.

Now it gets into some real cleverness.  If REM=0, we want to do n DIV -> n A, and if REM=1, we want to
do n DIV -> (3*n + 2) A.  (Once done, Eq.5 will finalize the result, and the program restarts with the new
value for A.)  Anders has, essentially, turned this into calculating 2r + (1+2r)*d, where d is the floor of n/2
and r is the remainder of this quotient.

e. Depending on whether n was even or odd (whether there are zero or one REMs), either Eq.2 or Eq.3 is executed.
   (1) Consider REM=0.  Then Eq.3 converts DIV to REM and switches to MUL2.  MUL2 consumes another DIV and
       replaces it with REM, while also producing an A.  Now that there are two REMs, Eq.1 converts them into a
       DIV, which is a trick that avoids catalysts.  So, overall, there is one fewer DIV and one more A.
       Eventually, all but one of the DIVs will be converted into As.  For the final DIV, then Eq.3 applies but
       then Eq.4 does not, leading to Eq.5 being applied.  This has the effect of converting this last DIV into
       a REM, which is as if Eq.6 had been applied once, giving (n-1) A + REM.
   (2) Consider REM=1.  Eq.2 consumes the REM and adds 2 A while switching to MUL2.  Eq.4 consumes a DIV and adds
       the REM back in while producing an A, switching back to MUL1 and Eq.2, which adds in another 2 A and returns
       to Eq.4.  All in all, these latter two steps convert each DIV into 3 A.  With that 2 A from the beginning,
       we have n DIV being converted into (2 + 3*n) A.  The end of this sequence is Eq.2, so all the REMs are consumed.
f. Now there are A's, possibly one REM, and a single MUL2.  Eq.5 consumes the MUL2, and the program enters (nearly)
   its initial state, so the division part of the program begins again.  The nearly is because in e(1) there is a REM
   at the end, as if Eq.6 had been applied once.

I tried hard to find a way to simplify this program farther.  It seems Anders might have found the
optimal solution!

---

Allowing catalysts (which can only ever decrease program length), I found a 7-equation solution that
I wasn't able to simplify further.  Perhaps a clever person might find a way to bring this down to 6
equations, but it seems unlikely.

Input: n of species A

1. EVEN + DIV  ->  EVEN + A
2.       EVEN  ->  0
3.  ODD + DIV  ->  ODD + 3 A
4.        ODD  ->  0
5.        2 A  ->  DIV
6.    A + DIV  ->  ODD + DIV + 2 A
7.        DIV  ->  EVEN + DIV
	See https://codegolf.stackexchange.com/a/203682/78112

	This is a description of Anders Kaseorg's solution to the code golf problem of making a program in FRACTRAN
	that detects whether or not the Collatz sequence starting from the given number eventually reaches 1.

	FRACTRAN programs can be thought of as being a list of chemical reactions with no catalysts (chemical
	species that appear as both reagents and products). Given a fraction c/d, the prime factors present
	in d (with multiplicity) are the reagent species, and the prime factors present in c (with
	multiplicity) are the product species. Since fractions are reduced, a species is only ever either a
	reagent or a product in a given reaction.

	"Decompiling" Anders's solution, we get the following reactions:

	INPUT: n of species A

	1. 2 REM -> DIV
	2. MUL1 + REM -> MUL2 + 2 A
	3. MUL1 + DIV -> MUL2 + REM
	4. MUL2 + DIV -> MUL1 + REM + A
	5. MUL2 -> 0
	6. A -> REM
	7. DIV -> MUL1 + 2 REM

	His assignments to primes are as follows:
	2 <-> REM -- the remainder of A/2
	3 <-> A -- the input
	5 <-> DIV -- the quotient A/2 rounded down
	7 <-> MUL2 -- flag for stage 2 of multiplication
	11 <-> MUL1 -- flag for stage 1 of multiplication

	The way this works is as follows:
	a. At the beginning, we have n A.
	b. Eq.6 converts this all to REM, preparing for division by 2
	c. Simultaneously, Eq.1 reduces REM mod 2, leaving DIV with the quotient and REM with the remainder.
	d. Once all processed, Eq.7 asserts the MUL1 flag if DIV >= 1 (otherwise n was 1, and the program halts).
	Since FRACTRAN does not allow catalysts, there is a trick here: one DIV molecule is consumed and replaced
	by 2 REM, which are immediately converted back into a DIV by Eq.1.

	Now it gets into some real cleverness. If REM=0, we want to do n DIV -> n A, and if REM=1, we want to
	do n DIV -> (3*n + 2) A. (Once done, Eq.5 will finalize the result, and the program restarts with the new
	value for A.) Anders has, essentially, turned this into calculating 2r + (1+2r)*d, where d is the floor of n/2
	and r is the remainder of this quotient.

	e. Depending on whether n was even or odd (whether there are zero or one REMs), either Eq.2 or Eq.3 is executed.
	(1) Consider REM=0. Then Eq.3 converts DIV to REM and switches to MUL2. MUL2 consumes another DIV and
	replaces it with REM, while also producing an A. Now that there are two REMs, Eq.1 converts them into a
	DIV, which is a trick that avoids catalysts. So, overall, there is one fewer DIV and one more A.
	Eventually, all but one of the DIVs will be converted into As. For the final DIV, then Eq.3 applies but
	then Eq.4 does not, leading to Eq.5 being applied. This has the effect of converting this last DIV into
	a REM, which is as if Eq.6 had been applied once, giving (n-1) A + REM.
	(2) Consider REM=1. Eq.2 consumes the REM and adds 2 A while switching to MUL2. Eq.4 consumes a DIV and adds
	the REM back in while producing an A, switching back to MUL1 and Eq.2, which adds in another 2 A and returns
	to Eq.4. All in all, these latter two steps convert each DIV into 3 A. With that 2 A from the beginning,
	we have n DIV being converted into (2 + 3*n) A. The end of this sequence is Eq.2, so all the REMs are consumed.
	f. Now there are A's, possibly one REM, and a single MUL2. Eq.5 consumes the MUL2, and the program enters (nearly)
	its initial state, so the division part of the program begins again. The nearly is because in e(1) there is a REM
	at the end, as if Eq.6 had been applied once.

	I tried hard to find a way to simplify this program farther. It seems Anders might have found the
	optimal solution!

	---

	Allowing catalysts (which can only ever decrease program length), I found a 7-equation solution that
	I wasn't able to simplify further. Perhaps a clever person might find a way to bring this down to 6
	equations, but it seems unlikely.

	Input: n of species A

	1. EVEN + DIV -> EVEN + A
	2. EVEN -> 0
	3. ODD + DIV -> ODD + 3 A
	4. ODD -> 0
	5. 2 A -> DIV
	6. A + DIV -> ODD + DIV + 2 A
	7. DIV -> EVEN + DIV