Skip to content

Instantly share code, notes, and snippets.

@Radvylf
Created May 20, 2022 22:47
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Radvylf/73d14cbd5812205bdfbe4c2fe1b89234 to your computer and use it in GitHub Desktop.
Save Radvylf/73d14cbd5812205bdfbe4c2fe1b89234 to your computer and use it in GitHub Desktop.

catstruct

A proof-of-concept golfing language, where the structure of a program and its operators are distinct.

Structure

A structure comes before the operator array.

Op count to number of structs:

 0: 1
 1: 3
 2: 9
 3: 36
 4: 162
 5: 783
 6: 3969
 7: 20817
 8: 112023
 9: 615033
10: 3431403
11: 19398690
12: 110880900
13: 639730305
14: 3720657807
15: 21790419444
16: 128398625658
17: 760668489729
18: 4528069760691
19: 27070491820644
20: 162464919528222
21: 978463778897637
22: 5911727071716891
23: 35821932198013812
24: 217642657066130500

Op count to required bits for structure and total bits:

 0: 0   0
 1: 2   8
 2: 4   16
 3: 6   24
 4: 8   32
 5: 10  40
 6: 12  48
 7: 15  57
 8: 17  65
 9: 20  74
10: 22  82
11: 25  91
12: 27  99
13: 30  108
14: 32  116
15: 35  125
16: 37  133
17: 40  142
18: 43  151
19: 45  159
20: 48  168
21: 50  176
22: 53  185
23: 55  193
24: 58  202

Bits to ops:

 0: 0 (1)
 1:   0 (3)
 2:   0 (7)
 3:   0 (15)
 4:   0 (31)
 5:   0 (63)
 6: 1 (1)
 7: 1 (3)
 8:   1 (7)
 9:   1 (15)
10:   1 (31)
11:   1 (63)
12: 2 (1)
13: 2 (3)
14: 2 (7)
15: 2 (15)
16:   2 (31)
17:   2 (63)
18: 3 (1)
19: 3 (3)
20: 3 (7)
21: 3 (15)
22: 3 (31)
23: 3 (63)
24: 4 (1)
25: 4 (3)
26: 4 (7)
27: 4 (15)
28: 4 (31)
29: 4 (63)
30: 4 (127)
31: 4 (255)
32: 5 (4)
33: 5 (12)
34: 5 (28)
35: 5 (60)
36: 5 (124)
37: 5 (252)
38: 5 (508)
39: 5 (1020)
40: 6 (16)
41: 6 (48)
42: 6 (112)
43: 6 (240)
44: 6 (496)
45: 6 (1008)
46: 6 (2032)
47: 6 (4080)
48: 7 (64)

Old structure

A structure comes before the operator reference. It starts with a number of 1 bits, then a 0 bit. Then, there are three data bits, and an additional three bits for each 1 at the start. Then, f(n) - 1 where f(n) = 8 ** n + f(n - 1) and f(0) = 1 is added to the number represented by the data bits, where n is the number of 1 bits at the start, giving the index in the sequence of program structures. The structure takes approximately (n - 2) / 4 bytes, where n is the number of operators.

The sequence of program structures represents all possible structures as ternary numbers, where each digit is the child count of a node, in prefix notation. E.g., 2010 represents a program like 1 N 2 + in postfix notation. These ternary numbers are sorted first by length, shortest first, then by the ratios of 0s, 1s, and 2s. If x, y, and z represented the amounts of 0s, 1s, and 2s respectively, the smallest (y - z) ** 2 would come first, or in the event of a tie, the smallest ((x * 2) - (y + z)) ** 2. In the event that there is still a tie, the larger number comes first (this ensures 0s are farther to the right). The first 200 terms in this sequence are:

  0: 0
  1: 10
  2: 200
  3: 110
  4: 2100
  5: 2010
  6: 1200
  7: 1110
  8: 21100
  9: 21010
 10: 20110
 11: 12100
 12: 12010
 13: 11200
 14: 22000
 15: 20200
 16: 11110
 17: 221000
 18: 220100
 19: 220010
 20: 212000
 21: 210200
 22: 202100
 23: 202010
 24: 201200
 25: 122000
 26: 120200
 27: 211100
 28: 211010
 29: 210110
 30: 201110
 31: 121100
 32: 121010
 33: 120110
 34: 112100
 35: 112010
 36: 111200
 37: 111110
 38: 2211000
 39: 2210100
 40: 2210010
 41: 2201100
 42: 2201010
 43: 2200110
 44: 2121000
 45: 2120100
 46: 2120010
 47: 2112000
 48: 2110200
 49: 2102100
 50: 2102010
 51: 2101200
 52: 2021100
 53: 2021010
 54: 2020110
 55: 2012100
 56: 2012010
 57: 2011200
 58: 1221000
 59: 1220100
 60: 1220010
 61: 1212000
 62: 1210200
 63: 1202100
 64: 1202010
 65: 1201200
 66: 1122000
 67: 1120200
 68: 2111100
 69: 2111010
 70: 2110110
 71: 2101110
 72: 2011110
 73: 1211100
 74: 1211010
 75: 1210110
 76: 1201110
 77: 1121100
 78: 1121010
 79: 1120110
 80: 1112100
 81: 1112010
 82: 1111200
 83: 2220000
 84: 2202000
 85: 2200200
 86: 2022000
 87: 2020200
 88: 1111110
 89: 22111000
 90: 22110100
 91: 22110010
 92: 22101100
 93: 22101010
 94: 22100110
 95: 22011100
 96: 22011010
 97: 22010110
 98: 22001110
 99: 21211000
100: 21210100
101: 21210010
102: 21201100
103: 21201010
104: 21200110
105: 21121000
106: 21120100
107: 21120010
108: 21112000
109: 21110200
110: 21102100
111: 21102010
112: 21101200
113: 21021100
114: 21021010
115: 21020110
116: 21012100
117: 21012010
118: 21011200
119: 20211100
120: 20211010
121: 20210110
122: 20201110
123: 20121100
124: 20121010
125: 20120110
126: 20112100
127: 20112010
128: 20111200
129: 12211000
130: 12210100
131: 12210010
132: 12201100
133: 12201010
134: 12200110
135: 12121000
136: 12120100
137: 12120010
138: 12112000
139: 12110200
140: 12102100
141: 12102010
142: 12101200
143: 12021100
144: 12021010
145: 12020110
146: 12012100
147: 12012010
148: 12011200
149: 11221000
150: 11220100
151: 11220010
152: 11212000
153: 11210200
154: 11202100
155: 11202010
156: 11201200
157: 11122000
158: 11120200
159: 22210000
160: 22201000
161: 22200100
162: 22200010
163: 22120000
164: 22102000
165: 22100200
166: 22021000
167: 22020100
168: 22020010
169: 22012000
170: 22010200
171: 22002100
172: 22002010
173: 22001200
174: 21220000
175: 21202000
176: 21200200
177: 21022000
178: 21020200
179: 20221000
180: 20220100
181: 20220010
182: 20212000
183: 20210200
184: 20202100
185: 20202010
186: 20201200
187: 20122000
188: 20120200
189: 12220000
190: 12202000
191: 12200200
192: 12022000
193: 12020200
194: 21111100
195: 21111010
196: 21110110
197: 21101110
198: 21011110
199: 20111110

Strings

Strings are written using a special monad (or a dyad for format strings if supported). These take a tree and, rather than treating it as code, use the operator list and formatting to generate binary for the string's contents. For the format part, this comes from the same op-count-to-number-of-structs table as the structure. E.g., there are 783 trees which use 5 operators. These can simply be multiplied together, giving 783 * (64 ** 5) possible strings with 5 operators, which is 840739848192 or about 40 bits. A number will be added to the numeric representation of the string, that being the number of strings possible with fewer operators, removing any possibility of duplicates.

Catstruct's strings are bit-based, so a way to turn the string indices into bits is required. This is done using bijective base 2, with the digits 0 and 1 rather than 1 and 2, as the string specified by 0001 is not the same as 000000001. The first few numbers in this system would be:

 0: 
 1: 0
 2: 1
 3: 00
 4: 01
 5: 10
 6: 11
 7: 000
 8: 001
 9: 010
10: 011
11: 100
12: 101
13: 110
14: 111
15: 0000
16: 0001
17: 0010
18: 0011
19: 0100
20: 0101
21: 0110
22: 0111
23: 1000
24: 1001

Catstruct's strings consist of groups of bits with headers indicating their function:

  • 00: A huffman-coded corpus character
  • 010: A 14-bit Unicode character
  • 011: A 21-bit Unicode character
  • 1: A corpus word

Corpus words can have the following codes:

  • 0: Title case if preceded by a ., ?, or !, preceding space if not at start of string
  • 1000: No spacing
  • 1001: Opposite of normal casing
  • 1010: No spacing, opposite of normal casing
  • 1011: No spacing, all capitals
  • 1100: Add ,
  • 1101: Add .
  • 1110: Add ?
  • 1111: Add !c

Operator Brainstorming for catstruct

Ops

Nilads

Null:

  • null

Int:

  • -2
  • -1
  • 0
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 16
  • 60
  • 100
  • 256
  • googol
  • Infinity

Rational, Cmplx:

  • 1 / 2
  • i
  • -i
  • 1 + i
  • 1 - i
  • -1 + i
  • -1 - i
  • pi
  • sqrt2

String:

  • ""
  • " "
  • "\n"
  • "abcdefghijklmnopqrstuvwxyz"
  • "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
  • "bcdfghjklmnpqrstvwxyz"
  • "bcdfghjklmnpqrstvwxz"
  • "aeiou"
  • "aeiouy"
  • "0123456789"
  • "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"

Array:

  • [-1, 1]
  • [0, 1]
  • [1, 2]
  • [1, i, -1, -i]
  • ["qwertyuiop", "asdfghjkl", "zxcvbnm"]

Inf. array:

  • 0s
  • 1s
  • Pos. ints
  • Nonn. ints
  • Ints ([0, 1, -1, 2, -2])
  • Rationals (calkin-wilf with 0)
  • Cmplx(ints)
  • Cmplx(rationals)
  • Cmplx(rationals) with 0
  • Binary fractions
  • Primes
  • Fibonacci sequence
  • Powers of 2
  • Powers of 10
  • Alternating bits

Number

Monads:

  • Negate
  • Reciprocal
  • Floor
  • Trunc
  • Round
  • Ceil
  • Bitwise NOT
  • Complement
  • Absolute value
  • Factorial
  • Square root
  • To binary
  • To hexadecimal
  • To string
  • Double
  • Halve
  • Triple
  • Log 2
  • Log 10
  • Floor of Log 2 + 1
  • Floor of Log 10 + 1
  • Range 0..z
  • Range 0..(z + 1)
  • Range 1..(z + 1)
  • Range z..0
  • Range 1..(z + 1)
  • Range -z..0
  • Range -z..z
  • Range -z..(z + 1)
  • To char
  • To alpha (0-based)
  • To alpha (1-based)
  • Digits
  • Sig figs
  • Insignificant (abs <= 1)
  • Counts of digits
  • Mod 2
  • Mod 10
  • Mod 256
  • z * 10
  • Square
  • Increment
  • Decrement
  • Half and floor
  • Is prime
  • zth prime
  • zth fibonacci
  • Pow of 2
  • Pow of 10
  • Sin
  • Cos
  • Tan
  • Asin
  • Acos
  • Atan
  • Sinh
  • Cosh
  • Tanh
  • Asinh
  • Acosh
  • Atanh
  • Ln
  • Exp
  • Sign
  • Is pos
  • Is int
  • Is rational
  • Is real
  • Is finite
  • Fractional part
  • Real
  • Imag
  • Imag coeff.
  • Conjugate
  • Random to
  • Count 1 bits
  • Divisibility by 2
  • Divisibility by 10
  • Divisors
  • Divisor pairs
  • Factors

Dyads:

  • Add
  • Subtract
  • Multiply
  • Divide
  • Integer division
  • Modulo
  • Pow
  • Root
  • Log
  • Bitwise NOT with bits
  • Bitwise AND
  • Bitwise OR
  • Bitwise XOR
  • Distance
  • Hypot
  • Atan2
  • Base convert
  • Convert to digits
  • Convert to bijective digits
  • Less than
  • Greater than
  • Less than or equal to
  • Greater than or equal to
  • All right-justified base-y numbers with a max of x digits as a list (e.g., 2?8 is all byte values, 3?2 is [[0, 0], [0, 1], [0, 2], [1, 0], [1, 1], [1, 2], [2, 0], [2, 1], [2, 2]])
  • Range x..y
  • Range x..(y + 1)
  • Take sign
  • Mul by pow of 2
  • Mul by pow of 10
  • Digit
  • Add away from 0
  • GCD
  • LCM
  • Divisibility
  • Imag. from coeffs
  • Run polynomial
  • Divmod
  • Similar (diff. <= 1)

String and array

Monad:

  • Size
  • Double in-place
  • Double whole
  • Half ([[a, b, c], [d, f]])
  • Half ([[a, b], [c, d]])
  • Half ([[a, b], [c], [d, f]])
  • Positions
  • Distinct
  • Nondistinct
  • Is fill
  • Squish_2 nondistinct
  • Zip with position
  • First
  • Final
  • First and final ([-1, 0, -2, 0, 4] is [-1, 4])
  • Without first
  • Without final
  • Without first and final
  • Reverse
  • Is not null
  • Is size 1
  • Random
  • Combinations
  • Sorts
  • Sorts of combinations
  • Slices
  • Partitionings
  • Lossy partitionings
  • Modification distance
  • Sort (common)
  • Run length encode
  • Counts
  • Modular slice by 2
  • Cartesian square
  • Cartesian square and squish
  • Is symm
  • Partition symm
  • Multiply by inf

Dyad:

  • Subscript(s)
  • Trim
  • Trim start
  • Trim right
  • Squish_2 ([1, 2, 2, 1, 2, 2, 2, 2], 2 is [1, 2, 1, 2])
  • Concat
  • Position of
  • Positions of
  • Position of (monotonic)
  • Positions of (monotonic)
  • Fill
  • Slice to
  • Slice from
  • Slice final
  • Insert at start
  • Insert at finish
  • Splat at start
  • Splat at finish
  • Join
  • Split
  • Without position
  • Without positions
  • Without first
  • Without
  • Map with dict
  • Is start
  • Is finish
  • Shift back
  • Shift right
  • Group
  • Group with bonus ([[0, 1, 2], [3, 4, 5], [6]])
  • Sliding group
  • Swap with dict ([1, 2, 4], {1: 2, 2: 10, 4: 40} is [2, 10, 40])
  • Swap first with dict
  • Swap strings with dict, long first
  • GCD
  • LCM
  • Distinct GCD
  • Distinct LCM
  • Diff
  • Symm diff
  • Distinct diff
  • Distinct symm diff
  • Base convert (string)
  • Count
  • Count not
  • Multiply
  • Multiply in-place
  • Partition
  • Split at positions
  • Interleave
  • Modular slice ([0, 1, 2, 3, 4, 56, 7, 8], 3 is [0, 3, 6])
  • Cartesian product
  • Cartesian product and squish
  • Cartesian power
  • Cartesian power and squish
  • Venn diagram ([first, both, second])
  • Xor diagram ([one, both])
  • Same size
  • Combinations of z
  • Sorts of z

Triad:

  • Write subscript(s)
  • Slice
  • Cut
  • Insert
  • Splat
  • Pad start
  • Pad finish
  • Pad both (bias left)
  • Merge ([1, 2, 2, 1, 1, 2, 2, 2, 2] and 2 -> [1, 2, 1, 1, 2])
  • Write on tape

String

Monad:

  • Matrix of chars
  • Split with ""
  • Split with " "
  • Split with "\n"
  • First char code
  • Array of char codes
  • Uppercase
  • Lowercase
  • Is whitespace
  • Is alpha
  • Is lowercase
  • Is digit
  • From binary
  • From hexadecimal
  • To number

Dyad:

  • Draw with mask ( a b and cc is ca b)
  • Multiply with \ns

Array

Monad:

  • Squish
  • Crush
  • Position of string
  • Positions of string
  • From matrix of chars (nested join)
  • Join with ""
  • Join with " "
  • Join with "\n"
  • Dict positions
  • Dict data
  • Cumulative sum
  • Dlist
  • Cumulative concat
  • Sum
  • Product
  • Flip
  • From array of char codes
  • Max
  • Min
  • Int from sig figs
  • Sig figs with pow of 10
  • Convert from digits
  • Convert from bijective digits
  • Points by ([1, 2] is [[0, 2], [2, 2], [1, 1], [1, 3]])
  • Points by, with corners
  • Filter coords by bounds
  • Mod coords by bounds
  • Init with dims
  • Dims
  • Rank
  • Zip with coords, crush
  • Diagonals
  • Antidiagonals
  • Cut start to flip
  • Cut finish to flip
  • Cut finish, flip

Dyad:

  • Addwith
  • Subtractwith
  • Multiplywith
  • Dividewith
  • Crush ranks (2 turns 4d to 2d)
  • Crush to rank (2 turns anything to 2d)
  • Trim string
  • Trim string start
  • Trim string right
  • Squish_2 string
  • Split string
  • Join string
  • String is start
  • String is finish
  • Swap strings with dict
  • Dict subscript
  • Dict subscripts
  • Is in dict
  • Init with dims and fill
  • Fill MD
  • Subscript with coords
  • Subscripts with coords
  • Shift ranks ([1, 0] for flip, [2, 0, 1, 3] for 4d array)
  • Build from crush with dims
  • Reverse rank(s)
  • Dictionary distinct
  • Flip dictionary
  • Pad start to flip
  • Pad finish to flip
  • Pad finish, flip
  • Pad start with string to flip
  • Pad finish with string to flip
  • Pad finish with string, flip
  • Multiply polynomials
  • Roots of polynomial

Triad:

  • Write dict subscript
  • Write dict subscripts
  • Write subscript with coords
  • Write subscripts with coords
  • Pad start with string
  • Pad finish with string

Function

Monad:

  • Inf. array with f(f(f(...f(f(null)))))
  • Inf. array with f(f(f(...f(f([...this])))))

Dyad:

  • Compose dyadic function with monad on both args (f(x, y), g(z) is f(g(x), g(y)))
  • Inf. array with f(f(f(...f(f(input)))))
  • Inf. array with f(f(f(...f(f([...this, input])))))

Dynamics

  • For
  • For with position
  • For with position from right
  • For with array so far
  • Filter
  • Filter with position
  • Filter with array so far
  • Zipwith
  • Zipwith with position
  • Zipwith, without bonus
  • Zip rows
  • Zip rows with position
  • Find
  • Find position
  • Find positions
  • Find (monotonic)
  • Find position (monotonic)
  • Find positions (monotonic)
  • Apply at position(s)
  • Fold
  • Fold with initial
  • Fold right
  • Fold right with initial
  • Scan
  • Scan with initial
  • Scan right
  • Scan right with initial
  • For with starts
  • For with pairs
  • For with pairs with bonus
  • For with sliding pairs
  • For with groups
  • For with groups with bonus
  • For with sliding groups
  • For with rows as inputs (dict to array)
  • For dict
  • For dict without position
  • All
  • Count
  • Count not
  • Min
  • Max
  • Sort (min) by
  • Sort (max) by
  • Sort (comp)
  • Zip for ([x, f(x)])
  • Zip for with position
  • Partition
  • Partition with part so far
  • Partition with part so far and parts so far
  • Partition by ([1, 2, -1, 1, -1, -2, 2, -2, -2, 1] with abs(z) is [[1], [2], [-1, 1, -1], [-2, 2, -2, -2], [1]])
  • Split
  • For MD
  • For MD with coords
  • Find first coords MD
  • Find coords MD
  • Find first coods MD (monotonic)
  • Find coods MD (monotonic)
  • Zipwith MD
  • Zipwith MD with coords
  • Apply at coords
  • Slice to first falsy
  • Final prior to first falsy
  • Distinct by
  • Squish_2 by
  • Trim by
  • Squish_2 start by
  • Squish_2 finish by
  • Trim start by
  • Trim finish by

Any

Monad:

  • Is null
  • To bool
  • Not
  • Wrap
  • Print
  • Inf. list of

Dyad:

  • Triad modifier
  • Compare
  • Is
  • Is not
  • Min
  • Max
  • Pair
  • Dict
  • And
  • Or
  • Null or

Triad:

  • Conditional
  • Not conditional
  • Null conditional
  • Not null conditional

Custom

  • Niladic function
  • Monadic function
  • Dyadic function
  • For with function (dyadic)
  • Int
  • Rational
  • String
  • Dyadic string
() => { var known, known_non, known_count, known_bits, structs, non_i_structs_2, count_structs; };
var known = new Map();
var known_non = new Map();
var known_count = new Map();
var known_bits = new Map();
var structs = (n) => {
if (n == 1)
return [["0"], ["i"]];
if (known.has(n))
return known.get(n);
var sts = [...structs(n - 1).map(s => ["1", ...s])];
for (var i = 2; i < n; i++)
sts = sts.concat(structs(i - 1).flatMap(s => structs(n - i).map(t => ["2", ...s, ...t])));
known.set(n, sts);
return sts;
};
var non_i_structs = (n) => {
var sts = [];
for (var i = n; i <= n * 2 + 1; i++)
sts = sts.concat(structs(i).filter(x => x.filter(d => d != "i").length == n));
return sts;
};
var non_i_structs_2 = (n) => {
if (n == 0)
return [["i"]];
if (n == 1)
return [["0"], ["1", "i"], ["2", "i", "i"]];
if (known_non.has(n))
return known_non.get(n);
var sts = [...non_i_structs_2(n - 1).map(s => ["1", ...s])];
for (var i = 1; i <= n; i++)
sts = sts.concat(non_i_structs_2(i - 1).flatMap(s => non_i_structs_2(n - i).map(t => ["2", ...s, ...t])));
known_non.set(n, sts);
return sts;
};
var count_structs = (n) => {
if (n == 0)
return 1;
if (n == 1)
return 3;
if (known_count.has(n))
return known_count.get(n);
var sts = count_structs(n - 1);
for (var i = 1; i <= n; i++)
sts += count_structs(i - 1) * count_structs(n - i);
known_count.set(n, sts);
return sts;
};
var bits_to_ops = (b, op_s = 6) => {
var bits = known_bits.has(op_s) ? known_bits.get(op_s) : [[0, 1]];
if (b in bits)
return bits[b][0];
for (var i = bits.length; i <= b; i++) {
if (bits[i - 1][1] >= count_structs(bits[i - 1][0]) && i >= op_s * (bits[i - 1][0] + 1))
bits.push([bits[i - 1][0] + 1, 2 ** (i - op_s * (bits[i - 1][0] + 1))]);
else
bits.push([bits[i - 1][0], bits[i - 1][1] + 2 ** (i - op_s * bits[i - 1][0])]);
}
known_bits.set(op_s, bits);
return bits[b][0];
};
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment