Skip to content

Instantly share code, notes, and snippets.

@jckarter
Created May 30, 2012 22:07
Show Gist options
  • Save jckarter/2839239 to your computer and use it in GitHub Desktop.
Save jckarter/2839239 to your computer and use it in GitHub Desktop.
z

fml

The function manipulation language

"I just implemented Conway's Game of Life in two lines of code. #fml"

pad = x flip[stitch] 0, stitch 0, flip[cat] 0, cat 0
life = pad, neighborhoods[3 3], [ravel, [sum in?: [x @ 4, + 3; 3]]]/2

[0 1 0 1
 0 1 1 0
 0 0 1 0
 0 0 0 0] replicate[life]-times[5]
###
[0 1 0 1
 0 1 1 0
 0 0 1 0
 0 0 0 0

 0 1 0 0
 0 1 0 1
 0 1 1 0
 0 0 0 0

 0 0 1 0
 1 1 0 0
 0 1 1 0
 0 0 0 0

 0 1 0 0
 1 0 0 0
 1 1 1 0
 0 0 0 0

 0 0 0 0
 1 0 1 0
 1 1 0 0
 0 1 0 0]
###

Summary

fml is an optimizing, function-oriented, array programming language. Unlike other array programming languages, it aims to have a less symbol-heavy but still concise syntax, and non-strict semantics that allow for high-level optimization.

Note that fml is not:

  • meant for serious use
  • always faster than systems languages like C/C++/etc.
  • always more expressive than general-purpose languages like Python, Javascript, etc.
  • suitable for cryptography, real-time, or low-level applications that require fine control of time and space complexity

License

The fml implementation is licensed under an MIT-style license; see LICENSE.txt for details. This document and the other included fml documentation are in the public domain.

Syntax

Character set

fml source code is UTF-8 encoded Unicode text. On-disk source code should be pre-normalized to NFC form; an off-line interpreter or compiler may not normalize source code text by itself. The characters [](){},.'":;#/ are reserved and used in fml's syntax. All printable, non-whitespace characters not reserved are free for making up identifiers. The characters +-_ and digit characters have special meaning in numeric literals but may be also used in identifiers.

Comments

# begins a line comment that ends at the end of the line. A line beginning with ### begins a block comment that ends at the next line identical to the starting line. ### comments can envelop other lines beginning with ### as long as those lines are followed by different text.

# This is a line comment

###
This is a block comment
###

### LICENSE ###
(c) 2012 ORIGINAL LANGUAGE DO NOT STEAL
### LICENSE ###

###FIXME
###
Add one to a value
###
inc = x - 1
###FIXME

Brackets

Bracketing characters are used for many purposes in fml. The matching pairs [], (), {} may be used as brackets interchangeably, though they must match and nest correctly.

# These are all the same
[[1 + 2] - 3]
[(1 + 2) - 3]
([1 + 2] - 3)
({1 + 2} - 3)

# This is invalid
[(1 + 2] - 3}

Anywhere in this document brackets or [] are mentioned, matching pairs of () and {} may be substituted.

Literals

Numbers

A numeric literal begins with an optional + or - sign followed immediately by one or more digits. . marks a floating-point decimal point, and / marks a rational denominator.

123
+123
-123
1/3
+1/-3
1.01

The prefixes 0x, 0o, and 0b introduce hexadecimal, octal, and binary literals.

0xFFFF
0o755
0b10110111

A decimal floating-point literal may include an exponent marked by e.

1.5e10
1.5e-10

Hexadecimal and binary floating-point literals are supported; they must include a base-2 exponent marked by p.

-0x1.8p12   # (1 + 8/16) * (2 ^ 12)
0b1.1p-12   # (1 + 1/2) * (2 ^ -12)

Complex floating-point literals are supported; the real and imaginary part are separated by a sign followed by j.

1+j0
0.707-j0.707

_ may be used freely within numeric literals for human readability; it does not affect the value of the literal.

1_000_000
1_00_00_000
0xFFFF_0000
0x1.8000_0000_0000_1p0

fml tries to deduce an appropriate type for number literals, which is sufficient most of the time. An explicit type may be asked for with a type suffix:

1b          # 1-bit/boolean 1 (note: cannot be used with 0x)
1o          # 8-bit octet 1
1s          # 16-bit short 1
1i          # 32-bit int 1
1L          # 64-bit long 1
1f          # single-precision float 1
1F          # double-precision float 1
1I          # arbitrary-precision 1
1R          # arbitrary-precision rational 1

If -, +, or a digit does not begin a numeric literal, it is treated as an identifier character. Outside of numeric literals, _ may also be used as an identifier character; . is used as the module path separator in identifiers; and / as the rank modifier.

The special values infinity and NaN can be referred to as 1/0 and 0/0 respectively.

Strings

'single quotes' and "double quotes" may be used to delimit string literals. Two quotes in a row, '' or "", escapes a quote character within a literal. The C-style escapes \', \", \r, \n, \t are also supported. An additional escape \s is provided for the space character. Combining characters may also be escaped with \. Unicode code points may be referenced by hexadecimal index with \u012ABC.

'What are you, a lawyer?'
'I always wanted to be in one of your fuckin'' plays.'
"Garçon!"
'Garc\̧on!'

Strings with escaped combining characters are never normalized. 'e\́' and 'é' are different strings.

String literals may span multiple lines. Single newlines within a string literal are combined with any adjacent whitespace into a single space character inside the literal. Blank lines in the string literal, along with any whitespace represented by escape sequences such as \s or \n, are preserved.

# No newlines in the string's value
"The secret, I don't know. I guess you just gotta find something you
 love and then do it for the rest of your life. For me, it's going to
 Rushmore."

# blank lines preserved in the string's value
'"I like your nurse''s uniform, guy."

 "These are O.R. scrubs."

 "O.R. they?"'

Preformatted strings may be delimited with ''' ''' or """ """. Leading and trailing whitespace is stripped, along with any indentation whitespace up to the indentation level of the first non-whitespace character. Newlines (including the final newline before the closing """, if any) and additional indentation are preserved. Within a triple-quoted string, ' and " are literal, as is the one of ''' or """ not used to delimit the string. Literal quotes may also be used at the end of the string; in a literal ending with more than three """; the final three are taken as the delimiter.

# no ending newline; internal newlines preserved
# also note that final " is included inside string
""""She's my Rushmore, Max."
   "I know. She was mine, too.""""

# newline at end preserved
'''
    def factorial(n):
        """and now for something completely different"""
        fac = 1
        if n > 2:
            for i in xrange(2,n):
                fac *= i
        return fac
'''

An escape sequence such as \s or \n can be used to represent significant space that will not be trimmed.

"""
    \s                           MR. BLUME

            You guys have it real easy. I never had it like this
            where I grew up. But I send my kids here. Because, the
            fact is, whether you deserve it or not: you go to one
            of the best schools in the country.

    Max's eyes light up.

                                MR. BLUME

            Rushmore. You lucked out.

    Max leans forward to the railing and begins to listen intently.

                                MR. BLUME

            Now, for some of you it doesn't matter. You were born
            rich, and you're going to stay rich. But here's my
            advice to the rest of you: take dead aim on the rich
            boys. Get them in the crosshairs. And take them down.
"""

If a string literal appears in an expression by itself preceding a binding expression (using =, described below), the string becomes a docstring attached to the following binding.

"""
    Calculates the factorial of a positive integer X. Returns 1 for zero or
    negative integers.
"""
factorial = 2 to x, fold[*]
Arrays

Literal one-dimensional arrays, called lists, may be specified as series of literals separated by whitespace inside brackets.

[1 2 3]

Multiple-code-point strings are lists of code points, so the following forms are equivalent.

'abc'
['a' 'b' 'c']

Literal two-dimensional arrays, called tables, may be specified as series of lists separated by newlines or semicolons. Table literals are interpreted in row-major order.

# 2x3 table
[1 2 3
 2 4 6]

# Same
[1 2 3; 2 4 6]

N-dimensional table literals may be specified with the major axes delimited by blank lines or repeated semicolons. The Nth dimension is delimited by N - 2 blank lines, or N - 1 semicolons.

# 2x3x3
[ 1  2  3
  4  5  6
  7  8  9

 10 11 12
 13 14 15
 16 17 18]

# Same
[1 2 3; 4 5 6; 7 8 9;; 10 11 12; 13 14 15; 16 17 18]

# 2x2x2x2
[ 1  2
  3  4

  5  6
  7  8


  9 10
 11 12

 13 14
 15 16]

# Same
[1 2; 3 4;; 5 6; 7 8;;; 9 10; 11 12;; 13 14; 15 16]

To specify a type suffix for the elements of a numeric list, the suffix can be applied once to the last element of the list.

[0 1 0f]    # list of single-precision floats
[0 1 0b]    # list of bits

The empty list is []. The empty two-dimensional table is [;], three-dimensional is [;;], and so on.

Note that brackets and ; are also used for nesting expressions and list construction, described below. A bracketed expression is parsed as a literal if there is a single literal, newline, or ; token inside the brackets, or if the first two tokens inside the brackets are literals, newlines, or ;. The comma ,, described below, can force expression parsing.

[1]         # one-element list
[1 2]       # two-element list
[1 A]       # syntax error: invalid list literal; A isn't a literal
[1;2]       # 2x1 table
[1;A]       # syntax error: invalid table literal; A isn't a literal

[, 1]        # ===> 1
[, 1; A]     # ===> 1; A

Array literals must be homogeneous and rectangular. Heterogeneous arrays or ragged arrays-of-arrays are supported but must be created using list construction expressions, described below.

Identifiers

Identifiers are used to name nouns (values) and verbs (functions). Identifiers may be made up of any non-whitespace, non-reserved characters, but may not begin with a digit or with + or - followed by a digit. Capitalized and lowercased identifiers have distinct syntactic roles. A capitalized identifier begins with an uppercase or title-case letter or the symbol _. A lowercased identifier begins with any non-uppercase, non-title-case character. Identifiers are separated by whitespace or by reserved characters; symbols can be intermixed with letters and digits and do not separate identifiers by themselves.

# Capitalized identifiers
X
X*          # one identifier
Even?
_foo
_孙悟空

# Lowercased identifiers
sin
ro-sham-bo
exists?
+
-
**

The token = is reserved as the binding operator; however, longer identifiers may include =.

==
>=

/ is used as the rank modifier; however, it may also be used by itself as a lowercased identifier.

/           # the identifier '/', used for the division function
foo/1       # 'foo' rank 1
//1         # '/' rank 1

A trailing : is reserved for keyword syntax, but identifiers may include embedded :s.

if:else

. is used to provide qualified module paths.

math.functions.sqrt
math.constants.Pi

. by itself is also the "black hole" identifier used in binding patterns (see Binding, below). Directives, such as .import, are prefixed with ..

Expressions

Each line of fml source code is an expression. Newlines normally separate expressions, although indentation may be used to span expressions across lines.

fml distinguishes values, functions, and higher-order functions syntactically. In the APL tradition these are referred to as "nouns", "verbs", and "adverbs" respectively.

Nouns

Nouns are values. Literals are nouns, as are the results of applying verbs. Noun variables are named by capitalized identifiers.

1
'two'
[3 4 5]

Pi = 3.14159
Verbs

Verbs are first-order functions; they take nouns as inputs and output new nouns. Verbs are named with lowercased identifiers. A unary verb is applied using postfix notation, and a binary or higher-arity verb is applied using infix notation; the verb being applied is the second expression in any case.

1 neg                       # apply unary 'neg' to 1        ===> -1
1 + 2                       # apply binary '+' to 1 and 2   ===> 3
X? if:else 'yes' 'no'       # apply ternary 'if:else' to X?, 'yes', 'no'

Most verbs are pure functions; their output depends only on their inputs. If a verb name ends with !, it is impure and may read or write external state.

'Hello world!' print!
'foo.bin' read!, + 1, & 0xFF, write! 'foo+1.bin'

A higher-arity verb may be given . as its right argument. It will then be given the left argument for all of its arguments.

4 + .               # ===> 8
1 + 2, * .          # ===> 9
Nesting expressions

Brackets [] and the comma , can be used to nest verb applications and other expressions. Brackets nest expressions, like parens in many programming languages.

[1 + 2] * 3         # ===> 9
1 + [2 * 3]         # ===> 7
[2 neg] * 3         # ===> -6

Note that a left bracket must be separated from a preceding verb by whitespace. Without whitespace it is parsed as an adverb (see below).

1 + [2 * 3]         # brackets
[1 2 3] fold[+]     # adverb

A comma brackets from the beginning of the expression up to the comma, for left-associative composition.

1 + 2, neg          # -3
1 neg, + 2          #  1
2 * 3, + 1          #  7
1 + 2, * 3          #  9
1 + 2, * 3, neg     # -9

A comma can be used at the beginning of an expression, where it has no effect. This can be used to disambiguate list literals from expression brackets. A single literal in brackets is parsed as a one-element list literal. Literals should never otherwise need to be bracketed by themselves, but if this is desired, an extra comma can be used to force parsing as expression brackets.

[1]                 # one-element list [1]
[, 1]               # ===> 1
Keyword verbs

Higher-arity verbs may be applied using Smalltalk-esque keyword syntax. X a: Y b: Z c: W applies the verb a:b:c to the arguments X, Y, Z, and W. It generalizes for any number of keywords. The : must not be preceded by whitespace and must be followed by whitespace.

X? if: 'yes' else: 'no'      # X? if:else 'yes' 'no'

For arguments after the first two, a lone colon may be used as a separator without an identifier, preceded and followed by whitespace, as an argument separator. X a: Y : Z : W applies a to the arguments X, Y, and Z.

'hello-world.txt' create!: 0o644 : 'Hello world!'

The first "keyword" can be a bracketed verb expression. The subsequent keyword names, if any, are ignored.

# A lambda expression; see "bindings" below
1 [X . Y Z = X * Y, + Z]: 2 : 3     # ===> 6
# same
1 [X . Y Z = X * Y, + Z]: 2 and: 3  # ===> 6

Keyword application has lower precedence than non-keyword verb application and ,. This can be taken advantage of to lower the precedence of a verb without brackets.

1 / 2 +: 2 * 3          # [1 / 2] + [2 * 3]       ===>  13/2
1 / 2 +: 2 * 3, neg     # [1 / 2] + [(2 * 3) neg] ===> -11/2
Composing verbs

Verbs can be applied to other verbs to yield new verbs that compose the input verbs. Composing two unary verbs, uv1 uv2, gives a unary verb that applies uv1 to its input then uv2 to the result. Composing a unary verb with a higher-arity verb, uv bv, gives a higher-arity verb that applies uv to each of its inputs then bv over the results. Composing a higher-arity verb with a unary verb the other way, bv uv, gives a higher-arity verb that applies bv over its inputs then uv to the result.

X [uv1 uv2]      X [uv bv] Y       X [bv uv] Y

     X             X    Y            X    Y
     |             |    |             \  /
    uv1           uv    uv             bv
     |              \  /               |
    uv2              bv                uv

X uv1, uv2    [X uv] bv [Y uv]     X bv Y, uv

Composing two higher-arity functions is an error. However, "forks" can be composed using verbs of any arity. In a fork, R arity-B verbs are each applied over the inputs, and the results are used as the inputs to one arity-R root verb rv to give the result of the fork.

  X [uv1 rv uv2]         X [bv1 rv bv2] Y

      X    X               X   Y  X   Y
      |    |                \ /    \ /
     uv1  uv2               bv1    bv2
       \  /                    \  /
        rv                      rv

[X uv1] rv [X uv2]    [X bv1 Y] rv [X bv2 Y]

A tine of a fork may also be a noun, which is used directly as the corresponding input for the root verb.

mersenne = ^ - 1              # X mersenne Y = [X ^ Y] - 1
2 mersenne 7                  # ===> 127

# x is the identity function; X x == X
inc = x + 1
5 inc                         # ===> 6

A fork may also be formed with . as its right side, to reuse the left-hand result for the rest of the root verb's arguments.

sq = x * .

These composition rules allow many functions to be defined implicitly without explicit reference to arguments.

hypot = sq, +, sqrt           # ===> X hypot Y = [X sq] + [Y sq], sqrt
mean = sum / length           # ===> X mean = [X sum] / [X length]
variance = x - mean, sq, sum  # ===> X variance = X - [X mean], sq, sum
List construction

Semicolons ; collect the results of multiple expressions into a list. It has lower precedence than verb application, ,, and keywords.

1 + 2, * 3; 1 +: 2 * 3; 1 + 2       # ===> [9 7 3]

A list of same-arity verbs is itself a verb. When a verb list is applied, each verb in the list is applied to its argument(s) and the results are collected into a list. Nouns can be included in a verb list; they become values in the result of the verb.

5 [+;-;*;floor[/];0] 2   # ===> [7 3 10 2 0]

Using ; with equal-sized list or table elements builds a higher-rank table.

[1 2]; [3 4]                # ===> [1 2; 3 4]
[1 2; 3 4]; [5 6; 7 8]      # ===> [1 2; 3 4;; 5 6; 7 8]

Using ; with heterogeneous elements of different types or array sizes creates a heterogeneous list.

[1 2]; 3                    # ===> [, [1 2]; 3]

Higher-rank table constructions can be expressed using repeated ;.

4 + 2; 4 - 2;; 4 * 2; 4 / 2 # ===> [6 2; 8 2]

Note that brackets and ; are also used as table literals; the first two tokens inside the brackets are used to differentiate: if there is a single literal, newline, or ; token inside the brackets, or if the first two tokens inside the brackets are literals, newlines, or ;, it is parsed as a literal. , or brackets can be used to force parsing as an expression if the first element is a literal.

[1; 2; 3]                   # 3x1 literal table
[1; A; 3]                   # syntax error; A isn't a literal

[A; 2; 3]                   # 3-element lists
[, 1; A; 3]
[, 1; 2; 3]
Adverbs

Adverbs are higher-order functions; they take one or more noun or verb arguments and yield a new noun or verb as a result. adverb[arg] applies an adverb; the left bracket must not be preceded by whitespace. The unapplied adverb is named with empty brackets, as in adverb[]. Adverbs that produce verbs are named by lowercased identifiers.

# 'flip[]' takes a binary verb and reverses its arguments
3 flip[-] 2                 # ===> -1

# bind an alias to flip[]
flop[] = flip[]
3 flop[-] 2                 # ===> -1

# 'fold[]' takes a binary verb and yields a unary verb applying the verb
# between all the elements of a list
[1 2 3] fold[+]             # ===> 6

# 'case[]' applies one of a list of unary verbs to each element of its right argument
# selected by values from its left argument
[1 2 0] case[x + 1; x - 1; x * 2] [1 2 3]   # ===> [0 4 2]

Multiple-argument adverbs have the form a[arg][arg] or a[arg]-b[arg], without whitespace between the brackets and identifiers.

# 'do[f]-while[g]' applies 'f' repeatedly while 'g' returns nonzero.

# Find the first Fibonacci number greater than 100
[1 1] do[x @ 1; fold(+)]-while[x @ 1, <= 100], @ 1  # ===> 144

A partially-applied adverb can be referred to by leaving one or more of the argument brackets empty.

inc-while[] = do[x + 1]-while[]
while-odd[] = do[]-while[x % 2, != 0]

1.5 inc-while[x < 4]        # ===> 4.5
39 while-odd[x floor[/] 2]  # ===> 4

Unapplied or partially applied adverbs may also be used in verb composition or fork expressions to give new adverbs. The unapplied parameters become parameters of the result adverb in source order.

foldsquares[] = sq fold[]
fold[]-minus[] = fold[] - fold[]

[1 2 3] foldsquares[+]      # ===> 14
[2 3 4] fold[*]-minus[+]    # ===> 24 - 9 ===> 15

Adverbs may also be named and applied with keyword syntax. X a[f]: Y b: Z c: W applies the adverb a[]:b:c with the verb argument f, then applies the resulting verb to X, Y, Z, and W. Adverb arguments and keywords may be interspersed freely in the adverb name, for example, X a[f]-b[g]: Y c: Z d[h]: W applies a[]-b[]:c:d[].

Adverbs that produce nouns are named by capitalized identifiers. These can be used to query properties of verbs or nouns.

# 'Rank' yields the rank(s) of a verb
Rank[+]     # ===> [0 0]
Rank[x]     # ===> 1/0
Rank[is]    # ===> [1/0 1/0]

# 'Name' yields the name of a verb as a string
Name[sqrt]  # ===> 'sqrt'

# 'Help' yields the docstring for a verb or noun
Help[sqrt]  # ===> 'Outputs the square root yadda yadda yadda'
Bindings

= binds variables. It has lower precedence than verb application, ,, keywords, and ;.

# Noun binding
Pi = 1 atan, * 4

# Verb binding
sq = x * x
hypot = sq, +, sqrt

# Adverb binding
flop[] = flip[]
inc-while[] = do[x + 1]-while[]

The left-hand side may look like a list of identifiers to bind the destructured elements of a list. The special name . can be bound to discard a value.

One; Two; Three = [1 2 3]
One; .; Three = [1 2 3]

The left-hand side may also look like a verb application to bind a verb with explicit named arguments. The argument names are lexically scoped to the right side of the binding.

# Explicit definitions of the above tacit bindings
X sq = X * X
X hypot Y = [X sq] + [Y sq], sqrt

# Define keyword verb pow:mod
Base pow: Exp mod: Mod = Base ^ Exp, % Mod

Verb arguments may look like lists to destructure a list argument.

[A;B;C] quadratic = B neg, [+;-] [B sq, - [4 * A, * C], sqrt] /: 2 * A

[2 0 -8] quadratic  # ===> [2 -2]

The left-hand side may also look like an adverb application, to bind an adverb with explicit arguments. A verb-producing adverb may have explicit or implicit verb arguments.

# Noun-producing adverb
Arity[] = Rank[] length

# Noun-producing adverb, explicit adverb argument
Arity[f] = Rank[f] length

# Verb-producing adverb, implicit verb arguments
ofsquares[f] = sq f
3 ofsquares[+] 4    # ===> 25

# Verb-producing adverb, explicit verb arguments
X flip[f] Y = Y f X
3 flip[/] 1         # ===> 1/3

Bindings are lexically scoped. A noun binding's scope begins from the line after the binding; a noun name may thus be rebound referencing the previous value bound to that name.

X = 1
X print!    # prints: 1
X = X + 1
X print!    # prints: 2

Verb bindings are lexically scoped beginning from the binding itself, so that verbs may reference themselves recursively.

X factorial = X == 0 if: 1 else: X * [X - 1, factorial]

If a verb or adverb definition uses impure verbs (named ending in !), the new verb or adverb must also be marked impure with a name ending in !. "Pure" adverbs may however be applied to impure verb arguments; a[f!] will result in an impure verb.

X spam! =
    X print!
    X spam!

show![f] = f print!

1 show![+] 2        # prints: 3

[1 2 3 4] case[(); print!] [1 0 1 0]

Binding expressions self-evaluate to the bound value. A variable bound in a subexpression is lexically scoped to the line it appears in and can be used within the same expression.

X foo Y = [(X1 = X + 1) *: Y + 1] / [X1 *: Y - 1]

Binding . as a verb with explicit arguments can thus be used as a lambda expression.

[1 2 3] case[(X . = X + 1); (X . = X - 1)] [0 1 0]  # ===> [2 1 4]

Indentation

A binding = sign may be followed by an indent to introduce a multiline definition. All subsequent lines up to the matching outdent become part of the definition. The last line becomes the final value of the binding for a noun binding, or the result of a verb binding. Bindings within the definition are lexically scoped to the definition.

Pi =
    Pi/4 = 1 atan
    Pi/4 * 4

[A;B;C] quadratic =
    D = B sq, - [4 * A, * C], sqrt
    B neg, [+;-] D /: 2 * A

X totient =
    Factors; Powers = X factor
    Factors - 1, * [Factors ^: Powers - 1], product

An indented definition is ended either by an outdent or by a closing bracket if the definition is inside of a bracketed expression.

X foo =
    Bar     # definition goes from here...
    Bas
    Zim
    Zang    # ...to here
Zung

2 [X bar Y =
    Bar                     # indentation goes from here...
    Bas
    Zim] 3 + 4 bar 5        # ...to closing bracket

Indentation after a non-= token acts as a line continuation.

'foo.bin' read!, sha256, print!

# equivalent
'foo.bin' read!,
    sha256,
    print!

Multi-argument adverb applications may also be split over indented lines.

39 do[x floor[/] 2]-while[x % 2, != 0]

# equivalent
39 do[x floor[/] 2]
    -while[x % 2, != 0]

Directives

Directives are special lines that have syntax and semantics of their own. Directives all start with a name of the form .foo; new directives may be added by future versions of the language.

.import and .from

The .import <module> directive imports a module into the current lexical scope. Bindings in the module can be referenced by qualified name.

.import math.constants
area = x * math.constants.Pi, ^ 2

The .from <module> import <names> directive imports bindings from a module directly into the current lexical scope. .from <module> import . imports all exported bindings from <module> into the current scope.

.from math.constants import Pi, E
area = x sq, * Pi
polar = a * [E ^ [b * 0+j1]]
.module

The .module <module> exports <names> directive names a module, and lists the publicly exported names in the module. Without a .module directive a source file may not be imported as a module, and only exported names may be imported.

.module math.constants exports Pi, E

Pi = 3.14159
E = 2.71828

Semantics

Execution order

With few exceptions, fml is pure functional and referentially transparent. The implementation uses an execution planner (similar to the query planner used by many SQL databases), which may reorder, combine, or eliminate calculations, and copy or share storage among values as it sees fit. Computation is only guaranteed to occur if an externally-visible result depends on it. In this respect fml is similar to lazy languages such as Haskell. However, fml is not a true lazy language, because it also does not generally guarantee that unobserved computations will not be executed, so array computations that contain unused infinite-time or -space elements cannot be reliably expressed. However, branching computations will short-circuit; X? if: Y else: Z will not evaluate its Y argument if X? is all-zero or its Z argument if X? is all-nonzero. Recursive expressions with terminating cases may thus be safely expressed.

Some verbs and adverbs are provided that act as hints to the execution planner. They do not change the result of the computation, but change how the planner arrives at the result and may thereby affect time and space complexity.

Impure operations

The primary exception to fml's purity is impure verbs, whose names must end with !. Impure verbs act as sequence points; they are guaranteed to be executed even if their outputs are discarded, and will execute relative to each other in source order. Within expressions, side-effecting arguments are executed depth-first, left-to-right.

. discard . = []
. f! . = 'F' print!

# print ABCDEF
. abcdef! =
    'A' print!
    'B' print! discard: 'C' print!
    'D' print! f!: 'E' print!

When an impure verb is applied over an array, it applies to cells in row-major order, least index to greatest index.

# print!/1 applies 'print!' to each rank-1 cell (see "Verb rank" below)
# This prints '[1 2 3]', then '[4 5 6]', then '[7 8 9]'
[1 2 3
 4 5 6
 7 8 9] print!/1

Types

fml supports the following scalar types:

  • Number types:
    • 1-bit booleans
    • 8-, 16-, 32-, and 64-bit signed integers
    • 32-bit single-precision and 64-bit double-precision floating-point numbers, in real, imaginary, and complex flavors
    • Arbitrary-precision integers and rationals
  • The character type, representing a single Unicode code point. A character is represented by a single-code-point string literal, such as 'x' or 'é'. Strings are just lists of code points.
  • The pointer type, used to reference foreign resources. These do not have a literal representation.

Numeric calculations normally give the result type that best accurately contains the result. Integer results will be machine-word sized if possible (that is, 32-bit on 32 bit hosts and 64-bit on 64-bit hosts), spilling into 64-bit or arbitrary precision if necessary. Results can be truncated into a smaller type by composing a bitwise-and or modulus operation onto the calculations. Results known to only be 0 or 1 will be reduced to boolean type.

10 ^ 0              # 1-bit
10 ^ 8              # a 32- or 64-bit result
10 ^ 100            # a bignum result
10 ^ 100, & 0xFF    # an 8-bit result

Irrational and transcendental operations such as sqrt will give approximate floating-point results even with arbitrary-precision inputs. The adverbs floor, trunc and ceil give accurate integer versions of many of these operations.

10 ^ 101, sqrt              # a floating-point result
10 ^ 101, floor[sqrt]       # a bignum result

Floating-point results will be double-precision, unless all inputs are single-precision or the result is coerced to single-precision. For performance, fml takes algebraic liberties with floating-point calculations; strict floating-point conformance can be had with the strictfp adverb.

X sort[<], fold[+]                  # the sort may not affect the result in the
                                    # intended way here; algebraically it's redundant
                                    # and could be discarded

X sort[<], fold[strictfp(+)]        # add up list least-to-greatest for best accuracy
                                    # strictfp[+] isn't assumed to be associative so
                                    # the sort is significant

Arrays

Scalars of the same type may be composed into arrays. One-dimensional arrays are also referred to as lists.

[1 2 3]

Strings are arrays of characters.

'hello' is [ 'h' 'e' 'l' 'l' 'o' ]    # ===> 1

Heterogeneous arrays, including arrays of arrays, are supported; internally these arrays have an extra layer of indirection. They must be constructed explicitly using the ; syntax. The printer will display these as a construction expression.

1; 'two'; [4 5 6]   # ===> [, 1; 'two'; [4 5 6]]

Rectangular multidimensional arrays are supported as well. The dimensionality of an array is referred to as its rank. A scalar has rank 0. A list has rank 1. An MxN table has rank 2, a MxNxO table rank 3, and so on.

The in-memory representation of arrays is unspecified. The implementation may choose sparse or dense representations and arbitrary row-major, column-major, or locality-optimized tiled orderings as it sees fit. Impure verbs that read or write arrays to memory or disk will specify the ordering in which they read or write data.

Verb rank

Every verb also has a rank, which determines the rank of the values the verb operates on, and how the verb generalizes when applied to higher-rank values. Verbs cut their inputs up into "cells" of the appropriate rank. For example, a verb of rank 2 will operate once over the entirety of a rank-2 table, or over every rank-2 cell of a 3-or-higher dimensional table. The results of a higher-rank computation are collected into a new result list.

# rank 2 determinant
[A;B;;C;D] det = [A * B] - [C * D]
Rank[det]           # ===> 2

[1 0; 0 1] det      # ===> 1

[1 0; 0 1;; 2 0; 0 2;; 3 2; 2 3] det
                    # ===> [1 4 5]

Higher-arity verbs require agreement in the dimensions of their arguments; the arguments must have a common prefix among their dimensions. For instance, applying a rank-0 verb to two 3-element lists will apply the verb to each corresponding pair of elements, and applying it between a 3-element list and a 3x3 table will pair each element of the list with each row of the table.

X = [1 2 3]
Y = [10 20 30]
X + Y               # Apply + to each corresponding pair of elements in X and Y
                    # ===> [11 22 33]

X + 1               # Apply + to 1 and each element of X
                    # ===> [2 3 4]

Z = [  10   20   30
      100  200  300
     1000 2000 3000]
X + Z               # Apply + to each element of X and the corresponding row in Z
                    # ===> [  11   21   31
                    #        102  202  302
                    #       1003 2003 3003]

Higher-arity verbs may have different ranks for each argument. For instance, a binary verb with left rank M and right rank N will apply each rank-M cell of its left argument to the corresponding rank-N cell of the right.

# rank 2 determinant + rank 0 scalar
M det+ N = M det, + N

Rank[det+]                  # ===> 2 0
[1 0; 0 1] det+ 2           # ===> 3
[1 0; 0 1] det+ [2 3; 4 5]  # ===> [3 4; 5 6]
[1 0; 0 1;; 2 0; 0 2;; 3 2; 2 3] det+ [2 3 4]
                            # ===> [3 7 9]

A verb may have infinite rank, in which case it always operates on the entirety of its input, or negative rank, in which case it operates on ArgumentRank + VerbRank-rank cells of its argument. For instance, the is verb, which compares entire values for equivalence, has infinite rank in both arguments, whereas == has rank 0 and compares elementwise.

Rank[==]            # ===> 0 0
Rank[is]            # ===> 1/0 1/0

[1 2 3] is [1 2 3]  # ===> 1
[1 2 3] is [1 2 4]  # ===> 0
[1 2 3] is [1 1 1
            2 2 2
            3 3 3]  # ===> 0

[1 2 3] == [1 2 3]  # ===> [1 1 1]
[1 2 3] == [1 2 4]  # ===> [1 1 0]
[1 2 3] == [1 1 1
            2 2 2
            3 3 3]  # ===> [1 1 1; 1 1 1; 1 1 1]

The rank of a verb may be modified. verb/N produces a variant of verb that operates on N-cells of its argument(s); if N is a list, each element of N specifies the rank of the cells for the corresponding argument. For instance, * can be modified to apply each element of its right argument to its entire left argument as follows:

[1 2 3 4] */[1/0 0] .       # ===> [ 1  2  3  4
                            #        2  4  6  8
                            #        3  6  9 12
                            #        4  8 12 16]

The order in which pure verbs are applied to cells is unspecified. Impure verbs are applied to cells in row-major order, least index to greatest.

Errata

  • Are non-strict, non-lazy semantics too slushy? "Lazy" and "strict" verb modifiers? Additional order-of-operations guarantees needed besides if:else shortcircuiting?
  • Maybe too many weird rules for when whitespace between tokens is significant and when certain characters have meaning in identifiers
  • Overloading [;] for list literals, parens, and list construction is goofy
  • Shorthand for common HOFs like flip, fold?
  • , for nesting and ; for lists is a bit hard to read without syntax highlighting because , and ; both visually register below the baseline. At a glance, how many elements in the list 1 + 2, * 3; 4 * 5, sqrt, floor?
  • , looks alright with named verbs, like X foo, bar Y, bas, but looks a bit noisy with math symbols like X + Y, * Z. X + Y * Z also syntactically tries to create a fork, and requires semantic knowledge of + to catch the error
  • Would the relative precedence of keywords and , be better with , lower?
  • Finer-grained effect types than pure/impure?
  • What is a module?
  • Construct to run impure code in a pure context, such as log or observe?
  • Lists inside adverb brackets? foo[1 2 3]
  • Some national keyboard layouts lack #, also it's a shifted character, so it violates the "no shifted symbols; no US-only symbols" rules
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment