Skip to content

Instantly share code, notes, and snippets.

@amcgregor
Created December 5, 2014 10:48
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save amcgregor/016098f96a687a6738a8 to your computer and use it in GitHub Desktop.
Save amcgregor/016098f96a687a6738a8 to your computer and use it in GitHub Desktop.
Clueless, the meta-programming programming language. WIP documentation.

Clueless

A meta-programming language that uses itself to define itself.

Designed to explore the minimum required functionality for a viable general-purpose language, Clueless is peculiar in that the Core of the language defines very little. By itself it’s not very useful, similar to attempting to use Brainfuck to solve real-world problems. Where Clueless excels, however, is in modifying its own grammar and behaviour during runtime to emulate the structures of other languages.

It’s designed for extreme consistency of use, and modelled after Lua and Python.

  • One statement per line. Period. Semicolons in examples are to illustrate statement separation in inline code and should be mentally replaced with newlines.
  • All source files are UTF-8.
    • Strings are UTF-8 text; and slicing works on logical characters, so may return multiple “bytes” even if one character is requested.
    • Normalized to compound accent modifiers into their characters.
    • Byte strings are available, of course, and conversion between them using various encodings available.
  • It’s “true” and “false”, damn it. They aren’t classes. Boolean is.
  • Indentation denotes scope, changes in which define code blocks.
    • The One True Indentation™ is tabs. Spaces won’t work.
    • Unless prefixed with a function call or macro expecting a block argument, blocks are executed immediately upon definition.
      • Such “bare blocks” create temporary scopes which are thrown away after execution; changes to local variables within one are not propagated up.
    • Certain macros immediately execute their block arguments, others defer execution, some might not ever execute their block. The power of macros. (Which all “reserved words” are.)
    • If no assignments are detected in a block, no scope is actually created.
  • Statements are dependency-graphed (i.e. a call to “io.stdout.puts(name)” is known to call “.asString” on its arguments and access the “.write()” method of the underlying file object) and can be re-ordered based on this information.
    • Some method calls are explicitly linear; i.e. “.write()” on a file; this ensures the dependency grapher orders serial operations correctly.
    • Enable verbose debugging to see the grapher at work during the compilation phase.
    • Declared requirements can happen anywhere in a module, and while all requirements are evaluated when the module is evaluated, regardless of position in the code, only those nested scopes are updated to include the reference. (No surprise compilation during runtime.)
      • This means macro definitions will always effect the whole module. (The parser is “re-set” for each module.)
    • Because the graphing creates sets of non-conflicting statements (i.e. “x=1;y=2;puts(x, y)” creates two sub-blocks: [x=1, y=2] and [puts(x, y)]) it naturally produces groups of statements that can be run in parallel without ever encountering locking issues.
    • Re-ordering (other than requirements) only happens if optimizations are enabled, and each statement remembers its original line to produce readable tracebacks.
    • This will confuse your interactive debuggers, though!
  • Blocks passed as arguments to functions/macros begin their life “paused”. Generator blocks pause when yielding values, and blocks become paused when raising exceptions. Paused blocks can have exceptions injected into them, their scope directly manipulated, and may be resumed.
    • Raising an exception isn’t a death-knell.
    • In fact, exceptions are basically how the whole thing works.
  • Clueless makes an automatic distinction between procedures (no argument calls) and functions (calls with arguments).
    • Procedures are executed when referenced through normal means. (I.e. via obj.example, not via obj..get(‘example’) or obj[‘example’].)

Glossary

[Needs sorting. Meh; later. - Ed.]

  • function — a deferred block that expects arguments.
  • procedure — a deferred block that expects no arguments, or a function whose arguments are optional, when executed, that had no arguments passed to it. Similar to automatic @procedure decoration, to borrow Python terms.
  • callable — a reference to any function or procedure, or to any object with a __call__ method.
  • block — an object encapsulating a series of ordered, potentially branching statements.
  • scope — an object representing the execution context of a block.
  • exception — an object that pauses normal execution.
  • track — a subset of statements from a block that can be executed without conflict to other tracks.
  • generator — a reentrant callable, generally using the “yield” macro or other exception-bouncing techniques. Generators participate in the iterator protocol, but can also be used to implement coroutines and stack-free, asynchronous cooperative multitasking.
  • iterator — any object conforming to the .asIterator and .next protocol. (A call to .asIterator must return an object with a .next procedure, and optionally a .length attribute or procedure. May be self.)
  • public — an attribute that may be accessed by any other code, and is included when iterating an object.
  • private — an attribute that may only be accessed (and is only visible) within the same object scope.
  • protected — an attribute that may only be accessed within the same object scope, and is inherited by children.
  • internal — an attribute hidden from object iteration and moved into the “enclosing object” (..).

Command-Line Usage

clue[comp,less] [-dEhiuvVx] [-m module | -c command | script | -] [--] [args]

| -d | Enable diagnostic output and conditionals. (__debug__) | | -E | Ignore environment variables. | | -h | Display invocation information, then exit. | | -i | Continue running interactively after executing an entry point. | | -u | Unbuffered binary stdin, stdout, stderr. | | -v | Enable verbose output. (__verbose__) | | -V | Display version and build information, then exit. | | -x | Skip the first line of source to allow non-UNIX #! marking. | | -m module | Search for this module or entry point and execute as a script. | | -c command | Terminate the interpreter argument list and specify the command to execute. | | script | The filename of a script to execute. | | - | The default file to execute: stdin. | | -- | Terminate the argument list. The interpreter will no longer attempt to parse arguments beyond this point, leaving them to the script being run. | | args | Additional arguments for the script being run to utilize. |

Environment

| CLU_PREFIX | Execution prefix. | | CLU_PATH | Module search path. | | CLU_INIT | Script to be executed on startup, prior to execution of the entry point. | | CLU_ARGS | Default arguments to the interpreter. |

Keywords

(These are actually all macros.)

  • expr and expr — boolean mutual requirement.
  • except expr [as name] — exception capture.
  • finally — execute a block regardless of exception state.
  • forever — an infinite loop structure taking a scope as an argument to execute… forever.
  • global name [name…] — mark the given local-scope variable as write-back. Changes will propagate to parent scopes. (Technically just deletes a shadow value.)
  • expr [not] in expr — boolean membership. The right-hand expression must be iterable or participate in the __contains__ protocol.
  • expr is [a] expr — literal identity or instance membership comparison.
  • not expr — boolean inversion.
  • expr or expr — boolean “any” requirement.
  • pass — do nothing. Used only to populate a block that has no other statements.
  • try — begin an exception capture scope.
  • require spec — execute a module and make it accessible in the current scope, or pull objects from a module into the current scope. Behaviour determined by the format of the specification. I.e. “foo.bar” vs. “foo.bar:baz” respectively.
  • with expr [as name] — scope population and context manager macro. (If the object the expression evaluates to has .__enter__ and .__exit__ methods, they will be called as appropriate. When not using “as”, the object evaluated becomes the scope, when using “as” the object is assigned to that local-scope variable.)

style.functional

  • break — raise a StopIteration exception.
  • continue — raise a ContinueIteration exception.
  • else [if expr] — conditional code branching. Does not actually need to immediately follow an if; execution will be determined based on the previous conditional block within the scope, whatever that may be. (If the previous conditional block ran, an “else if” block will not, regardless of intermediary statements. If debugging is enabled, this will generate a warning, though.)
  • for name in expr — loop over an iterator/generator and populate the nested block’s scope for each iteration.
  • define name[(argspec)] [returns expr] — a named deferred block, optionally accepting arguments.
  • if expr — conditional execution of a block.
  • return expr — raise a ReturnValue exception.
  • while expr — conditional loop.
  • [name =] yield [expr] — raise a re-entrant YieldValue exception and handle continuation, optionally with reverse-injected values and exceptions.
require io

if io.stderr
	io.stderr.puts(“Well, this file is open.”)

io.stdout.puts(“This should always work.”)

else  # Perfectly valid.  Mind blown?
	io.stdout.puts(“Wat.”)

style.oo

  • class — define a class using the provided scope whose block is immediately executed.
  • [internal] [public|private|protected] [instance|class|static] method — a named deferred block, optionally accepting arguments that also optionally participates in name mangling rules. Default if omitted: public, instance. (This macro handles instance binding and other goodness.)

Notable Non-Keywords

For any of these you could easily write a two-line macro.

  • assert — use .asBoolean.assert instead.
  • del — use this..delete() instead.
  • exec — use this..eval() instead.
  • raise — use Exception.raise instead; see also the Block.raise() method to inject exceptions into paused blocks.

Identifiers

(letter | “_”) (letter | digit | “_” | “:”)*
  • Case-sensitive.
  • Double-underscore (underscore) surrounded identifiers typically refer to “semi-private” or “system-defined” identifiers.

String Literals

  • Forms:
    • “double quoted” — used for natural language text, typically displayed to a user at some point. Produces text strings.
    • ’single quoted’ — used for machine text, typically not displayed to end users. Produces UTF-8 encoded byte strings.
    • “””triple quoted””” — used for multi-line text strings.
    • ‘’’triple quoted’’’ — as per the multi-line text strings, but byte strings instead.
  • String literals may have prefixes:
    • r — raw; do not evaluate escape sequences.
  • Multiple consecutive string literals are treated as one.
  • Text strings and byte strings can not be combined in coercive ways. Conversions between these two types must be explicit.

Escape Sequences

  • \ — Backslash ()
  • \e — Escape (ESC)
  • \v — Vertical Tab (VT)
  • ' — Single quote (')
  • \f — Formfeed (FF)
  • \ooo — char with octal value ooo
  • \” — Double quote (")
  • \n — Linefeed (LF)
  • \a — Bell (BEL)
  • \r — Carriage Return (CR)
  • \xhh — char with hex value hh
  • \b — Backspace (BS)
  • \t — Horizontal Tab (TAB)
  • \ux — Character with hexadecimal unicode code point x.
  • \Nn — Character named n in the Unicode database. E.g. u'\N{Greek Small Letter Pi}' <=> u'\u03c0'.
  • \any — left as-is, including the backslash, e.g. '\z' == '\z'

Logical Constants

  • undefined (Null singleton) — useful as a placeholder where none is a valid value, and when returned from certain system-defined magic methods will result in reflection of those methods.
  • none (Null singleton)
  • true (Boolean singleton)
  • false (Boolean singleton)

Numbers

  • Decimal Integer: 1234, 12345678901234567890 (unlimited precision)
  • Binary Integer: 0b101101
  • Octal Integer: 0o177
  • Duodecimal Integer: 0d40B
  • Hex Integer: 0xFFEE
  • Double-Precision Float: 3.14e-10, .001, 10., 1E3
  • Complex: 1j, 4+5j

Floating-point is viral, and integer division performs standardized rounding to remain integer.

Both pi and tau have constants in the “math” module. (Use of tau is encouraged to remove extraneous multiplication and division from most formulae.)

Sequences

Some types are mutable, most are not. Some offer both versions.

  • Strings. Immutable. String constants from code are interned, generated strings are not by default. (Ref: .intern procedure.) Byte strings iterate the bytes, text strings iterate characters. Being immutable, substrings of interned strings duplicate no underlying data.
    • String constants are bulk loaded during module compilation.
    • This interning means these strings actually directly reference ranges in the loaded source file.
  • Tuples. Effectively an immutable list. These are always interned an must contain hashable types. Use parenthesis to denote these, with comma-separated values, optionally with names. Tuples of pure constant values (strings, integers, other tuples) are bulk-loaded with the string constants.
    • Interesting fact: the first 1025 integers are pre-interned on startup to optimize short range loops.
  • Lists. Mutable and order-preserving. Use square brackets, comma-separated.
  • Objects. Mutable and not order-preserving. Use curly braces with comma-separated identifier=expr pairs. Keys must be hashable. (Also called “mappings” and “dictionaries” in other languages.) Iterating an object will produce its public keys in hash-order. (Also why protected is useful.)
  • Sets. Lockable. Use curly braces with comma-separated expressions. (.freeze and .thaw procedures manage mutability.) Immutable sets are interned.
  • Ranges. Immutable, transparently acts like a set. Use curly braces with two numeric values separated by .. or the unicode characters, i.e. {1…27}. Inclusive. (Both 1 and 27 are included in that pseudo-set.) Efficient iteration due to calculation of values, not pre-allocation. Thawing will generate the complete set at once.
  • Files. Iterating a file will return, depending on opening mode, the lines of text from the file or the bytes of the file; the latter only optionally using block-size buffering as the former requires it. (Explicitly enabling buffering will iterate buffer-sized chunks.)

Type Hierarchy

  • Object
    • Operator
    • Scope
    • Block
      • Procedure
      • Function
      • Method
    • Code
    • Frame
    • File
    • Module
    • Iterator
    • Type
      • Null (factory for none and undefined)
      • Boolean
      • String
      • Text
      • Number
        • Complex (exposes .real and .imaginary)
          • Real (abstract specialization, matches fixed and floats)
            • Rational (abstract specialization)
              • Integral (abstract specialization)
      • Container
        • Tuple
        • List
        • Set
        • Range (.minimum, .maximum, .step)

Magic Methods

These are internal. (Remember: accessed via ..)

  • class method new(…)
  • method init(…)
  • method destroy
  • method repr
  • method hash

Casting

These are not internal.

  • method asText
  • method asString
  • method asNumber
  • method asBoolean
  • method asIterator

Comparison

All of these take other as the sole argument, and are internal. Only a minor subset need to be defined—lt and eq, specifically—the rest model their behaviour on these by default.

  • method instanceOf
  • method lt
  • method le
  • method eq
  • method ne
  • method gt
  • method ge

Attribute Access

These are internal.

  • method get(name)
  • method set(name, value)
  • method unset(name)
  • method default(name) — used if key is missing.

Descriptors

These are internal.

  • method load(instance, parent)
  • method store(instance, value)
  • method delete(instance)

Containers

These are internal.

  • method length
  • method retrieve(name)
  • method update(name, value) — update with undefined to delete a value.
  • method contains(name)
@amcgregor
Copy link
Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment