Skip to content

Instantly share code, notes, and snippets.

State and Side-Effects in Mechanical

I don't like:

  • control flow
    • it confuses perceptible time ("the user clicked on something and then the thing finished downloading") and control flow "time" ("this function is called and then this function is called"). This is confusing for end-user programmers because control flow "time" is perceived as instantaneous to end-users (until you hit a blocking or asynchronous function call)
  • RxObservables/cycle.js-style stream-based functional-reactive programming (aka denotative continuous-time programming)
    • I don't think it's natural to think of like, a textbox, as its initial value and then every keypress or toolbar button click that will change its contents. I think it's much more natural to think, okay when this toolbar button gets clicked, what changes
  • the very example that Steve Krouse uses to illustrate DCTP/FRP is described as: "When either in

Paranoid JS Sandbox in the browser

A design for a paranoid, secure-by-default, defense-in-depth JS sandbox.

3 different, complementary layers:

  • Web Worker with CSP blocking network requests, in a cross-domain iframe
  • everything but an allowlist is deleted from global scope in the worker and the iframe
  • transpilation & static analysis that intercepts access to globals (and intercepting calls to the function constructor)

These are very complementary: the first layer is simple and hard for us to get wrong, but reliant on the browser not get wrong; the second layer is hard for the browser to get wrong, but reliant on us not to get it wrong, at runtime; the third and final layer is really hard for the browser to get wrong, but really reliant on us not to miss anything.

<style>
body {
font-family: monospace;
font-size: 16px;
color: #202122;
white-space: pre-wrap;
}
.nibble-sep {
margin: 0 -.15em;
font-size: 0.8em;

TL;DR: You can simulate 8-bit safe "bytestrings" in JavaScript by restricting each character to code points U+0000 to U+00FF, and there are even tricks to easily convert between ordinary UTF-16 JS strings and UTF-8 encoded as such a JavaScript "bytestring".

  • one weird trick to convert between ordinary UTF-16 JS strings and "bytestrings", using the deprecated function unescape():
    • binary_str = unescape(encodeURI(utf16_str))
    • utf16_str = decodeURIComponent(escape(binary_str))
    • the way this works is, encodeURI()/encodeURIComponent() first convert to UTF-8, then percent-encode each byte into 3 ASCII characters
    • whereas escape() directly percent-encodes each UTF-16 code unit into 3 or 6 characters, depending on whether it's >U+00FF or not
    • so unescape(encodeURI(utf16_str)) is parsing the 3-character percent-encoded byte values as code points, producing "bytestrings" of code points all U+0000 to U+00FF
  • and decodeURIComponent(escape(binary_str)) is parsing the 3-character per

Borrow Checking in Mechanical

Borrow checking serves two purposes in Mechanical. For compiling to JS, the only purpose is to detect when data structures like records or arrays can be mutated in-place rather than copied. In the future when we compile to WebAssembly or native, the second purpose will be for "compile-time reference counting", like Lobster's memory management model.

Sketch:

  • part of the inferred signature of a function, in addition to the expected type of each argument, is the "ownership level" the function needs over each argument
  • there are 5 levels of ownership:
    • unused means the argument is never used by the function never, or it's only used in pure expressions whose values are never used
  • borrow means the function just needs temporary read access. It's deduced when all operations on the argument are borrow operations, such as reading a scalar field on a record argument,

Assert vs Check vs Enforce vs Constrain in Mechanical

Check, Assert, and Enforce are all runtime error checks, but with very different semantics:

  • Check statements let you check for runtime error conditions and early-return, usually #err _ or #none.
  • Assert statements let you check for runtime error conditions and crash, to impose function preconditions or declare code invariants.
  • Enforce statements also check for runtime error conditions and crash, to enforce operational requirements of your program.

Check

Check statements let you early-return #err values:

@laughinghan
laughinghan / SP500.csv
Last active July 29, 2021 18:06
S&P 500 index daily price data, 1927-12-30 thru 2021-02-05, from Yahoo Finance (^GSPC)
We can't make this file beautiful and searchable because it's too large.
Date,Open,High,Low,Close,Adj Close,Volume
1927-12-30,17.660000,17.660000,17.660000,17.660000,17.660000,0
1928-01-03,17.760000,17.760000,17.760000,17.760000,17.760000,0
1928-01-04,17.719999,17.719999,17.719999,17.719999,17.719999,0
1928-01-05,17.549999,17.549999,17.549999,17.549999,17.549999,0
1928-01-06,17.660000,17.660000,17.660000,17.660000,17.660000,0
1928-01-09,17.500000,17.500000,17.500000,17.500000,17.500000,0
1928-01-10,17.370001,17.370001,17.370001,17.370001,17.370001,0
1928-01-11,17.350000,17.350000,17.350000,17.350000,17.350000,0
1928-01-12,17.469999,17.469999,17.469999,17.469999,17.469999,0

Zen of Mechanical

Ergonomics is everything.

Typechecking is ergonomics.

Compiler error messages is ergonomics.

Runtime performance is ergonomics.

(WIP) Error Handling in Mechanical

Appendix: Survey of Error Handling in Other Languages

  • checked exceptions: Java
    • the most infamous is Java's widely hated checked exceptions. No one else does this afaik
    • good:
      • you can code the "happy path" in a "direct style", and handle abnormal situations separately
      • function signatures tell you exactly what exceptions can happen (except for the carve-out, see below)
      • checked exceptions make it a little harder to forget to use try/finally
  • try-with-resources statement copies Python's lauded with-statement

Two Ways to do Interactive Double-Blind Trust Token Redemption

By "double-blind" I mean that not only does the issuer not know who the client is, but the server doesn't know exactly which issuers the client redeemed trust tokens from.

Summary: Two ways for an untrusted client to prove to an untrusted server that the client is trusted by X out of Y trust token issuers, without revealing exactly which issuers:

  • One way uses hash functions as its only crypto primitive, but requires revealing some of the issuers to the server, which is a leak of fingerprintable info. It's also probabilistic, and the chances of a malicious client's lie being caught depends on the degree of the lie as well as how many issuers are revealed to the server, which is an undesirable tradeoff.
  • The other way, if it works, reveals none of the issuers to the server, and the client cannot lie at all. But this assumes I understand elliptic curve cryptography, which I learned entirely