Skip to content

Instantly share code, notes, and snippets.

@ErnWong
Last active October 11, 2023 01:19
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ErnWong/3646bfd93248b6e01e97d34957d59400 to your computer and use it in GitHub Desktop.
Save ErnWong/3646bfd93248b6e01e97d34957d59400 to your computer and use it in GitHub Desktop.

Disclaimer: I have no idea what I'm doing. This gist is a scratchpad for me to hopefully one make sense of things. Expect nonsensical notes jotted on this page.

  • Purely functional
  • Dependent types like Idris? But with the ease and readability of refinement types with external solver (e.g. SMT). More on readability: see structured proofs "Intelligible semi-automated reasoning" (Isar) https://courses.grainger.illinois.edu/cs576/sp2015/doc/isar-overview.pdf https://isabelle.in.tum.de/Isar/
  • Should be gradually typed (https://arxiv.org/pdf/1906.06469.pdf), but we can annotate a function to forbid using any gradual typing.
  • Runtime checks for imprecise gradually typed programs should also offer a mechanism for the programmer to choose between fatal program termination (like a runtime assert) vs a graceful program-aware handling of the runtime error (i.e. with Either/Result/Maybe types). I guess in the latter case it just becomes an explicit fallible runtime casting. Can gradual typing just be syntax sugar over strong typing? It can work for using fully untyped values into typed functions via explicit runtime-checked casting, but I'm not sure how it would deal with partially untyped (e.g. Vec Nat ?) and functions that take type as their argument.
  • High level of reusability. Gradual typing allows List a = Vec a ?. How does the following paper address reusability? https://dl.acm.org/doi/pdf/10.1145/3236799
xs : vec Nat 10 -- Vector xs with length value known at compile time

i <- cast $ getstr
xs : vec Nat i -- Vector xs with runtime length symbolically known at compile time

exists i such that xs : vec Nat i -- Vector xs with runtime length symbolically known at compile time. Existential type (dependent sum).

xs : vec Nat ? -- Vector xs with length dynamically checked at runtime (just like your standard List in haskell)

xs : ? -- Fully untyped variable xs.
  • Syntax should allow interleaving arguments inside the function name like ObjectiveC (so it is much easier to decipher what each positional argument means, making the program read more like a English). Note that it is syntax sugar so the non-interleaved naming still works for passing functions as arguments into higher order functions, and possibly for currying/partial application). Perhaps this makes infix operators redundant (at the cost of parens) and also makes Uniform Function Call Syntax redundant?
  • Support for linear/uniqueness types with rust-like control
  • Parametricity and theorems for free, but allow this to be specified, i.e. parametric vs ad-hoc polymorphism, coloured type systems like https://jyp.github.io/pdf/Doc-Revised.pdf https://jyp.github.io/pdf/CCCC.pdf. TODO: read more about parametricity in https://bartoszmilewski.com/2014/09/22/parametricity-money-for-nothing-and-theorems-for-free/ and https://vimeo.com/16541785
  • Cater to metaprogramming and embedding DSLs. In particular, see deeper shallow embeddings: https://drops.dagstuhl.de/opus/volltexte/2022/16737/pdf/LIPIcs-ITP-2022-28.pdf
  • Specify memory allocation/management strategy (stack, heap, manual, garbage collector, compile-time reference counting, runtime reference counting, arena, custom allocator)
  • Low/zero-cost abstractions, where performant code is idiomatic code
  • Encourages and allows easy to reason about register and cache friendly data structures
  • wasm backend, arm backend, FPGA/ASIC backend, webgpu/metal/vulkan backend
  • ABI?
  • optimizations can be explicit and checked so that they're predictable: e.g. erasure, in-place mutation, tail-call optimization, small-string optimization, null-pointer optimization etc.
  • support for powerful embedded DSLs, and potentially type-safe syntactic transformations or elaborators or macros that are still IDE-friendly (i.e. can be incrementalized and can give good error messaging)
  • Optimizations (e.g. the above two points) are expressed in within language itself, are extensible/customizable, are verified, etc. They can either be a refinement of the canonical/oracle implementation, or a syntactic transformation of the canonical/oracle implementation
  • Provide some way to reason about time complexity: see implicit computational complexity https://lipn.univ-paris13.fr/~baillot/GEOCAL06/ICCworkshop.html and http://cs.unibo.it/~dallago/FICQRA/esslli.pdf
  • Linting can maybe be expressed in the language itself too, using similar technologies as optimization
  • interpretable as well as compilable, with interactive repl
  • support for language server protocol, language server index format, equational-reasoning based semantic code search, semantic git diffing
  • nix-based reproducible and highly incremental build systems
  • semver checks and semver-respecting language features (i.e. consider typeclass coherence, orphan rule, overlap check, etc)
  • great error messages with hints, diagnostics, and suggestions.
  • good support for debugging - source mapping, stack unwinding, watchdog hang detection, crash reporting, sampling, spin dumping, instrumentation, respecting application provided thread names, and during stack unwinding, recover method name, file name, line number, module (if dynamically loaded), etc.
  • inline documentation with annotated parameters, markdown syntax, and rust-like testable code blocks
  • global documentation database that is searchable by both content and type signature - search using a query language like sourcegraph, jira, or the one based on equational reasoning.
  • Floating point, fixed point, big integers, Fin (i.e. bounded integers), wrapped integers, signed and unsigned, pointer, etc.
  • Full control over strings and proper Unicode string encoding management. Programmer is made aware of all the different ways to do things, e.g. compare strings via appearance or via code point or canonical, sort using locale aware or by codepoint or by locale-specific with very weird cultural edge-cases.
  • Some polymorphic code can be monomorphised at compile time, aka static dispatch, but this can be controlled (e.g. opt out in rust's dyn trait objects using a vtable)
  • Control over datastructure alignment, padding, etc
  • Stack allocated function closures - in rust, each closure has its own type and so it's easier to box them. Roc seems to be able to infer the types and be able to express the concrete type of the closure, and therefore can store it on the stack. Therefore, array of closures is automatically treated as array of tagged union of different realized closures. See https://media.handmade-seattle.com/roc-lang/ 44 minutes in. Lambda captures desugared into extra arguments - idea from morphic solver authors - called defunctionalization, or is it lambda lifting? https://en.wikipedia.org/wiki/Lambda_lifting.
  • Fast compile times
  • Auto/implicit parameters similar to Idris, but extended to support contexts and guarantees
    • Contexts include: panic context, alloc context, io context, async context, ffi context etc. This also allows functions to be implicitly parameterized by, e.g., the specific allocator implementation, without caring about it in their high-level type signature.
    • By default, functions are context agnostic, meaning that they are not allowed to directly use these contexts, but they can invoke a function argument that uses these contexts
    • Functions that uses these contexts will therefore need to explicitly opt in in their type signature
    • Functions can explicitly opt out of such contexts. E.g., calling this function will never panic, or will never do any system calls that could pre-empt the current thread.
    • Need to consider how that is enforced due to currying/closures, as it may not be as simple as syntax sugar for extra context arguments. E.g., context arguments could be captured in a closure, in which case these closures need a way to indicate that they contain these contexts in their type signature. Perhaps any function that captures a context is promoted into such context itself. In that case, it is more like a partitioning of types, or like auto traits in rust. Contexts are poisonous.
    • How do these relate to the IO monads? Perhaps IO monads are already doing this more elegantly. Do we really need an opt-out mechanism? Probably it is suffient for a function to be context agnostic, and let the caller enforce whether it really should opt-in or not. In that case, do we still need auto/implicit parameters for these kinds of contexts? Probably not. They can explicitly take in IO monads.
  • Do we want first-class parameterized modules? ML Modules / Scala Modules. Does this help with the idea of contexts above?
  • Secure by default - permissions, capabilities, etc - see Deno -- but that's more for runtime languages i.e. interpreted. It'll be nice to know all the permissions needed at compile time. https://www.youtube.com/watch?v=cpQwtwVKAfU Roc goes about it by having sandboxed platforms - but user can't trust that the program is using such platform. Users will need some OS or external sandboxing (e.g. VMs, or wasm runtime) to run the programs. Programmers can manage security by having a language that restricts what third-party dependencies can do. Third-party libraries will need to opt-in to things requiring permissions and publically advertise this in either their package manifest or their type signatures. Can we carry permissions via type signature?
  • Private vs public dependencies so multiple versions can be used at the same time without conflict. Would this work? Is this a good idea?

Useful links

There are three reasons for introducing a syntactic abstraction (macro) rather than a function:

  1. new binding forms
  2. implicit quoting or, more generally, a data sub-language
  3. an order of evaluation that is incompatible with evaluation

A lazy language makes macros fro 3 unnecessary. A function with first-class functions still needs macros for 1; otherwise you keep writing foo (fn x => ...) all over the place.

Ideas I want to challenge or have been successfully challenged

  • Can't have both safety and speed - Look at rust.
  • HDLs can't be synthesizable and high level at the same time - Haskell Clash and other functional HDLs
  • Macros can't have good diagnostics and error messages
  • Functional programming can't have good control over hardware or codegen - todo...
  • Effects are hard in function programming - IO monads and do notation seem to alleviate this
  • GUI is better suited in OOP than FP - look at Qt bugs, and look at react and elm - are we gui yet in FP?
  • Optimizations (automatic or handwritten) makes code hard to read, maintain and debug
  • Business logic is better suited in OOP than FP - look at how algebraic data types can help better model the problem domain. Potentially the pseudo declarative nature of FP allows business logic to be expressed concisely, making it easier to conform to spec. One argument for OOP (which I haven't really got a chance to validate yet) is that business logic changes and churn through quickly, whereas the domain objects remain relatively stable, and therefore modelling business logic with OOP is easier to adapt to change. I hypothesize a counterargument: It may seem easier to change in OOP, but it is harder to reason about the correctness of the change compared with FP, and while it may require more refactoring in FP, it is much easier to refactor correctly in FP. Resisting refactoring in OOP causes the code to organically grow with non-ideal APIs and lots of technical debt (which is what I have witnessed in the industry so far).
  • Adding dependent types forces programmers to write difficult proofs rather than actual programming - look at Idris, where I think the style of programming is very similar to normal Haskell. TODO: Actually try programming in Idris to learn whether it's the case.
  • TODO...

Questions

  • What is the difference between Frank Pong's self types, vs Aaron Stump's Cedille (constructor-constrained recursive types), Aaron Stump's {Implicit Products + Dependent Intersections + Heterogeneous equality type}, vs Insanely dependent types? They all seem to be talking about something that can talk about its own type. https://github.com/gelisam/circular-sig https://github.com/UlfNorell/insane

First class everything?

  • First class functions
  • Higher kinded types
  • Generic associated types / associated type constructors
  • First class types - kinds
  • First class constraints - constraint kinds - parametrize a data type or function with any constraint/type-class, e.g. Dict (Eq Int). (https://www.youtube.com/watch?v=hIZxTQP1ifo describes it that it's like implicits without the problems of implicits).
  • First class hot reloading - https://github.com/mun-lang/mun
  • First class syntax sugaring? Creeping into homoiconicity like lisp.

Function takes value to value, where value can be: functions and type, where

Can type be lambda encoded? Closest might be self types.

Can syntax tree be lambda encoded? Yes with scott encoding.

Unified language for runtime, compiletime, build scripting, interpreter.

Other stuff

image

  • Affine types vs linear types vs uniqueness types vs reference counting?

image image image Smalltalk - everything object, minimal powerful language, late binding, add code while debugging, method explorer image image

Github search

https://github.com/search?q=systems+functional+programming+language

https://github.com/topics/dependent-types

Syntax for function application is commonly divided into the following:

Firstly: the lack of parens:

fn arg
add 1 2

Secondly: parens around the application (i.e. lisp like s-expressions)

(fn arg)
(add 1 2)

Thirdly: parens around each argument (i.e. common in imperative languages)

fn(arg)
add(1, 2)

There are other syntax, such as the uniform function call syntax:

arg.fn()
1.add(2)

and infix operators:

1 `add` 2

and also interleaved arguments (Objective C and SmallTalk selector/messages, the latter example copied from Wikipedia)

object message
object messageWithArg: 2 andAnotherArg: 3
'hello world' indexOf: $o startingAt: 6

Can we go a step further? Allow function names to have spaces in them, and allow arguments to be wherever, even at the start:

(a) is equal to (b)


(x) is an element of (xs)
(x) is in (xs)

((x) is in (xs)) and ((y) is in (xs))


\ (x : vector of (nat) size (10)) is an element of (xs : set of (vector of (nat) size (10))) 

This might open up to new domain-specific syntax extensions, and caters to infix operators, uniform function call syntax, etc...

Perhaps if we allow whitespace, then we can also allow other symbols that aren't parens:

do (
  (x) <- (readline)
  (y) <- (readline)
  printline ( "the sum of (x) and (y) is (x + y)" )
)

although that raises the question on how do deal with variable arguments, as in the syntax for lists and do-blocks and let blocks. Perhaps those exist as an implicit continuation argument that doesn't require parens? The problem with just an implicit continuation argument is that we won't know how to stop parsing a function's identifier since whitespace is allowed.

-- is it this?
do (
  ((x) <- (readline))
  ((y) <- (readline))
  (printline ( "the sum of (x) and (y) is (x + y)" ))
)

-- or is it this?
do (
  ((x) <- (readline) (y))
  (<- (readline))
  (printline ( "the sum of (x) and (y) is (x + y)" ))
)

-- etc...

But we still want to avoid a continuation/callback pyramid of doom:

do (
  (x) <- (readline) (
    (y) <- (readline) (
      printline ( "the sum of (x) and (y) is (x + y)" )
    )
  )
)

Even if we get rid of the indent, we still get the lisp parens hell:

do (
  (x) <- (readline) (
  (y) <- (readline) (
  (z) <- (readline) (
  (a) <- (readline) (
  (b) <- (readline) (
  (c) <- (readline) (
      printline ( "the sum of (x) and (y) is (x + y)" )
  ))))))
)

Perhaps since we're emulating an imperative "block", we should use semicolons? Similar to Haskell's $ function application operator.

do (
  (x) <- (readline);
  (y) <- (readline);
  printline ( "the sum of (x) and (y) is (x + y)" )
)

Cool, I think I can live with that. It'll be even cooler if ; is something we can define as well, but maybe that's asking too much since there would still be too much parens, and I'm not sure how to deal with variable arguments.

do (
  ((x) <- (readline)) ;
  ((y) <- (readline)) ;
  (printline ( "the sum of (x) and (y) is (x + y)" ))
)

Actually, if ; is just a function application operator, then we'd already have curried functions and partial application, so we can just do this, which is somewhat more elegant with needing to introduce ;:

do (
  ((x) <- (readline))
  ((y) <- (readline))
  (printline ( "the sum of (x) and (y) is (x + y)" ))
)

Although since this makes it a pretty parenthesis-heavy language, having a dedicated ; or $ would be nice. Unlike lisp (I think - I don't actually know lisp that well), the indentifier for the function isn't just the first thing in the s-expression.

The second question is how should we deal with value arguments vs symbol arguments and binding. The function (var) <- (monad) takes var as a symbol/binding, and monad as a value. Moreover, do will need to add the <- function to the context when parsing its argument, and each (var) <- (monad) (continuation) call adds whatever var symbol as a new binding into the context of the continuation. This means:

  • function parameters can have type Symbol, whatever that is, for creating new bindings.
  • function parameters can be annotated to have additional context before evaluating that parameter argument.

This sounds like a very beefy self-Pi type. Let's give it a name. Since it feels very nutty, let's call it the Nut type.

data Term
  = Variable Symbol
  | Type (level : Nat)
  | Abstraction (argument : Symbol) (body : Term)
  | Nut (self : Maybe Symbol) (argument : Symbol) (argumentType : Type) (argumentAdditionalContext : Maybe Term) (returnType : Type)
  
-- Symbol is an alternating sequence of strings and holes for parameters, and should probably be a dependent type indexed by the number of parameters it has.
data Symbol = NameFirst NameElement | ParameterFirst NameElement
data NameElement
  = FollowedByName String ParameterElement
  | EndingWithName String
data ParameterElement
  = FollowedByParameter NameElement
  | EndingWithParameter

This raises the following questions/problems:

  1. Having the core language depend on a specific encoding for the additional context doesn't seem elegant
  2. What if the additional context term is ill-typed?
  3. What if the additional context has conflicting identifiers with the existing context?

In attempt to answer question 1, we ask another question: what is an existing "natural" way of specifying context in the lambda calculus? Abstractions! So perhaps when specifying the type of a function using the Nut type, we describe the layers of new abstractions that get sandwhiched between the function and argument evaluation? Hmm, not sure where that leads to.

How much power do we want to provide for the additional context? Either

  1. Computable at runtime
  2. Computable at compile time
  3. Static - hardcoded

We can't have (3) static/hardcoded additional contexts given that we want to create a DSL that provides its own binding mechanism, like the do-notation:

do (
  (x) <- (readline);
  (y) <- (readline);
  printline ( "the sum of (x) and (y) is (x + y)" )
)

-- where, in the above code, the `<-` function might be defined like the following:

define ((var : Symbol) <- (monad : m a) (continuation : (c) with context (var : a))) returning (c)

When specifying the additional context, we obviously need to specify which user-provided symbol to use and probably the type it has, but would we need to specify anything further? Like how it relates to other things in the context? For that question, I need to learn more about how, e.g., Idris handles unification of dependent types that talk about other things in scope, e.g. Elem x xs is the proof that x is an element of xs, but how does it work when this proof is passed around to other functions that don't know about x nor xs? Does it just check for definitional equality every time?

Also, is all this extensible binding mechanism worth it, and is it sufficient to model other language features and macros that the user would want to write? E.g. what about SMT-provided proofs or special pattern matching syntax?

An alternative way to represent these extra bindings is just to have a new lambda abstraction, perhaps with implicit arguments, although that might not necessarily work for the do-notation bindings where the new binding symbol is on the wrong side. For the syntax, since round brackets are already used for normal (possibly eagerly evaluated) value arguments, we could perhapse use curly braces for quoted/unevaluated blocks, which fits well with both (1) syntax extensions like do-notations and datatype definition syntax, and (2) lambda with implicit arguments. However, how would we denote idris-style implicit parameters then?

do {
  (x) <- (readline) {
    (y) <- (readline) {
      printline ( "the sum of (x) and (y) is (x + y)" )
    }
  }
}


do {
  ((readline) ->)
  (\x .
    (readline) ->)
  (\y .
    printline ( "the sum of (x) and (y) is (x + y)" )) -- error, x is not defined
}

Hmm, the un-nested lambda abstractions don't work because x is no longer visible outside of the lexical nesting structure.

Square brackets, round brackets, curly brackets

What if we assign it these consistent meanings?

  • Round brackets for values, pass by value
  • Curly braces for quoted/deferred blocks, lambda bodies
  • Square brackets for introducing bindings
do {
  [x] <- (readline);
  [y] <- (readline);
  printline ( "the sum of (x) and (y) is (x + y)" )
}
map (my array) ([item]{item + 1})

https://www.buzzsprout.com/728558/6284173-explaining-my-encoding-of-a-hoas-datatype-part-2?client_source=small_player&iframe=true&referrer=https://www.buzzsprout.com/728558.js?player=small&tags=chapter%20a&order=date https://arxiv.org/pdf/1910.10851.pdf

data Term forall y. (x->y) -> y -> TermX(y)????????


On minimal syntax / metacircular interpreters / extensible syntax:


Oh nice, Agda already has something similar: mixfix operators! https://agda.readthedocs.io/en/v2.5.2/language/mixfix-operators.html

Taken from that page:

_and_ : Bool → Bool → Bool
true and x = x
false and _ = false

if_then_else_ : {A : Set} → Bool → A → A → A
if true then x else y = x
if false then x else y = y

_⇒_   : Bool → Bool → Bool
true  ⇒ b = b
false ⇒ _ = true

Syntax and language extensibility is a point in https://www.youtube.com/watch?v=kQCFGENufSY&ab_channel=FMFFoundationsseminar (William J . Bowman - Cur: Designing a less devious proof assistant) to approach the goal of having proof assistants closer to what the program wants to express the program. https://github.com/wilbowma/cur

Since I'm bringing up the idea of presenting ideas as close to how the programmer wants as possible, I might as well also refer to Isar (https://isabelle.in.tum.de/Isar/).

Back to less devious proof assistants, it's interesting that Bowmann brings up a point I've been thinking about: how important is the foundation, and can we easily hot-swap out the foundation? Extensible foundation with low barrier of entry.

Another highlight of Cur: being able to define Nat literals (taken from video)

; Import the existing definition for number literals so we can rename it.
; The existing definition does nothing but print an error.
(require
 (only-in  cur [#%datum super.datum]))

; Rename that existing definition.
; (Defines at compile-time)
(define-syntax #%datum
  (lambda (stx)
    (syntax-parse stx
      [(_ . n:nat)
        (nat->unary (syntax->datum #'n))])))
        
(begin-for-syntax
  (define (nat->unary n)
    (if (zero? n)
      #'z
      #`(s #,(nat->unary (sub1 n))))))

Another highlight: reflect on cur's implementation. Metaprogramming can also be done in Cur rather than Racket. Also: everything done in the same language/same file - no outside toolchain/processing needed. No context switching.

Weakness: possibly too flexible/powerful metaprogramming? Not known how to restrict it to enforce soundness/consistency.

Racket provides ability to create transformers, provide binding utilities?

More reading:

  • Type Systems as Macros POPL 2017
  • Dependent Type Systems as Macros. PACMPL 2020

Also: Coq's notations - https://coq.inria.fr/refman/user-extensions/syntax-extensions.html (Sounds similar to Agda mixfix operators? http://www.cse.chalmers.se/~nad/publications/danielsson-norell-mixfix.pdf)


More experimentation to test out how verbose the parenthesis are:

pure [DecidableEquality] [x: Int, y: Int] {
  if (x == y) then {
    Some(Refl (x))
  } else {
    None
  }
}

pure [map option (x: Option of (_))] {
  if let [Some y] = (x) then {
    f(y)
  } else {
    None
  }
}

let [Some x] = (y) else { throw some effectful error };

proc [ReadFile] [path: FilePath, length: Nat] -> (Vector of (String) and size (length)) {

}

type [FList] [n: Nat] = [list: List of (String), (length of (list)) == (n)]

type [Finite list of size ()] [n: Nat] = [list: List of (String), (length of (list)) == (n)]

type [Finite list of size (n: Nat)] = [list: List of (String), (length of (list)) == (n)]

What if, back to our [] () {} idea:

  • () is what you'd expect: value grouping
  • {} gives a thunk Thunk<Set<FreeVariable>>
  • [], aka "bindings", behaves like a function that takes a thunk of the right free-variables and returns a function? This isn't enough as we'd want to manipulate these bindings, i.e. inject more bindings for custom syntax, or read the binding names for import statements.

What if bindings are just strings?

What about inline type annotations - how would they fit in? What about other annotations?

More experimentation around pattern matching:

let [my option] = (Some (3));
let [known value] = (42);

let [result] = (match (my option) {
    (Some (known value)) => ("First");
    (Some [value]) => (
        let [twice] = ((value) * (2));
        "Second: {twice}"
    );
    (None) => "Third";
});

(Point { [x], [y] }){
    Point { y: (x), x: (y) }
}

let [character position] = (Point { x: (4), y: (2) });
let (Point { [x], ... }) = (character position);
x

Programming language 3R by Andrew Black had a similar idea around names made up of multiple words and can have embedded arguments:

https://youtu.be/JqYCt9rTG8g?si=icpAMTeHofPB0rEe&t=1197

image

Implementing lambda calculi with very small cores.

Not including parsing.

Step 1. Untyped Lambda Calculus

  1. Syntax
  2. Reduce
  3. Normalize
  4. Evaluate

Step 2. Simply Typed Lambda Calclulus

  1. Syntax
  2. Type Inference
  3. Type Check
  4. Reduce
  5. Normalize
  6. Evaluate

Step 3. Hindley Milner Type System for Parametric Types

  1. Syntax
  2. Type Inference
  3. Type Check
  4. Reduce
  5. Normalize
  6. Evaluate

Step 4. Calculus of Constructions vs PiSigma?

  1. Syntax
  2. Type Inference
  3. Type Check
  4. Reduce
  5. Normalize
  6. Evaluate

Step 5. Add self types

  1. Syntax
  2. Type Inference
  3. Type Check
  4. Reduce
  5. Normalize
  6. Evaluate

Step 6. Add Quantitative types

TODO: needed? Can linear/affine types be desugared to monads or something else simpler in the core?


https://www.youtube.com/watch?v=ov0roDPDBFs&ab_channel=MicrosoftResearch

  • Untyped vs dynamically typed vs static dumb types vs static sexy types
  • Ornaments - adding and removing constraints to/from types
  • Ending up with "megatypes" that contain all the properties we could every possibly need in the future
  • Being able to express what we want, what we know, and what we don't know, so we could check statically and defer any remaining uncertainties to runtime.
  • The ability to intuitively understand failures - having runtime state examples is nicer than a type checker failure.

https://personal.cis.strath.ac.uk/conor.mcbride/pub/OAAO/LitOrn.pdf Ornaments!

The idea that List is just Nat with more information, and Vec is just List with more information, reminds me of another paper about coloured type theory, where we can systematically erase coloured information to form another types to get theorems for free.

https://www.researchgate.net/profile/Jean-Philippe-Bernardy/publication/266654430_Type-Theory_in_Color/links/59427c180f7e9b3a5fa62c28/Type-Theory-in-Color.pdf Type-Theory In Color (Jean-Philippe Bernardy and Guilhem Moulin)

Comparison from Bernady et al:

Ornaments Relating variants of dependently-typed programs have been a concern for a long time. The idea of ornamenting inductive structures have been proposed to remedy this problem (Dagand and McBride 2012). Here, we instead focus on erasure (let the user specify an ornamented type and recover the relation with its erasure) instead of specifying ornamentation of already existing types. This is not much of a difference in practice, because ornaments typically can only be applied to a single type. An advantage of colors over ornaments is that colors integrate natively with existing type-theories, while ornamentation relies on encodings; additionally we get free equalities when working with erasures, and colored pairs reveal parametricity properties. The chief advantage of ornaments is that any algebra can be used to ornament datatypes, while we are limited to structural relations.

Ornaments seem really cool. I haven't read the paper fully so I might be misunderstanding, but it's nice to have these new fancier types at a level of descriptors that can be reflected upon. That said, that particular idea itself seems to be called "Universe construction" as McBride mentions, perhaps similar to Idris' idea of the "universe pattern" of encoding different types in a datatype, moving what is otherwise a primitive language construct into a value that we can work with.

A highlight:

Libraries should deliver encoded datatypes, to be taken as themes for local variation, ensuring that programmers can work with the precision appropriate to the task at hand without compromising the broader exchange of data. We should not lose the general functionality of a list library just because it is sometimes convenient to work with the precision of vectors, or ordered lists, or ordered vectors, or any of the other multifarious possibilities which we are now free to farm. At the same time, we might develop libraries of ornaments. Nobody should have to think about how to index a particular data structure by its number of nodes, or a bound on its depth, or its flattening to a sequence: such recipes belong in the repertoire. Declaring—rather than defining—datatypes should now be seen as an impediment to the properly compositional data design process, punished by payment for what is rightfully free. As the Epigram team has shown, it is not even necessary to declare the datatype which is used for the descriptions of datatypes: with care, and a handful of predefined components, a sufficiently expressive universe can be persuaded to appear as if it is self-encoded (Chapman et al., 2010).

Oh, and another highlight that I've been always wondering:

At the moment, our usual means to refer to particular components of index information tends to be rather positional and is thus unlikely to scale: can we adapt record types to fit this purpose?

More potentially useful reading: http://www.cs.ox.ac.uk/jeremy.gibbons/publications/progorn.pf

How does this relate to containers?


The gradual dependent types track:

These allow defining e.g. List a = Vec a ?, with full spectrum from untyped to richly typed, with inserted runtime checking based on the accumulated information/constraints that a ?-typed value has at runtime.

My naive opinion: seems very ergonomic, although less reflective power as ornaments, and given a definition of a rich type, we can only weaken it. Given a List a, it seems to make a lot of sense that we can derive a Vec a n since we have all the information from it. Perhaps we already have a way to "strengthen" a dumb type, i.e. via dependent pairs to attach more properties to an existing dumb type, giving a lot more flexibility than what indexing might be able to give us. E.g., instead of Vec a n, we have (as exemplified by Eremondi et al's GEq paper) FList a n = Sigma(x: List a, length x = n). The downside is that we probably won't get as many proofs for free as the extra predicates are no longer baked into the structure of the inductive type.


Blame lambda calculus track: TODO

This is my attempt to understand the differences between various ways of doing dependent type systems with lambda calculus:

Prerequisites

Fixed point theory?

Starting with a positive position, every time we go to the left of an arrow, we flip the polarity. Then, we forbid recursion in the negative positions, so that the type is strongly normalizing and we can't encode nontermination.

Peng Fu's Self Types

Aaron Stump's Constructor Constrained Recursive Types (CDLE, Cedille)

The present paper proposes two new type constructs for this, called constructor-constrained recursive types and lifting. The former deepens earlier work by Fu and Stump on System S, which solves the problem of induction using a typing construct called self types, to allow the type to refer to the subject of the typing via bound variable x in ιx.T (Fu & Stump, 2014). To prove consistency, they rely on a dependencyeliminating translation to System Fω plus positive-recursive types. This method is not applicable to analyze a system with large eliminations, where dependence of types on terms is fundamental

Aaron Stump's use of Kopylov's Dependent Intersections, Miquel's Implicit Products, along with a primitive equality type

The closest related work is my own paper on the Calculus of Dependent Lambda Eliminations (CDLE) [39]. The goal with CDLE is similar to that of the present paper: extend a PTS with new type forms for induction and dependently typed programming. CDLE adds a type construct called constructorconstrained recursive types. This is a form of recursive type based on the idea of preserving the typings of lambda-encoded constructors at each approximation (as familiar from fixed-point theory) to the recursive type. While adequate for deriving induction and while not requiring any change to the term language of the system (i.e., pure lambda calculus), constructor-constrained recursive types are still a rather complex feature, with fairly involved kinding rules. Their semantics requires a nontrivial extension to the already rather technical machinery needed for recursive types. The approach of this paper greatly improves on CDLE, by identifying a combination of reasonably simple typing constructs known already from the literature that suffice for deriving induction. No complex new typing construct is required. Indeed, except for the equality types, ιλP2 is a subsystem of CDLE. So just one simple addition is enough to obviate the entire complex machinery of constructor-constrained recursive types, which took several years to formulate and analyze. Needless to say, this was quite unexpected.

Perhaps these screencasts would help motivate the need for these extra constructs on top of Calculus of Constructions (CC) https://www.youtube.com/watch?v=A5c-KVMRSds

UlfNorell's Insane pi-types?

Actually I haven't read too far in to know if this belongs in this category.

Note: it's about a special kind of relation called logical relations.

The preprint paper also talks about something like a conjugation operation:

Hmmm, the podcast mentions the study of relations, such as relational algebra. Interesting, looking at wikipediat, I see familiar things from my uni courses on databases: https://en.wikipedia.org/wiki/Relational_algebra

Would this take us a step closer towards my dream of "programming as information management"?


Links to better explain a related paper - Observational Type Theory: http://lambda-the-ultimate.org/node/3547#comment-75314 - apparently the type system is set up to avoid ever normalizing the axiom of extensionality, and therefore avoid the computational inadequacy of extensionality?

Goals:

  • namespace management
    • namespace,
    • qualification
    • alias/renaming
    • collision management
  • abstraction/info-hiding e.g. private vs exposed interface
  • reuse/convenience: e.g. module parameters
  • Should it allow cyclic dependencies? Some things may require mutual recursion. It doesn't seem like a good design decision, but we might be enforcing an unnecessary restriction in some domains where this is useful. Haskell seems to support them (https://wiki.haskell.org/Mutually_recursive_modules) and I think I read/heard from somewhere that it was nice to be able to split an existing mutually recursive definition from one file into multiple files. Ah I think rust also supports modules but not crates.
    • We should make the default acyclic, and have a separate opt-in syntax (or linter warning suppression syntax) for cyclic modules.
    • How does this impact, say, incremental compilation?
  • interaction with package management and dependency management?
  • nestable
  • modules as first-class values
    • should the dependency graph be reflectable?
    • With module signatures and module-to-module functions, it's starting to feel like Rust's trait system.
  • Compilation/build-process and tooling considerations... todo

Modular data type: Data types a la carte by Wouter Swierstra

Differences I want from Haskell:

  • Have syntax that encourages listing out each imported symbol, and have explicit wildcards.
  • Constructors for data types are automatically namespaced
    • Do we follow Java by automatically opening the data type's namespace when pattern matching against that type?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment