ErnWong/0-my-dream-language-disclaimer.md

## 0-my-dream-language-disclaimer.md

      
    Raw
  

              0-my-dream-language-disclaimer.md
            
          
    Disclaimer: I have no idea what I'm doing. This gist is a scratchpad for me to hopefully one make sense of things. Expect nonsensical notes jotted on this page.

  
## More metaprogramming and staging links
Just noticed I have too many chrome tabs open, so I'm going to try and save those links here:

- Staging for Generic Programming in Space and Time https://www.cs.purdue.edu/homes/rompf/papers/ofenbeck-gpce17.pdf
- https://github.com/GeorgOfenbeck/SpaceTime
- Collapsing Towers of Interpreters https://namin.seas.harvard.edu/files/collapsing-towers.pdf
- 3-lisp: an infinite tower of meta-circular interpreters. https://www.cofault.com/2022/08/3-lisp-infinite-tower-of-meta-circular.html

## my-dream-language.md

      
    Raw
  

              my-dream-language.md
            
          
Purely functional
Dependent types like Idris? But with the ease and readability of refinement types with external solver (e.g. SMT). More on readability: see structured proofs "Intelligible semi-automated reasoning" (Isar) https://courses.grainger.illinois.edu/cs576/sp2015/doc/isar-overview.pdf https://isabelle.in.tum.de/Isar/
Should be gradually typed (https://arxiv.org/pdf/1906.06469.pdf), but we can annotate a function to forbid using any gradual typing.
Runtime checks for imprecise gradually typed programs should also offer a mechanism for the programmer to choose between fatal program termination (like a runtime assert) vs a graceful program-aware handling of the runtime error (i.e. with Either/Result/Maybe types). I guess in the latter case it just becomes an explicit fallible runtime casting. Can gradual typing just be syntax sugar over strong typing? It can work for using fully untyped values into typed functions via explicit runtime-checked casting, but I'm not sure how it would deal with partially untyped (e.g. Vec Nat ?) and functions that take type as their argument.
High level of reusability. Gradual typing allows List a = Vec a ?. How does the following paper address reusability? https://dl.acm.org/doi/pdf/10.1145/3236799

xs : vec Nat 10 -- Vector xs with length value known at compile time

i <- cast $ getstr
xs : vec Nat i -- Vector xs with runtime length symbolically known at compile time

exists i such that xs : vec Nat i -- Vector xs with runtime length symbolically known at compile time. Existential type (dependent sum).

xs : vec Nat ? -- Vector xs with length dynamically checked at runtime (just like your standard List in haskell)

xs : ? -- Fully untyped variable xs.


Syntax should allow interleaving arguments inside the function name like ObjectiveC (so it is much easier to decipher what each positional argument means, making the program read more like a English). Note that it is syntax sugar so the non-interleaved naming still works for passing functions as arguments into higher order functions, and possibly for currying/partial application). Perhaps this makes infix operators redundant (at the cost of parens) and also makes Uniform Function Call Syntax redundant?
Support for linear/uniqueness types with rust-like control
Parametricity and theorems for free, but allow this to be specified, i.e. parametric vs ad-hoc polymorphism, coloured type systems like https://jyp.github.io/pdf/Doc-Revised.pdf https://jyp.github.io/pdf/CCCC.pdf. TODO: read more about parametricity in https://bartoszmilewski.com/2014/09/22/parametricity-money-for-nothing-and-theorems-for-free/ and  https://vimeo.com/16541785
Cater to metaprogramming and embedding DSLs. In particular, see deeper shallow embeddings: https://drops.dagstuhl.de/opus/volltexte/2022/16737/pdf/LIPIcs-ITP-2022-28.pdf
Specify memory allocation/management strategy (stack, heap, manual, garbage collector, compile-time reference counting, runtime reference counting, arena, custom allocator)
Low/zero-cost abstractions, where performant code is idiomatic code
Encourages and allows easy to reason about register and cache friendly data structures
wasm backend, arm backend, FPGA/ASIC backend, webgpu/metal/vulkan backend
ABI?
optimizations can be explicit and checked so that they're predictable: e.g. erasure, in-place mutation, tail-call optimization, small-string optimization, null-pointer optimization etc.
support for powerful embedded DSLs, and potentially type-safe syntactic transformations or elaborators or macros that are still IDE-friendly (i.e. can be incrementalized and can give good error messaging)
Optimizations (e.g. the above two points) are expressed in within language itself, are extensible/customizable, are verified, etc. They can either be a refinement of the canonical/oracle implementation, or a syntactic transformation of the canonical/oracle implementation
Provide some way to reason about time complexity: see implicit computational complexity https://lipn.univ-paris13.fr/~baillot/GEOCAL06/ICCworkshop.html and http://cs.unibo.it/~dallago/FICQRA/esslli.pdf
Linting can maybe be expressed in the language itself too, using similar technologies as optimization
interpretable as well as compilable, with interactive repl
support for language server protocol, language server index format, equational-reasoning based semantic code search, semantic git diffing
nix-based reproducible and highly incremental build systems
semver checks and semver-respecting language features (i.e. consider typeclass coherence, orphan rule, overlap check, etc)
great error messages with hints, diagnostics, and suggestions.
good support for debugging - source mapping, stack unwinding, watchdog hang detection, crash reporting, sampling, spin dumping, instrumentation, respecting application provided thread names, and during stack unwinding, recover method name, file name, line number, module (if dynamically loaded), etc.
inline documentation with annotated parameters, markdown syntax, and rust-like testable code blocks
global documentation database that is searchable by both content and type signature - search using a query language like sourcegraph, jira, or the one based on equational reasoning.
Floating point, fixed point, big integers, Fin (i.e. bounded integers), wrapped integers, signed and unsigned, pointer, etc.
Full control over strings and proper Unicode string encoding management. Programmer is made aware of all the different ways to do things, e.g. compare strings via appearance or via code point or canonical, sort using locale aware or by codepoint or by locale-specific with very weird cultural edge-cases.
Some polymorphic code can be monomorphised at compile time, aka static dispatch, but this can be controlled (e.g. opt out in rust's dyn trait objects using a vtable)
Control over datastructure alignment, padding, etc
Stack allocated function closures - in rust, each closure has its own type and so it's easier to box them. Roc seems to be able to infer the types and be able to express the concrete type of the closure, and therefore can store it on the stack. Therefore, array of closures is automatically treated as array of tagged union of different realized closures. See https://media.handmade-seattle.com/roc-lang/ 44 minutes in. Lambda captures desugared into extra arguments - idea from morphic solver authors - called defunctionalization, or is it lambda lifting? https://en.wikipedia.org/wiki/Lambda_lifting.
Fast compile times
Auto/implicit parameters similar to Idris, but extended to support contexts and guarantees

Contexts include: panic context, alloc context, io context, async context, ffi context etc. This also allows functions to be implicitly parameterized by, e.g., the specific allocator implementation, without caring about it in their high-level type signature.
By default, functions are context agnostic, meaning that they are not allowed to directly use these contexts, but they can invoke a function argument that uses these contexts
Functions that uses these contexts will therefore need to explicitly opt in in their type signature
Functions can explicitly opt out of such contexts. E.g., calling this function will never panic, or will never do any system calls that could pre-empt the current thread.
Need to consider how that is enforced due to currying/closures, as it may not be as simple as syntax sugar for extra context arguments. E.g., context arguments could be captured in a closure, in which case these closures need a way to indicate that they contain these contexts in their type signature. Perhaps any function that captures a context is promoted into such context itself. In that case, it is more like a partitioning of types, or like auto traits in rust. Contexts are poisonous.
How do these relate to the IO monads? Perhaps IO monads are already doing this more elegantly. Do we really need an opt-out mechanism? Probably it is suffient for a function to be context agnostic, and let the caller enforce whether it really should opt-in or not. In that case, do we still need auto/implicit parameters for these kinds of contexts? Probably not. They can explicitly take in IO monads.


Do we want first-class parameterized modules? ML Modules / Scala Modules. Does this help with the idea of contexts above?
Secure by default - permissions, capabilities, etc - see Deno -- but that's more for runtime languages i.e. interpreted. It'll be nice to know all the permissions needed at compile time. https://www.youtube.com/watch?v=cpQwtwVKAfU Roc goes about it by having sandboxed platforms - but user can't trust that the program is using such platform. Users will need some OS or external sandboxing (e.g. VMs, or wasm runtime) to run the programs. Programmers can manage security by having a language that restricts what third-party dependencies can do. Third-party libraries will need to opt-in to things requiring permissions and publically advertise this in either their package manifest or their type signatures. Can we carry permissions via type signature?
Private vs public dependencies so multiple versions can be used at the same time without conflict. Would this work? Is this a good idea?

Useful links


https://github.com/carp-lang/Carp/
https://github.com/VictorTaelin/Formality
https://www.reddit.com/r/haskell/comments/d5d13i/is_it_possible_to_design_a_functional_language/
Roc

https://media.handmade-seattle.com/roc-lang/ transparent arena allocation, platform domain-specific memory allocation strategy, no free


Idris
Rust
Haskell Clash
https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-908.pdf
https://arxiv.org/pdf/1908.05647.pdf
In rust, we tend to hear that we want to move the trait bounds to the impls rather than the data. Interesting to see the same argument in Haskell type classes in 23 minutes in of https://www.youtube.com/watch?v=hIZxTQP1ifo - if data isn't dumb, then we can't reuse as well as we need to think of all the possible properties we'd want. Also in this video talks about the dangers when we don't have typeclass coherence... but can we use dependent types to prove that we are using the same typeclass instance across uses of the same data? Huh, indeed the video then goes and shows what seems like an alternative to implicits but without the problems of implicits - by using ConstraintKinds, which allows a data type to be parameterized by the constraint, and so we always use the same instance? (TODO: double check that I have got this correct - I'm still halfway in the talk and haven't finished). looks like I misunderstood it.
https://github.com/AndrasKovacs/pny1-assignment/blob/master/DepTC.md - Dependent Type Classes - a bit over my head for now, so I should re-read this in the near future.
https://gist.github.com/nrinaudo/b02d0d17f62b6babea60cb0b52ded287
https://programatica.cs.pdx.edu//House/ https://ogi.altocumulus.org/~hallgren/ICFP2005/house.pdf
https://wiki.haskell.org/Applications_and_libraries/Operating_system
https://wiki.haskell.org/Research_papers/Program_development#Operating_systems
https://www.tweag.io/blog/2022-01-19-why-liquid-haskell/
https://www.tweag.io/blog/2021-02-05-refinement-types/
https://www.infoq.com/presentations/Type-System-JavaScript/
https://groups.google.com/g/idris-lang/c/e0_6DSF4sXg
https://antelang.org/
http://math.andrej.com/2012/11/08/how-to-implement-dependent-type-theory-i/
http://math.andrej.com/2012/11/11/how-to-implement-dependent-type-theory-ii/
http://math.andrej.com/2012/11/29/how-to-implement-dependent-type-theory-iii/
https://github.com/BSVino/JaiPrimer/blob/dd6581a2ffb9edf344463aa4020c118cd404df29/JaiPrimer.md Jai - explicit optimizations!! integrated build process within the code itself!! Also automatic struct of arrays <--> array of structs conversion, full introspection,
https://www.youtube.com/watch?v=nBaRKI5q-DI - Dependent Types for Real Time Constraints - synchronous languages.
The update framework https://theupdateframework.io/
https://boats.gitlab.io/blog/post/the-problem-of-effects/
https://github.com/ivanperez-keera/dunai
https://dustycloud.org/blog/guile-steel-proposal/
https://fermat.github.io/document/talks/thesis-talk.pdf
https://www.graalvm.org/22.0/graalvm-as-a-platform/language-implementation-framework
https://www.youtube.com/watch?v=25RmUl88jSw
https://github.com/valmirjunior0088/lambda-curios Learn more about what equirecursion and isorecursion is
http://people.csail.mit.edu/gregs/ll1-discuss-archive-html/msg01111.html - macros / metaprogramming


There are three reasons for introducing a syntactic
abstraction (macro) rather than a function:

new binding forms
implicit quoting or, more generally, a data sub-language
an order of evaluation that is incompatible with evaluation

A lazy language makes macros fro 3 unnecessary. A function with first-class
functions still needs macros for 1; otherwise you keep writing
foo (fn x => ...)
all over the place.


https://typetheorypodcast.com/ - some cool people there (e.g. David Christiansen and Aaron Stump, and many other cool people).
https://www.cis.upenn.edu/~euisuny/note/freedom-for-proofs.pdf https://www.cis.upenn.edu/~euisuny/note/freedom-for-proofs-slides.pdf - Survey of ways to bring about representation independence.
Algebraic effects > monads? https://www.youtube.com/watch?v=nGhoZzihbHY
https://www.buzzsprout.com/728558/5956747-term-models-for-higher-order-signatures?client_source=small_player&iframe=true&referrer=https://www.buzzsprout.com/728558.js?player=small&tags=chapter%20a&order=date Why higher order encodings - so we can avoid the nasty tedious and error-prone parts of managing name bindings.
https://www.youtube.com/watch?v=cT8G6FS2v94 - TrieMap and pattern matching maps.
https://www.youtube.com/watch?v=832JF1o7Ck8 - Verse functional logic programming with functions that goes both ways, a more generalised CSP-like boolean alternatives, etc.

Ideas I want to challenge or have been successfully challenged


Can't have both safety and speed - Look at rust.
HDLs can't be synthesizable and high level at the same time - Haskell Clash and other functional HDLs
Macros can't have good diagnostics and error messages
Functional programming can't have good control over hardware or codegen - todo...
Effects are hard in function programming - IO monads and do notation seem to alleviate this
GUI is better suited in OOP than FP - look at Qt bugs, and look at react and elm - are we gui yet in FP?
Optimizations (automatic or handwritten) makes code hard to read, maintain and debug
Business logic is better suited in OOP than FP - look at how algebraic data types can help better model the problem domain. Potentially the pseudo declarative nature of FP allows business logic to be expressed concisely, making it easier to conform to spec. One argument for OOP (which I haven't really got a chance to validate yet) is that business logic changes and churn through quickly, whereas the domain objects remain relatively stable, and therefore modelling business logic with OOP is easier to adapt to change. I hypothesize a counterargument: It may seem easier to change in OOP, but it is harder to reason about the correctness of the change compared with FP, and while it may require more refactoring in FP, it is much easier to refactor correctly in FP. Resisting refactoring in OOP causes the code to organically grow with non-ideal APIs and lots of technical debt (which is what I have witnessed in the industry so far).
Adding dependent types forces programmers to write difficult proofs rather than actual programming - look at Idris, where I think the style of programming is very similar to normal Haskell. TODO: Actually try programming in Idris to learn whether it's the case.
TODO...

Questions


What is the difference between Frank Pong's self types, vs Aaron Stump's Cedille (constructor-constrained recursive types), Aaron Stump's {Implicit Products + Dependent Intersections + Heterogeneous equality type}, vs Insanely dependent types? They all seem to be talking about something that can talk about its own type. https://github.com/gelisam/circular-sig https://github.com/UlfNorell/insane

First class everything?


First class functions
Higher kinded types
Generic associated types / associated type constructors
First class types - kinds
First class constraints - constraint kinds - parametrize a data type or function with any constraint/type-class, e.g. Dict (Eq Int). (https://www.youtube.com/watch?v=hIZxTQP1ifo describes it that it's like implicits without the problems of implicits).
First class hot reloading - https://github.com/mun-lang/mun
First class syntax sugaring? Creeping into homoiconicity like lisp.

Function takes value to value, where value can be: functions and type, where
Can type be lambda encoded? Closest might be self types.
Can syntax tree be lambda encoded? Yes with scott encoding.
Unified language for runtime, compiletime, build scripting, interpreter.
Other stuff


Affine types vs linear types vs uniqueness types vs reference counting?


Smalltalk - everything object, minimal powerful language, late binding, add code while debugging, method explorer


Github search

https://github.com/search?q=systems+functional+programming+language

https://github.com/lycus/flect
https://github.com/hawkw/mnemosyne - https://github.com/hawkw/senior-thesis - https://github.com/hawkw/senior-thesis/blob/master/appendices/sample/pic.mn
https://github.com/NuxiiGit/cosy-lang

https://github.com/topics/dependent-types

https://github.com/magmide/magmide

interesting that this talks about the idea of separating a "pure" logical language/system vs an imperative host side, and that Cedille 2 presentation mentioned something similar about undoing curry-howard isomorphism to separate programs from proofs. https://www.youtube.com/watch?v=2DNL90cAF4w... I think it ended up being this: https://arxiv.org/pdf/2101.09655.pdf (relational type theory)


https://github.com/HackerFoo/poprc


## symmetric-interaction-combinators.md

      
    Raw
  

              symmetric-interaction-combinators.md
            
          
    Encoding unscoped lambdas:

How Absal and SIC traverses through the graph to find the next active pair to rewrite:

Originally using meta breadcrumbs: VictorTaelin/abstract-algorithm#6
Later with a stack: https://github.com/VictorTaelin/abstract-algorithm/commit/22ece0dd64657e5dd3f66fda18bde0ee585cf5bf

States: stack warp, stack exit, port next


HVM also uses some kind of stack?

States: stack, u64 init, u32 host


## syntax.md

      
    Raw
  

              syntax.md
            
          
    Syntax for function application is commonly divided into the following:
Firstly: the lack of parens:
fn arg
add 1 2
Secondly: parens around the application (i.e. lisp like s-expressions)
(fn arg)
(add 1 2)
Thirdly: parens around each argument (i.e. common in imperative languages)
fn(arg)
add(1, 2)
There are other syntax, such as the uniform function call syntax:
arg.fn()
1.add(2)
and infix operators:
1 `add` 2
and also interleaved arguments (Objective C and SmallTalk selector/messages, the latter example copied from Wikipedia)
object message
object messageWithArg: 2 andAnotherArg: 3
'hello world' indexOf: $o startingAt: 6
Can we go a step further? Allow function names to have spaces in them, and allow arguments to be wherever, even at the start:
(a) is equal to (b)


(x) is an element of (xs)
(x) is in (xs)

((x) is in (xs)) and ((y) is in (xs))


\ (x : vector of (nat) size (10)) is an element of (xs : set of (vector of (nat) size (10))) 

This might open up to new domain-specific syntax extensions, and caters to infix operators, uniform function call syntax, etc...
Perhaps if we allow whitespace, then we can also allow other symbols that aren't parens:
do (
  (x) <- (readline)
  (y) <- (readline)
  printline ( "the sum of (x) and (y) is (x + y)" )
)

although that raises the question on how do deal with variable arguments, as in the syntax for lists and do-blocks and let blocks. Perhaps those exist as an implicit continuation argument that doesn't require parens? The problem with just an implicit continuation argument is that we won't know how to stop parsing a function's identifier since whitespace is allowed.
-- is it this?
do (
  ((x) <- (readline))
  ((y) <- (readline))
  (printline ( "the sum of (x) and (y) is (x + y)" ))
)

-- or is it this?
do (
  ((x) <- (readline) (y))
  (<- (readline))
  (printline ( "the sum of (x) and (y) is (x + y)" ))
)

-- etc...

But we still want to avoid a continuation/callback pyramid of doom:
do (
  (x) <- (readline) (
    (y) <- (readline) (
      printline ( "the sum of (x) and (y) is (x + y)" )
    )
  )
)

Even if we get rid of the indent, we still get the lisp parens hell:
do (
  (x) <- (readline) (
  (y) <- (readline) (
  (z) <- (readline) (
  (a) <- (readline) (
  (b) <- (readline) (
  (c) <- (readline) (
      printline ( "the sum of (x) and (y) is (x + y)" )
  ))))))
)

Perhaps since we're emulating an imperative "block", we should use semicolons? Similar to Haskell's $ function application operator.
do (
  (x) <- (readline);
  (y) <- (readline);
  printline ( "the sum of (x) and (y) is (x + y)" )
)

Cool, I think I can live with that. It'll be even cooler if ; is something we can define as well, but maybe that's asking too much since there would still be too much parens, and I'm not sure how to deal with variable arguments.
do (
  ((x) <- (readline)) ;
  ((y) <- (readline)) ;
  (printline ( "the sum of (x) and (y) is (x + y)" ))
)

Actually, if ; is just a function application operator, then we'd already have curried functions and partial application, so we can just do this, which is somewhat more elegant with needing to introduce ;:
do (
  ((x) <- (readline))
  ((y) <- (readline))
  (printline ( "the sum of (x) and (y) is (x + y)" ))
)

Although since this makes it a pretty parenthesis-heavy language, having a dedicated ; or $ would be nice. Unlike lisp (I think - I don't actually know lisp that well), the indentifier for the function isn't just the first thing in the s-expression.
The second question is how should we deal with value arguments vs symbol arguments and binding. The function (var) <- (monad) takes var as a symbol/binding, and monad as a value. Moreover, do will need to add the <- function to the context when parsing its argument, and each (var) <- (monad) (continuation) call adds whatever var symbol as a new binding into the context of the continuation. This means:

function parameters can have type Symbol, whatever that is, for creating new bindings.
function parameters can be annotated to have additional context before evaluating that parameter argument.

This sounds like a very beefy self-Pi type. Let's give it a name. Since it feels very nutty, let's call it the Nut type.
data Term
  = Variable Symbol
  | Type (level : Nat)
  | Abstraction (argument : Symbol) (body : Term)
  | Nut (self : Maybe Symbol) (argument : Symbol) (argumentType : Type) (argumentAdditionalContext : Maybe Term) (returnType : Type)
  
-- Symbol is an alternating sequence of strings and holes for parameters, and should probably be a dependent type indexed by the number of parameters it has.
data Symbol = NameFirst NameElement | ParameterFirst NameElement
data NameElement
  = FollowedByName String ParameterElement
  | EndingWithName String
data ParameterElement
  = FollowedByParameter NameElement
  | EndingWithParameter
This raises the following questions/problems:

Having the core language depend on a specific encoding for the additional context doesn't seem elegant
What if the additional context term is ill-typed?
What if the additional context has conflicting identifiers with the existing context?

In attempt to answer question 1, we ask another question: what is an existing "natural" way of specifying context in the lambda calculus? Abstractions! So perhaps when specifying the type of a function using the Nut type, we describe the layers of new abstractions that get sandwhiched between the function and argument evaluation? Hmm, not sure where that leads to.
How much power do we want to provide for the additional context? Either

Computable at runtime
Computable at compile time
Static - hardcoded

We can't have (3) static/hardcoded additional contexts given that we want to create a DSL that provides its own binding mechanism, like the do-notation:
do (
  (x) <- (readline);
  (y) <- (readline);
  printline ( "the sum of (x) and (y) is (x + y)" )
)

-- where, in the above code, the `<-` function might be defined like the following:

define ((var : Symbol) <- (monad : m a) (continuation : (c) with context (var : a))) returning (c)

When specifying the additional context, we obviously need to specify which user-provided symbol to use and probably the type it has, but would we need to specify anything further? Like how it relates to other things in the context? For that question, I need to learn more about how, e.g., Idris handles unification of dependent types that talk about other things in scope, e.g. Elem x xs is the proof that x is an element of xs, but how does it work when this proof is passed around to other functions that don't know about x nor xs? Does it just check for definitional equality every time?
Also, is all this extensible binding mechanism worth it, and is it sufficient to model other language features and macros that the user would want to write? E.g. what about SMT-provided proofs or special pattern matching syntax?
An alternative way to represent these extra bindings is just to have a new lambda abstraction, perhaps with implicit arguments, although that might not necessarily work for the do-notation bindings where the new binding symbol is on the wrong side. For the syntax, since round brackets are already used for normal (possibly eagerly evaluated) value arguments, we could perhapse use curly braces for quoted/unevaluated blocks, which fits well with both (1) syntax extensions like do-notations and datatype definition syntax, and (2) lambda with implicit arguments. However, how would we denote idris-style implicit parameters then?
do {
  (x) <- (readline) {
    (y) <- (readline) {
      printline ( "the sum of (x) and (y) is (x + y)" )
    }
  }
}


do {
  ((readline) ->)
  (\x .
    (readline) ->)
  (\y .
    printline ( "the sum of (x) and (y) is (x + y)" )) -- error, x is not defined
}

Hmm, the un-nested lambda abstractions don't work because x is no longer visible outside of the lexical nesting structure.
Square brackets, round brackets, curly brackets

What if we assign it these consistent meanings?

Round brackets for values, pass by value
Curly braces for quoted/deferred blocks, lambda bodies
Square brackets for introducing bindings

do {
  [x] <- (readline);
  [y] <- (readline);
  printline ( "the sum of (x) and (y) is (x + y)" )
}

map (my array) ([item]{item + 1})


https://www.buzzsprout.com/728558/6284173-explaining-my-encoding-of-a-hoas-datatype-part-2?client_source=small_player&iframe=true&referrer=https://www.buzzsprout.com/728558.js?player=small&tags=chapter%20a&order=date
https://arxiv.org/pdf/1910.10851.pdf
data Term
forall y. (x->y) -> y -> TermX(y)????????

On minimal syntax / metacircular interpreters / extensible syntax:

https://www.reddit.com/r/ProgrammingLanguages/comments/yhm4vx/comment/iugi3tt/?utm_source=reddit&utm_medium=web2x&context=3
https://web.cs.wpi.edu/~jshutt/kernel.html
https://en.wikipedia.org/wiki/META_II
http://lampwww.epfl.ch/~amin/pub/collapsing-towers.pdf
https://www.youtube.com/watch?v=91m8X8hSEXg
https://seed7.sourceforge.net/ - see https://seed7.sourceforge.net/manual/syntax.htm


Oh nice, Agda already has something similar: mixfix operators!
https://agda.readthedocs.io/en/v2.5.2/language/mixfix-operators.html
Taken from that page:
_and_ : Bool → Bool → Bool
true and x = x
false and _ = false

if_then_else_ : {A : Set} → Bool → A → A → A
if true then x else y = x
if false then x else y = y

_⇒_   : Bool → Bool → Bool
true  ⇒ b = b
false ⇒ _ = true


Syntax and language extensibility is a point in https://www.youtube.com/watch?v=kQCFGENufSY&ab_channel=FMFFoundationsseminar (William J . Bowman - Cur: Designing a less devious proof assistant) to approach the goal of having proof assistants closer to what the program wants to express the program. https://github.com/wilbowma/cur
Since I'm bringing up the idea of presenting ideas as close to how the programmer wants as possible, I might as well also refer to Isar (https://isabelle.in.tum.de/Isar/).
Back to less devious proof assistants, it's interesting that Bowmann brings up a point I've been thinking about: how important is the foundation, and can we easily hot-swap out the foundation? Extensible foundation with low barrier of entry.
Another highlight of Cur: being able to define Nat literals (taken from video)
; Import the existing definition for number literals so we can rename it.
; The existing definition does nothing but print an error.
(require
 (only-in  cur [#%datum super.datum]))

; Rename that existing definition.
; (Defines at compile-time)
(define-syntax #%datum
  (lambda (stx)
    (syntax-parse stx
      [(_ . n:nat)
        (nat->unary (syntax->datum #'n))])))
        
(begin-for-syntax
  (define (nat->unary n)
    (if (zero? n)
      #'z
      #`(s #,(nat->unary (sub1 n))))))
Another highlight: reflect on cur's implementation. Metaprogramming can also be done in Cur rather than Racket. Also: everything done in the same language/same file - no outside toolchain/processing needed. No context switching.
Weakness: possibly too flexible/powerful metaprogramming? Not known how to restrict it to enforce soundness/consistency.
Racket provides ability to create transformers, provide binding utilities?
More reading:

Type Systems as Macros POPL 2017
Dependent Type Systems as Macros. PACMPL 2020

Also: Coq's notations - https://coq.inria.fr/refman/user-extensions/syntax-extensions.html
(Sounds similar to Agda mixfix operators? http://www.cse.chalmers.se/~nad/publications/danielsson-norell-mixfix.pdf)

More experimentation to test out how verbose the parenthesis are:
pure [DecidableEquality] [x: Int, y: Int] {
  if (x == y) then {
    Some(Refl (x))
  } else {
    None
  }
}

pure [map option (x: Option of (_))] {
  if let [Some y] = (x) then {
    f(y)
  } else {
    None
  }
}

let [Some x] = (y) else { throw some effectful error };

proc [ReadFile] [path: FilePath, length: Nat] -> (Vector of (String) and size (length)) {

}

type [FList] [n: Nat] = [list: List of (String), (length of (list)) == (n)]

type [Finite list of size ()] [n: Nat] = [list: List of (String), (length of (list)) == (n)]

type [Finite list of size (n: Nat)] = [list: List of (String), (length of (list)) == (n)]

What if, back to our [] () {} idea:

() is what you'd expect: value grouping
{} gives a thunk Thunk<Set<FreeVariable>>
[], aka "bindings", behaves like a function that takes a thunk of the right free-variables and returns a function? This isn't enough as we'd want to manipulate these bindings, i.e. inject more bindings for custom syntax, or read the binding names for import statements.

What if bindings are just strings?
What about inline type annotations - how would they fit in? What about other annotations?
More experimentation around pattern matching:
let [my option] = (Some (3));
let [known value] = (42);

let [result] = (match (my option) {
    (Some (known value)) => ("First");
    (Some [value]) => (
        let [twice] = ((value) * (2));
        "Second: {twice}"
    );
    (None) => "Third";
});

(Point { [x], [y] }){
    Point { y: (x), x: (y) }
}

let [character position] = (Point { x: (4), y: (2) });
let (Point { [x], ... }) = (character position);
x


Programming language 3R by Andrew Black had a similar idea around names made up of multiple words and can have embedded arguments:
https://youtu.be/JqYCt9rTG8g?si=icpAMTeHofPB0rEe&t=1197


## training.md

      
    Raw
  

              training.md
            
          
    Implementing lambda calculi with very small cores.
Not including parsing.
Step 1. Untyped Lambda Calculus


Syntax
Reduce
Normalize
Evaluate

Step 2. Simply Typed Lambda Calclulus


Syntax
Type Inference
Type Check
Reduce
Normalize
Evaluate

Step 3. Hindley Milner Type System for Parametric Types


Syntax
Type Inference
Type Check
Reduce
Normalize
Evaluate

Step 4. Calculus of Constructions vs PiSigma?


Syntax
Type Inference
Type Check
Reduce
Normalize
Evaluate

Step 5. Add self types


Syntax
Type Inference
Type Check
Reduce
Normalize
Evaluate

Step 6. Add Quantitative types

TODO: needed? Can linear/affine types be desugared to monads or something else simpler in the core?

  
## y-gradual-typing.md

      
    Raw
  

              y-gradual-typing.md
            
          
https://www.youtube.com/watch?v=ov0roDPDBFs&ab_channel=MicrosoftResearch

Untyped vs dynamically typed vs static dumb types vs static sexy types
Ornaments - adding and removing constraints to/from types
Ending up with "megatypes" that contain all the properties we could every possibly need in the future
Being able to express what we want, what we know, and what we don't know, so we could check statically and defer any remaining uncertainties to runtime.
The ability to intuitively understand failures - having runtime state examples is nicer than a type checker failure.


https://personal.cis.strath.ac.uk/conor.mcbride/pub/OAAO/LitOrn.pdf
Ornaments!
The idea that List is just Nat with more information, and Vec is just List with more information, reminds me of another paper about coloured type theory, where we can systematically erase coloured information to form another types to get theorems for free.
https://www.researchgate.net/profile/Jean-Philippe-Bernardy/publication/266654430_Type-Theory_in_Color/links/59427c180f7e9b3a5fa62c28/Type-Theory-in-Color.pdf
Type-Theory In Color (Jean-Philippe Bernardy and Guilhem Moulin)
Comparison from Bernady et al:

Ornaments Relating variants of dependently-typed programs have been a concern for a long time. The idea of ornamenting inductive structures have been proposed to remedy this problem (Dagand and McBride 2012). Here, we instead focus on erasure (let the user specify an ornamented type and recover the relation with its erasure) instead of specifying ornamentation of already existing types. This is not much of a difference in practice, because ornaments typically can only be applied to a single type. An advantage of colors over ornaments is that colors integrate natively with existing type-theories, while ornamentation relies on encodings; additionally we get free equalities when working with erasures, and colored pairs reveal parametricity properties. The chief advantage of ornaments is that any algebra can be used to ornament datatypes, while we are limited to structural relations.

Ornaments seem really cool. I haven't read the paper fully so I might be misunderstanding, but it's nice to have these new fancier types at a level of descriptors that can be reflected upon. That said, that particular idea itself seems to be called "Universe construction" as McBride mentions, perhaps similar to Idris' idea of the "universe pattern" of encoding different types in a datatype, moving what is otherwise a primitive language construct into a value that we can work with.
A highlight:

Libraries should deliver encoded datatypes, to be taken as themes for local variation, ensuring that programmers can work with the precision appropriate to the task at hand without compromising the broader exchange of data. We should not lose the general functionality of a list library just because it is sometimes convenient to work with the precision of vectors, or ordered lists, or ordered vectors, or any of the other multifarious possibilities which we are now free to farm. At the same time, we might develop libraries of ornaments. Nobody should have to think about how to index a particular data structure by its number of nodes, or a bound on its depth, or its flattening to a sequence: such recipes belong in the repertoire. Declaring—rather than defining—datatypes should now be seen as an impediment to the properly compositional data design process, punished by payment for what is rightfully free. As the Epigram team has shown, it is not even necessary to declare the datatype which is used for the descriptions of datatypes: with care, and a handful of predefined components, a sufficiently expressive universe can be persuaded to appear as if it is self-encoded (Chapman et al., 2010).

Oh, and another highlight that I've been always wondering:

At the moment, our usual means to refer to particular components of index information tends to be rather positional and is thus
unlikely to scale: can we adapt record types to fit this purpose?

More potentially useful reading: http://www.cs.ox.ac.uk/jeremy.gibbons/publications/progorn.pf
How does this relate to containers?

The gradual dependent types track:

GDTL - Eremondi et al (2019) Approximate Normalization for Gradual Dependent Types

https://arxiv.org/pdf/1906.06469.pdf
https://www.youtube.com/watch?v=e4FeQCabuOQ


GCIC - Lennon-Bertrand et al (2021/2022) Gradualizing the Calculus of Inductive Constructions

https://www.meven.ac/documents/POPL-22.pdf
https://pleiad.cl/papers/2022/lennonAl-toplas2022.pdf
https://www.youtube.com/watch?v=KoNpBvvXOa4


GEq - Eremondi et al (2022) Propositional Equality for Gradual Dependently-Typed Programming

https://arxiv.org/pdf/2205.01241.pdf
https://www.youtube.com/watch?v=-wTfXsuNwrA


These allow defining e.g. List a = Vec a ?, with full spectrum from untyped to richly typed, with inserted runtime checking based on the accumulated information/constraints that a ?-typed value has at runtime.
My naive opinion: seems very ergonomic, although less reflective power as ornaments, and given a definition of a rich type, we can only weaken it. Given a List a, it seems to make a lot of sense that we can derive a Vec a n since we have all the information from it. Perhaps we already have a way to "strengthen" a dumb type, i.e. via dependent pairs to attach more properties to an existing dumb type, giving a lot more flexibility than what indexing might be able to give us. E.g., instead of Vec a n, we have (as exemplified by Eremondi et al's GEq paper) FList a n = Sigma(x: List a, length x = n). The downside is that we probably won't get as many proofs for free as the extra predicates are no longer baked into the structure of the inductive type.

Blame lambda calculus track: TODO

  
## z1-dependently typed lambda calculus.md

      
    Raw
  

              z1-dependently typed lambda calculus.md
            
          
    This is my attempt to understand the differences between various ways of doing dependent type systems with lambda calculus:
Prerequisites

Fixed point theory?


https://cs.stackexchange.com/questions/42150/meaning-of-positive-position-and-negative-position-in-type-theory
https://homepages.inf.ed.ac.uk/wadler/papers/free-rectypes/free-rectypes.txt

Starting with a positive position, every time we go to the left of an arrow, we flip the polarity. Then, we forbid recursion in the negative positions, so that the type is strongly normalizing and we can't encode nontermination.
Peng Fu's Self Types


https://www.seas.upenn.edu/~sweirich/dtp13/talks/fu.pdf
https://fermat.github.io/document/talks/thesis-talk.pdf
https://fermat.github.io/document/papers/rta-tlca.pdf
https://fermat.github.io/document/papers/dissertation.pdf

Aaron Stump's Constructor Constrained Recursive Types (CDLE, Cedille)


https://homepage.cs.uiowa.edu/~astump/papers/cdle.pdf


The present paper proposes two new type constructs for
this, called constructor-constrained recursive types and lifting. The former deepens earlier
work by Fu and Stump on System S, which solves the problem of induction using a typing
construct called self types, to allow the type to refer to the subject of the typing via bound
variable x in ιx.T (Fu & Stump, 2014). To prove consistency, they rely on a dependencyeliminating translation to System Fω plus positive-recursive types. This method is not
applicable to analyze a system with large eliminations, where dependence of types on
terms is fundamental

Aaron Stump's use of Kopylov's Dependent Intersections, Miquel's Implicit Products, along with a primitive equality type


https://github.com/cedille/cedille.github.io/blob/master/semantics.pdf
https://homepage.cs.uiowa.edu/~astump/papers/from-realizability-to-induction-aaron-stump.pdf
https://typetheorypodcast.com/2016/12/episode-6-aaron-stump-on-cedille/


The closest related work is my own paper on the Calculus of Dependent Lambda Eliminations
(CDLE) [39]. The goal with CDLE is similar to that of the present paper: extend a PTS with new type
forms for induction and dependently typed programming. CDLE adds a type construct called constructorconstrained recursive types. This is a form of recursive type based on the idea of preserving the typings of
lambda-encoded constructors at each approximation (as familiar from fixed-point theory) to the recursive
type. While adequate for deriving induction and while not requiring any change to the term language of the
system (i.e., pure lambda calculus), constructor-constrained recursive types are still a rather complex feature, with fairly involved kinding rules. Their semantics requires a nontrivial extension to the already rather
technical machinery needed for recursive types. The approach of this paper greatly improves on CDLE, by
identifying a combination of reasonably simple typing constructs known already from the literature that
suffice for deriving induction. No complex new typing construct is required. Indeed, except for the equality
types, ιλP2 is a subsystem of CDLE. So just one simple addition is enough to obviate the entire complex
machinery of constructor-constrained recursive types, which took several years to formulate and analyze.
Needless to say, this was quite unexpected.

Perhaps these screencasts would help motivate the need for these extra constructs on top of Calculus of Constructions (CC)  https://www.youtube.com/watch?v=A5c-KVMRSds
UlfNorell's Insane pi-types?


https://github.com/UlfNorell/insane

Actually I haven't read too far in to know if this belongs in this category.

  
## z2-relational-type-theory.md

      
    Raw
  

              z2-relational-type-theory.md
            
          
https://arxiv.org/pdf/2101.09655.pdf
https://homepage.divms.uiowa.edu/~astump/ittc.html
https://homepage.divms.uiowa.edu/~astump/ittc-chapter11.html
https://people.mpi-sws.org/~dreyer/tor/papers/reynolds.pdf

Note: it's about a special kind of relation called logical relations.
The preprint paper also talks about something like a conjugation operation:

https://math.stackexchange.com/questions/1972402/what-is-conjugate-in-group-theory
https://www.axiomtutor.com/new-blog/2022/9/28/what-even-is-conjugation-group-theory

Hmmm, the podcast mentions the study of relations, such as relational algebra. Interesting, looking at wikipediat, I see familiar things from my uni courses on databases: https://en.wikipedia.org/wiki/Relational_algebra
Would this take us a step closer towards my dream of "programming as information management"?

Links to better explain a related paper - Observational Type Theory: http://lambda-the-ultimate.org/node/3547#comment-75314 - apparently the type system is set up to avoid ever normalizing the axiom of extensionality, and therefore avoid the computational inadequacy of extensionality?

  
## z3-module-system.md

      
    Raw
  

              z3-module-system.md
            
          
https://github.com/rossberg/1ml - first class modules? Also look at what ocaml does. Also, why does ML module system have dependent types?

https://people.mpi-sws.org/~rossberg/1ml/


https://homepage.divms.uiowa.edu/~astump/ittc-chapter13.html

Goals:

namespace management

namespace,
qualification
alias/renaming
collision management


abstraction/info-hiding e.g. private vs exposed interface
reuse/convenience: e.g. module parameters
Should it allow cyclic dependencies? Some things may require mutual recursion. It doesn't seem like a good design decision, but we might be enforcing an unnecessary restriction in some domains where this is useful. Haskell seems to support them (https://wiki.haskell.org/Mutually_recursive_modules) and I think I read/heard from somewhere that it was nice to be able to split an existing mutually recursive definition from one file into multiple files. Ah I think rust also supports modules but not crates.

We should make the default acyclic, and have a separate opt-in syntax (or linter warning suppression syntax) for cyclic modules.
How does this impact, say, incremental compilation?


interaction with package management and dependency management?
nestable
modules as first-class values

should the dependency graph be reflectable?
With module signatures and module-to-module functions, it's starting to feel like Rust's trait system.


Compilation/build-process and tooling considerations... todo

Modular data type: Data types a la carte by Wouter Swierstra

http://web.archive.org/web/20200415052545/http://www.cs.ru.nl/~W.Swierstra/Publications/DataTypesALaCarte.pdf
https://github.com/jcouyang/alacarte
Funny how the first thing that this reminds me of is my attempted object oriented implementation of a bunch of functional stuff, like parsing with zippers and lambda calculus. Of course, though, data types a la carte provides richer type information.

Differences I want from Haskell:

Have syntax that encourages listing out each imported symbol, and have explicit wildcards.
Constructors for data types are automatically namespaced

Do we follow Java by automatically opening the data type's namespace when pattern matching against that type?
	Just noticed I have too many chrome tabs open, so I'm going to try and save those links here:

	- Staging for Generic Programming in Space and Time https://www.cs.purdue.edu/homes/rompf/papers/ofenbeck-gpce17.pdf
	- https://github.com/GeorgOfenbeck/SpaceTime
	- Collapsing Towers of Interpreters https://namin.seas.harvard.edu/files/collapsing-towers.pdf
	- 3-lisp: an infinite tower of meta-circular interpreters. https://www.cofault.com/2022/08/3-lisp-infinite-tower-of-meta-circular.html