Skip to content

Instantly share code, notes, and snippets.

@marianoguerra
Last active May 27, 2018 10:59
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save marianoguerra/4fd73f166c0f97c4a0c821a6b38d64ca to your computer and use it in GitHub Desktop.
Save marianoguerra/4fd73f166c0f97c4a0c821a6b38d64ca to your computer and use it in GitHub Desktop.
Draft of BEAM interoperability topics

Language Server

Problem

Editor Support for BEAM languages is pretty limited, the effort required to add editor support is substantial.

Solution

Unified Language Server implementation that supports registering language implementations through "plugins".

We all mantain the core together, each language implements the plugin for it.

There are already two projects:

We should pick one of them as the standard one and define the plugin interface.

Module Aliasing

Problem

Elixir, Fez and Alpaca modules are prefixed have a fixed prefix and then a possible nested path of dot separated modules.

Calling those modules from languages like Erlang, Efene, LFE, Clojerl and others implies writting the full name each time and quoting the atom to support uppercase and dots in the name.

Note: erlang supported nested module names as an experimental feature that was later removed, this could be revisited here.

Proposed Solution

Standard way to alias modules to give them a shorter and language friendly name

Solution 1: Module level aliasing with module attributes

-alias('Elixir.String', exString).

Pros

  • Simple to implement with a parse transform
  • Per module declaration of aliases

Cons

  • Alias used frequently must be aliased on each module they are used

    • Not too different than imports in other languages

Solution 2: rebar3 plugin

rebar3 plugin that allows to specify the aliases at project level, possibly with a convention based conversion (for example, replace the language for a shorter one Elixir -> ex and the dots for underscores)

provide a way to dump all the existing modules into a file to make manual changes later

Pros

  • Less work for the user

Cons

...

Open Problems

  • how to handle nested module references from aliases

@micmus on slack:

It still leaves open the question of how elixir aliases allow working with nested modules, e.f. alias Foo.Bar, as: Baz and later Baz.X actually refers to 'Elixir.Foo.Bar.X' module

User Defined Types Interop

Problem

Elixir, Fez and Alpaca allow users to define "User Defined Types" (UDTs) like Structs/Records, Types/Classes, Discriminated Unions etc.

Some of these allow to dispatch to common functions based on the type.

Since the BEAM doesn't provide any feature to "tag" this UDTs or to do dynamic dispatching based on a tag each language implements them based on exiting types like tagged tuples or maps with special attributes and does the dispatching by consolidating the implementations using pattern matching on a module at compile time.

This different implementations make interoperability hard, if one language wants to use a library or type from another and gets a UDT operating on it is language specific.

There's also an oportunity for the Erlang standard library to fix the standard library data type API inconsistencies by implementing UDTs for existing types.

Note: See EEP 42: Frames for related discussion

Proposed Solution

Support for tagged values on the BEAM

A way to create tagged values and retrieve the tag.

A way to dispatch based on the tag (pattern match, guards)

Syntactic sugar on erlang to work with UDTs.

Standard data format for UDTs with a proposal to add support on the beam

There are two existing ways languages implement UDTs at the moment:

Considerations

Subtypes

Fez (and maybe Alpaca) have "types with subtypes", like Discriminating Unions, where there's a base type and a set of subtypes (Option: Some | None)

This will need a way to also attach the extra information somewhere.

Alternatively the tag could be a tuple {option, some}

Class Hierarchies

Fez (and maybe Alpaca) support classes and keeps the base classes as a list at runtime to dispatch to them if needed.

This will need a way to also attach the extra information somewhere.

Note: since class hierarchies are known at compile time and don't change at runtime, they could be "erased" by building static dispatch for all known methods.

Clojure Metadata

Clojure allows to attach arbitrary metadata to values (see Clojure Metadata), the language itself also uses it to attach some information, for example on functions and modules.

The same mechanism as the one needed for Fez/Alpaca Subtypes and Class Hierarchies could be used here.

Dispatch Consolidation/Dynamic Dispatch

Implemenations that dispatch to functions based on types consolidate the dispatch into the protocol/interface to improve performance, this is a compile time phase that requires information from all the modules in the project (own and in dependencies) to gatter all the implementations and build the static dispatch functions (usually with dynamic fallbacks at the end).

This requires a place to store the information as it is gattered and a final step after module compilation to generate the dispatch modules.

This could get some standard support, initially from tools like mix and rebar3, it could be considered as a candidate to get better support in the erlang compiler itself.

To improve performance a cache of the information per module can/is generated and stored to avoid gattering the information from scratch on each incremental compilation when just a few modules change. Support for a place to store this cached data, to invalidate it and to clean it on standard clean tasks would be helpful.

I'm not aware of runtime registration of new dispatch implementation, this should be taken into consideration in case there is a need, which could be solved with a basic runtime support and hot code reloading. This could be useful in case a particular type is dynamically dispatched on the catch all clause of the consolidated dispatch function and justifies adding the static clause at runtime.

Kitchensink

Note: Don't know if this fits this discussion but I put it here in case it does.

Languages like Clojure and Fez have support for lazy data types, not sure if any of this would help with that.

Languages like Clojure and Fez have a way to wrap an immutable value in a "mutable reference"

Bikeshed

The record syntax could be extended to support UDTs, for example if prefixed like #t.option/some 42 #t/set {1, 2, 2, 3}

Alternatively it could be a parse transform or new syntax.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment