jhugman/Code generation pattern proposal.md

## Code generation pattern proposal.md

      
    Raw
  

              Code generation pattern proposal.md
            
          
    Tool specific Intermediate Representation

The Intermediate Representation (IR) resolves to a tree of Descriptors, e.g.:

EnumDescriptor, which has EnumVariantDescriptors which may have FieldDescriptors
RecordDescriptor which have FieldDescriptors.
ObjectDescriptor which have MethodDescriptors, which may have ArgDescriptors.

These represent the concrete types and syntactic structures within those types.
struct EnumDescriptor {
	name: String,
	variants: Vec<EnumVariantDescriptor>,
}

Some of these descriptors will point to other concrete types. e.g.
struct FieldDescriptor {
	field_name: String,
	type_: TypeIdentifier,
	default: Option<Value>,
}

These descriptors are shared between all backends and serialize to the IR.
Backend specific type wraps type descriptors

For the descriptors that represent types (e.g. Object, Enum etc), there exists a struct that wraps the descriptor, and gives access to the sub-descriptor.
struct KotlinEnum {
	inner: EnumDescriptor
}

impl KotlinEnum {
	fn variants(&self) -> Vec<EnumVariantDescriptor> {
		self.inner.variants()
	}
}

These structs implement the trait CodeType.
CodeType is a trait that emits foreign language code for specific tasks,
mostly identifiers, and expressions (e.g. function calls into its own machinery).
Precisely what is needed depends upon what the tool this generator
is part of. E.g. uniffi needs lift and lower machinery.
It also knows how to generate all the code for its  inner Descriptor with the fn render_declaration(&self).
impl CodeType for KotlinEnum {
	fn name(&self) -> String {
		self.inner.name().to_camel_case()
	}

	fn internals(&self) -> String {
		format!("Uniffi{}Internals", self.name())
	}

	fn literal(&self, v: Value) -> String {
		…
	}
	fn lower_into(&self, value: String, buffer: String) -> String {
		format!("{}.lowerInto({}, {})", self.internals(), value, buffer)
	}
	
	fn render_declaration(&self) -> Result<String> {
		EnumDecl(&self).render()
	}
}


A type_oracle knows how to map TypeIdentifiers to CodeTypes.
i.e. if a render_declaration() or another CodeType  has a TypeIdentifier and the type_oracle, it can look up the CodeType and then be able to reference and manipulate it in the foreign language.
Aside: how far can we take these CodeTypes? Can CodeType contain TypeIdentifiers?
Since we can generate a declaration, and ways to call into it, I suspect we can:

support compound code types (for Option<T> and Array<T>, Map<String, T>)
support the TransformTowers proposal
support the external types proposal.
code types for primitives (though a rust macro may be needed for this).

Generating the declaration with the main templates

The render_declaration method is almost certainly calling into a template, which is all the code needed to define the type, and what ever internal/private machinery that is required.
The template has access to the type_oracle.
We can stop here, and do all the above with askama. In askama land, we have one template per struct, so in this proposal this would be one template file per struct that implements CodeType.
However, rfk asked me for dreamcode.
I've used a rsx! macro and #[component] syntax which is taken directly from the render crate, which itself implements something like JSX. I added to backticks.
Components (in JSX speak) are templates that take a set of arguments and render a representation of those arguments using strings or other components to render themselves.
#[component]
fn EnumDecl(type_: &KotlinEnum) -> Result<String> {
	let type_name = type_.name();
	rsx! ```
		public sealed class {{ type_name }} {
			<EnumVariants type_={{type_}} />
		}
		
		internal class {{ type_.internals() }} {
			static fun downOne(v: {{ type_name }}): Int = …
			static fun upOne(v: Int): {{ type_name }} = …
			static fun lowerInto(v: {{ type_name }}, buffer: RustBuffer) {
				…
			}
		}
	```
}

#[component]
fn EnumVariants(type_: &KotlinEnum) -> Result<String> {
	let type_name = type_.name();
 	type_.variants().map(|v| {
		if v.fields().len() == 0 {
			rsx! ```
				public object {{ v.name() }} : {{ type_name }}
			```
		} else {
			rsx! ```
				public class {{ v.name() }}(
					<FieldsDecl fields={{ variant.fields() }}
				) : {{ type_name }}
			```
		}
	}).join("\n")
}

#[component]
fn FieldsDecl(fields: &Vec<FieldDescriptor>) -> Result<String> {
	fields.map(|f| {
		let name = f.name();
		let type_ = type_oracle.find(f.type_id())?
		if let Option(default) = f.default_value() {
			rsx! ``` 
				val {{ name }}: {{ type_.name() }} = {{ type_.literal(default) }}
			```
		} else {
			rsx! ```
				val {{ name }}: {{ type_.name }}			
			```
		}

	}).join(",\n")
}

The interesting bits here are:

templates are composed of text and other templates.
Intra-template logic is Rust, instead of additional templating logic.

We did use macros in askama, but these are somewhat more ergonomic unit of template re-use.
I don't know if render can be persuaded to do this, or if we have to write something ourselves, perhaps based on syn-rsx.
Wishlist aside if we were to build our own rsx:

works on Rust Stable
Markdown triple backticks FTW
Trimming the indent so it matches the indent of the rsx! token, or kotlin's trimIndent