Skip to content

Instantly share code, notes, and snippets.

@jkelleyrtp
Last active June 21, 2024 17:38
Show Gist options
  • Save jkelleyrtp/1769a8be6d2aaf733736b50cbca4548f to your computer and use it in GitHub Desktop.
Save jkelleyrtp/1769a8be6d2aaf733736b50cbca4548f to your computer and use it in GitHub Desktop.
Dioxus Labs + "High Level Rust"

Dioxus Labs + “High-level Rust”

Recently an article very critical of Rust swept r/rust, r/programming, and HackerNews. This isn’t the first time someone has been critical of Rust, but in my experience, it’s one of the few times I didn’t see the typical host of responses piling on the author about “doing Rust wrong.” The post was so thorough and so poignant that it shut up even the loudest of zealots.

With that article, I felt a vibe shift. Someone finally called out Rust on its bullshit, so to speak. It’s true - “Rust gamedev ecosystem lives on hype” - and it’s about time we addressed it. There’s a lot of Rust folks whose answer would be “just get gud.” But the article rightly points out that for startups, small teams, and basically anyone who just wants to ship, Rust is not the answer.

Why does my opinion matter at all?


A year ago, I went full time as maintainer of Dioxus and founder of Dioxus Labs. Dioxus Labs, a startup backed by YCombinator and Khosla Ventures, was founded on the thesis that we could wrangle Rust into the “do-it-all” language of the future for app development.

Before Rust, I spent time across the entire stack, writing Python, C, CUDA, and JavaScript for plasma physics simulations (HPC). Not only was I solving partial differential equations on expensive super computing clusters, I was wrangling the shittiness of python installations, banging my head against broken CUDA drivers, stumbling through terrible C build chains, and trying to wrap my head around the ridiculousness of the JavaScript ecosystem. There’s a reason tools in research usually suck: researchers don’t have time to wade through a zillion different configs, toolchains, and poorly maintained documentation outside their core discipline.

Discovering Rust six years ago was like finding a candle in the darkness. I ported over my electron plasma simulation code in less than a week and had it wrapped in a Yew-based frontend in just a few days. What originally took me two years to build I had replicated in just a week with a single language. It felt like there was finally a tool “sharp” enough for the next generation of impossible problems.

A common response to the LogLog article was “Rust is not designed for that usecase.” And sure, maybe if you used the right combination of scripting languages on top, you might ship a game faster. But why can’t we fight to make Rust better and be that “good enough” tool for the next generation of plasma physicists, biotech researchers, and AI engineers? I don’t care if TypeScript is 20% more productive at UI development if I could use my existing language knowledge to sequence the human genome or protect the world’s internet from DDOS attacks.

I’m of the mindset that Rust can and should be improved until it’s good enough at most use cases people throw at it - I’m not interested in giving up. I’d rather the “kwisatz haderach of programming” come this generation, not the next. Rustfmt, Rustdoc, and Cargo are amazing projects - why throw them out? There’s real problems in this world to solve and life is too short to keep waiting for yet another language to build up an equivalently large developer community, funding, and momentum.

Rust’s success is not a technical feat, but rather a social one


I argue that Rust’s success is social in nature. My hot take: Rust’s popularity does not come from its technical merits. For another language like Nim, Odin, or Crystal to rise to the limelight, they would need to move heaven and earth. Rust neatly fills the gap that the developer community wants from a modern programming language: speed, type safety, and portability. Rust is fast - unlike Python, type-safe - unlike JavaScript, and portable - you can use it basically anywhere LLVM targets. The same cannot be said about languages with large runtimes or weird compiler infrastructure.

Due to Rust’s social success, it has encouraged a developer community to channel their energy into bringing modern affordances to a language on a large scale. Projects like Rust-Analyzer, rustfmt, cargo, miri, rustdoc, mdbook, etc are all social byproduct’s of Rust’s success. These projects emerged because a driven group of developers in the modern era wanted a better programming language. There’s nothing stopping other new languages from having the same affordances, either, but it’s a lot of work. Really the only language I’ve seen emerge with anything remotely similar to Rust’s developer tools is Gleam. It’s possible, but requires the social buy-in of a lot of talented people to pull off.

There’s obviously an appetite in the developer community for something like Rust. I started Dioxus with the mindset that Rust is basically the best shot we have to achieve the “holy grail of application development” and that we’d be determined to either 1) work around the language’s warts or 2) improve the language where possible. We’ve implemented a ton of work-arounds… but LogLog’s post makes it clear we need to start pushing the language forward.

Don’t misunderstand me, I think Nim, Odin, and Crystal are incredible projects in their own right, but Rust definitely stands out from the pack in terms of its success. Google, Facebook, Microsoft, Amazon, Cloudflare all ship Rust in their stacks. Rust is on track to land in the Linux kernel… an amazing feat. Instead of throwing out the baby with the bathwater, I’m proposing that we need to get serious and prioritize the important issues holding Rust back.

Dioxus and Rust


Much like LogLog, Dioxus is all-in on Rust. Not only do we need to quickly write and ship Rust code, we also need to make sure users of Dioxus can also ship Dioxus apps quickly. We need to somehow convince startups and enterprise that yes, you should start your new greenfield project in Rust.

And yet, we struggle with many of the same issues that LogLog brought up in their article. In fact, our recent 0.5 release was focused entirely on working around papercuts of Rust to give developers a way of organizing their state in a sane way with no clones. The success of Dioxus is directly linked to the Rust programming language, and frankly, I feel like Rust is in many ways holding back what Dioxus could be.

So, I want to throw my thoughts into the arena and rally support of the community, core team, and corporate sponsors to make Rust a suitable language for rapid development. I completely believe it’s possible, I believe we don’t need to overhaul everything, and I’m personally willing to dedicate Dioxus Labs’ resources to fix the problems. Dioxus Labs doesn’t have an tremendous amount of capital to spare, but I’m determined that we can fund work on the right changes necessary to “fix Rust.”

I’d rather the Rust foundation and Amazon/Google/Cloudflare/Microsoft sponsor work in the relevant areas rather than Dioxus Labs throw its limited runway at the problems. But, at a certain point, I have to acknowledge the fact that Dioxus won’t take off if Rust is holding it back, so we should do our best to fix what we have. With this post I want to put out a concrete plan, inspired by six years of experience, on how to make Rust suitable for rapid development.

A quick table of contents


  • Capture trait for automatic cheap clones
  • Automatic partial borrows for private methods
  • Named and optional function arguments
  • Proper Reflection
  • Global Dependency Cache
  • Formalizing prebuilt crates
  • Rust JIT and Hotreloading
  • Incremental Linking
  • Parallel Frontend
  • Caching Macro Expansion
  • ThreadSafeSend - fix work stealing nonsend tasks

Making iterating on Rust code faster

A capture trait for automatic cheap clones - stealing ARC from Swift


Like it or not, Rust has found itself in a set of usecases it was not originally designed for. The low-level kernel engineers are dragging it into the kernel leaving a trail of changes to rustc in their wake. Over the past several years, I’ve watched the high-level app developers and aspiring indie game programmers pull Rust up higher and higher into the stack. One could argue that Rust is not suitable for either of these environments: if you want high-level use Go or C# and if you want low-level use C or Zig. This is a fair critique, but as I mentioned before, the thesis of this article is that we can have our cake and eat it too, only by being honest about where the language falls short.

Rust’s sister language, Swift, borrows many of the affordances of Rust also without having an overly complex garbage collector. Basically everything in swift is an Arc<Mutex<T>> with no explicit requirement to call clone() on values:

// an example of some swift code

In Rust, if we want to share an Arc between threads, we need to explicitly call clone on values:

let some_value = Arc::new(something);

// task 1
let _some_value = some_value.clone();
tokio::task::spawn(async move {
    do_something_with(_some_value);
});

// task 2
let _some_value = some_value.clone();
tokio::task::spawn(async move {
    do_something_else_with(_some_value);
});

If this looks ugly and tedious, it’s because it is. This gets tiresome, quickly. While working at Cloudflare, I had to work with a struct with nearly 30 fields of Arced data. Spawning tokio tasks looked like:

// listen for dns connections
let _some_a = self.some_a.clone();
let _some_b = self.some_b.clone();
let _some_c = self.some_c.clone();
let _some_d = self.some_d.clone();
let _some_e = self.some_e.clone();
let _some_f = self.some_f.clone();
let _some_g = self.some_g.clone();
let _some_h = self.some_h.clone();
let _some_i = self.some_i.clone();
let _some_j = self.some_j.clone();
tokio::task::spawn(async move {
  	// do something with all the values
});

Working on this codebase was demoralizing. We could think of no better way to architect things - we needed listeners for basically everything that filtered their updates based on the state of the app. You could say “lol get gud,” but the engineers on this team were the sharpest people I’ve ever worked with. Cloudflare is all-in on Rust. They’re willing to throw money at codebases like this. Nuclear fusion won’t be solved with Rust if this is how sharing state works.

Rust needs a new opt-in type that saves us from polluting our codebases with clone. Who actually cares about the hardcount of an Rc/Arc? 99 times out of 100 I’m reaching for Arc/Rc to solve a really challenging shared state problem and the hardcount is not at all important to me.

I propose a Capture trait as part of the Clone and Copy family. Just like Copy types, the clone logic is cleanly inserted behind the scenes. When Capture types move between scopes, Rust would simply coerce them into their Owned type. Capture would only be implemented for types that are “cheap” to clone, like Arc/Rc and other ecosystem defined types like Channel/Signal. This would occur in a number of places:

fn some_outer_fn(state: &Shared<State>) {
	// 1. Calling functions and going from &T to T
  // state is automatically coereced from &T to T via a cheap clone
	some_inner_fn(state);
	
	// 2. When working with closures
	// state is `captured` via an implicit call to ToOwned
	let cb = move || {}
	
	// 3. Working with async
	// state is coerced to an owned type when moving into async move scopes
	task::spawn(async move { some_inner_async_fn(state) });
}

// this inner fn takes an owned version of state 
fn some_inner_fn(state: Shared<State> {}

Amazingly, callbacks - the bane of Rust UI development - immediately become effortless:

// Due to rust supporting disjoint borrows in closures, 
// capture would propagate through structs.
// 
// Not a clone in sight - could you imagine?
fn create_callback(obj: &SomethingBig) -> Callback<'static> {
	move || obj.tx.send(obj.config.reduce("some.raycat.data"))
}

struct SomethingBig {
		config: Shared<Config>,
    tx: Channel<State>
}

Capture would give us the ergonomics of Copy types without having to resort to hacky, yet innovative crates like Dioxus’ recently released Generational-Box crate. Generational Box provides the same semantics of Capture via its CopyType by stuffing your data into a global runtime and hiding access behind generational pointers. We spent 3 months adding this to Dioxus - eating 3 months of our runway - and I’d be more than willing to dedicate an equivalent amount of resources to get Capture into Rust itself.

Automatic partial borrows for private methods


I mentioned above the nastiness of the codebase I worked on at Cloudflare. Thousands of lines dedicated to clone. But the problem that really slowed us down was the lack of partial borrows. Our codebase was gigantic - nearly 100,000 lines of crossplatform code, written 5 years ago. Talk about technical debt. We had structs nested a dozen levels deep; because, how else can you represent the complexity of DNS, WireGuard, WebSockets, ICMP, health checks, stats, etc? Large Rust projects struggle under their own weight and quickly become almost impossible to work on.

We agonized over a lack of partial borrows. Lack of partial borrows prevent this code from compiling:

// Imagine some struct with some items in it
struct SomethingBig {
    name: String,
    children: Vec<SomethingBig>,
}

// Also imagine this struct has some methods on it
impl SomethingBig {
    // this method modifies the `name` field
    fn modify(&mut self) {
        self.name = "modified".to_string();
    }
    
    // this method reads and returns a reference to a child field
    fn read(&self) -> &str {
        &self.children.last().unwrap().name
    }
}

// bummer....
// This code doesn't compile because `o2` is borrowed while .modify is called
fn partial_borrow(s: &mut SomethingBig) {
    let o2 = s.read();
    let _ = s.modify();
    println!("o: {:?}", o2);
}

When you codebase grows, this gets tiresome, quickly. Folks will tell you to “get gud” and refactor your app. Sorry, but Cloudflare is one of the biggest in-production users of Rust, and I can tell you there was no way in hell anyone is going to refactor that codebase and ship the features and bug fixes for the sprint. Companies die over this stuff. I can’t imagine telling Matthew Prince that WARP couldn’t meet its feature deadlines because we couldn’t borrow children and modify name in the same scope.

Folks have bikeshedded a proposed syntax for partial borrows for six years. When I started writing Rust, the momentum was there. I thought this would be fixed in 2018. It’s 2024.

What if I told you that we could have partial borrows in Rust with zero code changes? It’s literally just a switch we could (hypothetically) flip on in a minor release.

Hold onto your socks… the code from above does compile… if you use closures:

fn partial_borrow(s: &mut SomethingBig) {
    let mut modify_something =  || s.name = "modified".to_string();
    let read_something =  || &s.children.last().unwrap().name;

    // This works!!
    let o2 = read_something();
    let o1 = modify_something();
    println!("o: {:?}", o2);
}

As of Rust 2023, closures capture fields of structs through a technique called “disjoint capture.” The machinery for partial borrows already exists! We have it, in the Rust compiler, already!

But why isn’t it enabled for methods, you ask? Bikeshedding. Folks, understandably, want a dedicated syntax to describe the borrows happening here. With closures, lifetimes are generally implicit so nobody really cared about the syntax semantics. Can’t have a public API for a closure.

My concrete proposal is: enable disjoint captures for private methods only. Let Rust-Analyzer give me hints on what partial borrows are occurring for my private methods. Feel free to continue bikeshedding the pub fn syntax for another six years, but for Cloudflare and LogLog and Dioxus to be successful today, we need this switch flipped on for private methods.

Turning on disjoint captures for private methods is a non-breaking change that could be rolled out in just a few releases. Again, if the hunger for this feature is met and an RFC would be accepted in a timely manner, I’d be more than happy to put some of Dioxus’ resource towards getting this into Rust itself.

Named and optional function parameters


Another source of cruft that pollutes Cloudflare’s codebases, Dioxus’ codebase, and countless other codebases: the builder pattern. Sometimes I feel like the Rust ecosystem has Stockholm Syndrome… how could anyone believe that builders are at a sane default for large bags of fields? Why is this the best we have?

struct PlotCfg {
   title: Option<String>,
   height: Option<u32>,
   width: Option<u32>,
   dpi: Option<u32>,
   style: Option<Style>
}

impl PlotCfg {
    pub fn title(&mut self, title: Option<u32>) -> &mut self {
        self.title = title;
        self
    }
    pub fn height(&mut self, height: Option<u32>) -> &mut self {
        self.height = height;
        self
    }
    pub fn width(&mut self, width: Option<u32>) -> &mut self {
        self.width = width;
        self
    }
    pub fn dpi(&mut self, dpi: Option<u32>) -> &mut self {
        self.dpi = dpi;
        self
    }
    pub fn style(&mut self, style: Option<u32>) -> &mut self {
        self.style = style;
        self
    }
    pub fn build() -> Plot {
	    todo!()
    }
}

Do you know what would be nicer than hundreds of lines of builder pattern? Named, optional function arguments. Not implemented next year or the year after, but today:

// just use a function, like every other language
pub fn plot(
	 x: Vec<usize>,
	 y: Vec<usize>,
   #[default] title: Option<String>,
   #[default] height: Option<u32>,
   #[default] width: Option<u32>,
   #[default] dpi: Option<u32>,
   #[default] style: Option<Style>
) -> Plot {
  todo!()
}

Again, I know people have been bikeshedding this for years, but we need something soon. Frankly, it doesn’t need to be any more complex than the most obvious solution - other languages of have had kwargs forever. I can point towards a hundred things I hate about JavaScript and Python, but kwargs is not one, especially when the alternative is thousands of lines of ridiculous builder pattern nonsense. If you want to write a Cfg blob with builder patterns, you still can, but there’s no reason we need to live like this anymore.

Faster unwrap syntax


Maybe you see this as an issue, maybe you don’t. When writing apps with Dioxus, I have historically waded through a sea of clone and unwrap. Unwrap isn’t bad; honestly, I love that Rust has the notion of error handling like this.

But come on, for a demo of fetching from a server, this is just stupid:

let res = Client::new()
	.unwrap()
	.get("https://dog.ceo/api/breeds/list/all")
	.header("content/text".parse().unwrap())
	.send()
	.unwrap()
	.await
	.unwrap()
	.json::<DogApi>()
	.await
	.unwrap();

Why can’t we have a cleaner syntax for unwraps? We already have question mark syntax for error propagation, we not combine it with something like a ! for unwrap?

let res = Client::new()!
	.get("https://dog.ceo/api/breeds/list/all")
	.header("content/text".parse()!)
	.send()!
	.await!
	.json::<DogApi>()
	.await!;

This section is not necessarily me trying to get ! into the language as the exact unwrap shorthand ligature - I’m not a language designer - but it should be obvious that we could actually prototype faster if the language treated unwrapping with the same consistency that it treats error propagation.

Honorable mentions:


There’s a few things

  • Try trait
  • Specialization
  • Stabilizing async read/write traits to standardize on an executor API
  • Allowing compilation of builds that fail typechecking

Things we can do at the compiler level

Global Dependency Cache


Imagine a world where a fresh Rust project cloned from GitHub builds as fast as an incremental build. What if every fresh build on your system was as fast as an incremental build? Would we still say Rust’s compile times sucked?

A global dependency cache would make every new Rust project compile as if it were already in an incrementally compiled state. Essentially, a global dependency cache is an sccache for your entire computer - or perhaps the entire Rust ecosystem. It’s not dissimilar to symlinking every target directory on your computer together into a single target dir, guaranteeing every project on your machine shares the same incremental compilation artifacts.

Of course this wouldn’t propagate for every crate in the graph that you just pulled - and new versions of crates and the compiler would limit what could be cached - but the perceived speed of the rust compiler would be much faster. There’s a whole host of problems that need to be solved, like handling crate cfg flags, but the gains are palpable. Check out how fast Bun is with its global module cache.

Formalizing prebuilt crates


Rust itself could host an sccache of crates, delivering both precompiled binaries and sources for crates as you download them. These would be validated by crates.io, making the whole issues with dtolnay shipping a precompiled serde_derive a moot point. We could choose if we trusted crates.io with our prebuilt binaries, and if so, we get a free sccache implementation for everyone.

Better yet, we could serve prebuilt dependencies in release mode - not debug mode - making everyone’s debug projects not only compile faster but run faster for no added extra cost. I’m not the first to think of this: there’s an issue on GitHub that has been open for 9 years.

rust-lang/cargo#1139

Making macros Run in release by default


Would you believe me if I told you that macros in Rust always ran in debug mode? Every serde #[derive(Serialize, Deserialize), every rsx! call, every #[component] annotation, every #[thiserror] attribute - they all run in debug mode. This is much much slower, especially for crates doing a lot of code generation. When building with Dioxus, we actually recommend you crank up your macros to run in release mode.

Running rsx! macros in release mode has significant improvements to compile time, shaving several seconds off incremental builds. For Dioxus users, this is tremendously important since we do a lot of macro work at compile time in exchange for really fast runtime performance.

You might think - well, what if I don’t do a lot of macro stuff? Your dependencies probably do and your derive impls do too. Dioxus uses a fork of the typed-builder crate - and that generates a ton of code!

I argue that macros should run in release by default. When paired with prebuilt binaries, you’d always be guaranteed that syn/quote/etc. download instantly and run at max speed.

Caching Macro Expansion


What if I told you that every incremental compile re-evaluated every macro in your crate? That would be strange right?

Well, here’s a demo:

macro_recompile.mov

Note that our tokens passed into the macro never change, but the macro continuously pumps out new timestamp files into the dumps directory. Also notice that the Rust code also never changes - we simply add new comments to the file. The little bit of caching Rust applies breaks down when any spans change, which are affected by whitespace changes! Ever wonder why rust-analyzer seemingly chugs on innocuous changes? You’re probably re-running all your macros in your project several times.

If Rust cached macro inputs and outputs, it would only need to re-run macros whose contents have changed, saving valuable time in compilation. Especially incremental compilation. Just like prebuilt dependencies and release mode macros, this is low-hanging fruit that would have a massive positive impact across the entire Rust community.

Parallel Frontend


Recently, Nicolas Nethercote released an article on *Faster compilation with the parallel front-end in nightly.* I recommend you read this article to get a deeper sense of what this change entails. In short, the Rust compiler is split into two phases: a frontend and a backend. The backend is basically an LLVM worker pool while the frontend is the type checking and MIR lowering that feeds the LLVM worker pool.

This change to the Rust compiler would enable type checking and MIR lowering work to occur in parallel across crates - and in the demo case - cutting the frontend phase in half which is about a 25% performance improvement. Stabilizing this work would lead to a huge performance win across the ecosystem. Of course, you can enable this today with nightly Rust, but I personally would vote for this to be a high priority item for stabilization.

Incremental Linking


A project recently surfaced on r/rust about David Lattimore’s work on Wild: an incremental linker for Rust. Currently Wild doesn’t actually do incremental linking - writing a linker in Rust is already a lot of work - but the projected plan is to support incremental linking.

If you’re not familiar: every ahead-of-time compiled project, whether it be C/C++/Zig/Rust is compiled in two phases: code generation and linking. During the code generation phase, the compiler walks your project and emits the assembly code for every function, struct, enum, etc it sees. When you pull in crates like windows-rs, this can get expensive since there’s a lot of code to generate, but you generally only pay the code generation penalty once for dependencies and then every time when recompiling your main project.

Once your code is walked by the compiler, it’s passed to LLVM and then linked together by your system linker. This involves walking the binary from “main” to every referenced symbol, and then pulling in the generated code into a final output binary.

The important aspect to note here: while compilation might be incremental, linking is not. All of your dependencies - while their codegen might be cached - get relinked on every single build. In the context of a fresh build, codegen dominates. However, for large projects, linking can take up to 50% of the total cargo build time (in this case 5s of a 10s incremental build).

An incremental linker would sidestep this issue. Instead of completely relinking your entire binary from scratch, it would cache its existing linkage and only perform linking on items that have changed between builds. For large projects, like a videogame in LogLog’s case, this would eliminate practically 99% of link time for small changes.

Rust JIT and Hotreloading


The final item I want to cover here is a JIT for Rust. Right now the Rust compiler operates as an ahead-of-time compiler, shelling out to LLVM to produce a final output binary. This is great for high-performance, portable, runtime-free builds that need maximum optimization and correctness. But for iterative game and app dev - LLVM is overkill. Sure, it gives us a lot, but it also limits what we can do.

Dioxus’ analog is Flutter, a crossplatform app-dev framework by Google, which is powered by Dart. Dart sports both an AOT engine, like Rust, but also a JIT engine for development. This JIT engine is extremely powerful, allowing things like functions to be hotreloaded on the fly. Developers building native apps for iOS and Android usually complain about compile times with Flutter developers happily chug along with instantly-reloading apps.

To date, the Dioxus team has put a LOT of effort into getting hotreloading for our rsx! macro contents, but what we really need for Dioxus is hotreloading of components directly. Having proper, native hotreloading support in Rust would let us drop a lot of support code we need for our less-powerful template reloading, and is the area I’m personally most interested in funding.

At its core, JIT defers codegen entirely until a codepath is hit, basically eliminating all compile time completely. The speedups from a parallel frontend, release mode macros, and macro caching are still important - we can’t generate the JIT intermediate-representation without running the rustc frontend - but we would basically spend no time in codegen and linking. I’m personally of the mindset that a proper, hotreload-capable, JIT backend for Rust would be the single most important upgrade for iterative Rust compile time, making Rust properly suitable for app and game dev.

And, lo-and-behold, there’s work in this space! The brave bjorn3 has been chugging along, seemingly solo, pushing rustc_codegen_cranelift forward. In my opinion, this work should receive a shitload of funding with a dedicated crew of compiler engineers, tightly integrating the JIT engine with the incremental linker. According to cranelift.dev:

It has been measured to perform the code-generation process about an order of magnitude faster than an equivalent LLVM-based system.

Rust compile times could be negligible.

Putting it all together - Compile times


Between incremental linking, a parallel frontend, macro expansion caching, release mode-macros, and precompiled binaries, I’m positive we could cut Rust compile times by 90% or more. None of these steps are impossible, but they need dedicated engineers backed by a substantial amount of funding.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment