PoignardAzur/notes_xilem_plans.md

## notes_xilem_plans.md

      
    Raw
  

              notes_xilem_plans.md
            
          
    Roadmap for Xilem backend

As you may have heard by now, Google Fonts is funding me this year to work on Xilem.
I'm not alone in that: Aaron Muir Hamilton, Daniel McNab and Matt Campbell were funded as well to work on various parts of the ecosystem. I believe this is Matt's third year getting funding from Google Fonts.
Now, what I was hired to do might be unclear to you. The scope we agreed on was fairly broad and was stated as "contributing to Xilem's view tree and developer experience", but that could mean a lot of things.
This is something I want to fix ASAP. I have a a lot of plans for Xilem, and they involve major changes from the current architectures, changes that might be worrying to the community at large or even to Raph Levien. I certainly don't want to give the impression that I'm wresting the project away from the Linebender community for Google's interest, and that means being transparent early and often about the things I want to break.
What I want for Xilem

Xilem is a research-and-development project, the latest in a series (Xi-Core, Druid, Crochet, Idiopath), that aims to explore how patterns of UI development can be made compatible with Rust.
This might feel a little banal now, because Rust GUI frameworks are popping up everywhere and starting to settle on similar patterns, but those patterns weren't obvious in 2019! Xilem is very different from Druid, and getting to its current architecture has been a process.
Xilem has been perma-experimental since May 2022. Once it got started, the main Druid maintainers stopped contributing almost entirely with the expectation that Xilem would replace Druid once it was ready. Unfortunately, it still hasn't made enough progress to be on parity with Druid, yet Druid still lies abandonned, which is a problem because people had started relying on it.
It's debatable how much this could have been avoided. As I've pointed out before, the Rust GUI ecosystem is subject to massive yak-shaving: many of us came here because we wanted to build a text editor, and now we're all learning about text rendering, text editing, compositing, accessibility trees, using monoids to implement stuff on the GPU, ECS, and some concepts that I'm absolutely certain Raph made up like Bézier paths and C++.
(BTW - I love this style of writing where we start humanizing some part of a software process and complaining about how much it wants to kill us. The articles I cited are very high-quality too.)
And to a certain extent, yak-shaving is good! "I'm going to do better than X without learning how X works" is a child's attitude. It's the attitude we all start with, and it fills us with courage, but that courage is naive. True improvement comes from keeping that courage after spending years learning how immense the problem space is, and keeping faith that it's possible to do better. The Rust's community is one that values doing things from scratch and learning from the past.
This is good, but the "learning from the past" part takes a lot of time, and in the meantime, I'd argue that we've neglected end users a bit. We've abandonned Druid before Xilem was ready as an alternative, and then we worked on foundational and experimental projects that would make Xilem stronger and sounder in the long term, but left the project in an awkward state in the meantime.
In the ECS-backed GUI framework article I linked, Bevy maintainers have this to say:

Not to be too harsh, but a lot of the existing Rust GUI solutions... just aren't very good. There's a lot of passable options, but they all have non-trivial drawbacks. No one has really risen to the top as a clear winner. [...] Deep down, we all know that we can do better, and we should.

My ambition is to makes this paragraph obsolete before the end of the year. I want Xilem to get back to being suggested to newbies in the same breath as Iced and SlintUI. In the next few years, I want to entire ecosystem to get to a point where people talk about Rust GUI like they talk about ripgrep or rustls.
To give one specific example: my personal goal is to implement seamless hot reloading for Xilem before the end of 2024. I believe it's feasible, and I believe doing it will radically change the way people think about Rust GUI, along with other improvements I hope to unlock over time.
In the shorter term, my plan is to focus on basic features, widgets and styling options so people can experiment with Xilem, building on Raph's work on Vello over the past year. John Skottis's Xilem UX Notes gist gives a good picture of the kinds of things I'll be working on.
In the medium term, I also plan to work on the performance aspect of Xilem. People from Google Fonts have expressed interest in seeing how much performance we can get from Rust UI. While my mission isn't limited to performance work, it will be an ongoing focus, especially since its an area where Rust has unique opportunities. I'll go into more details on my perspective on Rust performance in a later article.
Major changes I want to implement soon

These are the big changes I want to work on soon. Ideally, some items of this list should be completed by the end of the month (but, you know, time estimates are hard).
Switching to Masonry as a backend

Okay, this one makes me blush a little.
I'm the only maintainer of Masonry, a GUI crate with barely over 300 stars on Github and little outside interest. Can I really justify taking the money Google pays me to improve Xilem and spending my time on Masonry instead?
That choice isn't random, mind you: Masonry was forked from Druid, Xilem's ancestor, and in fact Xilem and Masonry share a lot of code right now.
In fact, I'd argue that Masonry's codebase is of better quality than Xilem's current native backend. This isn't meant to be a put-down of the work of Xilem maintainers; rather, it's the outcome of a difference in priorities. Nobody was very interested in Xilem's native backend (though Raph was aware of its importance, hence me getting paid to write this), and lots of people were very interested in the frontend and the web backends, therefore the frontend and web backends saw the most work. But on an objective level, I think Masonry's backend codebase is a healthier starting point.
Masonry also comes with some built-in perks, like powerful unit tests and a structured widget graph.
It's not clear whether we want to import the Masonry codebase, the crate, or the entire repository, and its not clear what the ultimate branding should be. I'm writing a full RFC to lay out the different possibilities to the community, so we can reach a consensus before doing more work (including possibly not using Masonry at all).
If we do move on in that direction, then Masonry will need some porting work:

Using Vello and Parley instead of Piet.
Using Glazier instead of druid-shell.
Integrating AccessKit.
Removing the dependency on GTK.

AccessKit support in particular is table stakes, by now.
There are other things Xilem added over Druid (notably virtual lists and async support), but I believe Masonry doesn't need to implement them right away for parity. Both will be very important for eventual performance work.
Not using custom Widgets

This is another big one, and will likely also require an RFC.
We should write Xilem's backend under the assumption that end users of the library (including the Xilem frontend) won't be able to create their own widgets. Instead, they will have to compose the primitives given to them the same way they compose DOM elements in the browser. The primary escape hatch will be a fixed-size surface where end-users can use arbitrary paint primitives, like the web canvas.
This is a bold move, but one I'm confident in. Developers of hobby GUI frameworks tend to provide a severely anemic set of primitives, under the assumption that if anyone wants other features in their widgets, they can implement them themselves. Both Druid and Xilem suffer from this.
On the other end of the spectrum, the web has shown that you can implement complex custom UIs given a rigid but rich set of primitives.
One major customization point I want to remove from Xilem is layout.
Today all Rust GUI frameworks ship their own layout algorithm, but none of them are competing on their layout algorithm. Often the layout algorithm is just a means to an end; developers that do want to work on a state-of-the-art layout engine will provide it in a modular crate like Morphorm and Taffy.
And here's the thing: realistically, most layout engines aren't that creative. Every Rust framework and every major GUI framework I'm aware of uses layout that boils down to "FlexBox with small variations". The number of meaningfully different ways to do GUI layout can be counted on, generously, two hands.
I'd argue we only need CSS Flow, FlexBox, CSS Grid, and maybe Morphorm to cover most people's needs. Conveniently, this is what Taffy aims to provide. There's already an experimental Taffy integration in Xilem, and I want to make it deeper. More on this in a latter blog post.
Also, having a fixed set of widgets mean you can avoid using Box<dyn Widget>, which has some interesting performance implications. It means you can add a lot of methods to the Widget trait without worrying about bloating the vtable. And you can write methods under the assumption that they'll be inlined, which opens up interesting DX improvements.
Using ECS, maybe, sort of

In current Xilem code, if you want your FlexWidget to have children, it must have a Vec<Pod<ChildWidget>>. Each Pod directly owns a Widget.
I think this is an interesting quirk of Rust GUI, tying directly to Rust's ownership semantics. Non-Rust frameworks often use pointers to child widgets, true, but they're often shared with the framework, with an understanding that the framework is charged with keeping track of how many widgets there are and which is a child of which. In Qt, for instance, it's preferred (but not mandatory) for parents to call delete_later() on their children instead of delete, because it plays better with event handling.
Xilem's approach of "you own your children" is a little bespoke. It means the framework has a hard time iterating over the entire widget tree (say, as part of a DOM inspector) unless each Widget specifically implements the iterating method. And it means if you want to target an event at a specific widget (say, the one with keyboard focus), you need to go through the entire ownership chain, using some heuristics to avoid visiting too many widgets.
I believe Widgets should be owned by the library. If your container has children, then the only thing the container will actually own is keys into a structure (probably a slotmap) where the widget is stored. This makes a lot of things easier, like serialization and debugging, but it has an impact on the entire backend. It's an infrastructure investment.
Lately, I've seen more and more dicussion of implementing GUI through an ECS. A lot of that discussion comes from Bevy, which is natural, since the bevy community is filled with ruthless cultists striving to feed ever more sacrifices to the ECS god
(Note: will need to run this joke by the Bevy community)
TODO
is intimately familiar with the ECS pattern and putting a lot more importance on UI
But I've seen discussions about it in the Linebender community too.
Whether we actually want to use ECS is something we still need to research.
The way I understand it, the big draw of ECS is the "S" part, systems: it's about handling lots of similar data at once in arbitrary order. Which is why eg archetypes in bevy are useful: you're grouping together entities with the same components, so that you can say "execute this code for all entities with component X and component Y" and have this code be executed on arrays of values with minimal indirection or branching. And the "arbitrary order" part makes them work really well with slotmaps: since you can just read from the beginning of the slotmap, you don't pay the double-indirection penalty.
In other words, the ideal use case for ECS is code of the type for (entity in entities) entity.position += entity.speed.
GUI is pretty far from that ideal use-case: updates are sparse and should only run on a small number of changed widgets. Order often matters. And I'm not sure archetypes would work in a framework with a large variety of widgets and different attributes.
I think there are two things you really want from a Rust ECS library for GUI: slotmaps, and efficient ways to add and remove components from an entity.
Implementing those is going to be a major undertaking, but one I expect to pay many times over.
Community involvment

Something I want to keep emphasizing: none of the above is set in stone.
Part of what Raph wants to achieve this year is to make Xilem more of a community project, and that means running more decisions by the community.
I've created an RFC repository for this, and each of the sections above will get its own RFC, where I'll expand on my rationale in more detail.
This isn't a rubber-stamping process either: I want to implement those ideas, but if the community decides that they are bad, then we'll go another direction.
[note: "The community" is a nebulous concept here. For obvious reason, we'll given more weight to the opinions of people who have already put work into the Linebender ecosystem, but the general hope is to get a consensus from everyone involved. And you're welcome to weigh in even if you haven't contributed before.]
There's other ideas I haven't discussed yet that I want to come back to:

Declarative styling: Giving Widgets attributes similar to CSS properties, with modifiers for animations, pseudo-classes, media queries and container queries. While inspired by CSS, it wouldn't involve any CSS parsing, selector resolution, specificity resolution, etc.
Compositing: Xilem is heading towards a "do everything in the GPU" approach. This is good for performance, but Firefox engineers have warned us of the many dangers they faced going down this road, especially regarding battery life. In general, after layout and painting, you really want a composite() phase to deal with scrolling and some simple animations without touching the GPU at all if possible. This is a complex subject, and it would be nice if we could hide most of that complexity from app developers while still giving them good performance.
Resource loading: This is another area where we might want to crib from game engines. At the very least, we want to be able to ask "Load me this texture/shader/font/etc" without having to write disk I/O in user code. Then we can start thinking about progressive loading, fallbacks, sharing, etc.

These ideas are likely to be the subject of further articles and/or RFCs in the near future.
Hey, this is really cool!

Something that came up in discussions I've had with Raph is that it's easy to miss the forest for the trees. In the middle of all the paperwork, blog-writing, RFC-writing, we can end up focusing on the process so much that we forget how excited we are.
So, I'm taking the moment to remind myself: this is really freaking exciting!
For all my criticisms and caveats and realistic assessments of the problems we need to overcome, I'm still feeling absolutely great about this!
I'm super thankful to Raph for getting us this far, to Google Fonts for sponsoring us, and to everyone else who contributed to the Linebender ecosystem. I think we can do something amazing and I'm honestly a little startled to be in this position at all.
To reiterate what I said: in the next few years, I want to entire ecosystem to get to a point where people talk about Rust GUI like they talk about ripgrep or rustls.
I legitimately think Rust has the potential to get best-in-class GUI frameworks, and now the time has come to prove it.