Skip to content

Instantly share code, notes, and snippets.

@burdiyan
Last active May 20, 2022 18:10
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save burdiyan/733b340d7c6da8dc94b081bb6278454c to your computer and use it in GitHub Desktop.
Save burdiyan/733b340d7c6da8dc94b081bb6278454c to your computer and use it in GitHub Desktop.
My (certainly very) opinionated feedback about GitHub Actions

Backstory

GitHub ran a survey asking for feedback about GitHub Actions. I do have some opinions about that, so I started writing. It turned out to be quite an essay, without me noticing it. It's a subject that I'm quite nerdy about, and I had a lot to say. When I clicked submit, the page failed with an HTTP 500 error, and can't know if the response went through or not 🤦‍♂️. The notification doesn't appear anymore for me, and it'll be frustrating if all of it went for nothing.

So, I'm publishing it as a Gist (I don't have another medium to write long thoughts), verbatim (besides fixing typos), just in case someone else shares my opinions or finds it interesting. Here it goes.


This is a very personal and biased opinion, but I spent quite some time thinking about it. I never knew where to share it, so I'm glad you asked :)

I'm sorry for being so verbose and all over the place, but this is my honest opinion and feedback.

Reliance on JavaScript

The most annoying thing about GitHub Actions for me is the reliance on JavaScript.

Why TypeScript for writing custom actions? Having to compile it, and store the compiled assets in a separate repository is just so weird and limiting. I think about an Action like a function call. It does some work, accepts some inputs, and produces some outputs. But instead of being a function call, it's a weird, bespoke pile of YAML, TypeScript, compiled and bundled JavaScript, and a whole lot of imperative activities disguised as being declarative. I don't mind imperativeness at all, and I'd much rather prefer having a true scripting environment to define my workflows.

In my ideal world, I should be able to create a script and point GitHub Actions to run it on certain events. That's it. Instead of imitating variables, conditionals, and function calls in YAML, I'd love to just call functions in a scripting language. If it's TypeScript, just let me write TypeScript, don't make me compile it and store it along with the source files. In my opinion, Python is a much more suitable language for this kind of work than TypeScript, because it doesn't need to be bundled, and is pretty easy to work with. I'm not a Python developer at all, but I think it's a perfect language for "glue" stuff like CI and automation. It's easy enough to pick up for anyone who's able to currently deal with the complexities of GitHub Actions. The standard library is powerful enough for many situations, so you won't have to pull in half of the Internet to be useful like you have to do with NodeJS. It's also pretty easy to sandbox compared to JavaScript.

So imagine if there was a GitHub Actions SDK, and all those well-known published actions were just Python packages that you could pull in and just call like functions. With all the benefits your IDE provides you for programming languages. I think this would be beautiful. And you can still make it feel declarative by building abstractions on top of more lower-level imperative primitives.

What's interesting is that this could be done right now if I'm not mistaken. AFAIK actions-toolkit SDK for TypeScript in the end just mostly calls some remote APIs, so this could be ported to any programming language, but I've never seen a good example for that yet. And oftentimes it still needs to be wrapped and packaged with a TypeScript wrapper to be useful.

External Dependencies And Speed

My other pain is with managing dependencies, and the general slowness of the process. It surprises me that people so often complain about Bitcoin being environmentally unfriendly, or talking about climate change telling people to take planes less frequently, while we burn so much energy doing stupid redundant things on our CIs every single day. How did it become normal to npm install (or similar) on every CI run? Yes, I know you can cache the results, but for some situations is not as easy as with npm and friends. Sometimes you need system packages. So you'd sudo apt install something, but caching the whole apt local directory can often be slower than just pulling what you need from scratch. But, if you have a pipeline, you'd have to do that at every step of the pipeline, which is wasteful.

Speaking of caching. I think the current approach of archiving the directory and uploading it to GitHub is far from optimal. If I only change one package in a branch I'll have to zip the entire directory, upload it, and it would probably nuke my main cache because of size limits, so next time it'll have to pull everything from scratch in my main branch. I know there're ways to reduce these problems by specifying wildcard cache keys and so on, but it's still very complicated and error-prone.

In my ideal world, GitHub would run a transparent caching proxy in front of my CI. So in effect, this would become a mirror for my packages, and each package would be stored separately. Making it content-addressable could help to share packages between different customers, so it could probably even make you store fewer data. Co-locating these caches with runners would improve speed a lot compared to pulling the package from the origin.

BTW, I understand that slow CI makes more money for GitHub :) So maybe you're not so interested in speeding things up very much, but at the moment working with Actions is just very slow. The feedback loop is incredibly frustrating.

Reproducibility And Debugging

It's just incredibly hard to debug a failed workflow. You wait for 10 minutes to know that you made some mistake in your cache key, or some variable passed to a different action, and without being able to SSH to the actual runner and debug on the go, the feedback loop is very slow.

As an example, in our current project, we're building a system that includes a Go daemon, a JavaScript frontend, and a Rust wrapper which is an alternative to Electron. We build for Windows, macOS, and Linux, and it's all in the same repository. We've spent more than 6 weeks to set it all up, and more than 200 bucks on mostly failed Action runs, just because of being unable to debug workflows more seamlessly.

I'm using Bazel build system quite a lot. And I really wish GitHub Actions considered it and provided some tools around it. Take a look at services like BuildBuddy for inspiration. So much complexity could be removed if more people used Bazel :) Instead of defining complex workflows in a bunch of YAML, passing artifacts around through remote object stores, people could just configure the system to run the same Bazel commands they run on their local machines, and it could still reuse remote cache and do everything in a single workflow, instead of having to spin up multiple VMs over and over, passing artifacts around. It would help tremendously with debugging too. If it runs locally, it would run in CI, with exactly the same versions of all the tools, without you having to write a whole bunch of actions/setup-<something> things, trying to keep all the versions in sync between local and CI environment.

As a starting point, it would be just fantastic if GitHub provided caching service compatible with Bazel. It's an open protocol, that could be sitting in front of Action's current cache. Cirrus CI provides one for their CI offering. They also provide a hack for doing it with GitHub Actions by spinning up a local proxy that's backed by Action's cache. It's suboptimal compared to an out-of-the-box solution. Google Cloud Storage (their object store offering) supports the build cache protocol out of the box, and it has been tremendously useful in my experience.

Summary

I think it's unfortunate that so many bespoke, weird, and wasteful things have become the norm because GitHub is such a huge driver for Open Source. People just follow the rules and don't go against the grain very often, so they would do anything GitHub would tell them. And GitHub could be doing so much better job at providing much more efficient solutions, and educating people to follow them.

If you read this until the end, drop me a line :) I'd be delighted to know what you think about all that.

And don't get me wrong, I love GitHub :) This month it'll be my 10th anniversary on GitHub, I recently and accidentally figured it out by playing around with the API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment