CodyKochmann/cicd-life-advice.md

## cicd-life-advice.md

      
    Raw
  

              cicd-life-advice.md
            
          
I need scripts to run on github webhooks just as bad as you do. But this one-size-fits-all "lets make scripts but in yaml" shit has got to go.

Yaml acts as the "purely ci details". Script logic goes in... SCRIPTS!

Every CI/CD engine has a million different 'input' layers - environment variables, repo variables, workflows from various branches, maybe the git commit object. Oftentimes your scripts will have the ability to jam more variables into the 'input' layer for later scripts - such as in the case of the github environment variables. UNFORTUNATELY, it's never clear WHICH input layers are available to WHICH parts of the script. "Oh, no, you can't use outputs from this script to as arguments for something in our yaml, because the yaml is calculated first." Well thanks, Mr Nadella, I'll just torch my entire pipeline and start over.

My form of dynamic ci materializes all variables into logged and exam-able ci code. Hidden ci assembly magic was one of the first things I got rid of because life triggered by magic side effects is terrible. All of my ci code going into their engine is "static". When my pipeline is dynamic, I have a job that generates a static ci file on the fly that is logged and able to be examined at any time to see what the running code actually was.

Every CI/CD engine is trying to have some sort of rudimentary 'secrets management', but that's not its core competency. There are always rules - "only 100 secrets", or "Secrets can't be passed from job to job," or "We allow secrets but you can only put them in via our GUI" or "We use hashicorp vault as a secrets backend but we didn't study the docs so we're just using their 'approle' backend as a username/password." Looking at you, Mr Tabib.

I avoid secrets management as much as possible. My personal philosophy is runners should be split by network locality and permission role so there is no need to pass around secrets between jobs. Locality should be the access credentials you need, not the passwords you smuggled in to call something outside of what the runner was designed to use.

Every CI/CD has obscure, insane 'defaults' left and right. "Oh, you ran a job that was called by another job called by another job, and everything succeeded, but one of the if statements didn't run? Our 100% proprietary fancy-pants no-source observer calls that a failure!"

Pro cicd tips:

Define for your org what you deem is sane defaults. If you dont, you leave yourself open to the whims of the adhd devs who thought it was a good idea to lolcat their way into putting all of this shit in a browser.
Normalize ONE shell for everything. If youre running the show and cant get over the alpine koolaid make sure you are ALWAYS using ash. If you at one point were a sysadmin, force those fucking devs to run in bash and to get over it if their favorite idiot blog decided to rant about how secure and minimal alpine was for an hour.
Always start every script with set -euxo pipefail so you can assert your shell is gonna spell out success/fail to the idiot cicd system using exit codes correctly. Do not expect devs programming shit for browsers to understand anything beyond non-zero exit.
NEVER use if, rules, or conditional workflow in your cicd pipeline. If you do, you turn that pipeline into a little adaptor pattern monstrosity that multiplies its final states every time you add an if. The time it takes to debug conditional shit will ALWAYS cost more brain power than the extra time it takes to just run every step, the same way, in the same order, every time!


FOR GOD'S SAKE, YOU ARE A GLORIFIED SCRIPT RUNNER.

I prefer the term "glorified make caller".

I'm dying over here, you guys. What happened to the unix principle? What happened to each thing doing one thing really, really well and stringing it all together? Why does each CI/CD runner have a GUI and a CLI and two APIs and secrets management and slack integration and built in graphs and dashboards and runs shell scripts against my cappucino maker?!

Because sysadminning is a lost art. Heres a hint though. Install it as a dumb linux or k8s service and admin them all the same way you would any other service. Everyone and their mother would love to be the supervisor that runs the show. If you want your sysadmin fu, you need to dumb shit down as much as you can and then supervise everything with the same one or two supervisors you manage everything else with. If its not being managed by the same system as you would manage a sshd server with, you're not going dumb enough and are gonna hate runner management.

Like, SO MUCH of our CI/CD in every place I've ever been could be done with docker containers and python. Github pull request? Send a webhook to the container, run the scripts, fuck off. Am I missing a GUI I admit it might be a little irritating to build my own GUI, but everything else? Emailing people from a script? Slacking people from a script? Correctly gathering secrets from my vault instance from a script? CORRECTLY PASSING RETURN VALUES BETWEEN SCRIPTS?!

Correct. However, what your homebrewed gui probably isnt gonna provide that youre probably only gonna realize is priceless later down the line are... Job history and a shared execution environment. While yes I could waste my morning and repeat every action some dev said "works on their box", I cant recreate their god damn home wifi and unobserved home network that allows connections anywhere in the world. That gui acts as a window to every failed attempt the dev had so I can look through things to see where they went wrong. If youre running solo, use make and get over it. If youre running on a large distributed team and want an efficient way to share your struggles with your team so they can quickly help you, that gui will save everyone else DAYS of not needing to recreate what a dev wont stop saying doesnt work.

But instead, I spend my days becoming a goddamn little cackling wizard on someone's particular dumbass piece of software with features we'll never use, hidden 'sane defaults' that get in the way, security that protects someone else, translating yaml to basic-bitch bash.

If you cook your scripts into the CI yaml itself, thats on you. Jam it all into a Makefile or script files and the logic becomes portable and friendly for your graybeards to quickly dig deep and take an interactive look.

ugh. Thanks guys. rant over.

Thanks for sharing your thoughts.
If I may add my own 2 cents, I feel your pain. The journey of making CICD systems is a long one.
Somewhere along the way, we lost the art of letting a sysadmin design and manage a shared multi-user linux environment for all devs to collaborate on together. With how build systems have progressed though, we dont deserve an environment that nice.
So now we have this ephemeral, reproducible, distributed crap because someone somewhere was arrogant enough to claim their product could replace a good sysadmin. Good job. They then realized this mistake needed to layer endless logic to clean up after devs who dont know how to keep a system clean and needed to put it all in a format that even idiot devs are able to understand: yaml
If you wanna learn how the big kids do it, I highly suggest you go look at Linux, CPython, and Kubernetes who all recognize make is still king for shared development. Quit letting idiot devs who refuse to read a manual reinvent the tool that is carrying all of our greatest pieces of software!