crutchcorn/nobodyAskedWhatKubernetesWasSoIExplainedPoorlyAnyway.md

## nobodyAskedWhatKubernetesWasSoIExplainedPoorlyAnyway.md

      
    Raw
  

              nobodyAskedWhatKubernetesWasSoIExplainedPoorlyAnyway.md
            
          
    This is gonna seem like a fuckload of irrelevant CS shit but I swear it will all come together in the end
Linux
What is Linux?
Technically speaking, a kernel CAN be built to only include an IPC (an IPC is what dictates what process to run in what order, and occationally [depending on the design] how memory is alliocated for that particular process).
Side note: A process is a single piece of running code. You can think of it like a thought. A single action (ala picking up something from the ground or deciding where to go to eat) CAN be one thought but is usually many processes that work together interlinked, but that's getting into userspace which I'll cover in a second
But more often than not, a kernel includes a lot of other bits to understand what your hardware does (so basic backup graphics/x86 (x86 is a type of CPU that runs a specific set of instruction called x86) drivers, as well as stuff like file system drivers (so in linux one of the options is EXT3, but in Windows you get NTFS - this is HOW the 1s and 0s are arranged/organized in your drive to make up the data that's read from the disk)
Linux is called a monolithic kernel for this reason (mono meaning one, lithic meaning pertaining to stone: literally "formed of a single block") - it includes EVERY GPU/CPU/etc driver immediately to the kernel. When you start your computer it just immediately knows where to look for the specific hardware you have in order to know how to display to the screen in the most optimized way/etc
(I'll loop back to kernel design because it's pretty important to understanding kernels, for now just assume that's how all kernels work)
From there, it starts "userspace" AKA everything that's not the kernel. Before userspace starts up, there's no way to process anything. There's no way to pass input to process then to output. In order to do this, we have to take the raw IO data and allow user interaction in some way.
With Linux, this comes in the form of GNU packages. If you just take a SUPER minimal install of Linux with just Linux + SH, you're left with a terminal, no GUI, but you can do some basic scripting ( echo "Hi" to output to screen)
Side note: This is what Stallman means when he says that "Linux" is really GNU/Linux, as what most the user interacts with is GNU. If GNU was programmed to be more kernel nuetral, it would run/interact with the user exactly the same on a BSD kernel, but that's getting into some fucking deep-ass-cut CS history. I'd love to go into more about this if you'd like but I'm getting off track
But most distros don't stop here - they add on XServer (or nowadays: Wayland + XServer) in order to display graphics on screen, they inlclude GTK/Qt in order to allow for windows and window controlls to be drawn on screen, a window manager like Mutter to handle the window's position on the screen, etc etc etc (just like programming requires a lot of thoughts going on at once, your computer runs a FUCKLOAD of processes on top of each other/interlaced with one-another in order to run in a way we've come to expect)
So now I've very basically explained kernel space and userspace, let's take a step back and explain some hardware-level stuff (so that I can loop back to virtualization)
When you press the power button on your computer, you're connecting a phyiscal switch that has physical connections between circuts. Completing this circut allows the machine to send power from your PSU (power supply) to the motherboard
[everything I just said is not even close to true in phones or laptops since phyisically pressable virtual power-buttons are a thing, but pretending this is how it works on everything makes your brain not explode]
When the power reaches your power supply, it passes power to your CPU, RAM, and storage. Your CPU will startup, start looking for a specific sector of memory on your storage, process that information, then load it into RAM (this is where your kernel comes into play - the IPC will from there figure out what else to run and where to store it in RAM/etc)
Virtualization is simulating all of this
But instead of hardware it's programming to simulate the hardware in order to run other hardware
A good example of this is the Android device emulator. Your phone runs on ARM architecture, your laptop/desktop runs on x86. These instruction-sets are not compatable and in order to run an Android app on your laptop, you must simulate a phone running (including kernel, userspace, window drawing, etc) in order to install your application onto this "virtual device" (aka Virtual machine), as it's not compatable with x86 (I know there are x86 devices and emulators just pretend they don't exist for now)
Now obviously you can emulate (aka simulate) an x86 machine from an x86 machine. Totally can, you can recreate a CPU virtually with your real CPU figuring out what THAT CPU should run just like any other program on your device
And that's something like VirtualBox or VMWare
But if you're running x86 and it's compatable with x86.... Why not just use your kernel and then just recreate all the userland (Xserver, etc) instead of emulating another x86 kernel/processor/etc?? Wouldn't emulation be slower than that?
Congrats you made containers
Containers are exactly that concept. They reuse the host's kernel (usually) in order to run many processes in entirely seperated environments that're supppper sandboxed off from one-another (cuz outside kernelspace, they're entirely different operating systems)
Okay, so now we understand (roughly, feel free to smack my punk ass because I probably just skipped 200340298309274 things either cuz I thought I was going too in-depth or am just dumb) containers, but how does Kubernetes come into play?
Well, first let's start with something like Docker. Docker is 'just' [don't kill me Docker community] a way of running these containers. git clone repo.git && cd repo && docker compose and you can have a VASTLY complex native applications installed and setup immediately because you're literally saying "get the entire operating system and set it up exactly the way we want it to be"
Because of how easy to run complex opperations it is, a LOT of people/companies have went "wait, why not use this to deploy our server code? We can just send the whole OS to users and not have to worry about them mucking up upgrades or config"
A nice little side effect is that, just like debugging application errors can be "did you try reinstalling" now you can say this to users of your server? "did you try reinstalling the container?"
Wait I'm not done tho
So now we're all using docker to run our servers: Hooray
But wait, what about applications that require many servers for various reasons (EG my company has a PDF server in one language, other parts of our app in another for load balancing reasons/etc)? Do we just ship them all in many different docker instances? Many containers?
What happens if they all go down? How do we know what to upgrade or restart if ONE goes wrong? How do we configure networks between them to keep their networks open internally but not exposed publicallly?
This is where Kubernetes comes into play. Kubernetes is an "orchestration" layer for your containers. You build around this layer and Kubernetes makes deploying/working with your stuff much easier/scalable
One of the main points of Kubernetes is to keep your containers stateless (so no databases [again, this isn't strictly true, it's a little more complex than this], as databases have state, no config based on modifiable data, etc). Why? So that:

Let's say things go wrong. "reinstall the container" isn't a good answer if your container is many many many gigs when you scale to huge sized companies. Just restart that shit and you're golden. No state means you can safely proove that your code will run EXACTLY the same after a restart. This means that Kubernetes allows your servers to "self-heal" after critical errors
What happens when you have a million users trying to access a single server for everything. Answer? Bad things. No single machine can handle that kind of load. You solve this by doing something called "load balancing" (this isn't unique to kubernetes or containers, this is general server practice). This usually means deploying many servers with your server code/config and then setting up your network in complex BS ways. Well, what if you didn't have anything in your server code that needed to be configed? Or rather, could automatically be config'd because you already have your server install script due to containers and knew for a fact it would run EXACTLY the same for any instance of a server because you aren't allowed to have state? (one of the hardest parts of load balancing usually is maintaining state between servers)

Wellllllll, Kubernetes realizes this and just spins up another server and puts it where it needs to go and moves users to that server as-needed. Don't have as many users for a long time using that one feature? No big deal, just remove that extra server container and move users back to the other one
Kubernetes is basically perfectly infinately scalable********* for that reason. You can have as many users you want move between however many resources you have by dynamically scaling up/down your containers that's all handles ("orchestrated") by Kubernetes