Historically, in order to host website for a company they usually contact server/hardware providers to setup a sever farm. But this way had some serious down sides
- Managing hardwares was pain in the a**.
- Security risk, if we had another service to run in our server there was no any isolation, every service had access to do any thing with any service in that server.
- Let's say we had a server farm that could handle even black friday sales but all those resources are idle in case of normal day. So this is not cost effective.
Now we have Virtual Machines which is basically running new instances of operating system inside a host operating system. By using VMs
- We can easily manage which service can have how much amount of resources and we can also scale up and down the resources. There will be no any security risks because one VM does not know about other VMs running in that server.
Buts,
- We still have managing hardware problems.
- Also running multiple OSs inside a host OS is not optimal it has performance problems.
AWS, GCP like cloud providing platforms has made our life easy now we don't have to contact vendors/providers for creating our own server farm.
- We can easily rent a server with specific regions of the world from these cloud providers.It is also realtively cheaper.
- We can easily scale up and down resources according to our needs. It is cost effective.
- We don't have to hire any sys admins, hardware maintainers to look after our servers.
Buts,
- To run separate services in our server which has an OS we still need to have multiple VMs. We still need to run multiple OSs inside our host OS provided by these Cloud providers. So it still doesnot resolves our issue of performance arising due to running multiple OSs inside a Host OS.
Oh boy now this technology is a game changer. It gives us all the features that VMs provides us like resource management, security without having to run a whole other operating system. It utilizes three features of linux kernel i.e. chroot(change root), namespace, and cgroups(control groups) in order to create an isolation inside Host OS which is secure.
chroot also change-root helps to create a separate filesystem for a process. It is like creating a new root for a process. Now this process has no idea about outside world also it cannot access the filesystem above it because the process is running on a new root, so it cannot go beyond the new root path (/)
$ mkdir my-new-root
$ chroot my-new-root /bin/bash -> Run bash with my-new-root folder as root dir
But above command will return chroot: failed to run command ‘/bin/bash’: No such file or directory
Because in new root dir i.e. my-new-root, chroot command is looking for a bin folder which contains bin executable but our my-new-root folder is empty so let's fill it up with require dependency.
$ mkdir -p my-new-root/bin
$ cp /bin/bash my-new-root/bin/
$ ldd /bin/bash
linux-vdso.so.1 (0x00007fffa89d8000)
libtinfo.so.5 => /lib/x86_64-linux-gnu/libtinfo.so.5 (0x00007f6fb8a07000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f6fb8803000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f6fb8412000)
/lib64/ld-linux-x86-64.so.2 (0x00007f6fb8f4b000)
$ mkdir /my-new-root/lib /my-new-root/lib64 or mkdir /my-new-root/lib{,64}
$ cp /lib/x86_64-linux-gnu/libtinfo.so.5 /lib/x86_64-linux-gnu/libdl.so.2 /lib/x86_64-linux-gnu/libc.so.6 /my-new-root/lib
$ cp /lib64/ld-linux-x86-64.so.2 /my-new-root/lib64
After doing this now we can run bash command inside new chrooted environment
$ chroot my-new-root/ /bin/bash
bash4.5#
BUT, this is not a complete isolation, another bash terminal running on the same container or chrooted environment with different root directory can see the processing running of other chrooted root directory or from inside container.
So inorder to hide processes from the other environment we can use namespaces.
Instead of copy each and every commands and its library we can use package called debootstrap in order to bootstrap our root which provides min base for a chrooted env to run properly
$ apt-get update
$ apt install debootstrap
$ debootstrap --variant=minbase bionic /better-root // this is gonna get bare min file system to run ubuntu bionic (debain bootstrap)
$ cd better-root
$ chroot . bash
root@56asd788fgd:/#
Namespaces allow you to hide processes from other processes. If we give each chroot'd environment different sets of namespaces, now persons in the system can't see each others' processes (they even get different process PIDs, or process IDs, so they can't guess what the others have) and you can't steal or hijack what you can't see!. We are just not sharing capabilities of process or stoping the flow of processes.
$ unshare --mount --uts --ipc --net --pid --fork --user --map-root-user chroot /better-root bash # this also chroot's for us
// unshare these process with this chrooted bash process
We didn't just unshared the process here but we also unshared network(net), process manager and others too.
As from above demo we can see Host OS can pretty much kill any process running on the children OS or can also terminate children env itself.
So cgroups was invented on google. So one of the problem that google faced was that when running separate teams' products on googles' server. One products teams high traffic crashed the other teams products they all were in their own container and running products had their own processes and filesystem but resources weren't isolated to respective products.
$ apt-get install -y cgroup-tools
$ apt-get install -y htop // to see processes and resources consumed in real time
So all above hassle is done by a commandline tool called docker.
https://hub.docker.com/search?q=&type=image
Docker Hub is a public registry of pre-made containers.It is like npm for containers.
$ docker image // to see all the images locally
$ docker image rm <image-id> // delete docker image
$ docker ps -al // list docker image process
$ docker ps // shows all the running images
$ docker pull mongo // pulls mongo from docker hub
$ docker run -it --detach ubuntu:bionic // runs image on background
$ docker ps // to see running docker images
$ docker attach <image-name> // name from docker ps command
$ docker kill <image-id/image-name> // to kill running image but stores that image
$ docker rm <image-custom-name> // to remove completely image , removes metadata related to that image
$ docker run -it --name my-custom-name alpine:3.10 // we can give custom name to reference later
$ docker run -it --name my-custom-alpine --rm alpine:3.10 // on exit it automatically removes all data from alpine:3.10 image
$ docker -rmi <image-name> // removes image
$ docker container prune // remove all stopped containers
$ docker restart <container-name> // restarts the container
$ docker search python // searches python container from docker-hub
$ docker run -it node:14-stretch // runs node repl by default on stretch
$ docker run -dit node:14-stretch // runs node:14-stretch image on background
$ docker ps // list running images
$ docker pause <container-id> // to pause container
$ docker unpause <container-id> // to unpause container
$ docker kill <container-id> // to kill container process
$ docker kill $(docker ps -q) // to kill all the containers $() -> runs another shell
docker run
is going to start a new container.
$ docker run -it ubuntu:bionic // it is running a new container
docker exec
is going to run something on an existing container
$ docker exec <container-id/container-name> pwd // execute pwd command on given container with <container-id> or name of <container-name>
Building docker containers with dockerfile. Docker reads this file and then it uses that to build containers for you.
// Dockerfile
FROM node:14-stretch
CMD["node","-e","console.log(\"hi there\")"]
$ docker build --tag my-node-app <path-to-folder-containing-Dockerfile> // it will find dockerfile and run it
$ docker run <container-id/tag-name>
$ docker run --init --rm --publish 3000:3000 my-node-server
--init -> tini module that handles SIGTERM on our application like "^C"
--rm -> on exit remove all the data
--publish 3000:3000 -> punching a hole inside docker to share host network