Reto Lehmann ReToCode

## kube_graceful_shutdown.txt
# Problem
We operate more than 3500 containers on a large OpenShift cluster. A lot of of applications have the same problem with the current termination process. To achieve zero downtime in rolling updates, pod restarts and evacuation of nodes due to maintenance an application has to do the following things:

- Pod has to be killed due to some of the above mentioned events
- Kubernetes/OpenShift sends a SIGTERM signal
- Application has to catch the SIGTERM signal
- Application has to set its readyness probe to false to stop getting new traffic
- Application has to wait until no new connections are sent via service
- In OpenShift the application also has to wait until the HA-Proxy is reloaded to not get any traffic from there. Also the HA-Proxy has it's own health check which is not in sync with the readyness state of Kubernetes. To signal HA-Proxy to not send any more traffic the HTTP listener port has to be closed.
- The application has to finish it's active requests (within the terminationGracePeriodSeconds

## pruning.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                ReToCode
                / pruning.md
            
            
              Last active
              November 23, 2017 07:20
            
              
                OpenShift image pruning
              
          
    Generally:


The pruner should not ever delete any used image (in any possible openshift object - also the beta/alpha types!)
The pruner should not ever delete layers that are used (also via cache) in such an image

I don't know what issues you already know about, or are already fixed in 3.5+. Here a few things to consider if the basic solution stays the way it is today:
Things to consider:

Builds happen on an node based on "docker caching logic" and on the "local docker pool", not based on the upstream registry


This way the cache-reuse happens based on the "docker logic" and against the local docker pool
-> The pruning must know about dockers "caching logic". This could be tricky if docker changes that behaviour in a future version and the pruner is not aligned


## Remove all node_module folders recursively
find . -name "node_modules" -exec rm -rf '{}' +
	# Problem
	We operate more than 3500 containers on a large OpenShift cluster. A lot of of applications have the same problem with the current termination process. To achieve zero downtime in rolling updates, pod restarts and evacuation of nodes due to maintenance an application has to do the following things:

	- Pod has to be killed due to some of the above mentioned events
	- Kubernetes/OpenShift sends a SIGTERM signal
	- Application has to catch the SIGTERM signal
	- Application has to set its readyness probe to false to stop getting new traffic
	- Application has to wait until no new connections are sent via service
	- In OpenShift the application also has to wait until the HA-Proxy is reloaded to not get any traffic from there. Also the HA-Proxy has it's own health check which is not in sync with the readyness state of Kubernetes. To signal HA-Proxy to not send any more traffic the HTTP listener port has to be closed.
	- The application has to finish it's active requests (within the terminationGracePeriodSeconds