Skip to content

Instantly share code, notes, and snippets.

@holyjak
Last active May 2, 2019 09:46
Show Gist options
  • Save holyjak/fc6e35a4228d348cf89eef5303473cab to your computer and use it in GitHub Desktop.
Save holyjak/fc6e35a4228d348cf89eef5303473cab to your computer and use it in GitHub Desktop.
Kubernetes resources on AWS EKS having the "no available volume zone" problem
$ kubectl describe pv
Name: pvc-c95f0952-f160-11e8-9107-02fe87a39e2e
Labels: failure-domain.beta.kubernetes.io/region=eu-west-1
failure-domain.beta.kubernetes.io/zone=eu-west-1a
Annotations: kubernetes.io/createdby: aws-ebs-dynamic-provisioner
pv.kubernetes.io/bound-by-controller: yes
pv.kubernetes.io/provisioned-by: kubernetes.io/aws-ebs
Finalizers: [kubernetes.io/pv-protection]
StorageClass: gp2
Status: Bound
Claim: common/demo-db-storage-demo-db-deployment-0
Reclaim Policy: Retain
Access Modes: RWO
Capacity: 1Gi
Node Affinity: <none>
Message:
Source:
Type: AWSElasticBlockStore (a Persistent Disk resource in AWS)
VolumeID: aws://eu-west-1a/vol-00f2068b070550965
FSType: ext4
Partition: 0
ReadOnly: false
Events: <none>
# The demo-db is constnatly in the Pending state, `kubectl describe pod ...` shows:
# Warning FailedScheduling 49s (x5 over 3m14s) default-scheduler 0/2 nodes are available: 2 node(s) had no available volume zone.
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: demo-db-read
namespace: common
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "watch", "list"]
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: demo-db-read
namespace: common
subjects:
- kind: ServiceAccount
name: default
namespace: common
roleRef:
kind: Role
name: demo-db-read
apiGroup: rbac.authorization.k8s.io
---
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: demo-db-deployment
namespace: common
spec:
serviceName: demo-db
replicas: 1
template:
metadata:
labels:
role: demo-db
environment: stage
spec:
terminationGracePeriodSeconds: 10
containers:
- name: mongo
image: mongo:4.0
command:
- mongod
- "--bind_ip"
- 0.0.0.0
- "--smallfiles"
- "--noprealloc"
ports:
- containerPort: 27017
volumeMounts:
- name: demo-db-storage
mountPath: /data/db
- name: mongo-sidecar
image: cvallance/mongo-k8s-sidecar
env:
- name: MONGO_SIDECAR_POD_LABELS
value: "role=demo-db,environment=stage"
- name: KUBE_NAMESPACE
value: "common"
- name: KUBERNETES_MONGO_SERVICE_NAME
value: "demo-db-service"
volumeClaimTemplates:
- metadata:
name: demo-db-storage
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: "gp2"
resources:
requests:
storage: 1Gi
---
apiVersion: v1
kind: Service
metadata:
name: demo-db-service
namespace: common
labels:
name: demo-db-service
spec:
ports:
- port: 27017
targetPort: 27017
selector:
role: demo-db
---
apiVersion: apps/v1beta2
kind: Deployment
metadata:
name: demo-app-deployment
namespace: common
spec:
selector:
matchLabels:
app: demo-app
replicas: 1
template:
metadata:
labels:
app: demo-app
spec:
containers:
- name: app
image: 472188425588.dkr.ecr.eu-west-1.amazonaws.com/demo:v0.0.141
ports:
- containerPort: 3010
---
kind: Service
apiVersion: v1
metadata:
name: demo-app-service
namespace: common
spec:
selector:
app: demo-app
ports:
- protocol: TCP
port: 80
targetPort: 3010
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: demo-app-ingress
namespace: common
annotations:
kubernetes.io/ingress.class: traefik
spec:
rules:
- host: demo.example.com
http:
paths:
- path: /
backend:
serviceName: demo-app-service
servicePort: 80

Resources for https://stackoverflow.com/questions/54439356/node-has-no-available-volume-zone-in-aws-eks

The eks-pod-no-avail-volume-zone.yml file is nearly the same as the one we use in prod, only difference is hostname and "prod" / "stage" so this file is most likely not the source of the problem.

The storage is defined and AWS console shows I do have a gp2 volume attached to each worker node.

One difference between stage (broken) and prod (working): the former runs Kubernetes 1.11, the latter 1.10.

Cause & Solution

I see from kubectl describe pv that in prod I have 2 persistent volumes while in stage I only have one. Moreover, the one in stage shows in AWS EC2 console as "available", not as "in-use".

I tried and failed to delete the PV: kubectl delete pv <id> says persistentvolume "" deleted but hangs after that and if I kill it and get pv it is still listed (notice that I have also manually deleted the Volume in AWS EC2):

NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE pvc-c95f0952-f160-11e8-9107-02fe87a39e2e 1Gi RWO Retain Terminating common/demo-db-storage-demo-db-deployment-0 gp2 156d

Deleting the StatefulSet that created did not help either.

Finally I succeeded by first deleting the pesisten volume claim (PVC) for it - I mistakenly expected that deleting the StatefulSet that used it would delete it too.

# This is in both prod and stage
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: gp2
annotations:
storageclass.kubernetes.io/is-default-class: "true"
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp2
encrypted: "true"
reclaimPolicy: Retain
mountOptions:
- debug
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment