Skip to content

Instantly share code, notes, and snippets.

@alexcpn
Last active February 23, 2023 07:43
Show Gist options
  • Save alexcpn/4de6a5ae242de9de6cbfdb3ffea40977 to your computer and use it in GitHub Desktop.
Save alexcpn/4de6a5ae242de9de6cbfdb3ffea40977 to your computer and use it in GitHub Desktop.
Test of Velero Restic based bacup of PV,PVCs, CSI Snapshot , StatefulSets etc from one cluster to another

Install minio

See the minio.txt

portforward in green--1

[root@green--1 ~]# kubectl port-forward minio-64b7c649f9-tt7km --address 0.0.0.0 7000:9000 --namespace minio

http://10.131.232.223:7000/minio/velero/backups/ from https://velero.io/docs/v1.4/contributions/minio/

With S3 port forwarding

http://10.131.232.223:7000/

Create ./credentials-velero

[default] aws_access_key_id = minio aws_secret_access_key = minio123

velero install
--provider aws
--plugins velero/velero-plugin-for-aws:v1.0.0,velero/velero-plugin-for-csi:v0.1.1
--bucket velero2
--secret-file ./credentials-velero
--use-volume-snapshots=true
--backup-location-config region=default,s3ForcePathStyle="true",s3Url=http://192.168.0.30:7000
--image velero/velero:v1.4.0
--snapshot-location-config region="default"
--use-restic


If you are using an Ingress

http://minio.10.131.232.223.nip.io

velero install
--provider aws
--plugins velero/velero-plugin-for-aws:v1.0.0,velero/velero-plugin-for-csi:v0.1.1
--bucket velero2
--secret-file ./credentials-velero
--use-volume-snapshots=true
--backup-location-config region=default,s3ForcePathStyle="true",s3Url=http://minio.10.131.232.223.nip.io
--image velero/velero:v1.4.0
--snapshot-location-config region="default"
--use-restic


Without CSI

velero install
--provider aws
--plugins velero/velero-plugin-for-aws:v1.0.0
--bucket velero2
--secret-file ./credentials-velero
--use-volume-snapshots=true
--backup-location-config region=default,s3ForcePathStyle="true",s3Url=http://minio.10.131.232.223.nip.io
--image velero/velero:v1.4.0
--snapshot-location-config region="default"
--use-restic

Rook-Ceph CSI Snapshot was not supported

https://github.com/vmware-tanzu/velero-plugin-for-csi/issues/53 rook/rook#4624 (comment)

-- So we need to use restic -We integrated restic with Velero so that users have an out-of-the-box solution for backing up and restoring almost any type of Kubernetes volume*. This is a new capability for Velero, not a replacement for existing functionality. If you’re running on AWS, and taking EBS snapshots as part of your regular Velero backups, there’s no need to switch to using restic. However, if you’ve been waiting for a snapshot plugin for your storage platform, or if you’re using EFS, AzureFile, NFS, emptyDir, local, or any other volume type that doesn’t have a native snapshot concept, restic might be for you.

time="2020-08-05T06:41:53Z" level=fatal msg="The 'EnableCSI' feature flag was specified, but CSI API group [snapshot.storage.k8s.io/v1beta1] was not found." logSource="pkg/cmd/server/server.go:569" [root@green--1 velero]# k get VolumeSnapshotClass -n rook-ceph NAME AGE csi-rbdplugin-snapclass 13d


[root@green--1 velero]# k get VolumeSnapshotClass -n rook-ceph -o yaml apiVersion: v1 items:

  • apiVersion: snapshot.storage.k8s.io/v1alpha1 --> expected is snapshot.storage.k8s.io/v1beta1 kind: VolumeSnapshotClass metadata: annotations: kubectl.kubernetes.io/last-applied-configuration: | {"apiVersion":"snapshot.storage.k8s.io/v1alpha1","kind":"VolumeSnapshotClass","metadata":{"annotations":{},"name":"csi-rbdplugin-snapclass"},"parameters":{"clusterID":"rook-ceph","csi.storage.k8s.io/snapshotter-secret-name":"rook-ceph-csi","csi.storage.k8s.io/snapshotter-secret-namespace":"rook-ceph"},"snapshotter":"rook-ceph.rbd.csi.ceph.com"} creationTimestamp: "2020-07-23T04:50:59Z" generation: 1 name: csi-rbdplugin-snapclass resourceVersion: "45121668" selfLink: /apis/snapshot.st

Without CSI support

velero install
--provider aws
--plugins velero/velero-plugin-for-aws:v1.0.0,velero/velero-plugin-for-csi:v0.1.1
--bucket velero
--secret-file ./credentials-velero
--use-volume-snapshots=true
--backup-location-config region=default,s3ForcePathStyle="true",s3Url=http://192.168.0.30:7000
--image velero/velero:v1.4.0
--snapshot-location-config region="default"
--use-restic

--

[root@green--1 velero]# k get pods -n velero NAME READY STATUS RESTARTS AGE restic-26885 1/1 Running 0 4m5s restic-bvr4r 1/1 Running 0 4m5s restic-gncgn 1/1 Running 0 4m5s restic-vdhlq 1/1 Running 0 4m5s velero-7cbbdc689-dwx6d 1/1 Running 0 4m5s

--

cat << EOF | kubectl apply -f -
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: ceph-ext
  labels:
    app: nginx
  namespace: test-nginx
spec:
  storageClassName: rook-ceph-block
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi 
---
apiVersion: v1
kind: Pod
metadata:
  name: nginx-test
  namespace: test-nginx
spec:
  volumes:
    - name: mystorage
      persistentVolumeClaim:
        claimName: ceph-ext
  containers:
    - name: task-pv-container
      image: nginx
      ports:
        - containerPort: 80
          name: "http-server"
      volumeMounts:
        - mountPath: "/usr/share/nginx/html"
          name: mystorage
EOF

Lets write some data in there

[root@green--1 velero]# k -n test-nginx exec -it nginx-test -- /bin/bash root@nginx-test:/usr/share/nginx/html# while true; do echo -n "my test " >> /usr/share/nginx/html/index.html ; done

root@nginx-test:/# ls -l /usr/share/nginx/html/index.html -rw-r--r-- 1 root root 2525800 Aug 6 09:27 /usr/share/nginx/html/index.html root@nginx-test:/#

kubectl -n test-nginx annotate pod/nginx-test backup.velero.io/backup-volumes=mystorage

[root@green--1 velero]# velero backup create test-nginx-b4 --include-namespaces test-nginx --wait Backup request "test-nginx-b2" submitted successfully. Waiting for backup to complete. You may safely press ctrl-c to stop waiting - your backup will continue in the background. ...... Backup completed with status: PartiallyFailed. You may check for more information using the commands velero backup describe test-nginx-b2 and velero backup logs test-nginx-b2.

[root@green--1 velero]# velero backup logs test-nginx-b2 | grep error time="2020-08-05T12:39:04Z" level=error msg="Error getting volume snapshotter for volume snapshot location" backup=velero/test-nginx-b2 error="rpc error: code = Unknown desc = missing region in aws configuration" error.file="/go/src/github.com/vmware-tanzu/velero-plugin-for-aws/velero-plugin-for-aws/volume_snapshotter.go:78" error.function="main.(*VolumeSnapshotter).Init" logSource="pkg/backup/item_backupper.go:437" name=pvc-a7a87cee-abb2-4db8-a445-fc95b4f8a237 namespace= persistentVolume=pvc-a7a87cee-abb2-4db8-a445-fc95b4f8a237 resource=persistentvolumes volumeSnapshotLocation=default [root@green--1 velero]#


Solution edit VolumeSnapshotClass k edit VolumeSnapshotLocation -n velero

apiVersion: velero.io/v1
kind: VolumeSnapshotLocation
metadata:
  creationTimestamp: "2020-08-05T09:30:21Z"
  generation: 2
  labels:
    component: velero
  name: default
  namespace: velero
  resourceVersion: "52019245"
  selfLink: /apis/velero.io/v1/namespaces/velero/volumesnapshotlocations/default
  uid: bd3f40e3-5a24-4b13-ab7e-51100ae9fc0a
spec:
  config:
    profile: default
    region: us-east-1 --> add this
  provider: aws

[root@green--1 ~]# velero backup create test-nginx-b4 --include-namespaces test-nginx --wait Backup request "test-nginx-b4" submitted successfully. Waiting for backup to complete. You may safely press ctrl-c to stop waiting - your backup will continue in the background. .. Backup completed with status: Completed. You may check for more information using the commands velero backup describe test-nginx-b4 and velero backup logs test-nginx-b4.


[root@green--1 velero]# velero get backups NAME STATUS ERRORS WARNINGS CREATED EXPIRES STORAGE LOCATION SELECTOR nginx-test-2 PartiallyFailed 4 0 2020-08-05 15:23:19 +0530 IST 29d default test-nginx-b4 Completed 0 1 2020-08-06 11:27:30 +0530 IST 29d default

This is the state of the namespce

-- [root@green--1 velero]# k get pods,pvc,pv -n test-nginx NAME READY STATUS RESTARTS AGE pod/nginx-test 1/1 Running 0 18h

NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE persistentvolumeclaim/ceph-ext Bound pvc-a7a87cee-abb2-4db8-a445-fc95b4f8a237 1Gi RWO rook-ceph-block 18h

NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE test-nginx/ceph-ext rook-ceph-block 18h

velero restore create --from-backup test-nginx-b4

http://10.131.232.223:7000/minio/velero/backups/nginx-test/

-- Now delete the namespace

k delete ns test-nginx

and restore

velero restore create --from-backup test-nginx-b4


[root@green--1 velero]# k get pods,pvc,pv -n test-nginx NAME READY STATUS RESTARTS AGE pod/nginx-test 0/1 ContainerCreating 0 27m

NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE persistentvolumeclaim/ceph-ext Lost pvc-a7a87cee-abb2-4db8-a445-fc95b4f8a237 0 rook-ceph-block 27m


Type Reason Age From Message


Normal Scheduled 7m22s default-scheduler Successfully assigned test-nginx/nginx-test to green--4 Warning FailedMount 2m32s (x8 over 7m9s) kubelet, green--4 Unable to attach or mount volumes: unmounted volumes=[mystorage], unattached volumes=[default-token-rf2ll mystorage]: error processing PVC test-nginx/ceph-ext: PVC is not bound Warning FailedMount 113s (x18 over 7m21s) kubelet, green--4 Unable to attach or mount volumes: unmounted volumes=[mystorage], unattached volumes=[mystorage default-token-rf2ll]: error processing PVC test-nginx/ceph-ext: PVC is not boun


[root@green--1 velero]# velero backup logs test-nginx-b4 | grep restic time="2020-08-06T05:57:31Z" level=info msg="Getting items for resource" backup=velero/test-nginx-b4 group=velero.io/v1 logSource="pkg/backup/item_collector.go:106" resource=resticrepositories time="2020-08-06T05:57:31Z" level=info msg="Listing items" backup=velero/test-nginx-b4 group=velero.io/v1 logSource="pkg/backup/item_collector.go:231" namespace=test-nginx resource=resticrepositories time="2020-08-06T05:57:31Z" level=info msg="Retrieved 0 items" backup=velero/test-nginx-b4 group=velero.io/v1 logSource="pkg/backup/item_collector.go:237" namespace=test-nginx resource=resticrepositories time="2020-08-06T05:57:32Z" level=warning msg="No volume named ceph-ext found in pod test-nginx/nginx-test, skipping" backup=velero/test-nginx-b4 logSource="pkg/restic/backupper.go:135" name=nginx-test namespace=test-nginx resource=pods

--

after restore


root@green--1 ~]# k get pods -n test-nginx NAME READY STATUS RESTARTS AGE nginx-test 1/1 Running 0 6m24s

[root@green--1 ~]# k -n test-nginx exec -it nginx-test -- /bin/bash root@nginx-test:/# ls -l /usr/share/nginx/html/index.html -rw-r--r-- 1 root root 2525800 Aug 6 09:27 /usr/share/nginx/html/index.html


pvc-15ff4984-4f14-4e27-b085-56be567b0577 1Gi RWO Retain Bound test-nginx/ceph-ext
pvc-5f175897-ceb4-4bb5-b04f-377a1d9d20b4 1Gi RWO Retain Released test-nginx/ceph-ext

-- testing performance

root@nginx-test:/# dd if=/dev/urandom of=/usr/share/nginx/html/test-file2.txt count=512000 bs=1024 512000+0 records in 512000+0 records out 524288000 bytes (524 MB, 500 MiB) copied, 12.1939 s, 43.0 MB/s root@nginx-test:/# ls /usr/share/nginx/html/test-file.txt /usr/share/nginx/html/test-file.txt root@nginx-test:/# ls -l /usr/share/nginx/html/test-file.txt -rw-r--r-- 1 root root 524288000 Aug 7 05:11 /usr/share/nginx/html/test-file.txt

root@nginx-test:/# ls -l /usr/share/nginx/html/ total 1005100 -rw-r--r-- 1 root root 2525800 Aug 6 09:27 index.html -rw-r--r-- 1 root root 524288000 Aug 7 05:11 test-file.txt -rw-r--r-- 1 root root 501358592 Aug 7 06:54 test-file2.txt -rw-r--r-- 1 root root 1048576 Aug 7 04:31 testfile

/dev/rbd2 1014M 1014M 20K 100% /usr/share/nginx/html

-- [root@green--1 ~]# velero backup create test6 --include-namespaces test-nginx --wait Backup request "test6" submitted successfully. Waiting for backup to complete. You may safely press ctrl-c to stop waiting - your backup will continue in the background. .......................................... Backup completed with status: Completed. You may check for more information using the commands velero backup describe test6 and velero backup logs test6.

root@nginx-test:/# ls -l /usr/share/nginx/html/ total 1005100 -rw-r--r-- 1 root root 2525800 Aug 6 09:27 index.html -rw-r--r-- 1 root root 524288000 Aug 7 07:06 test-file.txt -rw-r--r-- 1 root root 501358592 Aug 7 06:54 test-file2.txt -rw-r--r-- 1 root root 1048576 Aug 7 04:31 testfile root@nginx-test:/#

1 GB aroung 43 seconds

exit [root@green--1 ~]# velero backup create test7 --include-namespaces test-nginx --wait Backup request "test7" submitted successfully. Waiting for backup to complete. You may safely press ctrl-c to stop waiting - your backup will continue in the background. ......................................... Backup completed with status: Completed. You may check for more information using the commands velero backup describe test7 and velero backup logs test7.

since we use random stirngs the compression was less and hence in the S3 it was exactly 1 GB Stored

---Test 3 Expanded PV to 2GB

root@nginx-test:/# dd if=/dev/urandom of=/usr/share/nginx/html/test-file3.txt count=1024000 bs=1024 1024000+0 records in 1024000+0 records out 1048576000 bytes (1.0 GB, 1000 MiB) copied, 32.752 s, 32.0 MB/s root@nginx-test:/# ls -l /usr/share/nginx/html/ total 2029100 -rw-r--r-- 1 root root 2525800 Aug 6 09:27 index.html -rw-r--r-- 1 root root 524288000 Aug 7 07:06 test-file.txt -rw-r--r-- 1 root root 501358592 Aug 7 06:54 test-file2.txt -rw-r--r-- 1 root root 1048576000 Aug 7 07:20 test-file3.txt -rw-r--r-- 1 root root 1048576 Aug 7 04:31 testfile

transferring 1 GB [root@green--1 ~]# velero backup create test9 --include-namespaces test-nginx --wait Backup request "test9" submitted successfully. Waiting for backup to complete. You may safely press ctrl-c to stop waiting - your backup will continue in the background.

Unsucess full; A previos backup got aborted and Velero was in stuck state

Logs for backup "test9" are not available until it's finished processing. Please wait until the backup has a phase of Completed or Failed and try again. [root@green--1 ~]# velero get backups NAME STATUS ERRORS WARNINGS CREATED EXPIRES STORAGE LOCATION SELECTOR nginx-test-2 PartiallyFailed 4 0 2020-08-05 15:23:19 +0530 IST 28d default test-nginx-b4 Completed 0 0 2020-08-06 14:58:33 +0530 IST 29d default test2 Completed 0 0 2020-08-06 17:19:06 +0530 IST 29d default test3 Completed 0 0 2020-08-07 10:31:32 +0530 IST 29d default test4 Completed 0 0 2020-08-07 10:35:21 +0530 IST 29d default test5 Completed 0 0 2020-08-07 12:11:51 +0530 IST 29d default test6 Completed 0 0 2020-08-07 12:31:45 +0530 IST 29d default test7 Completed 0 0 2020-08-07 12:37:27 +0530 IST 29d default test9 New 0 0 29d [root@green--1 ~]#

-- time="2020-08-07T09:16:44Z" level=info msg="Found 1 backups in the backup location that do not exist in the cluster and need to be synced" backupLocation=default controller=backup-sync logSource="pkg/controller/backup_sync_controller.go:196" time="2020-08-07T09:16:44Z" level=info msg="Attempting to sync backup into cluster" backup=test8 backupLocation=default controller=backup-sync logSource="pkg/controller/backup_sync_controller.go:204" time="2020-08-07T09:16:44Z" level=error msg="Error getting backup metadata from backup store" backup=test8 backupLocation=default controller=backup-sync error="rpc error: code = Unknown desc = error getting object backups/test8/velero-backup.json: NoSuchKey: The specified key does not exist.\n\tstatus code: 404, request id: 1628F1AFB2146226, host id: " error.file="/github.com/vmware-tanzu/velero/pkg/persistence/object_store.go:261" error.function="github.com/vmware-tanzu/velero/pkg/persistence.(*objectBackupStore).GetBackupMetadata" logSource="pkg/controller/backup_sync_controller.go:208" ^C

2 GB in 5 minutes Waiting for backup to complete. You may safely press ctrl-c to stop waiting - your backup will continue in the background. ..................................................................................................................................................................................................................................................................................................................................... Backup completed with status: Completed. You may check for more information using the commands velero backup describe test10 and velero backup logs test10. 1.98 GB 326 seconds

Incremental Backup

[root@green--1 velero]# velero backup create test11 --include-namespaces test-nginx --wait Backup request "test11" submitted successfully. Waiting for backup to complete. You may safely press ctrl-c to stop waiting - your backup will continue in the background. .................. Backup completed with status: Completed. You may check for more information using the commands velero backup describe test11 and velero backup logs test11.


25 MB 7 seconds [root@green--1 velero]# velero backup create test12 --include-namespaces test-nginx --wait Backup request "test12" submitted successfully. Waiting for backup to complete. You may safely press ctrl-c to stop waiting - your backup will continue in the background. ...... Backup completed with status: Completed. You may check for more information using the commands velero backup describe test12 and velero backup logs test12.

Velero restore to another namespace ?

BackUp and Restore form One Cluster to another

Green Cluster

NGINX detials

[root@green--1 ~]# kubectl -n test-nginx get pvc
NAME       STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS      AGE
ceph-ext   Bound    pvc-bced999f-32e3-4275-8f40-77699ce123b7   2Gi        RWO            rook-ceph-block   24d

PVC detials

[root@green--1 ~]# kubectl -n test-nginx describe pvc ceph-ext
Name:          ceph-ext
Namespace:     test-nginx
StorageClass:  rook-ceph-block
Status:        Bound
Volume:        pvc-bced999f-32e3-4275-8f40-77699ce123b7
Labels:        app=nginx
               velero.io/backup-name=test4
               velero.io/restore-name=test4-20200807103720

Pod File detials

root@nginx-test:/usr/share/nginx/html# du -h .
0       ./.velero
2.0G    .
root@nginx-test:/usr/share/nginx/html# ls -laSh
total 2.0G
-rw-r--r-- 1 root root 1000M Aug  7 07:20 test-file3.txt
-rw-r--r-- 1 root root  500M Aug  7 07:06 test-file.txt
-rw-r--r-- 1 root root  479M Aug  7 06:54 test-file2.txt
-rw-r--r-- 1 root root   24M Aug  7 11:29 test-file4.txt
-rw-r--r-- 1 root root  2.5M Aug  6 09:27 index.html
-rw-r--r-- 1 root root  1.0M Aug  7 04:31 testfile
drwxr-xr-x 3 root root   142 Aug  7 11:29 .
drwxr-xr-x 2 root root    50 Aug  7 05:07 .velero
drwxr-xr-x 3 root root    18 Aug  5 00:27 ..

root@nginx-test:/# ls -laS /usr/share/nginx/html total 2053292 -rw-r--r-- 1 root root 1048576000 Aug 7 07:20 test-file3.txt -rw-r--r-- 1 root root 524288000 Aug 7 07:06 test-file.txt -rw-r--r-- 1 root root 501358592 Aug 7 06:54 test-file2.txt -rw-r--r-- 1 root root 24772608 Aug 7 11:29 test-file4.txt -rw-r--r-- 1 root root 2525800 Aug 6 09:27 index.html -rw-r--r-- 1 root root 1048576 Aug 7 04:31 testfile drwxr-xr-x 3 root root 142 Aug 31 12:08 . drwxr-xr-x 2 root root 50 Aug 31 12:08 .velero drwxr-xr-x 3 root root 18 Aug 14 00:36 ..

Prepare for BackUp

kubectl -n test-nginx  annotate pod/nginx-test backup.velero.io/backup-volumes=mystorage

BackUp

velero backup create test-4 --include-namespaces test-nginx --wait
Backup request "test-4" submitted successfully.
Waiting for backup to complete. You may safely press ctrl-c to stop waiting - your backup will continue in the background.
...
Backup completed with status: Completed. You may check for more information using the commands `velero backup describe test-4` and `velero backup logs test-4`.

Test in Cluster 1


[root@green--1 ~]# kubectl delete ns test-nginx

velero restore create --from-backup test-4

[root@green--1 ~]# velero get restore
NAME                    BACKUP   STATUS      ERRORS   WARNINGS   CREATED                         SELECTOR
test-4-20200831173655   test-4   Completed   0        0          2020-08-31 17:36:55 +0530 IST   <none>

[root@green--1 ~]# kubectl -n test-nginx get pvc
NAME       STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS      AGE
ceph-ext   Bound    pvc-0ce3ed03-e323-41d0-b4b0-a2fc865060d1   2Gi        RWO            rook-ceph-block   3m29s

All Seems well

In Cluster 2 ;

Install and Connect Velero to same S3 (velero2)

https://velero.io/docs/v1.4/basic-install/

velero install \
 --provider aws \
 --plugins velero/velero-plugin-for-aws:v1.0.0,velero/velero-plugin-for-csi:v0.1.1 \
 --bucket velero2  \
 --secret-file ./credentials-velero  \
 --use-volume-snapshots=true \
 --backup-location-config region=default,s3ForcePathStyle="true",s3Url=http://192.168.0.30:7000  \
 --image velero/velero:v1.4.2  \
 --snapshot-location-config region="default" \
 --use-restic

Check backup locations

root@k8s-storage-1:~# velero backup-location get 
NAME      PROVIDER   BUCKET/PREFIX   ACCESS MODE
default   aws        velero2         ReadWrite
root@k8s-storage-1:~# velero get backup
NAME                 STATUS            ERRORS   WARNINGS   CREATED                         EXPIRES   STORAGE LOCATION   SELECTOR
blue                 Completed         0        0          2020-08-25 11:19:09 +0000 UTC   23d       default            <none>
green                PartiallyFailed   1        0          2020-08-25 11:18:05 +0000 UTC   23d       default            <none>
ingress-controller   Completed         0        0          2020-08-25 11:14:36 +0000 UTC   23d       default            <none>
test                 PartiallyFailed   1        0          2020-08-25 11:14:02 +0000 UTC   23d       default            <none>
test-4               Completed         0        0          2020-08-31 10:23:58 +0000 UTC   29d       default            <none>
test-nginx-b4        Completed         0        0          2020-08-25 09:11:34 +0000 UTC   23d       default            <none>
test10               Completed         0        0          2020-08-07 11:05:39 +0000 UTC   5d        default            <none>
test11               Completed         0        0          2020-08-07 11:13:10 +0000 UTC   5d        default            <none>
test12               Completed         0        0          2020-08-07 11:30:48 +0000 UTC   6d        default            <none>

Restore

velero restore create --from-backup test-4


time="2020-08-31T11:32:47Z" level=info msg="Done waiting for all restic restores to complete" logSource="pkg/restore/restore.go:486" restore=velero/test-4-20200831113247
time="2020-08-31T11:32:47Z" level=info msg="restore completed" logSource="pkg/controller/restore_controller.go:468" restore=velero/test-4-20200831113247

Problem

root@k8s-storage-1:~# velero get restore
NAME                           BACKUP          STATUS            ERRORS   WARNINGS   CREATED                         SELECTOR
test-4-20200831113247          test-4          PartiallyFailed   4        0          2020-08-31 11:32:47 +0000 UTC   <none>
test-nginx-b4-20200831114212   test-nginx-b4   PartiallyFailed   4        0          2020-08-31 11:42:12 +0000 UTC   <none>
test-nginx-b4-20200831114855   test-nginx-b4   PartiallyFailed   4        0          2020-08-31 11:48:55 +0000 UTC   <none>
root@k8s-storage-1:~# 

  Namespaces:
    test-nginx:  error preparing persistentvolumeclaims/test-nginx/ceph-ext: rpc error: code = Unknown desc = Volumesnapshot test-nginx/ceph-ext does not have a velero.io/csi-volumesnapshot-handle annotation
                 error preparing secrets/test-nginx/default-token-cqlmk: rpc error: code = Unknown desc = Volumesnapshot test-nginx/default-token-cqlmk does not have a velero.io/csi-volumesnapshot-handle annotation
                 error preparing serviceaccounts/test-nginx/default: rpc error: code = Unknown desc = Volumesnapshot test-nginx/default does not have a velero.io/csi-volumesnapshot-handle annotation
                 error preparing pods/test-nginx/nginx-test: rpc error: code = Unknown desc = Volumesnapshot test-nginx/nginx-test does not have a velero.io/csi-volumesnapshot-handle annotation
				 


In Green K8s in v1.17

[root@green--1 ~]# kubectl get VolumeSnapshotClass --all-namespaces
NAME                      AGE
csi-rbdplugin-snapclass   39d

Solutuon in K8s v.18

# Install the  Snapshot CRD's

https://github.com/kubernetes-csi/external-snapshotter/issues/245

https://github.com/kubernetes-csi/external-snapshotter/blob/master/README.md#usage

--
root@k8s-storage-1:~# git clone https://github.com/kubernetes-csi/external-snapshotter.git
root@k8s-storage-1:~# cd external-snapshotter/
--
root@k8s-storage-1:~/external-snapshotter# kubectl create -f client/config/crd
customresourcedefinition.apiextensions.k8s.io/volumesnapshotclasses.snapshot.storage.k8s.io created
customresourcedefinition.apiextensions.k8s.io/volumesnapshotcontents.snapshot.storage.k8s.io created
customresourcedefinition.apiextensions.k8s.io/volumesnapshots.snapshot.storage.k8s.io created
--

root@k8s-storage-1:~/external-snapshotter# kubectl create -f deploy/kubernetes/csi-snapshotter
serviceaccount/csi-snapshotter created
clusterrole.rbac.authorization.k8s.io/external-snapshotter-runner created
clusterrolebinding.rbac.authorization.k8s.io/csi-snapshotter-role created
role.rbac.authorization.k8s.io/external-snapshotter-leaderelection created
rolebinding.rbac.authorization.k8s.io/external-snapshotter-leaderelection created
serviceaccount/csi-provisioner created
clusterrole.rbac.authorization.k8s.io/external-provisioner-runner created
clusterrolebinding.rbac.authorization.k8s.io/csi-provisioner-role created
role.rbac.authorization.k8s.io/external-provisioner-cfg created
rolebinding.rbac.authorization.k8s.io/csi-provisioner-role-cfg created
clusterrolebinding.rbac.authorization.k8s.io/csi-snapshotter-provisioner-role created
rolebinding.rbac.authorization.k8s.io/csi-snapshotter-provisioner-role-cfg created
service/csi-snapshotter created
statefulset.apps/csi-snapshotter created
--
root@k8s-storage-1:~/external-snapshotter# kubectl apply -f https://raw.githubusercontent.com/rook/rook/master/cluster/examples/kubernetes/ceph/csi/rbd/snapshotclass.yaml
volumesnapshotclass.snapshot.storage.k8s.io/csi-rbdplugin-snapclass created
--
root@k8s-storage-1:~/external-snapshotter# kubectl get VolumeSnapshotClass --all-namespaces
NAME                      DRIVER                       DELETIONPOLICY   AGE
csi-rbdplugin-snapclass   rook-ceph.rbd.csi.ceph.com   Delete           67s

root@k8s-storage-1: kubectl apply -f https://raw.githubusercontent.com/rook/rook/master/cluster/examples/kubernetes/ceph/csi/rbd/snapshot.yaml

Restore again oot@k8s-storage-1:~/external-snapshotter# velero restore create --from-backup test-4

root@k8s-storage-1:~/external-snapshotter# kubectl -n test-nginx get pods
NAME         READY   STATUS    RESTARTS   AGE
nginx-test   0/1     Pending   0          10m
root@k8s-storage-1:~/external-snapshotter# kubectl -n test-nginx get pvc
NAME       STATUS    VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS      AGE
ceph-ext   Pending                                      rook-ceph-block   10m


Problem

Type     Reason              Age                 From                         Message
  ----     ------              ----                ----                         -------
  Warning  ProvisioningFailed  80s (x42 over 11m)  persistentvolume-controller  storageclass.storage.k8s.io "rook-ceph-block" not found

Provisioned the storage class in cluster 2

cat EOF << kubectl apply -f - allowVolumeExpansion: true apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: creationTimestamp: "2020-03-12T09:21:03Z" name: rook-ceph-block resourceVersion: "408457" selfLink: /apis/storage.k8s.io/v1/storageclasses/rook-ceph-block uid: 0253aa9e-27b7-433b-aa9a-a150a54bb4a3 parameters: blockPool: replicapool clusterNamespace: rook-ceph fstype: xfs provisioner: ceph.rook.io/block reclaimPolicy: Retain volumeBindingMode: Immediate EOF

Added restore againg. The old one needs to be delted

root@k8s-storage-1:~/external-snapshotter# velero get restore NAME BACKUP STATUS ERRORS WARNINGS CREATED SELECTOR test-4-20200831113247 test-4 PartiallyFailed 4 0 2020-08-31 11:32:47 +0000 UTC test-4-20200901050019 test-4 InProgress 0 0 2020-09-01 05:00:19 +0000 UTC test-4-20200901061839 test-4 New 0 0 2020-09-01 06:18:39 +0000 UTC test-nginx-b4-20200831114212 test-nginx-b4 PartiallyFailed 4 0 2020-08-31 11:42:12 +0000 UTC test-nginx-b4-20200831114855 test-nginx-b4 PartiallyFailed 4 0 2020-08-31 11:48:55 +0000 UTC

Bug Even when old is deleted it is still in New statefulset root@k8s-storage-1:~/external-snapshotter# velero get restore NAME BACKUP STATUS ERRORS WARNINGS CREATED SELECTOR test-4-20200831113247 test-4 PartiallyFailed 4 0 2020-08-31 11:32:47 +0000 UTC test-4-20200901061839 test-4 New 0 0 2020-09-01 06:18:39 +0000 UTC test-nginx-b4-20200831114212 test-nginx-b4 PartiallyFailed 4 0 2020-08-31 11:42:12 +0000 UTC test-nginx-b4-20200831114855 test-nginx-b4 PartiallyFailed 4 0 2020-08-31 11:48:55 +0000 UTC

Bug WA - Restarted the velero pod to trigger restore

-- Stuck Normal ExternalProvisioning 7s (x20 over 4m39s) persistentvolume-controller waiting for a volume to be created, either by external provisioner "ceph.rook.io/block" or manually created by system administrator root@k8s-storage-1:~/external-snapshotter#

Provisoned installed Rook compatible storage class

cat << EOF | kubectl apply -f -

From rook 1.4.1 cluster\examples\kubernetes\ceph\csi\rbd\storageclass.yaml

apiVersion: ceph.rook.io/v1 kind: CephBlockPool metadata: name: replicapool namespace: rook-ceph spec:

TODO if a failure domain of host is selected, then CRUSH will ensure that each replica

of the data is stored on a different host. https://docs.ceph.com/docs/master/rados/operations/crush-map/

failureDomain: host replicated: size: 2 # Disallow setting pool with replica 1, this could lead to data loss without recovery. # Make sure you're ABSOLUTELY CERTAIN that is what you want requireSafeReplicaSize: true # gives a hint (%) to Ceph in terms of expected consumption of the total cluster capacity of a given pool # for more info: https://docs.ceph.com/docs/master/rados/operations/placement-groups/#specifying-expected-pool-size #targetSizeRatio: .5

apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: rook-ceph-block

Change "rook-ceph" provisioner prefix to match the operator namespace if needed

provisioner: rook-ceph.rbd.csi.ceph.com parameters: # clusterID is the namespace where the rook cluster is running # If you change this namespace, also change the namespace below where the secret namespaces are defined clusterID: rook-ceph

# If you want to use erasure coded pool with RBD, you need to create
# two pools. one erasure coded and one replicated.
# You need to specify the replicated pool here in the `pool` parameter, it is
# used for the metadata of the images.
# The erasure coded pool must be set as the `dataPool` parameter below.
#dataPool: ec-data-pool
pool: replicapool

# RBD image format. Defaults to "2".
imageFormat: "2"

# RBD image features. Available for imageFormat: "2". CSI RBD currently supports only `layering` feature.
imageFeatures: layering

# The secrets contain Ceph admin credentials. These are generated automatically by the operator
# in the same namespace as the cluster.
csi.storage.k8s.io/provisioner-secret-name: rook-csi-rbd-provisioner
csi.storage.k8s.io/provisioner-secret-namespace: rook-ceph
csi.storage.k8s.io/controller-expand-secret-name: rook-csi-rbd-provisioner
csi.storage.k8s.io/controller-expand-secret-namespace: rook-ceph
csi.storage.k8s.io/node-stage-secret-name: rook-csi-rbd-node
csi.storage.k8s.io/node-stage-secret-namespace: rook-ceph
# Specify the filesystem type of the volume. If not specified, csi-provisioner
# will set default as `ext4`. Note that `xfs` is not recommended due to potential deadlock
# in hyperconverged settings where the volume is mounted on the same node as the osds.
csi.storage.k8s.io/fstype: ext4

uncomment the following to use rbd-nbd as mounter on supported nodes

IMPORTANT: If you are using rbd-nbd as the mounter, during upgrade you will be hit a ceph-csi

issue that causes the mount to be disconnected. You will need to follow special upgrade steps

to restart your application pods. Therefore, this option is not recommended.

#mounter: rbd-nbd allowVolumeExpansion: true reclaimPolicy: Delete EOF


everything working now

Type Reason Age From Message


Normal ExternalProvisioning 4m11s (x82 over 24m) persistentvolume-controller waiting for a volume to be created, either by external provisioner "ceph.rook.io/block" or manually created by system administrator Normal Provisioning 11s rook-ceph.rbd.csi.ceph.com_csi-rbdplugin-provisioner-dbc67ffdc-p6rhl_1cfffe93-3dfe-4f55-8ef4-66943c74eb8e External provisioner is provisioning volume for claim "test-nginx/ceph-ext" Normal ProvisioningSucceeded 11s rook-ceph.rbd.csi.ceph.com_csi-rbdplugin-provisioner-dbc67ffdc-p6rhl_1cfffe93-3dfe-4f55-8ef4-66943c74eb8e Successfully provisioned volume pvc-30de54cb-e276-4c1f-a4bc-11ae169e5ffe root@k8s-storage-1:~/external-snapshotter#

--

root@k8s-storage-1:/external-snapshotter# kubectl -n test-nginx get pods NAME READY STATUS RESTARTS AGE nginx-test 0/1 Init:0/1 0 25m root@k8s-storage-1:/external-snapshotter#

-- Init contianer is verleo/and it will copy the data to PVC

--

root@k8s-storage-1:~/external-snapshotter# kubectl -n test-nginx get pods NAME READY STATUS RESTARTS AGE nginx-test 1/1 Running 0 28m

root@k8s-storage-1:~/external-snapshotter# kubectl -n test-nginx exec -it nginx-test -- /bin/bash root@nginx-test:/# ls -laSh /usr/share/nginx/html

Cluster 2

root@nginx-test:/# ls -laS /usr/share/nginx/html total 1563720 -rw-r--r-- 1 root root 1048576000 Aug 7 07:20 test-file3.txt -rw-r--r-- 1 root root 524288000 Aug 7 07:06 test-file.txt -rw-r--r-- 1 root root 24772608 Aug 7 11:29 test-file4.txt -rw-r--r-- 1 root root 2525800 Aug 6 09:27 index.html -rw-r--r-- 1 root root 1048576 Aug 7 04:31 testfile drwx------ 2 root root 16384 Sep 1 08:13 lost+found drwxrwxrwx 4 root root 4096 Sep 1 08:14 . drwxr-xr-x 3 root root 4096 Aug 14 00:36 .. drwxr-xr-x 2 root root 4096 Sep 1 08:14 .velero

-- Green Cluster (1)

root@nginx-test:/usr/share/nginx/html# rm test-file2.txt root@nginx-test:/usr/share/nginx/html# ls -laS /usr/share/nginx/html total 1563684 -rw-r--r-- 1 root root 1048576000 Aug 7 07:20 test-file3.txt -rw-r--r-- 1 root root 524288000 Aug 7 07:06 test-file.txt -rw-r--r-- 1 root root 24772608 Aug 7 11:29 test-file4.txt -rw-r--r-- 1 root root 2525800 Aug 6 09:27 index.html -rw-r--r-- 1 root root 1048576 Aug 7 04:31 testfile drwxr-xr-x 3 root root 120 Sep 1 08:09 . drwxr-xr-x 2 root root 50 Aug 31 12:08 .velero drwxr-xr-x 3 root root 18 Aug 14 00:36 ..


An error occurred: unknown flag: --include-resournces [root@green--1 ~]# velero backup create test-pv-8 --include-resources ceph-ext --wait Backup request "test-pv-8" submitted successfully. Waiting for backup to complete. You may safely press ctrl-c to stop waiting - your backup will continue in the background. .... Backup completed with status: Completed. You may check for more information using the commands velero backup describe test-pv-8 and velero backup logs test-pv-8. [root@green--1 ~]# velero backup create test-pv-9 --include-resources ceph-ext --exclude-resources pods --wait Backup request "test-pv-9" submitted successfully. Waiting for backup to complete. You may safely press ctrl-c to stop waiting - your backup will continue in the background. .. Backup completed with status: Completed. You may check for more information using the commands velero backup describe test-pv-9 and velero backup logs test-pv-9.


Only backup PV and Restore PV - possible


velero backup create test-pv-10  --include-namespaces test-nginx  --include-resources  persistentvolumeclaims,persistentvolumes --wait

velero restore create --from-backup  test-pv-10  --namespace-mappings test-nginx:restore-test-nginx

root@k8s-storage-1:~# kubectl -n restore-test-nginx get pvc
NAME       STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS      AGE
ceph-ext   Bound    pvc-0ce3ed03-e323-41d0-b4b0-a2fc865060d1   2Gi        RWO            rook-ceph-block   30s
root@k8s-storage-1:~# kubectl -n restore-test-nginx get pods
No resources found in restore-test-nginx namespace.

Test 2 with CSi Snapshot

root@k8s-storage-1:~# velero install --provider aws --plugins velero/velero-plugin-for-aws:v1.0.0,
velero/velero-plugin-for-csi:v0.1.1 --bucket velero 2 --secret-file ./credentials-velero --use-volume-snapshots=true
--backup-location-config region=default,s3ForcePathStyle="true",s3Url=http://192.168.0.30:7000
--image velero/velero:v1.4.2 --snapshot-location-config region="default" --features=EnableCSI

--create nginx

velero backup create k8s2-test1 --include-namespaces test3-nginx --include-resources pods,persistentvolumeclaims,persistentvolumes --wait

k8s2-test1 Completed 0 1 2020-09-01 10:21:28 +0000 UTC 29d default

Checking In Cluster 1 (k8s v.17 snapshot api is in alpha)

velero restore create --from-backup k8s2-test1 -- failed PVC is not created Warning FailedScheduling 28s (x3 over 107s) default-scheduler persistentvolumeclaim "ceph-ext" not found -- this is because other cluster does not have CSI snapshot


Restoring in same cluster

Namespaces: test3-nginx: error preparing persistentvolumeclaims/test3-nginx/ceph-ext: rpc error: code = Unknown desc = Failed to get Volumesnapshot test3-nginx/velero-ceph-ext-6nfb7 to restore PVC test3-nginx/ceph-ext: volumesnapshots.snapshot.storage.k8s.io "velero-ceph-ext-6nfb7" not found

--

velero backup create k8s2-test3 --include-namespaces test3-nginx

does not work : Bug

time="2020-09-01T11:06:53Z" level=info msg="Waiting for CSI driver to reconcile volumesnapshot test3-nginx/velero-ceph-ext-lfplp. Retrying in 5s" backup=velero/k8s2-test3 cmd=/plugins/velero-plugin-for-csi logSource="/go/src/velero-plugin-for-csi/internal/util/util.go:172" pluginName=velero-plugin-for-csi time="2020-09-01T11:06:53Z" level=error msg="Timed out awaiting reconciliation of volumesnapshot test3-nginx/velero-ceph-ext-lfplp" backup=velero/k8s2-test3 cmd=/plugins/velero-plugin-for-csi logSource="/go/src/velero-plugin-for-csi/internal/util/util.go:194" pluginName=velero-plugin-for-csi time="2020-09-01T11:06:53Z" level=info msg="1 errors encountered backup up item" backup=velero/k8s2-test3 logSource="pkg/backup/backup.go:444" name=nginx-test time="2020-09-01T11:06:53Z" level=error msg="Error backing up item" backup=velero/k8s2-test3 error="error executing custom action (groupResource=volumesnapshots.snapshot.storage.k8s.io, namespace=test3-nginx, name=velero-ceph-ext-lfplp): rpc error: code = Unknown desc = timed out waiting for the condition" logSource="pkg/backup/backup.go:448" name=nginx-test time="2020-09-01T11:06:53Z" level=info msg="Backed up 4 items out of an estimated total of 19 (estimate will change throughout the backup)" backup=velero/k8s2-test3 logSource="pkg/backup/backup.go:411" name=nginx-test namespace=test3-nginx progress= resource=pods

--

root@nginx-test:/usr/share/nginx/html# dd if=/dev/urandom of=/usr/share/nginx/html/test-file3.txt count=512000 bs=1024 512000+0 records in 512000+0 records out 524288000 bytes (524 MB, 500 MiB) copied, 8.58373 s, 61.1 MB/s root@nginx-test:/usr/share/nginx/html# ls -laSh total 5.4G -rw-r--r-- 1 root root 4.9G Sep 3 07:07 test-file2.txt -rw-r--r-- 1 root root 500M Sep 3 07:08 test-file3.txt drwx------ 2 root root 16K Sep 3 06:55 lost+found drwxr-xr-x 3 root root 4.0K Sep 3 07:08 . drwxr-xr-x 3 root root 18 Aug 14 00:36 ..

5g in 5 mts

[root@green--1 ~]# date && velero backup create test-5g-2 --include-namespaces test-nginx --wait && date Thu Sep 3 12:44:24 IST 2020 Backup request "test-5g-2" submitted successfully. Waiting for backup to complete. You may safely press ctrl-c to stop waiting - your backup will continue in the background. ............................................................................................................................................................................................................................................................................................................ Backup completed with status: Completed. You may check for more information using the commands velero backup describe test-5g-2 and velero backup logs test-5g-2. Thu Sep 3 12:49:24 IST 2020

10 g test

root@nginx-test:/# cd /usr/share/nginx/html root@nginx-test:/usr/share/nginx/html# dd if=/dev/urandom of=/usr/share/nginx/html/test-file4.txt count=5120000 bs=1024 dd: error writing '/usr/share/nginx/html/test-file4.txt': No space left on device 4504821+0 records in 4504820+0 records out 4612935680 bytes (4.6 GB, 4.3 GiB) copied, 109.811 s, 42.0 MB/s root@nginx-test:/usr/share/nginx/html# ls -laSh total 9.7G -rw-r--r-- 1 root root 4.9G Sep 3 07:07 test-file2.txt -rw-r--r-- 1 root root 4.3G Sep 3 07:28 test-file4.txt -rw-r--r-- 1 root root 500M Sep 3 07:08 test-file3.txt drwx------ 2 root root 16K Sep 3 06:55 lost+found drwxr-xr-x 3 root root 4.0K Sep 3 07:27 . drwxr-xr-x 3 root root 18 Aug 14 00:36 ..


[root@rook-ceph-tools-7d764c8647-slvfx /]# ceph df --- RAW STORAGE --- CLASS SIZE AVAIL USED RAW USED %RAW USED hdd 245 GiB 198 GiB 42 GiB 47 GiB 19.21 TOTAL 245 GiB 198 GiB 42 GiB 47 GiB 19.21

--- POOLS --- POOL ID STORED OBJECTS USED %USED MAX AVAIL replicapool 2 14 GiB 3.77k 42 GiB 18.96 59 GiB replicapool-ext 3 50 MiB 27 102 MiB 0.06 89 GiB my-store.rgw.control 4 0 B 8 0 B 0 59 GiB my-store.rgw.meta 5 0 B 0 0 B 0 59 GiB my-store.rgw.log 6 50 B 210 192 KiB 0 59 GiB my-store.rgw.buckets.index 7 0 B 0 0 B 0 59 GiB my-store.rgw.buckets.non-ec 8 0 B 0 0 B 0 59 GiB .rgw.root 9 3.7 KiB 16 2.8 MiB 0 59 GiB my-store.rgw.buckets.data 10 0 B 0 0 B 0 119 GiB device_health_metrics 11 0 B 4 0 B 0 59 GiB [root@rook-ceph-tools-7d764c8647-slvfx /]#


time="2020-09-03T11:01:29Z" level=info msg="Backup completed" controller=backup logSource="pkg/controller/backup_controller.go:619" time="2020-09-03T11:01:29Z" level=error msg="backup failed" controller=backup error="rpc error: code = Unknown desc = error putting object backups/test-10g-3/velero-backup.json: XMinioStorageFull: Storage backend has reached its minimum free disk threshold. Please delete a few objects to proceed.\n\tstatus code: 507, request id: 1631411248E36551, host id: " key=velero/test-10g-3 logSource="pkg/controller/backup_controller.go:273"


]# ssh -i alex_ee.pem ceph-2 ceph osd pool set-quota replicated_10g max_bytes 10000000000


updated the pool

ssh -i alex_ee.pem ceph-2 ceph osd pool set-quota replicated_10g max_bytes 20000000000 set-quota max_bytes = 20000000000 for pool replicated_10g

-- velero restore create --from-backup green-allpvc2 --namespace-mappings blue:r-blue green:r-green test-nginx:r-test-nginx test:r-test

20-09-04T07:58:56Z" level=info msg="Skipping PVC test/cassandra-volume-claim, associated PV pvc-dbf8f052-a7d7-401d-a956-2fdeccc14fae is not a CSI volume" backup=velero/green-allpvc cmd=/plugins/velero-plugin-for-csi logSource="/go/src/velero-plugin-for-csi/internal/backup/pvc_action.go:82" pluginName=velero-plugin-for-csi

time="2020-09-04T08:13:40Z" level=info msg="Starting PVCBackupItemAction" backup=velero/green-allpvc cmd=/plugins/velero-plugin-for-csi logSource="/go/src/velero-plugin-for-csi/internal/backup/pvc_action.go:57" pluginName=velero-plugin-for-csi time="2020-09-04T08:13:40Z" level=info msg="Skipping PVC test/cassandra-volume-claim, associated PV pvc-dbf8f052-a7d7-401d-a956-2fdeccc14fae is not a CSI volume" backup=velero/green-allpvc cmd=/plugins/velero-plugin-for-csi logSource="/go/src/velero-plugin-for-csi/internal/backup/pvc_action.go:82" pluginName=velero-plugin-for-csi

An error occurred: backups.velero.io "green-allpvc2" not found root@k8s-storage-1:# velero get backups NAME STATUS ERRORS WARNINGS CREATED EXPIRES STORAGE LOCATION SELECTOR blue Completed 0 0 2020-08-25 11:19:09 +0000 UTC 20d default green PartiallyFailed 1 0 2020-08-25 11:18:05 +0000 UTC 20d default ingress-controller Completed 0 0 2020-08-25 11:14:36 +0000 UTC 20d default k8s2-test1 Completed 0 1 2020-09-01 10:21:28 +0000 UTC 27d default k8s2-test3 PartiallyFailed 2 1 2020-09-01 10:56:51 +0000 UTC 27d default s-test-nginx-20200903072502 Completed 0 0 2020-09-03 07:25:02 +0000 UTC 28d default s-test-nginx-20200903072002 Completed 0 0 2020-09-03 07:20:02 +0000 UTC 28d default s-test-nginx-20200903071502 Completed 0 0 2020-09-03 07:19:24 +0000 UTC 28d default s-test-nginx-20200903071002 Completed 0 0 2020-09-03 07:10:02 +0000 UTC 28d default s-test-nginx-20200902072502 Completed 0 0 2020-09-02 07:25:02 +0000 UTC 27d default s-test-nginx-20200902072002 Completed 0 0 2020-09-02 07:20:02 +0000 UTC 27d default s-test-nginx-20200902071502 Completed 0 0 2020-09-02 07:15:02 +0000 UTC 27d default s-test-nginx-20200902071002 Completed 0 0 2020-09-02 07:10:02 +0000 UTC 27d default s-test-nginx-20200902070502 Completed 0 0 2020-09-02 07:05:02 +0000 UTC 27d default s-test-nginx-20200902070002 Completed 0 0 2020-09-02 07:00:02 +0000 UTC 27d default s-test-nginx-20200902065502 Completed 0 0 2020-09-02 06:55:02 +0000 UTC 27d default s-test-nginx-20200902065002 Completed 0 0 2020-09-02 06:50:02 +0000 UTC 27d default s-test-nginx-20200902064502 Completed 0 0 2020-09-02 06:45:02 +0000 UTC 27d default s-test-nginx-20200902064002 Completed 0 0 2020-09-02 06:40:02 +0000 UTC 27d default s-test-nginx-20200902063502 Completed 0 0 2020-09-02 06:35:02 +0000 UTC 27d default s-test-nginx-20200902063002 Completed 0 0 2020-09-02 06:30:02 +0000 UTC 27d default s-test-nginx-20200902062502 Completed 0 0 2020-09-02 06:25:02 +0000 UTC 27d default stroagek8s-test1 InProgress 0 0 2020-09-01 10:09:15 +0000 UTC 27d default test PartiallyFailed 1 0 2020-08-25 11:14:02 +0000 UTC 20d default test-4 Completed 0 0 2020-08-31 10:23:58 +0000 UTC 26d default test-5g Completed 0 0 2020-09-03 07:11:20 +0000 UTC 28d default test-5g-2 Completed 0 0 2020-09-03 07:14:24 +0000 UTC 28d default test-nginx-b4 Completed 0 0 2020-08-25 09:11:34 +0000 UTC 19d default test-pv-10 Completed 0 0 2020-09-01 08:39:09 +0000 UTC 26d default test12 Completed 0 0 2020-08-07 11:30:48 +0000 UTC 2d default velero-k8s-storage Completed 0 0 2020-09-01 09:02:53 +0000 UTC 26d default root@k8s-storage-1:#

velero restore create --from-backup green-allpvc2 --namespace-mappings blue:r-blue,green:r-green,test-nginx:r-test-nginx,test:r-test

root@k8s-storage-1:~# kubectl get pvc --all-namespaces | grep r-* del test Bound pvc-01507522-0e29-447c-98f3-a90f5802e7f0
minio minio-pv-claim Lost pvc-98ef6aa8-97d7-4efa-8422-0f2907dfd44e 0 rook-ceph-block-ext2 60s r-blue cassandradata-cassandra-0 Lost pvc-4a36b48e-96f5-4101-b20d-4b3b120a17c4 0 rook-ceph-block 60s r-blue cassandradata-cassandra-1 Lost pvc-1e1fb541-6ee1-45f6-b813-f007132355a6 0 rook-ceph-block 60s r-blue cassandradata-cassandra-2 Lost pvc-4150b280-157b-4444-b378-7d6909fc8dd3 0 rook-ceph-block 60s r-green cassandradata Lost pvc-d29a8969-5017-4728-830b-96c3f60187b2 0 rook-ceph-block 60s r-green cassandradata-cassandra-0 Lost pvc-e2bae7c1-e24a-4486-a62a-89a72ae97a72 0 rook-ceph-block 60s r-green cassandradata-cassandra-1 Lost pvc-897e95b4-11f7-4eff-a967-ef21a4b90185 0 rook-ceph-block 60s r-green cassandradata-cassandra-2 Lost pvc-67faeb98-be4a-4673-88fb-b724ea3b85bc 0 rook-ceph-block 60s r-test-nginx ceph-ext Lost pvc-fa807e5b-f43f-482d-bb76-e4221c923a7a 0 rook-block 60s r-test cassandra-volume-claim Lost pvc-dbf8f052-a7d7-401d-a956-2fdeccc14fae 0 rook-ceph-block 60s

-- All PVC's are lost

velero backup create green-all-pv-pvc --include-namespaces blue --include-resources persistentvolumeclaims,persistentvolume --wait

velero restore create --from-backup green-all-pv-pvc --namespace-mappings blue:r-blue

root@k8s-storage-1:~# kubectl -n r-blue get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE cassandradata-cassandra-0 Bound pvc-4a36b48e-96f5-4101-b20d-4b3b120a17c4 5Gi RWO rook-ceph-block 47s cassandradata-cassandra-1 Bound pvc-1e1fb541-6ee1-45f6-b813-f007132355a6 5Gi RWO rook-ceph-block 47s cassandradata-cassandra-2 Bound pvc-4150b280-157b-4444-b378-7d6909fc8dd3 5Gi RWO rook-ceph-block 47s

--in cluster 1

[root@green--1 velero]# kubectl get pods -n blue NAME READY STATUS RESTARTS AGE cassandra-0 1/1 Running 0 165d cassandra-1 1/1 Running 0 165d cassandra-2 1/1 Running 1 165d

[root@green--1 velero]# kubectl get pvc -n blue NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE cassandradata-cassandra-0 Terminating pvc-4a36b48e-96f5-4101-b20d-4b3b120a17c4 5Gi RWO rook-ceph-block 165d cassandradata-cassandra-1 Terminating pvc-1e1fb541-6ee1-45f6-b813-f007132355a6 5Gi RWO rook-ceph-block 165d cassandradata-cassandra-2 Terminating pvc-4150b280-157b-4444-b378-7d6909fc8dd3 5Gi RWO rook-ceph-block 165d [root@green--1 velero]# kubectl get pv -n blue NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE pvc-0ce3ed03-e323-41d0-b4b0-a2fc865060d1 2Gi RWO Retain Released test-nginx/ceph-ext rook-ceph-block 3d21h pvc-1e1fb541-6ee1-45f6-b813-f007132355a6 5Gi RWO Retain Bound blue/cassandradata-cassandra-1 rook-ceph-block 165d pvc-4150b280-157b-4444-b378-7d6909fc8dd3 5Gi RWO Retain Bound blue/cassandradata-cassandra-2 rook-ceph-block 165d pvc-4a36b48e-96f5-4101-b20d-4b3b120a17c4 5Gi RWO Retain Bound blue/cassandradata-cassandra-0 rook-ceph-block

-- removed the finalizer from PVC - pvc-protection finalizers: / - kubernetes.io/pvc-protection

PV is not deleted as it is in Retain pvc-4a36b48e-96f5-4101-b20d-4b3b120a17c4 5Gi RWO Retain Released blue/cassandradata-cassandra-0 rook-ceph-block 165d

[root@green--1 velero]# kubectl get pv -n blue NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS No resources found in blue namespace. [root@green--1 velero]# velero restore create --from-backup green-allpvc2 Restore request "green-allpvc2-20200904152429" submitted successfully. Run velero restore describe green-allpvc2-20200904152429 or velero restore logs green-allpvc2-20200904152429 for more details. [root@green--1 velero]# kubectl get pvc -n blue NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE cassandradata-cassandra-0 Lost pvc-4a36b48e-96f5-4101-b20d-4b3b120a17c4 0 rook-ceph-block 6s cassandradata-cassandra-1 Lost pvc-1e1fb541-6ee1-45f6-b813-f007132355a6 0 rook-ceph-block 6s cassandradata-cassandra-2 Lost pvc-4150b280-157b-4444-b378-7d6909fc8dd3 0 rook-ceph-block 6s

Though PV is retained, PVC is not able to bind to it; Also PV is still using flex

Cluste 1 flexVolume: driver: ceph.rook.io/rook-ceph

---

Cluster 2
   flexVolume:
driver: ceph.rook.io/rook-ceph
fsType: xfs

-- Just PV, PVC backup is not working as after deleteing the PVC, even though PV is there , it is not able to bind

--

velero restore create --from-backup green-all-pv-pvc

[root@green--1 velero]# kubectl -n green exec -it cassandra-0 -- /bin/bash root@cassandra-0:/# nodetool status Datacenter: datacenter1

Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 10.244.4.146 138.96 KB 256 100.0% 1f252a64-c518-4170-a310-ddc050a39a7f rack1 UN 10.244.3.73 164.14 KB 256 100.0% 48bbbd6d-c9df-4a36-9e71-a16297af2914 rack1



root@green--1:/# cqlsh -u cassandra -p cassandra 10.244.4.146 Connected to green at 10.244.4.146:9042. [cqlsh 5.0.1 | Cassandra 3.0.20 | CQL spec 3.4.0 | Native protocol v4]

https://docs.datastax.com/en/dse/5.1/cql/examples/cycling/doc/cyclist_alt_stats.html

cassandra@cqlsh> describe keyspaces;

cycling system_schema system_auth system system_distributed system_traces

cassandra@cqlsh> describe keyspace cycling

CREATE KEYSPACE cycling WITH replication = {'class': 'NetworkTopologyStrategy', 'datacenter1': '3'} AND durable_writes = true;

CREATE TABLE cycling.cyclist_alt_stats ( id uuid PRIMARY KEY, birthday timestamp, height text, lastname text, nationality text, weight text ) WITH bloom_filter_fp_chance = 0.01 AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} AND comment = '' AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'} AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND crc_check_chance = 1.0 AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99PERCENTILE';

cassandra@cqlsh> select * from cycling.cyclist_alt_stats;

id | birthday | height | lastname | nationality | weight --------------------------------------+---------------------------------+--------+----------+-------------+-------- 7b6962dd-3f90-4c93-8f61-eabfa4a803e2 | 2007-11-03 00:00:00.000000+0000 | null | Robin | null | null 5b6962dd-3f90-4c93-8f61-eabfa4a803e2 | 2007-11-03 00:00:00.000000+0000 | null | Robin | null | null

cassandra@cqlsh> INSERT INTO cycling.cyclist_alt_stats (id, lastname, birthday, nationality, height) VALUES (ed584e99-80f7-4b13-9a90-9dc5571e6821,'TSATEVICH', '1989-07-05', 'Russia', '64');

cassandra@cqlsh> select * from cycling.cyclist_alt_stats;

id | birthday | height | lastname | nationality | weight --------------------------------------+---------------------------------+--------+-----------+-------------+-------- 7b6962dd-3f90-4c93-8f61-eabfa4a803e2 | 2007-11-03 00:00:00.000000+0000 | null | Robin | null | null 5b6962dd-3f90-4c93-8f61-eabfa4a803e2 | 2007-11-03 00:00:00.000000+0000 | null | Robin | null | null ed584e99-80f7-4b13-9a90-9dc5571e6821 | 1989-07-05 00:00:00.000000+0000 | 64 | TSATEVICH | Russia | null


In Blue Cassandra

[root@green--1 velero]# kubectl -n blue exec -it cassandra-0 -- /bin/bash root@cassandra-0:/# nodetool status Datacenter: datacenter1

Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 10.244.1.49 223.4 KB 256 66.1% f76a0603-ccae-4e17-af9e-0cf7d8d3830b rack1 UN 10.244.2.56 280.41 KB 256 64.1% 8f2eaf61-608f-4a23-9060-31ce0818ef46 rack1 UN 10.244.3.59 270.04 KB 256 69.7% 555d985b-4062-48a0-a12c-3ee804b47845 rack1

cqlsh -u cassandra -p cassandra 10.244.1.49^C

cassandra@cqlsh> select * from cycling.cyclist_id;

lastname | firstname | age | id -----------+-----------+-----+-------------------------------------- WELTEN | Bram | 18 | 18f471bf-f631-4bc4-a9a2-d6f6cf5ea503 EENKHOORN | Pascal | 18 | ffdfa2a7-5fc6-49a7-bfdc-3fcdcfdd7156


root@k8s-storage-1:# velero restore create --from-backup blue-cass1 --namespace-mappings blue:r-blue Restore request "blue-cass1-20200904113517" submitted successfully. Run velero restore describe blue-cass1-20200904113517 or velero restore logs blue-cass1-20200904113517 for more details. root@k8s-storage-1:# kubectl -n r-blue get pods NAME READY STATUS RESTARTS AGE cassandra-0 0/1 Init:0/1 0 34s cassandra-1 0/1 Init:0/1 0 34s cassandra-2 0/1 Pending 0 34s root@k8s-storage-1:~# kubectl -n r-blue get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE cassandradata-cassandra-0 Bound pvc-4a36b48e-96f5-4101-b20d-4b3b120a17c4 5Gi RWO rook-ceph-block 41s cassandradata-cassandra-1 Bound pvc-1e1fb541-6ee1-45f6-b813-f007132355a6 5Gi RWO rook-ceph-block 41s cassandradata-cassandra-2 Bound pvc-4150b280-157b-4444-b378-7d6909fc8dd3 5Gi RWO rook-ceph-block 41s

Configspec everything is mapped

Rason for error

Type Reason Age From Message


Normal Scheduled 4m22s default-scheduler Successfully assigned r-blue/cassandra-0 to k8s-storage-3 Warning FailedMount 4m21s kubelet, k8s-storage-3 Unable to attach or mount volumes: unmounted volumes=[cassandradata], unattached volumes=[tmp-config cassandraconfig default-token-zbz5z cassandradata]: error processing PVC r-blue/cassandradata-cassandra-0: PVC is not bound Warning FailedMount 71s (x6 over 3m55s) kubelet, k8s-storage-3 Unable to attach or mount volumes: unmounted volumes=[cassandradata], unattached volumes=[cassandradata tmp-config cassandraconfig default-token-zbz5z]: failed to get Plugin from volumeSpec for volume "pvc-4a36b48e-96f5-4101-b20d-4b3b120a17c4" err=no volume plugin matched Warning FailedMount 44s (x4 over 3m15s) kubelet, k8s-storage-3 Unable to attach or mount volumes: unmounted volumes=[cassandradata], unattached volumes=[default-token-zbz5z cassandradata tmp-config cassandraconfig]: failed to get Plugin from volumeSpec for volume "pvc-4a36b48e-96f5-4101-b20d-4b3b120a17c4" err=no volume plugin matched Warning FailedMount 32s (x3 over 3m39s) kubelet, k8s-storage-3 Unable to attach or mount volumes: unmounted volumes=[cassandradata], unattached volumes=[cassandraconfig default-token-zbz5z cassandradata tmp-config]: failed to get Plugin from volumeSpec for volume "pvc-4a36b48e-96f5-4101-b20d-4b3b120a17c4" err=no volume plugin matched Warning FailedMount 8s (x6 over 4m6s) kubelet, k8s-storage-3 Unable to attach or mount volumes: unmounted volumes=[cassandradata], unattached volumes=[tmp-config cassandraconfig default-token-zbz5z cassandradata]: failed to get Plugin from volumeSpec for volume "pvc-4a36b48e-96f5-4101-b20d-4b3b120a17c4" err=no volume plugin matched root@k8s-storage-1:~#

You cant move from FLEX to CSI - rook/rook#4438

selfLink: /api/v1/persistentvolumes/pvc-4a36b48e-96f5-4101-b20d-4b3b120a17c4 uid: 8fe6af02-643a-442d-8908-f15abd8d2d5c spec: accessModes:

  • ReadWriteOnce capacity: storage: 5Gi claimRef: apiVersion: v1 kind: PersistentVolumeClaim name: cassandradata-cassandra-0 namespace: r-blue resourceVersion: "2862158" uid: 9292173d-f58d-4522-8149-0b591e705680 flexVolume: driver: ceph.rook.io/rook-ceph fsType: xfs options: clusterNamespace: rook-ceph dataBlockPool: "" image: pvc-4a36b48e-96f5-4101-b20d-4b3b120a17c4 pool: replicapool storageClass: rook-ceph-block
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment