Etcd is the key-value store for OpenShift Container Platform, which persists the state of all resource objects.
Back up your cluster’s etcd data regularly and store in a secure location ideally outside the OpenShift Container Platform environment. Do not take an etcd backup before the first certificate rotation completes, which occurs 24 hours after installation, otherwise the backup will contain expired certificates. It is also recommended to take etcd backups during non-peak usage hours, as it is a blocking action.
(Cover image :
I was in OCP 4.3.0 Restricted Environment where OCP Nodes have no Internet Connection even through Proxy, and noticed etcd-snapshot-backup.sh script failed as it tried to download the etcdctl from Internet.
[root@bastion ~]# ssh -i .ssh/id_rsa [email protected]
[core@etcd-1 ~]$ sudo /usr/local/bin/etcd-snapshot-backup.sh ./assets/backup
Creating asset directory ./assets
Downloading etcdctl binary..
In high level to make the etcd backup successful, I had to find etcdctl and copied somewhere (/root/etcdctl), and modified etcd-snapshot-backup.sh script
[root@etcd-1 ~]# find / -iname etcdctl*
[root@etcd-1 ~]# diff /usr/local/bin/etcd-snapshot-backup.sh /usr/local/bin/etcd-snapshot-original.sh
40c40
< ETCDCTL="/root/etcdctl"
---
> ETCDCTL="${ASSET_DIR}/bin/etcdctl"
49c49
< # dl_etcdctl
---
> dl_etcdctl
Then performed the backup:
[root@etcd-1 ~]# /usr/local/bin/etcd-snapshot-backup.sh assets/backup/
Trying to backup etcd client certs..
etcd client certs found in /etc/kubernetes/static-pod-resources/kube-apiserver-pod-14 backing up to ./assets/backup/
Backing up /etc/kubernetes/manifests/etcd-member.yaml to ./assets/backup/
Trying to backup latest static pod resources..
Snapshot saved at ./assets/tmp/snapshot.db
snapshot db and kube resources are successfully saved to assets/backup//snapshot_db_kuberesources_2020-02-25_030239.tar.gz!
PS:
We need to revert back the changes we have on etcd-snapshot-backup.sh script to avoid machine-config operatorgoes to DEGRADED state due to file mismatch, verification: oc describe pods -n machine-config-operator machine-config-daemon-XXX (the nodes where we modify the script)
To fix the DEGRADED state, we need to delete the problematic pods
Note:
– Do not forget to store the snapshot backup file somewhere outside the OCP Nodes
– For OCP nodes connected using proxy, We might need to add HTTP(S)_PROXY environment variables on the script or export them before running the script
– For OCP 4.3.5 and later, You might not need to modify the backup script.
Disclaimer:
The views expressed and the content shared in all published articles on this website are solely those of the respective authors, and they do not necessarily reflect the views of the author’s employer or the techbeatly platform. We strive to ensure the accuracy and validity of the content published on our website. However, we cannot guarantee the absolute correctness or completeness of the information provided. It is the responsibility of the readers and users of this website to verify the accuracy and appropriateness of any information or opinions expressed within the articles. If you come across any content that you believe to be incorrect or invalid, please contact us immediately so that we can address the issue promptly.
Abip Sjarbini
Platform Consultant at Red Hat, Oracle Engineered Systems Specialist
This site uses Akismet to reduce spam. Learn how your comment data is processed.1 Response
Leave a Reply Cancel reply
[…] Perform etcd Backup for Restricted Environment on OCP 4.3.x – March 27, 2020 […]