There might be situations where you have already deleted pods (or already removed dc aka deployment configuration) but pods are stuck in Terminating state. There are few suggestions if you google around (Red Hat Thread : How to delete pods hanging in Terminating state) and just listing down the best method or steps which worked for me.
I am faking some output to explain the situation and solution as shown below.
$ oc get pods
NAME READY STATUS RESTARTS AGE
jenkins-1-deploy 0/1 Terminating 0 7d
mongo-db-dev-0 2/2 Running 0 20h
mongo-db-build 0/1 Completed 0 18h
mynew-app-1-build 0/1 Terminating 0 7d
We can see pods jenkins-1-deploy and mynew-app-1-build are already instructed to delete but still hanging in Terminating state. So, let’s try the first method by deleting the pod forcefully.
Step 1: Delete pod forcefully
$ oc delete pod jenkins-1-deploy -n myproject --grace-period=0 --force
warning: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely.
pod "jenkins-1-deploy" deleted
I thought it’s deleted as the message says pod “jenkins-1-deploy” deleted. But see what happened when I checked; It’s still there !!!
$ oc get pods
NAME READY STATUS RESTARTS AGE
jenkins-1-deploy 0/1 Terminating 0 7d
mongo-db-dev-0 2/2 Running 0 20h
mongo-db-build 0/1 Completed 0 18h
mynew-app-1-build 0/1 Terminating 0 7d
Okay, now we realized that, the pod deletion is stuck somewhere. Upon reading documentations we realized that, when you delete an object in kubernetes cluster, you can specify whether the object’s dependents are also deleted automatically. Read more about cascading deletion as well as background vs foreground cascading deletion.
So we have to check below items from pods details.
- Is there any value in object’s deletionTimestamp ?
- Is there any value under object’s metadata.finalizers section “foregroundDeletion” ?
There was ! So I have edited the pod (oc edit) and removed/replaced those values.
$ oc edit pod jenkins-1-deploy
Step 2: Remove deletionTimestamp
Before:
deletionTimestamp: 2019-01-23T11:40:28Z
After
deletionTimestamp: null
Step 3: Remove items under metadata.finalizers
Before:
...
metadata:
finalizers:
- foregroundDeletion
...
After:
...
metadata:
finalizers: null
...
And save it. (If its asking to save a multiple times as temp files, just save again with :wq.
List down our pods again.
$ oc get pods
NAME READY STATUS RESTARTS AGE
mongo-db-dev-0 2/2 Running 0 20h
mongo-db-build 0/1 Completed 0 18h
mynew-app-1-build 0/1 Terminating 0 7d
It’s gone !!! Do the same for that mynew-app-1-build as well.
Still not working ? Then you may try Step 4
Step 4: Invoke OpenShift API
This is a bit risky method as incorrect usage of this API method may result in unpredictable situations in your OpenShift cluster environment; so be careful.
This method is explained in Red Hat KnowledgeBase with sample instruction set. Let me give some samples below.
We can invoke OpenShift API directly for object manipulation, even for deletion in our case. To delete a pod stuck in ‘Terminating‘ or ‘Unknown‘ state, you may try following curl sent to the API:
$ echo '{ "propagationPolicy": "Background" }' | curl -k -X DELETE -d @- -H "Authorization: Bearer XYZ" -H 'Accept: application/json' -H 'Content-Type: application/json' https://master.example.com:443/api/v1/namespaces/myproject/pods/my-app-123
Where,
- master_URL = master.example.com
- port = 443
- token = XYZ (this one you have to get using oc whoami -t command)
- pod_name = my-app-123
- namespace = myproject
And check your zombie pods if its still there or not.
That’s all !