Ways to Fix Stuck Kubernetes Pods in Terminating Status

Source: Ways to Fix Stuck Kubernetes Pods in Terminating Status

1. Common Causes of Pods Stuck in Terminating

Kubernetes is designed to remove pods cleanly, but certain conditions can prevent deletion. A finalizer in a pod’s metadata can block deletion until an external task (like unmounting a volume or finishing a backup) is done. If that task never completes, the pod will stay in Terminating. Likewise, a long-running preStop hook inside the container can exceed the termination grace period. For example, a script or command that sleeps longer than the default 30 seconds will be killed by the kubelet, but the pod remains marked “terminating”. In essence, any operation that takes too long to finish (or fails) – whether it’s a cleanup script, volume unmount, or finalizer – will hang the pod.

1.1 Unresponsive Containers or Hooks

If the containerized application ignores the SIGTERM signal or takes too long to handle shutdown, Kubernetes can’t remove the pod. This is often due to an application not trapping the signal, or a preStop lifecycle hook that never completes. In such cases, the kubelet eventually force-kills the container, but the pod object lingers. Checking the pod’s logs (kubectl logs ) can reveal if the process is stuck. Ensuring your app closes connections, flushes data, and stops promptly on SIGTERM can prevent this issue. You may also increase the pod’s terminationGracePeriodSeconds or simplify the preStop command to give it enough time.

1.2 Finalizers and Resource Dependencies

Pods can have finalizers (metadata fields) that tell Kubernetes to wait for an external controller or resource cleanup before deletion. For example, if a pod’s volume driver finalizer is waiting for a disk to detach, the pod will not be deleted until that happens. Similarly, if a custom resource or webhook must finish validation or cleanup, it can block the pod’s removal. The Kubernetes docs note that webhooks can prevent finalizers from being cleared. In practice, any unsatisfied dependency (storage, networking, backups, etc.) encoded as a finalizer will cause the pod to remain in Terminating.

1.3 Storage or Node Issues

Underlying infrastructure problems can also leave pods hanging. For instance, if a node is down or its kubelet is unresponsive, the control plane may not learn that a pod’s containers were terminated, so the pod object stays. A common case is a stuck volume: if a PersistentVolume remains mounted (showing “device or resource busy” errors), the pod can’t finish cleaning up. In such cases, you might see error messages on the node about unmount failures. Administrators often need to fix the node (rebooting the kubelet or manually detaching volumes) to clear these pods.

1.4 StatefulSets with Ordered Policy

When pods belong to a StatefulSet using the default PodManagementPolicy: OrderedReady, Kubernetes deletes pods one at a time in sequence. This means if one pod is stuck, it blocks deletion (or recreation) of the next. The KodeKloud guide explains that with OrderedReady, pods are deleted in strict order. While this isn’t a cause per se, it illustrates how a single stuck pod in a StatefulSet can have a ripple effect, making multiple pods appear stuck.

2. Techniques to Resolve Stuck Pods

Once you identify why a pod is not terminating, you can apply appropriate fixes. Begin with a graceful approach: use kubectl describe pod and kubectl logs to diagnose issues. Address any application errors or cleanup failures (e.g., fix preStop scripts, ensure finalizers can complete). If normal deletion still hangs, the following methods can force the pod to be removed.

2.1 Graceful Deletion

Try deleting the pod normally first:

kubectl delete pod <POD_NAME>

Watch the events or logs for clues. If the container shuts down cleanly, Kubernetes will eventually remove the pod. Adjust the terminationGracePeriodSeconds in the pod spec if the shutdown needs more time. Also consider using readiness and liveness probes to catch problems early. For example, adding a proper preStop script to gracefully shut down your app (closing sockets, saving state) can prevent it from hanging.

2.2 Remove Finalizers Manually

If a finalizer is blocking deletion, you can delete it. The KodeKloud guide shows using kubectl edit on the pod to remove the finalizer field. For example:

kubectl edit pod <POD_NAME>

Find the finalizers: section in the YAML and delete its entries, then save the file. Kubernetes will then bypass the external cleanup and delete the pod. Alternatively, use a kubectl patch:

kubectl patch pod <POD_NAME> -p '{"metadata":{"finalizers":null}}'

Either approach forces Kubernetes to remove the finalizer, allowing the pod to terminate. Use this carefully – ensure it’s safe to skip the cleanup the finalizer was waiting for.

2.3 Force Deletion

As a last resort, you can force delete the pod with zero grace period. The KodeKloud example demonstrates this command:

kubectl delete pod <POD_NAME> --grace-period=0 --force

This tells Kubernetes to send SIGKILL to the containers immediately and remove the pod object. (You might see a warning about data loss; this is expected.) You can also delete all stuck pods at once, for example:

for p in $(kubectl get pods --field-selector=status.phase=Terminating -o jsonpath='{.items[*].metadata.name}'); do
  kubectl delete pod $p --grace-period=0 --force
done

This loop will clear any pods still in the Terminating state. After this, the pods should disappear from kubectl get pods.

By following these steps – diagnosing the root cause, then using the right deletion method – you can clear pods stuck in Terminating. If the problem was a dependency or storage issue, make sure to correct that underlying issue to prevent recurrence. If you have any questions or encounter new scenarios, feel free to comment below!

Ways to Fix Stuck Kubernetes Pods in Terminating Status

1. Common Causes of Pods Stuck in Terminating

1.1 Unresponsive Containers or Hooks

1.2 Finalizers and Resource Dependencies

1.3 Storage or Node Issues

1.4 StatefulSets with Ordered Policy

2. Techniques to Resolve Stuck Pods

2.1 Graceful Deletion

2.2 Remove Finalizers Manually

2.3 Force Deletion

Comments

More from this blog

Reasons TTL Alone Is a Weak Cache Strategy for Frequently Updated Business Data

Techniques: How to design versioned commands so retries stay safe under concurrent modification?

Techniques to Partition Data for Growth Without Breaking Query Simplicity

Methods to Move Cross-Cutting Logic Out of Controllers Without Building a Mystery Box

Reasons Java services get slower after a few hours: How to find thread pool saturation?

Command Palette

1. Common Causes of Pods Stuck in Terminating

1.1 Unresponsive Containers or Hooks

1.2 Finalizers and Resource Dependencies

1.3 Storage or Node Issues

1.4 StatefulSets with Ordered Policy

2. Techniques to Resolve Stuck Pods

2.1 Graceful Deletion

2.2 Remove Finalizers Manually

2.3 Force Deletion

Comments

More from this blog