Node Maintenance
  • 11 Feb 2022
  • Dark
    Light

Node Maintenance

  • Dark
    Light

Article Summary

If you need to reboot a node (such as kernel upgrades, libc upgrade, hardware repair, etc.), and the downtime is brief, then when the Kubelet restarts, it will attempt to restart the pods scheduled to it. If the reboot takes longer (the default time is 5 minutes, controlled by --pod-eviction-timeout the controller-manager), the node controller will terminate the pods bound to the unavailable node. If there is a corresponding replica set (or replication controller), then a new copy of the pod will be started on a different node. So, in the case where all pods are replicated, upgrades can be done without special coordination, assuming that not all nodes will go down simultaneously.

If you want more control over the upgrading process, you may use the following workflow:

Use kubectl drain to gracefully terminate all pods on the node while marking the node as unschedulable:

kubectl drain $NODENAME

This keeps new pods from landing on the node while trying to get them off.

For pods with a replica set, the pod will be replaced by a new pod scheduled to a new node. Additionally, if the pod is part of a service, clients will automatically be redirected to the new pod.

For pods with no replica set, you need to bring up a new copy of the pod, and assuming it is not part of a service, redirect clients to it.

Perform maintenance work on the node.

Make the node schedulable again:

kubectl uncordon $NODENAME

Advanced Topics

Upgrading to a different API version

When a new API version is released, you may need to upgrade a cluster to support the new API version (e.g., switching from ‘v1’ to ‘v2’ when ‘v2’ is launched).

This is an infrequent event, but it requires careful management. There is a sequence of steps to upgrade to a new API version.

  1. Turn on the new API version.
  2. Upgrade the cluster’s storage to use the new version.
  3. Upgrade all config files. Identify users of the old API version endpoints.
  4. Update existing objects in the storage to the new version by running cluster/update-storage-objects.sh.
  5. Turn off the old API version.

Turn on or off an API version for your cluster

Specific API versions can be turned on or off by passing --runtime-config=api/<version> flags while bringing up the API server. For example: to turn off v1 API, pass --runtime-config=api/v1=false. runtime-config also supports two special keys: api/all and api/legacy to control all and legacy APIs, respectively. For example, for turning off all API versions except v1, pass --runtime-config=api/all=false,api/v1=true. For these flags, legacy APIs are those APIs that have been explicitly deprecated.

Switching your cluster’s storage API version

The objects stored to disk for a cluster’s internal representation of the Kubernetes resources active in the cluster are written using a particular version of the API. These objects may need to be rewritten in the newer API when the supported API changes. Failure to do this will eventually result in resources no longer decodable or usable by the Kubernetes API server.

Switching your config files to a new API version

You can use kubectl convert command to convert config files between different API versions.

kubectl convert -f pod.yaml --output-version v1



Was this article helpful?