| 备注 | 修改日期 | 修改人 |
| 创建版本 | 2026-01-21 17:24:42[当前版本] | 文艺范儿 |
集群缩容 指通过减少 Worker 节点数量,释放闲置资源(如云服务器、物理机)的过程,核心是安全迁移节点上的 Pod 至其他节点,确保业务连续性。
缩容的前提条件
kubectl top nodes验证)。缩容流程
步骤 1:前置检查(关键!避免缩容失败):1.检查节点状态与资源 2.检查 Pod 分布与依赖 3.确认无本地存储 PV
步骤 2:标记节点为“不可调度”
步骤 3:驱逐节点上的 Pod(核心步骤)
步骤 4:验证 Pod 迁移结果并重置下线节点环境
步骤 5:从集群中移除节点
步骤 6:回收节点物理资源(云服务器/物理机)
云环境:在云平台控制台停止/销毁 k8s-worker-02 实例(释放计费)。
物理机:断开电源、下架机架、格式化磁盘(确保数据清除)。
安全移除 k8s-worker-02(192.168.1.233),将其上运行的 Pod 迁移至 k8s-worker-01,释放节点资源。
# 1.检查节点状态与资源
# 查看所有节点状态(确保 k8s-worker-02 为 Ready)
[root@k8s-master-01 ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-master-01 Ready control-plane 22d v1.34.3
k8s-worker-01 Ready <none> 22d v1.34.3
k8s-worker-02 Ready <none> 22d v1.34.3
# 检查节点资源利用率(确保剩余节点 k8s-worker-01 资源充足)
[root@k8s-master-01 ~]# kubectl top nodes
NAME CPU(cores) CPU(%) MEMORY(bytes) MEMORY(%)
k8s-master-01 285m 3% 1986Mi 12%
k8s-worker-01 107m 2% 1350Mi 18%
k8s-worker-02 88m 2% 1252Mi 17%
# 2. 检查 Pod 分布与依赖
# 查看 k8s-worker-02 上的所有 Pod(确认无关键有状态应用)
[root@k8s-master-01 ~]# kubectl get pods -o wide --all-namespaces | grep k8s-worker-02
default java-app-75ccc87bb4-hnfgn 1/1 Running 0 45m 10.100.2.212 k8s-worker-02 <none> <none>
default scheduler-nodename-5fdc8446dc-7r6vb 1/1 Running 0 44s 10.100.2.217 k8s-worker-02 <none> <none>
default scheduler-nodename-5fdc8446dc-92d7v 1/1 Running 0 44s 10.100.2.214 k8s-worker-02 <none> <none>
default scheduler-nodename-5fdc8446dc-bb7cm 1/1 Running 0 44s 10.100.2.216 k8s-worker-02 <none> <none>
default scheduler-nodename-5fdc8446dc-twcbs 1/1 Running 0 44s 10.100.2.215 k8s-worker-02 <none> <none>
default scheduler-nodename-5fdc8446dc-wqblr 1/1 Running 0 44s 10.100.2.213 k8s-worker-02 <none> <none>
kube-flannel kube-flannel-ds-82xp2 1/1 Running 0 3d16h 192.168.1.233 k8s-worker-02 <none> <none>
kube-system kube-proxy-l9bnz 1/1 Running 0 5d18h 192.168.1.233 k8s-worker-02 <none> <none>
metallb-system speaker-k9lvj 1/1 Running 0 3d16h 192.168.1.233 k8s-worker-02 <none> <none>
# 3. 确认无本地存储 PV
# 检查是否有 Pod 使用本地存储(Local PV)
[root@k8s-master-01 ~]# kubectl get pv -o jsonpath='{range .items[*]}{.spec.nodeAffinity}{"\n"}{end}' | grep k8s-worker-02
# 无输出则表示无本地存储依赖(若有,需先迁移 PV)
# 阻止新的 Pod 调度到 k8s-worker-02: [root@k8s-master-01 ~]# kubectl cordon k8s-worker-02 # 验证:节点 STATUS 变为 SchedulingDisabled [root@k8s-master-01 ~]# kubectl get nodes k8s-worker-02 # k8s-worker-02 Ready,SchedulingDisabled <none> 22d v1.34.3
使用 kubectl drain命令迁移 Pod 至其他节点(k8s-worker-01),忽略 DaemonSet Pod(会自动重建到其他节点):
[root@k8s-master-01 ~]# kubectl drain k8s-worker-02 \ --ignore-daemonsets \ --delete-emptydir-data \ --force \ --grace-period=30 # 注释 # --ignore-daemonsets \ # 忽略 DaemonSet Pod(如 Node Exporter) # --delete-emptydir-data \ # 删除 EmptyDir 存储数据(临时数据) # --force \ # 强制驱逐无控制器管理的 Pod # --grace-period=30 # 优雅终止期(30秒)
输出示例(Pod 迁移过程):
node/k8s-worker-02 already cordoned Warning: ignoring DaemonSet-managed Pods: kube-flannel/kube-flannel-ds-82xp2, kube-system/kube-proxy-l9bnz, metallb-system/speaker-k9lvj evicting pod default/scheduler-nodename-5fdc8446dc-wqblr evicting pod default/scheduler-nodename-5fdc8446dc-92d7v evicting pod default/scheduler-nodename-5fdc8446dc-twcbs evicting pod default/java-app-75ccc87bb4-hnfgn evicting pod default/scheduler-nodename-5fdc8446dc-bb7cm evicting pod default/scheduler-nodename-5fdc8446dc-7r6vb pod/scheduler-nodename-5fdc8446dc-92d7v evicted pod/java-app-75ccc87bb4-hnfgn evicted pod/scheduler-nodename-5fdc8446dc-wqblr evicted pod/scheduler-nodename-5fdc8446dc-bb7cm evicted pod/scheduler-nodename-5fdc8446dc-twcbs evicted pod/scheduler-nodename-5fdc8446dc-7r6vb evicted node/k8s-worker-02 drained
确认所有 Pod 已成功调度至 k8s-worker-01:
# 查看原 k8s-worker-02 上的 Pod(应无业务 Pod) [root@k8s-master-01 ~]# kubectl get pods -o wide --all-namespaces | grep k8s-worker-02 kube-flannel kube-flannel-ds-82xp2 1/1 Running 0 3d16h 192.168.1.233 k8s-worker-02 <none> <none> kube-system kube-proxy-l9bnz 1/1 Running 0 5d18h 192.168.1.233 k8s-worker-02 <none> <none> metallb-system speaker-k9lvj 1/1 Running 0 3d16h 192.168.1.233 k8s-worker-02 <none> <none> # 输出:仅 DaemonSet Pod(如 node-exporter) # 查看迁移后的 Pod(均在 k8s-worker-01) [root@k8s-master-01 ~]# kubectl get pods -o wide | grep k8s-worker-01 java-app-75ccc87bb4-jv96r 1/1 Running 0 53m 10.100.1.230 k8s-worker-01 <none> <none> scheduler-nodename-5fdc8446dc-2q7rv 1/1 Running 0 8m35s 10.100.1.248 k8s-worker-01 <none> <none> scheduler-nodename-5fdc8446dc-4wdsv 1/1 Running 0 8m35s 10.100.1.249 k8s-worker-01 <none> <none> scheduler-nodename-5fdc8446dc-66p54 1/1 Running 0 8m35s 10.100.1.246 k8s-worker-01 <none> <none> scheduler-nodename-5fdc8446dc-dd5b6 1/1 Running 0 100s 10.100.1.252 k8s-worker-01 <none> <none> scheduler-nodename-5fdc8446dc-jpnh8 1/1 Running 0 8m35s 10.100.1.247 k8s-worker-01 <none> <none> scheduler-nodename-5fdc8446dc-k7scx 1/1 Running 0 100s 10.100.1.253 k8s-worker-01 <none> <none> scheduler-nodename-5fdc8446dc-q45fg 1/1 Running 0 100s 10.100.1.2 k8s-worker-01 <none> <none> scheduler-nodename-5fdc8446dc-q4mdw 1/1 Running 0 100s 10.100.1.251 k8s-worker-01 <none> <none> scheduler-nodename-5fdc8446dc-tvjgp 1/1 Running 0 8m35s 10.100.1.250 k8s-worker-01 <none> <none> scheduler-nodename-5fdc8446dc-wwdlb 1/1 Running 0 100s 10.100.1.254 k8s-worker-01 <none> <none> [root@k8s-master-01 ~]#
在k8s-worker-02下线节点环境:
# 在 k8s-worker-02节点停止kubelet服务 [root@k8s-worker-02 ~]# systemctl disable --now kubelet Removed '/etc/systemd/system/multi-user.target.wants/kubelet.service'. [root@k8s-worker-02 ~]# [root@k8s-worker-02 ~]# ls /var/lib/kubelet actuated_pods_state checkpoints cpu_manager_state dra_manager_state kubeadm-flags.env pki plugins_registry pods allocated_pods_state config.yaml device-plugins instance-config.yaml memory_manager_state plugins pod-resources [root@k8s-worker-02 ~]# # 在 k8s-worker-02节点执行重置命令,并指定(容器运行时接口) [root@k8s-worker-02 ~]# kubeadm reset -f --cri-socket=unix:///var/run/cri-dockerd.sock [root@k8s-worker-02 ~]# ls /var/lib/kubelet [root@k8s-worker-02 ~]# # 完成后,删除数据,关机并重装系统
# 删除节点(仅从 API Server 移除记录,节点物理资源需手动处理) [root@k8s-master-01 ~]# kubectl delete node k8s-worker-02 # 验证:节点已从集群列表中移除 [root@k8s-master-01 ~]# kubectl get nodes NAME STATUS ROLES AGE VERSION k8s-master-01 Ready control-plane 22d v1.34.3 k8s-worker-01 Ready <none> 22d v1.34.3
云环境:在云平台控制台停止/销毁 k8s-worker-02 实例(释放计费)。
物理机:断开电源、下架机架、格式化磁盘(确保数据清除)。