- 기술지원node의 Notready status 해결방법2
-
박*원 2024-07-29 18:45:38- hits90
에러문과 로그 첨부하였습니다 감사합니다.
kubectl node-shell contest73-node-w-5b14
spawning "nsenter-dzyitj" on "contest73-node-w-5b14"
error: timed out waiting for the condition
pod "nsenter-dzyitj" deleted
−−−−−−−−−−−−−−−−−−−−−−-
Name: contest73-node-w-5b14
Roles: <none>
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/instance-type=SVR.VSVR.HICPU.C002.M004.G003
beta.kubernetes.io/os=linux
failure-domain.beta.kubernetes.io/region=1
failure-domain.beta.kubernetes.io/zone=2
kubernetes.io/arch=amd64
kubernetes.io/hostname=contest73-node-w-5b14
kubernetes.io/os=linux
ncloud.com/nks-nodepool=contest73-node
node.kubernetes.io/instance-type=SVR.VSVR.HICPU.C002.M004.G003
nodeId=25530113
regionNo=1
topology.kubernetes.io/region=1
topology.kubernetes.io/zone=2
zoneNo=2
Annotations: alpha.kubernetes.io/provided-node-ip: 192.168.6.7
csi.volume.kubernetes.io/nodeid: {"blk.csi.ncloud.com":"25530113","nas.csi.ncloud.com":"contest73-node-w-5b14"}
node.alpha.kubernetes.io/ttl: 0
volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp: Wed, 17 Jul 2024 10:19:12 +0900
Taints: node.kubernetes.io/unreachable:NoSchedule
Unschedulable: false
Lease:
HolderIdentity: contest73-node-w-5b14
AcquireTime: <unset>
RenewTime: Mon, 29 Jul 2024 06:42:59 +0900
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
−−−− −−−−−− −−−−−−−−−−−−−−−−- −−−−−−−−−−−−−−−−−− −−−−−− −−−−−−-
NetworkUnavailable False Wed, 17 Jul 2024 10:20:11 +0900 Wed, 17 Jul 2024 10:20:11 +0900 CiliumIsUp Cilium is running on this node
MemoryPressure Unknown Mon, 29 Jul 2024 06:42:55 +0900 Mon, 29 Jul 2024 06:43:40 +0900 NodeStatusUnknown Kubelet stopped posting node status.
DiskPressure Unknown Mon, 29 Jul 2024 06:42:55 +0900 Mon, 29 Jul 2024 06:43:40 +0900 NodeStatusUnknown Kubelet stopped posting node status.
PIDPressure Unknown Mon, 29 Jul 2024 06:42:55 +0900 Mon, 29 Jul 2024 06:43:40 +0900 NodeStatusUnknown Kubelet stopped posting node status.
Ready Unknown Mon, 29 Jul 2024 06:42:55 +0900 Mon, 29 Jul 2024 06:43:40 +0900 NodeStatusUnknown Kubelet stopped posting node status.
Addresses:
InternalIP: 192.168.6.7
Hostname: contest73-node-w-5b14
ExternalIP: 223.130.143.233
Capacity:
cpu: 2
ephemeral-storage: 103083576Ki
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 4005900Ki
pods: 110
Allocatable:
cpu: 1930m
ephemeral-storage: 95001823485
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 2902028Ki
pods: 110
System Info:
Machine ID: 83c00806e7504df088d78c7418b87290
System UUID: 8fb90707-22c2-447c-9b5e-f21836acfada
Boot ID: a9edc7f0-51a3-4d4d-bbce-800b7f0f1e65
Kernel Version: 5.15.0-94-generic
OS Image: Ubuntu 22.04.3 LTS
Operating System: linux
Architecture: amd64
Container Runtime Version: containerd://1.7.13
Kubelet Version: v1.27.9
Kube-Proxy Version: v1.27.9
PodCIDR: 198.18.0.0/24
PodCIDRs: 198.18.0.0/24
ProviderID: navercloudplatform://25530113
Non-terminated Pods: (29 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits Age
−−−−−−−−- −−−− −−−−−−−−−−−− −−−−−−−−−− −−−−−−−−−−−−−−- −−−−−−−−−−−−- −−-
default nsenter-5041hp 0 (0%) 0 (0%) 0 (0%) 0 (0%) 11h
default nsenter-84bxhk 0 (0%) 0 (0%) 0 (0%) 0 (0%) 11h
default nsenter-dzyitj 0 (0%) 0 (0%) 0 (0%) 0 (0%) 2m58s
default nsenter-jhaixw 0 (0%) 0 (0%) 0 (0%) 0 (0%) 11h
default nsenter-kbk1f9 0 (0%) 0 (0%) 0 (0%) 0 (0%) 11h
default nsenter-l851jv 0 (0%) 0 (0%) 0 (0%) 0 (0%) 11h
default nsenter-lzlb5e 0 (0%) 0 (0%) 0 (0%) 0 (0%) 11h
default nsenter-n4ftg9 0 (0%) 0 (0%) 0 (0%) 0 (0%) 12h
default nsenter-njf3bb 0 (0%) 0 (0%) 0 (0%) 0 (0%) 11h
default nsenter-rbrqg7 0 (0%) 0 (0%) 0 (0%) 0 (0%) 11h
default nsenter-s9cgp2 0 (0%) 0 (0%) 0 (0%) 0 (0%) 2d1h
default nsenter-x77xqe 0 (0%) 0 (0%) 0 (0%) 0 (0%) 11h
kube-system cilium-brw9p 100m (5%) 0 (0%) 10Mi (0%) 0 (0%) 12d
kube-system cilium-monitor-ncfnx 0 (0%) 0 (0%) 0 (0%) 0 (0%) 12d
kube-system cilium-operator-7f5bf5d886-7g8pc 0 (0%) 0 (0%) 0 (0%) 0 (0%) 12d
kube-system coredns-847f76456c-ptq4k 100m (5%) 0 (0%) 70Mi (2%) 170Mi (5%) 12d
kube-system csi-nks-controller-dfdb58f9c-ghqnz 0 (0%) 0 (0%) 0 (0%) 0 (0%) 12d
kube-system csi-nks-node-zmtb8 0 (0%) 0 (0%) 0 (0%) 0 (0%) 12d
kube-system dns-autoscaler-59dcb5c7f4-96sfb 20m (1%) 0 (0%) 10Mi (0%) 0 (0%) 12d
kube-system konnectivity-agent-6c5d8cfdc9-jpjxq 0 (0%) 0 (0%) 0 (0%) 0 (0%) 12d
kube-system konnectivity-agent-autoscaler-85d4dd5cdf-qkssm 20m (1%) 0 (0%) 10Mi (0%) 0 (0%) 12d
kube-system metrics-server-646b65fcf-zcb4z 100m (5%) 0 (0%) 200Mi (7%) 0 (0%) 12d
kube-system nks-metric-sender-7bffd5c6d6-h6khj 0 (0%) 0 (0%) 0 (0%) 0 (0%) 12d
kube-system nks-metric-sender-7bffd5c6d6-s9tck 0 (0%) 0 (0%) 0 (0%) 0 (0%) 12d
kube-system nks-nas-csi-controller-69879ddc7f-724xr 40m (2%) 500m (25%) 80Mi (2%) 800Mi (28%) 12d
kube-system nks-nas-csi-node-xzzvr 30m (1%) 400m (20%) 60Mi (2%) 500Mi (17%) 12d
kube-system nks-nodelocalproxy-qn2kv 0 (0%) 0 (0%) 0 (0%) 0 (0%) 12d
kube-system nodelocaldns-bz9qq 100m (5%) 0 (0%) 70Mi (2%) 170Mi (5%) 12d
kube-system snapshot-controller-0 0 (0%) 0 (0%) 0 (0%) 0 (0%) 12d
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
−−−−−−−− −−−−−−−− −−−−−−
cpu 510m (26%) 900m (46%)
memory 510Mi (17%) 1640Mi (57%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-1Gi 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)
Events: <none>
안녕하세요. 개방형 클라우드 플랫폼 센터입니다.
올려주신 로그를 확인해 보니, "Kubelet stopped posting node status" 이슈가 있는 것으로 확인되며 Taint가 NoSchedule로 설정되어 있습니다. (Taints: node.kubernetes.io/unreachable:NoSchedule)
NoSchedule로 설정되어 있는 경우 해당 노드가 pod 스케줄링에서 제외됩니다.
해당 노드에 접속해서 root 권한으로 kubelet 서비스 상태를 확인한 후 inactive 상태일 경우 active 해주시기 바랍니다.
# root 권한 변경
sudo -i
# kubelet status 확인
systemctl status kubelet
(inactive 예시)
○ kubelet.service - Kubernetes Kubelet Server
Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: enabled)
Active: inactive (dead) since Tue 2024-07-30 09:52:01 KST; 5s ago
# kubelet 활성
systemctl enable --now kubelet
# kubelet active 상태 확인
systemctl status kubelet
(active 예시)
● kubelet.service - Kubernetes Kubelet Server
Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2024-07-30 09:52:23 KST; 1s ago
위 과정 진행 후 master node에서 node 조회 시 "Ready" 상태로 변경되었는지 확인해 보시기 바랍니다.
감사합니다.