一般情况下,问题出在pod本身,我们可以按照如下步骤进行分析定位问题
1 查看节点运行情况
[root@k8s-m1 src]# kubectl get node
NAME STATUS ROLES AGE VERSION
k8s-c1 Ready <none> 16h v1.14.2
k8s-m1 Ready master 17h v1.14.2
2 首先查看pod状态是否正常
[root@k8s-m1 docker]# kubectl get pod -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-fb8b8dccf-5g2cx 1/1 Running 0 2d14h
coredns-fb8b8dccf-c5skq 1/1 Running 0 2d14h
etcd-k8s-master 1/1 Running 0 2d14h
kube-apiserver-k8s-master 1/1 Running 0 2d14h
kube-controller-manager-k8s-master 1/1 Running 0 2d14h
kube-flannel-ds-arm64-7cr2b 0/1 CrashLoopBackOff 629 2d12h
kube-flannel-ds-arm64-hnsrv 0/1 CrashLoopBackOff 4 2d12h
kube-proxy-ldw8m 1/1 Running 0 2d14h
kube-proxy-xkfdw 1/1 Running 0 2d14h
kube-scheduler-k8s-master 1/1 Running 0 2d14h
发现网络插件kube-flannel一直在尝试重启,有时能够正常,有时提示 CrashLoopBackOff有时OOMKilled
3 查看kublet日志
[root@k8s-m1 src]# journalctl -u kubelet -f
12月 09 09:12:45 k8s-m1 kubelet[35667]: E1209 09:12:45.895575 35667 pod_workers.go:190] Error syncing pod 2eaa8ef9-1822-11ea-a1d9-70fd45ac3f1f ("kube-flannel-ds-arm64-7cr2b_kube-system(2eaa8ef9-1822-11ea-a1d9-70fd45ac3f1f)"), skipping: failed to "StartContainer" for "kube-flannel" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=kube-flannel pod=kube-flannel-ds-arm64-7cr2b_kube-system(2eaa8ef9-1822-11ea-a1d9-70fd45ac3f1f)"
4 查看网路插件kube-flannel的日志
[root@k8s-m1 src]# kubectl logs kube-flannel-ds-arm64-88rjz -n kube-system
E1209 01:20:42.527856 1 iptables.go:115] Failed to ensure iptables rules: Error checking rule existence: failed to check rule existence: running [/sbin/iptables -t nat -C POSTROUTING ! -s 10.244.0.0/16 -d 10.244.0.0/16 -j MASQUERADE --random-fully --wait]: exit status -1:
E1209 01:20:46.928502 1 iptables.go:115] Failed to ensure iptables rules: Error checking rule existence: failed to check rule existence: running [/sbin/iptables -t filter -C FORWARD -s 10.244.0.0/16 -j ACCEPT --wait]: exit status -1:
E1209 01:20:52.128049 1 iptables.go:115] Failed to ensure iptables rules: Error checking rule existence: failed to check rule existence: running [/sbin/iptables -t filter -C FORWARD -s 10.244.0.0/16 -j ACCEPT --wait]: exit status -1:
E1209 01:20:52.932263 1 iptables.go:115] Failed to ensure iptables rules: Error checking rule existence: failed to check rule existence: fork/exec /sbin/iptables: cannot allocate memory
刚开始一直怀疑是iptables问题,当我尝试把iptables.go中执行命令拷贝到命令行之后可以正常执行,这个时候就不知所以然了,直到我发现有时pod会提示;
kube-flannel-ds-arm64-hnsrv 0/1 OOMKilled 4 2d12h
一般情况下是因为网络插件flannel下载问题,默认的网络插件下载地址是quay.io/coreos/flannel,但是这个地址国内网络无法直接访问到,这个时候我们需要从quay-mirror.qiniu.com/coreos/flannel地址下载,然后重命名城quay.io,然后执行
kubectl create -f kube-flannel.yml主节点一切安装成功,并且提示子节点加入命令,当输入到子节点时发现无法加入,或者一直卡在加入shell命令行界面,无法加入。
第一:先看防火墙 systemctl firewalld.service status 因为集群间需要组网通信,如果防火墙是打开的建议关闭或者加入到iptables里面。默认可以访问。
第二:查看自己是否配置host组件
1.执行cat /etc/hosts命令,修改hosts文件。
2.添加集群所有节点的IP及hostname信息
3.hostnamectl --static set-hostname centos-1依次执行
查看daemon.json文件
因为指定了systemd,导致文件docker 运行镜像失败
cat /etc/docker/daemon.json
{
“registry-mirrors”: [“https://registry.docker-cn.co”],
“exec-opts”: [“native.cgroupdriver=systemd”]
}去掉
“exec-opts”: [“native.cgroupdriver=systemd”]重启docker 服务
如果觉得我的文章对您有用,请随意打赏。你的支持将鼓励我继续创作!