Kubernetes 使用 flannel 创建容器陷入“ContainerCreating"状态

编程入门行业动态更新时间:2024-10-24 16:21:47

本文介绍了Kubernetes 使用 flannel 创建容器陷入“ContainerCreating"状态的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！问题描述

背景

我按照

问题

为什么容器创建没有按预期工作，这可能是什么根本原因?最重要的是:我该如何解决这个问题?

编辑

总结我的回答如下，原因如下:

Docker 使用 cgroups 而不是 systemd
我没有正确配置iptables
我使用了错误的 kubeadm init，因为 flannels 标准-yaml 要求 --pod-network-cidr 为 10.244.0.0/16

解决方案

因为回答这个问题花了我很多时间，所以我想分享一下是什么让我摆脱了这个问题.可能有一些不必要的代码，但如果我或其他人必须重做所有步骤，我也希望将这些代码放在一个地方.

首先，一切都始于 Docker...

我发现这大概都是从我安装 Docker 的方式开始的.按照链接的在线说明，我使用 sudo apt-get install docker.io 来安装 Docker 并通过执行 cgroups 将其与 cgroups 一起使用代码>sudo usermod -aG docker $USER.

好吧，看看 Kubernetes 的官方说明，这是一个错误:systemd 是推荐的方法！

所以我按照

在站点注释中:这也解决了 /run/flannel/subnet.env: no such file or directory - 在描述未创建的 coredns 时我在这些步骤之前遇到的错误.

Context

I installed Docker following this instruction on my Ubuntu 18.04 LTS (Server) and later on Kubernetes followed via kubeadm. After initializing (kubeadm init --pod-network-cidr=10.10.10.10/24) and joining a second node (I got a two node cluster for the start) I cannot get my coredns as well as the later applied Web UI (Dashboard) to actually go into status Running.

As pod network I tried both, Flannel (kubectl apply -f raw.githubusercontent/coreos/flannel/2140ac876ef134e0ed5af15c65e414cf26827915/Documentation/kube-flannel.yml) and Weave Net - Nothing changed. It still shows status ContainerCreating, even after hours of waiting:

Question

Why doesn't the container creation work as expected and what might be the root cause for this? And most importantly: How do I solve this?

Edit

Summing up my answer below, here are the reasons why:

Docker used cgroups instead of systemd
I did not configure iptables correctly
I used a wrong kubeadm init since flannels standard-yaml requires --pod-network-cidr to be 10.244.0.0/16

解决方案

Since answering this questions took me a lot of time, I wanted to share what got me out of this. There might be some more code than necessary, but I also want this to be in one place if I or someone else has to redo all steps.

First it all started with Docker...

I figured out that it presumably all started with the way I installed Docker. Following the linked online-instructions I used sudo apt-get install docker.io in order to install Docker and used it with cgroups by doing sudo usermod -aG docker $USER.

Well, taking a look at the official instructions from Kubernetes this was a mistake: systemd is the recommended way to go!

So I completly purged all I ever did with docker by following these great instructions from Mayur Bhandare:

sudo apt-get purge -y docker-engine docker docker.io docker-ce sudo apt-get autoremove -y --purge docker-engine docker docker.io docker-ce sudo rm -rf /var/lib/docker /etc/docker sudo rm /etc/apparmor.d/docker sudo groupdel docker sudo rm -rf /var/run/docker.sock # Reboot to be sure

Afterwards I installed reinstalled the official way (keep in mind that this might change in the future):

# Install Docker CE ## Set up the repository: ### Install packages to allow apt to use a repository over HTTPS apt-get update && apt-get install -y apt-transport-https ca-certificates curl software-properties-common gnupg2 ### Add Docker’s official GPG key curl -fsSL download.docker/linux/ubuntu/gpg | apt-key add - ### Add Docker apt repository. add-apt-repository "deb [arch=amd64] download.docker/linux/ubuntu $(lsb_release -cs) stable" ## Install Docker CE. apt-get update && apt-get install -y containerd.io=1.2.10-3 docker-ce=5:19.03.4~3-0~ubuntu-$(lsb_release -cs) docker-ce-cli=5:19.03.4~3-0~ubuntu-$(lsb_release -cs) # Setup daemon. cat > /etc/docker/daemon.json <<EOF { "exec-opts": ["native.cgroupdriver=systemd"], "log-driver": "json-file", "log-opts": { "max-size": "100m" }, "storage-driver": "overlay2" } EOF mkdir -p /etc/systemd/system/docker.service.d # Restart docker. systemctl daemon-reload systemctl restart docker

Note that this explicitly uses systemd!

... and then it went on with Flannel...

Above I wrote my sudo kubeadm init was done with --pod-network-cidr=10.10.10.10/24 since the latter was the IP of my master. Well, as pointed out here not using the official recommended --pod-network-cidr=10.244.0.0/16 results in an error for example using kubectl proxy or the container-creation when using the provided kubectl apply -f raw.githubusercontent/coreos/flannel/2140ac876ef134e0ed5af15c65e414cf26827915/Documentation/kube-flannel.yml. This is due to the fact that 10.244.0.0/16 is hard-linked in the .yaml and, hence, mandatory - Or you just change it in the .yaml.

In order to get rid of the false configuration I did a full reset. This can be achieved using sudo kubeadm reset and by deleting the config with sudo rm -r ~/.kube/config. Anyhow, since I screwed it so much, I did a full reset by uninstalling and reinstalling kubeadm and making sure it did use iptables this time (which I also forgot to do before...).

Here is a nice link how to fully uninstall all kubeadm-parts.

kubeadm reset sudo apt-get purge kubeadm kubectl kubelet kubernetes-cni kube* sudo apt-get autoremove sudo rm -rf ~/.kube

For the sake of completeness, here is the reinstall as well:

# ensure legacy binaries are installed sudo apt-get install -y iptables arptables ebtables # switch to legacy versions sudo update-alternatives --set iptables /usr/sbin/iptables-legacy sudo update-alternatives --set ip6tables /usr/sbin/ip6tables-legacy sudo update-alternatives --set arptables /usr/sbin/arptables-legacy sudo update-alternatives --set ebtables /usr/sbin/ebtables-legacy # Install Kubernetes with kubeadm sudo apt-get update && sudo apt-get install -y apt-transport-https curl curl -s packages.cloud.google/apt/doc/apt-key.gpg | sudo apt-key add - cat <<EOF | sudo tee /etc/apt/sources.list.d/kubernetes.list deb apt.kubernetes.io/ kubernetes-xenial main EOF sudo apt-get update sudo apt-get install -y kubelet kubeadm kubectl sudo apt-mark hold kubelet kubeadm kubectl #reboot

... and finally it worked!

After the clean reinstallation I did the following:

# Initialize with correct cidr sudo kubeadm init --pod-network-cidr=10.244.0.0/16 mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config kubectl apply -f raw.githubusercontent/coreos/flannel/2140ac876ef134e0ed5af15c65e414cf26827915/Documentation/kube-flannel.yml

And then be astouned by the result:

kubectl get pods --all-namespaces

On a site note: This also resolved the /run/flannel/subnet.env: no such file or directory-error I encountered prior to these steps when describing the uncreated coredns.

更多推荐

Kubernetes 使用 flannel 创建容器陷入“ContainerCreating"状态

本文发布于:2023-11-23 19:40:33，感谢您对本站的认可！

本文链接:https://www.elefans.com/category/jswz/34/1622648.html