Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ImagePull failure - can't unmount tmpmount #10239

Open
willthames opened this issue May 17, 2024 · 1 comment
Open

ImagePull failure - can't unmount tmpmount #10239

willthames opened this issue May 17, 2024 · 1 comment
Labels

Comments

@willthames
Copy link

willthames commented May 17, 2024

Description

We occasionally get ImagePull errors, particularly at node start time, but at other times too.

Steps to reproduce the issue

Describe the results you received and expected

Expected: images that exist and are accessible on other nodes pull on every node

Actual (I've added extra line breaks so that the output is readable):

May 17 02:30:09 ip-172-18-13-191.us-west-2.compute.internal kubelet[4461]: E0517 02:30:09.771490
    4461 pod_workers.go:1300] "Error syncing pod, skipping" err="failed to \"StartContainer\" for \"kafka\" 
with ErrImagePull: \"failed to pull and unpack image \\\"docker.io/confluentinc/cp-kafka:7.3.3\\\": failed to extract layer 
sha256:969795712c492f0c43031ce89dfb3d6ca2c08221fc28fb4479c7e0a370af7342: failed to unmount /var/lib/
containerd/tmpmounts/containerd-mount2055904353: failed to unmount target /var/lib/containerd/tmpmounts/
containerd-mount2055904353: device or resource busy: unknown\"" pod="management/kafka-1" podUID="31a80586-
e277-4270-93ae-eb14a2615692"
bash-5.1# grep /var/lib/containerd/tmpmounts/containerd-mount2055904353 /proc/1/mountinfo 
19294 128 259:16 /var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/1389/fs//deleted 
/var/lib/containerd/tmpmounts/containerd-mount2055904353 rw,nosuid,nodev,noatime
 shared:68 - xfs /dev/nvme1n1p1 rw,seclabel,attr2,inode64,logbufs=8,logbsize=32k,noquota

I've tried straceing this but all I see is EBUSY returned from the umount2 system call. I can't find any candidates for the mount being used elsewhere - I've looked for candidate processes similar to fluent from #9530 but we don't seem to hostPath mount /var/lib or /var/lib/containerd anywhere.

What version of containerd are you using?

containerd github.com/containerd/containerd 1.6.31+bottlerocket e377cd5

Any other relevant information

bash-5.1# runc --version
runc version 1.1.12+bottlerocket
commit: 51d5e94601ceffbbd85688df1c928ecccbfa4685
spec: 1.0.2-dev
go: go1.21.9
libseccomp: 2.5.5
bash-5.1# crictl info
bash: crictl: command not found
bash-5.1# uname -a
Linux ip-172-18-13-191.us-west-2.compute.internal 6.1.87 #1 SMP PREEMPT_DYNAMIC Wed May  8 18:52:52 UTC 2024 x86_64 GNU/Linux

Show configuration if it is related to CRI plugin.

version = 2
root = "/var/lib/containerd"
state = "/run/containerd"
disabled_plugins = [
    "io.containerd.internal.v1.opt",
    "io.containerd.snapshotter.v1.aufs",
    "io.containerd.snapshotter.v1.devmapper",
    "io.containerd.snapshotter.v1.native",
    "io.containerd.snapshotter.v1.zfs",
]

[grpc]
address = "/run/containerd/containerd.sock"

[plugins."io.containerd.grpc.v1.cri"]
enable_selinux = true
# Pause container image is specified here, shares the same image as kubelet's pod-infra-container-image
sandbox_image = "602401143452.dkr.ecr.us-west-2.amazonaws.com/eks/pause:3.1-eksbuild.1"

[plugins."io.containerd.grpc.v1.cri".containerd]
default_runtime_name = "shimpei"

[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.shimpei]
runtime_type = "io.containerd.runc.v2"
base_runtime_spec = "/etc/containerd/cri-base.json"

[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.shimpei.options]
SystemdCgroup = true
BinaryName = "shimpei"

[plugins."io.containerd.grpc.v1.cri".cni]
bin_dir = "/opt/cni/bin"
conf_dir = "/etc/cni/net.d"
@Fricounet
Copy link

Hey @willthames, are you running some kind of monitoring/security software on your nodes by any chance. If yes, it might be something similar as #5538

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants