meta data for this page
  •  

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
vm:proxmox:lxc:issues [2021/01/27 19:45]
niziak created
vm:proxmox:lxc:issues [2023/07/28 12:13] (current)
niziak
Line 1: Line 1:
 ====== LXC Issues ====== ====== LXC Issues ======
 +
 +===== lxc_init: Failed to run lxc.hook.pre-start for container =====
 +
 +After upgrade guest system from Debian 12.0 to 12.1.
 +
 +<code bash>
 +lxc-start -lDEBUG -o error.log -F -n <​ContainerID>​
 +</​code>​
 +
 +<​code>​unsupported debian version '​12.1'</​code>​
 +
 +PVE (''​pve-container''​) needs upgrade.
 +
 +
 +===== apply caps: operation not permitted: unknown. =====
 +
 +BalenaOS Build inside privileged LXC:
 +<​code>​
 +docker: Error response from daemon: OCI runtime create failed: container_linux.go:​380:​ starting container process caused: apply caps: operation not permitted: unknown.
 +</​code>​
 +
 +Solution (not secure!):
 +<file container.conf>​
 +lxc.apparmor.profile:​ unconfined
 +lxc.cgroup.devices.allow:​ a
 +lxc.cap.drop:​
 +</​file>​
 +
 +Source: [[https://​danthesalmon.com/​running-docker-on-proxmox/​]]
 +
 +
 +===== Slow login into container =====
 +
 +see below
 +
 +===== Failed at step NAMESPACE spawning /​lib/​systemd/​systemd-logind:​ Permission denied =====
 +
 +Debian Bullseye in unprivileged container:
 +
 +<​code>​
 +systemd[579]:​ systemd-logind.service:​ Failed to set up mount namespacing:​ /​run/​systemd/​unit-root/​proc:​ Permission denied
 +systemd[579]:​ systemd-logind.service:​ Failed at step NAMESPACE spawning /​lib/​systemd/​systemd-logind:​ Permission denied
 +</​code>​
 +SOLUTION: enable container nesting.
 +
 +
 +
 +===== cannot stop container =====
 +Container works, responds to pings but it is not possible to SSH or attach.
 +
 +Normal commands to stop or reboot doesn'​t help (even ''​lxc-stop -k''​).
 +
 +**CAUSE:** Container was freezed for snapshot. All processess are in '​D'​ state. Cannot be killed.
 +**SOLUTION:​**
 +<code bash>
 +echo THAWED > /​sys/​fs/​cgroup/​freezer/​lxc/​200/​freezer.state
 +</​code>​
 +
 +**Info**: ​
 +  * [[https://​www.kernel.org/​doc/​Documentation/​cgroup-v1/​freezer-subsystem.txt|Freezer subsystem]]
 +  * [[https://​www.kernel.org/​doc/​html/​latest/​admin-guide/​cgroup-v1/​freezer-subsystem.html]]
 +  * [[https://​www.kernel.org/​doc/​html/​latest//​power/​freezing-of-tasks.html|Freezing of tasks]]
 +
 +
 +==== Investigation ====
 +
 +So killing container is solution:
 +<code bash>
 +pstree -p
 +
 +           ​├─lxc-start(3747487)───systemd(3747514)─┬─agetty(3748048)
 +           ​│ ​                                      ​├─agetty(3748049)
 +
 +kill -9 3747514
 +</​code>​
 +
 +
 +Now it is not possible to start LXC container again. Debugging:
 +<code bash>
 +lxc-start -o lxc-start.log -lDEBUG -F -n 200
 +cat lxc-start.log
 +
 +lxc-start 200 20210325085035.665 INFO     conf - conf.c:​run_script_argv:​331 - Executing script "/​usr/​share/​lxc/​hooks/​lxc-pve-prestart-hook"​ for container "​200",​ config section "​lxc"​
 +lxc-start 200 20210325085036.126 DEBUG    conf - conf.c:​run_buffer:​303 - Script exec /​usr/​share/​lxc/​hooks/​lxc-pve-prestart-hook 200 lxc pre-start produced output: failed to remove directory '/​sys/​fs/​cgroup/​net_cls/​lxc/​200/​ns':​ Device or resource busy
 +</​code>​
 +
 +**Reason:** after killing container systemd, orphaned cgroups left.
 +<code bash>
 +find /​sys/​fs/​cgroup/​*/​lxc/​200 -depth -type d -print -delete
 +
 +# Lot of errors:
 +find: cannot delete ‘/​sys/​fs/​cgroup/​freezer/​lxc/​200’:​ Device or resource busy
 +</​code>​
 +
 +All processess from container 200 are in '​D'​ state ''​Uninterruptible Sleep''​
 +<code bash>
 +ps axl | awk '$10 ~ /D/'
 +</​code>​
 +
 +<code bash>
 +echo w > /​proc/​sysrq_trigger
 +
 +[587314.999001] smbd            D    0 1181293 ​ 42630 0x00004184
 +[587314.999002] Call Trace:
 +[587314.999004] ​ __schedule+0x2e6/​0x6f0
 +[587314.999005] ​ schedule+0x33/​0xa0
 +[587314.999007] ​ __refrigerator+0x44/​0x160
 +[587314.999009] ​ get_signal+0x814/​0x850
 +[587314.999011] ​ do_signal+0x34/​0x6e0
 +[587314.999013] ​ ? wait_woken+0x80/​0x80
 +[587314.999014] ​ ? __audit_syscall_exit+0x236/​0x290
 +[587314.999016] ​ exit_to_usermode_loop+0x90/​0x130
 +[587314.999018] ​ do_syscall_64+0x160/​0x190
 +[587314.999020] ​ entry_SYSCALL_64_after_hwframe+0x44/​0xa9
 +
 +</​code>​
 +
 +So it looks like whole container cgroup was freezed for snapshot and problem happens.
 +
 +
 +===== nested docker in cpulimit =====
 +
 +Gitlab runner fails to start docker executor:
 +<​code>​
 +ERROR: Job failed (system failure): prepare environment:​ Error response from daemon: OCI runtime create failed: container_linux.go:​367:​ starting container process caused: process_linux.go:​495:​ container init caused: process_linux.go:​458:​ setting cgroup config for procHooks process caused: failed to write "​2400000":​ write /​sys/​fs/​cgroup/​cpu,​cpuacct/​docker/​af4fd93c304a3edc9edb85da6f7a7f9ec85a15262db37393a22141686647d060/​cpu.cfs_quota_us:​ invalid argument: unknown (exec.go:​57:​0s). Check https://​docs.gitlab.com/​runner/​shells/​index.html#​shell-profile-loading for more information
 +</​code>​
 +
 +**Reason:** ''​cpulimit''​ was set on container in PVE
 +**Reproduction:​**
 +<code bash>
 +# works:
 +docker run -it busybox
 +
 +# problem:
 +docker run --cpuset-cpus='​0'​ --cpus=1 --cpu-shares=256 -it busybox
 +</​code>​
 +
 +
 +
  
 =====  Failed to set up mount namespacing:​ Permission denied ===== =====  Failed to set up mount namespacing:​ Permission denied =====
Line 12: Line 151:
 gru 28 08:19:10 hostname sshd[783]: pam_systemd(sshd:​session):​ Failed to create session: Failed to activate service '​org.freedesktop.login1':​ timed out (service_start_timeout=25000ms) gru 28 08:19:10 hostname sshd[783]: pam_systemd(sshd:​session):​ Failed to create session: Failed to activate service '​org.freedesktop.login1':​ timed out (service_start_timeout=25000ms)
 gru 28 08:19:14 hostname systemd[877]:​ systemd-logind.service:​ Failed to set up mount namespacing:​ /​run/​systemd/​unit-root/​proc:​ Permission denied gru 28 08:19:14 hostname systemd[877]:​ systemd-logind.service:​ Failed to set up mount namespacing:​ /​run/​systemd/​unit-root/​proc:​ Permission denied
-<​code>​+</code>