Boot issues with 7.x: system-udevd " Failed to update device symlinks: Too many levels of symbolic links"

xed · Oct 4, 2021

Just narrowed down my search for the culprit of my boot issues to this:

Code:

░░ Subject: A start job for unit systemd-rfkill.socket has finished successfully
░░ Defined-By: systemd
░░ Support: https://www.debian.org/support
░░
░░ A start job for unit systemd-rfkill.socket has finished successfully.
░░
░░ The job identifier is 169.
Oct 04 04:35:54 XXX kernel: ipmi_si IPI0001:00: IPMI kcs interface initialized
Oct 04 04:35:54 XXX kernel: ipmi_ssif: IPMI SSIF Interface driver
Oct 04 04:35:55 XXX systemd-udevd[10006]: sdh1: Failed to update device symlinks: Too many levels of symbolic links
Oct 04 04:35:55 XXX systemd-udevd[10017]: sdl1: Failed to update device symlinks: Too many levels of symbolic links
Oct 04 04:35:55 XXX systemd-udevd[10032]: sdm1: Failed to update device symlinks: Too many levels of symbolic links
Oct 04 04:35:55 XXX systemd-udevd[10043]: sda1: Failed to update device symlinks: Too many levels of symbolic links
Oct 04 04:35:55 XXX systemd-udevd[10037]: sdc1: Failed to update device symlinks: Too many levels of symbolic links
Oct 04 04:35:55 XX systemd-udevd[10050]: sdb1: Failed to update device symlinks: Too many levels of symbolic links
Oct 04 04:35:55 XXX systemd-udevd[10011]: sde1: Failed to update device symlinks: Too many levels of symbolic links
Oct 04 04:35:55 XXX systemd-udevd[10027]: sdf1: Failed to update device symlinks: Too many levels of symbolic links
Oct 04 04:35:55 XXX systemd-udevd[10025]: sdd1: Failed to update device symlinks: Too many levels of symbolic links
Oct 04 04:35:55 XXX systemd-udevd[10037]: sdc1: Failed to update device symlinks: Too many levels of symbolic links
Oct 04 04:35:55 XXX systemd-udevd[10050]: sdb1: Failed to update device symlinks: Too many levels of symbolic links
Oct 04 04:35:55 XXX systemd-udevd[10011]: sde1: Failed to update device symlinks: Too many levels of symbolic links
...
 04 04:37:54 teva systemd[1]: ifupdown2-pre.service: Main process exited, code=exited, status=1/FAILURE
░░ Subject: Unit process exited
░░ Defined-By: systemd
░░ Support: https://www.debian.org/support
░░
░░ An ExecStart= process belonging to unit ifupdown2-pre.service has exited.
░░
░░ The process' exit code is 'exited' and its exit status is 1.
Oct 04 04:37:54 XXX systemd[1]: ifupdown2-pre.service: Failed with result 'exit-code'.
░░ Subject: Unit failed
░░ Defined-By: systemd
░░ Support: https://www.debian.org/support
░░
░░ The unit ifupdown2-pre.service has entered the 'failed' state with result 'exit-code'.
Oct 04 04:37:54 XXX systemd[1]: Failed to start Helper to synchronize boot up for ifupdown.
░░ Subject: A start job for unit ifupdown2-pre.service has failed
░░ Defined-By: systemd
░░ Support: https://www.debian.org/support
░░
░░ A start job for unit ifupdown2-pre.service has finished with a failure.
░░
░░ The job identifier is 42 and the job result is failed.
Oct 04 04:37:54 XXX systemd[1]: Dependency failed for Network initialization.
░░ Subject: A start job for unit networking.service has failed
░░ Defined-By: systemd
░░ Support: https://www.debian.org/support
░░
░░ A start job for unit networking.service has finished with a failure.
░░
░░ The job identifier is 41 and the job result is dependency.
Oct 04 04:37:54 XXX systemd[1]: networking.service: Job networking.service/start failed with result 'dependency'.
Oct 04 04:37:54 XX systemd[1]: ifupdown2-pre.service: Consumed 2.208s CPU time.
░░ Subject: Resources consumed by unit runtime
░░ Defined-By: systemd
░░ Support: https://www.debian.org/support

16 SATA drives, 3 SAS3 drives and 2NVMe system.

Code:

# pveversion -v
proxmox-ve: 7.0-2 (running kernel: 5.11.22-4-pve)
pve-manager: 7.0-11 (running version: 7.0-11/63d82f4e)
pve-kernel-helper: 7.1-2
pve-kernel-5.11: 7.0-7
pve-kernel-5.11.22-4-pve: 5.11.22-9
ceph-fuse: 15.2.14-pve1
corosync: 3.1.5-pve1
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.22-pve1
libproxmox-acme-perl: 1.3.0
libproxmox-backup-qemu0: 1.2.0-1
libpve-access-control: 7.0-4
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.0-9
libpve-guest-common-perl: 4.0-2
libpve-http-server-perl: 4.0-2
libpve-storage-perl: 7.0-11
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 4.0.9-4
lxcfs: 4.0.8-pve2
novnc-pve: 1.2.0-3
proxmox-backup-client: 2.0.10-1
proxmox-backup-file-restore: 2.0.10-1
proxmox-mini-journalreader: 1.2-1
proxmox-widget-toolkit: 3.3-6
pve-cluster: 7.0-3
pve-container: 4.0-10
pve-docs: 7.0-5
pve-edk2-firmware: 3.20200531-1
pve-firewall: 4.2-3
pve-firmware: 3.3-1
pve-ha-manager: 3.3-1
pve-i18n: 2.5-1
pve-qemu-kvm: 6.0.0-4
pve-xtermjs: 4.12.0-1
qemu-server: 7.0-14
smartmontools: 7.2-1
spiceterm: 3.2-2
vncterm: 1.7-1
zfsutils-linux: 2.0.5-pve1

rpool with native ZFS encryption, everything else is a fresh install

xed · Oct 4, 2021

Quick update: It might not be the culprit. The symptoms are: all requests time out for the Storage tabs (cannot access anything at all), or reports 'communication failure'. No NFS in use either.

janssensm · Oct 4, 2021

This could be the same sort of symptoms that I reported [1], but until now nobody has reported the same issues.
I'm lucky that my host had no boot issues, just the messages, but after reading the bug report, there were reports of unbootable systems, so I was cautious.
If this is the same issue, it depends on processor speed, how many cores, how many block devices with the same label and uuid (zfs), perhaps more variables.
I've read in my search that systemd-udev could be timing out and bring down ifupdown2-pre.service with it.

You could look at your boot times with systemd-analyze blame
Or maybe the same workaround I used in /etc/udev/udev.conf, but you should than try what value works for you and watch if other issues appear.

[1] https://forum.proxmox.com/threads/a...ives-too-many-levels-of-symbolic-links.96565/

xed · Oct 4, 2021

Hi! I'm on it, already reviewing the 'blame' output. I can confirm udevadm settle is the real culprit, and it is mostly occurring when someone leaves an iODD "multi image" USB3 disk plugged. I think any drive that is dying/faulty will also cause this issue, or any controller that responds in some non-standard way.

There is a workaround that I do not recommend without further investigation: masking the ifupdown2-pre service. That will remove the 'settle' requirement for udevadm from the boot process for the NICs and things will continue as normal. Most folks will be fine doing this, but I don't recommend it purely out of caution, because udevadm settle should finish just fine unless you have issues with your controllers. If you are using HW RAID I also don't know the potential side effects, usually the RAID controller is already initialized at that point (it does so via its BIOS-launched internal firmware), but I can't extrapolate that to all cases and scenarios.

tl;dr this is often related to problems with drives/storage components.

I'll keep testing.

Search

Search

Boot issues with 7.x: system-udevd " Failed to update device symlinks: Too many levels of symbolic links"

xed

Active Member

xed

Active Member

janssensm

Renowned Member

xed

Active Member