Proxmox 6.1.8 LXC zfs replication issues

Paspao

Active Member
Aug 1, 2017
69
2
28
55
Hello,

I have 2 clusters with same hardware, configuration and same LXC.

One cluster is still on V6.0 and another I upgraded to 6.1.8.

The both use local ZFS.

After upgrading the cluster to 6.1 I had issues with LXC with one or more processes in D (Uninterruptible sleep) and was forced to reboot nodes.

I noticed LXC were hanging during replication hours and I found replication jobs hanging in SYNCING .

No errors in syslog in most cases.

In one case I found:
Code:
Apr 22 01:36:40 proxmox kernel: [837119.871166]  filp_close+0x36/0x70
Apr 22 01:38:41 proxmox kernel: [837240.702308]  fuse_flush+0x14d/0x190
Apr 22 01:40:42 proxmox kernel: [837361.533436] zabbix_agentd   D    0 39050  39048 0x00004324
Apr 22 01:40:42 proxmox kernel: [837361.533526] RDX: 0000558d0267d2b0 RSI: 0000000000000001 RDI: 0000000000000006
Apr 22 01:42:43 proxmox kernel: [837482.364556] Call Trace:
Apr 22 01:44:43 proxmox kernel: [837603.195770] RIP: 0033:0x7fdc0a5c32b0
Apr 22 01:46:44 proxmox kernel: [837724.026875]  __x64_sys_close+0x22/0x50
Apr 22 01:48:45 proxmox kernel: [837844.857981]  fuse_request_send+0x29/0x30
Apr 22 01:50:46 proxmox kernel: [837965.689128]  __x64_sys_close+0x22/0x50
Apr 22 01:52:47 proxmox kernel: [838086.520278]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Is there any known bug ?

Any troubleshooting hint?

Thank you.
P.
 
After upgrading the cluster to 6.1 I had issues with LXC with one or more processes in D (Uninterruptible sleep) and was forced to reboot nodes.
which processes where in uninteruptible sleep?

could you please provide the complete journal for that timeframe ?
Code:
journalctl --since '2020-04-22'

the part of the stacktrace you pasted does not look like a known problem (zabbix being blocked on a fuse operation...)
do you have zabbix agents running inside your lxc-containers?
 
Hi Stoiko,

I could provide log but really contains nothing interesting only some:

zed: eid=7528 class=history_event pool_guid=0xA49767BB4C1CD042



which processes where in uninteruptible sleep?

In one case all processes under lxc init were in D (Ds/Dt)

the part of the stacktrace you pasted does not look like a known problem (zabbix being blocked on a fuse operation...)
do you have zabbix agents running inside your lxc-containers?

In some cases only zabbix-agent hangs, yes we have it in all our lxc


Thank you !
P.
 
Last edited:
zed: eid=7528 class=history_event pool_guid=0xA49767BB4C1CD042
That seems like a regular message that can happen when creating a volume or a snapshot - this is unlikely to be the culprit

is the part of the stacktrace you posted in your original post listed in the journal?:
* if yes please post the surrounding lines
* if no i guess you don't have persistent journalling enabled and rebooted the host -> please check the /var/log/syslog* files for the relevant lines

without a complete log it is not really possible to narrow down the issue
 
Hello,

today it happened again, even if I stopped storage replication.

17558 ? D 0:15 \_ /usr/sbin/zabbix_agentd: collector [processing data]

It seems it happened very close to log rotation time at 3:19 AM.

No errors in syslog (attached).

This is output of cat /proc/17558/stack:
Code:
[<0>] request_wait_answer+0x133/0x210
[<0>] __fuse_request_send+0x69/0x90
[<0>] fuse_request_send+0x29/0x30
[<0>] fuse_direct_io+0x404/0x690
[<0>] fuse_file_read_iter+0x9a/0x130
[<0>] new_sync_read+0x122/0x1b0
[<0>] __vfs_read+0x29/0x40
[<0>] vfs_read+0x99/0x160
[<0>] ksys_read+0x61/0xe0
[<0>] __x64_sys_read+0x1a/0x20
[<0>] do_syscall_64+0x5a/0x130
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

Code:
lsof -p 17558

COMMAND     PID USER   FD   TYPE DEVICE SIZE/OFF     NODE NAME
zabbix_ag 17558 _apt  cwd    DIR   0,62       22       34 /
zabbix_ag 17558 _apt  rtd    DIR   0,62       22       34 /
zabbix_ag 17558 _apt  txt    REG   0,62   287448   100655 /usr/sbin/zabbix_agentd
zabbix_ag 17558 _apt  DEL    REG    0,1                 1 /SYSV6c3e02ce
zabbix_ag 17558 _apt  mem    REG   0,62             40804 /lib/x86_64-linux-gnu/libnss_files-2.13.so (stat: No such file or directory)
zabbix_ag 17558 _apt  mem    REG   0,62             40806 /lib/x86_64-linux-gnu/libnss_nis-2.13.so (stat: No such file or directory)
zabbix_ag 17558 _apt  mem    REG   0,62             40647 /lib/x86_64-linux-gnu/libnsl-2.13.so (stat: No such file or directory)
zabbix_ag 17558 _apt  mem    REG   0,62             40801 /lib/x86_64-linux-gnu/libnss_compat-2.13.so (stat: No such file or directory)
zabbix_ag 17558 _apt  mem    REG   0,62             98313 /lib/x86_64-linux-gnu/libkeyutils.so.1.4 (stat: No such file or directory)
zabbix_ag 17558 _apt  mem    REG   0,62             54283 /usr/lib/x86_64-linux-gnu/libkrb5support.so.0.1 (path dev=0,25, inode=5667)
zabbix_ag 17558 _apt  mem    REG   0,62             54095 /lib/x86_64-linux-gnu/libcom_err.so.2.1 (path dev=0,25, inode=225731)
zabbix_ag 17558 _apt  mem    REG   0,62             54220 /usr/lib/x86_64-linux-gnu/libk5crypto.so.3.1 (path dev=0,25, inode=5659)
zabbix_ag 17558 _apt  mem    REG   0,62             54263 /usr/lib/x86_64-linux-gnu/libkrb5.so.3.3 (path dev=0,25, inode=5665)
zabbix_ag 17558 _apt  mem    REG   0,62               182 /lib/x86_64-linux-gnu/libgpg-error.so.0.8.0 (stat: No such file or directory)
zabbix_ag 17558 _apt  mem    REG   0,62             34866 /usr/lib/x86_64-linux-gnu/libp11-kit.so.0.0.0 (stat: No such file or directory)
zabbix_ag 17558 _apt  mem    REG   0,62            180292 /usr/lib/x86_64-linux-gnu/libtasn1.so.3.1.16 (stat: No such file or directory)
zabbix_ag 17558 _apt  mem    REG   0,62             34561 /usr/lib/x86_64-linux-gnu/librtmp.so.0 (stat: No such file or directory)
zabbix_ag 17558 _apt  mem    REG   0,62               149 /lib/x86_64-linux-gnu/libz.so.1.2.7 (stat: No such file or directory)
zabbix_ag 17558 _apt  mem    REG   0,62             54240 /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2 (path dev=0,25, inode=1672)
zabbix_ag 17558 _apt  mem    REG   0,62             68992 /usr/lib/x86_64-linux-gnu/libssh2.so.1.0.1 (path dev=0,25, inode=30160)
zabbix_ag 17558 _apt  mem    REG   0,62             34537 /usr/lib/x86_64-linux-gnu/libidn.so.11.6.8 (stat: No such file or directory)
zabbix_ag 17558 _apt  mem    REG   0,62             40570 /lib/x86_64-linux-gnu/libpthread-2.13.so (stat: No such file or directory)
zabbix_ag 17558 _apt  mem    REG   0,62             54161 /lib/x86_64-linux-gnu/libgcrypt.so.11.7.0 (stat: No such file or directory)
zabbix_ag 17558 _apt  mem    REG   0,62             54197 /usr/lib/x86_64-linux-gnu/libgnutls.so.26.22.4 (stat: No such file or directory)
zabbix_ag 17558 _apt  mem    REG   0,62             34435 /usr/lib/x86_64-linux-gnu/libsasl2.so.2.0.25 (path dev=0,25, inode=69761)
zabbix_ag 17558 _apt  mem    REG   0,62             40574 /lib/x86_64-linux-gnu/libc-2.13.so (stat: No such file or directory)
zabbix_ag 17558 _apt  mem    REG   0,62             40809 /lib/x86_64-linux-gnu/libresolv-2.13.so (stat: No such file or directory)
zabbix_ag 17558 _apt  mem    REG   0,62             40815 /lib/x86_64-linux-gnu/librt-2.13.so (stat: No such file or directory)
zabbix_ag 17558 _apt  mem    REG   0,62             40599 /lib/x86_64-linux-gnu/libdl-2.13.so (stat: No such file or directory)
zabbix_ag 17558 _apt  mem    REG   0,62             95339 /usr/lib/x86_64-linux-gnu/libcurl-gnutls.so.4.2.0 (stat: No such file or directory)
zabbix_ag 17558 _apt  mem    REG   0,62             98330 /usr/lib/x86_64-linux-gnu/liblber-2.4.so.2.8.3 (stat: No such file or directory)
zabbix_ag 17558 _apt  mem    REG   0,62             98329 /usr/lib/x86_64-linux-gnu/libldap_r-2.4.so.2.8.3 (stat: No such file or directory)
zabbix_ag 17558 _apt  mem    REG   0,62             40571 /lib/x86_64-linux-gnu/ld-2.13.so (stat: No such file or directory)
zabbix_ag 17558 _apt    0r   CHR    1,3      0t0        5 /dev/null
zabbix_ag 17558 _apt    1w   REG   0,62     1937    70384 /var/log/zabbix-agent/zabbix_agentd.log.1 (deleted)
zabbix_ag 17558 _apt    2w   REG   0,62     1937    70384 /var/log/zabbix-agent/zabbix_agentd.log.1 (deleted)
zabbix_ag 17558 _apt    3w   REG  0,171        4       33 /run/zabbix/zabbix_agentd.pid (deleted)
zabbix_ag 17558 _apt    4u  sock    0,9      0t0 35768137 protocol: TCP
zabbix_ag 17558 _apt    5u  sock    0,9      0t0 35768138 protocol: TCPv6
zabbix_ag 17558 _apt    6r   REG  0,116        0        7 /proc/stat
zabbix_ag 17558 _apt   11r   REG    0,5        0    29336 /proc/30433/status

Any other data from /proc can help troubleshooting?

It is really an issue have to reboot server for this, then reboot command hangs in killing process and I don't like power cycling.

Could you please help in troubleshooting?

Thank you
P.
 

Attachments

  • syslog.txt
    145 KB · Views: 0
Last edited:
This is output of cat /proc/17558/stack:
could you try to access:
* `/proc/stat`
* `/proc/30433/status`
(i.e. the 2 files that zabbix tries to access when it hangs) - via cat?

(of course if this happens again you'll need to run the lsof again)

Thanks!
 
additionally could I ask you for:
* `pveversion -v`
* `dmesg`
outputs?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!