Hi all,
I am trying to troubleshoot an issue I am having for some time with my homelab.
Server HW:
I troubleshooted lots of things, saw almost full RAM so I upgraded from 16 gigs to 32... Later I noticed in the logs "deadman" events from system ZFS pool (syslog at the end of the post), which according to documentation should be "flaky hardware or drivers". The errors are same for each issue. However, the errors happen randomly on both disks in array, both are relatively new (half a year) - SMART in screenshot bellow. It s basically same for both drives, as they were only used in this server in mirror.
I would be think it is a faulty disk, if errors were only from one. But from both? Any ideas, what I might have done wrong, might try?
Here is a syslog, starting with time where something starts to happen until I successfuly login to admin console:
Thank you all for your suggestions!
I am trying to troubleshoot an issue I am having for some time with my homelab.
Server HW:
- CPU: AMD Ryzen 2400G
- RAM: 32GB
- System/VM disk: 2x256GB NVMe ZFS mirror (one in onboard slot, other in pcie enclosure)
- Data disk: 10TB WD Gold
I troubleshooted lots of things, saw almost full RAM so I upgraded from 16 gigs to 32... Later I noticed in the logs "deadman" events from system ZFS pool (syslog at the end of the post), which according to documentation should be "flaky hardware or drivers". The errors are same for each issue. However, the errors happen randomly on both disks in array, both are relatively new (half a year) - SMART in screenshot bellow. It s basically same for both drives, as they were only used in this server in mirror.
I would be think it is a faulty disk, if errors were only from one. But from both? Any ideas, what I might have done wrong, might try?
Here is a syslog, starting with time where something starts to happen until I successfuly login to admin console:
Code:
Mar 16 18:27:24 HomeServer-hypervisor pvescheduler[564969]: jobs: cfs-lock 'file-jobs_cfg' error: got lock request timeout
Mar 16 18:27:24 HomeServer-hypervisor pvestatd[2723]: status update time (15.999 seconds)
Mar 16 18:27:24 HomeServer-hypervisor pve-firewall[2724]: firewall update time (15.818 seconds)
Mar 16 18:27:24 HomeServer-hypervisor pmxcfs[2484]: [status] notice: RRDC update error /var/lib/rrdcached/db/pve2-vm/101: -1
Mar 16 18:27:24 HomeServer-hypervisor pmxcfs[2484]: [status] notice: RRDC update error /var/lib/rrdcached/db/pve2-vm/102: -1
Mar 16 18:27:24 HomeServer-hypervisor pmxcfs[2484]: [status] notice: RRDC update error /var/lib/rrdcached/db/pve2-vm/103: -1
Mar 16 18:27:24 HomeServer-hypervisor pmxcfs[2484]: [status] notice: RRDC update error /var/lib/rrdcached/db/pve2-vm/100: -1
Mar 16 18:27:24 HomeServer-hypervisor pmxcfs[2484]: [status] notice: RRDC update error /var/lib/rrdcached/db/pve2-storage/HomeServer-hypervisor/local-zfs: -1
Mar 16 18:27:24 HomeServer-hypervisor pmxcfs[2484]: [status] notice: RRDC update error /var/lib/rrdcached/db/pve2-storage/HomeServer-hypervisor/local: -1
Mar 16 18:53:38 HomeServer-hypervisor pve-firewall[2724]: firewall update time (74.129 seconds)
Mar 16 18:53:52 HomeServer-hypervisor pve-ha-lrm[2779]: loop take too long (88 seconds)
Mar 16 18:53:52 HomeServer-hypervisor pve-ha-crm[2764]: loop take too long (76 seconds)
Mar 16 18:53:52 HomeServer-hypervisor pvescheduler[703528]: jobs: cfs-lock 'file-jobs_cfg' error: got lock request timeout
Mar 16 18:53:52 HomeServer-hypervisor pvestatd[2723]: status update time (88.347 seconds)
Mar 16 18:53:52 HomeServer-hypervisor pve-firewall[2724]: firewall update time (13.790 seconds)
Mar 16 18:53:52 HomeServer-hypervisor pmxcfs[2484]: [status] notice: RRDC update error /var/lib/rrdcached/db/pve2-storage/HomeServer-hypervisor/local: -1
Mar 16 18:53:52 HomeServer-hypervisor pmxcfs[2484]: [status] notice: RRDC update error /var/lib/rrdcached/db/pve2-storage/HomeServer-hypervisor/local-zfs: -1
Mar 16 18:55:03 HomeServer-hypervisor pvedaemon[2754]: <root@pam> successful auth for user 'root@pam'
Mar 16 18:55:39 HomeServer-hypervisor pvedaemon[2753]: <root@pam> successful auth for user 'root@pam'
Mar 16 18:59:22 HomeServer-hypervisor zed[2153]: Missed 26 events
Mar 16 18:59:22 HomeServer-hypervisor zed[2153]: Bumping queue length to 67108864
Mar 16 18:59:22 HomeServer-hypervisor zed[2153]: Missed 89 events
Mar 16 18:59:22 HomeServer-hypervisor zed[2153]: Bumping queue length to 134217728
Mar 16 18:59:22 HomeServer-hypervisor zed[2153]: Missed 46 events
Mar 16 18:59:22 HomeServer-hypervisor zed[2153]: Bumping queue length to 268435456
Mar 16 18:59:22 HomeServer-hypervisor zed[722309]: eid=172 class=deadman pool='rpool' vdev=nvme-Apacer_AS2280P4_256GB_L09938R002700-part3 size=8192 offset=166710919168 priority=3 err=0 flags=0x180880 bookmark=66527:1:0:14557663
Mar 16 18:59:22 HomeServer-hypervisor zed[722306]: eid=167 class=deadman pool='rpool' vdev=nvme-Apacer_AS2280P4_256GB_L09938R002700-part3 size=8192 offset=166710960128 priority=3 err=0 flags=0x180880 bookmark=66527:1:0:14557668
Mar 16 18:59:22 HomeServer-hypervisor zed[722315]: eid=166 class=deadman pool='rpool' vdev=nvme-Apacer_AS2280P4_256GB_L09938R002700-part3 size=8192 offset=166710968320 priority=3 err=0 flags=0x180880 bookmark=66527:1:0:14557669
Mar 16 18:59:22 HomeServer-hypervisor zed[722327]: eid=175 class=deadman pool='rpool' vdev=nvme-Apacer_AS2280P4_256GB_L09938R002700-part3 size=8192 offset=166710894592 priority=3 err=0 flags=0x180880 bookmark=66527:1:0:14557660
Mar 16 18:59:22 HomeServer-hypervisor zed[722326]: eid=169 class=deadman pool='rpool' vdev=nvme-Apacer_AS2280P4_256GB_L09938R002700-part3 size=8192 offset=166710943744 priority=3 err=0 flags=0x180880 bookmark=66527:1:0:14557666
Mar 16 18:59:22 HomeServer-hypervisor zed[722322]: eid=168 class=deadman pool='rpool' vdev=nvme-Apacer_AS2280P4_256GB_L09938R002700-part3 size=8192 offset=166710951936 priority=3 err=0 flags=0x180880 bookmark=66527:1:0:14557667
Mar 16 18:59:22 HomeServer-hypervisor zed[722340]: eid=181 class=deadman pool='rpool' vdev=nvme-Apacer_AS2280P4_256GB_L09938R002700-part3 size=8192 offset=166710845440 priority=3 err=0 flags=0x180880 bookmark=66527:1:0:14557654
Mar 16 18:59:22 HomeServer-hypervisor zed[722338]: eid=174 class=deadman pool='rpool' vdev=nvme-Apacer_AS2280P4_256GB_L09938R002700-part3 size=8192 offset=166710902784 priority=3 err=0 flags=0x180880 bookmark=66527:1:0:14557661
Mar 16 18:59:22 HomeServer-hypervisor zed[722341]: eid=170 class=deadman pool='rpool' vdev=nvme-Apacer_AS2280P4_256GB_L09938R002700-part3 size=8192 offset=166710935552 priority=3 err=0 flags=0x180880 bookmark=66527:1:0:14557665
Mar 16 18:59:22 HomeServer-hypervisor zed[722342]: eid=173 class=deadman pool='rpool' vdev=nvme-Apacer_AS2280P4_256GB_L09938R002700-part3 size=8192 offset=166710910976 priority=3 err=0 flags=0x180880 bookmark=66527:1:0:14557662
Mar 16 18:59:22 HomeServer-hypervisor zed[722354]: eid=180 class=deadman pool='rpool' vdev=nvme-Apacer_AS2280P4_256GB_L09938R002700-part3 size=8192 offset=166710853632 priority=3 err=0 flags=0x180880 bookmark=66527:1:0:14557655
Mar 16 18:59:22 HomeServer-hypervisor zed[722339]: eid=176 class=deadman pool='rpool' vdev=nvme-Apacer_AS2280P4_256GB_L09938R002700-part3 size=8192 offset=166710886400 priority=3 err=0 flags=0x180880 bookmark=66527:1:0:14557659
Mar 16 18:59:22 HomeServer-hypervisor zed[722355]: eid=171 class=deadman pool='rpool' vdev=nvme-Apacer_AS2280P4_256GB_L09938R002700-part3 size=8192 offset=166710927360 priority=3 err=0 flags=0x180880 bookmark=66527:1:0:14557664
Mar 16 18:59:22 HomeServer-hypervisor zed[722344]: eid=177 class=deadman pool='rpool' vdev=nvme-Apacer_AS2280P4_256GB_L09938R002700-part3 size=8192 offset=166710878208 priority=3 err=0 flags=0x180880 bookmark=66527:1:0:14557658
Mar 16 18:59:22 HomeServer-hypervisor zed[722349]: eid=179 class=deadman pool='rpool' vdev=nvme-Apacer_AS2280P4_256GB_L09938R002700-part3 size=8192 offset=166710861824 priority=3 err=0 flags=0x180880 bookmark=66527:1:0:14557656
Mar 16 18:59:22 HomeServer-hypervisor zed[722353]: eid=183 class=deadman pool='rpool' vdev=nvme-Apacer_AS2280P4_256GB_L09938R002700-part3 size=8192 offset=166710829056 priority=3 err=0 flags=0x380880 bookmark=66527:1:0:14557652
Mar 16 18:59:22 HomeServer-hypervisor zed[722358]: eid=182 class=deadman pool='rpool' vdev=nvme-Apacer_AS2280P4_256GB_L09938R002700-part3 size=8192 offset=166710837248 priority=3 err=0 flags=0x180880 bookmark=66527:1:0:14557653
Mar 16 18:59:22 HomeServer-hypervisor zed[722376]: eid=178 class=deadman pool='rpool' vdev=nvme-Apacer_AS2280P4_256GB_L09938R002700-part3 size=8192 offset=166710870016 priority=3 err=0 flags=0x180880 bookmark=66527:1:0:14557657
Mar 16 18:59:22 HomeServer-hypervisor zed[724326]: eid=185 class=deadman pool='rpool' vdev=nvme-Apacer_AS2280P4_256GB_L09938R002700-part3 size=8192 offset=166710820864 priority=3 err=0 flags=0x380880 bookmark=66527:1:0:14557651
Mar 16 18:59:22 HomeServer-hypervisor zed[724324]: eid=184 class=deadman pool='rpool' vdev=nvme-Apacer_AS2280P4_256GB_L09938R002700-part3 size=131072 offset=166710706176 priority=3 err=0 flags=0x40080c80 delay=8574092ms
Mar 16 19:04:18 HomeServer-hypervisor pve-firewall[2724]: firewall update time (145.874 seconds)
Mar 16 19:04:18 HomeServer-hypervisor pvestatd[2723]: status update time (146.324 seconds)
Mar 16 19:04:18 HomeServer-hypervisor pmxcfs[2484]: [status] notice: RRDC update error /var/lib/rrdcached/db/pve2-storage/HomeServer-hypervisor/local: -1
Mar 16 19:04:18 HomeServer-hypervisor pmxcfs[2484]: [status] notice: RRD update error /var/lib/rrdcached/db/pve2-storage/HomeServer-hypervisor/local: /var/lib/rrdcached/db/pve2-storage/HomeServer-hypervisor/local: illegal attempt to update using time 1647453858 when last update time is 1647453858 (minimum one second step)
Mar 16 19:04:18 HomeServer-hypervisor pmxcfs[2484]: [status] notice: RRDC update error /var/lib/rrdcached/db/pve2-storage/HomeServer-hypervisor/local-zfs: -1
Mar 16 19:04:18 HomeServer-hypervisor pmxcfs[2484]: [status] notice: RRD update error /var/lib/rrdcached/db/pve2-storage/HomeServer-hypervisor/local-zfs: /var/lib/rrdcached/db/pve2-storage/HomeServer-hypervisor/local-zfs: illegal attempt to update using time 1647453858 when last update time is 1647453858 (minimum one second step)
Mar 16 19:04:23 HomeServer-hypervisor pve-ha-lrm[2779]: loop take too long (156 seconds)
Mar 16 19:04:23 HomeServer-hypervisor pve-ha-crm[2764]: loop take too long (151 seconds)
Mar 16 19:08:54 HomeServer-hypervisor pvedaemon[2753]: <root@pam> successful auth for user 'root@pam'
Thank you all for your suggestions!
Last edited: