Hello everyone!
I'm using one of these CWWK/Topton/... Nxxx quad NIC router devices as my pve host. Two days ago I have migrated from a unit with the N100 to the exact same unit only with the N305 CPU. This means that I have just swapped out the two NVME drives, SATA OS drive and RAM. My daily backup (pbs on same system, snapshot to external usb hdd) runs at night and so I noticed a frozen system in the morning. The Host is not responding to any input anymore. The only solution is to hard reset it via power button. I can reproduce this behavior with just starting a manual backup of one of my VMs.
I think the interesting line is this one pve kernel: BUG: unable to handle page fault for address: 00000e66129fa240
My system is completely up to date with the packages from the no-subscription repo. The first time it occurred the system had an older kernel version, I have updated it hopping that maybe this will be fixed.
Has anyone an idea what could be the problem?
I'm using one of these CWWK/Topton/... Nxxx quad NIC router devices as my pve host. Two days ago I have migrated from a unit with the N100 to the exact same unit only with the N305 CPU. This means that I have just swapped out the two NVME drives, SATA OS drive and RAM. My daily backup (pbs on same system, snapshot to external usb hdd) runs at night and so I noticed a frozen system in the morning. The Host is not responding to any input anymore. The only solution is to hard reset it via power button. I can reproduce this behavior with just starting a manual backup of one of my VMs.
Oct 12 15:14:02 pve proxmox-backup-proxy[190647]: error during snapshot file listing: 'unable to load blob '"/mnt/pve/backup/vm/101/2023-10-12T12:18:27Z/index.json.blob"' - No such file or directory (os error 2)'
Oct 12 15:14:08 pve pvedaemon[190730]: <root@pam> starting task UPIDve:0004799A:00040CC4:6527F120:imgdel:101@pbs:root@pam:
Oct 12 15:14:08 pve proxmox-backup-[190647]: pve proxmox-backup-proxy[190647]: removing backup snapshot "/mnt/pve/backup/vm/101/2023-10-12T12:18:27Z"
Oct 12 15:14:08 pve pvedaemon[190730]: <root@pam> end task UPIDve:0004799A:00040CC4:6527F120:imgdel:101@pbs:root@pam: OK
Oct 12 15:14:11 pve pvedaemon[190729]: <root@pam> starting task UPIDve:000479CE:00040DCE:6527F123:vzdump:101:root@pam:
Oct 12 15:14:11 pve pvedaemon[293326]: INFO: starting new backup job: vzdump 101 --notes-template '{{guestname}}' --remove 0 --mode snapshot --node pve --storage pbs
Oct 12 15:14:11 pve pvedaemon[293326]: INFO: Starting Backup of VM 101 (qemu)
Oct 12 15:14:11 pve proxmox-backup-proxy[190647]: starting new backup on datastore 'backup' from ::ffff:192.168.178.8: "vm/101/2023-10-12T13:14:11Z"
Oct 12 15:14:11 pve proxmox-backup-proxy[190647]: download 'index.json.blob' from previous backup.
Oct 12 15:14:11 pve proxmox-backup-proxy[190647]: register chunks in 'drive-scsi0.img.fidx' from previous backup.
Oct 12 15:14:12 pve proxmox-backup-proxy[190647]: download 'drive-scsi0.img.fidx' from previous backup.
Oct 12 15:14:12 pve proxmox-backup-proxy[190647]: created new fixed index 1 ("vm/101/2023-10-12T13:14:11Z/drive-scsi0.img.fidx")
Oct 12 15:14:12 pve proxmox-backup-proxy[190647]: register chunks in 'drive-scsi1.img.fidx' from previous backup.
Oct 12 15:14:12 pve proxmox-backup-proxy[190647]: download 'drive-scsi1.img.fidx' from previous backup.
Oct 12 15:14:12 pve proxmox-backup-proxy[190647]: created new fixed index 2 ("vm/101/2023-10-12T13:14:11Z/drive-scsi1.img.fidx")
Oct 12 15:14:12 pve proxmox-backup-proxy[190647]: add blob "/mnt/pve/backup/vm/101/2023-10-12T13:14:11Z/qemu-server.conf.blob" (420 bytes, comp: 420)
Oct 12 15:14:15 pve proxmox-backup-proxy[190647]: error during snapshot file listing: 'unable to load blob '"/mnt/pve/backup/vm/101/2023-10-12T13:14:11Z/index.json.blob"' - No such file or directory (os error 2)'
Oct 12 15:14:15 pve pveproxy[190871]: worker exit
Oct 12 15:14:15 pve pveproxy[2330]: worker 190871 finished
Oct 12 15:14:15 pve pveproxy[2330]: starting 1 worker(s)
Oct 12 15:14:15 pve pveproxy[2330]: worker 300986 started
Oct 12 15:17:01 pve CRON[600396]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Oct 12 15:17:01 pve CRON[600397]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Oct 12 15:17:01 pve CRON[600396]: pam_unix(cron:session): session closed for user root
Oct 12 15:20:33 pve kernel: perf: interrupt took too long (2505 > 2500), lowering kernel.perf_event_max_sample_rate to 79750
Oct 12 15:20:34 pve zed[1264433]: eid=8 class=checksum pool='datastore' vdev=nvme-Samsung_SSD_970_EVO_Plus_2TB_S4J4NM0W615790R-part1 algorithm=fletcher4 size=8192 offset=211223277568 priority=0 err=52 flags=0x380880 bookmark=671:1:0:33183492
Oct 12 15:21:06 pve zed[1346716]: eid=9 class=checksum pool='datastore' vdev=nvme-Samsung_SSD_970_EVO_Plus_2TB_S4J4NM0W615820H-part1 algorithm=fletcher4 size=8192 offset=332496965632 priority=0 err=52 flags=0x380880 bookmark=671:1:0:37461792
Oct 12 15:21:19 pve kernel: BUG: unable to handle page fault for address: 00000e66129fa240
Oct 12 15:21:19 pve kernel: #PF: supervisor read access in kernel mode
-- Rebooted --
Oct 12 15:14:08 pve pvedaemon[190730]: <root@pam> starting task UPIDve:0004799A:00040CC4:6527F120:imgdel:101@pbs:root@pam:
Oct 12 15:14:08 pve proxmox-backup-[190647]: pve proxmox-backup-proxy[190647]: removing backup snapshot "/mnt/pve/backup/vm/101/2023-10-12T12:18:27Z"
Oct 12 15:14:08 pve pvedaemon[190730]: <root@pam> end task UPIDve:0004799A:00040CC4:6527F120:imgdel:101@pbs:root@pam: OK
Oct 12 15:14:11 pve pvedaemon[190729]: <root@pam> starting task UPIDve:000479CE:00040DCE:6527F123:vzdump:101:root@pam:
Oct 12 15:14:11 pve pvedaemon[293326]: INFO: starting new backup job: vzdump 101 --notes-template '{{guestname}}' --remove 0 --mode snapshot --node pve --storage pbs
Oct 12 15:14:11 pve pvedaemon[293326]: INFO: Starting Backup of VM 101 (qemu)
Oct 12 15:14:11 pve proxmox-backup-proxy[190647]: starting new backup on datastore 'backup' from ::ffff:192.168.178.8: "vm/101/2023-10-12T13:14:11Z"
Oct 12 15:14:11 pve proxmox-backup-proxy[190647]: download 'index.json.blob' from previous backup.
Oct 12 15:14:11 pve proxmox-backup-proxy[190647]: register chunks in 'drive-scsi0.img.fidx' from previous backup.
Oct 12 15:14:12 pve proxmox-backup-proxy[190647]: download 'drive-scsi0.img.fidx' from previous backup.
Oct 12 15:14:12 pve proxmox-backup-proxy[190647]: created new fixed index 1 ("vm/101/2023-10-12T13:14:11Z/drive-scsi0.img.fidx")
Oct 12 15:14:12 pve proxmox-backup-proxy[190647]: register chunks in 'drive-scsi1.img.fidx' from previous backup.
Oct 12 15:14:12 pve proxmox-backup-proxy[190647]: download 'drive-scsi1.img.fidx' from previous backup.
Oct 12 15:14:12 pve proxmox-backup-proxy[190647]: created new fixed index 2 ("vm/101/2023-10-12T13:14:11Z/drive-scsi1.img.fidx")
Oct 12 15:14:12 pve proxmox-backup-proxy[190647]: add blob "/mnt/pve/backup/vm/101/2023-10-12T13:14:11Z/qemu-server.conf.blob" (420 bytes, comp: 420)
Oct 12 15:14:15 pve proxmox-backup-proxy[190647]: error during snapshot file listing: 'unable to load blob '"/mnt/pve/backup/vm/101/2023-10-12T13:14:11Z/index.json.blob"' - No such file or directory (os error 2)'
Oct 12 15:14:15 pve pveproxy[190871]: worker exit
Oct 12 15:14:15 pve pveproxy[2330]: worker 190871 finished
Oct 12 15:14:15 pve pveproxy[2330]: starting 1 worker(s)
Oct 12 15:14:15 pve pveproxy[2330]: worker 300986 started
Oct 12 15:17:01 pve CRON[600396]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Oct 12 15:17:01 pve CRON[600397]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Oct 12 15:17:01 pve CRON[600396]: pam_unix(cron:session): session closed for user root
Oct 12 15:20:33 pve kernel: perf: interrupt took too long (2505 > 2500), lowering kernel.perf_event_max_sample_rate to 79750
Oct 12 15:20:34 pve zed[1264433]: eid=8 class=checksum pool='datastore' vdev=nvme-Samsung_SSD_970_EVO_Plus_2TB_S4J4NM0W615790R-part1 algorithm=fletcher4 size=8192 offset=211223277568 priority=0 err=52 flags=0x380880 bookmark=671:1:0:33183492
Oct 12 15:21:06 pve zed[1346716]: eid=9 class=checksum pool='datastore' vdev=nvme-Samsung_SSD_970_EVO_Plus_2TB_S4J4NM0W615820H-part1 algorithm=fletcher4 size=8192 offset=332496965632 priority=0 err=52 flags=0x380880 bookmark=671:1:0:37461792
Oct 12 15:21:19 pve kernel: BUG: unable to handle page fault for address: 00000e66129fa240
Oct 12 15:21:19 pve kernel: #PF: supervisor read access in kernel mode
-- Rebooted --
I think the interesting line is this one pve kernel: BUG: unable to handle page fault for address: 00000e66129fa240
My system is completely up to date with the packages from the no-subscription repo. The first time it occurred the system had an older kernel version, I have updated it hopping that maybe this will be fixed.
Has anyone an idea what could be the problem?
Last edited: