[SOLVED] pvesr segfault in perl caused weird hang

Sep 30, 2019
6
2
8
39
First here is a portion of syslog during the event, looks like a segfault at 1:18, but then keeps on running until 1:59 with no errors, then nothing until I manually reboot the following day:
Sep 28 01:17:00 pve01 systemd[1]: Starting Proxmox VE replication runner...
Sep 28 01:17:00 pve01 systemd[1]: pvesr.service: Succeeded.
Sep 28 01:17:00 pve01 systemd[1]: Started Proxmox VE replication runner.
Sep 28 01:17:01 pve01 CRON[20299]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
Sep 28 01:18:00 pve01 systemd[1]: Starting Proxmox VE replication runner...
Sep 28 01:18:00 pve01 systemd[1]: pvesr.service: Main process exited, code=killed, status=11/SEGV
Sep 28 01:18:00 pve01 systemd[1]: pvesr.service: Failed with result 'signal'.
Sep 28 01:18:00 pve01 kernel: [393797.362147] show_signal: 6 callbacks suppressed
Sep 28 01:18:00 pve01 kernel: [393797.362149] traps: pvesr[18087] general protection fault ip:55721d22f3d3 sp:7fffac89bf60 error:0 in perl[55721d226000+15d000]
Sep 28 01:18:00 pve01 systemd[1]: Failed to start Proxmox VE replication runner.

Sep 28 01:19:00 pve01 systemd[1]: Starting Proxmox VE replication runner...
Sep 28 01:19:00 pve01 systemd[1]: pvesr.service: Succeeded.
Sep 28 01:19:00 pve01 systemd[1]: Started Proxmox VE replication runner.
{above 3 repeats every minute until 01:25:00}
Sep 28 01:25:13 pve01 smartd[1531]: Device: /dev/sda [SAT], SMART Usage Attribute: 190 Drive_Temperature changed from 72 to 73
Sep 28 01:25:13 pve01 smartd[1531]: Device: /dev/sdb [SAT], SMART Usage Attribute: 190 Drive_Temperature changed from 75 to 76
Sep 28 01:25:13 pve01 smartd[1531]: Device: /dev/sdc [SAT], SMART Usage Attribute: 190 Drive_Temperature changed from 76 to 77
Sep 28 01:25:13 pve01 smartd[1531]: Device: /dev/sdd [SAT], SMART Usage Attribute: 190 Drive_Temperature changed from 74 to 75
Sep 28 01:26:00 pve01 systemd[1]: Starting Proxmox VE replication runner...
Sep 28 01:26:00 pve01 systemd[1]: pvesr.service: Succeeded.
Sep 28 01:26:00 pve01 systemd[1]: Started Proxmox VE replication runner.
{above 3 repeats every minute until 01:55:00}
Sep 28 01:55:13 pve01 smartd[1531]: Device: /dev/sda [SAT], SMART Usage Attribute: 190 Drive_Temperature changed from 73 to 74
Sep 28 01:55:13 pve01 smartd[1531]: Device: /dev/sdc [SAT], SMART Usage Attribute: 190 Drive_Temperature changed from 77 to 78
Sep 28 01:55:13 pve01 smartd[1531]: Device: /dev/sdd [SAT], SMART Usage Attribute: 190 Drive_Temperature changed from 75 to 76
Sep 28 01:56:00 pve01 systemd[1]: Starting Proxmox VE replication runner...
Sep 28 01:56:00 pve01 systemd[1]: pvesr.service: Succeeded.
Sep 28 01:56:00 pve01 systemd[1]: Started Proxmox VE replication runner.
Sep 28 01:57:00 pve01 systemd[1]: Starting Proxmox VE replication runner...
Sep 28 01:57:00 pve01 systemd[1]: pvesr.service: Succeeded.
Sep 28 01:57:00 pve01 systemd[1]: Started Proxmox VE replication runner.
Sep 28 01:58:00 pve01 systemd[1]: Starting Proxmox VE replication runner...
Sep 28 01:58:00 pve01 systemd[1]: pvesr.service: Succeeded.
Sep 28 01:58:00 pve01 systemd[1]: Started Proxmox VE replication runner.
Sep 28 01:59:00 pve01 systemd[1]: Starting Proxmox VE replication runner...
Sep 28 01:59:00 pve01 systemd[1]: pvesr.service: Succeeded.
Sep 28 01:59:00 pve01 systemd[1]: Started Proxmox VE replication runner.
Sep 29 16:19:24 pve01 systemd-modules-load[1745]: Inserted module 'iscsi_tcp'
Sep 29 16:19:24 pve01 kernel: [ 0.000000] microcode: microcode updated early to revision 0xcc, date = 2019-04-01

Looking at the logs of my VMs, they show an unexpected shutdown occurred at 1:59 as well. The weird thing is the host and all the VMs were responding to pings after the hang, also the host was open to ssh and web connections but they would eventually just time out.

Here's kern.log:
Code:
Sep 25 00:29:06 pve01 kernel: [131647.067364]  nvme1n1: p1 p2
Sep 28 01:18:00 pve01 kernel: [393797.362147] show_signal: 6 callbacks suppressed
Sep 28 01:18:00 pve01 kernel: [393797.362149] traps: pvesr[18087] general protection fault ip:55721d22f3d3 sp:7fffac89bf60 error:0 in perl[55721d226000+15d000]
Sep 29 16:19:24 pve01 kernel: [    0.000000] microcode: microcode updated early to revision 0xcc, date = 2019-04-01

messages:
Code:
Sep 28 00:00:00 pve01 rsyslogd:  [origin software="rsyslogd" swVersion="8.1901.0" x-pid="1534" x-info="https://www.rsyslog.com"] rsyslogd was HUPed
Sep 28 01:18:00 pve01 kernel: [393797.362147] show_signal: 6 callbacks suppressed
Sep 28 01:18:00 pve01 kernel: [393797.362149] traps: pvesr[18087] general protection fault ip:55721d22f3d3 sp:7fffac89bf60 error:0 in perl[55721d226000+15d000]
Sep 29 16:19:24 pve01 kernel: [    0.000000] microcode: microcode updated early to revision 0xcc, date = 2019-04-01

pveversion -v
proxmox-ve: 6.0-2 (running kernel: 5.0.21-2-pve)
pve-manager: 6.0-7 (running version: 6.0-7/28984024)
pve-kernel-5.0: 6.0-8
pve-kernel-helper: 6.0-8
pve-kernel-5.0.21-2-pve: 5.0.21-3
pve-kernel-5.0.21-1-pve: 5.0.21-2
pve-kernel-5.0.15-1-pve: 5.0.15-1
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.0.2-pve2
criu: 3.11-3
glusterfs-client: 5.5-3
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.12-pve1
libpve-access-control: 6.0-2
libpve-apiclient-perl: 3.0-2
libpve-common-perl: 6.0-4
libpve-guest-common-perl: 3.0-1
libpve-http-server-perl: 3.0-2
libpve-storage-perl: 6.0-8
libqb0: 1.0.5-1
lvm2: 2.03.02-pve3
lxc-pve: 3.1.0-65
lxcfs: 3.0.3-pve60
novnc-pve: 1.0.0-60
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.0-7
pve-cluster: 6.0-7
pve-container: 3.0-7
pve-docs: 6.0-4
pve-edk2-firmware: 2.20190614-1
pve-firewall: 4.0-7
pve-firmware: 3.0-2
pve-ha-manager: 3.0-2
pve-i18n: 2.0-3
pve-qemu-kvm: 4.0.0-5
pve-xtermjs: 3.13.2-1
qemu-server: 6.0-7
smartmontools: 7.0-pve2
spiceterm: 3.1-1
vncterm: 1.6-1
zfsutils-linux: 0.8.1-pve2
Any other logs I should be looking at, any ideas?
 
Hi
what you get if you run this command?

Code:
/usr/bin/pvesr run --mail 1
 
Glad that you found the problem.
Please mark the thread as marked.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!