Random VM crashes with a SPICE vm: QEMU free(): corrupted unsorted chunks

Code:
root@ukdah:~# pveversion -v
proxmox-ve: 8.1.0 (running kernel: 6.5.11-7-pve)
pve-manager: 8.1.4 (running version: 8.1.4/ec5affc9e41f1d79)
proxmox-kernel-helper: 8.1.0
pve-kernel-5.15: 7.4-9
proxmox-kernel-6.5: 6.5.11-7
proxmox-kernel-6.5.11-7-pve-signed: 6.5.11-7
pve-kernel-5.0: 6.0-11
pve-kernel-5.15.131-2-pve: 5.15.131-3
pve-kernel-5.4.166-1-pve: 5.4.166-1
pve-kernel-5.0.21-5-pve: 5.0.21-10
pve-kernel-5.0.15-1-pve: 5.0.15-1
ceph: 18.2.1-pve2
ceph-fuse: 18.2.1-pve2
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown: residual config
ifupdown2: 3.2.0-1+pmx8
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-4
libknet1: 1.28-pve1
libproxmox-acme-perl: 1.5.0
libproxmox-backup-qemu0: 1.4.1
libproxmox-rs-perl: 0.3.3
libpve-access-control: 8.0.7
libpve-apiclient-perl: 3.3.1
libpve-common-perl: 8.1.0
libpve-guest-common-perl: 5.0.6
libpve-http-server-perl: 5.0.5
libpve-network-perl: 0.9.5
libpve-rs-perl: 0.8.8
libpve-storage-perl: 8.0.5
libqb0: 1.0.5-1
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 5.0.2-4
lxcfs: 5.0.3-pve4
novnc-pve: 1.4.0-3
proxmox-backup-client: 3.1.2-1
proxmox-backup-file-restore: 3.1.2-1
proxmox-kernel-helper: 8.1.0
proxmox-mail-forward: 0.2.3
proxmox-mini-journalreader: 1.4.0
proxmox-offline-mirror-helper: 0.6.4
proxmox-widget-toolkit: 4.1.3
pve-cluster: 8.0.5
pve-container: 5.0.8
pve-docs: 8.1.3
pve-edk2-firmware: 4.2023.08-3
pve-firewall: 5.0.3
pve-firmware: 3.9-1
pve-ha-manager: 4.0.3
pve-i18n: 3.2.0
pve-qemu-kvm: 8.1.2-6
pve-xtermjs: 5.3.0-3
qemu-server: 8.0.10
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.2.2-pve1


Code:
root@ukdah:~# qm config 146
agent: 1
audio0: device=ich9-intel-hda,driver=spice
balloon: 8096
boot: order=scsi0;net0
cores: 6
cpu: x86-64-v2-AES
description: * Bullseye template for debian workstation%0A* needs%3A spice-vdagent rxvt-unicode%0A* Workstation (polaris)
memory: 24448
meta: creation-qemu=6.1.1,ctime=1651942626
name: bullseye-04
net0: virtio=7E:8C:66:91:6D:66,bridge=vmbr0,firewall=1
numa: 0
onboot: 1
ostype: l26
scsi0: vm_rbd:vm-146-disk-0,discard=on,size=32G
scsihw: virtio-scsi-pci
smbios1: uuid=8df17a61-829f-401c-8cb9-ce991922d8eb
sockets: 1
tags: ukdah
usb0: spice
usb1: spice
usb2: spice
usb3: spice
vga: qxl2,memory=128
vmgenid: 388b854f-8226-4b9c-9109-4526090ab0ca

The last few times it crashed (2-3 since last post), the message was the same segfault. I've installed the coredump tool, so will see if that produces anything. Thanks!

Also possibly worth noting, usually it happens when I click something in firefox on the VM. (not some random link on the interwebs, but things I do all the time like clicking around the proxmox gui, or logging into truenas, etc etc), but that could be a red herring since half my time is spent clicking i suppose.
Also 8.1.4
 
in 8.0-2 there was a libspice-server1_0.15.1-1_amd64.deb package. He's the same now. In theory, it's not about that, but about the adjacent code.
 
Talking with a friend we discussed different possibility to debug a heap corruption. Glibc has some facilities to enable some debugging. One short page I found explaining is https://www.gnu.org/software/libc/manual/html_node/Heap-Consistency-Checking.html.
To sum up you can add some environment like
Code:
MALLOC_CHECK_=3 LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libc_malloc_debug.so.0
to qemu execution (you need to find your debug library, it should be with same name under /usr). How to add environments to Qemu while executing under Proxmox I don't know. In libvirt you can change the XML file editing it with "virsh edit" command.

Another option we discussed was address sanitizer but it needs recompiling the software so it's not easy.
 
  • Like
Reactions: fiona
You can run the following in the CLI to start the VM with the additonal environment variables set:
Code:
MALLOC_CHECK_=3 LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libc_malloc_debug.so.0 qm start XYZ
You can verify that the deug library is loaded with
Code:
grep libc_malloc_debug /proc/$(cat /var/run/qemu-server/XYZ.pid)/maps
In both cases, replace XYZ with the actual ID of the VM of course.
 
There also is QEMU 9.0 available on the pvetest repository. You could give it a try to see if the issue was resolved on the QEMU side since QEMU 8.1. After installing, you need to shutdown+start a VM, Reboot via Web UI (reboot within the guest is not enough!) or live-migrate to an upgraded node to have the VM actually use the new binary.
 
One host with debug option. New crash. Where can I find more debug info from libc_malloc_debug.so.0?
Code:
Jun 03 10:24:05 vdi3 systemd-coredump[2467508]: [] Process 3636863 (kvm) of user 0 dumped core.

                                                Module libsystemd.so.0 from deb systemd-252.22-1~deb12u1.amd64
                                                Module libudev.so.1 from deb systemd-252.22-1~deb12u1.amd64
                                                Stack trace of thread 3636896:
                                                #0  0x00007213623aa401 unlink_chunk (libc_malloc_debug.so.0 + 0x3401)
                                                #1  0x00007213623abfca _int_malloc (libc_malloc_debug.so.0 + 0x4fca)
                                                #2  0x00007213623ac1b9 malloc_check (libc_malloc_debug.so.0 + 0x51b9)
                                                #3  0x00007213623acdf5 __debug_malloc (libc_malloc_debug.so.0 + 0x5df5)
                                                #4  0x00007213619ed679 g_malloc (libglib-2.0.so.0 + 0x5a679)
                                                #5  0x0000721361a07d00 g_memdup2 (libglib-2.0.so.0 + 0x74d00)
                                                #6  0x00007213620d802b n/a (libspice-server.so.1 + 0x4102b)
                                                #7  0x00007213620e90f5 n/a (libspice-server.so.1 + 0x520f5)
                                                #8  0x00007213620e928b n/a (libspice-server.so.1 + 0x5228b)
                                                #9  0x00007213620e9cac n/a (libspice-server.so.1 + 0x52cac)
                                                #10 0x00007213619e77a9 g_main_context_dispatch (libglib-2.0.so.0 + 0x547a9)
                                                #11 0x00007213619e7a38 n/a (libglib-2.0.so.0 + 0x54a38)
                                                #12 0x00007213619e7cef g_main_loop_run (libglib-2.0.so.0 + 0x54cef)
                                                #13 0x00007213620e8fa9 n/a (libspice-server.so.1 + 0x51fa9)
                                                #14 0x00007213600e3134 start_thread (libc.so.6 + 0x89134)
                                                #15 0x00007213601637dc __clone3 (libc.so.6 + 0x1097dc)

                                                Stack trace of thread 3636890:
                                                #0  0x0000721360157c5b __GI___ioctl (libc.so.6 + 0xfdc5b)
                                                #1  0x00005d1b71b086cf kvm_vcpu_ioctl (qemu-system-x86_64 + 0x72c6cf)
                                                #2  0x00005d1b71b08ba5 kvm_cpu_exec (qemu-system-x86_64 + 0x72cba5)
                                                #3  0x00005d1b71b0a08d kvm_vcpu_thread_fn (qemu-system-x86_64 + 0x72e08d)
                                                #4  0x00005d1b71ca22d8 qemu_thread_start (qemu-system-x86_64 + 0x8c62d8)
                                                #5  0x00007213600e3134 start_thread (libc.so.6 + 0x89134)
                                                #6  0x00007213601637dc __clone3 (libc.so.6 + 0x1097dc)

                                                Stack trace of thread 3636864:
                                                #0  0x000072136015b719 syscall (libc.so.6 + 0x101719)
                                                #1  0x00005d1b71ca345a qemu_futex_wait (qemu-system-x86_64 + 0x8c745a)
                                                #2  0x00005d1b71cacd62 call_rcu_thread (qemu-system-x86_64 + 0x8d0d62)
                                                #3  0x00005d1b71ca22d8 qemu_thread_start (qemu-system-x86_64 + 0x8c62d8)
                                                #4  0x00007213600e3134 start_thread (libc.so.6 + 0x89134)
                                                #5  0x00007213601637dc __clone3 (libc.so.6 + 0x1097dc)

                                                Stack trace of thread 3636893:
                                                #0  0x0000721360157c5b __GI___ioctl (libc.so.6 + 0xfdc5b)
                                                #1  0x00005d1b71b086cf kvm_vcpu_ioctl (qemu-system-x86_64 + 0x72c6cf)
                                                #2  0x00005d1b71b08ba5 kvm_cpu_exec (qemu-system-x86_64 + 0x72cba5)
                                                #3  0x00005d1b71b0a08d kvm_vcpu_thread_fn (qemu-system-x86_64 + 0x72e08d)
                                                #4  0x00005d1b71ca22d8 qemu_thread_start (qemu-system-x86_64 + 0x8c62d8)
                                                #5  0x00007213600e3134 start_thread (libc.so.6 + 0x89134)
                                                #6  0x00007213601637dc __clone3 (libc.so.6 + 0x1097dc)

                                                Stack trace of thread 3636892:
                                                #0  0x0000721360157c5b __GI___ioctl (libc.so.6 + 0xfdc5b)
                                                #1  0x00005d1b71b086cf kvm_vcpu_ioctl (qemu-system-x86_64 + 0x72c6cf)
                                                #2  0x00005d1b71b08ba5 kvm_cpu_exec (qemu-system-x86_64 + 0x72cba5)
                                                #3  0x00005d1b71b0a08d kvm_vcpu_thread_fn (qemu-system-x86_64 + 0x72e08d)
                                                #4  0x00005d1b71ca22d8 qemu_thread_start (qemu-system-x86_64 + 0x8c62d8)
                                                #5  0x00007213600e3134 start_thread (libc.so.6 + 0x89134)
                                                #6  0x00007213601637dc __clone3 (libc.so.6 + 0x1097dc)

                                                Stack trace of thread 3636895:
                                                #0  0x000072136015615f __GI___poll (libc.so.6 + 0xfc15f)
                                                #1  0x00007213619e79ae n/a (libglib-2.0.so.0 + 0x549ae)
                                                #2  0x00007213619e7cef g_main_loop_run (libglib-2.0.so.0 + 0x54cef)
                                                #3  0x00007213620e8fa9 n/a (libspice-server.so.1 + 0x51fa9)
                                                #4  0x00007213600e3134 start_thread (libc.so.6 + 0x89134)
                                                #5  0x00007213601637dc __clone3 (libc.so.6 + 0x1097dc)

                                                Stack trace of thread 2425743:
                                                #0  0x00007213600dfe96 __futex_abstimed_wait_common64 (libc.so.6 + 0x85e96)
                                                #1  0x00007213600e283c __pthread_cond_wait_common (libc.so.6 + 0x8883c)
                                                #2  0x00005d1b71ca2461 qemu_cond_timedwait_ts (qemu-system-x86_64 + 0x8c6461)
                                                #3  0x00005d1b71ca3000 qemu_cond_timedwait_impl (qemu-system-x86_64 + 0x8c7000)
                                                #4  0x00005d1b71cb7834 worker_thread (qemu-system-x86_64 + 0x8db834)
                                                #5  0x00005d1b71ca22d8 qemu_thread_start (qemu-system-x86_64 + 0x8c62d8)
                                                #6  0x00007213600e3134 start_thread (libc.so.6 + 0x89134)
                                                #7  0x00007213601637dc __clone3 (libc.so.6 + 0x1097dc)

                                                Stack trace of thread 3636898:
                                                #0  0x00007213600dfe96 __futex_abstimed_wait_common64 (libc.so.6 + 0x85e96)
                                                #1  0x00007213600e2558 __pthread_cond_wait_common (libc.so.6 + 0x88558)
                                                #2  0x00005d1b71ca2deb qemu_cond_wait_impl (qemu-system-x86_64 + 0x8c6deb)
                                                #3  0x00005d1b7172ef2b vnc_worker_thread_loop (qemu-system-x86_64 + 0x352f2b)
                                                #4  0x00005d1b7172fbc8 vnc_worker_thread (qemu-system-x86_64 + 0x353bc8)
                                                #5  0x00005d1b71ca22d8 qemu_thread_start (qemu-system-x86_64 + 0x8c62d8)
                                                #6  0x00007213600e3134 start_thread (libc.so.6 + 0x89134)
                                                #7  0x00007213601637dc __clone3 (libc.so.6 + 0x1097dc)

                                                Stack trace of thread 3636863:
                                                #0  0x0000721360156256 __ppoll (libc.so.6 + 0xfc256)
                                                #1  0x00005d1b71cb855e ppoll (qemu-system-x86_64 + 0x8dc55e)
                                                #2  0x00005d1b71cb5e4e os_host_main_loop_wait (qemu-system-x86_64 + 0x8d9e4e)
                                                #3  0x00005d1b71912aa7 qemu_main_loop (qemu-system-x86_64 + 0x536aa7)
                                                #4  0x00005d1b71b12f46 qemu_default_main (qemu-system-x86_64 + 0x736f46)
                                                #5  0x000072136008124a __libc_start_call_main (libc.so.6 + 0x2724a)
                                                #6  0x0000721360081305 __libc_start_main_impl (libc.so.6 + 0x27305)
                                                #7  0x00005d1b717050a1 _start (qemu-system-x86_64 + 0x3290a1)

                                                Stack trace of thread 3636891:
                                                #0  0x0000721360157c5b __GI___ioctl (libc.so.6 + 0xfdc5b)
                                                #1  0x00005d1b71b086cf kvm_vcpu_ioctl (qemu-system-x86_64 + 0x72c6cf)
                                                #2  0x00005d1b71b08ba5 kvm_cpu_exec (qemu-system-x86_64 + 0x72cba5)
                                                #3  0x00005d1b71b0a08d kvm_vcpu_thread_fn (qemu-system-x86_64 + 0x72e08d)
                                                #4  0x00005d1b71ca22d8 qemu_thread_start (qemu-system-x86_64 + 0x8c62d8)
                                                #5  0x00007213600e3134 start_thread (libc.so.6 + 0x89134)
                                                #6  0x00007213601637dc __clone3 (libc.so.6 + 0x1097dc)
                                                ELF object binary architecture: AMD x86-64

One host was updated to qemu 9.
 
One host with debug option. New crash. Where can I find more debug info from libc_malloc_debug.so.0?
Is there anything interesting in the system log/journal before the crash trace?

One host was updated to qemu 9.
So a different one then the one that just had the crash?
 
Is there anything interesting in the system log/journal before the crash trace?
Just this
Code:
Jun 03 11:52:09 vdi3 kernel: SPICE Worker[3635472]: segfault at 5eaf43bf9820 ip 000074232cf9a401 sp 00007421987fadd0 error 4 in libc_malloc_debug.so.0[74232cf99000+6000] likely on CPU 75 (core 3, socket 1)
Jun 03 11:52:09 vdi3 kernel: Code: 08 48 89 c8 48 83 e0 f8 48 3b 04 07 0f 85 a9 00 00 00 f3 0f 6f 47 10 48 8b 77 10 48 8b 57 18 66 48 0f 7e c0 48 3b 78 18 75 77 <48> 3b 7a 10 75 71 48 89 50 18 66 0f d6 42 10 48 81 f9 ff 03 00 00
So a different one then the one that just had the crash?
The crash was on qemu 8.1 with libc_malloc_debug.so, it has not been on qemu 9.0 yet
 
qemu 9 not helped
Code:
Jun 04 11:16:15 vdi3 kernel: show_signal_msg: 16 callbacks suppressed
Jun 04 11:16:15 vdi3 kernel: SPICE Worker[6633]: segfault at 7cb4540000a0 ip 00007cb707d94fcd sp 00007cb575dfaeb0 error 4 in libc.so.6[7cb707d26000+155000] likely on CPU 73 (core 1, socket 1)

Code:
Jun 04 11:17:16 vdi3 systemd-coredump[1215989]: [] Process 6595 (kvm) of user 0 dumped core.

                                                Module libsystemd.so.0 from deb systemd-252.22-1~deb12u1.amd64
                                                Module libudev.so.1 from deb systemd-252.22-1~deb12u1.amd64
                                                Stack trace of thread 6633:
                                                #0  0x00007cb707d94fcd unlink_chunk (libc.so.6 + 0x94fcd)
                                                #1  0x00007cb707d9626f _int_free (libc.so.6 + 0x9626f)
                                                #2  0x00007cb707d98e8f __GI___libc_free (libc.so.6 + 0x98e8f)
                                                #3  0x00007cb709d7c089 n/a (libspice-server.so.1 + 0x41089)
                                                #4  0x00007cb709d53338 n/a (libspice-server.so.1 + 0x18338)
                                                #5  0x00007cb709d775e3 n/a (libspice-server.so.1 + 0x3c5e3)
                                                #6  0x00007cb709d77700 n/a (libspice-server.so.1 + 0x3c700)
                                                #7  0x00007cb709d6179d n/a (libspice-server.so.1 + 0x2679d)
                                                #8  0x00007cb70968b67f g_main_context_dispatch (libglib-2.0.so.0 + 0x5467f)
                                                #9  0x00007cb70968ba38 n/a (libglib-2.0.so.0 + 0x54a38)
                                                #10 0x00007cb70968bcef g_main_loop_run (libglib-2.0.so.0 + 0x54cef)
                                                #11 0x00007cb709d8cfa9 n/a (libspice-server.so.1 + 0x51fa9)
                                                #12 0x00007cb707d89134 start_thread (libc.so.6 + 0x89134)
                                                #13 0x00007cb707e097dc __clone3 (libc.so.6 + 0x1097dc)

                                                Stack trace of thread 6628:
                                                #0  0x00007cb707dfdc5b __GI___ioctl (libc.so.6 + 0xfdc5b)
                                                #1  0x000064dd853a19c9 kvm_vcpu_ioctl (qemu-system-x86_64 + 0x7ad9c9)
                                                #2  0x000064dd853a1f11 kvm_cpu_exec (qemu-system-x86_64 + 0x7adf11)
                                                #3  0x000064dd853a3755 kvm_vcpu_thread_fn (qemu-system-x86_64 + 0x7af755)
                                                #4  0x000064dd85559e88 qemu_thread_start (qemu-system-x86_64 + 0x965e88)
                                                #5  0x00007cb707d89134 start_thread (libc.so.6 + 0x89134)
                                                #6  0x00007cb707e097dc __clone3 (libc.so.6 + 0x1097dc)

                                                Stack trace of thread 6629:
                                                #0  0x00007cb707dfdc5b __GI___ioctl (libc.so.6 + 0xfdc5b)
                                                #1  0x000064dd853a19c9 kvm_vcpu_ioctl (qemu-system-x86_64 + 0x7ad9c9)
                                                #2  0x000064dd853a1f11 kvm_cpu_exec (qemu-system-x86_64 + 0x7adf11)
                                                #3  0x000064dd853a3755 kvm_vcpu_thread_fn (qemu-system-x86_64 + 0x7af755)
                                                #4  0x000064dd85559e88 qemu_thread_start (qemu-system-x86_64 + 0x965e88)
                                                #5  0x00007cb707d89134 start_thread (libc.so.6 + 0x89134)
                                                #6  0x00007cb707e097dc __clone3 (libc.so.6 + 0x1097dc)

                                                Stack trace of thread 6596:
                                                #0  0x00007cb707e01719 syscall (libc.so.6 + 0x101719)
                                                #1  0x000064dd8555b18a qemu_futex_wait (qemu-system-x86_64 + 0x96718a)
                                                #2  0x000064dd855660e2 call_rcu_thread (qemu-system-x86_64 + 0x9720e2)
                                                #3  0x000064dd85559e88 qemu_thread_start (qemu-system-x86_64 + 0x965e88)
                                                #4  0x00007cb707d89134 start_thread (libc.so.6 + 0x89134)
                                                #5  0x00007cb707e097dc __clone3 (libc.so.6 + 0x1097dc)

                                                Stack trace of thread 6631:
                                                #0  0x00007cb707dfdc5b __GI___ioctl (libc.so.6 + 0xfdc5b)
                                                #1  0x000064dd853a19c9 kvm_vcpu_ioctl (qemu-system-x86_64 + 0x7ad9c9)
                                                #2  0x000064dd853a1f11 kvm_cpu_exec (qemu-system-x86_64 + 0x7adf11)
                                                #3  0x000064dd853a3755 kvm_vcpu_thread_fn (qemu-system-x86_64 + 0x7af755)
                                                #4  0x000064dd85559e88 qemu_thread_start (qemu-system-x86_64 + 0x965e88)
                                                #5  0x00007cb707d89134 start_thread (libc.so.6 + 0x89134)
                                                #6  0x00007cb707e097dc __clone3 (libc.so.6 + 0x1097dc)

                                                Stack trace of thread 6634:
                                                #0  0x00007cb707dfc15f __GI___poll (libc.so.6 + 0xfc15f)
                                                #1  0x00007cb70968b9ae n/a (libglib-2.0.so.0 + 0x549ae)
                                                #2  0x00007cb70968bcef g_main_loop_run (libglib-2.0.so.0 + 0x54cef)
                                                #3  0x00007cb709d8cfa9 n/a (libspice-server.so.1 + 0x51fa9)
                                                #4  0x00007cb707d89134 start_thread (libc.so.6 + 0x89134)
                                                #5  0x00007cb707e097dc __clone3 (libc.so.6 + 0x1097dc)

                                                Stack trace of thread 1213556:
                                                #0  0x00007cb707d85e96 __futex_abstimed_wait_common64 (libc.so.6 + 0x85e96)
                                                #1  0x00007cb707d8883c __pthread_cond_wait_common (libc.so.6 + 0x8883c)
                                                #2  0x000064dd8555a011 qemu_cond_timedwait_ts (qemu-system-x86_64 + 0x966011)
                                                #3  0x000064dd8555acb8 qemu_cond_timedwait_impl (qemu-system-x86_64 + 0x966cb8)
                                                #4  0x000064dd85571a6c worker_thread (qemu-system-x86_64 + 0x97da6c)
                                                #5  0x000064dd85559e88 qemu_thread_start (qemu-system-x86_64 + 0x965e88)
                                                #6  0x00007cb707d89134 start_thread (libc.so.6 + 0x89134)
                                                #7  0x00007cb707e097dc __clone3 (libc.so.6 + 0x1097dc)

                                                Stack trace of thread 6630:
                                                #0  0x00007cb707dfdc5b __GI___ioctl (libc.so.6 + 0xfdc5b)
                                                #1  0x000064dd853a19c9 kvm_vcpu_ioctl (qemu-system-x86_64 + 0x7ad9c9)
                                                #2  0x000064dd853a1f11 kvm_cpu_exec (qemu-system-x86_64 + 0x7adf11)
                                                #3  0x000064dd853a3755 kvm_vcpu_thread_fn (qemu-system-x86_64 + 0x7af755)
                                                #4  0x000064dd85559e88 qemu_thread_start (qemu-system-x86_64 + 0x965e88)
                                                #5  0x00007cb707d89134 start_thread (libc.so.6 + 0x89134)
                                                #6  0x00007cb707e097dc __clone3 (libc.so.6 + 0x1097dc)

                                                Stack trace of thread 6595:
                                                #0  0x00007cb707dfc256 __ppoll (libc.so.6 + 0xfc256)
                                                #1  0x000064dd8557283e ppoll (qemu-system-x86_64 + 0x97e83e)
                                                #2  0x000064dd8556fc6e os_host_main_loop_wait (qemu-system-x86_64 + 0x97bc6e)
                                                #3  0x000064dd8518a1a9 qemu_main_loop (qemu-system-x86_64 + 0x5961a9)
                                                #4  0x000064dd853ad1f6 qemu_default_main (qemu-system-x86_64 + 0x7b91f6)
                                                #5  0x00007cb707d2724a __libc_start_call_main (libc.so.6 + 0x2724a)
                                                #6  0x00007cb707d27305 __libc_start_main_impl (libc.so.6 + 0x27305)
                                                #7  0x000064dd84f2a621 _start (qemu-system-x86_64 + 0x336621)

                                                Stack trace of thread 6636:
                                                #0  0x00007cb707d85e96 __futex_abstimed_wait_common64 (libc.so.6 + 0x85e96)
                                                #1  0x00007cb707d88558 __pthread_cond_wait_common (libc.so.6 + 0x88558)
                                                #2  0x000064dd8555aa7b qemu_cond_wait_impl (qemu-system-x86_64 + 0x966a7b)
                                                #3  0x000064dd84f7516b vnc_worker_thread_loop (qemu-system-x86_64 + 0x38116b)
                                                #4  0x000064dd84f75e48 vnc_worker_thread (qemu-system-x86_64 + 0x381e48)
                                                #5  0x000064dd85559e88 qemu_thread_start (qemu-system-x86_64 + 0x965e88)
                                                #6  0x00007cb707d89134 start_thread (libc.so.6 + 0x89134)
                                                #7  0x00007cb707e097dc __clone3 (libc.so.6 + 0x1097dc)
                                                ELF object binary architecture: AMD x86-64
 
Last edited:
Code:
[ 2444.210413] perf: interrupt took too long (2504 > 2500), lowering kernel.perf_event_max_sample_rate to 79000
[ 3804.152823] perf: interrupt took too long (3131 > 3130), lowering kernel.perf_event_max_sample_rate to 63000
[ 9506.519266] perf: interrupt took too long (3916 > 3913), lowering kernel.perf_event_max_sample_rate to 51000
[31218.896676] hrtimer: interrupt took 5418 ns
[54683.760199] show_signal_msg: 16 callbacks suppressed
[54683.760206] SPICE Worker[6633]: segfault at 7cb4540000a0 ip 00007cb707d94fcd sp 00007cb575dfaeb0 error 4 in libc.so.6[7cb707d26000+155000] likely on CPU 73 (core 1, socket 1)
[54683.760222] Code: 08 48 8b 4f 08 48 89 c8 48 83 e0 f8 48 3b 04 07 0f 85 a9 00 00 00 f3 0f 6f 47 10 48 8b 57 18 66 48 0f 7e c0 48 3b 78 18 75 7b <48> 3b 7a 10 75 75 48 8b 77 10 48 89 50 18 66 0f d6 42 10 48 81 f9
 
Hallo,
we suffer also from SPICE Worker segfaults regarding one of our Win 10 VMs. This is not directly related to the upgrade from Proxmox 7 to 8 (we did that in summer and the VM didn't suffer from any segfaults then).
The segfaults started to appear after changing the VM BIOS to UEFI and updating the (quite old) libvirt- and SPICE-drivers on the Win 10 guest system. The driver update was necessary because the signature on the old drivers was outdated and Win 10 wouldn't boot in the new UEFI safe boot environment with them.
So my conclusion is that the problem (also) has to do with changes in the SPICE drivers on the guest system. For us, the segfaults quite reproducibly occur when we use a particular application that performs quite frequent GUI and cursor updates (changing from normal pointer to waiting cursor) when loading database rows into a table widget for display. Without accessing SPICE (and using this particular application) the VM can run for days or weeks without crashes.

Here is the backtrace from the most recent coredump, pointing to a free()-issue in RedCursorCmd (relevant thread only because of post character limit of this forum):

Code:
Thread 1 (Thread 0x707b5e2006c0 (LWP 3998736)):
#0  __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44
#1  0x0000707f86f19f1f in __pthread_kill_internal (signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:78
#2  0x0000707f86ecafb2 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#3  0x0000707f86eb5472 in __GI_abort () at ./stdlib/abort.c:79
#4  0x0000707f86f0e430 in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x707f87028459 "%s\n") at ../sysdeps/posix/libc_fatal.c:155
#5  0x0000707f86f2383a in malloc_printerr (str=str@entry=0x707f8702b1e0 "free(): corrupted unsorted chunks") at ./malloc/malloc.c:5660
#6  0x0000707f86f2592c in _int_free (av=0x707afc000030, p=0x707afc896730, have_lock=<optimized out>, have_lock@entry=0) at ./malloc/malloc.c:4626
#7  0x0000707f86f27f1f in __GI___libc_free (mem=<optimized out>) at ./malloc/malloc.c:3385
#8  0x0000707f88f5d089 in red_put_cursor (red=0x707afea28940) at ../server/red-parse-qxl.cpp:1493
#9  RedCursorCmd::~RedCursorCmd (this=0x707afea28910, __in_chrg=<optimized out>) at ../server/red-parse-qxl.cpp:1541
#10 0x0000707f88f34338 in red::shared_ptr_unref<RedCursorCmd> (p=0x707afea28910) at ../server/utils.hpp:487
#11 red::shared_ptr_unref<RedCursorCmd> (p=0x707afea28910) at ../server/utils.hpp:487
#12 red::shared_ptr<RedCursorCmd const>::~shared_ptr (this=0x707afc8bf360, __in_chrg=<optimized out>) at ../server/utils.hpp:189
#13 RedCursorPipeItem::~RedCursorPipeItem (this=0x707afc8bf350, __in_chrg=<optimized out>) at ../server/cursor-channel.cpp:28
#14 RedCursorPipeItem::~RedCursorPipeItem (this=0x707afc8bf350, __in_chrg=<optimized out>) at ../server/cursor-channel.cpp:28
#15 0x0000707f88f585e3 in red::shared_ptr_unref (p=0x707afc8bf350) at ../server/utils.hpp:283
#16 red::shared_ptr_unref (p=0x707afc8bf350) at ../server/utils.hpp:283
#17 red::shared_ptr<RedPipeItem>::~shared_ptr (this=<synthetic pointer>, __in_chrg=<optimized out>) at ../server/utils.hpp:189
#18 RedChannelClient::push (this=0x707afc00cb60) at ../server/red-channel-client.cpp:1165
#19 0x0000707f88f587c2 in RedChannelClient::handle_message (this=0x707afc00cb60, type=<optimized out>, size=<optimized out>, message=<optimized out>) at ../server/red-channel-client.cpp:1292
#20 0x0000707f88f57359 in RedChannelClient::handle_incoming (this=this@entry=0x707afc00cb60) at ../server/red-channel-client.cpp:1105
#21 0x0000707f88f586dd in RedChannelClient::receive (this=0x707afc00cb60) at ../server/red-channel-client.cpp:1124
#22 red_channel_client_event (fd=<optimized out>, event=<optimized out>, rcc=0x707afc00cb60) at ../server/red-channel-client.cpp:739
#23 0x0000707f88f4279d in spice_watch_dispatch (source=0x707afc8b7910, callback=0x707f88f58680 <red_channel_client_event(int, int, RedChannelClient*)>, user_data=0x707afc00cb60) at ../server/event-loop.c:166
#24 0x0000707f8886367f in g_main_context_dispatch () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#25 0x0000707f88863a38 in ?? () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#26 0x0000707f88863cef in g_main_loop_run () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#27 0x0000707f88f6dfa9 in red_worker_main (arg=0x5bd6c9ca52e0) at ../server/red-worker.cpp:1021
#28 0x0000707f86f181c4 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#29 0x0000707f86f9885c in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

Best regards
Julian Hartig
 
Last edited: