pverados segfault

st6f9n · Jul 14, 2023

In /var/log/syslog I have a lot of this messages (Proxmox 8, upgrade from 6 to 7 to 8):

2023-07-14T17:59:04.235286+02:00 xxx kernel: [171745.397894] Code: Unable to access opcode bytes at 0x55615db1eab6.
2023-07-14T18:05:44.739292+02:00 xxx kernel: [172145.909735] pverados[841677]: segfault at 55615db1eae0 ip 000055615db1eae0 sp 00007ffefaf5c038 error 14 in perl[55615daf9000+195000] likely on CPU 13 (core 3, socket 1)

The message is on all nodes, core and socket changes. I have no actual problems with this Proxmox Cluster.

Thanks for any hints.

flyfly · Jul 15, 2023

I have the same upgrade history 6->7->8 and since 8 the system ist not stable anymore.

Jul 14 16:57:59 xxx kernel: Code: Unable to access opcode bytes at 0x55cb0430f006.
Jul 14 16:57:59 xxx kernel: pverados[1042313]: segfault at 55cb0430f030 ip 000055cb0430f030 sp 00007ffec8d44af8 error 14 likely on CPU 2 (core 2, socket 0)

On my other proxmox hosts I cannot see any problems. Don't know why here.
Maybe I will reinstall proxmox on this system....

fhloston · Jul 17, 2023

Same here, sometimes the opcode bytes are listed though:

Code:

[Mon Jul 17 05:04:28 2023] pverados[740828]: segfault at 55a4a3d5d030 ip 000055a4a3d5d030 sp 00007ffecd408178 error 14 in perl[55a4a3d31000+195000] likely on CPU 1 (core 2, socket 0)
[Mon Jul 17 05:04:28 2023] Code: Unable to access opcode bytes at 0x55a4a3d5d006.
[Mon Jul 17 05:43:29 2023] pverados[797640]: segfault at 55a4a4a33378 ip 000055a4a3e19cef sp 00007ffecd407fc0 error 7 in perl[55a4a3d31000+195000] likely on CPU 13 (core 10, socket 1)
[Mon Jul 17 05:43:29 2023] Code: 48 89 45 70 48 3b 45 78 0f 84 b6 02 00 00 4c 89 ea 48 2b 55 18 49 83 c5 08 4c 89 c6 48 c1 fa 03 48 89 ef 89 10 ba 2d 00 00 00 <4d> 89 75 00 4c 89 6d 00 e8 94 e0 f3 ff 48 8b 85 e0 00 00 00 48 8b
[Mon Jul 17 07:23:58 2023] pverados[944060]: segfault at 55a4a3d5d030 ip 000055a4a3d5d030 sp 00007ffecd408178 error 14 in perl[55a4a3d31000+195000] likely on CPU 10 (core 3, socket 1)
[Mon Jul 17 07:23:58 2023] Code: Unable to access opcode bytes at 0x55a4a3d5d006.

kadiron · Jul 17, 2023

Also here, fortunately not in our production environment which is still at 7:

[409308.173312] pverados[1509845]: segfault at 55ad0984aae0 ip 000055ad0984aae0 sp 00007ffe919882f8 error 14 in perl[55ad09825000+195000] likely on CPU 16 (core 2, socket 1)
[409308.173332] Code: Unable to access opcode bytes at 0x55ad0984aab6.

Lots of these messages on all nodes.

st6f9n · Jul 17, 2023

So it's definitly a Proxmox 8 problem. Where are the Proxmox gurus and where is the redemptive update ?

aaron · Jul 17, 2023

Do you have a Ceph cluster running on the affected clusters?

st6f9n · Jul 17, 2023

Yes, I have (it's the same at the other cluster):

Code:

ceph -s
  cluster:
    id:     aceb952d-7214-4fbe-bd8d-71b863f2d0eb
    health: HEALTH_OK
 
  services:
    mon: 3 daemons, quorum xxx,yyy,zzz (age 4d)
    mgr: xxx(active, since 4d), standbys: yyy, zzz
    osd: 6 osds: 6 up (since 4d), 6 in (since 6d)
 
  data:
    pools:   2 pools, 129 pgs
    objects: 549.33k objects, 2.1 TiB
    usage:   6.3 TiB used, 6.4 TiB / 13 TiB avail
    pgs:     129 active+clean

aaron · Jul 18, 2023

Hmm, I cannot see that in my test environments. Could you please post pveversion -v?
Maybe also which CPUs? lscpu

The message comes from the code path where Proxmox VE talks to the Ceph MON via its C API.
Do you see any problems, or is it just the logs that show up?

st6f9n · Jul 18, 2023

Could you please post pveversion -v?
Maybe also which CPUs? lscpu

No problems at the moment, but only a few vms are running

Code:

Architecture:                    x86_64
CPU op-mode(s):                  32-bit, 64-bit
Address sizes:                   46 bits physical, 48 bits virtual
Byte Order:                      Little Endian
CPU(s):                          40
On-line CPU(s) list:             0-39
Vendor ID:                       GenuineIntel
BIOS Vendor ID:                  Intel(R) Corporation
Model name:                      Intel(R) Xeon(R) Silver 4114 CPU @ 2.20GHz
BIOS Model name:                 Intel(R) Xeon(R) Silver 4114 CPU @ 2.20GHz  CPU @ 2.2GHz
BIOS CPU family:                 179
CPU family:                      6
Model:                           85
Thread(s) per core:              2
Core(s) per socket:              10
Socket(s):                       2
Stepping:                        4
CPU(s) scaling MHz:              90%
CPU max MHz:                     3000,0000
CPU min MHz:                     800,0000
BogoMIPS:                        4400,00
Flags:                           fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single pti intel_ppin ssbd mba ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb intel_pt avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts pku ospke md_clear flush_l1d
Virtualization:                  VT-x
L1d cache:                       640 KiB (20 instances)
L1i cache:                       640 KiB (20 instances)
L2 cache:                        20 MiB (20 instances)
L3 cache:                        27,5 MiB (2 instances)
NUMA node(s):                    2
NUMA node0 CPU(s):               0-9,20-29
NUMA node1 CPU(s):               10-19,30-39
Vulnerability Itlb multihit:     KVM: Mitigation: VMX disabled
Vulnerability L1tf:              Mitigation; PTE Inversion; VMX conditional cache flushes, SMT vulnerable
Vulnerability Mds:               Mitigation; Clear CPU buffers; SMT vulnerable
Vulnerability Meltdown:          Mitigation; PTI
Vulnerability Mmio stale data:   Mitigation; Clear CPU buffers; SMT vulnerable
Vulnerability Retbleed:          Mitigation; IBRS
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:        Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2:        Mitigation; IBRS, IBPB conditional, STIBP conditional, RSB filling, PBRSB-eIBRS Not affected
Vulnerability Srbds:             Not affected
Vulnerability Tsx async abort:   Mitigation; Clear CPU buffers; SMT vulnerable

Code:

proxmox-ve: 8.0.1 (running kernel: 6.2.16-4-pve)
pve-manager: 8.0.3 (running version: 8.0.3/bbf3993334bfa916)
pve-kernel-6.2: 8.0.3
pve-kernel-5.15: 7.4-4
pve-kernel-6.2.16-4-pve: 6.2.16-4
pve-kernel-5.15.108-1-pve: 5.15.108-1
pve-kernel-5.4.203-1-pve: 5.4.203-1
ceph: 17.2.6-pve1+3
ceph-fuse: 17.2.6-pve1+3
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown: 0.8.41
libjs-extjs: 7.0.0-3
libknet1: 1.25-pve1
libproxmox-acme-perl: 1.4.6
libproxmox-backup-qemu0: 1.4.0
libproxmox-rs-perl: 0.3.0
libpve-access-control: 8.0.3
libpve-apiclient-perl: 3.3.1
libpve-common-perl: 8.0.6
libpve-guest-common-perl: 5.0.3
libpve-http-server-perl: 5.0.4
libpve-rs-perl: 0.8.4
libpve-storage-perl: 8.0.2
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 5.0.2-4
lxcfs: 5.0.3-pve3
novnc-pve: 1.4.0-2
proxmox-backup-client: 3.0.1-1
proxmox-backup-file-restore: 3.0.1-1
proxmox-kernel-helper: 8.0.2
proxmox-mail-forward: 0.2.0
proxmox-mini-journalreader: 1.4.0
proxmox-offline-mirror-helper: 0.6.2
proxmox-widget-toolkit: 4.0.6
pve-cluster: 8.0.2
pve-container: 5.0.4
pve-docs: 8.0.4
pve-edk2-firmware: 3.20230228-4
pve-firewall: 5.0.2
pve-firmware: 3.7-1
pve-ha-manager: 4.0.2
pve-i18n: 3.0.5
pve-qemu-kvm: 8.0.2-3
pve-xtermjs: 4.16.0-3
qemu-server: 8.0.6
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.1.12-pve1

revilzs · Jul 19, 2023

We also have the segfault issue with PVE8 on a clean install:

[104807.394218] pverados[1515350]: segfault at 555d4830b030 ip 0000555d4830b030 sp 00007ffd4a727e38 error 14 in perl[555d482df000+195000] likely on CPU 18 (core 0, socket 1)

PVE:
pve-manager/8.0.3/bbf3993334bfa916 (running kernel: 6.2.16-4-pve)

CPU:
Model name: Intel(R) Xeon(R) CPU E5-2630L v2 @ 2.40GHz

flotho · Jul 19, 2023

Hi, I confirm that we also have the same messages in the logs :

Code:

kernel: pverados[4133454]: segfault at 55c6095fa408 ip 000055c607463cef sp 00007ffe68d860d0 error 7 in perl[55c60737b000+195000] likely on CPU 40 (core 21, socket 0)
kernel: Code: 48 89 45 70 48 3b 45 78 0f 84 b6 02 00 00 4c 89 ea 48 2b 55 18 49 83 c5 08 4c 89 c6 48 c1 fa 03 48 89 ef 89 10 ba 2d 00 00 00 <4d> 89 75 00 4c 89 6d 00 e8 94 e0 f3 ff 48 8b 85 e0 00 00 00 48 8b

Here is my setup :

OVH template PV7 migrated on 8 => pve-manager/8.0.3/bbf3993334bfa916 (running kernel: 6.2.16-4-pve)
CPU :

Code:

Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         43 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  48
  On-line CPU(s) list:   0-47
Vendor ID:               AuthenticAMD
  BIOS Vendor ID:        Advanced Micro Devices, Inc.
  Model name:            AMD EPYC 7402 24-Core Processor
    BIOS Model name:     AMD EPYC 7402 24-Core Processor                 Unknown CPU @ 2.8GHz
    BIOS CPU family:     107
    CPU family:          23
    Model:               49
    Thread(s) per core:  2
    Core(s) per socket:  24
    Socket(s):           1
    Stepping:            0
    Frequency boost:     enabled
    CPU(s) scaling MHz:  96%
    CPU max MHz:         2800.0000
    CPU min MHz:         1500.0000
    BogoMIPS:            5599.59
    Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fx
                         sr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl
                          nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 ss
                         e4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_
                         legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_cor
                         e perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba ibrs ibpb s
                         tibp vmmcall fsgsbase bmi1 avx2 smep bmi2 cqm rdt_a rdseed adx smap clflushopt clwb sh
                         a_ni xsaveopt xsavec xgetbv1 cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero
                         irperf xsaveerptr rdpru wbnoinvd amd_ppin arat npt lbrv svm_lock nrip_save tsc_scale v
                         mcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif
                         v_spec_ctrl umip rdpid overflow_recov succor smca sev sev_es
Virtualization features:
  Virtualization:        AMD-V
Caches (sum of all):     
  L1d:                   768 KiB (24 instances)
  L1i:                   768 KiB (24 instances)
  L2:                    12 MiB (24 instances)
  L3:                    128 MiB (8 instances)
NUMA:                   
  NUMA node(s):          1
  NUMA node0 CPU(s):     0-47
Vulnerabilities:         
  Itlb multihit:         Not affected
  L1tf:                  Not affected
  Mds:                   Not affected
  Meltdown:              Not affected
  Mmio stale data:       Not affected
  Retbleed:              Mitigation; untrained return thunk; SMT enabled with STIBP protection
  Spec store bypass:     Mitigation; Speculative Store Bypass disabled via prctl
  Spectre v1:            Mitigation; usercopy/swapgs barriers and __user pointer sanitization
  Spectre v2:            Mitigation; Retpolines, IBPB conditional, STIBP always-on, RSB filling, PBRSB-eIBRS No
                         t affected
  Srbds:                 Not affected
  Tsx async abort:       Not affected

fiona · Jul 20, 2023

Hi,
we can reproduce the issue and are currently investigating. It does seem to be kernel-related and occurs with 6.2.16-4-pve. Fortunately, it seems to only occur in short-lived client workers when querying certain things via RADOS.

flotho · Jul 20, 2023

Hi @fiona, thanks for your feedback, do you need a test on the pve-kernel-6.2.16-3-pve ?

fiona · Jul 20, 2023

Hi,

flotho said:
Hi @fiona, thanks for your feedback, do you need a test on the pve-kernel-6.2.16-3-pve ?

we did test that kernel with our reproducer (just a loop spawning the pverados worker and querying the storage) and never triggered the issue there. The reports also just showed up after pve-kernel-6.2.16-4-pve was out and we do think it might be triggered by this commit now. But we still need to find out why exactly and if there is an issue with that commit itself or if it's just exposing an existing issue.

BloodBlight · Jul 20, 2023

Adding my 2 cents. Also seeing this on both an Intel physical host and a virtual prox server on an AMD host.

dwm · Jul 20, 2023

Hello!

This may or may not be helpful, but just in case it is to someone:

We're just getting up to speed with Proxmox VE and have also seen Ceph-related problems with a three-node cluster running pve-kernel-6.2.16-4-pve. As well as segfaults of the form:

Code:

[81172.458890] pverados[705029]: segfault at 55e08c609030 ip 000055e08c609030 sp 00007ffe7880a4f8 error 14 in perl[55e08c5dd000+195000] likely on CPU 0 (core 0, socket 0)
[81172.458905] Code: Unable to access opcode bytes at 0x55e08c609006.

… we're also having problems spinning up a new VM using our separate pre-existing Ceph cluster (which has been working nicely!); the rbd -p POOLNAME -m mon0,mon1,mon2 … map vm-102-disk-2 incantation spawned by Proxmox is hanging for several 10s of seconds before failing, while the kernel is reporting a serious issue with the retrieval/processing of the Ceph osdmap:

Code:

Jul 20 16:23:47 FQDN kernel: libceph: mon2 (1)10.64.97.3:6789 session established
Jul 20 16:23:47 FQDN kernel: libceph: no match of type 1 in addrvec
Jul 20 16:23:47 FQDN kernel: libceph: corrupt full osdmap (-2) epoch 56000 off 5854 (0000000074d29902 of 0000000012b77a71-00000000d6ab7fb1)
(lengthy memory dump)

However, this may not be related, as rolling back to 6.2.16-3-pve doesn't appear to have helped!

Best wishes,
David

dwm · Jul 20, 2023

dwm said:
… we're also having problems spinning up a new VM using our separate pre-existing Ceph cluster (which has been working nicely!); the rbd -p POOLNAME -m mon0,mon1,mon2 … map vm-102-disk-2 incantation spawned by Proxmox is hanging for several 10s of seconds before failing, while the kernel is reporting a serious issue with the retrieval/processing of the Ceph osdmap:

Code:

Jul 20 16:23:47 FQDN kernel: libceph: mon2 (1)10.64.97.3:6789 session established Jul 20 16:23:47 FQDN kernel: libceph: no match of type 1 in addrvec Jul 20 16:23:47 FQDN kernel: libceph: corrupt full osdmap (-2) epoch 56000 off 5854 (0000000074d29902 of 0000000012b77a71-00000000d6ab7fb1) (lengthy memory dump)

While I haven't identified the change that caused this to stop working, for the benefit of anyone searching for this error message and getting stuck, this error appears to be happening because rbd is using the Ceph msgr1 protocol while the cluster insists on using msgr2 in secure mode only.

A manual execution of the same rbd command with the option --options ms_mode=secure succeeds immediately.

… now to figure out how to cause Proxmox to always set this option or the moral equivalent.

Apologies for the detour on this thread!

Best wishes,
David

G0ldmember · Jul 21, 2023

Just out of curiosity: does this only affect the no-subscription repo or also the pve-enterprise repo?
The weird thing is, we have 2 clusters consisting of 3 nodes, for one of them the problem also existed in the logs with the older (.3) kernel, for the other cluster, it only appeared after booting into the new .4 kernel

daff · Jul 21, 2023

Adding a "same here".

Code:

# pveversion
pve-manager/8.0.3/bbf3993334bfa916 (running kernel: 6.2.16-4-pve)

Using the "no-subscription" repositories, on a single-node "cluster" with 5 OSDs:

Code:

ceph version 17.2.6 (810db68029296377607028a6c6da1ec06f5a2b27) quincy (stable)

Code:

  cluster:
    id:     6bf44ce3-11e6-43d2-8eb0-4bd6a089690e
    health: HEALTH_OK

  services:
    mon: 1 daemons, quorum testhost01 (age 3d)
    mgr: testhost01(active, since 3d)
    mds: 1/1 daemons up
    osd: 5 osds: 5 up (since 3d), 5 in (since 13d)

  data:
    volumes: 1/1 healthy
    pools:   5 pools, 129 pgs
    objects: 155.65k objects, 594 GiB
    usage:   1.7 TiB used, 7.0 TiB / 8.7 TiB avail
    pgs:     129 active+clean

  io:
    client:   3.5 MiB/s rd, 33 MiB/s wr, 107 op/s rd, 132 op/s wr

dmesg is full of messages like these:

Code:

[309200.506079] pverados[2189824]: segfault at 55c45a27c030 ip 000055c45a27c030 sp 00007ffde15a27a8 error 14 likely on CPU 4 (core 4, socket 0)
[309200.506090] Code: Unable to access opcode bytes at 0x55c45a27c006.
[309620.497365] pverados[2195845]: segfault at 55c45b014388 ip 000055c45a338cef sp 00007ffde15a25f0 error 7 in perl[55c45a250000+195000] likely on CPU 0 (core 0, socket 0)
[309620.497379] Code: 48 89 45 70 48 3b 45 78 0f 84 b6 02 00 00 4c 89 ea 48 2b 55 18 49 83 c5 08 4c 89 c6 48 c1 fa 03 48 89 ef 89 10 ba 2d 00 00 00 <4d> 89 75 00 4c 89 6d 00 e8 94 e0 f3 ff 48 8b 85 e0 00 00 00 48 8b
[311720.802916] pverados[2235117]: segfault at 55c45a27c030 ip 000055c45a27c030 sp 00007ffde15a27a8 error 14 in perl[55c45a250000+195000] likely on CPU 11 (core 3, socket 0)
[311720.802931] Code: Unable to access opcode bytes at 0x55c45a27c006.
[311860.569958] pverados[2237733]: segfault at 55c45a27c030 ip 000055c45a27c030 sp 00007ffde15a27a8 error 14 in perl[55c45a250000+195000] likely on CPU 11 (core 3, socket 0)

fiona · Jul 25, 2023

We have analyzed the issue more in-depth now and fortunately, it is only cosmetic. I reported it upstream too. The segfault can only happen while the worker process is already being terminated.

For those interested in the details, do_user_addr_fault uses:

Code:

    vma = lock_mm_and_find_vma(mm, address, regs);
    if (unlikely(!vma)) {
        bad_area_nosemaphore(regs, error_code, address);
        return;
    }

You can think of bad_area_nosemaphore "being" the segfault. Before the commit already mentioned, this would fail if the vma couldn't be found, but with the change it can also fail when taking the lock aborts if there is a fatal signal pending (i.e. when the process is already being killed). In that case it's arguably not even a real segfault, but that's where the log comes from.

pverados segfault

Active Member

New Member

Member

New Member

Active Member

Proxmox Staff Member

Active Member

Proxmox Staff Member

Active Member

New Member

Renowned Member

Proxmox Staff Member

Renowned Member

Proxmox Staff Member

Member

New Member

New Member

Active Member

New Member

Proxmox Staff Member

We value your privacy