Linux Kernel 5.4 for Proxmox VE

MATPOC

New Member
Nov 3, 2020
3
0
1
50
After last upgrade to kernel 5.4 we are found strange bug about the last core of Xeon 5xxx family procs. Kernel marking the last core of last processor as dead:

smpboot: CPU 6 Converting physical 0 to logical die 1

We are using Supermicro X8DTU motherboard, BIOS and Intel microcodes was upgraded to the last versions from Supermicro site.

We are tried to change procs, BIOS settings, booting from other linux distros and kernels. Last version of linux kernel that worked without this bug is kernel 4.9.

We has another server with Xeon 7xxx family - kernel 5.4 working on this proc without bug of last core.

# pveversion pve-manager/6.2-14/2001b502 (running kernel: 5.4.65-1-pve)

Here are dmesg excerpts about `CPU`:

Code:
# dmesg | grep -i cpu
[    0.000000] KERNEL supported cpus:
[    0.000000] smpboot: Allowing 24 CPUs, 12 hotplug CPUs
[    0.000000] setup_percpu: NR_CPUS:8192 nr_cpumask_bits:24 nr_cpu_ids:24 nr_node_ids:2
[    0.000000] percpu: Embedded 55 pages/cpu s188416 r8192 d28672 u262144
[    0.000000] pcpu-alloc: s188416 r8192 d28672 u262144 alloc=1*2097152
[    0.000000] pcpu-alloc: [0] 00 01 02 03 04 05 12 14 [0] 16 18 20 22 -- -- -- --
[    0.000000] pcpu-alloc: [1] 06 07 08 09 10 11 13 15 [1] 17 19 21 23 -- -- -- --
[    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=24, Nodes=2
[    0.000000] rcu:     RCU restricting CPUs from NR_CPUS=8192 to nr_cpu_ids=24.
[    0.000000] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=24
[    0.021636] mce: CPU0: Thermal monitoring enabled (TM1)
[    0.137350] smpboot: CPU0: Intel(R) Xeon(R) CPU           X5670  @ 2.93GHz (family: 0x6, model: 0x2c, stepping: 0x2)
[    0.137517] core: CPUID marked event: 'bus cycles' unavailable
[    0.139871] smp: Bringing up secondary CPUs ...
[    0.139997] .... node  #0, CPUs:        #1  #2  #3  #4  #5
[    0.152203] .... node  #1, CPUs:    #6
[    0.000000] smpboot: CPU 6 Converting physical 0 to logical die 1
[    0.250235] smp: Brought up 2 nodes, 12 CPUs
[    0.256705] cpuidle: using governor ladder
[    0.256705] cpuidle: using governor menu
[    1.015713] intel_pstate: CPU model not supported
[    1.016348] ledtrig-cpu: registered to indicate activity on CPUs
 
Last edited:

fabian

Proxmox Staff Member
Staff member
Jan 7, 2016
7,292
1,340
164
that's not what that message means.. this is "die" as in "CPU die" not as in "death" ;) can you post /proc/cpuinfo contents?
 

MATPOC

New Member
Nov 3, 2020
3
0
1
50
that's not what that message means.. this is "die" as in "CPU die" not as in "death" ;) can you post /proc/cpuinfo contents?

# cat /proc/cpuinfo
Code:
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 44
model name      : Intel(R) Xeon(R) CPU           X5670  @ 2.93GHz
stepping        : 2
microcode       : 0x15
cpu MHz         : 2933.435
cache size      : 12288 KB
physical id     : 0
siblings        : 6
core id         : 0
cpu cores       : 6
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 11
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr ssesse2 ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt lahf_lm tpr_shadow vnmi flexpriority ept vpid dtherm arat
bugs            : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit
bogomips        : 5866.87
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:


processor       : 1
vendor_id       : GenuineIntel
cpu family      : 6
model           : 44
model name      : Intel(R) Xeon(R) CPU           X5670  @ 2.93GHz
stepping        : 2
microcode       : 0x15
cpu MHz         : 2933.609
cache size      : 12288 KB
physical id     : 0
siblings        : 6
core id         : 1
cpu cores       : 6
apicid          : 2
initial apicid  : 2
fpu             : yes
fpu_exception   : yes
cpuid level     : 11
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr ssesse2 ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt lahf_lm tpr_shadow vnmi flexpriority ept vpid dtherm arat
bugs            : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit
bogomips        : 5866.87
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:


processor       : 2
vendor_id       : GenuineIntel
cpu family      : 6
model           : 44
model name      : Intel(R) Xeon(R) CPU           X5670  @ 2.93GHz
stepping        : 2
microcode       : 0x15
cpu MHz         : 2933.583
cache size      : 12288 KB
physical id     : 0
siblings        : 6
core id         : 2
cpu cores       : 6
apicid          : 4
initial apicid  : 4
fpu             : yes
fpu_exception   : yes
cpuid level     : 11
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr ssesse2 ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt lahf_lm tpr_shadow vnmi flexpriority ept vpid dtherm arat
bugs            : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit
bogomips        : 5866.87
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:


processor       : 3
vendor_id       : GenuineIntel
cpu family      : 6
model           : 44
model name      : Intel(R) Xeon(R) CPU           X5670  @ 2.93GHz
stepping        : 2
microcode       : 0x15
cpu MHz         : 2933.580
cache size      : 12288 KB
physical id     : 0
siblings        : 6
core id         : 8
cpu cores       : 6
apicid          : 16
initial apicid  : 16
fpu             : yes
fpu_exception   : yes
cpuid level     : 11
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr ssesse2 ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt lahf_lm tpr_shadow vnmi flexpriority ept vpid dtherm arat
bugs            : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit
bogomips        : 5866.87
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:


processor       : 4
vendor_id       : GenuineIntel
cpu family      : 6
model           : 44
model name      : Intel(R) Xeon(R) CPU           X5670  @ 2.93GHz
stepping        : 2
microcode       : 0x15
cpu MHz         : 2933.540
cache size      : 12288 KB
physical id     : 0
siblings        : 6
core id         : 9
cpu cores       : 6
apicid          : 18
initial apicid  : 18
fpu             : yes
fpu_exception   : yes
cpuid level     : 11
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr ssesse2 ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt lahf_lm tpr_shadow vnmi flexpriority ept vpid dtherm arat
bugs            : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit
bogomips        : 5866.87
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:


processor       : 5
vendor_id       : GenuineIntel
cpu family      : 6
model           : 44
model name      : Intel(R) Xeon(R) CPU           X5670  @ 2.93GHz
stepping        : 2
microcode       : 0x15
cpu MHz         : 2933.543
cache size      : 12288 KB
physical id     : 0
siblings        : 6
core id         : 10
cpu cores       : 6
apicid          : 20
initial apicid  : 20
fpu             : yes
fpu_exception   : yes
cpuid level     : 11
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr ssesse2 ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt lahf_lm tpr_shadow vnmi flexpriority ept vpid dtherm arat
bugs            : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit
bogomips        : 5866.87
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:


processor       : 6
vendor_id       : GenuineIntel
cpu family      : 6
model           : 44
model name      : Intel(R) Xeon(R) CPU           X5670  @ 2.93GHz
stepping        : 2
microcode       : 0x15
cpu MHz         : 2023.676
cache size      : 12288 KB
physical id     : 1
siblings        : 6
core id         : 0
cpu cores       : 6
apicid          : 32
initial apicid  : 32
fpu             : yes
fpu_exception   : yes
cpuid level     : 11
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr ssesse2 ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt lahf_lm tpr_shadow vnmi flexpriority ept vpid dtherm arat
bugs            : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit
bogomips        : 5855.64
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:


processor       : 7
vendor_id       : GenuineIntel
cpu family      : 6
model           : 44
model name      : Intel(R) Xeon(R) CPU           X5670  @ 2.93GHz
stepping        : 2
microcode       : 0x15
cpu MHz         : 1844.074
cache size      : 12288 KB
physical id     : 1
siblings        : 6
core id         : 1
cpu cores       : 6
apicid          : 34
initial apicid  : 34
fpu             : yes
fpu_exception   : yes
cpuid level     : 11
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr ssesse2 ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt lahf_lm tpr_shadow vnmi flexpriority ept vpid dtherm arat
bugs            : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit
bogomips        : 5855.64
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:


processor       : 8
vendor_id       : GenuineIntel
cpu family      : 6
model           : 44
model name      : Intel(R) Xeon(R) CPU           X5670  @ 2.93GHz
stepping        : 2
microcode       : 0x15
cpu MHz         : 1982.629
cache size      : 12288 KB
physical id     : 1
siblings        : 6
core id         : 2
cpu cores       : 6
apicid          : 36
initial apicid  : 36
fpu             : yes
fpu_exception   : yes
cpuid level     : 11
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr ssesse2 ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt lahf_lm tpr_shadow vnmi flexpriority ept vpid dtherm arat
bugs            : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit
bogomips        : 5855.64
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:


processor       : 9
vendor_id       : GenuineIntel
cpu family      : 6
model           : 44
model name      : Intel(R) Xeon(R) CPU           X5670  @ 2.93GHz
stepping        : 2
microcode       : 0x15
cpu MHz         : 1837.433
cache size      : 12288 KB
physical id     : 1
siblings        : 6
core id         : 8
cpu cores       : 6
apicid          : 48
initial apicid  : 48
fpu             : yes
fpu_exception   : yes
cpuid level     : 11
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr ssesse2 ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt lahf_lm tpr_shadow vnmi flexpriority ept vpid dtherm arat
bugs            : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit
bogomips        : 5855.64
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:


processor       : 10
vendor_id       : GenuineIntel
cpu family      : 6
model           : 44
model name      : Intel(R) Xeon(R) CPU           X5670  @ 2.93GHz
stepping        : 2
microcode       : 0x15
cpu MHz         : 1910.050
cache size      : 12288 KB
physical id     : 1
siblings        : 6
core id         : 9
cpu cores       : 6
apicid          : 50
initial apicid  : 50
fpu             : yes
fpu_exception   : yes
cpuid level     : 11
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr ssesse2 ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt lahf_lm tpr_shadow vnmi flexpriority ept vpid dtherm arat
bugs            : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit
bogomips        : 5855.64
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:


processor       : 11
vendor_id       : GenuineIntel
cpu family      : 6
model           : 44
model name      : Intel(R) Xeon(R) CPU           X5670  @ 2.93GHz
stepping        : 2
microcode       : 0x15
cpu MHz         : 1777.030
cache size      : 12288 KB
physical id     : 1
siblings        : 6
core id         : 10
cpu cores       : 6
apicid          : 52
initial apicid  : 52
fpu             : yes
fpu_exception   : yes
cpuid level     : 11
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr ssesse2 ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt lahf_lm tpr_shadow vnmi flexpriority ept vpid dtherm arat
bugs            : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit
bogomips        : 5855.64
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:
 

fabian

Proxmox Staff Member
Staff member
Jan 7, 2016
7,292
1,340
164
so cpuinfo shows 2x6, looks okay to me?
 

fabian

Proxmox Staff Member
Staff member
Jan 7, 2016
7,292
1,340
164
why does that make you nervous? Oo you have 2 sockets with 6 cores each, so starting with the 7th core it tells you that they are not on the same die..
 

masterevil

New Member
Mar 3, 2021
4
0
1
Hello! After upgrade to the latest pve-kernel version (5.4.101-1-pve) I've got duplicated disk drives.

Code:
sdao    LVM2_member       3E9Xch-Citw-KEaP-lMVl-CSoD-mPQY-BzXGb2
sdj     LVM2_member       3E9Xch-Citw-KEaP-lMVl-CSoD-mPQY-BzXGb2
sdag    LVM2_member       74tj43-uEdO-oG4C-EaAi-Urbc-Zf1Q-TJKdPF
sdb     LVM2_member       74tj43-uEdO-oG4C-EaAi-Urbc-Zf1Q-TJKdPF
sdau    LVM2_member       9PJB7p-2kjn-AWax-hYJI-fUMW-lZil-8BdeOt
sdp     LVM2_member       9PJB7p-2kjn-AWax-hYJI-fUMW-lZil-8BdeOt
sdai    LVM2_member       DswoeH-LFyJ-Kdbo-BXXX-MFFC-YxqC-4F3cQe
sdd     LVM2_member       DswoeH-LFyJ-Kdbo-BXXX-MFFC-YxqC-4F3cQe

It's not cloned drives it's a same drive with two device names in /dev. If I reboot into previous installed kernel version(5.4.44-1-pve) all works as expected.

How to fix it?

pve-manager/6.3-4/0a38c56f (running kernel: 5.4.44-1-pve)
 

t.lamprecht

Proxmox Staff Member
Staff member
Jul 28, 2015
5,182
1,493
164
South Tyrol/Italy
shop.proxmox.com
Hello! After upgrade to the latest pve-kernel version (5.4.101-1-pve) I've got duplicated disk drives.

Code:
sdao    LVM2_member       3E9Xch-Citw-KEaP-lMVl-CSoD-mPQY-BzXGb2
sdj     LVM2_member       3E9Xch-Citw-KEaP-lMVl-CSoD-mPQY-BzXGb2
sdag    LVM2_member       74tj43-uEdO-oG4C-EaAi-Urbc-Zf1Q-TJKdPF
sdb     LVM2_member       74tj43-uEdO-oG4C-EaAi-Urbc-Zf1Q-TJKdPF
sdau    LVM2_member       9PJB7p-2kjn-AWax-hYJI-fUMW-lZil-8BdeOt
sdp     LVM2_member       9PJB7p-2kjn-AWax-hYJI-fUMW-lZil-8BdeOt
sdai    LVM2_member       DswoeH-LFyJ-Kdbo-BXXX-MFFC-YxqC-4F3cQe
sdd     LVM2_member       DswoeH-LFyJ-Kdbo-BXXX-MFFC-YxqC-4F3cQe

It's not cloned drives it's a same drive with two device names in /dev. If I reboot into previous installed kernel version(5.4.44-1-pve) all works as expected.

How to fix it?

pve-manager/6.3-4/0a38c56f (running kernel: 5.4.44-1-pve)
Hi!

pve-manager/6.3-4/0a38c56f (running kernel: 5.4.44-1-pve)
You're currently still running the old 5.4.44-1-pve kernel, not the new one - did you forget to reboot or did you reboot into the old one again on purpose?

Also, can you post the output of:
Bash:
ls -la /dev/sd*
 

masterevil

New Member
Mar 3, 2021
4
0
1
You're currently still running the old 5.4.44-1-pve kernel, not the new one - did you forget to reboot or did you reboot into the old one again on purpose?
I wrote about reboot into old kernel version:
If I reboot into previous installed kernel version(5.4.44-1-pve) all works as expected.
And replying to your question:
Also, can you post the output of:
Bash:
ls -la /dev/sd*
Code:
# ls -la /dev/sd*
brw-rw---- 1 root disk  8,   0 Mar  4 02:39 /dev/sda
brw-rw---- 1 root disk 65, 160 Mar  4 02:39 /dev/sdaa
brw-rw---- 1 root disk 65, 176 Mar  4 02:39 /dev/sdab
brw-rw---- 1 root disk 65, 192 Mar  4 02:39 /dev/sdac
brw-rw---- 1 root disk 65, 208 Mar  4 02:39 /dev/sdad
brw-rw---- 1 root disk 65, 224 Mar  4 02:39 /dev/sdae
brw-rw---- 1 root disk 65, 240 Mar  4 02:39 /dev/sdaf
brw-rw---- 1 root disk 66,   0 Mar  4 02:39 /dev/sdag
brw-rw---- 1 root disk 66,  16 Mar  4 02:39 /dev/sdah
brw-rw---- 1 root disk 66,  32 Mar  4 02:39 /dev/sdai
brw-rw---- 1 root disk 66,  48 Mar  4 02:39 /dev/sdaj
brw-rw---- 1 root disk 66,  64 Mar  4 02:39 /dev/sdak
brw-rw---- 1 root disk 66,  80 Mar  4 02:39 /dev/sdal
brw-rw---- 1 root disk 66,  96 Mar  4 02:39 /dev/sdam
brw-rw---- 1 root disk 66, 112 Mar  4 02:39 /dev/sdan
brw-rw---- 1 root disk 66, 128 Mar  4 02:39 /dev/sdao
brw-rw---- 1 root disk 66, 144 Mar  4 02:39 /dev/sdap
brw-rw---- 1 root disk 66, 160 Mar  4 02:39 /dev/sdaq
brw-rw---- 1 root disk 66, 176 Mar  4 02:39 /dev/sdar
brw-rw---- 1 root disk 66, 192 Mar  4 02:39 /dev/sdas
brw-rw---- 1 root disk 66, 208 Mar  4 02:39 /dev/sdat
brw-rw---- 1 root disk 66, 224 Mar  4 02:39 /dev/sdau
brw-rw---- 1 root disk 66, 240 Mar  4 02:39 /dev/sdav
brw-rw---- 1 root disk 67,   0 Mar  4 02:39 /dev/sdaw
brw-rw---- 1 root disk 67,   1 Mar  4 02:39 /dev/sdaw1
brw-rw---- 1 root disk 67,   2 Mar  4 02:39 /dev/sdaw2
brw-rw---- 1 root disk 67,   3 Mar  4 02:39 /dev/sdaw3
brw-rw---- 1 root disk 67,  16 Mar  4 02:39 /dev/sdax
brw-rw---- 1 root disk 67,  17 Mar  4 02:39 /dev/sdax1
brw-rw---- 1 root disk 67,  25 Mar  4 02:39 /dev/sdax9
brw-rw---- 1 root disk  8,  16 Mar  4 02:39 /dev/sdb
brw-rw---- 1 root disk  8,  32 Mar  4 02:39 /dev/sdc
brw-rw---- 1 root disk  8,  48 Mar  4 02:39 /dev/sdd
brw-rw---- 1 root disk  8,  64 Mar  4 02:39 /dev/sde
brw-rw---- 1 root disk  8,  80 Mar  4 02:39 /dev/sdf
brw-rw---- 1 root disk  8,  96 Mar  4 02:39 /dev/sdg
brw-rw---- 1 root disk  8, 112 Mar  4 02:39 /dev/sdh
brw-rw---- 1 root disk  8, 128 Mar  4 02:39 /dev/sdi
brw-rw---- 1 root disk  8, 144 Mar  4 02:39 /dev/sdj
brw-rw---- 1 root disk  8, 160 Mar  4 02:39 /dev/sdk
brw-rw---- 1 root disk  8, 176 Mar  4 02:39 /dev/sdl
brw-rw---- 1 root disk  8, 192 Mar  4 02:39 /dev/sdm
brw-rw---- 1 root disk  8, 208 Mar  4 02:39 /dev/sdn
brw-rw---- 1 root disk  8, 224 Mar  4 02:39 /dev/sdo
brw-rw---- 1 root disk  8, 240 Mar  4 02:39 /dev/sdp
brw-rw---- 1 root disk 65,   0 Mar  4 02:39 /dev/sdq
brw-rw---- 1 root disk 65,  16 Mar  4 02:39 /dev/sdr
brw-rw---- 1 root disk 65,  32 Mar  4 02:39 /dev/sds
brw-rw---- 1 root disk 65,  48 Mar  4 02:39 /dev/sdt
brw-rw---- 1 root disk 65,  64 Mar  4 02:39 /dev/sdu
brw-rw---- 1 root disk 65,  80 Mar  4 02:39 /dev/sdv
brw-rw---- 1 root disk 65,  96 Mar  4 02:39 /dev/sdw
brw-rw---- 1 root disk 65, 112 Mar  4 02:39 /dev/sdx
brw-rw---- 1 root disk 65, 128 Mar  4 02:39 /dev/sdy
brw-rw---- 1 root disk 65, 144 Mar  4 02:39 /dev/sdz
And add lvm output:
Code:
# pvscan
  WARNING: Not using device /dev/sdb for PV 74tj43-uEdO-oG4C-EaAi-Urbc-Zf1Q-TJKdPF.
  WARNING: Not using device /dev/sdc for PV R4lVwM-Py0a-gUdd-N7VH-YZqI-uggl-a3GaMU.
  WARNING: Not using device /dev/sdd for PV DswoeH-LFyJ-Kdbo-BXXX-MFFC-YxqC-4F3cQe.
  WARNING: Not using device /dev/sde for PV qfXtiR-FbHY-4KCq-DmAA-Ddy0-wfra-nOyP2U.
  WARNING: Not using device /dev/sdf for PV GD2pex-3Fam-sjZM-BFIr-O0Xw-59FB-kUHeFY.
  WARNING: Not using device /dev/sdg for PV zhceIq-p8Qh-L0hP-al0H-rJ3c-X63A-VUzpEl.
  WARNING: Not using device /dev/sdh for PV aH9tII-SVVK-LYeS-xkc3-pOT1-3K2F-vyPSyr.
  WARNING: Not using device /dev/sdi for PV wwndYy-AZxn-4xK2-QYer-h03q-SdVt-zoY0uE.
  WARNING: Not using device /dev/sdy for PV Px47wf-lv9T-aD0w-6eTE-5Oqr-hI2H-fcJCXt.
  WARNING: Not using device /dev/sdj for PV 3E9Xch-Citw-KEaP-lMVl-CSoD-mPQY-BzXGb2.
  WARNING: Not using device /dev/sdz for PV dVw5bJ-SyJP-dmiu-EkdI-OJ3z-9346-4JQmZh.
  WARNING: Not using device /dev/sdk for PV Od8mTF-f6N0-dvNi-BTIu-KhwD-IZ7Y-WOhvdR.
  WARNING: Not using device /dev/sdaa for PV EKZQ6p-qr8s-xMd8-xjrx-W33J-zDv1-UCC6UG.
  WARNING: Not using device /dev/sdl for PV bfXcxr-TY52-Vd1A-nhup-AEq9-0asT-61qV4f.
  WARNING: Not using device /dev/sdab for PV WdDF6d-ZQf5-6WYw-Swb2-Pxxt-NB73-9TeCs5.
  WARNING: Not using device /dev/sdm for PV JLVdTL-YnMG-uBsB-R1JR-GbW9-wExg-7T7VtC.
  WARNING: Not using device /dev/sdac for PV GRJDSK-XAml-JnWV-jtwp-QzcR-SUIt-VD7R18.
  WARNING: Not using device /dev/sdn for PV fI7n6X-LT2k-NAux-H9vq-FA3I-a1AJ-XE6SkE.
  WARNING: Not using device /dev/sdad for PV dJhl7B-2EYs-2xk4-ClzU-6qQe-FLmR-6n0Rjs.
  WARNING: Not using device /dev/sdo for PV ejvITu-Haru-KPU7-lZqU-kUGC-rVJo-mWcJDR.
  WARNING: Not using device /dev/sdae for PV JrCKzM-ofeO-DN6m-VlnW-kUlW-YMc4-XGk6AO.
  WARNING: Not using device /dev/sdp for PV 9PJB7p-2kjn-AWax-hYJI-fUMW-lZil-8BdeOt.
  WARNING: Not using device /dev/sdaf for PV KE5c5d-kkMt-2Ky0-3lkZ-E9IO-jPyo-h7XHCY.
  WARNING: Not using device /dev/sdav for PV qf5kq7-Lba7-Xfbc-LVFQ-S9pS-dVYT-Kd4keV.
  WARNING: PV 74tj43-uEdO-oG4C-EaAi-Urbc-Zf1Q-TJKdPF prefers device /dev/sdag because device was seen first.
  WARNING: PV R4lVwM-Py0a-gUdd-N7VH-YZqI-uggl-a3GaMU prefers device /dev/sdah because device was seen first.
  WARNING: PV DswoeH-LFyJ-Kdbo-BXXX-MFFC-YxqC-4F3cQe prefers device /dev/sdai because device was seen first.
  WARNING: PV qfXtiR-FbHY-4KCq-DmAA-Ddy0-wfra-nOyP2U prefers device /dev/sdaj because device was seen first.
  WARNING: PV GD2pex-3Fam-sjZM-BFIr-O0Xw-59FB-kUHeFY prefers device /dev/sdak because device was seen first.
  WARNING: PV zhceIq-p8Qh-L0hP-al0H-rJ3c-X63A-VUzpEl prefers device /dev/sdal because device was seen first.
  WARNING: PV aH9tII-SVVK-LYeS-xkc3-pOT1-3K2F-vyPSyr prefers device /dev/sdam because device was seen first.
  WARNING: PV wwndYy-AZxn-4xK2-QYer-h03q-SdVt-zoY0uE prefers device /dev/sdan because device was seen first.
  WARNING: PV Px47wf-lv9T-aD0w-6eTE-5Oqr-hI2H-fcJCXt prefers device /dev/sdr because device was seen first.
  WARNING: PV 3E9Xch-Citw-KEaP-lMVl-CSoD-mPQY-BzXGb2 prefers device /dev/sdao because device was seen first.
  WARNING: PV dVw5bJ-SyJP-dmiu-EkdI-OJ3z-9346-4JQmZh prefers device /dev/sds because device was seen first.
  WARNING: PV Od8mTF-f6N0-dvNi-BTIu-KhwD-IZ7Y-WOhvdR prefers device /dev/sdap because device was seen first.
  WARNING: PV EKZQ6p-qr8s-xMd8-xjrx-W33J-zDv1-UCC6UG prefers device /dev/sdt because device was seen first.
  WARNING: PV bfXcxr-TY52-Vd1A-nhup-AEq9-0asT-61qV4f prefers device /dev/sdaq because device was seen first.
  WARNING: PV WdDF6d-ZQf5-6WYw-Swb2-Pxxt-NB73-9TeCs5 prefers device /dev/sdu because device was seen first.
  WARNING: PV JLVdTL-YnMG-uBsB-R1JR-GbW9-wExg-7T7VtC prefers device /dev/sdar because device was seen first.
  WARNING: PV GRJDSK-XAml-JnWV-jtwp-QzcR-SUIt-VD7R18 prefers device /dev/sdv because device was seen first.
  WARNING: PV fI7n6X-LT2k-NAux-H9vq-FA3I-a1AJ-XE6SkE prefers device /dev/sdas because device was seen first.
  WARNING: PV dJhl7B-2EYs-2xk4-ClzU-6qQe-FLmR-6n0Rjs prefers device /dev/sdw because device was seen first.
  WARNING: PV ejvITu-Haru-KPU7-lZqU-kUGC-rVJo-mWcJDR prefers device /dev/sdat because device was seen first.
  WARNING: PV JrCKzM-ofeO-DN6m-VlnW-kUlW-YMc4-XGk6AO prefers device /dev/sdx because device was seen first.
  WARNING: PV 9PJB7p-2kjn-AWax-hYJI-fUMW-lZil-8BdeOt prefers device /dev/sdau because device was seen first.
  WARNING: PV KE5c5d-kkMt-2Ky0-3lkZ-E9IO-jPyo-h7XHCY prefers device /dev/sda because device was seen first.
  WARNING: PV qf5kq7-Lba7-Xfbc-LVFQ-S9pS-dVYT-Kd4keV prefers device /dev/sdq because device was seen first.
  PV /dev/sdau   VG ceph-3ebf060c-cba0-4bb9-9a5c-3e4dd77c8dac   lvm2 [<7.28 TiB / 0    free]
  PV /dev/sdat   VG ceph-7d2437ed-cde7-45fc-852f-47e98d20e313   lvm2 [<7.28 TiB / 0    free]
  PV /dev/sdas   VG ceph-d2be24d1-9531-47d3-9eab-2aa2278c026d   lvm2 [<7.28 TiB / 0    free]
  PV /dev/sdar   VG ceph-d8869478-7709-4668-9a6e-76ff85b50f45   lvm2 [<7.28 TiB / 0    free]
  PV /dev/sdaq   VG ceph-9920fb9d-68de-4a95-ae08-10e520741f33   lvm2 [<7.28 TiB / 0    free]
  PV /dev/sdap   VG ceph-fbdfbf81-65ad-4d02-a233-5f313fef740e   lvm2 [<7.28 TiB / 0    free]
  PV /dev/sdao   VG ceph-7c31e499-b1e2-44bd-92b7-e9f5c3f4bdb1   lvm2 [<7.28 TiB / 0    free]
  PV /dev/sdan   VG ceph-2a1dc75d-bdd7-4edd-b979-d7a34c234244   lvm2 [<7.28 TiB / 0    free]
  PV /dev/sdx    VG ceph-2f0b258e-a163-41d8-bac0-6b2fb7e17ca8   lvm2 [<7.28 TiB / 0    free]
  PV /dev/sdam   VG ceph-ee964530-2a1a-402b-8da7-f86395bbec93   lvm2 [<7.28 TiB / 0    free]
  PV /dev/sdw    VG ceph-e814dbbd-b551-43dc-8634-6ac2d0ad0682   lvm2 [<7.28 TiB / 0    free]
  PV /dev/sdal   VG ceph-f117500d-9e72-4f18-8b2a-2362366eaf05   lvm2 [<7.28 TiB / 0    free]
  PV /dev/sdv    VG ceph-05775cd8-481d-43f3-ba5c-39ac1cffdf5d   lvm2 [<7.28 TiB / 0    free]
  PV /dev/sdak   VG ceph-3b387a53-f0ac-42eb-8da9-5514c8cecebb   lvm2 [<7.28 TiB / 0    free]
  PV /dev/sdu    VG ceph-25b7e1e1-41d8-4155-a405-4f8306ddf590   lvm2 [<7.28 TiB / 0    free]
  PV /dev/sdaj   VG ceph-06d00174-c191-4ca1-a69b-2c2c5999e122   lvm2 [<7.28 TiB / 0    free]
  PV /dev/sdt    VG ceph-1e8daa81-4dc8-4013-a4b7-797eb2f02e42   lvm2 [<7.28 TiB / 0    free]
  PV /dev/sdai   VG ceph-18ef4961-3cd1-41c3-8d9d-6ef7d44cd64b   lvm2 [<7.28 TiB / 0    free]
  PV /dev/sds    VG ceph-516977f5-e5f9-4985-b668-d9c37d77b1a7   lvm2 [<7.28 TiB / 0    free]
  PV /dev/sdah   VG ceph-7023d13a-3cb8-491b-bceb-593b9af15acc   lvm2 [<7.28 TiB / 0    free]
  PV /dev/sdr    VG ceph-e4c9dc20-6e09-4b06-8e62-661ea5cad176   lvm2 [<7.28 TiB / 0    free]
  PV /dev/sdag   VG ceph-4c3ff6e0-cc44-4c70-b83f-5717d8f71032   lvm2 [<7.28 TiB / 0    free]
  PV /dev/sdq    VG ceph-8fba5402-a68d-42e5-b2fa-4a7ddca8d5e9   lvm2 [<7.28 TiB / 0    free]
  PV /dev/sda    VG ceph-917affd0-e62f-4a11-8d90-848ad7ac01fb   lvm2 [<7.28 TiB / 0    free]
  Total: 24 [<174.66 TiB] / in use: 24 [<174.66 TiB] / in no VG: 0 [0   ]
Hardware used:
Code:
# lspci | grep SAS
3b:00.0 Serial Attached SCSI controller: LSI Logic / Symbios Logic SAS3616 Fusion-MPT Tri-Mode I/O Controller Chip (IOC) (rev 02)
# dmesg | grep SAS3616
[    3.500325] mpt3sas_cm0: SAS3616: FWVersion(07.00.01.00), ChipRevision(0x02), BiosVersion(09.13.00.00)
Code:
# pveversion
pve-manager/6.3-4/0a38c56f (running kernel: 5.4.101-1-pve)
I checked previous kernel versions and found that latest working version for me is pve-kernel-5.4.78-2-pve
 
Last edited:

t.lamprecht

Proxmox Staff Member
Staff member
Jul 28, 2015
5,182
1,493
164
South Tyrol/Italy
shop.proxmox.com
I wrote about reboot into old kernel version:
Ah sorry, seems I missed that part.
# dmesg | grep SAS3616 [ 3.500325] mpt3sas_cm0: SAS3616: FWVersion(07.00.01.00), ChipRevision(0x02), BiosVersion(09.13.00.00)
And what does
Bash:
dmesg | grep mpt3sas
outputs?

I checked previous kernel versions and found that latest working version for me is pve-kernel-5.4.78-2-pve
Thanks that help to pin it down to a bunch of patches on the module in use, but not exactly clear what specific one causes that trouble, nor is there a clear upstream followup fix which is missing in our kernel tree... Besides two harmless looking timeout series these 14 patches are suspicious:
https://git.proxmox.com/?p=mirror_u...og;h=a2f1c11be6c5c6fa0fc67b409973af13435bf564

Did you upgrade the firmware of that controller to the latest release already?
 

masterevil

New Member
Mar 3, 2021
4
0
1
And what does
Bash:
dmesg | grep mpt3sas
outputs?
Code:
# dmesg | grep mpt3sas                                                                                                                                     
[    3.205618] mpt3sas version 35.101.00.00 loaded
[    3.221767] mpt3sas_cm0: 63 BIT PCI BUS DMA ADDRESSING SUPPORTED, total mem (131669024 kB)
[    3.288614] mpt3sas_cm0: MSI-X vectors supported: 128
[    3.288616] mpt3sas_cm0:  0 32
[    3.289216] mpt3sas_cm0: High IOPs queues : disabled
[    3.289217] mpt3sas0-msix0: PCI-MSI-X enabled: IRQ 61
[    3.289217] mpt3sas0-msix1: PCI-MSI-X enabled: IRQ 62
[    3.289218] mpt3sas0-msix2: PCI-MSI-X enabled: IRQ 63
[    3.289219] mpt3sas0-msix3: PCI-MSI-X enabled: IRQ 64
[    3.289219] mpt3sas0-msix4: PCI-MSI-X enabled: IRQ 65
[    3.289220] mpt3sas0-msix5: PCI-MSI-X enabled: IRQ 66
[    3.289220] mpt3sas0-msix6: PCI-MSI-X enabled: IRQ 67
[    3.289221] mpt3sas0-msix7: PCI-MSI-X enabled: IRQ 68
[    3.289221] mpt3sas0-msix8: PCI-MSI-X enabled: IRQ 69
[    3.289222] mpt3sas0-msix9: PCI-MSI-X enabled: IRQ 70
[    3.289223] mpt3sas0-msix10: PCI-MSI-X enabled: IRQ 71
[    3.289223] mpt3sas0-msix11: PCI-MSI-X enabled: IRQ 72
[    3.289224] mpt3sas0-msix12: PCI-MSI-X enabled: IRQ 73
[    3.289225] mpt3sas0-msix13: PCI-MSI-X enabled: IRQ 74
[    3.289225] mpt3sas0-msix14: PCI-MSI-X enabled: IRQ 75
[    3.289226] mpt3sas0-msix15: PCI-MSI-X enabled: IRQ 76
[    3.289227] mpt3sas0-msix16: PCI-MSI-X enabled: IRQ 77
[    3.289227] mpt3sas0-msix17: PCI-MSI-X enabled: IRQ 78
[    3.289228] mpt3sas0-msix18: PCI-MSI-X enabled: IRQ 79
[    3.289228] mpt3sas0-msix19: PCI-MSI-X enabled: IRQ 80
[    3.289229] mpt3sas0-msix20: PCI-MSI-X enabled: IRQ 81
[    3.289230] mpt3sas0-msix21: PCI-MSI-X enabled: IRQ 82
[    3.289230] mpt3sas0-msix22: PCI-MSI-X enabled: IRQ 83
[    3.289231] mpt3sas0-msix23: PCI-MSI-X enabled: IRQ 84
[    3.289231] mpt3sas0-msix24: PCI-MSI-X enabled: IRQ 85
[    3.289232] mpt3sas0-msix25: PCI-MSI-X enabled: IRQ 86
[    3.289232] mpt3sas0-msix26: PCI-MSI-X enabled: IRQ 87
[    3.289233] mpt3sas0-msix27: PCI-MSI-X enabled: IRQ 88
[    3.289233] mpt3sas0-msix28: PCI-MSI-X enabled: IRQ 89
[    3.289234] mpt3sas0-msix29: PCI-MSI-X enabled: IRQ 90
[    3.289235] mpt3sas0-msix30: PCI-MSI-X enabled: IRQ 91
[    3.289235] mpt3sas0-msix31: PCI-MSI-X enabled: IRQ 92
[    3.289237] mpt3sas_cm0: iomem(0x000038bffff00000), mapped(0x00000000161459f2), size(1048576)
[    3.289238] mpt3sas_cm0: ioport(0x0000000000007000), size(256)
[    3.357060] mpt3sas_cm0: sending message unit reset !!
[    3.358638] mpt3sas_cm0: message unit reset: SUCCESS
[    3.384746] mpt3sas_cm0: scatter gather: sge_in_main_msg(1), sge_per_chain(7), sge_per_io(128), chains_per_io(19)
[    3.386286] mpt3sas_cm0: request pool(0x00000000a36c07f6) - dma(0x1030100000): depth(7788), frame_size(128), pool_size(973 kB)
[    3.588451] mpt3sas_cm0: sense pool(0x00000000f0eef94d)- dma(0x102dc00000): depth(7567),element_size(96), pool_size(709 kB)
[    3.588783] mpt3sas_cm0: config page(0x00000000d5d1a8a5) - dma(0x102e355000): size(512)
[    3.588783] mpt3sas_cm0: Allocated physical memory: size(37821 kB)
[    3.588784] mpt3sas_cm0: Current Controller Queue Depth(7564),Max Controller Queue Depth(7680)
[    3.588785] mpt3sas_cm0: Scatter Gather Elements per IO(128)
[    3.634456] mpt3sas_cm0: _base_display_fwpkg_version: complete
[    3.634755] mpt3sas_cm0: SAS3616: FWVersion(14.00.00.00), ChipRevision(0x02), BiosVersion(09.13.00.00)
[    3.634756] mpt3sas_cm0: Protocol=(Initiator,Target), Capabilities=(TLR,EEDP,Diag Trace Buffer,Task Set Full,NCQ)
[    3.637219] mpt3sas_cm0: sending port enable !!
[    3.637586] mpt3sas_cm0: hba_port entry: 0000000066a54336, port: 0 is added to hba_port list
[    3.638504] mpt3sas_cm0: hba_port entry: 000000006dd42b92, port: 8 is added to hba_port list
[    3.639865] mpt3sas_cm0: host_add: handle(0x0001), sas_addr(0x5003048002fea300), phys(21)
[    3.641962] mpt3sas_cm0: expander_add: handle(0x0017), parent(0x0001), sas_addr(0x5003048017d1b07f), phys(43)
[    3.650512] mpt3sas_cm0: expander_add: handle(0x0031), parent(0x0017), sas_addr(0x5003048017d1b0ff), phys(43)
[    3.658059] mpt3sas_cm0: expander_add: handle(0x0018), parent(0x0009), sas_addr(0x5003048017d1b0ff), phys(43)
[    3.665726] mpt3sas_cm0: expander_add: handle(0x0032), parent(0x0018), sas_addr(0x5003048017d1b07f), phys(43)
[    3.675401] mpt3sas_cm0: port enable: SUCCESS
I upgraded SAS controller firmware, but nothing changed
Code:
# dmesg | grep SAS3616                                                                                                                                       
[    3.634755] mpt3sas_cm0: SAS3616: FWVersion(14.00.00.00), ChipRevision(0x02), BiosVersion(09.13.00.00)
 

masterevil

New Member
Mar 3, 2021
4
0
1
Today I've checked newest version of live xubuntu 21.10 with kernel version 5.13.0-19 and found same problem with duplicated disks. After that I've booted debian live 5.10.0-9 and all disks works as expected.
 

Attachments

  • Screenshot_20211018_152625.png
    Screenshot_20211018_152625.png
    298.6 KB · Views: 5
  • Screenshot_20211018_152755.png
    Screenshot_20211018_152755.png
    310.3 KB · Views: 5
  • Screenshot_20211018_182036.png
    Screenshot_20211018_182036.png
    570 KB · Views: 5
  • Screenshot_20211018_182115.png
    Screenshot_20211018_182115.png
    750.1 KB · Views: 5
  • Screenshot_20211018_182205.png
    Screenshot_20211018_182205.png
    828.6 KB · Views: 5

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!