Hi all,
I upgraded a few servers with different configurations from VE 7 to the latest Proxmox VE 8.1.3 and saw the abnormal load on the servers without anything running:
Server1 before and after upgrade:
It is just a permanent load on the server, looks like a bug.
kernel: Linux pve01 6.5.11-7-pve #1 SMP PREEMPT_DYNAMIC PMX 6.5.11-7 (2023-12-05T09:44Z) x86_64 GNU/Linux
pve-manager/8.1.3/b46aac3b42da5d15 (running kernel: 6.5.11-7-pve)
I'll try to debug it with sysstat:
CPU utilization
Check queue lengths and CPU load averages
Check disk usage
Check the memory load
Check the I/O load
Check swapping activity
The next step was collecting trace information via eBFP from the kernel:
And make a flame graph, attaching it.
Looks like Kernel is doing nothing too.
So, I suspect a bug in the load calculation function.
I upgraded a few servers with different configurations from VE 7 to the latest Proxmox VE 8.1.3 and saw the abnormal load on the servers without anything running:
Server1 before and after upgrade:
It is just a permanent load on the server, looks like a bug.
kernel: Linux pve01 6.5.11-7-pve #1 SMP PREEMPT_DYNAMIC PMX 6.5.11-7 (2023-12-05T09:44Z) x86_64 GNU/Linux
pve-manager/8.1.3/b46aac3b42da5d15 (running kernel: 6.5.11-7-pve)
I'll try to debug it with sysstat:
CPU utilization
sar -u 1 5
Code:
root@pve01:~# sar -u 1 5
Linux 6.5.11-7-pve (pve01) 01/03/2024 _x86_64_ (48 CPU)
04:38:09 PM CPU %user %nice %system %iowait %steal %idle
04:38:10 PM all 0.00 0.00 0.00 0.00 0.00 100.00
04:38:11 PM all 0.00 0.00 0.00 0.00 0.00 100.00
04:38:12 PM all 0.00 0.00 0.00 0.00 0.00 100.00
04:38:13 PM all 0.02 0.00 0.10 0.00 0.00 99.87
04:38:14 PM all 0.00 0.00 0.00 0.00 0.00 100.00
Average: all 0.00 0.00 0.02 0.00 0.00 99.97
Check queue lengths and CPU load averages
sar -q 1 10
Code:
04:39:49 PM runq-sz plist-sz ldavg-1 ldavg-5 ldavg-15 blocked
04:39:50 PM 0 1085 5.00 4.99 3.94 0
04:39:51 PM 0 1085 5.00 5.00 3.95 0
04:39:52 PM 0 1085 5.00 5.00 3.95 0
04:39:53 PM 0 1085 5.00 5.00 3.95 0
04:39:54 PM 0 1085 5.00 5.00 3.95 0
04:39:55 PM 0 1085 5.00 5.00 3.95 0
04:39:56 PM 0 1085 5.08 5.01 3.96 0
04:39:57 PM 0 1085 5.08 5.01 3.96 0
04:39:58 PM 0 1085 5.08 5.01 3.96 0
04:39:59 PM 0 1079 5.08 5.01 3.96 0
Average: 0 1084 5.03 5.00 3.95 0
Check disk usage
sar -d 1 3
Code:
Average: DEV tps rkB/s wkB/s dkB/s areq-sz aqu-sz await %util
Average: nvme2n1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Average: nvme1n1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Average: nvme0n1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Average: nvme3n1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Average: sdp 25.67 0.00 442.67 0.00 17.25 0.00 0.16 0.40
Average: sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Average: sdg 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Average: sdc 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Average: sde 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Average: sdi 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Average: sdh 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Average: sdf 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Average: sdj 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Average: sdk 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Average: sdl 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Average: sdn 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Average: sdm 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Average: sdo 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Average: sdd 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Average: sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Average: sdq 26.00 0.00 442.67 0.00 17.03 0.00 0.10 0.40
Average: zd0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Average: zd16 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Average: zd32 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Average: zd48 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Average: zd64 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Average: zd80 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Average: zd96 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Average: fioa 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Average: zd112 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Average: zd128 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Average: zd144 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Average: zd160 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Average: zd176 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Average: zd192 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Average: zd208 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Average: zd224 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Average: zd240 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Average: zd256 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Average: zd272 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Average: zd288 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Check the memory load
sar -r 1 3
Code:
04:42:36 PM kbmemfree kbavail kbmemused %memused kbbuffers kbcached kbcommit %commit kbactive kbinact kbdirty
04:42:37 PM 259369616 258317960 3438980 1.30 1052 187988 3868164 1.46 1080416 54092 12
04:42:38 PM 259365584 258313928 3443020 1.30 1052 187996 3868164 1.46 1080792 54096 12
04:42:39 PM 259364148 258312508 3444416 1.30 1052 187996 3868164 1.46 1079952 54092 4
Average: 259366449 258314799 3442139 1.30 1052 187993 3868164 1.46 1080387 54093 9
Check the I/O load
sar -b 1 10
Code:
Linux 6.5.11-7-pve (pve01) 01/03/2024 _x86_64_ (48 CPU)
04:43:56 PM tps rtps wtps dtps bread/s bwrtn/s bdscd/s
04:43:57 PM 0.00 0.00 0.00 0.00 0.00 0.00 0.00
04:43:58 PM 0.00 0.00 0.00 0.00 0.00 0.00 0.00
04:43:59 PM 0.00 0.00 0.00 0.00 0.00 0.00 0.00
04:44:00 PM 12.00 0.00 12.00 0.00 0.00 320.00 0.00
04:44:01 PM 0.00 0.00 0.00 0.00 0.00 0.00 0.00
04:44:02 PM 0.00 0.00 0.00 0.00 0.00 0.00 0.00
04:44:03 PM 0.00 0.00 0.00 0.00 0.00 0.00 0.00
04:44:04 PM 259.00 0.00 259.00 0.00 0.00 10400.00 0.00
04:44:05 PM 0.00 0.00 0.00 0.00 0.00 0.00 0.00
04:44:06 PM 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Average: 27.10 0.00 27.10 0.00 0.00 1072.00 0.00
Check swapping activity
sar -W 1 3
Code:
04:45:20 PM pswpin/s pswpout/s
04:45:21 PM 0.00 0.00
04:45:22 PM 0.00 0.00
04:45:23 PM 0.00 0.00
Average: 0.00 0.00
The next step was collecting trace information via eBFP from the kernel:
bpftrace -e 'profile:hz:99 { @[kstack] = count(); }' > trace.data
And make a flame graph, attaching it.
Looks like Kernel is doing nothing too.
So, I suspect a bug in the load calculation function.
Attachments
Last edited: