Proxmox VE 8.1.3 abnormal load

Odmin

New Member
Jan 3, 2024
4
0
1
Hi all,
I upgraded a few servers with different configurations from VE 7 to the latest Proxmox VE 8.1.3 and saw the abnormal load on the servers without anything running:
Server1 before and after upgrade:
1704292247250.png
1704292290760.png
It is just a permanent load on the server, looks like a bug.
kernel: Linux pve01 6.5.11-7-pve #1 SMP PREEMPT_DYNAMIC PMX 6.5.11-7 (2023-12-05T09:44Z) x86_64 GNU/Linux
pve-manager/8.1.3/b46aac3b42da5d15 (running kernel: 6.5.11-7-pve)

I'll try to debug it with sysstat:
CPU utilization
sar -u 1 5
Code:
root@pve01:~# sar -u 1 5
Linux 6.5.11-7-pve (pve01)      01/03/2024      _x86_64_        (48 CPU)

04:38:09 PM     CPU     %user     %nice   %system   %iowait    %steal     %idle
04:38:10 PM     all      0.00      0.00      0.00      0.00      0.00    100.00
04:38:11 PM     all      0.00      0.00      0.00      0.00      0.00    100.00
04:38:12 PM     all      0.00      0.00      0.00      0.00      0.00    100.00
04:38:13 PM     all      0.02      0.00      0.10      0.00      0.00     99.87
04:38:14 PM     all      0.00      0.00      0.00      0.00      0.00    100.00
Average:        all      0.00      0.00      0.02      0.00      0.00     99.97

Check queue lengths and CPU load averages
sar -q 1 10
Code:
04:39:49 PM   runq-sz  plist-sz   ldavg-1   ldavg-5  ldavg-15   blocked
04:39:50 PM         0      1085      5.00      4.99      3.94         0
04:39:51 PM         0      1085      5.00      5.00      3.95         0
04:39:52 PM         0      1085      5.00      5.00      3.95         0
04:39:53 PM         0      1085      5.00      5.00      3.95         0
04:39:54 PM         0      1085      5.00      5.00      3.95         0
04:39:55 PM         0      1085      5.00      5.00      3.95         0
04:39:56 PM         0      1085      5.08      5.01      3.96         0
04:39:57 PM         0      1085      5.08      5.01      3.96         0
04:39:58 PM         0      1085      5.08      5.01      3.96         0
04:39:59 PM         0      1079      5.08      5.01      3.96         0
Average:            0      1084      5.03      5.00      3.95         0

Check disk usage
sar -d 1 3
Code:
Average:          DEV       tps     rkB/s     wkB/s     dkB/s   areq-sz    aqu-sz     await     %util
Average:      nvme2n1      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:      nvme1n1      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:      nvme0n1      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:      nvme3n1      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:          sdp     25.67      0.00    442.67      0.00     17.25      0.00      0.16      0.40
Average:          sda      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:          sdg      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:          sdc      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:          sde      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:          sdi      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:          sdh      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:          sdf      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:          sdj      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:          sdk      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:          sdl      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:          sdn      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:          sdm      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:          sdo      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:          sdd      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:          sdb      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:          sdq     26.00      0.00    442.67      0.00     17.03      0.00      0.10      0.40
Average:          zd0      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:         zd16      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:         zd32      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:         zd48      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:         zd64      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:         zd80      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:         zd96      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:         fioa      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:        zd112      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:        zd128      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:        zd144      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:        zd160      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:        zd176      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:        zd192      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:        zd208      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:        zd224      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:        zd240      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:        zd256      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:        zd272      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:        zd288      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00

Check the memory load
sar -r 1 3
Code:
04:42:36 PM kbmemfree   kbavail kbmemused  %memused kbbuffers  kbcached  kbcommit   %commit  kbactive   kbinact   kbdirty
04:42:37 PM 259369616 258317960   3438980      1.30      1052    187988   3868164      1.46   1080416     54092        12
04:42:38 PM 259365584 258313928   3443020      1.30      1052    187996   3868164      1.46   1080792     54096        12
04:42:39 PM 259364148 258312508   3444416      1.30      1052    187996   3868164      1.46   1079952     54092         4
Average:    259366449 258314799   3442139      1.30      1052    187993   3868164      1.46   1080387     54093         9

Check the I/O load
sar -b 1 10
Code:
Linux 6.5.11-7-pve (pve01)      01/03/2024      _x86_64_        (48 CPU)

04:43:56 PM       tps      rtps      wtps      dtps   bread/s   bwrtn/s   bdscd/s
04:43:57 PM      0.00      0.00      0.00      0.00      0.00      0.00      0.00
04:43:58 PM      0.00      0.00      0.00      0.00      0.00      0.00      0.00
04:43:59 PM      0.00      0.00      0.00      0.00      0.00      0.00      0.00
04:44:00 PM     12.00      0.00     12.00      0.00      0.00    320.00      0.00
04:44:01 PM      0.00      0.00      0.00      0.00      0.00      0.00      0.00
04:44:02 PM      0.00      0.00      0.00      0.00      0.00      0.00      0.00
04:44:03 PM      0.00      0.00      0.00      0.00      0.00      0.00      0.00
04:44:04 PM    259.00      0.00    259.00      0.00      0.00  10400.00      0.00
04:44:05 PM      0.00      0.00      0.00      0.00      0.00      0.00      0.00
04:44:06 PM      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:        27.10      0.00     27.10      0.00      0.00   1072.00      0.00

Check swapping activity
sar -W 1 3
Code:
04:45:20 PM  pswpin/s pswpout/s
04:45:21 PM      0.00      0.00
04:45:22 PM      0.00      0.00
04:45:23 PM      0.00      0.00
Average:         0.00      0.00

The next step was collecting trace information via eBFP from the kernel:
bpftrace -e 'profile:hz:99 { @[kstack] = count(); }' > trace.data
And make a flame graph, attaching it.
1704369896285.png
Looks like Kernel is doing nothing too.
So, I suspect a bug in the load calculation function.
 

Attachments

  • trace.zip
    107.4 KB · Views: 1
Last edited:
Here is another example from Server2 with a different configuration:
1704367634874.png
Kernel:Linux pve02 6.5.11-7-pve #1 SMP PREEMPT_DYNAMIC PMX 6.5.11-7 (2023-12-05T09:44Z) x86_64 GNU/Linux
pve-manager/8.1.3/b46aac3b42da5d15 (running kernel: 6.5.11-7-pve)

Debugging it with sysstat:
CPU utilization
sar -u 1 5
Code:
Linux 6.5.11-7-pve (pve02)      01/04/2024      _x86_64_        (16 CPU)


01:29:06 PM     CPU     %user     %nice   %system   %iowait    %steal     %idle
01:29:07 PM     all      0.00      0.00      0.00      0.00      0.00    100.00
01:29:08 PM     all      0.31      0.00      0.25      0.06      0.00     99.38
01:29:09 PM     all      0.13      0.00      0.25      0.00      0.00     99.62
01:29:10 PM     all      0.06      0.00      0.06      0.00      0.00     99.88
01:29:11 PM     all      0.00      0.00      0.06      0.00      0.00     99.94
Average:        all      0.10      0.00      0.12      0.01      0.00     99.76

Check queue lengths and CPU load averages
sar -q 1 10

Code:
Linux 6.5.11-7-pve (pve02)      01/04/2024      _x86_64_        (16 CPU)


01:30:43 PM   runq-sz  plist-sz   ldavg-1   ldavg-5  ldavg-15   blocked
01:30:44 PM         0       470      2.11      2.08      2.05         0
01:30:45 PM         0       470      2.11      2.08      2.05         0
01:30:46 PM         0       470      2.11      2.08      2.05         0
01:30:47 PM         0       470      2.11      2.08      2.05         0
01:30:48 PM         0       470      2.10      2.08      2.05         0
01:30:49 PM         0       470      2.10      2.08      2.05         0
01:30:50 PM         0       470      2.10      2.08      2.05         0
01:30:51 PM         0       470      2.10      2.08      2.05         0
01:30:52 PM         0       470      2.10      2.08      2.05         0
01:30:53 PM         0       470      2.09      2.08      2.05         0
Average:            0       470      2.10      2.08      2.05         0


Check disk usage
sar -d 1 3
Code:
Average:          DEV       tps     rkB/s     wkB/s     dkB/s   areq-sz    aqu-sz     await     %util
Average:      nvme1n1      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:      nvme0n1      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:          sdb      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:          sda      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:          sdc      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:          sdd      3.33    256.00     21.33      0.00     83.20      0.01      1.30      0.93
Average:          sde      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:          sdf      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:         dm-0      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:         dm-1      4.67      0.00     21.33      0.00      4.57      0.00      0.86      0.40
Average:         dm-2      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:         dm-3      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:         dm-4      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:         dm-6      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:         dm-7      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:         dm-8      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00

Check the memory load
sar -r 1 3

Code:
Linux 6.5.11-7-pve (pve02)      01/04/2024      _x86_64_        (16 CPU)


01:34:30 PM kbmemfree   kbavail kbmemused  %memused kbbuffers  kbcached  kbcommit   %commit  kbactive   kbinact   kbdirty
01:34:31 PM   9644348  10204864  14622696     44.55     98480    626708   3973248      9.80   1027012    645720       248
01:34:32 PM   9644348  10204864  14622696     44.55     98480    626708   3973248      9.80   1027012    645720       272
01:34:33 PM   9644348  10204864  14622696     44.55     98480    626708   3973248      9.80   1027012    645720       272
Average:      9644348  10204864  14622696     44.55     98480    626708   3973248      9.80   1027012    645720       264

Check the I/O load
sar -b 1 10
Code:
Linux 6.5.11-7-pve (pve02)      01/04/2024      _x86_64_        (16 CPU)


01:35:27 PM       tps      rtps      wtps      dtps   bread/s   bwrtn/s   bdscd/s
01:35:28 PM      3.00      3.00      0.00      0.00    768.00      0.00      0.00
01:35:29 PM      0.00      0.00      0.00      0.00      0.00      0.00      0.00
01:35:30 PM      0.00      0.00      0.00      0.00      0.00      0.00      0.00
01:35:31 PM      0.00      0.00      0.00      0.00      0.00      0.00      0.00
01:35:32 PM      4.00      0.00      4.00      0.00      0.00    120.00      0.00
01:35:33 PM      0.00      0.00      0.00      0.00      0.00      0.00      0.00
01:35:34 PM      0.00      0.00      0.00      0.00      0.00      0.00      0.00
01:35:35 PM      0.00      0.00      0.00      0.00      0.00      0.00      0.00
01:35:36 PM      0.00      0.00      0.00      0.00      0.00      0.00      0.00
01:35:37 PM      4.00      0.00      4.00      0.00      0.00     96.00      0.00
Average:         1.10      0.30      0.80      0.00     76.80     21.60      0.00

Check swapping activity
sar -W 1 3
Code:
Linux 6.5.11-7-pve (pve02)      01/04/2024      _x86_64_        (16 CPU)


01:36:27 PM  pswpin/s pswpout/s
01:36:28 PM      0.00      0.00
01:36:29 PM      0.00      0.00
01:36:30 PM      0.00      0.00
Average:         0.00      0.00

Collecting trace information via eBFP from the kernel:
bpftrace -e 'profile:hz:99 { @[kstack] = count(); }' > trace.data
1704369973534.png
 

Attachments

  • trace2.zip
    15.3 KB · Views: 1
Last edited:
Hi Fiona,
Thanks a lot for your reply!
Yes, you are right, I checked on both servers:
Server1 with constant load 5:
Code:
ps aux | grep " [RD]"
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root        1254  0.0  0.0      0     0 ?        D<   11:34   0:00 [vdev_autotrim]
root        3317  0.0  0.0      0     0 ?        D<   11:34   0:00 [vdev_autotrim]
root        3318  0.0  0.0      0     0 ?        D<   11:34   0:00 [vdev_autotrim]
root        3319  0.0  0.0      0     0 ?        D<   11:34   0:00 [vdev_autotrim]
root        3320  0.0  0.0      0     0 ?        D<   11:34   0:00 [vdev_autotrim]
root       12447  0.0  0.0  11628  4608 pts/0    R+   11:45   0:00 ps aux

Server2 with constant load 2:
Code:
ps aux | grep " [RD]"
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root        1584  0.0  0.0      0     0 ?        D<    2023   0:01 [vdev_autotrim]
root        1585  0.0  0.0      0     0 ?        D<    2023   0:03 [vdev_autotrim]
root     1292548  0.0  0.0  12636  5504 pts/0    R+   10:58   0:00 ps aux

Looks like one adev_autorim process is "eating" 1 point of load.

I turned off autotrim on all pools
zpool set autotrim=off pool_name
But it does not destroy the process "vdev_autotrim" Then I rebooted the server and the issue was gone!
So, the root cause of this issue - ZFS autotrim process.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!