Proxmox lag when i upload large files/copy VM

.KRPK.

Member
Aug 21, 2022
2
0
6
Hi,

my problem is when i upload large files to nextcloud (AIO) on VM or make copy of VM my I/O jump to 50% and some of VM became unresponsive eg websites stop working on VM on nextcloud, Windows Server stop responding and proxmox interface timeout. Something like coping VM can be understandable (too much i/o on rpool on with proxmox is working on), but uploading a large files doesn't (high i/o for slowpool shouldn't efect VM on rpool or nvme00 pool).
It get 2 time soo lagy that i need to reboot proxmox, and 1 time event couldn't find boot drive for proxmox but after many reboots and trying it figure it out. Still this lag is conserning. Question is what i did wrong and what change to make it go away?

My setup:

Rich (BB code):
CPU(s)
 32 x AMD EPYC 7282 16-Core Processor (1 Socket)

Kernel Version
Linux 6.8.12-5-pve (2024-12-03T10:26Z)

Boot Mode
EFI

Manager Version
pve-manager/8.3.1/fb48e850ef9dde27

Repository Status
Proxmox VE updates
Non production-ready repository enabled!

Rich (BB code):
NAME       SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
nvme00    3.48T   519G  2.98T        -         -     8%    14%  1.00x    ONLINE  -
rpool     11.8T  1.67T  10.1T        -         -    70%    14%  1.76x    ONLINE  -
slowpool  21.8T  9.32T  12.5T        -         -    46%    42%  1.38x    ONLINE  -

Proxmox is on rpool:

Code:
root@alfredo:~# zpool status rpool
  pool: rpool
 state: ONLINE
  scan: scrub repaired 0B in 02:17:09 with 0 errors on Sun Jan 12 02:41:11 2025
config:

        NAME                                                   STATE     READ WRITE CKSUM
        rpool                                                  ONLINE       0     0     0
          raidz1-0                                             ONLINE       0     0     0
            ata-Samsung_SSD_870_EVO_4TB_S6BCNX0T306226Y-part3  ONLINE       0     0     0
            ata-Samsung_SSD_870_EVO_4TB_S6BCNX0T304731Z-part3  ONLINE       0     0     0
            ata-Samsung_SSD_870_EVO_4TB_S6BCNX0T400242Z-part3  ONLINE       0     0     0
        special
          mirror-1                                             ONLINE       0     0     0
            nvme-Samsung_SSD_970_EVO_Plus_1TB_S6P7NS0T314087Z  ONLINE       0     0     0
            nvme-Samsung_SSD_970_EVO_Plus_1TB_S6P7NS0T314095M  ONLINE       0     0     0

errors: No known data errors

Code:
root@alfredo:~# zfs get all rpool
NAME   PROPERTY              VALUE                  SOURCE
rpool  type                  filesystem             -
rpool  creation              Fri Aug 26 16:14 2022  -
rpool  used                  1.88T                  -
rpool  available             6.00T                  -
rpool  referenced            120K                   -
rpool  compressratio         1.26x                  -
rpool  mounted               yes                    -
rpool  quota                 none                   default
rpool  reservation           none                   default
rpool  recordsize            128K                   default
rpool  mountpoint            /rpool                 default
rpool  sharenfs              off                    default
rpool  checksum              on                     default
rpool  compression           on                     local
rpool  atime                 on                     local
rpool  devices               on                     default
rpool  exec                  on                     default
rpool  setuid                on                     default
rpool  readonly              off                    default
rpool  zoned                 off                    default
rpool  snapdir               hidden                 default
rpool  aclmode               discard                default
rpool  aclinherit            restricted             default
rpool  createtxg             1                      -
rpool  canmount              on                     default
rpool  xattr                 on                     default
rpool  copies                1                      default
rpool  version               5                      -
rpool  utf8only              off                    -
rpool  normalization         none                   -
rpool  casesensitivity       sensitive              -
rpool  vscan                 off                    default
rpool  nbmand                off                    default
rpool  sharesmb              off                    default
rpool  refquota              none                   default
rpool  refreservation        none                   default
rpool  guid                  5222442941902153338    -
rpool  primarycache          all                    default
rpool  secondarycache        all                    default
rpool  usedbysnapshots       0B                     -
rpool  usedbydataset         120K                   -
rpool  usedbychildren        1.88T                  -
rpool  usedbyrefreservation  0B                     -
rpool  logbias               latency                default
rpool  objsetid              54                     -
rpool  dedup                 on                     local
rpool  mlslabel              none                   default
rpool  sync                  standard               local
rpool  dnodesize             legacy                 default
rpool  refcompressratio      1.00x                  -
rpool  written               120K                   -
rpool  logicalused           1.85T                  -
rpool  logicalreferenced     46K                    -
rpool  volmode               default                default
rpool  filesystem_limit      none                   default
rpool  snapshot_limit        none                   default
rpool  filesystem_count      none                   default
rpool  snapshot_count        none                   default
rpool  snapdev               hidden                 default
rpool  acltype               off                    default
rpool  context               none                   default
rpool  fscontext             none                   default
rpool  defcontext            none                   default
rpool  rootcontext           none                   default
rpool  relatime              on                     local
rpool  redundant_metadata    all                    default
rpool  overlay               on                     default
rpool  encryption            off                    default
rpool  keylocation           none                   default
rpool  keyformat             none                   default
rpool  pbkdf2iters           0                      default
rpool  special_small_blocks  128K                   local
rpool  prefetch              all                    default

Drives for data is on HDD on slowpool:

Code:
root@alfredo:~# zpool status slowpool
  pool: slowpool
 state: ONLINE
  scan: scrub repaired 0B in 15:09:45 with 0 errors on Sun Jan 12 15:33:49 2025
config:

        NAME                                 STATE     READ WRITE CKSUM
        slowpool                             ONLINE       0     0     0
          raidz2-0                           ONLINE       0     0     0
            ata-ST6000NE000-2KR101_WSD809PN  ONLINE       0     0     0
            ata-ST6000NE000-2KR101_WSD7V2YP  ONLINE       0     0     0
            ata-ST6000NE000-2KR101_WSD7ZMFM  ONLINE       0     0     0
            ata-ST6000NE000-2KR101_WSD82NLF  ONLINE       0     0     0

errors: No known data errors

Code:
root@alfredo:~# zfs get all slowpool
NAME      PROPERTY              VALUE                  SOURCE
slowpool  type                  filesystem             -
slowpool  creation              Fri Aug 19 11:33 2022  -
slowpool  used                  5.99T                  -
slowpool  available             5.93T                  -
slowpool  referenced            4.45T                  -
slowpool  compressratio         1.05x                  -
slowpool  mounted               yes                    -
slowpool  quota                 none                   default
slowpool  reservation           none                   default
slowpool  recordsize            128K                   default
slowpool  mountpoint            /slowpool              default
slowpool  sharenfs              off                    default
slowpool  checksum              on                     default
slowpool  compression           on                     local
slowpool  atime                 on                     default
slowpool  devices               on                     default
slowpool  exec                  on                     default
slowpool  setuid                on                     default
slowpool  readonly              off                    default
slowpool  zoned                 off                    default
slowpool  snapdir               hidden                 default
slowpool  aclmode               discard                default
slowpool  aclinherit            restricted             default
slowpool  createtxg             1                      -
slowpool  canmount              on                     default
slowpool  xattr                 on                     default
slowpool  copies                1                      default
slowpool  version               5                      -
slowpool  utf8only              off                    -
slowpool  normalization         none                   -
slowpool  casesensitivity       sensitive              -
slowpool  vscan                 off                    default
slowpool  nbmand                off                    default
slowpool  sharesmb              off                    default
slowpool  refquota              none                   default
slowpool  refreservation        none                   default
slowpool  guid                  6841581580145990709    -
slowpool  primarycache          all                    default
slowpool  secondarycache        all                    default
slowpool  usedbysnapshots       0B                     -
slowpool  usedbydataset         4.45T                  -
slowpool  usedbychildren        1.55T                  -
slowpool  usedbyrefreservation  0B                     -
slowpool  logbias               latency                default
slowpool  objsetid              54                     -
slowpool  dedup                 on                     local
slowpool  mlslabel              none                   default
slowpool  sync                  standard               default
slowpool  dnodesize             legacy                 default
slowpool  refcompressratio      1.03x                  -
slowpool  written               4.45T                  -
slowpool  logicalused           6.12T                  -
slowpool  logicalreferenced     4.59T                  -
slowpool  volmode               default                default
slowpool  filesystem_limit      none                   default
slowpool  snapshot_limit        none                   default
slowpool  filesystem_count      none                   default
slowpool  snapshot_count        none                   default
slowpool  snapdev               hidden                 default
slowpool  acltype               off                    default
slowpool  context               none                   default
slowpool  fscontext             none                   default
slowpool  defcontext            none                   default
slowpool  rootcontext           none                   default
slowpool  relatime              on                     default
slowpool  redundant_metadata    all                    default
slowpool  overlay               on                     default
slowpool  encryption            off                    default
slowpool  keylocation           none                   default
slowpool  keyformat             none                   default
slowpool  pbkdf2iters           0                      default
slowpool  special_small_blocks  0                      default
slowpool  prefetch              all                    default

I recently add more nvme and move most heavy VM on this to freeup some i/o on rpool but it didn't help.

Code:
root@alfredo:~# zpool status nvme00
  pool: nvme00
 state: ONLINE
  scan: scrub repaired 0B in 00:19:11 with 0 errors on Sun Jan 12 00:43:12 2025
config:

        NAME                                                STATE     READ WRITE CKSUM
        nvme00                                              ONLINE       0     0     0
          mirror-0                                          ONLINE       0     0     0
            nvme-SAMSUNG_MZ1L23T8HBLA-00A07_S667NN0X601658  ONLINE       0     0     0
            nvme-SAMSUNG_MZ1L23T8HBLA-00A07_S667NN0X601711  ONLINE       0     0     0

errors: No known data errors

Some screenshots:
1737025984118.png
1737026002843.png
1737026024264.png

VM 100 with have nextcloud with data store on slowpool disk 0:
1737026114536.png
1737026135483.png
1737026145395.png

Full log file journal from one event:
Suspect detach and atach drives when running?

https://chmura.b24.best-it.pl/s/HZxHH2RBtPsdi5j
 
Last edited:

You also have your slowpool configured with Ironwolf Pros 4x-wide raidz2, this is killing performance. Raidz2 starts to make more sense around the 6-drive mark, but still only good for data storage - not good for interactive-VM performance.

Convert both pools to mirrors (this necessitates adding a 4th disk to rpool, or using larger disks in a mirror for the same free space; and backup/restore to slowpool)
 
Last edited:
yeah i know that my slowpool is realy sloooow because RaidZ2, not striped mirrors. It's waiting on to do list to fix this but later (TM).

But it should only slow down write/read i/o to and from slowpool?

Why it lags rpool and all proxmox even when some VM are on rpool or even only on nvme00 pool? It shouldn't do that.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!