High IO delay on restore

andy77 · Apr 26, 2017

Hello @ all,

I am asking myself why a restore form an NFS storage (1Gbit/s) is causing an IO delay of over 60% even if the disks used on the node are NVMe SSDs that should be minimum 10 times faster then the possible transfer rate over the 1Gbit/s network of the NFS Storage?

It also caused a Load of 6 of 8 what is also very high. Ok this may be because of compression, but the IO is for me unknowable.

Regards
Andy

wolfgang · Apr 27, 2017

Hi,

I wait means a process waits for io in your case the nvme ssd is waiting for slow nfs data.

andy77 · Apr 27, 2017

Ok, but why then the node is getting slow when IO delay gets high?

wolfgang · Apr 27, 2017

Can you be more specific out it get slow?
What filesystem do you use?

andy77 · Apr 27, 2017

Sure, the VMs are getting slow because of high IO.
The filesystem is thin-lvm.

linum · Apr 27, 2017

I also see these high IO delay while restoring to a zfs pool. I tried with and without dedup and/or compression. Regardless with settings I choose I get IO delays about 50%. The test system uses consumer grade SSD with SATA connections, but I see similar results with NVMe SSD. Maybe there's something while restoring a backup that causes high IO load to the filesystem that is independet from the underlying filesystem.

Code:

zpool list
NAME    SIZE  ALLOC   FREE  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
r1_00  1.80T   904G   936G         -    90%    49%  1.42x  ONLINE  -
r1_01  1.86T   737G  1.14T         -    77%    38%  1.23x  ONLINE  -

zpool status
  pool: r1_00
 state: ONLINE
  scan: scrub repaired 0 in 4h42m with 0 errors on Sun Apr  9 05:06:49 2017
config:

        NAME                                                   STATE     READ WRITE CKSUM
        r1_00                                                  ONLINE       0     0     0
          mirror-0                                             ONLINE       0     0     0
            ata-SanDisk_SDSSDXPS960G_xxx-part1        ONLINE       0     0     0
            ata-SanDisk_Ultra_II_960GB_xxx-part1      ONLINE       0     0     0
          mirror-1                                             ONLINE       0     0     0
            ata-Samsung_SSD_850_PRO_1TB_xxx-part1  ONLINE       0     0     0
            ata-Samsung_SSD_850_PRO_1TB_xxx-part1  ONLINE       0     0     0
        logs
          ata-INTEL_SSDSC2BB160G4_xxx-part1     ONLINE       0     0     0

errors: No known data errors

  pool: r1_01
 state: ONLINE
  scan: scrub repaired 0 in 1h56m with 0 errors on Sun Apr  9 02:20:56 2017
config:

        NAME                                                   STATE     READ WRITE CKSUM
        r1_01                                                  ONLINE       0     0     0
          mirror-0                                             ONLINE       0     0     0
            ata-Samsung_SSD_850_PRO_2TB_xxx-part2  ONLINE       0     0     0
            ata-Samsung_SSD_850_PRO_2TB_xxx-part2  ONLINE       0     0     0
        logs
          ata-INTEL_SSDSC2BB160G4_xxx-part2     ONLINE       0     0     0

errors: No known data errors

PS: What is needed to get a response to a bug request like this: https://bugzilla.proxmox.com/show_bug.cgi?id=1344

fabian · Apr 27, 2017

I/O delay when you do unthrottled I/O intensive operations is normal - I/O delay just says that there are processes waiting for I/O to complete. If your systems has a high I/O delay constantly - then you have a problem, because your system is bottle-necked by I/O performance. If you see high I/O delay while you are actually attempting to do more I/O than your system can handle (like when you are restoring a backup, which is almost always limited by either the source's read or the targets write performance, or when you are benchmarking

), then this is just a result of what you are doing, and not an indicator of a problem.

What can be a problem is swamping some disk / storage / .. with requests by a restore such that it cannot handle other requests in parallel in a timely manner.

Search

Search

High IO delay on restore

andy77

Renowned Member

wolfgang

Proxmox Retired Staff

andy77

Renowned Member

wolfgang

Proxmox Retired Staff

andy77

Renowned Member

linum

Renowned Member

fabian

Proxmox Staff Member

We value your privacy