[SOLVED] Very poor IO Delay / FSYNCS/SECOND on RAID1 Sata disk

you stated "SoftRaid 2x4To",
Also I have counted 3 disks with 2TB size on the old server. What's the third drive doing?

Also please provide output of
mdadm --detail /dev/md/md2 mdadm --detail /dev/md/md4

That was the info I was curious about.
 
Sorry for the response time, long week-end with current event...

I can shut down all VM yesterday, and I have same perf than old server when all VM is shutdown, so it's seems possible to get better perf

Code:
# pveperf /var/lib/vz
CPU BOGOMIPS:      60798.40
REGEX/SECOND:      4416748
HD SIZE:           3666.44 GB (/dev/md2)
BUFFERED READS:    212.10 MB/sec
AVERAGE SEEK TIME: 11.46 ms
FSYNCS/SECOND:     48.30
DNS EXT:           16.89 ms

Could be some LXC that is doing much more I/O?
I don't know how to see if one VM has more I/O than other, could you help me to see this?


New server hdparm:
Code:
# hdparm -tT /dev/sda

/dev/sda:
 Timing cached reads:   34630 MB in  1.99 seconds = 17386.24 MB/sec
 Timing buffered disk reads: 658 MB in  3.00 seconds = 219.24 MB/sec


Old server hdparm:
Code:
# hdparm -tT /dev/sda

/dev/sda:
 Timing cached reads:   16392 MB in  1.99 seconds = 8218.26 MB/sec
 Timing buffered disk reads: 554 MB in  3.01 seconds = 184.15 MB/sec

I check mdadm configuration and all disk is used:

New server:
Code:
# cat /proc/mdstat
Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10]
md2 : active raid1 sda2[0] sdb2[1]
      3905966016 blocks [2/2] [UU]
      bitmap: 11/30 pages [44KB], 65536KB chunk

unused devices: <none>

Old server:
Code:
# hdparm -tT /dev/sda

/dev/sda:
 Timing cached reads:   16392 MB in  1.99 seconds = 8218.26 MB/sec
 Timing buffered disk reads: 554 MB in  3.01 seconds = 184.15 MB/sec
root@kili:~# cat /proc/mdstat
Personalities : [raid1] [raid6] [raid5] [raid4] [raid0] [linear] [multipath] [raid10]
md4 : active raid5 sdc4[2] sda4[0] sdb4[1]
      3863960576 blocks level 5, 512k chunk, algorithm 2 [3/3] [UUU]
      bitmap: 3/15 pages [12KB], 65536KB chunk

md2 : active raid1 sdb2[1] sda2[0] sdc2[2]
      20478912 blocks [3/3] [UUU]

unused devices: <none>

On old server, we have 2 partitions, one for OS and one for Proxmox, for me, this has no impact on performance
 
Also I have counted 3 disks with 2TB size on the old server. What's the third drive doing?

Also please provide output of
mdadm --detail /dev/md/md2 mdadm --detail /dev/md/md4

That was the info I was curious about.
Sorry I miss your comment

Code:
# mdadm --detail /dev/md2
/dev/md2:
        Version : 0.90
  Creation Time : Tue Nov 27 15:01:41 2018
     Raid Level : raid1
     Array Size : 20478912 (19.53 GiB 20.97 GB)
  Used Dev Size : 20478912 (19.53 GiB 20.97 GB)
   Raid Devices : 3
  Total Devices : 3
Preferred Minor : 2
    Persistence : Superblock is persistent

    Update Time : Mon Jan 11 16:38:50 2021
          State : clean
 Active Devices : 3
Working Devices : 3
 Failed Devices : 0
  Spare Devices : 0

           UUID : b35014e9:92c5dc37:a4d2adc2:26fd5302
         Events : 0.581

    Number   Major   Minor   RaidDevice State
       0       8        2        0      active sync   /dev/sda2
       1       8       18        1      active sync   /dev/sdb2
       2       8       34        2      active sync   /dev/sdc2
# mdadm --detail /dev/md4
/dev/md4:
        Version : 0.90
  Creation Time : Tue Nov 27 15:01:41 2018
     Raid Level : raid5
     Array Size : 3863960576 (3684.96 GiB 3956.70 GB)
  Used Dev Size : 1931980288 (1842.48 GiB 1978.35 GB)
   Raid Devices : 3
  Total Devices : 3
Preferred Minor : 4
    Persistence : Superblock is persistent

  Intent Bitmap : Internal

    Update Time : Mon Jan 11 16:38:59 2021
          State : clean
 Active Devices : 3
Working Devices : 3
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 512K

           UUID : 655cac0e:41e7cb4e:a4d2adc2:26fd5302
         Events : 0.126736

    Number   Major   Minor   RaidDevice State
       0       8        4        0      active sync   /dev/sda4
       1       8       20        1      active sync   /dev/sdb4
       2       8       36        2      active sync   /dev/sdc4

For comparaison on the new server:
Code:
# mdadm --detail /dev/md2                                                                                                                                                     1
/dev/md2:
           Version : 0.90
     Creation Time : Thu Sep 26 15:02:00 2019
        Raid Level : raid1
        Array Size : 3905966016 (3725.02 GiB 3999.71 GB)
     Used Dev Size : 18446744073709551615
      Raid Devices : 2
     Total Devices : 2
   Preferred Minor : 2
       Persistence : Superblock is persistent

     Intent Bitmap : Internal

       Update Time : Mon Jan 11 16:40:25 2021
             State : clean
    Active Devices : 2
   Working Devices : 2
    Failed Devices : 0
     Spare Devices : 0

Consistency Policy : bitmap

              UUID : d090d79b:04b40376:a4d2adc2:26fd5302
            Events : 0.78591

    Number   Major   Minor   RaidDevice State
       0       8        2        0      active sync   /dev/sda2
       1       8       18        1      active sync   /dev/sdb2
 
I do struggle to understand the situation.
The old server has two (2) raids.
1x raid 1 using 2 disks and
1x raid 5 using 3 disks

If you try to do the same stuff on your new node you have only 40% of your spindles working compared to the old system.
To.me it would be.logical that your experience suffers. Do I miss something?
 
Thanks for your help, I reinstalled old server with latest Proxmox version on ZFS instead of soft raid by OVH.

Clearly, it's better, with no VM started, I hit 100+ FSYNCS/Second (not the best, but at least double old value).

Next step, reinstall new server on ZFS and check if perf is better or i I do change server.

Thanks ;)