VM clone is really slow

Darato

New Member
Aug 10, 2022
4
0
1
Hello,

When I try to clone a VM template, it takes more than 3minutes for a VM, it's a bit long but it works without throwing any error message.
If I add another VM, via Terraform Telmate provider, same instance with count = 2, it takes long too but no error message either.


Now I want 2 or 4 instances at the same time, still with Terraform, count = 4, and it doesn't works :

Code:
proxmox_vm_qemu.old_k3s_worker_instance[3]: Still creating... [5m50s elapsed]
╷
│ Error: vm locked, could not obtain config
│
│   with proxmox_vm_qemu.old_k3s_worker_instance[3],
│   on instance.tf line 1, in resource "proxmox_vm_qemu" "old_k3s_worker_instance":
│    1: resource "proxmox_vm_qemu" "old_k3s_worker_instance" {
│
╵
╷
│ Error: vm locked, could not obtain config
│
│   with proxmox_vm_qemu.old_k3s_worker_instance[0],
│   on instance.tf line 1, in resource "proxmox_vm_qemu" "old_k3s_worker_instance":
│    1: resource "proxmox_vm_qemu" "old_k3s_worker_instance" {
│
╵
╷
│ Error: vm locked, could not obtain config
│
│   with proxmox_vm_qemu.old_k3s_worker_instance[2],
│   on instance.tf line 1, in resource "proxmox_vm_qemu" "old_k3s_worker_instance":
│    1: resource "proxmox_vm_qemu" "old_k3s_worker_instance" {
│
╵
╷
│ Error: vm locked, could not obtain config
│
│   with proxmox_vm_qemu.old_k3s_worker_instance[1],
│   on instance.tf line 1, in resource "proxmox_vm_qemu" "old_k3s_worker_instance":
│    1: resource "proxmox_vm_qemu" "old_k3s_worker_instance" {
│
╵

I looked at the logs, and I found in dmesg that qemu is stuck:

Code:
[  +0.000005] INFO: task qemu-img:20060 blocked for more than 120 seconds.
[  +0.000054]       Tainted: P           O      5.15.39-1-pve #1
[  +0.000046] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  +0.000059] task:qemu-img        state:D stack:    0 pid:20060 ppid: 19869 flags:0x00004002
[  +0.000006] Call Trace:
[  +0.000002]  <TASK>
[  +0.000002]  __schedule+0x33d/0x1750
[  +0.000006]  ? __pagevec_release+0x30/0x70
[  +0.000007]  ? __cond_resched+0x1a/0x50
[  +0.000004]  ? write_cache_pages+0x24d/0x460
[  +0.000005]  ? __set_page_dirty_no_writeback+0x40/0x40
[  +0.000006]  schedule+0x4e/0xb0
[  +0.000003]  io_schedule+0x46/0x70
[  +0.000005]  wait_on_page_bit_common+0x114/0x3e0
[  +0.000007]  ? filemap_invalidate_unlock_two+0x40/0x40
[  +0.000007]  wait_on_page_bit+0x3f/0x50
[  +0.000006]  wait_on_page_writeback+0x26/0x80
[  +0.000006]  __filemap_fdatawait_range+0x97/0x110
[  +0.000008]  filemap_write_and_wait_range+0x88/0xd0
[  +0.000008]  blkdev_put+0x1d0/0x210
[  +0.000006]  blkdev_close+0x27/0x30
[  +0.000006]  __fput+0x9f/0x260
[  +0.000007]  ____fput+0xe/0x10
[  +0.000006]  task_work_run+0x6d/0xa0
[  +0.000006]  exit_to_user_mode_prepare+0x1a4/0x1b0
[  +0.000005]  syscall_exit_to_user_mode+0x27/0x50
[  +0.000005]  ? __x64_sys_close+0x12/0x40
[  +0.000004]  do_syscall_64+0x69/0xc0
[  +0.000003]  ? syscall_exit_to_user_mode+0x27/0x50
[  +0.000005]  ? do_syscall_64+0x69/0xc0
[  +0.000004]  ? asm_exc_page_fault+0x8/0x30
[  +0.000006]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  +0.000006] RIP: 0033:0x7f8049b9e11b
[  +0.000003] RSP: 002b:00007fff1f91c5c0 EFLAGS: 00000293 ORIG_RAX: 0000000000000003
[  +0.000004] RAX: 0000000000000000 RBX: 000055fff7551090 RCX: 00007f8049b9e11b
[  +0.000002] RDX: 00007f803fc81ca8 RSI: 0000000000000000 RDI: 0000000000000008
[  +0.000003] RBP: 000055fff754ad30 R08: 0000000000000000 R09: 000055fff7542130
[  +0.000002] R10: 0000000000000008 R11: 0000000000000293 R12: 000055fff74fd3b0
[  +0.000002] R13: 0000000000000000 R14: 000055fff5a8ef00 R15: 0000000000000000
[  +0.000004]  </TASK>

However, it seems the drive is not the bottleneck here :

Code:
root@pve1:~# iostat --human
Linux 5.15.39-1-pve (pve1)     08/10/2022     _x86_64_    (48 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.1%    0.0%    0.1%    0.5%    0.0%   99.4%

Device             tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd
dm-0              0.02         0.5k         0.0k         0.0k       4.2M       0.0k       0.0k
dm-1              3.94        29.8k        17.8k         0.0k     229.6M     136.9M       0.0k
dm-10             0.00         0.0k         0.0k         0.0k     116.0k       0.0k       0.0k
dm-11             0.00         0.0k         0.0k         0.0k     116.0k       0.0k       0.0k
dm-12             0.00         0.0k         0.0k         0.0k     116.0k       0.0k       0.0k
dm-13            46.70         0.3k       186.7k         0.0k       2.2M       1.4G       0.0k
dm-14            46.70         0.3k       186.7k         0.0k       2.2M       1.4G       0.0k
dm-15             0.02         0.4k         0.0k         0.0k       3.3M       0.0k       0.0k
dm-16            46.70         0.3k       186.7k         0.0k       2.2M       1.4G       0.0k
dm-17            46.70         0.3k       186.7k         0.0k       2.2M       1.4G       0.0k
dm-2              1.09         1.5k         2.8k         0.0k      11.7M      21.8M       0.0k
dm-3            212.24       644.4k         2.2M         0.0k       4.8G      17.2G       0.0k
dm-4            212.24       644.4k         2.2M         0.0k       4.8G      17.2G       0.0k
dm-6              0.02         0.5k         0.0k         0.0k       3.9M       0.0k       0.0k
dm-7              0.01         0.2k         0.0k         0.0k       1.2M       0.0k       0.0k
dm-8             14.15       877.4k         0.0k         0.0k       6.6G       0.0k       0.0k
dm-9              0.00         0.0k         0.0k         0.0k     116.0k       0.0k       0.0k
sda               0.02         0.6k         0.0k         0.0k       4.4M       0.0k       0.0k
sdb             167.42       742.8k         2.3M         0.0k       5.6G      17.4G       0.0k

There is the lsblk output (for when there was only 2 instances creating sorry):
Code:
NAME                            MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda                               8:0    0 298.1G  0 disk
sdb                               8:16   0 465.8G  0 disk
├─sdb1                            8:17   0  1007K  0 part
├─sdb2                            8:18   0   512M  0 part /boot/efi
└─sdb3                            8:19   0 465.3G  0 part
  ├─pve-swap                    253:0    0     8G  0 lvm  [SWAP]
  ├─pve-root                    253:1    0    96G  0 lvm  /
  ├─pve-data_tmeta              253:2    0   3.5G  0 lvm 
  │ └─pve-data-tpool            253:4    0 338.4G  0 lvm 
  │   ├─pve-data                253:5    0 338.4G  1 lvm 
  │   ├─pve-vm--100--disk--0    253:6    0    32G  0 lvm 
  │   ├─pve-vm--9000--cloudinit 253:7    0     4M  0 lvm 
  │   ├─pve-base--9000--disk--0 253:8    0   2.2G  1 lvm 
  │   ├─pve-vm--102--cloudinit  253:9    0     4M  0 lvm 
  │   ├─pve-vm--101--cloudinit  253:10   0     4M  0 lvm 
  │   ├─pve-vm--102--disk--0    253:11   0   2.2G  0 lvm 
  │   └─pve-vm--101--disk--0    253:12   0   2.2G  0 lvm 
  └─pve-data_tdata              253:3    0 338.4G  0 lvm 
    └─pve-data-tpool            253:4    0 338.4G  0 lvm 
      ├─pve-data                253:5    0 338.4G  1 lvm 
      ├─pve-vm--100--disk--0    253:6    0    32G  0 lvm 
      ├─pve-vm--9000--cloudinit 253:7    0     4M  0 lvm 
      ├─pve-base--9000--disk--0 253:8    0   2.2G  1 lvm 
      ├─pve-vm--102--cloudinit  253:9    0     4M  0 lvm 
      ├─pve-vm--101--cloudinit  253:10   0     4M  0 lvm 
      ├─pve-vm--102--disk--0    253:11   0   2.2G  0 lvm 
      └─pve-vm--101--disk--0    253:12   0   2.2G  0 lvm

But if you look at the process, they seems to be in deadlock, so waiting for drive.


Code:
root        1855  0.0  0.1 345608 124424 ?       Ss   13:02   0:00 pvedaemon
root        1856  0.0  0.1 353796 129376 ?       S    13:02   0:02  \_ pvedaemon worker
root       19866  0.0  0.1 361028 127048 ?       Ss   15:07   0:00  |   \_ task UPID:pve1:00004D9A:000B8497:62F3ADAB:qmclone:9000:terraform-prov@pve!token_terraform_pve_deploy:
root       20018  1.3  0.0 832156 25080 ?        Dl   15:08   0:06  |       \_ /usr/bin/qemu-img convert -p -n -f raw -O raw /dev/pve/base-9000-disk-0 zeroinit:/dev/pve/vm-102-disk-0
root        1857  0.0  0.1 354000 132160 ?       S    13:02   0:01  \_ pvedaemon worker
root       19865  0.0  0.1 361232 127104 ?       Ss   15:07   0:00  |   \_ task UPID:pve1:00004D99:000B8496:62F3ADAB:qmclone:9000:terraform-prov@pve!token_terraform_pve_deploy:
root       19972  1.5  0.0 832164 25404 ?        Dl   15:07   0:07  |   |   \_ /usr/bin/qemu-img convert -p -n -f raw -O raw /dev/pve/base-9000-disk-0 zeroinit:/dev/pve/vm-101-disk-0
root       19870  0.0  0.1 361232 127104 ?       Ss   15:07   0:00  |   \_ task UPID:pve1:00004D9E:000B849C:62F3ADAB:qmclone:9000:terraform-prov@pve!token_terraform_pve_deploy:
root       20044  1.0  0.0 832156 25224 ?        Dl   15:08   0:04  |       \_ /usr/bin/qemu-img convert -p -n -f raw -O raw /dev/pve/base-9000-disk-0 zeroinit:/dev/pve/vm-104-disk-1
root        1858  0.0  0.1 354132 130180 ?       S    13:02   0:02  \_ pvedaemon worker
root       19869  0.0  0.1 361364 127236 ?       Ss   15:07   0:00      \_ task UPID:pve1:00004D9D:000B849B:62F3ADAB:qmclone:9000:terraform-prov@pve!token_terraform_pve_deploy:
root       20060  0.9  0.0 832156 25108 ?        Dl   15:08   0:04          \_ /usr/bin/qemu-img convert -p -n -f raw -O raw /dev/pve/base-9000-disk-0 zeroinit:/dev/pve/vm-103-disk-0



The configuration is :
* HP DL 380 Gen9
* 96 GB DDR4 ECC
* 2x Intel(R) Xeon(R) CPU E5-2670 v3 @ 2.30GHz
* 1x SSD Samsung EVO 850 - via HPe embedded software raid


There is only 1 drive for now, installed with ext4 / lvm from the wizard.
I know it's not overkill regarding storage, but I feel the drive itself is not the bottleneck here.


There is the terraform instance configuration :
YAML:
resource "proxmox_vm_qemu" "old_k3s_worker_instance" {
  count = 4
  name  = "old-k3s-worker-instance-${count.index + 1}"

  target_node = "pve1"

  clone = var.pve_template_name

  # VM settings
  agent   = 0 #guest agent
  os_type = "cloud-init"

  cores    = 4
  sockets  = "1"
  cpu      = "host"
  memory   = 8 * 1024
  scsihw   = "virtio-scsi-pci"
  bootdisk = "scsi0"

  sshkeys = <<EOF
ssh-rsa AAAAB3N...PxdQ== ubuntu
EOF

  ssh_user = "ubuntu"

  disk {
    slot     = 0
    size     = "15G"
    type     = "scsi"
    storage  = "local-lvm"
    iothread = 1
  }

  network {
    model  = "virtio"
    bridge = "vmbr11"
  }

  # lifecycle {
  #   ignore_changes = [
  #     network
  #   ]
  # }

  ipconfig0 = "ip=10.0.11.${count.index + 11}/24,gw=10.0.11.1"
}


For now, I'll try to increase timeout, but Telmate Terrform provider is buggy and it doesn't works everytime.

Do you have an idea of what can cause this ?
 
PVE runs on Linux using a fairly vanilla Linux kernel. You have found and posted Kernel level messages showing that your storage is not responding for over 120 seconds. The clone you are doing seems to be a full clone, which is in essence a full qcow copy from one file to another within the same filesystem/disk. Everything points to a backing storage issue. Whether its your HP raid, your disk or something in between - anybody's guess without more troubleshooting.
You should be able to reproduce the same hung kernel message by putting enough pressure on your storage with either FIO or running qemu-img manually.

Increasing timeout in your workflow manager is just masking the issue, likely more than one thing is affected during the IO block.


Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
You're absolutely right @bbgeek17 but I can't find what causes this.

I was doing some fio benchmark to test, disk performance, I paste them below.

For the qemu command, I presume this can do the job ?
Bash:
for i in {0..3}; do
/usr/bin/qemu-img convert -p -n -f raw -O raw /dev/pve/base-9000-disk-0 zeroinit:/dev/pve/vm-10${i}-disk-0 &
done


FIO benchmark command comes from this page : AskUbuntu

FIO READ :
Code:
root@pve1:/tmp# fio --name TEST --eta-newline=5s --filename=fio-tempfile.dat --rw=read --size=500m --io_size=10g --blocksize=1024k --ioengine=libaio --fsync=10000 --iodepth=32 --direct=1 --numjobs=1 --runtime=60 --group_reporting
TEST: (g=0): rw=read, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=libaio, iodepth=32
fio-3.25
Starting 1 process
Jobs: 1 (f=1): [R(1)][35.0%][r=501MiB/s][r=500 IOPS][eta 00m:13s]
Jobs: 1 (f=1): [R(1)][65.0%][r=489MiB/s][r=489 IOPS][eta 00m:07s] 
Jobs: 1 (f=1): [R(1)][95.0%][r=486MiB/s][r=486 IOPS][eta 00m:01s] 
Jobs: 1 (f=1): [R(1)][100.0%][r=495MiB/s][r=495 IOPS][eta 00m:00s]
TEST: (groupid=0, jobs=1): err= 0: pid=36874: Wed Aug 10 16:53:50 2022
  read: IOPS=496, BW=497MiB/s (521MB/s)(10.0GiB/20620msec)
    slat (usec): min=48, max=870, avg=97.26, stdev=40.98
    clat (msec): min=2, max=145, avg=64.10, stdev=13.43
     lat (msec): min=2, max=145, avg=64.20, stdev=13.43
    clat percentiles (msec):
     |  1.00th=[   12],  5.00th=[   48], 10.00th=[   61], 20.00th=[   61],
     | 30.00th=[   65], 40.00th=[   65], 50.00th=[   65], 60.00th=[   65],
     | 70.00th=[   65], 80.00th=[   65], 90.00th=[   65], 95.00th=[   78],
     | 99.00th=[  117], 99.50th=[  124], 99.90th=[  136], 99.95th=[  142],
     | 99.99th=[  146]
   bw (  KiB/s): min=430080, max=559104, per=100.00%, avg=509052.88, stdev=25108.05, samples=41
   iops        : min=  420, max=  546, avg=497.12, stdev=24.52, samples=41
  lat (msec)   : 4=0.19%, 10=0.63%, 20=1.19%, 50=3.23%, 100=92.05%
  lat (msec)   : 250=2.71%
  cpu          : usr=0.64%, sys=6.77%, ctx=10229, majf=0, minf=8206
  IO depths    : 1=0.2%, 2=0.4%, 4=0.8%, 8=1.6%, 16=3.3%, 32=93.6%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=99.8%, 8=0.0%, 16=0.0%, 32=0.2%, 64=0.0%, >=64=0.0%
     issued rwts: total=10240,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=32

Run status group 0 (all jobs):
   READ: bw=497MiB/s (521MB/s), 497MiB/s-497MiB/s (521MB/s-521MB/s), io=10.0GiB (10.7GB), run=20620-20620msec

Disk stats (read/write):
    dm-1: ios=10162/36, merge=0/0, ticks=629004/2880, in_queue=631884, util=99.62%, aggrios=10252/22, aggrmerge=0/14, aggrticks=636934/1907, aggrin_queue=638842, aggrutil=99.37%
  sdb: ios=10252/22, merge=0/14, ticks=636934/1907, in_queue=638842, util=99.37%

FIO write :

Code:
root@pve1:/tmp# fio --name TEST --eta-newline=5s --filename=fio-tempfile.dat --rw=write --size=500m --io_size=10g --blocksize=1024k --ioengine=libaio --fsync=10000 --iodepth=32 --direct=1 --numjobs=1 --runtime=60 --group_reporting
TEST: (g=0): rw=write, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=libaio, iodepth=32
fio-3.25
Starting 1 process
Jobs: 1 (f=1): [W(1)][11.7%][w=104MiB/s][w=104 IOPS][eta 00m:53s]
Jobs: 1 (f=1): [W(1)][21.7%][w=90.0MiB/s][w=90 IOPS][eta 00m:47s]
Jobs: 1 (f=1): [W(1)][31.7%][w=100MiB/s][w=100 IOPS][eta 00m:41s]
Jobs: 1 (f=1): [W(1)][41.7%][w=75.0MiB/s][w=75 IOPS][eta 00m:35s]
Jobs: 1 (f=1): [W(1)][51.7%][w=98.0MiB/s][w=98 IOPS][eta 00m:29s]
Jobs: 1 (f=1): [W(1)][61.7%][w=93.1MiB/s][w=93 IOPS][eta 00m:23s]
Jobs: 1 (f=1): [W(1)][71.7%][w=95.0MiB/s][w=95 IOPS][eta 00m:17s]
Jobs: 1 (f=1): [W(1)][81.7%][w=92.0MiB/s][w=92 IOPS][eta 00m:11s]
Jobs: 1 (f=1): [W(1)][91.7%][w=100MiB/s][w=100 IOPS][eta 00m:05s]
Jobs: 1 (f=1): [W(1)][100.0%][w=88.0MiB/s][w=88 IOPS][eta 00m:00s]
TEST: (groupid=0, jobs=1): err= 0: pid=34850: Wed Aug 10 16:39:43 2022
  write: IOPS=96, BW=96.0MiB/s (101MB/s)(5783MiB/60210msec); 0 zone resets
    slat (usec): min=62, max=295327, avg=387.31, stdev=7305.32
    clat (msec): min=12, max=761, avg=332.20, stdev=107.85
     lat (msec): min=12, max=761, avg=332.59, stdev=107.62
    clat percentiles (msec):
     |  1.00th=[  124],  5.00th=[  218], 10.00th=[  230], 20.00th=[  241],
     | 30.00th=[  251], 40.00th=[  271], 50.00th=[  309], 60.00th=[  334],
     | 70.00th=[  372], 80.00th=[  460], 90.00th=[  485], 95.00th=[  506],
     | 99.00th=[  625], 99.50th=[  642], 99.90th=[  709], 99.95th=[  709],
     | 99.99th=[  760]
   bw (  KiB/s): min=53248, max=145408, per=99.79%, avg=98150.40, stdev=19827.96, samples=120
   iops        : min=   52, max=  142, avg=95.85, stdev=19.36, samples=120
  lat (msec)   : 20=0.02%, 50=0.12%, 100=0.55%, 250=29.03%, 500=64.41%
  lat (msec)   : 750=5.83%, 1000=0.03%
  cpu          : usr=0.74%, sys=0.58%, ctx=379, majf=0, minf=13
  IO depths    : 1=0.2%, 2=0.4%, 4=0.8%, 8=1.7%, 16=3.3%, 32=93.6%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=99.8%, 8=0.0%, 16=0.0%, 32=0.2%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,5783,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=32

Run status group 0 (all jobs):
  WRITE: bw=96.0MiB/s (101MB/s), 96.0MiB/s-96.0MiB/s (101MB/s-101MB/s), io=5783MiB (6064MB), run=60210-60210msec

Disk stats (read/write):
    dm-1: ios=0/5946, merge=0/0, ticks=0/1891796, in_queue=1891796, util=96.09%, aggrios=147/7696, aggrmerge=0/78, aggrticks=32548/2467653, aggrin_queue=2500202, aggrutil=93.67%
  sdb: ios=147/7696, merge=0/78, ticks=32548/2467653, in_queue=2500202, util=93.67%

Random 4K read QD1:
Code:
root@pve1:/tmp# fio --name TEST --eta-newline=5s --filename=fio-tempfile.dat --rw=randread --size=500m --io_size=10g --blocksize=4k --ioengine=libaio --fsync=1 --iodepth=1 --direct=1 --numjobs=1 --runtime=60 --group_reporting
TEST: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1
fio-3.25
Starting 1 process
Jobs: 1 (f=1): [r(1)][11.7%][r=30.9MiB/s][r=7905 IOPS][eta 00m:53s]
Jobs: 1 (f=1): [r(1)][21.7%][r=21.5MiB/s][r=5493 IOPS][eta 00m:47s] 
Jobs: 1 (f=1): [r(1)][31.7%][r=29.0MiB/s][r=7671 IOPS][eta 00m:41s] 
Jobs: 1 (f=1): [r(1)][41.7%][r=21.3MiB/s][r=5441 IOPS][eta 00m:35s] 
Jobs: 1 (f=1): [r(1)][51.7%][r=21.6MiB/s][r=5529 IOPS][eta 00m:29s] 
Jobs: 1 (f=1): [r(1)][61.7%][r=30.6MiB/s][r=7829 IOPS][eta 00m:23s] 
Jobs: 1 (f=1): [r(1)][71.7%][r=21.2MiB/s][r=5425 IOPS][eta 00m:17s] 
Jobs: 1 (f=1): [r(1)][81.7%][r=21.1MiB/s][r=5390 IOPS][eta 00m:11s] 
Jobs: 1 (f=1): [r(1)][91.7%][r=30.7MiB/s][r=7856 IOPS][eta 00m:05s] 
Jobs: 1 (f=1): [r(1)][100.0%][r=21.0MiB/s][r=5386 IOPS][eta 00m:00s]
TEST: (groupid=0, jobs=1): err= 0: pid=37345: Wed Aug 10 16:57:33 2022
  read: IOPS=6421, BW=25.1MiB/s (26.3MB/s)(1505MiB/60001msec)
    slat (usec): min=7, max=451, avg=16.44, stdev= 9.53
    clat (usec): min=3, max=10429, avg=136.20, stdev=93.57
     lat (usec): min=107, max=10468, avg=152.93, stdev=95.60
    clat percentiles (usec):
     |  1.00th=[   98],  5.00th=[  100], 10.00th=[  100], 20.00th=[  102],
     | 30.00th=[  105], 40.00th=[  121], 50.00th=[  141], 60.00th=[  145],
     | 70.00th=[  147], 80.00th=[  163], 90.00th=[  178], 95.00th=[  186],
     | 99.00th=[  210], 99.50th=[  212], 99.90th=[  465], 99.95th=[  523],
     | 99.99th=[  635]
   bw (  KiB/s): min=20384, max=31720, per=100.00%, avg=25738.96, stdev=4667.43, samples=119
   iops        : min= 5096, max= 7930, avg=6434.74, stdev=1166.86, samples=119
  lat (usec)   : 4=0.01%, 20=0.01%, 50=0.01%, 100=8.35%, 250=91.52%
  lat (usec)   : 500=0.06%, 750=0.06%
  lat (msec)   : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01%
  cpu          : usr=6.05%, sys=16.36%, ctx=385282, majf=0, minf=19
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=385283,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=25.1MiB/s (26.3MB/s), 25.1MiB/s-25.1MiB/s (26.3MB/s-26.3MB/s), io=1505MiB (1578MB), run=60001-60001msec

Disk stats (read/write):
    dm-1: ios=384695/107, merge=0/0, ticks=52244/1004, in_queue=53248, util=99.93%, aggrios=385319/55, aggrmerge=0/52, aggrticks=51799/602, aggrin_queue=52401, aggrutil=99.85%
  sdb: ios=385319/55, merge=0/52, ticks=51799/602, in_queue=52401, util=99.85%

Mixed R & W :
Code:
root@pve1:/tmp# fio --name TEST --eta-newline=5s --filename=fio-tempfile.dat --rw=randrw --size=500m --io_size=10g --blocksize=4k --ioengine=libaio --fsync=1 --iodepth=1 --direct=1 --numjobs=1 --runtime=60 --group_reporting
TEST: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1
fio-3.25
Starting 1 process
Jobs: 1 (f=1): [m(1)][11.7%][r=72KiB/s,w=104KiB/s][r=18,w=26 IOPS][eta 00m:53s]
Jobs: 1 (f=1): [m(1)][21.7%][r=72KiB/s,w=100KiB/s][r=18,w=25 IOPS][eta 00m:47s] 
Jobs: 1 (f=1): [m(1)][31.7%][r=80KiB/s,w=104KiB/s][r=20,w=26 IOPS][eta 00m:41s] 
Jobs: 1 (f=1): [m(1)][41.7%][r=128KiB/s,w=96KiB/s][r=32,w=24 IOPS][eta 00m:35s] 
Jobs: 1 (f=1): [m(1)][51.7%][r=136KiB/s,w=104KiB/s][r=34,w=26 IOPS][eta 00m:29s]
Jobs: 1 (f=1): [m(1)][61.7%][r=116KiB/s,w=100KiB/s][r=29,w=25 IOPS][eta 00m:23s]
Jobs: 1 (f=1): [m(1)][71.7%][r=96KiB/s,w=104KiB/s][r=24,w=26 IOPS][eta 00m:17s]  
Jobs: 1 (f=1): [m(1)][81.7%][r=140KiB/s,w=96KiB/s][r=35,w=24 IOPS][eta 00m:11s]  
Jobs: 1 (f=1): [m(1)][91.7%][r=84KiB/s,w=104KiB/s][r=21,w=26 IOPS][eta 00m:05s]  
Jobs: 1 (f=1): [m(1)][100.0%][r=72KiB/s,w=108KiB/s][r=18,w=27 IOPS][eta 00m:00s]
TEST: (groupid=0, jobs=1): err= 0: pid=37594: Wed Aug 10 16:59:05 2022
  read: IOPS=24, BW=96.3KiB/s (98.6kB/s)(5780KiB/60004msec)
    slat (nsec): min=14631, max=62076, avg=40681.12, stdev=6151.86
    clat (usec): min=95, max=1666, avg=184.57, stdev=45.16
     lat (usec): min=141, max=1730, avg=225.92, stdev=45.46
    clat percentiles (usec):
     |  1.00th=[  163],  5.00th=[  167], 10.00th=[  167], 20.00th=[  169],
     | 30.00th=[  172], 40.00th=[  178], 50.00th=[  180], 60.00th=[  182],
     | 70.00th=[  190], 80.00th=[  194], 90.00th=[  206], 95.00th=[  210],
     | 99.00th=[  223], 99.50th=[  367], 99.90th=[  474], 99.95th=[ 1663],
     | 99.99th=[ 1663]
   bw (  KiB/s): min=   32, max=  208, per=99.66%, avg=96.81, stdev=35.05, samples=119
   iops        : min=    8, max=   52, avg=24.20, stdev= 8.76, samples=119
  write: IOPS=25, BW=101KiB/s (104kB/s)(6088KiB/60004msec); 0 zone resets
    slat (nsec): min=28721, max=65760, avg=46756.60, stdev=2628.93
    clat (usec): min=2294, max=20597, avg=9817.41, stdev=583.96
     lat (usec): min=2342, max=20645, avg=9864.84, stdev=583.95
    clat percentiles (usec):
     |  1.00th=[ 9634],  5.00th=[ 9765], 10.00th=[ 9765], 20.00th=[ 9765],
     | 30.00th=[ 9765], 40.00th=[ 9765], 50.00th=[ 9765], 60.00th=[ 9765],
     | 70.00th=[ 9765], 80.00th=[ 9765], 90.00th=[ 9896], 95.00th=[ 9896],
     | 99.00th=[12518], 99.50th=[12518], 99.90th=[19006], 99.95th=[20579],
     | 99.99th=[20579]
   bw (  KiB/s): min=   88, max=  120, per=99.55%, avg=101.45, stdev= 7.06, samples=119
   iops        : min=   22, max=   30, avg=25.36, stdev= 1.76, samples=119
  lat (usec)   : 100=0.03%, 250=48.23%, 500=0.40%
  lat (msec)   : 2=0.03%, 4=0.03%, 10=49.88%, 20=1.35%, 50=0.03%
  fsync/fdatasync/sync_file_range:
    sync (nsec): min=93, max=19403, avg=537.02, stdev=363.42
    sync percentiles (nsec):
     |  1.00th=[  334],  5.00th=[  338], 10.00th=[  342], 20.00th=[  350],
     | 30.00th=[  572], 40.00th=[  580], 50.00th=[  588], 60.00th=[  588],
     | 70.00th=[  588], 80.00th=[  604], 90.00th=[  612], 95.00th=[  612],
     | 99.00th=[  620], 99.50th=[  620], 99.90th=[  652], 99.95th=[  668],
     | 99.99th=[19328]
  cpu          : usr=0.12%, sys=0.53%, ctx=5220, majf=0, minf=14
  IO depths    : 1=199.9%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=1445,1522,0,2963 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=96.3KiB/s (98.6kB/s), 96.3KiB/s-96.3KiB/s (98.6kB/s-98.6kB/s), io=5780KiB (5919kB), run=60004-60004msec
  WRITE: bw=101KiB/s (104kB/s), 101KiB/s-101KiB/s (104kB/s-104kB/s), io=6088KiB (6234kB), run=60004-60004msec

Disk stats (read/write):
    dm-1: ios=1444/8419, merge=0/0, ticks=216/82536, in_queue=82752, util=99.88%, aggrios=1481/6080, aggrmerge=0/2355, aggrticks=539/59647, aggrin_queue=60185, aggrutil=99.75%
  sdb: ios=1481/6080, merge=0/2355, ticks=539/59647, in_queue=60185, util=99.75%


The last test is eloquent, but I'm not sure if it is relevant.