Hello,
We have a Proxmox 7.1 cluster, with 3 nodes.
VMs are on LVM over a shared iSCSI storage.
Storage access from the nodes reaches between 100 and 400 MiB/s for a sequential read.
Cloning a VM is very slow, with 2h 40min spent for cloning a 50 GiB VM (that's a bit more than 5 MiB cloned / second).
We have a few other vms running and making disc access, but I don't understand why the cloning operation is so slow.
The cloning task provides no information :
Nothing special in the system logs either.
Here is the result of
sdb is the shared iSCSI device, it's writing slowly (max 5 MiB/s) and it has iowait.
Here is the output for
What is
Why is this
Anything else I should check ?
I also noticed that the cloning task was stalled at 100% for a long time (maybe 1 hour). Does anything special happen at the end of the cloning operation, like calculating some checksum ?
We have a Proxmox 7.1 cluster, with 3 nodes.
VMs are on LVM over a shared iSCSI storage.
Storage access from the nodes reaches between 100 and 400 MiB/s for a sequential read.
Cloning a VM is very slow, with 2h 40min spent for cloning a 50 GiB VM (that's a bit more than 5 MiB cloned / second).
We have a few other vms running and making disc access, but I don't understand why the cloning operation is so slow.
The cloning task provides no information :
Code:
create full clone of drive scsi0 (LVM:vm-135-disk-0)
Wiping dos signature on /dev/datavg/vm-203-disk-0.
Logical volume "vm-203-disk-0" created.
transferred 0.0 B of 50.0 GiB (0.00%)
transferred 512.0 MiB of 50.0 GiB (1.00%)
transferred 1.0 GiB of 50.0 GiB (2.01%)
---snip---
transferred 49.7 GiB of 50.0 GiB (99.39%)
transferred 50.0 GiB of 50.0 GiB (100.00%)
transferred 50.0 GiB of 50.0 GiB (100.00%)
TASK OK
Nothing special in the system logs either.
Here is the result of
iostat -x 2
while cloning :
Code:
avg-cpu: %user %nice %system %iowait %steal %idle
12.24 0.00 0.52 11.39 0.00 75.85
Device r/s rkB/s rrqm/s %rrqm r_await rareq-sz w/s wkB/s wrqm/s %wrqm w_await wareq-sz d/s dkB/s drqm/s %drqm d_await dareq-sz f/s f_await aqu-sz %util
dm-0 0.00 0.00 0.00 0.00 0.00 0.00 108.00 676.00 0.00 0.00 0.00 6.26 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.40
dm-1 3.50 3712.00 0.00 0.00 209.71 1060.57 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.73 28.20
dm-10 2.00 8.00 0.00 0.00 274.00 4.00 320.00 1280.00 0.00 0.00 1924.33 4.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 616.33 54.00
dm-11 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-12 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-5 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-6 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-7 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-8 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-9 0.00 0.00 0.00 0.00 0.00 0.00 1.50 10.00 0.00 0.00 490.67 6.67 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.74 36.60
sda 0.00 0.00 0.00 0.00 0.00 0.00 100.50 676.00 7.50 6.94 0.21 6.73 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.02 0.40
sdb 5.50 3720.00 0.00 0.00 232.91 676.36 5.50 4506.00 319.00 98.31 484.00 819.27 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 3.94 36.00
avg-cpu: %user %nice %system %iowait %steal %idle
11.85 0.00 0.28 11.30 0.00 76.57
Device r/s rkB/s rrqm/s %rrqm r_await rareq-sz w/s wkB/s wrqm/s %wrqm w_await wareq-sz d/s dkB/s drqm/s %drqm d_await dareq-sz f/s f_await aqu-sz %util
dm-0 0.00 0.00 0.00 0.00 0.00 0.00 3.00 12.00 0.00 0.00 0.00 4.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.20
dm-1 1.00 1024.00 0.00 0.00 2892.00 1024.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 2.89 144.80
dm-10 0.00 0.00 0.00 0.00 0.00 0.00 160.00 640.00 0.00 0.00 2052.35 4.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 328.38 87.40
dm-11 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-12 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-5 0.00 0.00 0.00 0.00 0.00 0.00 0.50 10.00 0.00 0.00 0.00 20.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.20
dm-6 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-7 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-8 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-9 0.00 0.00 0.00 0.00 0.00 0.00 0.50 4.00 0.00 0.00 2044.00 8.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.02 102.20
sda 0.00 0.00 0.00 0.00 0.00 0.00 3.00 12.00 0.00 0.00 0.17 4.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.20
sdb 1.00 1024.00 0.00 0.00 2892.00 1024.00 1.00 642.00 159.50 99.38 2047.50 642.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 4.94 137.20
sdb is the shared iSCSI device, it's writing slowly (max 5 MiB/s) and it has iowait.
Here is the output for
iotop -oabtn60
during the cloning :
Code:
19:18:24 Total DISK READ: 724.33 K/s | Total DISK WRITE: 33.95 K/s
19:18:24 Current DISK READ: 724.33 K/s | Current DISK WRITE: 271.62 K/s
TIME TID PRIO USER DISK READ DISK WRITE SWAPIN IO COMMAND
19:18:24 1329951 be/4 root 0.00 B 0.00 B 0.00 % 94.39 % [kworker/u66:3+flush-253:10]
19:18:24 1332397 be/4 root 32.05 M 28.00 M 0.00 % 52.35 % qemu-img convert -p -n -f raw -O raw /dev/datavg/vm-135-disk-0 /dev/datavg/vm-101-disk-0
19:18:24 1335132 be/4 root 34.06 M 28.00 M 0.00 % 46.83 % qemu-img convert -p -n -f raw -O raw /dev/datavg/vm-135-disk-0 /dev/datavg/vm-101-disk-0
19:18:24 1336841 be/4 root 18.10 M 48.00 M 0.00 % 36.95 % qemu-img convert -p -n -f raw -O raw /dev/datavg/vm-135-disk-0 /dev/datavg/vm-101-disk-0
19:18:24 1336844 be/4 root 32.05 M 28.00 M 0.00 % 29.30 % qemu-img convert -p -n -f raw -O raw /dev/datavg/vm-135-disk-0 /dev/datavg/vm-101-disk-0
19:18:24 1337128 be/4 root 31.31 M 32.00 M 0.00 % 28.76 % qemu-img convert -p -n -f raw -O raw /dev/datavg/vm-135-disk-0 /dev/datavg/vm-101-disk-0
19:18:24 1335131 be/4 root 38.79 M 22.00 M 0.00 % 22.71 % qemu-img convert -p -n -f raw -O raw /dev/datavg/vm-135-disk-0 /dev/datavg/vm-101-disk-0
19:18:24 1337610 be/4 root 0.00 B 0.00 B 0.00 % 99.99 % iscsiadm --mode session --sid 1 --rescan
What is
[kworker/u66:3+flush-253:10]
? Is it an IO thread for a vm ?Why is this
iscsiadm --mode session --sid 1 --rescan
command running ? I think Proxmox rescans all storage very often, can this be changed ? Do you have any idea why it shows 99.99 % IO, but 0 bytes read or written ? Should I be worried about this ?Anything else I should check ?
I also noticed that the cloning task was stalled at 100% for a long time (maybe 1 hour). Does anything special happen at the end of the cloning operation, like calculating some checksum ?