Slow cloning

vertigomike

New Member
Apr 3, 2024
2
0
1
I have a three node cluster. I have a synology NAS attached via iSCSI to the nodes and it has worked fine. I have a VM template on there and i used it to clone nodes via ansible script for buildup on lab demo. It works fine but the cloning is slow (roughly 30min). This has been cloning to local storage today. The synology has 2 1g interfaces bonded so not superfast and I assumed this was some of my problem. I received added a storage server (running Starwinds VSAN) as iscsi target for cluster hosts as well. Its connected to my network switch via 2 40g interfaces on separate networks. Two of my proxmox hosts have dual 10g interfaces to same switch. I added two more 10g interfaces to one of the pve servers and have those interfaces configured individually into the those dedicated san subnets. looking the the server info it would appear that it is indeed using the dedicated link to talk to the storage. However cloning is no better (300g size template) and still takes around 30min. the template i moved onto the storage and no change. i have not enabled jumbo frames again yet. i did try this a week or so ago and ran into a bunch of issues (i was traveling and doing this remotely) and I haven't tried again yet but thought i'd get some input to see what else i might be missing.
 
Hi @vertigomike , welcome to the forum.

There are a lot of moving pieces in your description, I'd recommend concentrating on one particular combination.

You mentioned that you are using iSCSI as a storage protocol. That implies that your clones are full clones, ie data is copied from one volume to another. As you can imagine the packet travel a few times: read/transfer to host, write/transfer to storage, confirmations, etc.

It is unlikely that your 40G, or even 10G, interfaces are the bottleneck. It is more likely that the culprit is storage, i.e. disk, related. You haven't mentioned what type of disks are used on any of your storage devices. If HDD's are involved, even 1G could be fine.

Once the disk obstacle is overcome, you have to contend with iSCSI implementation of each device, CPU sharing on the storage and host/client.

I would recommend the following:
a) establish network transfer baseline by using iperf or similar tool. This will take the storage out of the equation.
b) using FIO establish baseline of reads/writes from each of the storage devices



Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
Hi @vertigomike , welcome to the forum.

There are a lot of moving pieces in your description, I'd recommend concentrating on one particular combination.

You mentioned that you are using iSCSI as a storage protocol. That implies that your clones are full clones, ie data is copied from one volume to another. As you can imagine the packet travel a few times: read/transfer to host, write/transfer to storage, confirmations, etc.

It is unlikely that your 40G, or even 10G, interfaces are the bottleneck. It is more likely that the culprit is storage, i.e. disk, related. You haven't mentioned what type of disks are used on any of your storage devices. If HDD's are involved, even 1G could be fine.

Once the disk obstacle is overcome, you have to contend with iSCSI implementation of each device, CPU sharing on the storage and host/client.

I would recommend the following:
a) establish network transfer baseline by using iperf or similar tool. This will take the storage out of the equation.
b) using FIO establish baseline of reads/writes from each of the storage devices



Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

Hi @vertigomike , welcome to the forum.

There are a lot of moving pieces in your description, I'd recommend concentrating on one particular combination.

You mentioned that you are using iSCSI as a storage protocol. That implies that your clones are full clones, ie data is copied from one volume to another. As you can imagine the packet travel a few times: read/transfer to host, write/transfer to storage, confirmations, etc.

It is unlikely that your 40G, or even 10G, interfaces are the bottleneck. It is more likely that the culprit is storage, i.e. disk, related. You haven't mentioned what type of disks are used on any of your storage devices. If HDD's are involved, even 1G could be fine.

Once the disk obstacle is overcome, you have to contend with iSCSI implementation of each device, CPU sharing on the storage and host/client.

I would recommend the following:
a) establish network transfer baseline by using iperf or similar tool. This will take the storage out of the equation.
b) using FIO establish baseline of reads/writes from each of the storage devices



Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
I'll see what i can come up with this weekend. I get similar performance from two different iSCSI hosts. One is an older synology with cheap consumer grade NAS SATA drives. The other is a dell server with 24 SAS drives in a hardware RAID 0+1 setup.