Import from ESXi Extremely Slow

Ahh nevermind, I didn't read the documentation. I added bs=16M and got a much better result. 973 MB/s.

It by default is doing 512 BYTE chunks, which is kind of wild. I'll see if I can make progress on the stream import. :)

Edit:


Default:
Command:
qemu-img dd -f raw -O raw osize=32212254720 if=/root/test-netcat-plaindd.raw of=/root/test-local-qemu-dd.raw

Timing
- Total Elapsed Time: 12 minutes 22.28 seconds
- User CPU Time: 283.83 seconds
- System CPU Time: 467.52 seconds
- Total CPU Time: 751.35 seconds
- CPU Usage: 101% (slightly over one CPU core)

Performance
- Data Transferred: ~32.2 GB
- Throughput: ~43.4 MB/s
- Maximum Memory Used: ~32 MB

I/O Statistics
- File System Reads: 18,259,176 operations
- File System Writes: 62,945,008 operations
- Context Switches: 251,830,712 voluntary / 14,882 involuntary
- No Page Faults Requiring I/O: 0 (all data stayed in memory)



Command: qemu-img dd -f raw -O raw bs=16M osize=32212254720 if=/root/test-netcat-plaindd.raw of=/root/test-local-qemu-dd.raw

Timing
- Total Elapsed Time: 33.09 seconds
- User CPU Time: 0.06 seconds
- System CPU Time: 23.81 seconds
- CPU Usage: 72%

Performance
- Data Transferred: ~32.2 GB
- Throughput: ~973 MB/s
- Maximum Memory Used: ~47 MB

I/O Statistics
- File System Reads: 16,802,016 operations
- File System Writes: 63,029,640 operations
- Context Switches: 12,759 voluntary / 632 involuntary
- Major Page Faults: 19 (vs 0 before)

Comparison: Default vs bs=16M



So you know, only 19000x less context switching. Lol.
 
Last edited:
So as we can see ... switching hypervisor technologie (esxi, vsphere, pve, kvm ...) or even hypervisor hw or storage hw is just a joke of work when the whole setup of vm/lxc images is on (ha-) nfs storage (normally always get full bw by just needed cp cmd's or just another mount) and all other more or less end in a migration nightmare. Hardware is around exchanging all 5 years while a hypervisor switch maybe in a >10 year rhythm and normally if not somethink get broken then in a timely manner hw must be switched immedently unprepared which makes it stressless or even stressful right depending on own decided "optimal" design.
 
Appreciate it!

We have a few 10TB+ VMs to move and they wouldn't be done within a week of starting them lol

I learned on the last large cutover that the current tool slows down over time after the first few TB. It went from roughly 30 mins for 60GB to about 60 mins for 60GB. I was praying it would be done before Monday morning and luckily it finished 10PM on Sunday. :oops:

I'll keep cracking away at it and see if I can come up with something realistic. I really need to just setup another 25GbE host so I don't have to bum others to test lol.
Why dont you use shared storage method - such as NFS. You will need to tune NFS a bit. But we managed to migrated VMs with 8/9 TB over night with this method.

Let me share some figures from our migrations.

Native ESXi importer: about 110/130 MB a sec. Which is fine for our standard VM size, about 30 min, but anything above 500GB becomes too slow.

NFS import: We have established two NFS servers, one physical box with Raid controller and SATA SSD disks and one NFS server as VM on Ceph(NVMe disks). The first step is to storage migrate VMs to NFS, you can do it Live.

By tuning NFS(I do not recall which options we have used now), ESXi was copying data with about 1.1GB/s to NFS on Ceph and about 700MB/s to physical NFS. At that stage you need to power off VM and start the import on PVE. qm disk import was running at about 330MB/s per disk. So if you have multiple disks you can achieve quite good transfer rate. Also at this import NFS on Ceph was doing way better especially importing several disks at the same time, but I do not recall actual numbers.

We have not tried running vmdk in place and import it while it is running.

I do not want to discourage this discussion to improve the performance, I just want to share a "workaround" if someone is struggling with import speed.

I wish that native import speed improves as NFS import is more complicated and we still need to migrate about 110 VMs with 50TB of data.