Import from ESXi Extremely Slow

spencerh · Oct 7, 2025

I've recently been experimenting with the ESXi import process in preparation for a migration from VMware to Proxmox. I'm having an issue where the disk import process runs extremely slowly. I've got a 10 Gb link between the ESXi host and Proxmox (verified with `iperf3` tests between the boxes on the same interface used for import) but the import runs at around 350 Mbps. I read other places online that setting `Config.HostAgent.vmacore.soap.maxSessionCount` to `0` on the ESXi host should help with API throttling, but I have seen the same result after modifying that setting.

Does anyone have any ideas about why the transfers from ESX are so slow? Should we be expecting in terms of speed importing VMs over a 10 Gb link?

I'm running Proxmox 9.0.10 and ESXi 7.0u3.

waltar · Oct 7, 2025

Did you take a look at your cpu usage of import processing in "top", maybe you hit your limit there ?

spencerh · Oct 7, 2025

From what I'm able to tell it doesn't seem that I'm hitting CPU limits. On the Proxmox side the import tool is using 11% of a CPU core and on the ESXi side I don't see anything using a full core.

PwrBank · Oct 7, 2025

We have observed the same thing here. I believe it's a limit of the ESXi API though, as I'm pretty sure the import process is using the ESXi API to pull down the chunks of data. We have dual 25GbE between the machines and an import of a 60GB VM takes about 30 mins. However, there seems to be a behavior that the larger the VM, the longer it takes. As in, it doesn't scale the same, instead of 120GB/hr, it's more like 30GB/hr. It took about 3 days for a 7TB VM to import.

I'm doing some tests right now to verify if that is indeed the issue. Going to attach the storage that ESXi uses to the PVE host and use `qm import disk` and see what the speed is vs importing using the VMware import tool.

spencerh · Oct 7, 2025

I've seen similar results in my limited testing. I'm able to pull the machines across the wire much faster using `scp` but then you lose out on the ESXi importer magic. You can go create the VMs and import the disks by hand but I was hoping to avoid doing that as it adds a lot more opportunity to screw things up, especially when migrating hundreds of VMs.

PwrBank · Oct 7, 2025

Did a quick test on a 1GbE connection:
qm import: 8m 50s
esxi import: 12m 34s

Once I get the environment setup, I'll test it on a 2x25GbE -> 2x25GbE connection.

But the import is definitely slower through the import tool.

No load on the PVE when this is importing either.

spencerh · Oct 7, 2025

For reference, we're doing a 2x10Gb -> 2x10Gb transfer and we're seeing a rate of roughly 50GiB import in 20 min.

EDIT: When we do the same import via Veeam restore we're able to import the disk in ~6 minutes.

PwrBank · Oct 7, 2025

Okay, I'm going to preface this with that I do not know Rust, but if I'm reading this correctly, it looks like the ESXi import can do 4 calls at once for pulling data.

This is what I think is happening:

It kicks off the make_request function

code_language.rust:

    async fn make_request<F>(&self, mut make_req: F) -> Result<Response<Body>, Error>
    where
        F: FnMut() -> Result<http::request::Builder, Error>,
    {
        let mut permit = self.connection_limit.acquire().await;

      ....

In there you see where it says, self.connection_limit.acquire, I believe that's pulling the value from the ConnectionLimit, here

code_language.rust:

impl ConnectionLimit {
    fn new() -> Self {
        Self {
            requests: tokio::sync::Semaphore::new(4),
            retry: tokio::sync::Mutex::new(()),
        }
    }

    /// This is supposed to cover an entire request including its retries.
    async fn acquire(&self) -> SemaphorePermit<'_> {
        // acquire can only fail when the semaphore is closed...
        self.requests
            .acquire()
            .await
            .expect("failed to acquire semaphore")
    }

Which, if I'm reading correctly, is hard coded to 4.

This is from the PVE-ESXI-IMPORT-TOOLS repo
https://git.proxmox.com/?p=pve-esxi-import-tools.git;a=summary

And I'm guessing the download_do function picks the next chunk of data to be pulled and asks for it

code_language.rust:

async fn download_do(
        &self,
        query: &str,
        range: Option<Range<u64>>,
    ) -> Result<(Bytes, Option<u64>), Error> {
        let (parts, body) = self
            .make_request(|| {
                let mut req = Request::get(query);

                if let Some(range) = &range {
                    req = req.header(
                        "range",
                        &format!("bytes={}-{}", range.start, range.end.saturating_sub(1)),
                    )
                }

                Ok(req)
            })
            .await?
            .into_parts();

They may have picked a really low number for the import just to be on the safe side, similar to how the backup jobs were originally being done to PBS a few months ago.

Again, if I'm reading this right, maybe a test would be to change the 4 to a 6 or to 8 and see if the speed scales with the increased number of calls.

Johannes S · Oct 7, 2025

I remember some users had similiar issues if they had any snapshots in ESXi. Do you happen to have any snapshots? Could you try to remove them?

You could also try one of the other migration methods, e.g. this one for minimal downtime:
https://pve.proxmox.com/wiki/Migrate_to_Proxmox_VE#Attach_Disk_&_Move_Disk_(minimal_downtime)

spencerh · Oct 7, 2025

Johannes S said:
I remember some users had similiar issues if they had any snapshots in ESXi. Do you happen to have any snapshots? Could you try to remove them?

You could also try one of the other migration methods, e.g. this one for minimal downtime:
https://pve.proxmox.com/wiki/Migrate_to_Proxmox_VE#Attach_Disk_&_Move_Disk_(minimal_downtime)

Sorry forgot to mention this in my original post. No, I have no snapshots on the machine. We do have some other options, but I'll admit that at this point my curiosity has gotten the better of me and, options aside, I'd like to understand what's going on here and why it's so slow.

spencerh · Oct 7, 2025

PwrBank said:
Okay, I'm going to preface this with that I do not know Rust, but if I'm reading this correctly, it looks like the ESXi import can do 4 calls at once for pulling data.

This is what I think is happening:

It kicks off the make_request function

code_language.rust:

async fn make_request<F>(&self, mut make_req: F) -> Result<Response<Body>, Error> where F: FnMut() -> Result<http::request::Builder, Error>, { let mut permit = self.connection_limit.acquire().await; ....

In there you see where it says, self.connection_limit.acquire, I believe that's pulling the value from the ConnectionLimit, here

code_language.rust:

impl ConnectionLimit { fn new() -> Self { Self { requests: tokio::sync::Semaphore::new(4), retry: tokio::sync::Mutex::new(()), } } /// This is supposed to cover an entire request including its retries. async fn acquire(&self) -> SemaphorePermit<'_> { // acquire can only fail when the semaphore is closed... self.requests .acquire() .await .expect("failed to acquire semaphore") }

Which, if I'm reading correctly, is hard coded to 4.

This is from the PVE-ESXI-IMPORT-TOOLS repo
https://git.proxmox.com/?p=pve-esxi-import-tools.git;a=summary

And I'm guessing the download_do function picks the next chunk of data to be pulled and asks for it

code_language.rust:

async fn download_do( &self, query: &str, range: Option<Range<u64>>, ) -> Result<(Bytes, Option<u64>), Error> { let (parts, body) = self .make_request(|| { let mut req = Request::get(query); if let Some(range) = &range { req = req.header( "range", &format!("bytes={}-{}", range.start, range.end.saturating_sub(1)), ) } Ok(req) }) .await? .into_parts();

They may have picked a really low number for the import just to be on the safe side, similar to how the backup jobs were originally being done to PBS a few months ago.

Again, if I'm reading this right, maybe a test would be to change the 4 to a 6 or to 8 and see if the speed scales with the increased number of calls.

I'm going to see if I can figure out how to get it to compile (I am also not a Rust developer) with that value changed to 8 and see if it has any impact on throughput. This is a great find, thanks for doing the legwork on this.

waltar · Oct 7, 2025

Tested a nfs accessed linux vm vmdk on T430 32threads, create a vm with 2cores, 2GB, disk 0.001G: 2s, qemu-ing convert -t writeback -f vmdk -O raw linux-vm.vmdk vm-117-disk-0.raw : created 30G in 4min24s on nfs (~ 130% core usage), mv to nfs-path/images/117/.: 2s.or direct give path to qemu-img cmd.
This could be done easily a lot in parallel by script until you reach your limit on I/O on esxi or network limit to/from esxi or to local or remote pve nfs storage I(O or even cpu limit. After an image is converted (while one by one get done in parallel) maybe move around vm.conf on pve hosts in /etc/pve/nodes/<node>/qemuserver/. and just start or fine tune cpu/mem before start. Won't do that without a script

PwrBank · Oct 8, 2025

I forked the ESXi import tool and increased the limits in a couple of places, if any one cares to try it.

https://github.com/PwrBank/pve-esxi-import-tools

On a 1GbE connection I didn't notice a difference. However, I'm trying to get my 25GbE stuff working atm, but it's acting all sorts of goofy. To copy 1.5GB is taking 30 mins with the unpatched tool, so until I get it even remotely like the 1GbE test environment I won't be able to reliably test it.

Let me know if any one sees any increases.

spencerh · Oct 9, 2025

Thanks, this is awesome! I'll give it a shot this afternoon and report back with my findings.

PwrBank · Oct 9, 2025

I actually got my 25GbE network working.

Before and after the patch took roughly 11 mins to import a 60GB VM.
I'll see if there's anything else that can be done with the tool to speed it up, I doubt it, but I'll see what might be feasible.

Edit 1:
Nevermind, I'm an idiot. I accidently setup the 1GbE this time. It seems when I have the Management interfaces added to the 25GbE port group on ESXi, it times out half the time trying to load the VM list. So, need to figure that out I guess.

Edit 2:
Also created a branch that uses a pool of connections to establish more than 1 connection at once. Right now it's using HTTP2 where it's lumping all (4 by default) 16 connections into one TCP connection.

This tries to create 8 TCP connections
https://github.com/PwrBank/pve-esxi-import-tools/tree/multitcp

I'll see if I can get the builds uploaded, but the source is there

PwrBank · Oct 10, 2025

Compiled new versions here:
https://github.com/PwrBank/pve-esxi-import-tools/releases/tag/1.1.0

Here's the notable changes:

8 concurrent TCP connections to ESXi (up from 1 with HTTP/2 multiplexing)
Round-robin distribution of requests across clients
Logging to track pool creation and usage patterns
Each client maintains its own SSL connector and connection state

My 25GbE networking is still being a goof, but it did show speed increases on a 1GbE connection.

PwrBank · Oct 10, 2025

@t.lamprecht @Max Carrara
Is this a viable path to increasing the import speed? I'm not sure if there's been any public discussion of the limitations of the ESXi import tool.

spencerh · Oct 10, 2025

So weirdly I re-copied my test VM and it seemed to get exactly the same throughput. It's got 2 disks; a 1GiB and a 50Gib disk. With the original version, the VM transfers in ~20 minutes, and with the modified importer I see the exact same performance, within maybe 40 seconds. I'm running over a 10Gbps connection and seeing peaks of around 700 Mb. I ran 3 imports and saw the same performance each time. I also rebooted after the first test migration just to be on the safe side. I also confirmed that I have MTU set to 1500 on the interface I'm using to connect to ESX.

That being said, I'm running version 1.1.0 and I'm still only seeing two connections, so maybe I'm doing something wrong. I had while true; do echo "$(date)": "$(ss -tn | grep ESX-IP:443 | wc -l)"; sleep 1; done running in a loop while I was watching a transfer and it never got above 3. Even when there's no transfer running it still reports 1 connection so I think I'm still only getting 2 connections for the transfer.

Update before I posted this: I disabled and re-enabled the storage which appears to have forced a remount and I'm now seeing the expected number of connections (8). This did not affect performance from what I could tell, though.

Here's the CLI I'm seeing run during the migration:

/usr/libexec/pve-esxi-import-tools/esxi-folder-fuse --skip-cert-verification --change-user nobody --change-group nogroup -o allow_other --ready-fd 11 --user root --password-file /etc/pve/priv/storage/MY-ESX-HOST.pw my-esx-host.example.com.ca /run/pve/import/esxi/MY-ESX-HOST/manifest.json /run/pve/import/esxi/MY-ESX-HOST/mnt

Let me know if there's anything else I can do to provide more useful information. I'm going to keep fiddling around and doing some testing next week (starting Tuesday, it's Canadian Thanksgiving) to see if I can make it do... something different.

Thanks again for all the effort you're putting into this.

Max Carrara · Oct 13, 2025

PwrBank said:
@t.lamprecht @Max Carrara
Is this a viable path to increasing the import speed? I'm not sure if there's been any public discussion of the limitations of the ESXi import tool.

Cool that you're looking into this, thanks a lot for the effort!

I'll have a more closer look once I can find the time, but IIRC there was a bottleneck on the ESXi side—so, the number of connections you add might not lead to a proportional increase in performance—but I'm not 100% sure about that. Things like these need to be tested with a couple different hardware constellations to see if there are any gains.

I can't guarantee anything ad hoc, but if it turns out that there might be some improvements for certain users, we could maybe make the number of threads / connections a tweakable setting in the UI or something.

PwrBank · Oct 13, 2025

Max Carrara said:
I'll have a more closer look once I can find the time, but IIRC there was a bottleneck on the ESXi side—so, the number of connections you add might not lead to a proportional increase in performance—but I'm not 100% sure about that. Things like these need to be tested with a couple different hardware constellations to see if there are any gains.

If you get around to it, do you mind asking the team where the bottleneck was found? What I've been doing for my ESXi -> PVE migrations is creating a new kernel adapter on the ESXi host designated as a management interface on the 25GbE NIC. So, in theory, there shouldn't be a bottleneck there. But I could see the HTTP server having a problem handling the requests.

While the HTTP method would be the most universal, I wonder if it's possible to change it to a SCP method somehow. Maybe I'll look into that just to futz around with.

Import from ESXi Extremely Slow

Member

Famous Member

Member

Active Member

Member

Active Member

Member

Active Member

Distinguished Member

Member

Member

Famous Member

Active Member

Member

Active Member

Active Member

Active Member

Member

Well-Known Member

Active Member

We value your privacy