VMware Migration - Scenario & Seeking Advice

Hey everyone!

I've been using Proxmox on my home server for a while and have been quite happy, but I have a potentially bigger scenario here and I'd like to seek the advice of you fine folks.

I work for a local IT company that has a large construction group as a client. I am filling in for their IT Infrastructure Head (also a good friend of mine) while he's off for a couple months on medical leave. Most of their IT infrastructure is in Azure, but they do have a fairly chonky couple of VMware 8.0 hosts on-prem that host a few smaller functions but critically, are also the on-prem storage point for all their Veeam backups. Because Broadcom is Broadcom, they're trying to massively hike their support contract rates for VMware and they're saying screw that. They've been out of support for a few months now. I suggested Proxmox (with their much more reasonably priced support of course) as an alternative and they're interested. Several of us at my employer have experience with Proxmox, but none of us have done VMware migrations, plus this one might have some unique challenges.

I have consulted the wiki page on migrations and read up on it and while it was helpful, I'd like to lay out the exact setup they have here and get feedback on not only the best way to approach the process, but also any potential gotchas that we might run into because they are looking to do several things in the process of this and I'm concerned about both potential risks and most importantly, potential downtime to achieve this.

Currently, they have two hosts:

Host 1
HPE ProLiant DL385 Gen11
128 CPU(s) x AMD EPYC 9374F 32-Core Processor
512GB RAM (180GB in use on average)
2TB SSD (1.25TB in use)
10TB 10K SAS HDD (3.4TB in use)

Host 2
HPE ProLiant DL385 Gen10 Plus
32 CPU(s) x AMD EPYC 7262 8-Core Processor
96GB RAM (75GB in use on average)
240GB SSD (30GB used)
6TB SAS HDD (4TB in use)

At the end of this, they want everything consolidated to just Host 1, though Host 2 could be used as a staging host if needed. There are also 2-3 VMs currently running on Host 2 that we hope to have retired before this migration, which will free up capacity. Everything should be able to comfortably fit on Host 1 when that's done.

In addition, there is a 75TB Synology JetStor unit that stores their Veeam backups. The VM that hosts Veeam Backup & Replication on Host 1 directly connects to this device via Raw Device Mapping. According to my friend, this was required due to VMware's virtual disk size limits. From a bit of reading I've done, it sounds like Proxmox might not have this limitation, but I'm curious if people think this would be best maintained as-is or changed to something else. I'm trying to see exactly how the connection works, but the credentials on file for the Synology aren't working so I'm hoping I can figure it out once that's sorted.

I'm not sure how many physical NICs the host has (I'm guessing at least 4), but there are 10 virtual networks setup on the cluster. We might not need a couple of them, but I'm not sure yet.

The Challenges

Here's where things get fun. We are planning this well ahead of time and the client knows down time will be required to do this, but some of these systems host things that can't be down for long during work hours. I'm not sure if achieving this over a weekend is viable given the amount of conversion work and such that will have to be done. If not, we might be able to take up-to-the-minute backups of some of the VMs and temporarily spin them up in Azure to keep things going, though that will pose its own challenges as the data deltas grow. We might need to buy or rent additional storage as well to act as a "bridge" when converting the disks as I assume we're going to need twice as much space for those as we currently have to make converted copies. I'm really hoping that the Veeam backups drive that's attached via Raw Device Mapping isn't in a VMware only format and that we can just detach it from the VMware VM and onto the Proxmox VM. If not, that's going to be...fun. Veeam also officially supports Proxmox now which is good, though I know we'll have to take fresh full backups of the new VMs since well, they're new. I believe the backup store has plenty of free space so that shouldn't be an issue I don't think.

Aside from just getting general feedback on this scenario, here's the specific questions I can think of at the moment:

  • Since we have two hosts but one is much less powerful than the other, we can't simply move all the running VMs over to it and then do the automatic ESXi import on each. We don't have to have them all running necessarily and can plan for downtime, but the OS drives are stored on the hosts themselves (with backups from Veeam on that Synology NAS appliance) and there isn't enough space on Host 2 to accommodate all the disks from Host 1 at once. This will be a problem as we obviously can't run two hypervisors on Host 1 at once. Is the solution for this just to acquire or rent additional storage we can attach as a temporary datastore to Host 2 to store all of Host 1's disks until we can import them?
  • Any concerns about bringing over that many virtual networks or do we just want to make sure we re-create them like-for-like in Proxmox?
  • Any concerns with drivers for hardware on a server like this with Proxmox? Should we get an HPE Debian driver pack for it if one's available or should Proxmox be able to figure it out?
  • I've never seen a NAS attached via Raw Device Mapping like this. If anyone has that experience, is it actually a VMware formatted volume or does it just see it as a big-ass hard drive that's directly formatted with the OS' filesystem? I'm hoping it's the latter because it's NTFS and we can hopefully just attach it to the Proxmox VM and it'll just work, but I'm not sure as I can't check the appliance right now.
  • Veeam does support restoring VMware backups to a new Proxmox host, similar to automatic import, but Veeam does it all. Anyone have experience with this?

I'll probably think of more questions and will update as I get them. Sorry for the long post, but we want to make 100% sure of this before we propose it to the client since it's kind of a weird setup. I think we can pull it off, I just want to have all our ducks in a row and since people here are very knowledgeable and helpful, I'm hoping some might be willing to office advice.

Thanks very much! :)
 
Last edited:
  • Like
Reactions: matto
I run Proxmox on 10th through 14th-generation Dells at work. Yup, this is with latest Proxmox 9.1.x with latest 7.0.x kernel. No issues.

I just made sure the machines has the latest firmware and BIOS/UEFI.

I do NOT use any RAID controllers (PERC). I swapped them out for IT-mode controllers (i.e., HBA330). If the machine doesn't have an IT-mode controller for it, I use the SATA ports.

I use two small drives to mirror Proxmox using ZFS RAID-1 and rest of drives are for ZFS/Ceph.

I find Proxmox KVM/QEMU "faster" then ESXi on the same hardware.

I use the following optimizations learned through trial-and-error. YMMV.

Code:
    Set SAS HDD Write Cache Enable (WCE) (sdparm -s WCE=1 -S /dev/sd[x])
    Set VM Disk Cache to None if clustered, Writeback if standalone
    Set VM Disk controller to VirtIO-Single SCSI controller and enable IO Thread & Discard option
    Set VM CPU Type for Linux to 'Host'
    Set VM CPU Type for Windows to 'x86-64-v2-AES' on older CPUs/'x86-64-v3' on newer CPUs/'nested-virt' on Proxmox 9.1
    Set VM CPU NUMA
    Set VM Networking VirtIO Multiqueue to 1
    Set VM Qemu-Guest-Agent software installed and VirtIO drivers on Windows
    Set VM IO Scheduler to none/noop on Linux
    Set Ceph RBD pools to use 'krbd' option
    Set Ceph 'bluestore_prefer_deferred_size_hdd = 0' in osd stanza in /etc/pve/ceph.conf for SAS HDD
    Set Ceph 'bluestore_min_alloc_size_hdd = 65536' in osd stanza in /etc/pve/ceph.conf for SAS HDD
    Set Ceph Erasure Coding profiles to 'plugin=ISA' & 'technique=reed_sol_van'
    Set Ceph Erasure Coding profiles to 'stripe_unit=65536' for SAS HDD
 
I'll probably think of more questions and will update as I get them. Sorry for the long post, but we want to make 100% sure of this before we propose it to the client since it's kind of a weird setup. I think we can pull it off, I just want to have all our ducks in a row and since people here are very knowledgeable and helpful, I'm hoping some might be willing to office advice.

Given the scope of this project, it might be worth considering a proper subscription and opening a support ticket before finalizing the proposal.
Since this is a client-facing production migration with downtime constraints, backup implications, and a somewhat unusual setup, having the Proxmox support team review the migration design would probably be the safest route.

https://www.proxmox.com/en/products/proxmox-virtual-environment/pricing


That said, here are a few thoughts on your questions:
  • Since we have two hosts but one is much less powerful than the other, we can't simply move all the running VMs over to it and then do the automatic ESXi import on each. We don't have to have them all running necessarily and can plan for downtime, but the OS drives are stored on the hosts themselves (with backups from Veeam on that Synology NAS appliance) and there isn't enough space on Host 2 to accommodate all the disks from Host 1 at once. This will be a problem as we obviously can't run two hypervisors on Host 1 at once. Is the solution for this just to acquire or rent additional storage we can attach as a temporary datastore to Host 2 to store all of Host 1's disks until we can import them?

There are several migration approaches documented by Proxmox, including integrated VM import, live-import, and the “Attach Disk & Move Disk” method for minimal downtime. If minimizing downtime is an important requirement, one practical option may be to provision temporary shared storage, such as NFS, make the source VMDKs available there, and then use the Attach Disk & Move Disk workflow.

https://pve.proxmox.com/wiki/Migrate_to_Proxmox_VE
https://pve.proxmox.com/wiki/Advanced_Migration_Techniques_to_Proxmox_VE

  • Any concerns about bringing over that many virtual networks or do we just want to make sure we re-create them like-for-like in Proxmox?

For networking, I would not expect every VMware object to map over one-to-one, and not every feature will necessarily have an exact equivalent. In some cases, such as private VLANs, you may be able to achieve a similar outcome, but not with exactly the same implementation. That said, I would expect Proxmox’s networking features to be flexible enough to reproduce an equivalent overall design. As for the number of networks, around ten should very likely not be an issue.

  • Any concerns with drivers for hardware on a server like this with Proxmox? Should we get an HPE Debian driver pack for it if one's available or should Proxmox be able to figure it out?

On the HPE side, Proxmox VE is Debian/Linux-based and designed for standard x86-64 server hardware, so I would generally expect many standard components to work with the normal Proxmox/Linux driver stack. At least personally, I have not run into any HPE-specific issues so far.
 
That's a big help d.oishi, thanks! Not every vendor provides migration assistance like this with a standard support contract, but it's good to know that Proxmox will at least review the plan with us. The client will have no problem subscribing to a plan before the migration as they were planning to pay for enterprise support regardless.

My aim this morning will be looking as this Raw Device Mapped appliance that hosts the Veeam repositories and see how it's configured. Based on what I can tell, I think it will work with Proxmox, just a question of if there will be filesystem conversion needed.
 
To summarize- you want to replace a single host vsphere with a single host pve. Easy peasy, but you need to understand that you will need sufficient staging space. another aspect is the backup. reading between the lines, veeam is deployed as a virtual machine with storage provided as a vmdk by an iscsi datastore.

SINCE your storage is already an iscsi store, here is my proposed sequence of events:
1. create a new LUN on the jetstor. presumably there is enough room for approx 5TB.
2. decomission the second node. whether you keep the VMs or not isnt particularly material, since its payload can and should fit on the first. If you dont have sufficient storage, just map an additional lun from the jetstore. If the Jetstore does not have enough room- time to do some backup cleanup and pruning; offload any historical/long term backups to a different device alltogther since keeping all your eggs on that one basket is bad practice anyway.
3. install pve on node2. map the new 5TB LUN you created in step 1 and dont forget to create an iqn group containing its new initiator id.
4. migrate your workload per whatever means you wish to use (pve integrated, veeam, by hand)
5. decomission node 1. you will NOT be touching the internal storage yet, just the boot partition. If you want a safety net, keep the original boot device handy or as a bare metal backup; you will be able to restore it and resume original function if it comes to that.
6. install pve on node 1. cluster it with node 2. you will need a quorum device but you can use just about anything for the purpose so long as it has connectivity.
7. with the cluster formed, you can migrate your workload to node 1 and turn them on. give this step a nice long period (how long depends on how may vms and their guest OS) to make sure they launch properly, exorcise their vmtools, install qemu-guest-agent, etc.
8. optional but suggested: remove node 2 from the cluster, and redeploy as a DEDICATED backup node (Veeam is you are in love with it, PBS if you want the full pve experience.)
9. depending on your guest IOPs requirements, redeploy the internal disk store previously used for vmware as a pve datastore. migrate your workloads from the jetstor to it.

If/when you want to graduate to a cluster, you'd have new choices to make.
 
Great advice alexskysilk, thank you as well! :) Your approach is quite similar to the one I have in my head. I'm still trying to get into the JetStor right now, but should have access soon. I don't know if it's iSCSI or not, but it's not a VMware datastore, the drive is raw mapped directly to the Veeam VM. I don't know enough about JetStor (never used one before) to know how the volume is actually formatted. As I understand it, there's supposed to be plenty of free space on it, so using it for staging I'm hoping will be doable. I do like Veeam, but keeping or changing that product isn't my call as I'm just filling in for the guy who makes that choice, plus they use other components of Veeam for Azure and 365 which do use this destination as a secondary target, so I think they'll be staying with that for now.

Once I can get into the JetStor, I can hopefully get a better idea of how this thing is setup. Thanks again!