Proxmox Merging resources

faisa7847

New Member
Jul 25, 2024
22
0
1
Hi everyone,


I’m currently running a Proxmox VE cluster with two physical servers, each equipped with its own GPU. I’m looking to set up a virtual machine that can leverage both GPUs simultaneously, even though they’re on separate nodes.


I know GPU passthrough is possible on a per-node basis using VFIO, but is there any way (officially supported or workaround) to make a single VM access and utilize GPUs from two different physical servers?


Some context:


  • The GPUs are different models but both NVIDIA.
  • I want to use this for AI training workloads (e.g., PyTorch/TensorFlow).
  • Ideally, I’d like to avoid splitting the workload manually unless there’s no alternative.

Is there a cluster-aware GPU pooling solution or passthrough trick that could help? Or should I just create two separate VMs and use a distributed compute framework?