error occured during live-restore: MAX 8 vcpus allowed per VM on this node

logics

Well-Known Member
Sep 8, 2019
44
8
48
34
Thanks for the PVE 6.4 release! The Live-Restore feature is especially interesting to me, because I've always looked for ways to make the restore faster in order to keep disaster recovery times a minimum.

Situation:
  • Main Node has 16 cores / 32 threads
  • VM 101 has 32 vCPUs, because the database profits from multithreading
  • Main Node has a cron job to backup the VM every 15 minutes to a local PBS datastore.
    2,17,32,47 * * * * root vzdump 101 --mailnotification failure --mode snapshot --mailto xx@yy.de --storage local101 --quiet 1
  • Backup Node has 4 cores / 8 threads
  • Backup Node has a PBS datastore sync job every 15 minutes (which starts a few minutes later) to sync the remote datastore to the local SSDs (new VM id: 103).
  • I've tried to Live-Restore the VM on the backup Node and encountered the following error:

Code:
new volume ID is 'localdata2tbssd:vm-103-disk-0'
rescan volumes...
VM is locked (create)
Starting VM for live-restore
An error occured during live-restore: MAX 8 vcpus allowed per VM on this node

error before or during data restore, some or all disks were not completely restored. VM 103 state is NOT cleaned up.
TASK ERROR: live-restore failed

When using the former restore mode (full restore), I was used to change the number of vCPUs before starting the VM:
sed -i 's/cores: 32/cores: '"$(nproc --all)"'/' /etc/pve/qemu-server/103.conf

Can I somehow use Live-Restore on my backup Node without decreasing the VM's number of vCPUs on the main Node? Maybe somehow tell the live-restore application to change the number of vCPUs before actually starting it?
 
This is currently not supported. You can open a feature request on our bugtracker to discuss this further if you want, but it certainly seems like a niche feature, so not something we'd prioritize implementing at the moment.

If you're feeling super adventurous, you can try applying the following diff to /usr/share/perl5/PVE/QemuServer.pm, but at your own risk and without support:

Code:
--- /usr/share/perl5/PVE/QemuServer.pm    2021-05-04 16:54:16.839427197 +0200
+++ /usr/share/perl5/PVE/QemuServer.pm.test    2021-05-04 16:54:13.011387343 +0200
@@ -6436,6 +6436,8 @@ sub pbs_live_restore {
        run_command(['ha-manager', 'set',  "vm:$vmid", '--state', 'started']);
     }
 
+        $conf->{cores} = 8;
+
     # start VM with backing chain pointing to PBS backup, environment vars for PBS driver
     # in QEMU (PBS_PASSWORD and PBS_FINGERPRINT) are already set by our caller
     vm_start_nolock($storecfg, $vmid, $conf, {paused => 1, 'pbs-backing' => $pbs_backing}, {});
 
Hello,

Indeed, that might be interresting to be able to modify the configuration before the start of the VM.

Maybe the simpliest way at first would be to edit the qemu-server blob file from the UI when "live restore" is checked. I'm thinking about network, cpu/ram and disk stuff.

DJP
 
Hello there!
I receive a similar error when trying to backup my datacenter:
ERROR: Backup of VM 101 failed - MAX 2 vcpus allowed per VM on this node

1629831019101.png
The host itself has 2 x AMD Turion(tm) II Neo N54L Dual-Core Processor (1 Socket)

I get this error even though the VM is currently not running. I assume the same answer applies to my case?
 
Hello there!
I receive a similar error when trying to backup my datacenter:
ERROR: Backup of VM 101 failed - MAX 2 vcpus allowed per VM on this node

The host itself has 2 x AMD Turion(tm) II Neo N54L Dual-Core Processor (1 Socket)

I get this error even though the VM is currently not running. I assume the same answer applies to my case?
No, that appears to be a different error. This thread is talking about live-restore, you're talking about backup. For backups to be made, we need to temporarily start the VM (in a paused mode, the guest doesn't boot), but this fails because only 2 CPU cores are detected on your system (which is less than the configured 4). I'm not sure why that is, the host setup you posted should result in 4... have you disabled SMT by any chance? Could you post the output of cat /proc/cpuinfo?

Edit: Sorry, misread, "2 x AMD Turion ... (1 Socket)" - do you mean a single dual-core CPU or two dual-core CPUs? If the former, your VM can only have 2 vCPUs. Reduce the amount and the backup should work.
 
Last edited:
Thank you for you reply.

I reduced the CPU core to 1 core, 1 socket. Backup still failed. I also had to install a missing Linux-bridge before I could start the VM. Afterwards,
INFO: Backup job finished successfully

Thank you very much - absolutely new to backing up proxmox ;-)

cat /proc/cpuinfo:
Code:
Linux proxmox02 5.11.22-3-pve #1 SMP PVE 5.11.22-6 (Wed, 28 Jul 2021 10:51:12 +0200) x86_64

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
Last login: Thu Aug 26 12:46:11 CEST 2021 from 192.168.1.7 on pts/0
root@proxmox02:~#
root@proxmox02:~# cat /proc/cpuinfo
processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 16
model           : 6
model name      : AMD Turion(tm) II Neo N54L Dual-Core Processor
stepping        : 3
microcode       : 0x10000c8
cpu MHz         : 2200.000
cache size      : 1024 KB
physical id     : 0
siblings        : 2
core id         : 0
cpu cores       : 2
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 5
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt nodeid_msr hw_pstate vmmcall npt lbrv svm_lock nrip_save
bugs            : tlb_mmatch apic_c1e fxsave_leak sysret_ss_attrs null_seg amd_e400 spectre_v1 spectre_v2
bogomips        : 4392.75
TLB size        : 1024 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management: ts ttp tm stc 100mhzsteps hwpstate

processor       : 1
vendor_id       : AuthenticAMD
cpu family      : 16
model           : 6
model name      : AMD Turion(tm) II Neo N54L Dual-Core Processor
stepping        : 3
microcode       : 0x10000c8
cpu MHz         : 2200.000
cache size      : 1024 KB
physical id     : 0
siblings        : 2
core id         : 1
cpu cores       : 2
apicid          : 1
initial apicid  : 1
fpu             : yes
fpu_exception   : yes
cpuid level     : 5
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt nodeid_msr hw_pstate vmmcall npt lbrv svm_lock nrip_save
bugs            : tlb_mmatch apic_c1e fxsave_leak sysret_ss_attrs null_seg amd_e400 spectre_v1 spectre_v2
bogomips        : 4392.75
TLB size        : 1024 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management: ts ttp tm stc 100mhzsteps hwpstate
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!