[SOLVED] Certain VMs from a cluster cannot be backed up and managed

Maybe it is worth noting that after we moved our cluster traffic to a dedicated network we had no more such issues.
 
Is there anything else i can do to assist in debugging this issue? We really would like to run backups again. Restarting the VMs every other day is quite annoying.

there'll be another test package available shortly on pvetest (pve-qemu-kvm 4.0.0-7)
 
  • Like
Reactions: sebmel
I tested with pve-qemu-kvm 4.0.0-7, it works very good so far :)! Thanks for getting this fixed!
Test some time with the new version please. We did'nt had any problem for 1,5 weeks. But then 5 VM's death. But yes with the old version. So hoping the new version is fixed :) Very thanks! :)
 
yes, changes in the pve-qemu package always only take effect for VMs started after the upgrade.
 
Just for clarification, I have a question in the context of starting/restarting a machine:

If I migrate a VM online, a new qemu Thread is started on the destination host. Does this count as a restart of the qemu process, the same way as if i had shut down the machine and then started it again?
 
Just for clarification, I have a question in the context of starting/restarting a machine:

If I migrate a VM online, a new qemu Thread is started on the destination host. Does this count as a restart of the qemu process, the same way as if i had shut down the machine and then started it again?

yes, for this issue. for some changes, we have to force the old behaviour on migration (to stay compatible with the running VM).
 
EDIT:
sorry, i've seen now, that i have to restart the vm's.
i will try it (but it's a whole piece of work)

#############


hey,
after suffering a bit now, i'm a little bit confused about the solution.
- fabian said : please to try pve-qemu-kvm 4.0.0-6 from pvetest.
- vmctec said: Upgraded to pve-manager/6.0-7/28984024. but in my case that would be a downgrade. 6.0.7 is from sept 3.
which package causes the problem?

in my case i have multiple impacts. i can't backup the affected vm's:

Code:
INFO: starting new backup job: vzdump 1001 --remove 0 --node p2 --mode snapshot --storage nfs-n3-pvebackup --mailto sd@schnied.net --compress lzo
INFO: Starting Backup of VM 1001 (qemu)
INFO: Backup started at 2019-10-16 07:57:35
INFO: status = running
INFO: update VM 1001: -lock backup
INFO: VM Name: f2.in.of.sd.vc
INFO: include disk 'scsi0' 'ssd_vm:vm-1001-disk-0' 8G
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating archive '/mnt/pve/nfs-n3-pvebackup/dump/vzdump-qemu-1001-2019_10_16-07_57_35.vma.lzo'
ERROR: got timeout
INFO: aborting backup job
ERROR: VM 1001 qmp command 'backup-cancel' failed - got timeout
ERROR: Backup of VM 1001 failed - got timeout
INFO: Failed at 2019-10-16 08:07:40
INFO: Backup job finished with errors
TASK ERROR: job errors

i can't use the console:

Code:
VM 1001 qmp command 'change' failed - got timeout
TASK ERROR: Failed to run vncproxy.

i can't migrate to another host:

Code:
2019-10-16 08:18:07 ERROR: migration aborted (duration 00:00:03): VM 1001 qmp command 'query-machines' failed - got timeout
TASK ERROR: migration aborted

as sebmel wrote : Is there anything else i can do to assist in debugging this issue?

regards
stefan
 
Last edited:
EDIT:
sorry, i've seen now, that i have to restart the vm's.
i will try it (but it's a whole piece of work)

#############


hey,
after suffering a bit now, i'm a little bit confused about the solution.
- fabian said : please to try pve-qemu-kvm 4.0.0-6 from pvetest.
- vmctec said: Upgraded to pve-manager/6.0-7/28984024. but in my case that would be a downgrade. 6.0.7 is from sept 3.
which package causes the problem?

the problem is not related to pve-manager at all. please test the newest packages (including pve-qemu-kvm 4.0.0-7), and don't downgrade anything ;)
 
  • Like
Reactions: Jovian
We tested with pve-qemu-kvm 4.0.0-7 since 2 days. Online-Migration and nightly backups of different vm's works very good so far !
Thanks for getting this fixed.

(reply to post #59)
 
Last edited:
see post #59

VM didnt react, we got timouts randomly after 1 or 2 days during our nightly backups or when we tried to use the console.
I know what was the issue, I started this thread...

I tried to ask the Proxmox staff for details about what caused this issue since they said they fixed it.
 
  • Like
Reactions: maurin

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!