[SOLVED] Windows 10 vm freezed

slapshot

Renowned Member
Feb 28, 2012
58
0
71
Hello, this is not the first time I find a vm on promox 7.3-6 freezed. The night backup cannot complete because this error:
Code:
ERROR: Backup of VM 110 failed - VM 110 qmp command 'cont' failed - unable to connect to VM 110 qmp socket - timeout after 451 retries

The vm seems ok from proxmox gui interface with classic green triangle but it is not able to reply so my only solution is to stop via:
Code:
qm stop 110
VM quit/powerdown failed - terminating now with SIGTERM
VM still running - terminating now with SIGKILL

This problem happened also on another windows vm always on the same proxmox server.

Is there anything I can check to understand why this random freeze ?

Thank you
 
Hi,
please share the output of pveversion -v and qm config 110 and the full backup task log where the issue occurred.

If you want to further debug the issue, you could use apt install pve-qemu-kvm-dbg gdb to install the relevant debug symbols and debugger. Then, when the VM is hanging, you can attach the debugger with
Code:
gdb --ex 'set pagination off' --ex 'handle SIGUSR1 noprint nostop' --ex 'handle SIGPIPE noprint nostop' -p $(cat /var/run/qemu-server/<ID>.pid)
replacing <ID> with the ID of the VM.

After that, please enter t a a bt in the debugger's prompt and post the output here.
 
Hi thank you for your reply. I will download gdb debugger and for the next freeze I will investigate with your command.

Thank you

Here is pveversion -v:

Code:
proxmox-ve: 7.3-1 (running kernel: 5.15.85-1-pve)
pve-manager: 7.3-6 (running version: 7.3-6/723bb6ec)
pve-kernel-helper: 7.3-4
pve-kernel-5.15: 7.3-2
pve-kernel-5.4: 6.4-20
pve-kernel-5.15.85-1-pve: 5.15.85-1
pve-kernel-5.4.203-1-pve: 5.4.203-1
pve-kernel-5.4.106-1-pve: 5.4.106-1
pve-kernel-5.4.73-1-pve: 5.4.73-1
ceph-fuse: 14.2.21-1
corosync: 3.1.7-pve1
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown: residual config
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve2
libproxmox-acme-perl: 1.4.3
libproxmox-backup-qemu0: 1.3.1-1
libpve-access-control: 7.3-1
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.3-2
libpve-guest-common-perl: 4.2-3
libpve-http-server-perl: 4.1-5
libpve-storage-perl: 7.3-2
libqb0: 1.0.5-1
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.2-1
lxcfs: 5.0.3-pve1
novnc-pve: 1.3.0-3
proxmox-backup-client: 2.3.3-1
proxmox-backup-file-restore: 2.3.3-1
proxmox-mail-forward: 0.1.1-1
proxmox-mini-journalreader: 1.3-1
proxmox-offline-mirror-helper: 0.5.1-1
proxmox-widget-toolkit: 3.5.5
pve-cluster: 7.3-2
pve-container: 4.4-2
pve-docs: 7.3-1
pve-edk2-firmware: 3.20220526-1
pve-firewall: 4.2-7
pve-firmware: 3.6-3
pve-ha-manager: 3.5.1
pve-i18n: 2.8-2
pve-qemu-kvm: 7.1.0-4
pve-xtermjs: 4.16.0-1
qemu-server: 7.3-3
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.8.0~bpo11+2
vncterm: 1.7-1
zfsutils-linux: 2.1.9-pve1

Here is qm config 110:

Code:
agent: 1
boot: order=ide0;ide2;net0
cores: 2
ide0: vmraid0:vm-110-disk-0,cache=writeback,discard=on,size=400G
ide2: local:iso/virtio-win-0.1.229.iso,media=cdrom,size=522284K
memory: 12288
name: WincodyW10Pro
net0: virtio=96:BB:8F:73:72:15,bridge=vmbr0,firewall=1
numa: 0
ostype: win10
scsihw: virtio-scsi-pci
smbios1: uuid=a03ca981-5209-4ab2-97e8-20b942228611
sockets: 2
vmgenid: a984e4ec-32b5-4d53-a249-aeb9516e3689

Here is backup output:
Code:
INFO: starting new backup job: vzdump 110 130 --storage pbs2-StorageMS01 --quiet 1 --mode snapshot --mailnotification always --mailto info@dynamicservice.it
INFO: Starting Backup of VM 110 (qemu)
INFO: Backup started at 2023-02-25 22:30:04
INFO: status = running
INFO: VM Name: WincodyW10Pro
VM 110 qmp command 'query-status' failed - unable to connect to VM 110 qmp socket - timeout after 51 retries

INFO: include disk 'ide0' 'vmraid0:vm-110-disk-0' 400G
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating Proxmox Backup Server archive 'vm/110/2023-02-25T21:30:04Z'
ERROR: QMP command query-proxmox-support failed - VM 110 qmp command 'query-proxmox-support' failed - unable to connect to VM 110 qmp socket - timeout after 51 retries
INFO: aborting backup job
ERROR: VM 110 qmp command 'backup-cancel' failed - unable to connect to VM 110 qmp socket - timeout after 5991 retries
INFO: resuming VM again
ERROR: Backup of VM 110 failed - VM 110 qmp command 'cont' failed - unable to connect to VM 110 qmp socket - timeout after 451 retries
INFO: Failed at 2023-02-25 22:40:59
 
Code:
agent: 1
boot: order=ide0;ide2;net0
cores: 2
ide0: vmraid0:vm-110-disk-0,cache=writeback,discard=on,size=400G
ide2: local:iso/virtio-win-0.1.229.iso,media=cdrom,size=522284K
memory: 12288
name: WincodyW10Pro
net0: virtio=96:BB:8F:73:72:15,bridge=vmbr0,firewall=1
numa: 0
ostype: win10
scsihw: virtio-scsi-pci
smbios1: uuid=a03ca981-5209-4ab2-97e8-20b942228611
sockets: 2
vmgenid: a984e4ec-32b5-4d53-a249-aeb9516e3689
After making sure you have a working backup, and assuming you already installed the VirtIO drivers, I'd suggest trying to switch the disk to SCSI and turn on the iothread setting. This makes sure that the IO is handled in a dedicated thread and can help the QEMU main thread, which handles qmp commands, run more smoothly.

Here is backup output:
Code:
INFO: starting new backup job: vzdump 110 130 --storage pbs2-StorageMS01 --quiet 1 --mode snapshot --mailnotification always --mailto info@dynamicservice.it
INFO: Starting Backup of VM 110 (qemu)
INFO: Backup started at 2023-02-25 22:30:04
INFO: status = running
INFO: VM Name: WincodyW10Pro
VM 110 qmp command 'query-status' failed - unable to connect to VM 110 qmp socket - timeout after 51 retries
Hmm, there seem to be issues even during preparation of the backup. What kind of hardware are you running on? How does the load on your system look like? Do you have enough free memory/CPU?
INFO: include disk 'ide0' 'vmraid0:vm-110-disk-0' 400G
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating Proxmox Backup Server archive 'vm/110/2023-02-25T21:30:04Z'
ERROR: QMP command query-proxmox-support failed - VM 110 qmp command 'query-proxmox-support' failed - unable to connect to VM 110 qmp socket - timeout after 51 retries
INFO: aborting backup job
ERROR: VM 110 qmp command 'backup-cancel' failed - unable to connect to VM 110 qmp socket - timeout after 5991 retries
INFO: resuming VM again
ERROR: Backup of VM 110 failed - VM 110 qmp command 'cont' failed - unable to connect to VM 110 qmp socket - timeout after 451 retries
INFO: Failed at 2023-02-25 22:40:59[/CODE]
 
After making sure you have a working backup, and assuming you already installed the VirtIO drivers, I'd suggest trying to switch the disk to SCSI and turn on the iothread setting. This makes sure that the IO is handled in a dedicated thread and can help the QEMU main thread, which handles qmp commands, run more smoothly.

Ok I will try to do this asap and let you know.

Hmm, there seem to be issues even during preparation of the backup. What kind of hardware are you running on? How does the load on your system look like? Do you have enough free memory/CPU?

I think there is a problem during the backup preparation because in windows event registry I got them until 22:26 so a few minutes the backup start at 22:30. The event will come again after my power restart. I have a Dell R330 server with 64 gb ram and 12 gb for this machine with 2 processor and 2 core. The load is not so heavy, free memory seems good with lvm-thin.

Thank you
Antonio
 
Ok I will try to do this asap and let you know.

Ok i was able to convert into a scsi disk via a workaround founded here on this forum. Leaving Virtio Scsi controller I got this warning at the vm start:

Code:
WARN: iothread is only valid with virtio disk or virtio-scsi-single controller, ignoring
TASK WARNINGS: 1

Changing it Virtio Scsi single seems to be ok and the warning is not there anymore. Is it safe to have this one option ? Further, is there a chance to understand the differences among this options ? I asume a deep wiki page, thank you.

Antonio
 
Ok i was able to convert into a scsi disk via a workaround founded here on this forum. Leaving Virtio Scsi controller I got this warning at the vm start:

Code:
WARN: iothread is only valid with virtio disk or virtio-scsi-single controller, ignoring
TASK WARNINGS: 1

Changing it Virtio Scsi single seems to be ok and the warning is not there anymore. Is it safe to have this one option ? Further, is there a chance to understand the differences among this options ? I asume a deep wiki page, thank you.

Antonio
Yes, it is safe. With the VirtIO SCSI single setting, QEMU will create a dedicated controller for each disk. This is required for iothread. See here for more information: https://pve.proxmox.com/pve-docs/chapter-qm.html#qm_hard_disk or for the latest version (the information was a bit outdated and has been updated just last week ;)): https://git.proxmox.com/?p=pve-docs...e3d91783eb0ba66ca3fc73175f6bb9ad47fc2dc3#l149
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!