Debug a freezing VM

Valutin

Member
Sep 28, 2020
13
2
23
43
Hello,
I recently migrated a physical 2008 R2 setup onto a proxmox server and it seems to be running fine except it freezes from time to time. I would timeout network connection etc....but the VM indicate that it did not encounter issue. After a few minutes, everything goes back to normal like nothing happened while on the clients side, we need to restart all transactions. Are there any tools I can use to understand why it is happening?
When it happens, the shell is not available. I can see the % of cpu /ram usage remaining the same. On the network activity I can sometimes see a flat line at the period it froze.
If you have a way to understand why its doing that, I am all ears.
Thanks.
 
but the VM indicate that it did not encounter issue
Like in, the logs from within the VM say nothing?
When it happens, the shell is not available.
Shell as in SSH to guest?
we need to restart all transactions
Transactions that go outside the VM via network?

Could you please post
Code:
qm config <vmid>
as a start?
 
Thanks Dominic for the feedback and really sorry for my late reply, had to move server room.

Here is the qm config:
Code:
root@pvhost:~# qm config 206
boot: c
bootdisk: ide0
cores: 4
ide0: VMs:206/NSERVER3_C_Drive.qcow2,cache=writethrough,size=466G
ide1: Migrate:206/vm-206-disk-0.qcow2,cache=writethrough,size=8000G
ide2: Migrate:206/NSERVER3_L_SVN.qcow2,cache=writethrough,size=932G
ide3: VMs:206/NSERVER3_D_DATABASE.qcow2,cache=writethrough,size=466G
memory: 8096
name: Server3-168
net0: e1000=AA:FD:01:D2:00:2D,bridge=vmbr0,firewall=1
numa: 0
onboot: 1
ostype: win7
scsihw: virtio-scsi-pci
smbios1: uuid=ae83effd-9bee-4cf1-a0d2-a71702b91c19
sockets: 2
vga: qxl
vmgenid: 1da7b02d-70f3-4ede-98a2-ff8f4f27080a

I tried to stick as much as possible to the original machine where this was running.
In Windows, I log in as admin and the system does not say it was down, even if I had a session opened, I can remote desktop back and there is no error.
By shell, I was meaning console, sorry for the confusion. No console access.
For transactions, I was thinking about file copy, database transactions etc... Everything stops. But if we wait like a minute or two, everything resumes, things that got timed out will have to restart again and when I remote into the vm, I resume my session if it was opened or open a new one without the unexpected shutdown error.

I looked at the syslog, but did not see anything special around the time somebody complained about issue.
 
No console access.
The noVNC console in the PVE web GUI?

Is this the only VM where something like that happens? Are there other VMs with similar configurations (RAM, disks...)? Your VM 206 consists of multiple processes on the PVE host. They are all
Code:
/usr/bin/kvm -id 206 -name Server3-168 ...
What you could try to take a look at is if the whole process sort of freezes or if this is a problem within the VM. This could happen if your storage is not sufficiently fast, for example.

Additionally, check out our Windows 7 best practices. Maybe using virtio instead of IDE helps.
 
The noVNC console in the PVE web GUI?
Yes, no console access.
Is this the only VM where something like that happens? Are there other VMs with similar configurations (RAM, disks...)? Your VM 206 consists of multiple processes on the PVE host. They are all
This was my original config:
>>I had a first physical Win 2k8 R2 with Hyper-V for roughly 10 years:
-pfsense
-win 2k8 r2 VM with FTP server
-win 2k8 r2 VM with DC controller 1
-win 2k8 r2 VM with DC controller 2 (yah I know, both DC on the same box.. was not my doing)
-win 2k8 r2 VM with ERP (front end)

>>Another second physical win 2k8 R2 without VM doing all the file sharing (network shares + SVN of CAD/production files) service as well as hosting the database for the ERP.

I move part of these into a newly built server:
R9 3900X
64G ECC
2xSSD in a rpool 1TB
6xHDD in RAIDZ2
I also added a 512GB NVME that was planned as cache but I am not sure I am actually in need of it.

The SSD are mirrored and hosting Proxmox and all C for the windows VM drives.
In proxmox, I have from the first physical server:
-pfsense
-The FTP server VM that was promoted to DC
-the ERP VM
>>I also migrated the second physical server.

As it was a physical to virtual migration, I left it as IDE, when I started working on migrating my VMs over, I tried virtio but could not pass something or it was giving me a black screen, so I left it as IDE, all the windows VM that were migrated including this one are all using IDE. VMs I created in proxmox switched to Virtio.

When I was working on this particular VM, the drive I created with a capacity of 8000G, this one is not from an image as I could not get the image creation from the original physical drive to complete, so I defined it in proxmox and copy over the content via network, I could not add it in virtio, it would boot to a black screen. At the end, I reverted to IDE and i worked.

Code:
/usr/bin/kvm -id 206 -name Server3-168 ...
What you could try to take a look at is if the whole process sort of freezes or if this is a problem within the VM. This could happen if your storage is not sufficiently fast, for example.

Additionally, check out our Windows 7 best practices. Maybe using virtio instead of IDE helps.
Maybe the easiest way to start is to do a simple ping over 24h, log it and see if it just times out, stop or skip time. That will tell me if it's a frozen or something else. But on the activity graph, when it does freeze, I can see a flat line, then, the network activity will spike.


Last, migrating IDE->Virtio.
I may retry that, copy the C drive image to the hard drive storage, start it fromt there attempt the virtio migration and if it goes well, replace the production C drive with the newly one made. One of the main issue I had during the first boot, it is that it took a hell lot of time. I did it in step, add the C drive image, boot, then the rest. For the C drive, I spent countless of hours booting into a hanging windows starting image, but on the last day, I decided to leave the VM on and just went to sleep, to my surprise, the next morning, the screen displayed a windows log in invite. Then adding the smaller disk image was a breeze, but as I said earlier, adding the bigger disk image was more troublesome, I added the 8TB image in virtio mode and booted into a black screen, I added in IDE, it worked. I might have to retry adding it in Virtio and see what will happen.

Thanks for the feedback
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!