VM stopped responding

J

jerim

Guest
I have been running a proxmox cloud for a few months with no issues. Today, the VM on one of my nodes stopped responding. It was still running, it is just that I couldn't RDP into it. Checking the /var/log/syslog I found this:

Code:
Nov 12 09:37:21 KNTCLCN004 rgmanager[301718]: [pvevm] VM 108 is running
Nov 12 09:37:41 KNTCLCN004 rgmanager[301760]: [pvevm] VM 108 is running
Nov 12 09:37:51 KNTCLCN004 rgmanager[301786]: [pvevm] VM 108 is running
Nov 12 09:38:21 KNTCLCN004 rgmanager[301830]: [pvevm] VM 108 is running
Nov 12 09:38:41 KNTCLCN004 rgmanager[301872]: [pvevm] VM 108 is running
Nov 12 09:38:51 KNTCLCN004 rgmanager[301898]: [pvevm] VM 108 is running
Nov 12 09:38:53 KNTCLCN004 pmxcfs[1457]: [status] notice: received log
Nov 12 09:39:19 KNTCLCN004 pmxcfs[1457]: [status] notice: received log
Nov 12 09:39:19 KNTCLCN004 pmxcfs[1457]: [status] notice: received log
Nov 12 09:39:21 KNTCLCN004 rgmanager[301944]: [pvevm] VM 108 is running
Nov 12 09:39:29 KNTCLCN004 pmxcfs[1457]: [status] notice: received log
Nov 12 09:39:41 KNTCLCN004 pvedaemon[4794]: <root@pam> starting task UPID:KNTCLCN004:00049B9E:00FEE7BA:50A1183D:hastop:108:root@pam:
Nov 12 09:39:41 KNTCLCN004 rgmanager[1793]: Stopping service pvevm:108
Nov 12 09:39:41 KNTCLCN004 pvevm: <root@pam> starting task UPID:KNTCLCN004:00049BA3:00FEE7E7:50A1183D:qmshutdown:108:root@pam:
Nov 12 09:39:41 KNTCLCN004 task UPID:KNTCLCN004:00049BA3:00FEE7E7:50A1183D:qmshutdown:108:root@pam:: shutdown VM 108: UPID:KNTCLCN004:00049BA3:00FEE7E7:50A1183D:qmshutdown:108:root@pam:
Nov 12 09:39:42 KNTCLCN004 rgmanager[301988]: [pvevm] Task still active, waiting
Nov 12 09:39:43 KNTCLCN004 rgmanager[302008]: [pvevm] Task still active, waiting
Nov 12 09:39:44 KNTCLCN004 rgmanager[302031]: [pvevm] Task still active, waiting
Nov 12 09:39:46 KNTCLCN004 rgmanager[302051]: [pvevm] Task still active, waiting
Nov 12 09:39:47 KNTCLCN004 rgmanager[302071]: [pvevm] Task still active, waiting
Nov 12 09:39:48 KNTCLCN004 rgmanager[302092]: [pvevm] Task still active, waiting
Nov 12 09:39:49 KNTCLCN004 rgmanager[302112]: [pvevm] Task still active, waiting
Nov 12 09:39:50 KNTCLCN004 rgmanager[302132]: [pvevm] Task still active, waiting
Nov 12 09:39:51 KNTCLCN004 rgmanager[302153]: [pvevm] Task still active, waiting
Nov 12 09:39:52 KNTCLCN004 rgmanager[302173]: [pvevm] Task still active, waiting
Nov 12 09:39:53 KNTCLCN004 rgmanager[302193]: [pvevm] Task still active, waiting
Nov 12 09:39:54 KNTCLCN004 rgmanager[302216]: [pvevm] Task still active, waiting
Nov 12 09:39:55 KNTCLCN004 rgmanager[302236]: [pvevm] Task still active, waiting
Nov 12 09:39:55 KNTCLCN004 kernel: vmbr0: port 2(tap108i0) entering disabled state
Nov 12 09:39:55 KNTCLCN004 kernel: vmbr0: port 2(tap108i0) entering disabled state
Nov 12 09:39:56 KNTCLCN004 pvevm: <root@pam> end task UPID:KNTCLCN004:00049BA3:00FEE7E7:50A1183D:qmshutdown:108:root@pam: OK
Nov 12 09:39:56 KNTCLCN004 rgmanager[302263]: [pvevm] Task still active, waiting
Nov 12 09:39:57 KNTCLCN004 rgmanager[1793]: Service pvevm:108 is disabled
Nov 12 09:39:57 KNTCLCN004 pvedaemon[4794]: <root@pam> end task UPID:KNTCLCN004:00049B9E:00FEE7BA:50A1183D:hastop:108:root@pam: OK
Nov 12 09:40:01 KNTCLCN004 pvedaemon[5084]: <root@pam> starting task UPID:KNTCLCN004:00049CCE:00FEEFDD:50A11851:hastart:108:root@pam:
Nov 12 09:40:03 KNTCLCN004 rgmanager[1793]: Starting disabled service pvevm:108
Nov 12 09:40:03 KNTCLCN004 pvevm: <root@pam> starting task UPID:KNTCLCN004:00049CD2:00FEF086:50A11853:qmstart:108:root@pam:
Nov 12 09:40:03 KNTCLCN004 task UPID:KNTCLCN004:00049CD2:00FEF086:50A11853:qmstart:108:root@pam:: start VM 108: UPID:KNTCLCN004:00049CD2:00FEF086:50A11853:qmstart:108:root@pam:
Nov 12 09:40:03 KNTCLCN004 kernel: device tap108i0 entered promiscuous mode
Nov 12 09:40:03 KNTCLCN004 kernel: vmbr0: port 2(tap108i0) entering forwarding state
Nov 12 09:40:04 KNTCLCN004 pvevm: <root@pam> end task UPID:KNTCLCN004:00049CD2:00FEF086:50A11853:qmstart:108:root@pam: OK
Nov 12 09:40:04 KNTCLCN004 rgmanager[1793]: Service pvevm:108 started
Nov 12 09:40:04 KNTCLCN004 pvedaemon[5084]: <root@pam> end task UPID:KNTCLCN004:00049CCE:00FEEFDD:50A11851:hastart:108:root@pam: OK
Nov 12 09:40:14 KNTCLCN004 kernel: tap108i0: no IPv6 routers present
Nov 12 09:40:41 KNTCLCN004 ntpd[1350]: Listen normally on 9 tap108i0 fe80::c487:57ff:fea9:2852 UDP 123
Nov 12 09:40:41 KNTCLCN004 ntpd[1350]: Deleting interface #8 tap108i0, fe80::1450:36ff:fe37:6061#123, interface stats: received=0, sent=0, dropped=0, active_time=165600 secs
Nov 12 09:40:42 KNTCLCN004 rgmanager[302377]: [pvevm] VM 108 is running
Nov 12 09:41:01 KNTCLCN004 pvedaemon[4794]: <root@pam> starting task UPID:KNTCLCN004:00049D46:00FF06FC:50A1188D:hamigrate:108:root@pam:
Nov 12 09:41:01 KNTCLCN004 rgmanager[1793]: Migrating pvevm:108 to KNTCLCN001
Nov 12 09:41:01 KNTCLCN004 pvevm: <root@pam> starting task UPID:KNTCLCN004:00049D4B:00FF072F:50A1188D:qmigrate:108:root@pam:
Nov 12 09:41:02 KNTCLCN004 pmxcfs[1457]: [status] notice: received log
Nov 12 09:41:02 KNTCLCN004 rgmanager[302415]: [pvevm] Task still active, waiting
Nov 12 09:41:03 KNTCLCN004 pmxcfs[1457]: [status] notice: received log
It seems the problem started when a mystery task started. Any ideas what happened?
 
You stopped the VM, the started it again? You use a cluster setup, and the VM uses HA. The VM now runs on node KNTCLCN001

We did that migrate the VM after it became non-responsive. We are still trying to track down why it became non-responsive. We are noticing some spyware that was installed and are going based off that.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!