[SOLVED] Some VM on node inaccessible for 20 minutes. "pvestatd: status update time" in logs

Kaya

Member
Jun 20, 2012
111
2
18
I have a proxmox installation of 3 cluster: 2 is computer inside Intel modular server and 1 inside HP DL 390 Server.

Suddenly some VM inside node 3 (HP) become inaccessible.
After 20 mins everything work again.
Accessing to webgui on node 1 see node 3 but detailed status was unavailable (for example, vm ram size was wrong).
Logging successfully in node 3 everthing seems ok.. but still cant reach VM (even if PING works great).
So i digged inside log and found:
Code:
Jan 05 11:17:01 proxmox4 CRON[32622]: pam_unix(cron:session): session closed for user root
Jan 05 11:20:27 proxmox4 pvestatd[1635]: status update time (5.769 seconds)
Jan 05 11:20:46 proxmox4 pvestatd[1635]: status update time (5.557 seconds)
Jan 05 11:20:56 proxmox4 pvestatd[1635]: status update time (5.236 seconds)
Jan 05 11:21:07 proxmox4 pvestatd[1635]: status update time (5.439 seconds)
Jan 05 11:21:16 proxmox4 pvestatd[1635]: status update time (5.723 seconds)
Jan 05 11:21:37 proxmox4 pvestatd[1635]: status update time (5.248 seconds)
Jan 05 11:23:16 proxmox4 pvestatd[1635]: status update time (5.101 seconds)
Jan 05 11:23:36 proxmox4 pvestatd[1635]: status update time (5.073 seconds)
Jan 05 11:23:46 proxmox4 pvestatd[1635]: status update time (5.061 seconds)
Jan 05 11:24:36 proxmox4 pvestatd[1635]: status update time (5.654 seconds)
Jan 05 11:24:56 proxmox4 pvestatd[1635]: status update time (5.150 seconds)
Jan 05 11:25:06 proxmox4 pvestatd[1635]: status update time (5.058 seconds)
Jan 05 11:25:17 proxmox4 pvestatd[1635]: status update time (5.067 seconds)
Jan 05 11:25:26 proxmox4 pvestatd[1635]: status update time (5.833 seconds)
Jan 05 11:25:37 proxmox4 pvestatd[1635]: status update time (5.754 seconds)
Jan 05 11:25:56 proxmox4 pvestatd[1635]: status update time (5.046 seconds)
Jan 05 11:27:19 proxmox4 pmxcfs[1600]: [status] notice: received log
Jan 05 11:27:47 proxmox4 pvestatd[1635]: status update time (5.263 seconds)
Jan 05 11:28:26 proxmox4 pvestatd[1635]: status update time (5.224 seconds)
Jan 05 11:28:52 proxmox4 sshd[1434]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=192.168.10.99 user=root
Jan 05 11:28:55 proxmox4 sshd[1434]: Failed password for root from 192.168.10.99 port 61156 ssh2
Jan 05 11:28:57 proxmox4 sshd[1434]: Accepted password for root from 192.168.10.99 port 61156 ssh2
Jan 05 11:28:57 proxmox4 sshd[1434]: pam_unix(sshd:session): session opened for user root by (uid=0)
Jan 05 11:28:57 proxmox4 pvestatd[1635]: status update time (6.105 seconds)
Jan 05 11:29:16 proxmox4 pvestatd[1635]: status update time (5.026 seconds)
Jan 05 11:29:26 proxmox4 pvestatd[1635]: status update time (5.255 seconds)
Jan 05 11:30:36 proxmox4 pveproxy[1873]: proxy detected vanished client connection
Jan 05 11:33:16 proxmox4 pvestatd[1635]: status update time (5.204 seconds)
Jan 05 11:33:37 proxmox4 pvestatd[1635]: status update time (5.608 seconds)
Jan 05 11:34:16 proxmox4 pvestatd[1635]: status update time (5.056 seconds)
Jan 05 11:34:37 proxmox4 pvestatd[1635]: status update time (5.540 seconds)
Jan 05 11:36:47 proxmox4 pvestatd[1635]: status update time (5.356 seconds)
Jan 05 11:37:06 proxmox4 pvestatd[1635]: status update time (5.037 seconds)
Jan 05 11:37:16 proxmox4 pvestatd[1635]: status update time (5.336 seconds)
Jan 05 11:37:54 proxmox4 pvedaemon[30383]: <root@pam> successful auth for user 'root@pam'
Jan 05 11:37:56 proxmox4 pvestatd[1635]: status update time (5.026 seconds)
Jan 05 11:38:26 proxmox4 pvestatd[1635]: status update time (5.246 seconds)
Jan 05 11:38:46 proxmox4 pvestatd[1635]: status update time (5.082 seconds)
Jan 05 11:39:56 proxmox4 pvestatd[1635]: status update time (5.120 seconds)
Jan 05 11:40:16 proxmox4 pvestatd[1635]: status update time (5.039 seconds)
Jan 05 11:40:27 proxmox4 pvestatd[1635]: status update time (6.256 seconds)
Jan 05 11:41:06 proxmox4 pvestatd[1635]: status update time (5.447 seconds)
Jan 05 11:41:17 proxmox4 pvestatd[1635]: status update time (5.580 seconds)
Jan 05 11:42:16 proxmox4 pmxcfs[1600]: [status] notice: received log
Jan 05 11:45:37 proxmox4 rrdcached[1472]: flushing old values
Jan 05 11:45:37 proxmox4 rrdcached[1472]: rotating journals
Jan 05 11:45:37 proxmox4 rrdcached[1472]: started new journal /var/lib/rrdcached/journal/rrd.journal.1515149137.113193
Jan 05 11:45:37 proxmox4 rrdcached[1472]: removing old journal /var/lib/rrdcached/journal/rrd.journal.1515141937.113250

After 11:41 (last status update time log) everything work great again.
I think about disk error but I can't find what and where.

For example, one of the VM (windows 2012) log say:
VIOSTOR : Emessa reimpostazione a dispositivo \Device\RaidPort0
EventId 129

Where can I look to find what is wrong?
dmesg says nothing about that

Pve version is : pve-manager/4.3-1/e7cdc165 (running kernel: 4.4.19-1-pve)

Any hints?
 
If you are using VirtIO drivers for disk, try changing to SCSI, just remember to install vioscsi driver in Windows before.
 
Why change to scsi? On the others VM all works without problem.
And the problem was on 2 linux machine and 1 windows machine. Other Vm on the same node had no problems..
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!