I have a proxmox installation of 3 cluster: 2 is computer inside Intel modular server and 1 inside HP DL 390 Server.
Suddenly some VM inside node 3 (HP) become inaccessible.
After 20 mins everything work again.
Accessing to webgui on node 1 see node 3 but detailed status was unavailable (for example, vm ram size was wrong).
Logging successfully in node 3 everthing seems ok.. but still cant reach VM (even if PING works great).
So i digged inside log and found:
After 11:41 (last status update time log) everything work great again.
I think about disk error but I can't find what and where.
For example, one of the VM (windows 2012) log say:
VIOSTOR : Emessa reimpostazione a dispositivo \Device\RaidPort0
EventId 129
Where can I look to find what is wrong?
dmesg says nothing about that
Pve version is : pve-manager/4.3-1/e7cdc165 (running kernel: 4.4.19-1-pve)
Any hints?
Suddenly some VM inside node 3 (HP) become inaccessible.
After 20 mins everything work again.
Accessing to webgui on node 1 see node 3 but detailed status was unavailable (for example, vm ram size was wrong).
Logging successfully in node 3 everthing seems ok.. but still cant reach VM (even if PING works great).
So i digged inside log and found:
Code:
Jan 05 11:17:01 proxmox4 CRON[32622]: pam_unix(cron:session): session closed for user root
Jan 05 11:20:27 proxmox4 pvestatd[1635]: status update time (5.769 seconds)
Jan 05 11:20:46 proxmox4 pvestatd[1635]: status update time (5.557 seconds)
Jan 05 11:20:56 proxmox4 pvestatd[1635]: status update time (5.236 seconds)
Jan 05 11:21:07 proxmox4 pvestatd[1635]: status update time (5.439 seconds)
Jan 05 11:21:16 proxmox4 pvestatd[1635]: status update time (5.723 seconds)
Jan 05 11:21:37 proxmox4 pvestatd[1635]: status update time (5.248 seconds)
Jan 05 11:23:16 proxmox4 pvestatd[1635]: status update time (5.101 seconds)
Jan 05 11:23:36 proxmox4 pvestatd[1635]: status update time (5.073 seconds)
Jan 05 11:23:46 proxmox4 pvestatd[1635]: status update time (5.061 seconds)
Jan 05 11:24:36 proxmox4 pvestatd[1635]: status update time (5.654 seconds)
Jan 05 11:24:56 proxmox4 pvestatd[1635]: status update time (5.150 seconds)
Jan 05 11:25:06 proxmox4 pvestatd[1635]: status update time (5.058 seconds)
Jan 05 11:25:17 proxmox4 pvestatd[1635]: status update time (5.067 seconds)
Jan 05 11:25:26 proxmox4 pvestatd[1635]: status update time (5.833 seconds)
Jan 05 11:25:37 proxmox4 pvestatd[1635]: status update time (5.754 seconds)
Jan 05 11:25:56 proxmox4 pvestatd[1635]: status update time (5.046 seconds)
Jan 05 11:27:19 proxmox4 pmxcfs[1600]: [status] notice: received log
Jan 05 11:27:47 proxmox4 pvestatd[1635]: status update time (5.263 seconds)
Jan 05 11:28:26 proxmox4 pvestatd[1635]: status update time (5.224 seconds)
Jan 05 11:28:52 proxmox4 sshd[1434]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=192.168.10.99 user=root
Jan 05 11:28:55 proxmox4 sshd[1434]: Failed password for root from 192.168.10.99 port 61156 ssh2
Jan 05 11:28:57 proxmox4 sshd[1434]: Accepted password for root from 192.168.10.99 port 61156 ssh2
Jan 05 11:28:57 proxmox4 sshd[1434]: pam_unix(sshd:session): session opened for user root by (uid=0)
Jan 05 11:28:57 proxmox4 pvestatd[1635]: status update time (6.105 seconds)
Jan 05 11:29:16 proxmox4 pvestatd[1635]: status update time (5.026 seconds)
Jan 05 11:29:26 proxmox4 pvestatd[1635]: status update time (5.255 seconds)
Jan 05 11:30:36 proxmox4 pveproxy[1873]: proxy detected vanished client connection
Jan 05 11:33:16 proxmox4 pvestatd[1635]: status update time (5.204 seconds)
Jan 05 11:33:37 proxmox4 pvestatd[1635]: status update time (5.608 seconds)
Jan 05 11:34:16 proxmox4 pvestatd[1635]: status update time (5.056 seconds)
Jan 05 11:34:37 proxmox4 pvestatd[1635]: status update time (5.540 seconds)
Jan 05 11:36:47 proxmox4 pvestatd[1635]: status update time (5.356 seconds)
Jan 05 11:37:06 proxmox4 pvestatd[1635]: status update time (5.037 seconds)
Jan 05 11:37:16 proxmox4 pvestatd[1635]: status update time (5.336 seconds)
Jan 05 11:37:54 proxmox4 pvedaemon[30383]: <root@pam> successful auth for user 'root@pam'
Jan 05 11:37:56 proxmox4 pvestatd[1635]: status update time (5.026 seconds)
Jan 05 11:38:26 proxmox4 pvestatd[1635]: status update time (5.246 seconds)
Jan 05 11:38:46 proxmox4 pvestatd[1635]: status update time (5.082 seconds)
Jan 05 11:39:56 proxmox4 pvestatd[1635]: status update time (5.120 seconds)
Jan 05 11:40:16 proxmox4 pvestatd[1635]: status update time (5.039 seconds)
Jan 05 11:40:27 proxmox4 pvestatd[1635]: status update time (6.256 seconds)
Jan 05 11:41:06 proxmox4 pvestatd[1635]: status update time (5.447 seconds)
Jan 05 11:41:17 proxmox4 pvestatd[1635]: status update time (5.580 seconds)
Jan 05 11:42:16 proxmox4 pmxcfs[1600]: [status] notice: received log
Jan 05 11:45:37 proxmox4 rrdcached[1472]: flushing old values
Jan 05 11:45:37 proxmox4 rrdcached[1472]: rotating journals
Jan 05 11:45:37 proxmox4 rrdcached[1472]: started new journal /var/lib/rrdcached/journal/rrd.journal.1515149137.113193
Jan 05 11:45:37 proxmox4 rrdcached[1472]: removing old journal /var/lib/rrdcached/journal/rrd.journal.1515141937.113250
After 11:41 (last status update time log) everything work great again.
I think about disk error but I can't find what and where.
For example, one of the VM (windows 2012) log say:
VIOSTOR : Emessa reimpostazione a dispositivo \Device\RaidPort0
EventId 129
Where can I look to find what is wrong?
dmesg says nothing about that
Pve version is : pve-manager/4.3-1/e7cdc165 (running kernel: 4.4.19-1-pve)
Any hints?