VM stopped

snpz

Well-Known Member
Mar 18, 2013
36
4
48
Hi!

I have 6 server proxmox cluster, each server installed on ZFS mirror, that is used for VM storage.
The problem is, that today one VM was just shut down with no evident reason. After starting this VM back up, checked syslog on vm and host machine!

VM is a mailserver - last message before stopping and first message after starting:
Jan 31 17:20:12 mail postfix/smtpd[10505]: disconnect from mxl064v67.mxlogic.net[208.81.64.67]
Jan 31 18:29:25 mail kernel: imklog 5.8.6, log source = /proc/kmsg started.

And this one from host syslog:
Jan 31 17:07:23 ve5 pvestatd[2421]: storage 'VMBackup' is not online
Jan 31 17:12:10 ve5 pmxcfs[2384]: [status] notice: received log
Jan 31 17:14:26 ve5 systemd-timesyncd[1781]: interval/delta/delay/jitter/drift 2048s/-0.000s/0.001s/0.002s/+7ppm
Jan 31 17:15:24 ve5 pmxcfs[2384]: [status] notice: received log
Jan 31 17:17:01 ve5 CRON[18231]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
Jan 31 17:20:00 ve5 pmxcfs[2384]: [status] notice: received log
Jan 31 17:20:46 ve5 systemd[1]: Stopping LVM2 PV scan on device 230:21...
Jan 31 17:20:46 ve5 systemd[1]: Requested transaction contradicts existing jobs: File exists
Jan 31 17:20:46 ve5 systemd[1]: Stopped LVM2 PV scan on device 230:21.
Jan 31 17:33:33 ve5 pmxcfs[2384]: [dcdb] notice: data verification successful


Any ideas where to look for some extra information?
If there would be any memory issues (out of memory or smthg), syslog would have same lines about that, right?

pveversion -v
proxmox-ve: 4.4-78 (running kernel: 4.4.35-2-pve)
pve-manager: 4.4-5 (running version: 4.4-5/c43015a5)
pve-kernel-4.4.35-1-pve: 4.4.35-77
pve-kernel-4.4.35-2-pve: 4.4.35-78
lvm2: 2.02.116-pve3
corosync-pve: 2.4.0-1
libqb0: 1.0-1
pve-cluster: 4.0-48
qemu-server: 4.0-102
pve-firmware: 1.1-10
libpve-common-perl: 4.0-85
libpve-access-control: 4.0-19
libpve-storage-perl: 4.0-71
pve-libspice-server1: 0.12.8-1
vncterm: 1.2-1
pve-docs: 4.4-1
pve-qemu-kvm: 2.7.1-1
pve-container: 1.0-90
pve-firewall: 2.0-33
pve-ha-manager: 1.0-38
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u3
lxc-pve: 2.0.6-5
lxcfs: 2.0.5-pve2
criu: 1.6.0-1
novnc-pve: 0.5-8
smartmontools: 6.5+svn4324-1~pve80
zfsutils: 0.6.5.8-pve13~bpo80
 
Today happened again!
Now i figured out that there was an urbackup process running on that mailserver. But still - why does VM just stopped? Host or VM had no resources to manage incremental backup, so it just killed this VM!?
Syslog for host machine ar VM as empty as last time.
Any ideas?
 
I was also hit by an OOM killer yesterday using 4.4.35-2-pve kernel. I've yet to debug it further, but wanted to mention about syslog: check permissions of the file /var/log/syslog. I've noticed the ownership of the file has been wrong on occasion (likely should be syslog:syslog or syslog:adm), so you could try removing the empty log file and restarting rsyslogd to have it recreate it with correct permissions.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!