Hi guys,
Sorry I have no text logs yet, only screenshots.
my problem is, sometimes once in 2-3 days, VM's crash with messages shown on pictures attached:
First: we got messages about e1000 errors, so I've changed nic's to RTL ones and errors disappeared.
Second: From what I can see at least swap drive was lost. I do also indicate cpu loaded with WA queue before VM panic, it also points to drives issue, changing scheduler from cfq didn't help.
And yesterday I got freebsd VM crashed:
We are running on zfs and using pve-zsync to backup. My thoughts was pve-zsync is locking HW raid as we make backups over bonded interfaces and use max backup speed so I lowered it to 20 MB/sec, and still got a crash.
I could assume Cloudlinux 6.8/Centos 6 and kernel 2.6 issue as I saw messages about bad work of kernel 2.6 with kvm but they are pretty old, like several years and marked as solved.
PLZ help.
I just thought, can storage type changing from zfs to qcow2 files help?
proxmox-ve: 4.4-76 (running kernel: 4.4.21-1-pve)
pve-manager: 4.4-2 (running version: 4.4-2/80259e05)
pve-kernel-4.4.6-1-pve: 4.4.6-48
pve-kernel-4.4.13-1-pve: 4.4.13-56
pve-kernel-4.4.35-1-pve: 4.4.35-76
pve-kernel-4.2.6-1-pve: 4.2.6-36
pve-kernel-4.4.13-2-pve: 4.4.13-58
pve-kernel-4.4.21-1-pve: 4.4.21-71
pve-kernel-4.2.8-1-pve: 4.2.8-41
pve-kernel-4.4.19-1-pve: 4.4.19-66
pve-kernel-4.4.10-1-pve: 4.4.10-54
lvm2: 2.02.116-pve3
corosync-pve: 2.4.0-1
libqb0: 1.0-1
pve-cluster: 4.0-48
qemu-server: 4.0-102
pve-firmware: 1.1-10
libpve-common-perl: 4.0-84
libpve-access-control: 4.0-19
libpve-storage-perl: 4.0-70
pve-libspice-server1: 0.12.8-1
vncterm: 1.2-1
pve-docs: 4.4-1
pve-qemu-kvm: 2.7.0-9
pve-container: 1.0-89
pve-firewall: 2.0-33
pve-ha-manager: 1.0-38
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u2
lxc-pve: 2.0.6-2
lxcfs: 2.0.5-pve1
criu: 1.6.0-1
novnc-pve: 0.5-8
smartmontools: 6.5+svn4324-1~pve80
zfsutils: 0.6.5.8-pve13~bpo80
pve-manager: 4.4-2 (running version: 4.4-2/80259e05)
pve-kernel-4.4.6-1-pve: 4.4.6-48
pve-kernel-4.4.13-1-pve: 4.4.13-56
pve-kernel-4.4.35-1-pve: 4.4.35-76
pve-kernel-4.2.6-1-pve: 4.2.6-36
pve-kernel-4.4.13-2-pve: 4.4.13-58
pve-kernel-4.4.21-1-pve: 4.4.21-71
pve-kernel-4.2.8-1-pve: 4.2.8-41
pve-kernel-4.4.19-1-pve: 4.4.19-66
pve-kernel-4.4.10-1-pve: 4.4.10-54
lvm2: 2.02.116-pve3
corosync-pve: 2.4.0-1
libqb0: 1.0-1
pve-cluster: 4.0-48
qemu-server: 4.0-102
pve-firmware: 1.1-10
libpve-common-perl: 4.0-84
libpve-access-control: 4.0-19
libpve-storage-perl: 4.0-70
pve-libspice-server1: 0.12.8-1
vncterm: 1.2-1
pve-docs: 4.4-1
pve-qemu-kvm: 2.7.0-9
pve-container: 1.0-89
pve-firewall: 2.0-33
pve-ha-manager: 1.0-38
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u2
lxc-pve: 2.0.6-2
lxcfs: 2.0.5-pve1
criu: 1.6.0-1
novnc-pve: 0.5-8
smartmontools: 6.5+svn4324-1~pve80
zfsutils: 0.6.5.8-pve13~bpo80
my problem is, sometimes once in 2-3 days, VM's crash with messages shown on pictures attached:
First: we got messages about e1000 errors, so I've changed nic's to RTL ones and errors disappeared.
Second: From what I can see at least swap drive was lost. I do also indicate cpu loaded with WA queue before VM panic, it also points to drives issue, changing scheduler from cfq didn't help.
And yesterday I got freebsd VM crashed:
We are running on zfs and using pve-zsync to backup. My thoughts was pve-zsync is locking HW raid as we make backups over bonded interfaces and use max backup speed so I lowered it to 20 MB/sec, and still got a crash.
I could assume Cloudlinux 6.8/Centos 6 and kernel 2.6 issue as I saw messages about bad work of kernel 2.6 with kvm but they are pretty old, like several years and marked as solved.
PLZ help.
I just thought, can storage type changing from zfs to qcow2 files help?