restarted a node, some kvm's on other nodes panic

RobFantini

Famous Member
May 24, 2012
2,043
111
133
Boston,Mass
we have a ceph cluster

a minute after restarting one node at least 3 key kvm's paniced.

screen shot attached.

kvm's are on 2 diff nodes
 

Attachments

  • services-kvm-freeze-s24_Proxmox_Virtual_Environment.png
    services-kvm-freeze-s24_Proxmox_Virtual_Environment.png
    207.3 KB · Views: 29
2 of 3 kvm's had high memory usage.

one did not have swap.

the 3 were busy with disk i/o

one of the nodes uses on board sata, the other a high end recent supermicro and it mode hba.

Code:
 # pveversion -v
proxmox-ve: 4.4-79 (running kernel: 4.4.35-2-pve)
pve-manager: 4.4-12 (running version: 4.4-12/e71b7a74)
pve-kernel-4.4.35-1-pve: 4.4.35-77
pve-kernel-4.4.35-2-pve: 4.4.35-79
lvm2: 2.02.116-pve3
corosync-pve: 2.4.0-1
libqb0: 1.0-1
pve-cluster: 4.0-48
qemu-server: 4.0-108
pve-firmware: 1.1-10
libpve-common-perl: 4.0-91
libpve-access-control: 4.0-23
libpve-storage-perl: 4.0-73
pve-libspice-server1: 0.12.8-1
vncterm: 1.2-1
pve-docs: 4.4-3
pve-qemu-kvm: 2.7.1-1
pve-container: 1.0-93
pve-firewall: 2.0-33
pve-ha-manager: 1.0-40
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u3
lxc-pve: 2.0.7-1
lxcfs: 2.0.6-pve1
criu: 1.6.0-1
novnc-pve: 0.5-8
smartmontools: 6.5+svn4324-1~pve80
zfsutils: 0.6.5.8-pve14~bpo80
ceph: 10.2.5-1~bpo80+1
 
we have a ceph cluster

a minute after restarting one node at least 3 key kvm's paniced.

screen shot attached.

kvm's are on 2 diff nodes

post your VM config:

> qm config VMID

what OS do you run inside in detail?
 
all three use jessie
Code:
boot: cn
bootdisk: scsi0
cores: 2
memory: 1024
name: fbcadmin
net0: virtio=DE:60:C3:F6:55:23,bridge=vmbr1
numa: 0
onboot: 1
ostype: l26
protection: 1
scsi0: ceph-kvm3:vm-100-disk-1,discard=on,size=8G
smbios1: uuid=195cf837-ebaa-49c2-95e9-5ba7a0869cb0
sockets: 1
 
also none of the systems logged out of memory errors. so prob. not a mem issue. note mem on above conf was 512 yesterday.

kernel running per uname -a
Linux fbcadmin 3.16.0-4-amd64 #1 SMP Debian 3.16.39-1 (2016-12-30) x86_64 GNU/Linux
 
...

scsi0: ceph-kvm3:vm-100-disk-1,discard=on,size=8G
...
[/code]

make sure that you use virtio-scsi controller (not LSI), see VM options. I remember some panic when using LSI recently but I did not debug it further as modern OS should use virtio-scsi anyways.
 
after some research , since I have 8 nodes , I'll try using 5 for OSD and 3 for VM. I am not sure yet where to place the 3 mons .
Hi Rob,
I'm not sure if this help for this issue, but I had an seperate ceph,cluster (8 nodes), where the mons run on the pve-nodes.
So I would run the mons on the VM-nodes.

Was the restarted node an osd+mon-node? Because there is an issue that the osd-stop would not reconised early enough, because the mon also died to fast. If you restart an node and shut down the ceph-osd first the VMs have app. 20sec less IO-stall.

Normaly the VMs should handle short IO-stalling without trouble, but perhaps not?! (Don't know if the discard is also an problem in this case).

Udo
 
Because there is an issue that the osd-stop would not reconised early enough, because the mon also died to fast. If you restart an node and shut down the ceph-osd first the VMs have app. 20sec less IO-stall.
Answer myself,
got an email that this bug (#18516) is solved now - but don't know how long it's take to get this changes in ceph (i guess 10.2.6).

Udo
 
make sure that you use virtio-scsi controller (not LSI), see VM options. I remember some panic when using LSI recently but I did not debug it further as modern OS should use virtio-scsi anyways.

they are set to LSI, I'll do the switch. thank you.
 
Hi Rob,
I'm not sure if this help for this issue, but I had an seperate ceph,cluster (8 nodes), where the mons run on the pve-nodes.
So I would run the mons on the VM-nodes.

Was the restarted node an osd+mon-node? Because there is an issue that the osd-stop would not reconised early enough, because the mon also died to fast. If you restart an node and shut down the ceph-osd first the VMs have app. 20sec less IO-stall.

Normaly the VMs should handle short IO-stalling without trouble, but perhaps not?! (Don't know if the discard is also an problem in this case).

Udo

Udo: yes the restarted node ran mon+osd .