Host Node network config:
vmbr0 is used for all VMs and CTs
vmbr0:0 is used for corosync.
eno2 is used for ceph traffic (currently just testing, most nodes use local ext4 dir storages)
I already tried many different configs.
- virtio iscsi single
- virtio iscsi
- iothread/ no...
I am still on that issue and figured out:
Once I remove the network adapter from the proxmox VM, it does not freeze anymore.
If I add it again - it is the same if it is virtio, intel or realtek - it freezes.
Also enabled the firewall dropping all incoming packets, still freezing.
no D state processes, expect sometimes ceph process has D state for some seconds.
Our cluster has 32 nodes.
Most of them are HP BL460c g8 blades with newest BIOS from 05/2018 (I31) and newest Raid Controller firmware (also tested in HBA mode to bypass the raid controller, no luck). 2x E5-2650v2...
the moment, a VM is freezing http://prntscr.com/nykjl6 (see 21:18 - 21:42) all other VMs show high disk usage:
What does the Disk I/O Graph on a VM means? Is it related to the VM only or to the overall disk?
These are VMs used by customers as well as test VMs without any custom services running. All KVM VMs are affected, Linux and windows. We are facing this on all nodes in our cluster using CEPH RBD storage or directory storage. LXC containers on the same hosts are not affected.
we are now using a CEPH Cluster (RBD) with raw VMs. Unfortunately the issue still persists.
Windows VMs keep freezing after a while, CPU usage looks like this:
http://prntscr.com/ny3jkp (this is where they freeze), stopping the VM and starting again leads to:
There is still a qemu process of...
played arround with it some hours and I successfully moved 6 nodes to an USB drive now with these steps:
- check how much disk space is used by root partition. My USB tick has 32gb, so I made sure root partition used less than 20gb.
- start rescue (I use sysrcd)
- vgrename pve pve-old ## rename...
I am currently playing around with CEPH. Configured it via proxmox gui, added some osds, added a Metadata Server and finally created "CephFS" in the "CephFS" tab:
Unter "Storages" (proxmox cluster) I accidently deleted the CephFS storage. When I try to create it...
we have a Cluster with some nodes. All nodes have a 2TB local disk where proxmox is installed (with its LV). The other space is allocated as ext4 directory storage. I would like to move the OS to a new 60 gb ssd without losing any VPS information. Means the vps stay on their ext4 directory...
we are facing issues on all KVM VMs (qcow2, VirtIO SCSI, DIR storage (ext4)) for months now.
These VMs freeze suddenly. Means we are unable to access by RDP or even by Console (VNC).
Stopping the frozen VM and trying to start them again lead to:
There is still a...