Urgent/important issue regarding proxmox/ceph storage and kvm virtualization!

Daniel S.

Member
May 15, 2019
21
0
6
Hello,

We are running multiple VMs in the following environment: proxmox cluster with ceph storage - block storage - all osds are enterprise SSDs (RBD pool 3 times replicated).

ceph version: 15.2.11

All nodes inside the cluster have exactly this following version: https://pastebin.com/ugjzptQ9

We have installed an redhat based OS on one VM and started to migrate data to it from another machine (rsync)(the machine from where we started to restore is outside this cluster).

The VM had 3 virtio-scsi disks added, check below for full config disk info.
The VM had an EFI disk (all disks including EFI were located on the same ceph rbd storage) - using OVMF uefi as bios.

This is from the VM config file:
Code:
efidisk0: rbd:vm-108-disk-1,size=1M
scsi0: rbd:vm-108-disk-0,backup=0,cache=writeback,discard=on,iothread=1,queues=8,size=250G    - /dev/sda1 partition ext4
scsi1: rbd:vm-108-disk-2,backup=0,cache=writeback,discard=on,iothread=1,queues=8,size=500G    - /dev/sdb1 partition ext4
scsi2: rbd:vm-108-disk-3,backup=0,cache=writeback,discard=on,iothread=1,queues=8,size=2T         - /dev/sdc1 partition ext4

The VM was suddenly killed by oom-killer, no issue here, as we assigned too much memory to the VM. (node has 256GB of ram) and has few more VMs running on it, we added 192GB to this specific VM, so we prolly need more ram, ok, but see below what happened.

Check the logs from the node which hosted the VM when it killed it: https://pastebin.com/EUPZa9m7

The very big problem is that /dev/sda1 and /dev/sdb1 partitions do not exist anymore on the system after we boot it, it appears that something wiped/removed them up, this is unacceptable. We boot using live cds, disks are all there but there are no more partitions on /dev/sda and /dev/sdb drives - the only one which still exist and can be mounted is /dev/sdc1.

Do you have any idea about this? What could cause this behaviour?
Nothing happened on ceph, nothing suspicious on logs, no failed osds or pgs, health all the time was and is OK.

This is a very weird situation - we are running in these environments since years, using multiple different OSes on VMs, we have never encountered such issues, this has to be investigated.

If someone has a hint/clue/idea please let me know.
Thank you.
 
The very big problem is that /dev/sda1 and /dev/sdb1 partitions do not exist anymore on the system after we boot it, it appears that something wiped/removed them up, this is unacceptable. We boot using live cds, disks are all there but there are no more partitions on /dev/sda and /dev/sdb drives - the only one which still exist and can be mounted is /dev/sdc1.

Do you have any idea about this? What could cause this behaviour?
The simplest explanation would be that the changes to partition and data simply were not committed to disk yet, i.e., in memory cache only and thus got lost when the VM was OOM killed. That theory could be supported by the fact that the VM is using the writeback cache mode, which allows some writes to return immediately, i.e., before being on the actual storage.

For ceph this would also require that either less than 24 MiB got written yet, or the rbd_cache_max_dirty config was tuned to a higher value, as else writes start to block until flushed to avoid too much outstanding writes in the cache.

In addition to that the mount options and kernel would also have something to say in that whole behavior.

As said, above I'm going for the simplest explanation in general, closer investigation would require a lot more info, possibly some experimenting in the setup and also a lot of time, that's rather out of scope for most to do in the community forum.
 
Hello,

Thank you for your explanation.

Does what you mention applies if this specific VM was create like 1 month ago? Was shutdown, started/rebooted several times before starting the migration?

What I mean is that the changes were surely commited to the disks as those were created long time before.
 
Does what you mention applies if this specific VM was create like 1 month ago? Was shutdown, started/rebooted several times before starting the migration?
No, then it's rather impossible that above theory applies in your case.
 
That is what I also thought.

Now, the problem is that we are afraid to try again because we have no idea what else could happen, as there are more VMs running inside this cluster, if you have any other idea what should we check for please let us know.
 
hi, can you check if it's just the partition table being wiped and if you are able to restore with testdisk/gpart ?

has using backup with vzdump or pbs been involved?

there is a longer standing bug on loosing partition table ( https://bugzilla.proxmox.com/show_bug.cgi?id=2874 ) and i think since there has been at least a handful of people affected, it should really get more priorization imho, because it's a bug which can lead to loosing trust into this great product. anyhow, there is no repro-case yet, so it's quite difficult to identify what's causing it.
 
Can't be restored, disks exists but partition table was wiped.

No backup was running, only what I first specified in the first post .

It should get more priority because this is very alarming and can happen in a production environment.
 
There is no way to recover anything. The bug seems similar, because when running backups I/O usage and resource usage is probably higher than normal, though we did not ran any backups but we did started a remote rsync restoration to a VM , this caused high I/O and other resource usage which led to the disaster. As above stated, this should be taken very seriously as no matter how many replicas are in your ceph pools/storage or no matter if your ceph health is ok, data is lost. Not to mention the downtime even if other backups are saved.. until you restore.. anyway you take it data is and can be lost.
 
I have backups running on PBS, however restoring them was of no use either. We're running over 150 VMs in the proxmox cloud, and using PBS for backups, this has become a real concern now as we do not have any other backups running. I wonder what proxmox is doing with regards to this, the least they can do is provide support and guidance.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!