Ceph not detecting disk failure

ermanishchawla

Well-Known Member
Mar 23, 2020
332
37
48
38
I am testing ceph by running a proxmox vms inside a proxmox server by using nested virtualization

My ceph setup consists of 3 VMs each with 2 disks of 100GB configured for Ceph. I detached the disk from one of the VM to understand, how ceph behaves on disk failure but to my surprise, ceph daemon did not detect the disk failure at all



Code:
root@pve1:~# ceph osd tree





ID  CLASS  WEIGHT   TYPE NAME      STATUS  REWEIGHT  PRI-AFF





-1         0.51962  root default                           





-3         0.19537      host pve1                           





 0    ssd  0.09769          osd.0      up   1.00000  1.00000





 3    ssd  0.09769          osd.3      up   1.00000  1.00000





-5         0.12888      host pve2                           





 1    ssd  0.09769          osd.1      up   1.00000  1.00000





 4    ssd  0.03119          osd.4      up   1.00000  1.00000





-7         0.19537      host pve3                           





 2    ssd  0.09769          osd.2      up   1.00000  1.00000





 5    ssd  0.09769          osd.5      up   1.00000  1.00000



lsblk output before disk removal


root@pve1:~# lsblk -f NAME FSTYPE LABEL UUID FSAVAIL FSUSE% MOUNTPOINT sda |-sda1 |-sda2 vfat B4F4-2A12 `-sda3 LVM2_member bJvBeQ-LIvX-vhPw-5kch-Zgq1-BsTd-YsdqDl |-pve-swap swap b637cb37-b0dc-4d48-a34b-e1e1ad834a95 [SWAP] |-pve-root ext4 a30876c8-1400-41f9-aa1f-deaf8a4813c6 19.7G 14% / |-pve-data_tmeta | `-pve-data `-pve-data_tdata `-pve-data sdc LVM2_member YqWY9f-2gur-fwKg-g2Lz-UXLP-fuHy-sMA1cD `-ceph--1b243735--df87--43fd--9c67--d874ccfba9fb-osd--block--393a90ce--a915--4be5--b953--5e9406304a84 sdd LVM2_member vLltfD-DNcK-94rB-xPxI-aW3c-4duq-1M5a5I `-ceph--90f206b7--44d2--4357--a61d--f37bfefddbde-osd--block--209d75a4--ffc8--4e04--854d--f3bef76d1685

lsblk output after disk removal

root@pve1:~# lsblk -f NAME FSTYPE LABEL UUID FSAVAIL FSUSE% MOUNTPOINT sda |-sda1 |-sda2 vfat B4F4-2A12 `-sda3 LVM2_member bJvBeQ-LIvX-vhPw-5kch-Zgq1-BsTd-YsdqDl |-pve-swap swap b637cb37-b0dc-4d48-a34b-e1e1ad834a95 [SWAP] |-pve-root ext4 a30876c8-1400-41f9-aa1f-deaf8a4813c6 19.7G 14% / |-pve-data_tmeta | `-pve-data `-pve-data_tdata `-pve-data sdc LVM2_member YqWY9f-2gur-fwKg-g2Lz-UXLP-fuHy-sMA1cD `-ceph--1b243735--df87--43fd--9c67--d874ccfba9fb-osd--block--393a90ce--a915--4be5--b953--5e9406304a84 sdd LVM2_member vLltfD-DNcK-94rB-xPxI-aW3c-4duq-1M5a5I `-ceph--90f206b7--44d2--4357--a61d--f37bfefddbde-osd--block--209d75a4--ffc8--4e04--854d--f3bef76d1685



root@pve1:~# systemctl status ceph-osd@0.service


ceph-osd@0.service - Ceph object storage daemon osd.0


Loaded: loaded (/lib/systemd/system/ceph-osd@.service; enabled-runtime; vendor preset: enabled)


Drop-In: /usr/lib/systemd/system/ceph-osd@.service.d


└─ceph-after-pve-cluster.conf


Active: active (running) since Sun 2021-06-20 00:33:01 IST; 12min ago


Process: 317497 ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh --cluster ${CLUSTER} --id 0 (code=exited, status=0/SUCCESS)


Main PID: 317501 (ceph-osd)


Tasks: 58


Memory: 79.9M


CGroup: /system.slice/system-ceph\x2dosd.slice/ceph-osd@0.service


└─317501 /usr/bin/ceph-osd -f --cluster ceph --id 0 --setuser ceph --setgroup ceph
 
Rich (BB code):
proxmox-ve: 6.4-1 (running kernel: 5.4.119-1-pve)
pve-manager: 6.4-8 (running version: 6.4-8/185e14db)
pve-kernel-5.4: 6.4-3
pve-kernel-helper: 6.4-3
pve-kernel-5.4.119-1-pve: 5.4.119-1
pve-kernel-5.4.106-1-pve: 5.4.106-1
ceph: 15.2.13-pve1~bpo10
ceph-fuse: 15.2.13-pve1~bpo10
corosync: 3.1.2-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: residual config
ifupdown2: 3.0.0-1+pve3
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.20-pve1
libproxmox-acme-perl: 1.1.0
libproxmox-backup-qemu0: 1.0.3-1
libpve-access-control: 6.4-3
libpve-apiclient-perl: 3.1-3
libpve-common-perl: 6.4-3
libpve-guest-common-perl: 3.1-5
libpve-http-server-perl: 3.2-3
libpve-storage-perl: 6.4-1
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.6-2
lxcfs: 4.0.6-pve1
novnc-pve: 1.1.0-1
openvswitch-switch: 2.12.3-1
proxmox-backup-client: 1.1.10-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.5-6
pve-cluster: 6.4-1
pve-container: 3.3-5
pve-docs: 6.4-2
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-4
pve-firmware: 3.2-4
pve-ha-manager: 3.1-1
pve-i18n: 2.3-1
pve-qemu-kvm: 5.2.0-6
pve-xtermjs: 4.7.0-3
qemu-server: 6.4-2
smartmontools: 7.2-pve2
spiceterm: 3.1-1
vncterm: 1.6-2
zfsutils-linux: 2.0.4-pve1
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!