OSD Destroy takes extreme long time and seems to block pvestatd

Dec 20, 2018
6
1
43
50
Heilbronn
www.albert.coach
Hi,
I've found a courious behavior when deleting a OSD. Down and Out work fine, cluster is rebalanced. Then I'll try to destroy the OSD.

---
stderr: 10+0 records in
10+0 records out
stderr: 10485760 bytes (10 MB, 10 MiB) copied, 0.583926 s, 18.0 MB/s
--> Only 1 LV left in VG, will proceed to destroy volume group ceph-833dec7c-af78-46b5-8231-bb33a9509b8d
Running command: /sbin/vgremove -v -f ceph-833dec7c-af78-46b5-8231-bb33a9509b8d
stderr: Removing ceph--833dec7c--af78--46b5--8231--bb33a9509b8d-osd--block--ada48ffe--ae03--4a59--b7a8--14a0186b51f9 (253:6)
stderr: Archiving volume group "ceph-833dec7c-af78-46b5-8231-bb33a9509b8d" metadata (seqno 17).
stderr: Releasing logical volume "osd-block-ada48ffe-ae03-4a59-b7a8-14a0186b51f9"
---

there it hangs. 10min later pvestatd turns that host gray (the spawned /sbin/vgs --separator : --noheadings --units b --unbuffered --nosuffix --options vg_name,vg_size,vg_free,lv_count hangs). restarting the pvestatd helps.

---
stderr: Creating volume group backup "/etc/lvm/backup/ceph-833dec7c-af78-46b5-8231-bb33a9509b8d" (seqno 18).
stdout: Logical volume "osd-block-ada48ffe-ae03-4a59-b7a8-14a0186b51f9" successfully removed
stderr: Removing physical volume "/dev/sdi" from volume group "ceph-833dec7c-af78-46b5-8231-bb33a9509b8d"
stdout: Volume group "ceph-833dec7c-af78-46b5-8231-bb33a9509b8d" successfully removed
--> Zapping successful for OSD: 1
command '/sbin/pvremove /dev/sdi' failed: Insecure dependency in exec while running with -T switch at /usr/share/perl/5.28/IPC/Open3.pm line 178.
TASK OK
---

I've been waiting once without a restart...for hours ;-)

Not a big issue, but didn't see that behavior in earlier versions. Using latest pve-manager/6.2-12/b287dd27

Best Regards,
Oliver
 
Hi,

I guess it is fixed because I can't see this on my current version 6.2-13.