Hi there,
I am running a 2-node Proxmox-Cluster and mounted RBD images on a remote Ceph cluster (latest Mimic release). Currently we are using the RBD image mount as backup storage for our VMs (mounted in /var/lib/backup).
It all works fine unless an OSD or an OSD-Host (we have 3, each containing 16 OSDs) fails/crashes. In that case the mount/filesystem gets locked and hangs. On other systems (e.g. Ubuntu 14.04) with an RBD image mount on the same cluster there are no problems at all.
It seems that connections to those failed OSDs are not closed and new connections to still available OSDs won't get opened. Every time, without exception, it will always fail if 1 or multiple OSDs go down.
We experienced this for example:
1) OSD host crashes
2) mount at /var/lib//backup is still there but hangs
3) OSD host rebootet
4) mount works fine again
Output of "cat /sys/kernel/debug/ceph/*/osdmap" with 2 failing OSDs:
epoch 10158 barrier 0 flags 0x188000
pool 13 'pve' type 1 size 3 min_size 2 pg_num 256 pg_num_mask 255 flags 0x1 lfor 0 read_tier -1 write_tier osd0 10.250.100.11:6819 100% (exists, up) 100%
osd1 10.250.100.11:6812 100% (exists, up) 100%
osd2 10.250.100.11:6809 100% (exists, up) 100%
osd3 10.250.100.11:6816 100% (exists, up) 100%
osd4 10.250.100.11:6826 100% (exists, up) 100%
osd5 10.250.100.11:6805 100% (exists, up) 100%
osd6 10.250.100.11:6803 100% (exists, up) 100%
osd7 10.250.100.11:6811 100% (exists, up) 100%
osd8 10.250.100.11:6807 100% (exists, up) 100%
osd9 10.250.100.11:6824 100% (exists, up) 100%
osd10 10.250.100.11:6831 100% (exists, up) 100%
osd11 10.250.100.11:6801 100% (exists, up) 100%
osd12 10.250.100.11:6815 100% (exists, up) 100%
osd13 10.250.100.11:6829 100% (exists, up) 100%
osd14 10.250.100.11:6823 100% (exists, up) 100%
osd15 10.250.100.11:6821 100% (exists, up) 100%
osd16 10.250.100.12:6825 100% (exists, up) 100%
osd17 10.250.100.12:6805 100% (exists, up) 100%
osd18 10.250.100.12:6829 100% (exists, up) 100%
osd19 10.250.100.12:6826 100% (exists, up) 100%
osd20 10.250.100.12:6819 100% (exists, up) 100%
osd21 10.250.100.12:6806 100% (exists, up) 100%
osd22 10.250.100.12:6808 100% (exists, up) 100%
osd23 10.250.100.12:6817 100% (exists, up) 100%
osd24 10.250.100.12:6807 100% (exists, up) 100%
osd25 10.250.100.12:6804 100% (exists, up) 100%
osd26 10.250.100.12:6801 100% (exists, up) 100%
osd27 10.250.100.12:6815 100% (exists, up) 100%
osd28 10.250.100.12:6816 100% (exists, up) 100%
osd29 (unknown sockaddr family 0) 0% (doesn't exist) 100%
osd30 10.250.100.12:6818 100% (exists, up) 100%
osd31 10.250.100.12:6803 100% (exists, up) 100%
osd32 10.250.100.13:6813 100% (exists, up) 100%
osd33 10.250.100.13:6819 100% (exists, up) 100%
osd34 10.250.100.13:6811 100% (exists, up) 100%
osd35 10.250.100.13:6815 100% (exists, up) 100%
osd36 10.250.100.13:6823 100% (exists, up) 100%
osd37 10.250.100.13:6821 100% (exists, up) 100%
osd38 10.250.100.13:6817 100% (exists, up) 100%
osd39 10.250.100.13:6825 100% (exists, up) 100%
osd40 10.250.100.13:6805 100% (exists, up) 100%
osd41 10.250.100.13:6827 100% (exists, up) 100%
osd42 10.250.100.13:6807 100% (exists, up) 100%
osd43 10.250.100.13:6803 100% (exists, up) 100%
osd44 (unknown sockaddr family 0) 0% (doesn't exist) 100%
osd45 10.250.100.13:6801 100% (exists, up) 100%
Output of "cat /sys/kernel/debug/ceph/*/osdc":
REQUESTS 4 homeless 0
2553328 osd17 13.d0f0e1bb 13.bb [17,5,14]/17 [17,5,14]/17 e10158 rbd_data.3235e6b8b4567.000000000000fb16 0x400024 1 set-alloc-hint,write
2553933 osd20 13.8b5b5080 13.80 [20,33,39]/20 [20,33,39]/20 e10158 rbd_data.3235e6b8b4567.0000000000000001 0x400014 1 read
2552044 osd25 13.7e756148 13.48 [25,4,27]/25 [25,4,27]/25 e10158 rbd_header.3235e6b8b4567 0x400024 1 watch-reconnect
2555730 osd25 13.7e756148 13.48 [25,4,27]/25 [25,4,27]/25 e10158 rbd_header.3235e6b8b4567 0x400024 1 watch-ping
LINGER REQUESTS
18446462598732840964 osd25 13.7e756148 13.48 [25,4,27]/25 [25,4,27]/25 e10158 rbd_header.3235e6b8b4567 0x20 25 WC/0
BACKOFFS
Output of dmesg:
[10346280.518308] libceph: mon0 10.250.100.11:6789 socket error on write
[10346287.558513] libceph: mon0 10.250.100.11:6789 socket error on write
[10346295.558764] libceph: mon0 10.250.100.11:6789 socket error on write
[10346307.078983] libceph: mon1 10.250.100.12:6789 socket error on write
[10346310.535123] libceph: mon1 10.250.100.12:6789 socket error on write
[10346313.607196] libceph: mon1 10.250.100.12:6789 socket error on write
[10346316.683264] libceph: mon1 10.250.100.12:6789 socket error on write
[10346323.719446] libceph: mon1 10.250.100.12:6789 socket error on read
[10346337.799760] libceph: mon0 10.250.100.11:6789 socket error on write
[10346341.543934] libceph: mon0 10.250.100.11:6789 socket closed (con state CONNECTING)
[10346343.687910] libceph: osd20 10.250.100.12:6819 socket error on write
[10346344.615951] libceph: mon0 10.250.100.11:6789 socket error on read
[10346352.584165] libceph: mon0 10.250.100.11:6789 socket error on write
[10346361.352433] libceph: mon2 10.250.100.13:6789 socket error on read
[10346364.424494] libceph: mon2 10.250.100.13:6789 socket error on read
[10346367.496594] libceph: mon2 10.250.100.13:6789 socket error on write
[10346370.568700] libceph: mon2 10.250.100.13:6789 socket error on write
[10346392.073164] libceph: mon0 10.250.100.11:6789 socket error on write
[10346394.121205] libceph: osd25 10.250.100.12:6804 socket error on write
[10346395.145267] libceph: mon0 10.250.100.11:6789 socket error on read
[10346398.217349] libceph: mon0 10.250.100.11:6789 socket error on read
[10346401.289387] libceph: mon0 10.250.100.11:6789 socket error on read
[10346408.713570] libceph: mon0 10.250.100.11:6789 socket error on write
Any ideas? Any help would be greatly appreciated. Thank you very much in advance!
I am running a 2-node Proxmox-Cluster and mounted RBD images on a remote Ceph cluster (latest Mimic release). Currently we are using the RBD image mount as backup storage for our VMs (mounted in /var/lib/backup).
It all works fine unless an OSD or an OSD-Host (we have 3, each containing 16 OSDs) fails/crashes. In that case the mount/filesystem gets locked and hangs. On other systems (e.g. Ubuntu 14.04) with an RBD image mount on the same cluster there are no problems at all.
It seems that connections to those failed OSDs are not closed and new connections to still available OSDs won't get opened. Every time, without exception, it will always fail if 1 or multiple OSDs go down.
We experienced this for example:
1) OSD host crashes
2) mount at /var/lib//backup is still there but hangs
3) OSD host rebootet
4) mount works fine again
Output of "cat /sys/kernel/debug/ceph/*/osdmap" with 2 failing OSDs:
epoch 10158 barrier 0 flags 0x188000
pool 13 'pve' type 1 size 3 min_size 2 pg_num 256 pg_num_mask 255 flags 0x1 lfor 0 read_tier -1 write_tier osd0 10.250.100.11:6819 100% (exists, up) 100%
osd1 10.250.100.11:6812 100% (exists, up) 100%
osd2 10.250.100.11:6809 100% (exists, up) 100%
osd3 10.250.100.11:6816 100% (exists, up) 100%
osd4 10.250.100.11:6826 100% (exists, up) 100%
osd5 10.250.100.11:6805 100% (exists, up) 100%
osd6 10.250.100.11:6803 100% (exists, up) 100%
osd7 10.250.100.11:6811 100% (exists, up) 100%
osd8 10.250.100.11:6807 100% (exists, up) 100%
osd9 10.250.100.11:6824 100% (exists, up) 100%
osd10 10.250.100.11:6831 100% (exists, up) 100%
osd11 10.250.100.11:6801 100% (exists, up) 100%
osd12 10.250.100.11:6815 100% (exists, up) 100%
osd13 10.250.100.11:6829 100% (exists, up) 100%
osd14 10.250.100.11:6823 100% (exists, up) 100%
osd15 10.250.100.11:6821 100% (exists, up) 100%
osd16 10.250.100.12:6825 100% (exists, up) 100%
osd17 10.250.100.12:6805 100% (exists, up) 100%
osd18 10.250.100.12:6829 100% (exists, up) 100%
osd19 10.250.100.12:6826 100% (exists, up) 100%
osd20 10.250.100.12:6819 100% (exists, up) 100%
osd21 10.250.100.12:6806 100% (exists, up) 100%
osd22 10.250.100.12:6808 100% (exists, up) 100%
osd23 10.250.100.12:6817 100% (exists, up) 100%
osd24 10.250.100.12:6807 100% (exists, up) 100%
osd25 10.250.100.12:6804 100% (exists, up) 100%
osd26 10.250.100.12:6801 100% (exists, up) 100%
osd27 10.250.100.12:6815 100% (exists, up) 100%
osd28 10.250.100.12:6816 100% (exists, up) 100%
osd29 (unknown sockaddr family 0) 0% (doesn't exist) 100%
osd30 10.250.100.12:6818 100% (exists, up) 100%
osd31 10.250.100.12:6803 100% (exists, up) 100%
osd32 10.250.100.13:6813 100% (exists, up) 100%
osd33 10.250.100.13:6819 100% (exists, up) 100%
osd34 10.250.100.13:6811 100% (exists, up) 100%
osd35 10.250.100.13:6815 100% (exists, up) 100%
osd36 10.250.100.13:6823 100% (exists, up) 100%
osd37 10.250.100.13:6821 100% (exists, up) 100%
osd38 10.250.100.13:6817 100% (exists, up) 100%
osd39 10.250.100.13:6825 100% (exists, up) 100%
osd40 10.250.100.13:6805 100% (exists, up) 100%
osd41 10.250.100.13:6827 100% (exists, up) 100%
osd42 10.250.100.13:6807 100% (exists, up) 100%
osd43 10.250.100.13:6803 100% (exists, up) 100%
osd44 (unknown sockaddr family 0) 0% (doesn't exist) 100%
osd45 10.250.100.13:6801 100% (exists, up) 100%
Output of "cat /sys/kernel/debug/ceph/*/osdc":
REQUESTS 4 homeless 0
2553328 osd17 13.d0f0e1bb 13.bb [17,5,14]/17 [17,5,14]/17 e10158 rbd_data.3235e6b8b4567.000000000000fb16 0x400024 1 set-alloc-hint,write
2553933 osd20 13.8b5b5080 13.80 [20,33,39]/20 [20,33,39]/20 e10158 rbd_data.3235e6b8b4567.0000000000000001 0x400014 1 read
2552044 osd25 13.7e756148 13.48 [25,4,27]/25 [25,4,27]/25 e10158 rbd_header.3235e6b8b4567 0x400024 1 watch-reconnect
2555730 osd25 13.7e756148 13.48 [25,4,27]/25 [25,4,27]/25 e10158 rbd_header.3235e6b8b4567 0x400024 1 watch-ping
LINGER REQUESTS
18446462598732840964 osd25 13.7e756148 13.48 [25,4,27]/25 [25,4,27]/25 e10158 rbd_header.3235e6b8b4567 0x20 25 WC/0
BACKOFFS
Output of dmesg:
[10346280.518308] libceph: mon0 10.250.100.11:6789 socket error on write
[10346287.558513] libceph: mon0 10.250.100.11:6789 socket error on write
[10346295.558764] libceph: mon0 10.250.100.11:6789 socket error on write
[10346307.078983] libceph: mon1 10.250.100.12:6789 socket error on write
[10346310.535123] libceph: mon1 10.250.100.12:6789 socket error on write
[10346313.607196] libceph: mon1 10.250.100.12:6789 socket error on write
[10346316.683264] libceph: mon1 10.250.100.12:6789 socket error on write
[10346323.719446] libceph: mon1 10.250.100.12:6789 socket error on read
[10346337.799760] libceph: mon0 10.250.100.11:6789 socket error on write
[10346341.543934] libceph: mon0 10.250.100.11:6789 socket closed (con state CONNECTING)
[10346343.687910] libceph: osd20 10.250.100.12:6819 socket error on write
[10346344.615951] libceph: mon0 10.250.100.11:6789 socket error on read
[10346352.584165] libceph: mon0 10.250.100.11:6789 socket error on write
[10346361.352433] libceph: mon2 10.250.100.13:6789 socket error on read
[10346364.424494] libceph: mon2 10.250.100.13:6789 socket error on read
[10346367.496594] libceph: mon2 10.250.100.13:6789 socket error on write
[10346370.568700] libceph: mon2 10.250.100.13:6789 socket error on write
[10346392.073164] libceph: mon0 10.250.100.11:6789 socket error on write
[10346394.121205] libceph: osd25 10.250.100.12:6804 socket error on write
[10346395.145267] libceph: mon0 10.250.100.11:6789 socket error on read
[10346398.217349] libceph: mon0 10.250.100.11:6789 socket error on read
[10346401.289387] libceph: mon0 10.250.100.11:6789 socket error on read
[10346408.713570] libceph: mon0 10.250.100.11:6789 socket error on write
Any ideas? Any help would be greatly appreciated. Thank you very much in advance!