Ceph snapshot lxc backup hanging

Rob Loan

Renowned Member
Mar 25, 2017
48
13
73
60
I've had a number of stuck backup jobs of lxc containers using a ceph snapshot as a source. Have you noticed this too? Its been happening for awhile:

2017-12-10 05:00:15.946415 osd.0 osd.0 10.1.1.10:6801/30151 14 : cluster [ERR] 1.3d missing primary copy of 1:bc3c5d99:::rbd_data.a6d574b0dc51.00000000000004f7:head, will try copies on 3,5
2017-12-11 05:04:40.047933 osd.0 osd.0 10.1.1.10:6804/2946 121 : cluster [ERR] 1.6d missing primary copy of 1:b665eacf:::rbd_data.8c342ae8944a.000000000000081a:head, will try copies on 3,5
2017-12-12 05:03:26.166045 osd.0 osd.0 10.1.1.10:6804/2946 238 : cluster [ERR] 1.f1 missing primary copy of 1:8f0ff2c2:::rbd_data.8c342ae8944a.0000000000000628:head, will try copies on 2,5
2017-12-23 05:10:58.722254 osd.2 osd.2 10.1.1.9:6804/2182 383 : cluster [ERR] 3.3c missing primary copy of 3:3d470d6b:::rbd_data.a6239643c9869.00000000000001ab:head, will try copies on 0,4
2018-01-03 05:08:28.078099 osd.3 osd.3 10.1.1.9:6804/2177 105 : cluster [ERR] 3.14 missing primary copy of 3:2812d53a:::rbd_data.a6239643c9869.00000000000006ba:head, will try copies on 0,4
2018-01-04 05:10:02.519346 osd.2 osd.2 10.1.1.9:6804/2245 45 : cluster [ERR] 3.5 missing primary copy of 3:a17478e7:::rbd_data.a6239643c9869.000000000000026a:head, will try copies on 0,5
2018-01-06 05:02:28.087476 osd.1 osd.1 10.1.1.10:6804/3394 37 : cluster [ERR] 3.b missing primary copy of 3:d1c318cd:::rbd_data.a41d774b0dc51.0000000000000519:head, will try copies on 2,4
2018-01-06 05:08:36.092738 osd.0 osd.0 10.1.1.10:6800/3249 65 : cluster [ERR] 3.45 missing primary copy of 3:a36530a0:::rbd_data.a41d774b0dc51.0000000000000cbe:head, will try copies on 3,5
2018-01-07 05:01:35.730387 osd.1 osd.1 10.1.1.10:6804/3394 66 : cluster [ERR] 3.4 missing primary copy of 3:20c0df81:::rbd_data.a6239643c9869.0000000000000834:head, will try copies on 2,5
2018-01-07 15:16:56.002832 osd.1 osd.1 10.1.1.10:6804/3356 5 : cluster [ERR] 3.65 missing primary copy of 3:a657a9ec:::rbd_data.a6239643c9869.0000000000000167:head, will try copies on 2,4
2018-01-08 05:02:30.962180 osd.0 osd.0 10.1.1.10:6800/3033 31 : cluster [ERR] 3.17 missing primary copy of 3:e9e1edfd:::rbd_data.a6239643c9869.00000000000006a4:head, will try copies on 3,5
2018-01-18 05:10:33.630255 osd.3 osd.3 10.1.1.9:6804/2172 47 : cluster [ERR] 3.39 missing primary copy of 3:9ccaaf7f:::rbd_data.a6239643c9869.000000000000062c:head, will try copies on 0,4
2018-01-18 18:59:53.061005 osd.3 osd.3 10.1.1.9:6804/2202 23 : cluster [ERR] 3.71 missing primary copy of 3:8ff57dec:::rbd_data.a6239643c9869.00000000000003e4:head, will try copies on 1,5
2018-01-18 19:22:27.504886 osd.2 osd.2 10.1.1.9:6800/2030 1 : cluster [ERR] 3.5d missing primary copy of 3:bb371d0a:::rbd_data.a6239643c9869.0000000000000275:head, will try copies on 0,4
2018-01-18 19:48:34.393222 osd.0 osd.0 10.1.1.10:6800/5528 15 : cluster [ERR] 3.74 missing primary copy of 3:2e9b9ab2:::rbd_data.a6239643c9869.00000000000009c3:head, will try copies on 3,4
2018-01-18 20:09:55.249787 osd.0 osd.0 10.1.1.10:6800/3006 1 : cluster [ERR] 3.5a missing primary copy of 3:5ad483d1:::rbd_data.a6239643c9869.000000000000005b:head, will try copies on 3,5
2018-01-18 21:08:21.173157 osd.2 osd.2 10.1.1.9:6800/2020 17 : cluster [ERR] 3.4b missing primary copy of 3:d239927e:::rbd_data.a6239643c9869.00000000000004ad:head, will try copies on 1,5
2018-01-18 22:43:41.818465 osd.0 osd.0 10.1.1.10:6800/2984 17 : cluster [ERR] 3.64 missing primary copy of 3:2789172e:::rbd_data.a6239643c9869.00000000000007a5:head, will try copies on 2,4
2018-01-18 22:56:42.256424 osd.4 osd.4 10.1.1.2:6804/15194 83 : cluster [ERR] 3.60 missing primary copy of 3:064cef74:::rbd_data.a6239643c9869.0000000000000294:head, will try copies on 1,3
2018-01-18 23:24:47.344713 osd.0 osd.0 10.1.1.10:6800/2980 1 : cluster [ERR] 3.23 missing primary copy of 3:c484296e:::rbd_data.a6239643c9869.0000000000000515:head, will try copies on 2,5

While it looks like a disk errors, and this home cluster has a bunch of old junk computers it, the failures follow OSDs where the backup is running and `ceph pg deep-scrub <pg>` clean. Also, the ceph error is recoverable, while the hang is from the mounted snapshot /dev/rbd1 device.

Code:
[  633.220900] libceph: get_reply osd0 tid 118337 data 2256896 > preallocated 131072, skipping
[ 5254.083541] libceph: get_reply osd2 tid 95813 data 1851392 > preallocated 131072, skipping
[ 8153.398333] libceph: get_reply osd0 tid 59025 data 946176 > preallocated 131072, skipping
[186162.844452] libceph: get_reply osd4 tid 111482 data 2768896 > preallocated 131072, skipping
[ 1903.885819] libceph: get_reply osd0 tid 75276 data 4079616 > preallocated 131072, skipping

is broken enough requiring a reboot

Code:
% pct delsnapshot 100 vzdump
rbd: sysfs write failed
can't unmap rbd volume vm-100-disk-1: rbd: sysfs write failed

# but removing it by hand works:
rbd snap rm ceph/vm-100-disk-1@vzdump

I've changed from snapshot to stop backup and its working great! The full ODS logs with scrub are here, but I think its more of a snapshot /dev/rbd1 mounting bug. If this isn't just me, its not really a proxmox bug, but at this point I'm not sure where I should file the bug.

Code:
[  633.220900] libceph: get_reply osd0 tid 118337 data 2256896 > preallocated 131072, skipping
2018-01-18 20:09:55.249787 osd.0 osd.0 10.1.1.10:6800/3006 1 : cluster [ERR] 3.5a missing primary copy of 3:5ad483d1:::rbd_data.a6239643c9869.000000000000005b:head, will try copies on 3,5
[ 5254.083541] libceph: get_reply osd2 tid 95813 data 1851392 > preallocated 131072, skipping
2018-01-18 21:08:21.169251 7f07e404f700 -1 bluestore(/var/lib/ceph/osd/ceph-2) _verify_csum bad crc32c/0x1000 checksum at blob offset 0x3d000, got 0x6706be76, expected 0x1e054be5, device location [0x6fdead000~1000], logical extent 0x23d000~1000, object #3:d239927e:::rbd_data.a6239643c9869.00000000000004ad:head#
2018-01-18 21:08:21.173148 7f07e404f700 -1 log_channel(cluster) log [ERR] : 3.4b missing primary copy of 3:d239927e:::rbd_data.a6239643c9869.00000000000004ad:head, will try copies on 1,5
2018-01-18 21:31:12.940846 7f5bf4b68700  0 log_channel(cluster) log [DBG] : 3.4b deep-scrub starts
2018-01-18 21:31:14.284975 7f5bf4b68700 -1 osd.2 pg_epoch: 1175 pg[3.4b( v 1162'47442 (1082'45914,1162'47442] local-lis/les=1174/1175 n=90 ec=296/296 lis/c 1174/1174 les/c/f 1175/1175/0 1174/1174/1174) [2,1,5] r=0 lpr=1174 crt=1162'47442 lcod 0'0 mlcod 0'0 active+clean+scrubbing+deep] _scan_snaps no head for 3:d269f5bc:::rbd_data.a6239643c9869.0000000000002060:237 (have MIN)
2018-01-18 21:31:14.285000 7f5bf4b68700 -1 osd.2 pg_epoch: 1175 pg[3.4b( v 1162'47442 (1082'45914,1162'47442] local-lis/les=1174/1175 n=90 ec=296/296 lis/c 1174/1174 les/c/f 1175/1175/0 1174/1174/1174) [2,1,5] r=0 lpr=1174 crt=1162'47442 lcod 0'0 mlcod 0'0 active+clean+scrubbing+deep] _scan_snaps no head for 3:d269f5bc:::rbd_data.a6239643c9869.0000000000002060:226 (have MIN)
2018-01-18 21:31:14.285008 7f5bf4b68700 -1 osd.2 pg_epoch: 1175 pg[3.4b( v 1162'47442 (1082'45914,1162'47442] local-lis/les=1174/1175 n=90 ec=296/296 lis/c 1174/1174 les/c/f 1175/1175/0 1174/1174/1174) [2,1,5] r=0 lpr=1174 crt=1162'47442 lcod 0'0 mlcod 0'0 active+clean+scrubbing+deep] _scan_snaps no head for 3:d269f5bc:::rbd_data.a6239643c9869.0000000000002060:210 (have MIN)
2018-01-18 21:31:15.878975 7f5bf4b68700 -1 osd.2 pg_epoch: 1175 pg[3.4b( v 1175'47444 (1082'45914,1175'47444] local-lis/les=1174/1175 n=90 ec=296/296 lis/c 1174/1174 les/c/f 1175/1175/0 1174/1174/1174) [2,1,5] r=0 lpr=1174 crt=1175'47444 lcod 1175'47443 mlcod 1175'47443 active+clean+scrubbing+deep] _scan_snaps no head for 3:d2ce786c:::rbd_data.a6239643c9869.000000000000207f:28b (have MIN)
2018-01-18 21:31:15.879002 7f5bf4b68700 -1 osd.2 pg_epoch: 1175 pg[3.4b( v 1175'47444 (1082'45914,1175'47444] local-lis/les=1174/1175 n=90 ec=296/296 lis/c 1174/1174 les/c/f 1175/1175/0 1174/1174/1174) [2,1,5] r=0 lpr=1174 crt=1175'47444 lcod 1175'47443 mlcod 1175'47443 active+clean+scrubbing+deep] _scan_snaps no head for 3:d2ce786c:::rbd_data.a6239643c9869.000000000000207f:27f (have MIN)
2018-01-18 21:31:15.879010 7f5bf4b68700 -1 osd.2 pg_epoch: 1175 pg[3.4b( v 1175'47444 (1082'45914,1175'47444] local-lis/les=1174/1175 n=90 ec=296/296 lis/c 1174/1174 les/c/f 1175/1175/0 1174/1174/1174) [2,1,5] r=0 lpr=1174 crt=1175'47444 lcod 1175'47443 mlcod 1175'47443 active+clean+scrubbing+deep] _scan_snaps no head for 3:d2ce786c:::rbd_data.a6239643c9869.000000000000207f:279 (have MIN)
2018-01-18 21:31:15.879017 7f5bf4b68700 -1 osd.2 pg_epoch: 1175 pg[3.4b( v 1175'47444 (1082'45914,1175'47444] local-lis/les=1174/1175 n=90 ec=296/296 lis/c 1174/1174 les/c/f 1175/1175/0 1174/1174/1174) [2,1,5] r=0 lpr=1174 crt=1175'47444 lcod 1175'47443 mlcod 1175'47443 active+clean+scrubbing+deep] _scan_snaps no head for 3:d2ce786c:::rbd_data.a6239643c9869.000000000000207f:267 (have MIN)
2018-01-18 21:31:15.879024 7f5bf4b68700 -1 osd.2 pg_epoch: 1175 pg[3.4b( v 1175'47444 (1082'45914,1175'47444] local-lis/les=1174/1175 n=90 ec=296/296 lis/c 1174/1174 les/c/f 1175/1175/0 1174/1174/1174) [2,1,5] r=0 lpr=1174 crt=1175'47444 lcod 1175'47443 mlcod 1175'47443 active+clean+scrubbing+deep] _scan_snaps no head for 3:d2ce786c:::rbd_data.a6239643c9869.000000000000207f:251 (have MIN)
2018-01-18 21:31:15.879030 7f5bf4b68700 -1 osd.2 pg_epoch: 1175 pg[3.4b( v 1175'47444 (1082'45914,1175'47444] local-lis/les=1174/1175 n=90 ec=296/296 lis/c 1174/1174 les/c/f 1175/1175/0 1174/1174/1174) [2,1,5] r=0 lpr=1174 crt=1175'47444 lcod 1175'47443 mlcod 1175'47443 active+clean+scrubbing+deep] _scan_snaps no head for 3:d2ce786c:::rbd_data.a6239643c9869.000000000000207f:23b (have MIN)
2018-01-18 21:31:15.879036 7f5bf4b68700 -1 osd.2 pg_epoch: 1175 pg[3.4b( v 1175'47444 (1082'45914,1175'47444] local-lis/les=1174/1175 n=90 ec=296/296 lis/c 1174/1174 les/c/f 1175/1175/0 1174/1174/1174) [2,1,5] r=0 lpr=1174 crt=1175'47444 lcod 1175'47443 mlcod 1175'47443 active+clean+scrubbing+deep] _scan_snaps no head for 3:d2ce786c:::rbd_data.a6239643c9869.000000000000207f:223 (have MIN)
2018-01-18 21:31:15.879043 7f5bf4b68700 -1 osd.2 pg_epoch: 1175 pg[3.4b( v 1175'47444 (1082'45914,1175'47444] local-lis/les=1174/1175 n=90 ec=296/296 lis/c 1174/1174 les/c/f 1175/1175/0 1174/1174/1174) [2,1,5] r=0 lpr=1174 crt=1175'47444 lcod 1175'47443 mlcod 1175'47443 active+clean+scrubbing+deep] _scan_snaps no head for 3:d2ce786c:::rbd_data.a6239643c9869.000000000000207f:20d (have MIN)
2018-01-18 21:31:19.885360 7f5bf4b68700  0 log_channel(cluster) log [DBG] : 3.4b deep-scrub ok

[ 8153.398333] libceph: get_reply osd0 tid 59025 data 946176 > preallocated 131072, skipping
2018-01-18 22:43:41.789123 7f05e97ba700 -1 bluestore(/var/lib/ceph/osd/ceph-0) _verify_csum bad crc32c/0x1000 checksum at blob offset 0x35000, got 0x6706be76, expected 0x3247a483, device location [0x7bca35000~1000], logical extent 0x335000~1000, object #3:2789172e:::rbd_data.a6239643c9869.00000000000007a5:head#
2018-01-18 22:43:41.818465 osd.0 osd.0 10.1.1.10:6800/2984 17 : cluster [ERR] 3.64 missing primary copy of 3:2789172e:::rbd_data.a6239643c9869.00000000000007a5:head, will try copies on 2,4
2018-01-18 22:46:15.017339 7f05e97ba700  0 log_channel(cluster) log [DBG] : 3.64 deep-scrub starts
2018-01-18 22:46:19.295938 7f05e97ba700  0 log_channel(cluster) log [DBG] : 3.64 deep-scrub ok

[186162.844452] libceph: get_reply osd4 tid 111482 data 2768896 > preallocated 131072, skipping
2018-01-18 22:56:42.245043 7f08153b9700 -1 bluestore(/var/lib/ceph/osd/ceph-4) _verify_csum bad crc32c/0x1000 checksum at blob offset 0x7b000, got 0x6706be76, expected 0xb619ce69, device location [0x70e07b000~1000], logical extent 0x17b000~1000, object #3:064cef74:::rbd_data.a6239643c9869.0000000000000294:head#
2018-01-18 22:56:42.256424 osd.4 osd.4 10.1.1.2:6804/15194 83 : cluster [ERR] 3.60 missing primary copy of 3:064cef74:::rbd_data.a6239643c9869.0000000000000294:head, will try copies on 1,3
2018-01-18 22:59:13.061571 7f08153b9700  0 log_channel(cluster) log [DBG] : 3.60 deep-scrub starts
2018-01-18 22:59:18.362073 7f08153b9700  0 log_channel(cluster) log [DBG] : 3.60 deep-scrub ok


[ 1903.885819] libceph: get_reply osd0 tid 75276 data 4079616 > preallocated 131072, skipping
2018-01-18 23:24:47.082488 7fb6c8524700 -1 bluestore(/var/lib/ceph/osd/ceph-0) _verify_csum bad crc32c/0x1000 checksum at blob offset 0x1d000, got 0x6706be76, expected 0x210a620a, device location [0x761fed000~1000], logical extent 0x1d000~1000, object #3:c484296e:::rbd_data.a6239643c9869.0000000000000515:head#
2018-01-18 23:24:47.344713 osd.0 osd.0 10.1.1.10:6800/2980 1 : cluster [ERR] 3.23 missing primary copy of 3:c484296e:::rbd_data.a6239643c9869.0000000000000515:head, will try copies on 2,5
2018-01-18 23:27:53.695821 7fb6c8524700  0 log_channel(cluster) log [DBG] : 3.23 deep-scrub starts
2018-01-18 23:27:58.363954 7fb6c8524700  0 log_channel(cluster) log [DBG] : 3.23 deep-scrub ok


Code:
root@MacPro:~# ceph osd df tree
ID CLASS WEIGHT  REWEIGHT SIZE  USE    AVAIL %USE VAR  PGS TYPE NAME       
-1       5.45755        - 5588G   105G 5483G 1.88 1.00   - root default    
-5       1.81918        - 1862G 35964M 1827G 1.89 1.00   -     host MacPro 
 2   hdd 0.90959  1.00000  931G 21076M  910G 2.21 1.17  75         osd.2   
 3   hdd 0.90959  1.00000  931G 14888M  916G 1.56 0.83  53         osd.3   
-3       1.81918        - 1862G 35884M 1827G 1.88 1.00   -     host Super  
 0   hdd 0.90959  1.00000  931G 21026M  910G 2.20 1.17  75         osd.0   
 1   hdd 0.90959  1.00000  931G 14857M  916G 1.56 0.83  53         osd.1   
-7       1.81918        - 1862G 35864M 1827G 1.88 1.00   -     host nas    
 4   hdd 0.90959  1.00000  931G 16653M  915G 1.75 0.93  60         osd.4   
 5   hdd 0.90959  1.00000  931G 19210M  912G 2.01 1.07  68         osd.5   
                    TOTAL 5588G   105G 5483G 1.88                          
MIN/MAX VAR: 0.83/1.17  STDDEV: 0.28

root@MacPro:/etc/pve/rob# pveversion -v
proxmox-ve: 5.1-36 (running kernel: 4.13.13-5-pve)
pve-manager: 5.1-42 (running version: 5.1-42/724a6cb3)
pve-kernel-4.13.13-5-pve: 4.13.13-36
libpve-http-server-perl: 2.0-8
lvm2: 2.02.168-pve6
corosync: 2.4.2-pve3
libqb0: 1.0.1-1
pve-cluster: 5.0-19
qemu-server: 5.0-19
pve-firmware: 2.0-3
libpve-common-perl: 5.0-25
libpve-guest-common-perl: 2.0-14
libpve-access-control: 5.0-7
libpve-storage-perl: 5.0-17
pve-libspice-server1: 0.12.8-3
vncterm: 1.5-3
pve-docs: 5.1-16
pve-qemu-kvm: 2.9.1-6
pve-container: 2.0-18
pve-firewall: 3.0-5
pve-ha-manager: 2.0-4
ksm-control-daemon: 1.2-2
glusterfs-client: 3.8.8-1
lxc-pve: 2.1.1-2
lxcfs: 2.0.8-1
criu: 2.11.1-1~bpo90
novnc-pve: 0.6-4
smartmontools: 6.5+svn4324-1
zfsutils-linux: 0.7.3-pve1~bpo9
ceph: 12.2.2-pve1
 
hmm, had a hang with a "stopped" backup, so its not a /dev/rbd1 mounting issue. I'm going to stop rotating snapshots and remove them all and move back to a snapshot mode for backup.

[ 5621.020944] libceph: get_reply osd2 tid 333933 data 933888 > preallocated 131072, skipping
 
hmm, with only the dump snapshot it still hangs. dmesg is similar:
[147928.992344] libceph: get_reply osd3 tid 638001 data 933888 > preallocated 131072, skipping

Code:
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root     11148  0.0  0.0  21984  1240 ?        D    20:47   0:00 umount -l -d /mnt/vzsnap0/
is the job stuck in iowait. strace -p 11148 does not show any activity. Stopping the backup via the gui does not kill umount. kill -9 11148 doesn't either. but a reboot does :-)