wheezy, fstrim resulting in blocked requests on ceph

mouk

Active Member
May 3, 2016
37
0
26
51
Hi,

Today we changed storage for a (debian wheezy) VM to SCSI with virtio SCSI and and rebooted. Came up fine. Storage is on ceph. (three node cluster, 10G network, with total of 12 OSDs)

Then we issued "fstrim -v / " on the VM and some trouble appeared:

in the wheezy guest we received:
end_request: I/O error, dev sda, sector 185410448
end_request: I/O error, dev sda, sector 187507600​
and some more lines like this. On the ceph cluster, status changed to HEALTH_WARN with blocked requests > 32 sec

Some example lines from the ceph log:
2016-09-12 12:20:47.417408 osd.8 10.10.89.3:6808/2979 2080 : cluster [WRN] slow request 30.417220 seconds old, received at 2016-09-12 12:20:17.000120: osd_op(client.1905008.0:16775 rbd_data.5712f238e1f29.000000000000f8ca [delete] 2.3a5aa8c4 snapc 1b=[1b,6] ack+ondisk+write+known_if_redirected e1895) currently commit_sent​
and
2016-09-12 12:20:49.615040 osd.6 10.10.89.2:6808/2960 2277 : cluster [WRN] slow request 32.608357 seconds old, received at 2016-09-12 12:20:17.006174: osd_op(client.1905008.0:17130 rbd_data.5712f238e1f29.000000000000fa2c [delete] 2.d3536127 snapc 1b=[1b,6] ack+ondisk+write+known_if_redirected e1895) currently waiting for subops from 2,10​
and
2016-09-12 12:20:48.283468 osd.1 10.10.89.1:6808/3039 1063 : cluster [WRN] slow request 31.274059 seconds old, received at 2016-09-12 12:20:17.009346: osd_op(client.1905008.0:17174 rbd_data.5712f238e1f29.000000000000fa58 [delete] 2.e5a9c48d snapc 1b=[1b,6] ack+ondisk+write+known_if_redirected e1895) currently no flag points reached​
basically affecting various OSDs, not tied to a specific proxmox node.

Otherwise the cluster has been and is running perfectly, with nothing but HEALTH_OK over the last weeks.
The above HEALTH_WARN also automatically disappeared again, and we're back to the usual HEALTH_OK now.

My question: how is it possible for a client "fstrim" to make the ceph cluster become HEALTH_WARN? The status came back to HEALTH_OK automatically, but fstrim did not free any space. (the ceph used and available did not change)

Proxmox 4.2-17/e1400248, three hosts, 4 OSD's per host.

Any ideas?
 
I'm not aware of this bug.

ceph version ? wheezy kernel version ?
ceph.conf ?

(I'm using fstrim without problem with ceph jewel + kernel 3.16 from wheezy backport).


Note that using trim in guest, will write zeros on ceph cluster for all available space. Could explains the slow ios. (maybe check with iostat on ceph, when fstrim is running)
 
Hi,
ceph version 0.94.7 (d56bdf93ced6b80b07397d57e3fa68fe68304432), wheezy kernel 3.2.0-4-amd64

ceph.conf:
[global]
auth client required = cephx
auth cluster required = cephx
auth service required = cephx
cluster network = 10.10.89.0/24
filestore xattr use omap = true
fsid = 1397f1dc-7d94-43ea-ab12-8f8792eee9c1
keyring = /etc/pve/priv/$cluster.$name.keyring
osd journal size = 5120
osd pool default min size = 1
# public network = 192.23.93.0/24
public network = 10.10.89.0/24

[osd]
keyring = /var/lib/ceph/osd/ceph-$id/keyring

[mon.2]
host = pm3
mon addr = 10.10.89.3:6789

[mon.0]
host = pm1
mon addr = 10.10.89.1:6789

[mon.1]
host = pm2
mon addr = 10.10.89.2:6789​

Do you see anything special in my ceph.conf?

To explain: 10.10.89.0 is our 10G ethernet network (meshed: https://pve.proxmox.com/wiki/Full_Mesh_Network_for_Ceph_Server)

Ceph is only used for our three proxmox nodes.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!