wheezy, fstrim resulting in blocked requests on ceph

mouk · Sep 12, 2016

Hi,

Today we changed storage for a (debian wheezy) VM to SCSI with virtio SCSI and and rebooted. Came up fine. Storage is on ceph. (three node cluster, 10G network, with total of 12 OSDs)

Then we issued "fstrim -v / " on the VM and some trouble appeared:

in the wheezy guest we received:

end_request: I/O error, dev sda, sector 185410448
end_request: I/O error, dev sda, sector 187507600

and some more lines like this. On the ceph cluster, status changed to HEALTH_WARN with blocked requests > 32 sec

Some example lines from the ceph log:

2016-09-12 12:20:47.417408 osd.8 10.10.89.3:6808/2979 2080 : cluster [WRN] slow request 30.417220 seconds old, received at 2016-09-12 12:20:17.000120: osd_op(client.1905008.0:16775 rbd_data.5712f238e1f29.000000000000f8ca [delete] 2.3a5aa8c4 snapc 1b=[1b,6] ack+ondisk+write+known_if_redirected e1895) currently commit_sent

and

2016-09-12 12:20:49.615040 osd.6 10.10.89.2:6808/2960 2277 : cluster [WRN] slow request 32.608357 seconds old, received at 2016-09-12 12:20:17.006174: osd_op(client.1905008.0:17130 rbd_data.5712f238e1f29.000000000000fa2c [delete] 2.d3536127 snapc 1b=[1b,6] ack+ondisk+write+known_if_redirected e1895) currently waiting for subops from 2,10

and

2016-09-12 12:20:48.283468 osd.1 10.10.89.1:6808/3039 1063 : cluster [WRN] slow request 31.274059 seconds old, received at 2016-09-12 12:20:17.009346: osd_op(client.1905008.0:17174 rbd_data.5712f238e1f29.000000000000fa58 [delete] 2.e5a9c48d snapc 1b=[1b,6] ack+ondisk+write+known_if_redirected e1895) currently no flag points reached

basically affecting various OSDs, not tied to a specific proxmox node.

Otherwise the cluster has been and is running perfectly, with nothing but HEALTH_OK over the last weeks.
The above HEALTH_WARN also automatically disappeared again, and we're back to the usual HEALTH_OK now.

My question: how is it possible for a client "fstrim" to make the ceph cluster become HEALTH_WARN? The status came back to HEALTH_OK automatically, but fstrim did not free any space. (the ceph used and available did not change)

Proxmox 4.2-17/e1400248, three hosts, 4 OSD's per host.

Any ideas?

spirit · Sep 13, 2016

I'm not aware of this bug.

ceph version ? wheezy kernel version ?
ceph.conf ?

(I'm using fstrim without problem with ceph jewel + kernel 3.16 from wheezy backport).

Note that using trim in guest, will write zeros on ceph cluster for all available space. Could explains the slow ios. (maybe check with iostat on ceph, when fstrim is running)

mouk · Sep 14, 2016

Hi,
ceph version 0.94.7 (d56bdf93ced6b80b07397d57e3fa68fe68304432), wheezy kernel 3.2.0-4-amd64

ceph.conf:

[global]
auth client required = cephx
auth cluster required = cephx
auth service required = cephx
cluster network = 10.10.89.0/24
filestore xattr use omap = true
fsid = 1397f1dc-7d94-43ea-ab12-8f8792eee9c1
keyring = /etc/pve/priv/$cluster.$name.keyring
osd journal size = 5120
osd pool default min size = 1
# public network = 192.23.93.0/24
public network = 10.10.89.0/24

[osd]
keyring = /var/lib/ceph/osd/ceph-$id/keyring

[mon.2]
host = pm3
mon addr = 10.10.89.3:6789

[mon.0]
host = pm1
mon addr = 10.10.89.1:6789

[mon.1]
host = pm2
mon addr = 10.10.89.2:6789

Do you see anything special in my ceph.conf?

To explain: 10.10.89.0 is our 10G ethernet network (meshed: https://pve.proxmox.com/wiki/Full_Mesh_Network_for_Ceph_Server)

Ceph is only used for our three proxmox nodes.

Search

Search

wheezy, fstrim resulting in blocked requests on ceph

mouk

Renowned Member

spirit

Distinguished Member

mouk

Renowned Member