[SOLVED] Laggy 'ceph status' and 'got timeout' in proxmox gui

ftrojahn

Active Member
Dec 21, 2018
24
4
43
good morning,

first I have to say that we use Proxmox and Ceph for some time now and always appreciate the
stability, the great support and community, thank you all for this!

I'm on the way setting up new cluster with Proxmox 5.3/Ceph luminous and migrating our 'old' production
nodes vm1,vm2,vm3 to new cluster. vm4 is new server, vm5/vm6 are interim nodes for migration and will
be removed later, after VMs are migrated and vm1-vm3 are reinstalled.

Problem: on new cluster, 'pveceph status' sometimes shows "got timeout" on vm5/vm6, sometimes on vm4, too. Why/when this is, seams random, e.g. one time "pveceph status" works, a second later not.
'ceph -s' sometimes 'thinks' 2-4s, too, before it shows it's output - but never times out. Cluster has quorum
and is healthy.

In Gui: the Ceph -> OSD tab is only shown on vm4, never on vm5/vm6 ("got timeout 500".
It makes no difference on which node I use the web gui.

According to chromium dev tools xhr calls like this:
vm5.lan.domain.tld:8006/api2/extjs/nodes/vm5/ceph/osd?_dc=1545896996438
time out and give back:
{"data":null,"message":"got timeout\n","success":0,"status":500}

I have no idea how to debug this -> two days ago just reinstalled and joined vm6, which made it a bit better.
Today reinstalling vm5. But if this won't work, may be I have to reinstall whole cluster - but this would be much more time intensive. And I'm curious, why I cannot find the reason for this behaviour or howto debug what pveceph vs. ceph tools are sometimes waiting for.

Logs show nothing what I can correlate to this - see some more infos attached:
journalctl -u "ceph*" -u "coro*" -u "pve*" --since "-1d" -> 20181227_log.txt
pveceph status -> 20181227_pveceph-status.txt
crushmap.txt -> rbd and cephfs pools are on replicated_ssd, other hdd pool will be created.

Thank you very much in advance for any idea!
Falko


:~# ceph -s
cluster:
id: 97ec297a-63e2-4d6a-89af-2e5e9ee2458c
health: HEALTH_OK

services:
mon: 3 daemons, quorum vm4,vm5,vm6
mgr: vm4(active), standbys: vm5, vm6
mds: cephfs-1/1/1 up {0=vm4=up:active}, 1 up:standby
osd: 12 osds: 12 up, 12 in

data:
pools: 3 pools, 448 pgs
objects: 1.34M objects, 1.27TiB
usage: 2.55TiB used, 21.2TiB / 23.8TiB avail
pgs: 448 active+clean

io:
client: 3.96KiB/s wr, 0op/s rd, 0op/s wr

:~# ceph osd status
+----+------+-------+-------+--------+---------+--------+---------+-----------+
| id | host | used | avail | wr ops | wr data | rd ops | rd data | state |
+----+------+-------+-------+--------+---------+--------+---------+-----------+
| 0 | vm4 | 442G | 510G | 0 | 0 | 0 | 0 | exists,up |
| 1 | vm4 | 1044M | 3724G | 0 | 0 | 0 | 0 | exists,up |
| 2 | vm4 | 359G | 594G | 0 | 1638 | 0 | 0 | exists,up |
| 3 | vm4 | 1044M | 3724G | 0 | 0 | 0 | 0 | exists,up |
| 4 | vm4 | 500G | 452G | 0 | 2457 | 0 | 0 | exists,up |
| 5 | vm5 | 475G | 477G | 1 | 5734 | 0 | 0 | exists,up |
| 6 | vm5 | 1044M | 3724G | 0 | 0 | 0 | 0 | exists,up |
| 7 | vm5 | 466G | 487G | 0 | 0 | 0 | 0 | exists,up |
| 8 | vm5 | 1044M | 1861G | 0 | 0 | 0 | 0 | exists,up |
| 9 | vm5 | 363G | 590G | 0 | 0 | 0 | 0 | exists,up |
| 10 | vm6 | 1044M | 1861G | 0 | 0 | 0 | 0 | exists,up |
| 11 | vm6 | 1044M | 3724G | 0 | 0 | 0 | 0 | exists,up |
+----+------+-------+-------+--------+---------+--------+---------+-----------+


:~# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 23.78134 root default
-3 10.07138 host vm4
1 hdd 3.63860 osd.1 up 1.00000 1.00000
3 hdd 3.63860 osd.3 up 1.00000 1.00000
0 ssd 0.93140 osd.0 up 1.00000 1.00000
2 ssd 0.93140 osd.2 up 1.00000 1.00000
4 ssd 0.93140 osd.4 up 1.00000 1.00000
-7 8.25208 host vm5
6 hdd 3.63860 osd.6 up 1.00000 1.00000
8 hdd 1.81929 osd.8 up 1.00000 1.00000
5 ssd 0.93140 osd.5 up 1.00000 1.00000
7 ssd 0.93140 osd.7 up 1.00000 1.00000
9 ssd 0.93140 osd.9 up 1.00000 1.00000
-10 5.45789 host vm6
10 hdd 1.81929 osd.10 up 1.00000 1.00000
11 hdd 3.63860 osd.11 up 1.00000 1.00000

:~# ceph df
GLOBAL:
SIZE AVAIL RAW USED %RAW USED
23.8TiB 21.2TiB 2.55TiB 10.73
POOLS:
NAME ID USED %USED MAX AVAIL OBJECTS
rbd 1 301GiB 19.87 1.19TiB 77363
cephfs_data 3 1000GiB 45.12 1.19TiB 1151998
cephfs_metadata 4 223MiB 0.02 1.19TiB 110891


:~# ceph osd pool stats
pool rbd id 1
client io 4.29KiB/s wr, 0op/s rd, 0op/s wr

pool cephfs_data id 3
nothing is going on

pool cephfs_metadata id 4
nothing is going on

[global]
auth client required = cephx
auth cluster required = cephx
auth service required = cephx
bluestore block db size = 5368709120
bluestore block wal size = 5368709120
cluster network = 192.168.200.0/24
fsid = 97ec297a-63e2-4d6a-89af-2e5e9ee2458c
keyring = /etc/pve/priv/$cluster.$name.keyring
mon allow pool delete = true
osd journal size = 5120
osd pool default min size = 2
osd pool default size = 3
public network = 192.168.40.0/24

[mds]
keyring = /var/lib/ceph/mds/ceph-$id/keyring

[mds.vm5]
host = vm5
mds standby for name = pve

[mds.vm4]
host = vm4
mds standby for name = pve

[osd]
keyring = /var/lib/ceph/osd/ceph-$id/keyring

[mon.vm5]
host = vm5
mon addr = 192.168.40.15:6789

[mon.vm4]
host = vm4
mon addr = 192.168.40.14:6789

[mon.vm6]
host = vm6
mon addr = 192.168.40.16:6789

:~#pveversion -v
proxmox-ve: 5.3-1 (running kernel: 4.15.18-9-pve)
pve-manager: 5.3-6 (running version: 5.3-6/37b3c8df)
pve-kernel-4.15: 5.2-12
pve-kernel-4.15.18-9-pve: 4.15.18-30
ceph: 12.2.10-pve1
corosync: 2.4.4-pve1
criu: 2.11.1-1~bpo90
glusterfs-client: 3.8.8-1
ksm-control-daemon: 1.2-2
libjs-extjs: 6.0.1-2
libpve-access-control: 5.1-3
libpve-apiclient-perl: 2.0-5
libpve-common-perl: 5.0-43
libpve-guest-common-perl: 2.0-18
libpve-http-server-perl: 2.0-11
libpve-storage-perl: 5.0-34
libqb0: 1.0.3-1~bpo9
lvm2: 2.02.168-pve6
lxc-pve: 3.0.2+pve1-5
lxcfs: 3.0.2-2
novnc-pve: 1.0.0-2
openvswitch-switch: 2.7.0-3
proxmox-widget-toolkit: 1.0-22
pve-cluster: 5.0-31
pve-container: 2.0-31
pve-docs: 5.3-1
pve-edk2-firmware: 1.20181023-1
pve-firewall: 3.0-16
pve-firmware: 2.0-6
pve-ha-manager: 2.0-5
pve-i18n: 1.0-9
pve-libspice-server1: 0.14.1-1
pve-qemu-kvm: 2.12.1-1
pve-xtermjs: 1.0-5
qemu-server: 5.0-43
smartmontools: 6.5+svn4324-1
spiceterm: 3.0-5
vncterm: 1.5-3
zfsutils-linux: 0.7.12-pve1~bpo1

:~# apt-show-versions |grep ceph
ceph:amd64/stretch 12.2.10-pve1 uptodate
ceph-base:amd64/stretch 12.2.10-pve1 uptodate
ceph-common:amd64/stretch 12.2.10-pve1 uptodate
ceph-fuse:amd64/stretch 12.2.10-pve1 uptodate
ceph-mds:amd64/stretch 12.2.10-pve1 uptodate
ceph-mgr:amd64/stretch 12.2.10-pve1 uptodate
ceph-mon:amd64/stretch 12.2.10-pve1 uptodate
ceph-osd:amd64/stretch 12.2.10-pve1 uptodate
libcephfs1:amd64/stretch 10.2.11-2 uptodate
libcephfs2:amd64/stretch 12.2.10-pve1 uptodate
python-cephfs:amd64/stretch 12.2.10-pve1 uptodate
 

Attachments

  • 20181227_log.txt
    9.9 KB · Views: 3
  • 20181227_pveceph-status.txt
    4 KB · Views: 2
  • crushmap.txt
    2.2 KB · Views: 4
Hi,

for my understanding:
- you run all servers in one cluster?
- you have currently two CEPH storages in same PVE cluster on different nodes?
- do you run all server and ceph in the same VLAN?
- you tried to check the GUI on all nodes directly, so you opened e.g. https://vm4:8006 and check status, then vm5 etc.?
 
Hi,

thanx for your quick answer.

Hi,

for my understanding:
- you run all servers in one cluster?

No.
- i have 'old' production cluster proxmox 4.4 (vm1,2,3)
- i have new, separate cluster proxmox 5.5

- you have currently two CEPH storages in same PVE cluster on different nodes?

No, old cluster has ceph hammer and rbd storage on 3 nodes,
new cluster has luminous, rbd and cephfs on 3 nodes.

- do you run all server and ceph in the same VLAN?

Well, partly. Old cluster runs on 10G CX4 HP switch (corosync, ceph public, LAN network all on 192.168.10.x)
and separate 10G RJ45 Netgear switch for ceph cluster network (192.168.200.x).

For new cluster, I wanted to better isolate - well, having some more experiences now. So
LAN 10.x network for clients, corosync on VLAN42 42.x, ceph public on VLAN40 40.x, ceph cluster on plain 10G Netgear switch 200.x.

So IMHO it is isolated ATM.
Edit: mostly isolated, ceph cluster net 200.x is shared.
But: as I have only 5 CX4 ports on HP switch, only 4 servers and uplink are possible.
That's why: 1 new server and 2 interim servers all are connected for now on 10G netgear,
VLAN2 (client lan), VLAN42 (corosync), VLAN40 (ceph public) and VLAN1 (ceph cluster).

Target after migration is to have vm1/2/3/4 using VLAN2, VLAN42 and VLAN40 on CX4 switch, only CEPH cluster network 200.x on Netgear.
As Netgear has 8 ports, I could then either use 4x2 LACP. Or, maybe better 4x ceph public and
4x ceph cluster on different VLANs, servers have 2x10G ports each.

Surely, may be something wrong with that - but since ping and omping working flawlessly between new
servers, ceph is healthy and giving good transfer rates AFAIKT, at least no idea what should be wrong with
that.

- you tried to check the GUI on all nodes directly, so you opened e.g. https://vm4:8006 and check status, then vm5 etc.?
[/QUOTE]

Yes.

So some information, how to debug this timeout, or maybe howto increase, increasing log level to get
better logs or anything like that would perhaps help.

BTW: reinstalling vm5 did not change anything - but anyway, wanted to try that:
having all osds back after reinstall succeeded.

Best regards,
Falko
 
Fixed - problem seemed to be: using jumbo frames and mtu 9000. Having now ceph public on other switch/vlan using mtu 1500 all works as expected. CEPH cluster network works on same switch with jumbo frames and mtu 9000 - so no idea, where it came from. Anyway, working now.
 
Fixed - problem seemed to be: using jumbo frames and mtu 9000. Having now ceph public on other switch/vlan using mtu 1500 all works as expected. CEPH cluster network works on same switch with jumbo frames and mtu 9000 - so no idea, where it came from. Anyway, working now.

I just want to say THANK YOU! I had issues with ceph timing out after we installed the Unifi Dream Machine Pro and we spent hours trying to figure out what was wrong. I am not lying when I say I was hitting google for over 2 hours when I saw your comment about MTU mismatch. The UDMPRO has "Enable Jumbo Frames" checkbox, but what I found out is that only sets the MTU to around 8125 or so. To get it to work, I had to SSH to the machine, and set the MTU manually using this little script (there are around 100 interfaces due to VLANs, bridges, etc):
ls -1 /sys/class/net | while read line ; do ip link set mtu 9000 dev $line ; done
 
I had the same problem , getting "error with 'df': got timeout" when trying to either install a VM with ceph storage or move an existing disk to ceph storage , otherwise it looked "good". I had the MTU size setup on one interface to 9000 and all rest to default 1500. Once I changed it to 1500 it jsut started to work.
 
No matter what Ceph commands I enter the system either timeouts at 300 or 500? I did have another node that I deleted and then I replaced it with a different node , diffeent name, different IP address. It would appear that I have some remnants of the old node still in some of my config files. Can anyone please take this NEWBY by the hand and guide me , step by step, in the process to fix CEPH? I can then see if it fixes my timeout isues with uploading, downloading or transferring ISO images to my PVE. Any help would be gratefully appreciated.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!