get nodes/$node/storage showing 0 byte for ceph pool

alexskysilk

Active Member
Oct 16, 2015
582
61
28
Chatsworth, CA
www.skysilk.com
I have this intermittent problem with storage returning 0 values for a specific rbd pool. Its only happening on one cluster, and there doesnt seem to be a corrolation to which node context is being called:

Code:
{"CODE":"OK","ERRORS":"","proxmoxRes":{"active":0,"avail":0,"content":"rootdir,images","enabled":1,"shared":1,"total":0,"type":"rbd","used":0,"data":{"active":0,"content":"rootdir,images","avail":0,"shared":1,"used":0,"total":0,"enabled":1,"type":"rbd"},"errors":null,"status":null,"success":1,"message":null},"request":null}
If I run the query in pvesh, I get a timeout before the 0 response:

Code:
pvesh get nodes/sky11/storage/vdisk-3pg/status
got timeout
200 OK
{
   "active" : 0,
   "avail" : 0,
   "content" : "rootdir,images",
   "enabled" : 1,
   "shared" : 1,
   "total" : 0,
   "type" : "rbd",
   "used" : 0
}
Why is it timing out? none of the nodes are overloaded, and pveproxy is showing any issues.

Code:
# pveversion -v
proxmox-ve: 5.2-2 (running kernel: 4.15.17-3-pve)
pve-manager: 5.2-3 (running version: 5.2-3/785ba980)
pve-kernel-4.15: 5.2-3
pve-kernel-4.15.17-3-pve: 4.15.17-13
pve-kernel-4.15.17-1-pve: 4.15.17-9
pve-kernel-4.15.3-1-pve: 4.15.3-1
ceph: 12.2.5-pve1
corosync: 2.4.2-pve5
criu: 2.11.1-1~bpo90
glusterfs-client: 3.8.8-1
ksm-control-daemon: 1.2-2
libjs-extjs: 6.0.1-2
libpve-access-control: 5.0-8
libpve-apiclient-perl: 2.0-4
libpve-common-perl: 5.0-34
libpve-guest-common-perl: 2.0-17
libpve-http-server-perl: 2.0-9
libpve-storage-perl: 5.0-23
libqb0: 1.0.1-1
lvm2: 2.02.168-pve6
lxc-pve: 3.0.0-3
lxcfs: 3.0.0-1
novnc-pve: 1.0.0-1
proxmox-widget-toolkit: 1.0-19
pve-cluster: 5.0-27
pve-container: 2.0-23
pve-docs: 5.2-4
pve-firewall: 3.0-12
pve-firmware: 2.0-4
pve-ha-manager: 2.0-5
pve-i18n: 1.0-6
pve-libspice-server1: 0.12.8-3
pve-qemu-kvm: 2.11.1-5
pve-xtermjs: 1.0-5
qemu-server: 5.0-29
smartmontools: 6.5+svn4324-1
spiceterm: 3.0-5
vncterm: 1.5-3
zfsutils-linux: 0.7.9-pve1~bpo9
 

Alwin

Proxmox Staff Member
Staff member
Aug 1, 2017
2,672
234
63
Is the storage accessible through rbd command line? If it is an external ceph cluster, is the keyring file at /etc/pve/priv/ceph/'?
 

Alwin

Proxmox Staff Member
Staff member
Aug 1, 2017
2,672
234
63
Are all MONs accessible through the PVE node? The timeout could come from a MON not being reachable, while the rest is.
 

Alwin

Proxmox Staff Member
Staff member
Aug 1, 2017
2,672
234
63
Is the port of every MON accessible (telnet/netcat)? Maybe a firewall/routing issue?
 

Alwin

Proxmox Staff Member
Staff member
Aug 1, 2017
2,672
234
63
Does a 'ceph -m monhost mon_status' to each of the MONs work?

For the moment, I believe not all MONs are (equally?) reachable, as I have seen in the past, the "sometimes empty" results from such behavior.

not sure how/why it would be firewall related, there is no firewall (software or hardware) enabled on that subnet, its dedicated to ceph traffic.
Just going through the usual questions, as with remote diagnosis you never know, what is and what isn't. ;):)
 

alexskysilk

Active Member
Oct 16, 2015
582
61
28
Chatsworth, CA
www.skysilk.com
For the moment, I believe not all MONs are (equally?) reachable, as I have seen in the past, the "sometimes empty" results from such behavior.
That seems logical. I ran tried randomly to call the monitors and in at least one instance it just hung without replying. I will move the defective monitor but how do I troubleshoot why its not responding?
 

Alwin

Proxmox Staff Member
Staff member
Aug 1, 2017
2,672
234
63
The logs on the MON may give any clues, if it is some network issue, then you maybe see dropped packets.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE and Proxmox Mail Gateway. We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!