Hello forum,
we are running a subscripted 5-nodes proxmox and ceph cluster for a while, in last hours our mon log includes entries like the following ones:
Meanwhile, all seems OK in the proxmox and ceph clusters as
and
We are experimenting a laggish GUI too
Any idea on the reason of those [WRN] entries and how to make them disappear?
Thanks for your attention
Pasquale
we are running a subscripted 5-nodes proxmox and ceph cluster for a while, in last hours our mon log includes entries like the following ones:
Code:
2014-09-30 16:24:20.599818 mon.0 10.10.0.75:6789/0 2198 : [INF] pgmap v5025678: 640 pgs: 640 active+clean; 1401 GB data, 4209 GB used, 11596 GB / 15806 GB avail; 2154 B/s wr, 0 op/s
2014-09-30 16:24:35.083070 osd.9 10.10.0.79:6803/8950 459 : [WRN] 1 slow requests, 1 included below; oldest blocked for > 30.131652 secs
2014-09-30 16:24:35.083077 osd.9 10.10.0.79:6803/8950 460 : [WRN] slow request 30.131652 seconds old, received at 2014-09-30 16:24:04.951384: osd_op(client.3337472.0:47152 rbd_data.4fbcc2ae8944a.0000000000000324 [set-alloc-hint object_size 4194304 write_size 4194304,write 3255808~16384] 2.6f4005e7 ack+ondisk+write e8761) v4 currently waiting for subops from 5,14
2014-09-30 16:24:39.674463 mon.0 10.10.0.75:6789/0 2343 : [INF] osd.12 10.10.0.77:6800/4418 failed (3 reports from 2 peers after 34.001207 >= grace 20.878662)
2014-09-30 16:24:39.674922 mon.0 10.10.0.75:6789/0 2345 : [INF] osd.13 10.10.0.77:6803/4625 failed (3 reports from 2 peers after 34.001424 >= grace 20.878662)
2014-09-30 16:24:39.675364 mon.0 10.10.0.75:6789/0 2347 : [INF] osd.14 10.10.0.77:6806/4869 failed (3 reports from 2 peers after 34.001730 >= grace 20.878662)
2014-09-30 16:24:39.675750 mon.0 10.10.0.75:6789/0 2349 : [INF] osd.15 10.10.0.77:6809/5207 failed (3 reports from 2 peers after 34.001986 >= grace 20.878662)
2014-09-30 16:24:40.426452 mon.0 10.10.0.75:6789/0 2402 : [INF] osdmap e8762: 15 osds: 11 up, 15 in
2014-09-30 16:24:40.432713 mon.0 10.10.0.75:6789/0 2403 : [INF] pgmap v5025679: 640 pgs: 640 active+clean; 1401 GB data, 4209 GB used, 11596 GB / 15806 GB avail; 3962 B/s wr, 0 op/s
2014-09-30 16:24:40.451743 mon.0 10.10.0.75:6789/0 2404 : [INF] pgmap v5025680: 640 pgs: 151 stale+active+clean, 489 active+clean; 1401 GB data, 4209 GB used, 11596 GB / 15806 GB avail; 5781 B/s wr, 0 op/s
2014-09-30 16:24:40.805781 osd.1 10.10.0.75:6800/4528 3 : [WRN] 1 slow requests, 1 included below; oldest blocked for > 30.698548 secs
2014-09-30 16:24:40.805798 osd.1 10.10.0.75:6800/4528 4 : [WRN] slow request 30.698548 seconds old, received at 2014-09-30 16:24:10.107177: osd_op(client.3336927.0:4636 rbd_data.1975eb2ae8944a.0000000000000a14 [set-alloc-hint object_size 4194304 write_size 4194304,write 2609152~8192] 2.d61f0b59 ack+ondisk+write e8761) v4 currently reached pg
2014-09-30 16:24:41.440616 mon.0 10.10.0.75:6789/0 2409 : [INF] osd.12 10.10.0.77:6800/4418 boot
2014-09-30 16:24:41.440780 mon.0 10.10.0.75:6789/0 2410 : [INF] osd.14 10.10.0.77:6806/4869 boot
2014-09-30 16:24:41.440953 mon.0 10.10.0.75:6789/0 2411 : [INF] osdmap e8763: 15 osds: 13 up, 15 in
2014-09-30 16:24:39.085681 osd.0 10.10.0.75:6803/4729 5 : [WRN] 1 slow requests, 1 included below; oldest blocked for > 30.105718 secs
2014-09-30 16:24:39.085693 osd.0 10.10.0.75:6803/4729 6 : [WRN] slow request 30.105718 seconds old, received at 2014-09-30 16:24:08.979886: osd_op(client.3147237.0:1968458 rbd_data.1985d22ae8944a.00000000000002f5 [set-alloc-hint object_size 4194304 write_size 4194304,write 2178560~12288] 2.220c9e08 ack+ondisk+write e8761) v4 currently waiting for subops from 7,8
2014-09-30 16:24:40.777186 osd.12 10.10.0.77:6800/4418 33 : [WRN] map e8762 wrongly marked me down
2014-09-30 16:24:41.446618 mon.0 10.10.0.75:6789/0 2412 : [INF] pgmap v5025681: 640 pgs: 151 stale+active+clean, 489 active+clean; 1401 GB data, 4209 GB used, 11596 GB / 15806 GB avail
2014-09-30 16:24:42.448706 mon.0 10.10.0.75:6789/0 2413 : [INF] osd.15 10.10.0.77:6809/5207 boot
2014-09-30 16:24:42.448999 mon.0 10.10.0.75:6789/0 2414 : [INF] osd.13 10.10.0.77:6803/4625 boot
2014-09-30 16:24:42.449243 mon.0 10.10.0.75:6789/0 2415 : [INF] osdmap e8764: 15 osds: 15 up, 15 in
2014-09-30 16:24:42.453775 mon.0 10.10.0.75:6789/0 2416 : [INF] pgmap v5025682: 640 pgs: 151 stale+active+clean, 489 active+clean; 1401 GB data, 4209 GB used, 11596 GB / 15806 GB avail
2014-09-30 16:24:43.458588 mon.0 10.10.0.75:6789/0 2417 : [INF] osdmap e8765: 15 osds: 15 up, 15 in
2014-09-30 16:24:43.463012 mon.0 10.10.0.75:6789/0 2418 : [INF] pgmap v5025683: 640 pgs: 151 stale+active+clean, 489 active+clean; 1401 GB data, 4209 GB used, 11596 GB / 15806 GB avail
2014-09-30 16:24:40.573054 osd.15 10.10.0.77:6809/5207 44 : [WRN] map e8762 wrongly marked me down
2014-09-30 16:24:41.377720 osd.13 10.10.0.77:6803/4625 38 : [WRN] map e8762 wrongly marked me down
2014-09-30 16:24:40.993859 osd.14 10.10.0.77:6806/4869 78 : [WRN] map e8762 wrongly marked me down
2014-09-30 16:24:47.483629 mon.0 10.10.0.75:6789/0 2429 : [INF] pgmap v5025684: 640 pgs: 75 stale+active+clean, 565 active+clean; 1401 GB data, 4209 GB used, 11596 GB / 15806 GB avail; 15716 B/s wr, 2 op/s
Meanwhile, all seems OK in the proxmox and ceph clusters as
Code:
root@c00:~# ceph status
cluster 59127979-f5d9-4eae-ace2-e29dda08525f
health HEALTH_OK
monmap e3: 3 mons at {0=10.10.0.77:6789/0,1=10.10.0.75:6789/0,2=10.10.0.76:6789/0}, election epoch 120, quorum 0,1,2 1,2,0
osdmap e8761: 15 osds: 15 up, 15 in
pgmap v5025655: 640 pgs, 3 pools, 1401 GB data, 350 kobjects
4209 GB used, 11596 GB / 15806 GB avail
639 active+clean
1 active+clean+scrubbing
client io 3050 B/s rd, 82351 B/s wr, 28 op/s
root@c00:~# pvecm nodes
Node Sts Inc Joined Name
1 M 888 2014-09-30 15:53:59 c00
2 M 892 2014-09-30 15:54:02 c01
3 M 892 2014-09-30 15:54:02 c02
4 M 892 2014-09-30 15:54:02 c03
5 M 908 2014-09-30 16:23:02 c04
root@c00:~# pvecm status
Version: 6.2.0
Config Version: 9
Cluster Name: cctest
Cluster Id: 6430
Cluster Member: Yes
Cluster Generation: 908
Membership state: Cluster-Member
Nodes: 5
Expected votes: 5
Total votes: 5
Node votes: 1
Quorum: 3
Active subsystems: 5
Flags:
Ports Bound: 0
Node name: c00
Node ID: 1
Multicast addresses: 239.192.25.55
Node addresses: 10.10.0.75
and
Code:
root@c00:~# pveversion -v
proxmox-ve-2.6.32: 3.2-132 (running kernel: 2.6.32-31-pve)
pve-manager: 3.2-4 (running version: 3.2-4/e24a91c1)
pve-kernel-2.6.32-30-pve: 2.6.32-130
pve-kernel-2.6.32-29-pve: 2.6.32-126
pve-kernel-2.6.32-31-pve: 2.6.32-132
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.7-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.5-1
pve-cluster: 3.0-15
qemu-server: 3.1-16
pve-firmware: 1.1-3
libpve-common-perl: 3.0-19
libpve-access-control: 3.0-11
libpve-storage-perl: 3.0-23
pve-libspice-server1: 0.12.4-3
vncterm: 1.1-8
vzctl: 4.0-1pve6
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 1.7-8
ksm-control-daemon: 1.1-1
glusterfs-client: 3.4.2-1
root@c00:~#
We are experimenting a laggish GUI too
Any idea on the reason of those [WRN] entries and how to make them disappear?
Thanks for your attention
Pasquale