Hi,
the morning of April 17 I upgraded my 5 nodes Proxmox Cluster (with Ceph 16.2.7) from 7.1-7 to 7.1-12 following these steps:
I have 2 pools, one for NVMe disks and one for SSD disks, ceph version is 16.2.7. Ceph health was OK up to 1 hour ago and now the output of
and the number of pgs that are not deep-srubbed in time is is increasing
The output of
the ceph log file
And every 1 or 2 seconds a new line is logged...
Now I'm starting to get very worried...
Could syou help me to solve the problem?
Thank you
the morning of April 17 I upgraded my 5 nodes Proxmox Cluster (with Ceph 16.2.7) from 7.1-7 to 7.1-12 following these steps:
Code:
1. Set noout, noscrub and nodeep-scrub before start the update process;
2. I have updated all 5 nodes without problems;
3. Unset the flags noout, noscrub and nodeep-scrub
ceph status
is:
Code:
cluster:
id: 6cddec54-f21f-4261-b8bd-b475e64bd3e3
health: HEALTH_WARN
48 pgs not deep-scrubbed in time
services:
mon: 3 daemons, quorum prx-a1-1,prx-a1-2,prx-a1-3 (age 5d)
mgr: prx-a1-2(active, since 5d), standbys: prx-a1-1, prx-a1-3
osd: 26 osds: 26 up (since 5d), 26 in (since 9M)
data:
pools: 3 pools, 2049 pgs
objects: 1.04M objects, 3.9 TiB
usage: 12 TiB used, 70 TiB / 82 TiB avail
pgs: 2048 active+clean
1 active+clean+scrubbing+deep
io:
client: 258 KiB/s rd, 9.9 MiB/s wr, 11 op/s rd, 351 op/s wr
and the number of pgs that are not deep-srubbed in time is is increasing
The output of
pveversion -v
is:
Code:
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
LANGUAGE = (unset),
LC_ALL = (unset),
LC_ADDRESS = "it_IT.UTF-8",
LC_NAME = "it_IT.UTF-8",
LC_MONETARY = "it_IT.UTF-8",
LC_PAPER = "it_IT.UTF-8",
LC_IDENTIFICATION = "it_IT.UTF-8",
LC_TELEPHONE = "it_IT.UTF-8",
LC_MEASUREMENT = "it_IT.UTF-8",
LC_TIME = "it_IT.UTF-8",
LC_NUMERIC = "it_IT.UTF-8",
LANG = "en_US.UTF-8"
are supported and installed on your system.
perl: warning: Falling back to a fallback locale ("en_US.UTF-8").
proxmox-ve: 7.1-1 (running kernel: 5.13.19-6-pve)
pve-manager: 7.1-12 (running version: 7.1-12/b3c09de3)
pve-kernel-helper: 7.1-14
pve-kernel-5.13: 7.1-9
pve-kernel-5.11: 7.0-10
pve-kernel-5.4: 6.4-4
pve-kernel-5.13.19-6-pve: 5.13.19-15
pve-kernel-5.13.19-2-pve: 5.13.19-4
pve-kernel-5.11.22-7-pve: 5.11.22-12
pve-kernel-5.11.22-3-pve: 5.11.22-7
pve-kernel-5.11.22-2-pve: 5.11.22-4
pve-kernel-5.11.22-1-pve: 5.11.22-2
pve-kernel-5.4.124-1-pve: 5.4.124-1
pve-kernel-5.4.106-1-pve: 5.4.106-1
ceph: 16.2.7
ceph-fuse: 16.2.7
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown: residual config
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.22-pve2
libproxmox-acme-perl: 1.4.1
libproxmox-backup-qemu0: 1.2.0-1
libpve-access-control: 7.1-7
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.1-5
libpve-guest-common-perl: 4.1-1
libpve-http-server-perl: 4.1-1
libpve-storage-perl: 7.1-1
libqb0: 1.0.5-1
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 4.0.11-1
lxcfs: 4.0.11-pve1
novnc-pve: 1.3.0-2
proxmox-backup-client: 2.1.5-1
proxmox-backup-file-restore: 2.1.5-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.4-7
pve-cluster: 7.1-3
pve-container: 4.1-4
pve-docs: 7.1-2
pve-edk2-firmware: 3.20210831-2
pve-firewall: 4.2-5
pve-firmware: 3.3-6
pve-ha-manager: 3.3-3
pve-i18n: 2.6-2
pve-qemu-kvm: 6.1.1-2
pve-xtermjs: 4.16.0-1
qemu-server: 7.1-4
smartmontools: 7.2-pve2
spiceterm: 3.2-2
swtpm: 0.7.1~bpo11+1
vncterm: 1.7-1
zfsutils-linux: 2.1.4-pve1
/var/log/ceph/ceph.log
is full of:
Code:
2022-04-22T16:40:42.194315+0200 mgr.prx-a1-2 (mgr.131256291) 227328 : cluster [DBG] pgmap v227503: 2049 pgs: 1 active+clean+scrubbing+deep, 2048 active+clean; 3.9 TiB data, 12 TiB used, 70 TiB / 82 TiB avail; 262 KiB/s rd, 13 MiB/s wr, 404 op/s
2022-04-22T16:40:44.195772+0200 mgr.prx-a1-2 (mgr.131256291) 227329 : cluster [DBG] pgmap v227504: 2049 pgs: 1 active+clean+scrubbing+deep, 2048 active+clean; 3.9 TiB data, 12 TiB used, 70 TiB / 82 TiB avail; 229 KiB/s rd, 12 MiB/s wr, 353 op/s
2022-04-22T16:40:46.201910+0200 mgr.prx-a1-2 (mgr.131256291) 227330 : cluster [DBG] pgmap v227505: 2049 pgs: 1 active+clean+scrubbing+deep, 2048 active+clean; 3.9 TiB data, 12 TiB used, 70 TiB / 82 TiB avail; 405 KiB/s rd, 15 MiB/s wr, 482 op/s
2022-04-22T16:40:48.203354+0200 mgr.prx-a1-2 (mgr.131256291) 227331 : cluster [DBG] pgmap v227506: 2049 pgs: 1 active+clean+scrubbing+deep, 2048 active+clean; 3.9 TiB data, 12 TiB used, 70 TiB / 82 TiB avail; 315 KiB/s rd, 9.3 MiB/s wr, 345 op/s
2022-04-22T16:40:50.206610+0200 mgr.prx-a1-2 (mgr.131256291) 227332 : cluster [DBG] pgmap v227507: 2049 pgs: 1 active+clean+scrubbing+deep, 2048 active+clean; 3.9 TiB data, 12 TiB used, 70 TiB / 82 TiB avail; 477 KiB/s rd, 12 MiB/s wr, 424 op/s
2022-04-22T16:40:52.210357+0200 mgr.prx-a1-2 (mgr.131256291) 227333 : cluster [DBG] pgmap v227508: 2049 pgs: 1 active+clean+scrubbing+deep, 2048 active+clean; 3.9 TiB data, 12 TiB used, 70 TiB / 82 TiB avail; 616 KiB/s rd, 11 MiB/s wr, 429 op/s
2022-04-22T16:40:54.211701+0200 mgr.prx-a1-2 (mgr.131256291) 227334 : cluster [DBG] pgmap v227509: 2049 pgs: 1 active+clean+scrubbing+deep, 2048 active+clean; 3.9 TiB data, 12 TiB used, 70 TiB / 82 TiB avail; 544 KiB/s rd, 9.0 MiB/s wr, 353 op/s
2022-04-22T16:40:56.217373+0200 mgr.prx-a1-2 (mgr.131256291) 227335 : cluster [DBG] pgmap v227510: 2049 pgs: 1 active+clean+scrubbing+deep, 2048 active+clean; 3.9 TiB data, 12 TiB used, 70 TiB / 82 TiB avail; 813 KiB/s rd, 13 MiB/s wr, 475 op/s
2022-04-22T16:40:58.218659+0200 mgr.prx-a1-2 (mgr.131256291) 227336 : cluster [DBG] pgmap v227511: 2049 pgs: 1 active+clean+scrubbing+deep, 2048 active+clean; 3.9 TiB data, 12 TiB used, 70 TiB / 82 TiB avail; 678 KiB/s rd, 9.0 MiB/s wr, 332 op/s
2022-04-22T16:41:00.221626+0200 mgr.prx-a1-2 (mgr.131256291) 227337 : cluster [DBG] pgmap v227512: 2049 pgs: 1 active+clean+scrubbing+deep, 2048 active+clean; 3.9 TiB data, 12 TiB used, 70 TiB / 82 TiB avail; 912 KiB/s rd, 12 MiB/s wr, 412 op/s
2022-04-22T16:41:02.224992+0200 mgr.prx-a1-2 (mgr.131256291) 227338 : cluster [DBG] pgmap v227513: 2049 pgs: 1 active+clean+scrubbing+deep, 2048 active+clean; 3.9 TiB data, 12 TiB used, 70 TiB / 82 TiB avail; 957 KiB/s rd, 12 MiB/s wr, 422 op/s
2022-04-22T16:41:04.226375+0200 mgr.prx-a1-2 (mgr.131256291) 227339 : cluster [DBG] pgmap v227514: 2049 pgs: 1 active+clean+scrubbing+deep, 2048 active+clean; 3.9 TiB data, 12 TiB used, 70 TiB / 82 TiB avail; 803 KiB/s rd, 12 MiB/s wr, 388 op/s
2022-04-22T16:41:06.232293+0200 mgr.prx-a1-2 (mgr.131256291) 227340 : cluster [DBG] pgmap v227515: 2049 pgs: 1 active+clean+scrubbing+deep, 2048 active+clean; 3.9 TiB data, 12 TiB used, 70 TiB / 82 TiB avail; 794 KiB/s rd, 19 MiB/s wr, 509 op/s
2022-04-22T16:41:08.233645+0200 mgr.prx-a1-2 (mgr.131256291) 227341 : cluster [DBG] pgmap v227516: 2049 pgs: 1 active+clean+scrubbing+deep, 2048 active+clean; 3.9 TiB data, 12 TiB used, 70 TiB / 82 TiB avail; 512 KiB/s rd, 15 MiB/s wr, 382 op/s
2022-04-22T16:41:10.236933+0200 mgr.prx-a1-2 (mgr.131256291) 227342 : cluster [DBG] pgmap v227517: 2049 pgs: 2 active+clean+scrubbing+deep, 2047 active+clean; 3.9 TiB data, 12 TiB used, 70 TiB / 82 TiB avail; 458 KiB/s rd, 18 MiB/s wr, 430 op/s
2022-04-22T16:41:12.240397+0200 mgr.prx-a1-2 (mgr.131256291) 227343 : cluster [DBG] pgmap v227518: 2049 pgs: 2 active+clean+scrubbing+deep, 2047 active+clean; 3.9 TiB data, 12 TiB used, 70 TiB / 82 TiB avail; 235 KiB/s rd, 16 MiB/s wr, 413 op/s
2022-04-22T16:41:13.469565+0200 osd.20 (osd.20) 278 : cluster [DBG] 3.4e deep-scrub ok
2022-04-22T16:41:14.241935+0200 mgr.prx-a1-2 (mgr.131256291) 227344 : cluster [DBG] pgmap v227519: 2049 pgs: 2 active+clean+scrubbing+deep, 2047 active+clean; 3.9 TiB data, 12 TiB used, 70 TiB / 82 TiB avail; 29 KiB/s rd, 14 MiB/s wr, 334 op/s
2022-04-22T16:41:16.247857+0200 mgr.prx-a1-2 (mgr.131256291) 227345 : cluster [DBG] pgmap v227520: 2049 pgs: 1 active+clean+scrubbing+deep, 2048 active+clean; 3.9 TiB data, 12 TiB used, 70 TiB / 82 TiB avail; 89 KiB/s rd, 20 MiB/s wr, 441 op/s
Now I'm starting to get very worried...
Could syou help me to solve the problem?
Thank you