[SOLVED] Ceph upgrade to 12.2.10 hang

udo

Distinguished Member
Apr 22, 2009
5,977
199
163
Ahrensburg; Germany
Hi,
just tried an upgrade on the first node and the process hang, without activity.
Code:
root@pve01:~# apt dist-upgrade
Reading package lists... Done
Building dependency tree       
Reading state information... Done
Calculating upgrade... Done
The following packages will be upgraded:
  ceph ceph-base ceph-common ceph-fuse ceph-mds ceph-mgr ceph-mon ceph-osd libcephfs2 librados2 libradosstriper1 librbd1 librgw2 python-ceph python-cephfs python-rados
  python-rbd python-rgw
18 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
Need to get 51.6 MB of archives.
After this operation, 1,476 kB of additional disk space will be used.
Do you want to continue? [Y/n]
Get:1 http://download.proxmox.com/debian/ceph-luminous stretch/main amd64 ceph-mds amd64 12.2.10-pve1 [3,611 kB]
Get:2 http://download.proxmox.com/debian/ceph-luminous stretch/main amd64 ceph-osd amd64 12.2.10-pve1 [14.2 MB]
Get:3 http://download.proxmox.com/debian/ceph-luminous stretch/main amd64 ceph-mon amd64 12.2.10-pve1 [4,512 kB]
Get:4 http://download.proxmox.com/debian/ceph-luminous stretch/main amd64 ceph-base amd64 12.2.10-pve1 [3,363 kB]
Get:5 http://download.proxmox.com/debian/ceph-luminous stretch/main amd64 ceph-common amd64 12.2.10-pve1 [13.0 MB]
Get:6 http://download.proxmox.com/debian/ceph-luminous stretch/main amd64 ceph amd64 12.2.10-pve1 [7,474 B]
Get:7 http://download.proxmox.com/debian/ceph-luminous stretch/main amd64 ceph-mgr amd64 12.2.10-pve1 [3,535 kB]
Get:8 http://download.proxmox.com/debian/ceph-luminous stretch/main amd64 librgw2 amd64 12.2.10-pve1 [1,820 kB]
Get:9 http://download.proxmox.com/debian/ceph-luminous stretch/main amd64 libradosstriper1 amd64 12.2.10-pve1 [322 kB]
Get:10 http://download.proxmox.com/debian/ceph-luminous stretch/main amd64 librbd1 amd64 12.2.10-pve1 [999 kB]
Get:11 http://download.proxmox.com/debian/ceph-luminous stretch/main amd64 python-rgw amd64 12.2.10-pve1 [98.4 kB]
Get:12 http://download.proxmox.com/debian/ceph-luminous stretch/main amd64 python-rados amd64 12.2.10-pve1 [291 kB]
Get:13 http://download.proxmox.com/debian/ceph-luminous stretch/main amd64 python-rbd amd64 12.2.10-pve1 [155 kB]
Get:14 http://download.proxmox.com/debian/ceph-luminous stretch/main amd64 python-ceph amd64 12.2.10-pve1 [7,406 B]
Get:15 http://download.proxmox.com/debian/ceph-luminous stretch/main amd64 python-cephfs amd64 12.2.10-pve1 [95.3 kB]
Get:16 http://download.proxmox.com/debian/ceph-luminous stretch/main amd64 libcephfs2 amd64 12.2.10-pve1 [411 kB]
Get:17 http://download.proxmox.com/debian/ceph-luminous stretch/main amd64 librados2 amd64 12.2.10-pve1 [2,716 kB]
Get:18 http://download.proxmox.com/debian/ceph-luminous stretch/main amd64 ceph-fuse amd64 12.2.10-pve1 [2,463 kB]
Fetched 51.6 MB in 4s (12.2 MB/s)     
Reading changelogs... Done
(Reading database ... 117388 files and directories currently installed.)
Preparing to unpack .../00-ceph-mds_12.2.10-pve1_amd64.deb ...
Unpacking ceph-mds (12.2.10-pve1) over (12.2.8-pve1) ...
Preparing to unpack .../01-ceph-osd_12.2.10-pve1_amd64.deb ...
Unpacking ceph-osd (12.2.10-pve1) over (12.2.8-pve1) ...
Preparing to unpack .../02-ceph-mon_12.2.10-pve1_amd64.deb ...
Unpacking ceph-mon (12.2.10-pve1) over (12.2.8-pve1) ...
Preparing to unpack .../03-ceph-base_12.2.10-pve1_amd64.deb ...
Unpacking ceph-base (12.2.10-pve1) over (12.2.8-pve1) ...
Preparing to unpack .../04-ceph-common_12.2.10-pve1_amd64.deb ...
Unpacking ceph-common (12.2.10-pve1) over (12.2.8-pve1) ...
Preparing to unpack .../05-ceph_12.2.10-pve1_amd64.deb ...
Unpacking ceph (12.2.10-pve1) over (12.2.8-pve1) ...
Preparing to unpack .../06-ceph-mgr_12.2.10-pve1_amd64.deb ...
Unpacking ceph-mgr (12.2.10-pve1) over (12.2.8-pve1) ...
Preparing to unpack .../07-librgw2_12.2.10-pve1_amd64.deb ...
Unpacking librgw2 (12.2.10-pve1) over (12.2.8-pve1) ...
Preparing to unpack .../08-libradosstriper1_12.2.10-pve1_amd64.deb ...
Unpacking libradosstriper1 (12.2.10-pve1) over (12.2.8-pve1) ...
Preparing to unpack .../09-librbd1_12.2.10-pve1_amd64.deb ...
Unpacking librbd1 (12.2.10-pve1) over (12.2.8-pve1) ...
Preparing to unpack .../10-python-rgw_12.2.10-pve1_amd64.deb ...
Unpacking python-rgw (12.2.10-pve1) over (12.2.8-pve1) ...
Preparing to unpack .../11-python-rados_12.2.10-pve1_amd64.deb ...
Unpacking python-rados (12.2.10-pve1) over (12.2.8-pve1) ...
Preparing to unpack .../12-python-rbd_12.2.10-pve1_amd64.deb ...
Unpacking python-rbd (12.2.10-pve1) over (12.2.8-pve1) ...
Preparing to unpack .../13-python-ceph_12.2.10-pve1_amd64.deb ...
Unpacking python-ceph (12.2.10-pve1) over (12.2.8-pve1) ...
Preparing to unpack .../14-python-cephfs_12.2.10-pve1_amd64.deb ...
Unpacking python-cephfs (12.2.10-pve1) over (12.2.8-pve1) ...
Preparing to unpack .../15-libcephfs2_12.2.10-pve1_amd64.deb ...
Unpacking libcephfs2 (12.2.10-pve1) over (12.2.8-pve1) ...
Preparing to unpack .../16-librados2_12.2.10-pve1_amd64.deb ...
Unpacking librados2 (12.2.10-pve1) over (12.2.8-pve1) ...
Preparing to unpack .../17-ceph-fuse_12.2.10-pve1_amd64.deb ...
Unpacking ceph-fuse (12.2.10-pve1) over (12.2.8-pve1) ...
Setting up ceph-fuse (12.2.10-pve1) ...
Setting up librados2 (12.2.10-pve1) ...
Setting up libcephfs2 (12.2.10-pve1) ...
Processing triggers for libc-bin (2.24-11+deb9u3) ...
Processing triggers for systemd (232-25+deb9u6) ...
Processing triggers for man-db (2.7.6.1-2) ...
Setting up python-rados (12.2.10-pve1) ...
Setting up python-cephfs (12.2.10-pve1) ...
Setting up libradosstriper1 (12.2.10-pve1) ...
Setting up librgw2 (12.2.10-pve1) ...
Setting up python-rgw (12.2.10-pve1) ...
Setting up librbd1 (12.2.10-pve1) ...
Setting up python-rbd (12.2.10-pve1) ...
Setting up ceph-common (12.2.10-pve1) ...
Setting system user ceph properties..usermod: no changes
..done
Fixing /var/run/ceph ownership....done


Progress: [ 82%] [############################################################################################################################...........................]
related processes:
Code:
root@pve01:~# ps aux | grep ceph | grep -v kvm
ceph        5201  0.0  0.1 441036 50368 ?        Ssl  Dec07  11:01 /usr/bin/ceph-mgr -f --cluster ceph --id pve01 --setuser ceph --setgroup ceph
ceph        5210  2.6  0.9 829184 298236 ?       Ssl  Dec07 406:38 /usr/bin/ceph-mon -f --cluster ceph --id pve01 --setuser ceph --setgroup ceph
ceph        5252  0.0  0.0 360376 27056 ?        Ssl  Dec07   7:44 /usr/bin/ceph-mds -f --cluster ceph --id pve01 --setuser ceph --setgroup ceph
ceph        5627  0.0  0.0 349104 22264 ?        Ssl  Dec07   7:14 /usr/bin/ceph-mds -i pve01 --pid-file /var/run/ceph/mds.pve01.pid -c /etc/ceph/ceph.conf --cluster ceph --setuser ceph --setgroup ceph
ceph        5969  1.1  6.4 3055796 2126252 ?     Ssl  Dec07 184:20 /usr/bin/ceph-osd -f --cluster ceph --id 3 --setuser ceph --setgroup ceph
root        6691  0.0  0.0      0     0 ?        I<   Dec07   0:00 [ceph-msgr]
root        6701  0.0  0.0      0     0 ?        I<   Dec07   0:00 [ceph-watch-noti]
ceph       10249  1.4  6.4 3008676 2117496 ?     Ssl  Dec07 231:04 /usr/bin/ceph-osd -f --cluster ceph --id 2 --setuser ceph --setgroup ceph
root     2939812  0.0  0.0   4292  1504 pts/2    S+   10:51   0:00 /bin/sh /var/lib/dpkg/info/ceph-common.postinst configure 12.2.8-pve1
root     2939857  0.0  0.0  17772  4024 pts/2    S+   10:51   0:00 perl /usr/bin/deb-systemd-invoke start ceph.target rbdmap.service
root     2939861  0.0  0.0  39600  4184 pts/2    S+   10:51   0:00 /bin/systemctl start ceph.target
root     2950146  0.0  0.0  12788   936 pts/3    S+   11:00   0:00 grep ceph
ceph versions still show the old 12.2.8

Udo
 
ceph versions still show the old 12.2.8
I just ran an upgrade on my test cluster from Ceph 12.28 -> 12.2.10 and didn't see this.

What does your 'pveversion -v' say? And if you kill and redo the upgrade, does it appear again?
 
Hi Alwin,
after kill the "/bin/systemctl start ceph.target" the dist-upgrade finshed.
My versions (now):
Code:
pveversion -v
proxmox-ve: 5.3-1 (running kernel: 4.15.18-9-pve)
pve-manager: 5.3-5 (running version: 5.3-5/97ae681d)
pve-kernel-4.15: 5.2-12
pve-kernel-4.15.18-9-pve: 4.15.18-30
pve-kernel-4.15.18-8-pve: 4.15.18-28
pve-kernel-4.15.18-7-pve: 4.15.18-27
ceph: 12.2.10-pve1
corosync: 2.4.4-pve1
criu: 2.11.1-1~bpo90
glusterfs-client: 3.8.8-1
ksm-control-daemon: 1.2-2
libjs-extjs: 6.0.1-2
libpve-access-control: 5.1-3
libpve-apiclient-perl: 2.0-5
libpve-common-perl: 5.0-43
libpve-guest-common-perl: 2.0-18
libpve-http-server-perl: 2.0-11
libpve-storage-perl: 5.0-33
libqb0: 1.0.3-1~bpo9
lvm2: 2.02.168-pve6
lxc-pve: 3.0.2+pve1-5
lxcfs: 3.0.2-2
novnc-pve: 1.0.0-2
openvswitch-switch: 2.7.0-3
proxmox-widget-toolkit: 1.0-22
pve-cluster: 5.0-31
pve-container: 2.0-31
pve-docs: 5.3-1
pve-edk2-firmware: 1.20181023-1
pve-firewall: 3.0-16
pve-firmware: 2.0-6
pve-ha-manager: 2.0-5
pve-i18n: 1.0-9
pve-libspice-server1: 0.14.1-1
pve-qemu-kvm: 2.12.1-1
pve-xtermjs: 1.0-5
qemu-server: 5.0-43
smartmontools: 6.5+svn4324-1
spiceterm: 3.0-5
vncterm: 1.5-3
zfsutils-linux: 0.7.12-pve1~bpo1
On the other nodes the upgrade runs without interaction.
The only thing was, that an ceph restart don't work - I had to kill the mon, but after that systemctl restart ceph work:
Code:
systemctl restart ceph
Job for ceph.service failed because the control process exited with error code.
See "systemctl status ceph.service" and "journalctl -xe" for details.
ps aux | grep ceph | grep -v kvm
ceph        4888  2.9  0.5 886368 322680 ?       Ssl  Dec05 543:38 /usr/bin/ceph-mon -f --cluster ceph --id pve03 --setuser ceph --setgroup ceph
ceph        5780  1.4  4.0 3148272 2329424 ?     Ssl  Dec05 277:32 /usr/bin/ceph-osd -f --cluster ceph --id 5 --setuser ceph --setgroup ceph
ceph        6120  0.7  3.0 2521440 1750152 ?     Ssl  Dec05 132:58 /usr/bin/ceph-osd -f --cluster ceph --id 9 --setuser ceph --setgroup ceph
ceph        6406  2.4  4.1 3192832 2417776 ?     Ssl  Dec05 456:04 /usr/bin/ceph-osd -f --cluster ceph --id 1 --setuser ceph --setgroup ceph
root        6752  0.0  0.0      0     0 ?        I<   Dec05   0:00 [ceph-msgr]
root        6772  0.0  0.0      0     0 ?        I<   Dec05   0:00 [ceph-watch-noti]
ceph     3501844  0.1  0.0 340020 12836 ?        Ssl  12:05   0:00 /usr/bin/ceph-mds -i pve03 --pid-file /var/run/ceph/mds.pve03.pid -c /etc/ceph/ceph.conf --cluster ceph --setuser ceph --setgroup ceph
ceph     3502313  1.0  0.0 348216 14320 ?        Ssl  12:05   0:00 /usr/bin/ceph-mds -f --cluster ceph --id pve03 --setuser ceph --setgroup ceph
root     3502474  0.0  0.0  12788   960 pts/5    S+   12:05   0:00 grep ceph

kill 4888
systemctl restart ceph
After restarting the osds too, the cluster is up-to-date
Code:
ceph versions
{
    "mon": {
        "ceph version 12.2.10 (fc2b1783e3727b66315cc667af9d663d30fe7ed4) luminous (stable)": 3
    },
    "mgr": {
        "ceph version 12.2.10 (fc2b1783e3727b66315cc667af9d663d30fe7ed4) luminous (stable)": 3
    },
    "osd": {
        "ceph version 12.2.10 (fc2b1783e3727b66315cc667af9d663d30fe7ed4) luminous (stable)": 10
    },
    "mds": {
        "ceph version 12.2.10 (fc2b1783e3727b66315cc667af9d663d30fe7ed4) luminous (stable)": 3
    },
    "overall": {
        "ceph version 12.2.10 (fc2b1783e3727b66315cc667af9d663d30fe7ed4) luminous (stable)": 19
    }
}
I will mark this as solved.

Udo
 
after kill the "/bin/systemctl start ceph.target" the dist-upgrade finshed.
Was the start of the ceph.target at the package upgrade or a manual task afterwards?
 
Was the start of the ceph.target at the package upgrade or a manual task afterwards?
Hi Alwin,
was startet through the update process,
all had the same start date (10:51):
Code:
root     2939812  0.0  0.0   4292  1504 pts/2    S+   10:51   0:00 /bin/sh /var/lib/dpkg/info/ceph-common.postinst configure 12.2.8-pve1
root     2939857  0.0  0.0  17772  4024 pts/2    S+   10:51   0:00 perl /usr/bin/deb-systemd-invoke start ceph.target rbdmap.service
root     2939861  0.0  0.0  39600  4184 pts/2    S+   10:51   0:00 /bin/systemctl start ceph.target
Udo
 
Was there a ceph service that wasn't running? As the 'ceph.target' hanged, maybe there might be something in the logs.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!