Hello Group,
I just upgraded my 4 node cluster last night from 4.4 to 5.2. I have having issues with ceph after the migration. The service is not coming up and is throwing the following errors. I am fairly new to Ceph. Any pointers/guide to troubleshoot this is greatly appreciated.
Thank you.
root@pve1:/var/log# pveversion -v
proxmox-ve: 5.2-2 (running kernel: 4.15.18-1-pve)
pve-manager: 5.2-6 (running version: 5.2-6/bcd5f008)
pve-kernel-4.15: 5.2-4
pve-kernel-4.15.18-1-pve: 4.15.18-17
pve-kernel-4.4.134-1-pve: 4.4.134-112
pve-kernel-4.4.19-1-pve: 4.4.19-66
ceph: 12.2.5-pve1
corosync: 2.4.2-pve5
criu: 2.11.1-1~bpo90
glusterfs-client: 3.8.8-1
ksm-control-daemon: 1.2-2
libjs-extjs: 6.0.1-2
libpve-access-control: 5.0-8
libpve-apiclient-perl: 2.0-5
libpve-common-perl: 5.0-37
libpve-guest-common-perl: 2.0-17
libpve-http-server-perl: 2.0-9
libpve-storage-perl: 5.0-24
libqb0: 1.0.1-1
lvm2: 2.02.168-pve6
lxc-pve: 3.0.0-3
lxcfs: 3.0.0-1
novnc-pve: 1.0.0-2
proxmox-widget-toolkit: 1.0-19
pve-cluster: 5.0-29
pve-container: 2.0-24
pve-docs: 5.2-5
pve-firewall: 3.0-13
pve-firmware: 2.0-5
pve-ha-manager: 2.0-5
pve-i18n: 1.0-6
pve-libspice-server1: 0.12.8-3
pve-qemu-kvm: 2.11.2-1
pve-xtermjs: 1.0-5
qemu-server: 5.0-30
smartmontools: 6.5+svn4324-1
spiceterm: 3.0-5
vncterm: 1.5-3
zfsutils-linux: 0.7.9-pve1~bpo9
root@pve1:/var/log# pveversion -v
proxmox-ve: 5.2-2 (running kernel: 4.15.18-1-pve)
pve-manager: 5.2-6 (running version: 5.2-6/bcd5f008)
pve-kernel-4.15: 5.2-4
pve-kernel-4.15.18-1-pve: 4.15.18-17
pve-kernel-4.4.134-1-pve: 4.4.134-112
pve-kernel-4.4.19-1-pve: 4.4.19-66
ceph: 12.2.5-pve1
corosync: 2.4.2-pve5
criu: 2.11.1-1~bpo90
glusterfs-client: 3.8.8-1
ksm-control-daemon: 1.2-2
libjs-extjs: 6.0.1-2
libpve-access-control: 5.0-8
libpve-apiclient-perl: 2.0-5
libpve-common-perl: 5.0-37
libpve-guest-common-perl: 2.0-17
libpve-http-server-perl: 2.0-9
libpve-storage-perl: 5.0-24
libqb0: 1.0.1-1
lvm2: 2.02.168-pve6
lxc-pve: 3.0.0-3
lxcfs: 3.0.0-1
novnc-pve: 1.0.0-2
proxmox-widget-toolkit: 1.0-19
pve-cluster: 5.0-29
pve-container: 2.0-24
pve-docs: 5.2-5
pve-firewall: 3.0-13
pve-firmware: 2.0-5
pve-ha-manager: 2.0-5
pve-i18n: 1.0-6
pve-libspice-server1: 0.12.8-3
pve-qemu-kvm: 2.11.2-1
pve-xtermjs: 1.0-5
qemu-server: 5.0-30
smartmontools: 6.5+svn4324-1
spiceterm: 3.0-5
vncterm: 1.5-3
zfsutils-linux: 0.7.9-pve1~bpo9
root@pve1:/var/log#
root@pve1:/var/log# systemctl status ceph
â— ceph.service - LSB: Start Ceph distributed file system daemons at boot time
Loaded: loaded (/etc/init.d/ceph; generated; vendor preset: enabled)
Active: failed (Result: exit-code) since Mon 2018-08-06 23:57:38 EDT; 10h ago
Docs: man:systemd-sysv-generator(8)
Process: 3715 ExecStart=/etc/init.d/ceph start (code=exited, status=1/FAILURE)
CPU: 2.066s
Aug 06 23:57:28 pve1 ceph[3715]: === mon.0 ===
Aug 06 23:57:29 pve1 ceph[3715]: Starting Ceph mon.0 on pve1...
Aug 06 23:57:29 pve1 ceph[3715]: 2018-08-06 23:57:29.433954 7f6d90cf6f80 -1 error opening mon data directory at '/var/lib/ceph/mon/ceph-0': (13) Per
Aug 06 23:57:29 pve1 ceph[3715]: failed: 'ulimit -n 32768; TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=134217728 /usr/bin/ceph-mon -i 0 --pid-file /var/r
Aug 06 23:57:30 pve1 ceph[3715]: Removed /run/systemd/system/ceph-osd.target.wants/ceph-osd@0.service.
Aug 06 23:57:30 pve1 ceph[3715]: Created symlink /run/systemd/system/ceph-osd.target.wants/ceph-osd@0.service → /lib/systemd/system/ceph-osd@.servic
Aug 06 23:57:38 pve1 systemd[1]: ceph.service: Control process exited, code=exited status=1
Aug 06 23:57:38 pve1 systemd[1]: Failed to start LSB: Start Ceph distributed file system daemons at boot time.
Aug 06 23:57:38 pve1 systemd[1]: ceph.service: Unit entered failed state.
Aug 06 23:57:38 pve1 systemd[1]: ceph.service: Failed with result 'exit-code'.
lines 1-17/17 (END)
I just upgraded my 4 node cluster last night from 4.4 to 5.2. I have having issues with ceph after the migration. The service is not coming up and is throwing the following errors. I am fairly new to Ceph. Any pointers/guide to troubleshoot this is greatly appreciated.
Thank you.
root@pve1:/var/log# pveversion -v
proxmox-ve: 5.2-2 (running kernel: 4.15.18-1-pve)
pve-manager: 5.2-6 (running version: 5.2-6/bcd5f008)
pve-kernel-4.15: 5.2-4
pve-kernel-4.15.18-1-pve: 4.15.18-17
pve-kernel-4.4.134-1-pve: 4.4.134-112
pve-kernel-4.4.19-1-pve: 4.4.19-66
ceph: 12.2.5-pve1
corosync: 2.4.2-pve5
criu: 2.11.1-1~bpo90
glusterfs-client: 3.8.8-1
ksm-control-daemon: 1.2-2
libjs-extjs: 6.0.1-2
libpve-access-control: 5.0-8
libpve-apiclient-perl: 2.0-5
libpve-common-perl: 5.0-37
libpve-guest-common-perl: 2.0-17
libpve-http-server-perl: 2.0-9
libpve-storage-perl: 5.0-24
libqb0: 1.0.1-1
lvm2: 2.02.168-pve6
lxc-pve: 3.0.0-3
lxcfs: 3.0.0-1
novnc-pve: 1.0.0-2
proxmox-widget-toolkit: 1.0-19
pve-cluster: 5.0-29
pve-container: 2.0-24
pve-docs: 5.2-5
pve-firewall: 3.0-13
pve-firmware: 2.0-5
pve-ha-manager: 2.0-5
pve-i18n: 1.0-6
pve-libspice-server1: 0.12.8-3
pve-qemu-kvm: 2.11.2-1
pve-xtermjs: 1.0-5
qemu-server: 5.0-30
smartmontools: 6.5+svn4324-1
spiceterm: 3.0-5
vncterm: 1.5-3
zfsutils-linux: 0.7.9-pve1~bpo9
root@pve1:/var/log# pveversion -v
proxmox-ve: 5.2-2 (running kernel: 4.15.18-1-pve)
pve-manager: 5.2-6 (running version: 5.2-6/bcd5f008)
pve-kernel-4.15: 5.2-4
pve-kernel-4.15.18-1-pve: 4.15.18-17
pve-kernel-4.4.134-1-pve: 4.4.134-112
pve-kernel-4.4.19-1-pve: 4.4.19-66
ceph: 12.2.5-pve1
corosync: 2.4.2-pve5
criu: 2.11.1-1~bpo90
glusterfs-client: 3.8.8-1
ksm-control-daemon: 1.2-2
libjs-extjs: 6.0.1-2
libpve-access-control: 5.0-8
libpve-apiclient-perl: 2.0-5
libpve-common-perl: 5.0-37
libpve-guest-common-perl: 2.0-17
libpve-http-server-perl: 2.0-9
libpve-storage-perl: 5.0-24
libqb0: 1.0.1-1
lvm2: 2.02.168-pve6
lxc-pve: 3.0.0-3
lxcfs: 3.0.0-1
novnc-pve: 1.0.0-2
proxmox-widget-toolkit: 1.0-19
pve-cluster: 5.0-29
pve-container: 2.0-24
pve-docs: 5.2-5
pve-firewall: 3.0-13
pve-firmware: 2.0-5
pve-ha-manager: 2.0-5
pve-i18n: 1.0-6
pve-libspice-server1: 0.12.8-3
pve-qemu-kvm: 2.11.2-1
pve-xtermjs: 1.0-5
qemu-server: 5.0-30
smartmontools: 6.5+svn4324-1
spiceterm: 3.0-5
vncterm: 1.5-3
zfsutils-linux: 0.7.9-pve1~bpo9
root@pve1:/var/log#
root@pve1:/var/log# systemctl status ceph
â— ceph.service - LSB: Start Ceph distributed file system daemons at boot time
Loaded: loaded (/etc/init.d/ceph; generated; vendor preset: enabled)
Active: failed (Result: exit-code) since Mon 2018-08-06 23:57:38 EDT; 10h ago
Docs: man:systemd-sysv-generator(8)
Process: 3715 ExecStart=/etc/init.d/ceph start (code=exited, status=1/FAILURE)
CPU: 2.066s
Aug 06 23:57:28 pve1 ceph[3715]: === mon.0 ===
Aug 06 23:57:29 pve1 ceph[3715]: Starting Ceph mon.0 on pve1...
Aug 06 23:57:29 pve1 ceph[3715]: 2018-08-06 23:57:29.433954 7f6d90cf6f80 -1 error opening mon data directory at '/var/lib/ceph/mon/ceph-0': (13) Per
Aug 06 23:57:29 pve1 ceph[3715]: failed: 'ulimit -n 32768; TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=134217728 /usr/bin/ceph-mon -i 0 --pid-file /var/r
Aug 06 23:57:30 pve1 ceph[3715]: Removed /run/systemd/system/ceph-osd.target.wants/ceph-osd@0.service.
Aug 06 23:57:30 pve1 ceph[3715]: Created symlink /run/systemd/system/ceph-osd.target.wants/ceph-osd@0.service → /lib/systemd/system/ceph-osd@.servic
Aug 06 23:57:38 pve1 systemd[1]: ceph.service: Control process exited, code=exited status=1
Aug 06 23:57:38 pve1 systemd[1]: Failed to start LSB: Start Ceph distributed file system daemons at boot time.
Aug 06 23:57:38 pve1 systemd[1]: ceph.service: Unit entered failed state.
Aug 06 23:57:38 pve1 systemd[1]: ceph.service: Failed with result 'exit-code'.
lines 1-17/17 (END)