HA Ceph Breaks on Migration to 3rd Node...

AZDNice

Member
Jul 1, 2020
5
0
21
55
During testing My HA-Ceph installation I received this error when trying to migrate to my 3rd node. It Migrated but stopped. I was then able to migrate back to one of the other 2 nodes and restart Be gentile I am very New to HA-Ceph. lol
Any insight you can provide would be great.

Also no problems migrating 1st to 2nd and back.....Just any going to 3rd. No error was present until I tried migrating node 1 to 3 or 2 to 3.

Thanks in advance for ANY time giving a response.
I do understand for the time being I could make a group pool and restrict migration between 1-2, but that defeat the purpose of a total HA cluster

root@bass:~# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 4.06088 root default
-3 2.00130 host bass
0 nvme 2.00130 osd.0 up 1.00000 1.00000
-7 1.81940 host daygo
1 ssd 1.81940 osd.1 up 1.00000 1.00000
-10 0.24019 host york
2 ssd 0.24019 osd.2 up 1.00000 1.00000

root@bass:~# ceph -s
cluster:
id: 241c8887-31d8-44f4-b252-c6e4eb5a14ed
health: HEALTH_WARN
Degraded data redundancy: 39/3354 objects degraded (1.163%), 4 pgs degraded, 4 pgs undersized

services:
mon: 3 daemons, quorum bass,daygo,york (age 19m)
mgr: bass(active, since 3h), standbys: daygo, york
osd: 3 osds: 3 up (since 2h), 3 in (since 2h); 1 remapped pgs

data:
pools: 2 pools, 129 pgs
objects: 1.12k objects, 4.2 GiB
usage: 220 GiB used, 3.8 TiB / 4.1 TiB avail
pgs: 39/3354 objects degraded (1.163%)
7/3354 objects misplaced (0.209%)
124 active+clean
4 active+undersized+degraded
1 active+clean+remapped

io:
client: 0 B/s rd, 85 B/s wr, 0 op/s rd, 0 op/s wr
 
Hi,
please share the full migration task log of a problematic migration as well as the output of pveversion -v from source and target node of the migration. Is there anything interesting in the system logs/journal around the time the issue happens?

root@bass:~# ceph osd tree
Code:
ID   CLASS  WEIGHT   TYPE NAME       STATUS  REWEIGHT  PRI-AFF
 -1         4.06088  root default                            
 -3         2.00130      host bass                           
  0   nvme  2.00130          osd.0       up   1.00000  1.00000
 -7         1.81940      host daygo                          
  1    ssd  1.81940          osd.1       up   1.00000  1.00000
-10         0.24019      host york                           
  2    ssd  0.24019          osd.2       up   1.00000  1.00000
Tip, you can use [CODE]output here[/CODE] tags to keep your information more readable.
It seems like the weight for the york/osd.2 is much lower than for the other two, maybe that is causing the issue?
 
root@york:~# pveversion -v
proxmox-ve: 8.2.0 (running kernel: 6.8.8-3-pve)
pve-manager: 8.2.4 (running version: 8.2.4/faa83925c9641325)
proxmox-kernel-helper: 8.1.0
proxmox-kernel-6.8: 6.8.8-3
proxmox-kernel-6.8.8-3-pve-signed: 6.8.8-3
ceph: 18.2.2-pve1
ceph-fuse: 18.2.2-pve1
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx9
ksm-control-daemon: 1.5-1
libjs-extjs: 7.0.0-4
libknet1: 1.28-pve1
libproxmox-acme-perl: 1.5.1
libproxmox-backup-qemu0: 1.4.1
libproxmox-rs-perl: 0.3.3
libpve-access-control: 8.1.4
libpve-apiclient-perl: 3.3.2
libpve-cluster-api-perl: 8.0.7
libpve-cluster-perl: 8.0.7
libpve-common-perl: 8.2.1
libpve-guest-common-perl: 5.1.3
libpve-http-server-perl: 5.1.0
libpve-network-perl: 0.9.8
libpve-rs-perl: 0.8.9
libpve-storage-perl: 8.2.3
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 6.0.0-1
lxcfs: 6.0.0-pve2
novnc-pve: 1.4.0-3
proxmox-backup-client: 3.2.7-1
proxmox-backup-file-restore: 3.2.7-1
proxmox-firewall: 0.4.2
proxmox-kernel-helper: 8.1.0
proxmox-mail-forward: 0.2.3
proxmox-mini-journalreader: 1.4.0
proxmox-widget-toolkit: 4.2.3
pve-cluster: 8.0.7
pve-container: 5.1.12
pve-docs: 8.2.2
pve-edk2-firmware: 4.2023.08-4
pve-esxi-import-tools: 0.7.1
pve-firewall: 5.0.7
pve-firmware: 3.13-1
pve-ha-manager: 4.0.5
pve-i18n: 3.2.2
pve-qemu-kvm: 9.0.0-6
pve-xtermjs: 5.3.0-3
qemu-server: 8.2.2
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.2.4-pve1
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!