has serial but no model error ceph

jer1981 · Sep 19, 2019

-- Logs begin at Thu 2019-09-19 16:56:44 EDT, end at Thu 2019-09-19 17:06:52 EDT. --
Sep 19 16:56:55 hypervisor01 systemd[1]: Started Ceph cluster monitor daemon.
Sep 19 16:56:57 hypervisor01 ceph-mon[1556]: 2019-09-19 16:56:57.376 7f87104a63c0 -1 mon.hypervisor01@0(electing) e1 failed to get devid for : fallback method has serial ''but no model

This has degraded all the disks and has filled my logs with this
019-09-19 17:10:09.610290 mgr.hypervisor01 (mgr.14121) 388 : cluster [DBG] pgmap v387: 128 pgs: 128 undersized+peered; 0 B data, 8.8 MiB used, 8.2 TiB / 8.2 TiB avail
2019-09-19 17:10:11.611104 mgr.hypervisor01 (mgr.14121) 389 : cluster [DBG] pgmap v388: 128 pgs: 128 undersized+peered; 0 B data, 8.8 MiB used, 8.2 TiB / 8.2 TiB avail
2019-09-19 17:10:13.611660 mgr.hypervisor01 (mgr.14121) 390 : cluster [DBG] pgmap v389: 128 pgs: 128 undersized+peered; 0 B data, 8.8 MiB used, 8.2 TiB / 8.2 TiB avail
2019-09-19 17:10:15.612369 mgr.hypervisor01 (mgr.14121) 391 : cluster [DBG] pgmap v390: 128 pgs: 128 undersized+peered; 0 B data, 8.8 MiB used, 8.2 TiB / 8.2 TiB avail
2019-09-19 17:10:17.613062 mgr.hypervisor01 (mgr.14121) 392 : cluster [DBG] pgmap v391: 128 pgs: 128 undersized+peered; 0 B data, 8.8 MiB used, 8.2 TiB / 8.2 TiB avail
2019-09-19 17:10:19.613715 mgr.hypervisor01 (mgr.14121) 393 : cluster [DBG] pgmap v392: 128 pgs: 128 undersized+peered; 0 B data, 8.8 MiB used, 8.2 TiB / 8.2 TiB avail
2019-09-19 17:10:21.614576 mgr.hypervisor01 (mgr.14121) 394 : cluster [DBG] pgmap v393: 128 pgs: 128 undersized+peered; 0 B data, 8.8 MiB used, 8.2 TiB / 8.2 TiB avail
2019-09-19 17:10:23.615249 mgr.hypervisor01 (mgr.14121) 395 : cluster [DBG] pgmap v394: 128 pgs: 128 undersized+peered; 0 B data, 8.8 MiB used, 8.2 TiB / 8.2 TiB avail
2019-09-19 17:10:25.615981 mgr.hypervisor01 (mgr.14121) 396 : cluster [DBG] pgmap v395: 128 pgs: 128 undersized+peered; 0 B data, 8.8 MiB used, 8.2 TiB / 8.2 TiB avail
2019-09-19 17:10:27.616684 mgr.hypervisor01 (mgr.14121) 397 : cluster [DBG] pgmap v396: 128 pgs: 128 undersized+peered; 0 B data, 8.8 MiB used, 8.2 TiB / 8.2 TiB avail
2019-09-19 17:10:29.617287 mgr.hypervisor01 (mgr.14121) 398 : cluster [DBG] pgmap v397: 128 pgs: 128 undersized+peered; 0 B data, 8.8 MiB used, 8.2 TiB / 8.2 TiB avail
2019-09-19 17:10:31.617990 mgr.hypervisor01 (mgr.14121) 399 : cluster [DBG] pgmap v398: 128 pgs: 128 undersized+peered; 0 B data, 8.8 MiB used, 8.2 TiB / 8.2 TiB avail
2019-09-19 17:10:33.618626 mgr.hypervisor01 (mgr.14121) 400 : cluster [DBG] pgmap v399: 128 pgs: 128 undersized+peered; 0 B data, 8.8 MiB used, 8.2 TiB / 8.2 TiB avail
2019-09-19 17:10:35.619344 mgr.hypervisor01 (mgr.14121) 401 : cluster [DBG] pgmap v400: 128 pgs: 128 undersized+peered; 0 B data, 8.8 MiB used, 8.2 TiB / 8.2 TiB avail
2019-09-19 17:10:37.619997 mgr.hypervisor01 (mgr.14121) 402 : cluster [DBG] pgmap v401: 128 pgs: 128 undersized+peered; 0 B data, 8.8 MiB used, 8.2 TiB / 8.2 TiB avail

can someone shed some light this issue please?
Thanks!

Alwin · Oct 1, 2019

What doese ceph -s and ceph osd df tree show? Can you please explain in more detail your cluster setup?

jer1981 · Oct 2, 2019

found that bond type was not set to active-failover fixed and rebooted
but still get WARN state
havent added any other nodes to cluster yet as I need to get this disk cleean so I can move VMs by hand here first
reformat and then join my other nodes
root@hypervisor01:~# ceph osd df tree
ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS TYPE NAME
-1 8.18546 - 8.2 TiB 3.0 GiB 9.8 MiB 0 B 3 GiB 8.2 TiB 0.04 1.00 - root default
-3 8.18546 - 8.2 TiB 3.0 GiB 9.8 MiB 0 B 3 GiB 8.2 TiB 0.04 1.00 - host hypervisor01
0 hdd 2.72849 1.00000 2.7 TiB 1.0 GiB 3.2 MiB 0 B 1 GiB 2.7 TiB 0.04 1.00 46 up osd.0
1 hdd 2.72849 1.00000 2.7 TiB 1.0 GiB 3.2 MiB 0 B 1 GiB 2.7 TiB 0.04 1.00 46 up osd.1
2 hdd 2.72849 1.00000 2.7 TiB 1.0 GiB 3.2 MiB 0 B 1 GiB 2.7 TiB 0.04 1.00 36 up osd.2
TOTAL 8.2 TiB 3.0 GiB 9.8 MiB 0 B 3 GiB 8.2 TiB 0.04
MIN/MAX VAR: 1.00/1.00 STDDEV: 0
root@hypervisor01:~# ceph -s
cluster:
id: 327c0bb6-ef1d-47ed-8a3d-a7ca913b13d9
health: HEALTH_WARN
Reduced data availability: 128 pgs inactive
Degraded data redundancy: 128 pgs undersized
128 pgs not deep-scrubbed in time
128 pgs not scrubbed in time

services:
mon: 1 daemons, quorum hypervisor01 (age 10m)
mgr: hypervisor01(active, since 9m)
osd: 3 osds: 3 up (since 10m), 3 in (since 12d)

data:
pools: 1 pools, 128 pgs
objects: 0 objects, 0 B
usage: 3.0 GiB used, 8.2 TiB / 8.2 TiB avail
pgs: 100.000% pgs not active
128 undersized+peered
if I can get back to an OK state with all space I can proceed quickly in getting things up and running with the other node joins

Alwin · Oct 2, 2019

Ceph needs 3x nodes to start with, not only for quorum, also for data redundancy. Add two more nodes with MON and OSD to get Ceph running properly.

jer1981 · Oct 2, 2019

is ceph the only way to share local storage as ideally the this cluster will have 2 nodes to start with and I need that to work first before adding a third
and I cant add nodes as that existing data is local and cant be converted to a bluestore OSD as there is data on those drives
any suggestions would be ideal

Alwin · Oct 2, 2019

Check out the pve storage replication. It uses ZFS and runs async. Otherwise you will need a extra box as shared storage.
https://pve.proxmox.com/pve-docs/chapter-pvesr.html

jer1981 · Oct 2, 2019

so I can do away with ceph entirely in this case?
my existing nodes are 5.3.8
can I join them to a 6.0 cluster to get the data synced move the VMs and then upgrade the older nodes

I cannot/do not want to do an inplace upgrade for a number of reasons (downtime/risk being a major one)

my 5.3.8 nodes do not store on ZFS but I guess that is OK I can move the data to do 6.0 node which has ZFS and reconfigure as needed
should I raid the ZFS storage or HBA it? system has 4 x 4TB disks currently setup for HBA due to ceph but can be raided if better

jer1981 · Oct 2, 2019

the GUI answered my question about raid stick to HBA and keep HW raid out of the picture
can the main boot drive be a ZFS disk or not?

jer1981 · Oct 2, 2019

Alwin would you mind viewing https://forum.proxmox.com/threads/invalid-proxmox-subscription-key.58618/
I have a paid key yet the server isnt accepting it allowing me access to enterprise for updates

so now I just need to know if 5.3.8 cluster can join a 6.0 cluster to move my data and i'm all set!

jer1981 · Oct 2, 2019

if anyone could kindly confirm about the joining of the cluster vers this would be awesome!

jer1981 · Oct 2, 2019

due to how this needs to be handled, this last question no longer applicable please kindly close

Alwin · Oct 3, 2019

jer1981 said:
so now I just need to know if 5.3.8 cluster can join a 6.0 cluster to move my data and i'm all set!

If this was your last question, you may know already. It isn't possible to join a PVE 5.3 with PVE 6.0 node. You will need to upgrade to PVE 5.4 first, install corosync 3 and then you could add the PVE 6.0 node to the PVE 5.4.

jer1981 · Oct 3, 2019

To be clear
now I am using ZFS and not ceph /bluestore
Can they be joined I cannot have down time in upgrading the first node to 5.4
I just need to find a way to get the data off the first node with little to zero downtime or reboots then I can upgrade it to 6

Alwin · Oct 3, 2019

jer1981 said:
Can they be joined I cannot have down time in upgrading the first node to 5.4

My statement from above is still valid.

Search

Search

has serial but no model error ceph

jer1981

Member

Alwin

Proxmox Retired Staff

jer1981

Member

Alwin

Proxmox Retired Staff

jer1981

Member

Alwin

Proxmox Retired Staff

jer1981

Member

jer1981

Member

jer1981

Member

jer1981

Member

jer1981

Member

Alwin

Proxmox Retired Staff

jer1981

Member

Alwin

Proxmox Retired Staff