I've been running for a few weeks now PVE 5.0 with Ceph and Bluestore on a quad node Dell 6100 with 1x 160GB for boot OS and 2x 2TB sata drives per node for ceph.
The cluster (Proxmox & Ceph) has been working fine but yesterday (on the first node) I did an apt-get update && apt-get dist-upgrade and when I reboot, the two OSDs on the server filed to come up.
After a lot of debug, I tried to just recreate the OSDs. I with wiped out the partitions and then tried creating the ODSs via the gui. That failed with out any reason so I went to command line tools and got the following:
=============================================================================
/dev/disk/by-uuid# pveceph createosd /dev/sdb -bluestore
create OSD on /dev/sdb (bluestore)
Caution: invalid backup GPT header, but valid main header; regenerating
backup header from main header.
****************************************************************************
Caution: Found protective or hybrid MBR and corrupt GPT. Using GPT, but disk
verification and recovery are STRONGLY recommended.
****************************************************************************
GPT data structures destroyed! You may now partition the disk using fdisk or
other utilities.
Creating new GPT entries.
The operation has completed successfully.
------------------------------------------------------------------
At this point it hung...
While trying to figure out the issue I also tried to recreate the disk partition type via parted and it also hung:
=============================================================================
root@pmx1:/dev/disk/by-uuid# parted /dev/sdb mklabel
New disk label type? gpt
Warning: The existing disk label on /dev/sdb will be destroyed and all data on this disk will be lost. Do you want
to continue?
Yes/No? yes
=============================================================================
After typing in 'Yes' and pressing enter, it than hung! So I'm getting closer to the real issue.
I also tried this too the second other OSD disk /dev/sdc with the same result.
Another possible clue is, in /dev/disk/by-uuid/, I don't see any links for ether of the two disks for the OSDs. I do see the initial boot disk though...
=============================================================================
root@pmx1:/dev/disk/by-uuid# ls -l
total 0
0 lrwxrwxrwx 1 root root 10 Aug 20 08:40 d29c5db0-eab1-4bbe-803e-382d73bc2a14 -> ../../dm-0
0 lrwxrwxrwx 1 root root 10 Aug 20 08:40 8345-80A0 -> ../../sda2
0 lrwxrwxrwx 1 root root 10 Aug 20 08:42 015ef2d3-0fc3-4caa-bf2b-6d9ca5e9e963 -> ../../dm-1
=============================================================================
Here is some more info:
=============================================================================
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 1 149.1G 0 disk
├─sda1 8:1 1 1M 0 part
├─sda2 8:2 1 256M 0 part
└─sda3 8:3 1 148.8G 0 part
├─pve-root 253:0 0 37G 0 lvm /
├─pve-swap 253:1 0 8G 0 lvm [SWAP]
├─pve-data_tmeta 253:2 0 88M 0 lvm
│ └─pve-data 253:4 0 87.8G 0 lvm
└─pve-data_tdata 253:3 0 87.8G 0 lvm
└─pve-data 253:4 0 87.8G 0 lvm
sdb 8:16 1 1.8T 0 disk
sdc 8:32 1 1.8T 0 disk
root@pmx1:/dev/disk/by-uuid# blkid
/dev/sda2: UUID="8345-80A0" TYPE="vfat" PARTUUID="e08c8686-4648-4b08-a1b6-b2d48df92260"
/dev/sda3: UUID="WaN31l-qUAc-KIa6-nN2u-PWe2-Gouv-U2hExs" TYPE="LVM2_member" PARTUUID="fe524c16-08bf-4059-a5cf-25e065e87561"
/dev/mapper/pve-root: UUID="d29c5db0-eab1-4bbe-803e-382d73bc2a14" TYPE="ext4"
/dev/mapper/pve-swap: UUID="015ef2d3-0fc3-4caa-bf2b-6d9ca5e9e963" TYPE="swap"
/dev/sda1: PARTUUID="6ed5b088-0f4c-4b5c-9b1f-b6e223fbf2d6"
/dev/sdb: PTUUID="a4d2cf40-aa4e-4e2c-a3bb-a92750b2eb5f" PTTYPE="gpt"
=============================================================================
Any help would be appreciated!
-Glen
The cluster (Proxmox & Ceph) has been working fine but yesterday (on the first node) I did an apt-get update && apt-get dist-upgrade and when I reboot, the two OSDs on the server filed to come up.
After a lot of debug, I tried to just recreate the OSDs. I with wiped out the partitions and then tried creating the ODSs via the gui. That failed with out any reason so I went to command line tools and got the following:
=============================================================================
/dev/disk/by-uuid# pveceph createosd /dev/sdb -bluestore
create OSD on /dev/sdb (bluestore)
Caution: invalid backup GPT header, but valid main header; regenerating
backup header from main header.
****************************************************************************
Caution: Found protective or hybrid MBR and corrupt GPT. Using GPT, but disk
verification and recovery are STRONGLY recommended.
****************************************************************************
GPT data structures destroyed! You may now partition the disk using fdisk or
other utilities.
Creating new GPT entries.
The operation has completed successfully.
------------------------------------------------------------------
At this point it hung...
While trying to figure out the issue I also tried to recreate the disk partition type via parted and it also hung:
=============================================================================
root@pmx1:/dev/disk/by-uuid# parted /dev/sdb mklabel
New disk label type? gpt
Warning: The existing disk label on /dev/sdb will be destroyed and all data on this disk will be lost. Do you want
to continue?
Yes/No? yes
=============================================================================
After typing in 'Yes' and pressing enter, it than hung! So I'm getting closer to the real issue.
I also tried this too the second other OSD disk /dev/sdc with the same result.
Another possible clue is, in /dev/disk/by-uuid/, I don't see any links for ether of the two disks for the OSDs. I do see the initial boot disk though...
=============================================================================
root@pmx1:/dev/disk/by-uuid# ls -l
total 0
0 lrwxrwxrwx 1 root root 10 Aug 20 08:40 d29c5db0-eab1-4bbe-803e-382d73bc2a14 -> ../../dm-0
0 lrwxrwxrwx 1 root root 10 Aug 20 08:40 8345-80A0 -> ../../sda2
0 lrwxrwxrwx 1 root root 10 Aug 20 08:42 015ef2d3-0fc3-4caa-bf2b-6d9ca5e9e963 -> ../../dm-1
=============================================================================
Here is some more info:
=============================================================================
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 1 149.1G 0 disk
├─sda1 8:1 1 1M 0 part
├─sda2 8:2 1 256M 0 part
└─sda3 8:3 1 148.8G 0 part
├─pve-root 253:0 0 37G 0 lvm /
├─pve-swap 253:1 0 8G 0 lvm [SWAP]
├─pve-data_tmeta 253:2 0 88M 0 lvm
│ └─pve-data 253:4 0 87.8G 0 lvm
└─pve-data_tdata 253:3 0 87.8G 0 lvm
└─pve-data 253:4 0 87.8G 0 lvm
sdb 8:16 1 1.8T 0 disk
sdc 8:32 1 1.8T 0 disk
root@pmx1:/dev/disk/by-uuid# blkid
/dev/sda2: UUID="8345-80A0" TYPE="vfat" PARTUUID="e08c8686-4648-4b08-a1b6-b2d48df92260"
/dev/sda3: UUID="WaN31l-qUAc-KIa6-nN2u-PWe2-Gouv-U2hExs" TYPE="LVM2_member" PARTUUID="fe524c16-08bf-4059-a5cf-25e065e87561"
/dev/mapper/pve-root: UUID="d29c5db0-eab1-4bbe-803e-382d73bc2a14" TYPE="ext4"
/dev/mapper/pve-swap: UUID="015ef2d3-0fc3-4caa-bf2b-6d9ca5e9e963" TYPE="swap"
/dev/sda1: PARTUUID="6ed5b088-0f4c-4b5c-9b1f-b6e223fbf2d6"
/dev/sdb: PTUUID="a4d2cf40-aa4e-4e2c-a3bb-a92750b2eb5f" PTTYPE="gpt"
=============================================================================
Any help would be appreciated!
-Glen