Ceph OSD with file journal

gcakici · Apr 12, 2017

I've an Intel P3700DC NVMe drive which I want to use it as journal for 3 SATA OSDs. I partitioned the device but can not use it them as block devices so I want to use it as file journals.

I couldn't find a way to go on Proxmox interface and ceph-disk prepares the OSD but I can not see it in the proxmox interface either. This is a new installed platform with enterprise repo and Jewel is the latest version.

How can I create file journaled OSD's which can be seen and operated in the Proxmox interface?

Thanks
Gokalp

ceph-disk prepare --fs-type xfs --cluster ceph --journal-file /dev/sda /tmp/journal

prepare_file: OSD will not be hot-swappable if journal is not the same device as the osd data
Setting name!
partNum is 0
REALLY setting name!
The operation has completed successfully.
meta-data=/dev/sda1 isize=2048 agcount=4, agsize=244188597 blks
= sectsz=4096 attr=2, projid32bit=1
= crc=0 finobt=0
data = bsize=4096 blocks=976754385, imaxpct=5
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0 ftype=0
log =internal log bsize=4096 blocks=476930, version=2
= sectsz=4096 sunit=1 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
The operation has completed successfully.

pveversion -v

proxmox-ve: 4.4-86 (running kernel: 4.4.49-1-pve)
pve-manager: 4.4-13 (running version: 4.4-13/7ea56165)
pve-kernel-4.4.35-1-pve: 4.4.35-77
pve-kernel-4.4.49-1-pve: 4.4.49-86
lvm2: 2.02.116-pve3
corosync-pve: 2.4.2-2~pve4+1
libqb0: 1.0.1-1
pve-cluster: 4.0-49
qemu-server: 4.0-110
pve-firmware: 1.1-11
libpve-common-perl: 4.0-94
libpve-access-control: 4.0-23
libpve-storage-perl: 4.0-76
pve-libspice-server1: 0.12.8-2
vncterm: 1.3-2
pve-docs: 4.4-4
pve-qemu-kvm: 2.7.1-4
pve-container: 1.0-97
pve-firewall: 2.0-33
pve-ha-manager: 1.0-40
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u3
lxc-pve: 2.0.7-4
lxcfs: 2.0.6-pve1
criu: 1.6.0-1
novnc-pve: 0.5-9
smartmontools: 6.5+svn4324-1~pve80
zfsutils: 0.6.5.9-pve15~bpo80
ceph: 10.2.7-1~bpo80+1

wolfgang · Apr 13, 2017

Hi,

here is the command for using a journal as you like
assume sdb is the disk and nvm00n1p1 is the NVME partition.

pveceph createosd /dev/sda -journal_dev /dev/nvme0n1p1

gcakici · Apr 13, 2017

Thank you for the reply. This is exactly what I did for my installation. Before the execution of the command ;

#ceph osd tree
ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 0 root default
-2 0 host prox-ceph-5-1-51

#ceph-disk zap /dev/sda
The operation has completed successfully.

Then I executed the same command as yours.

#pveceph createosd /dev/sda -journal_dev /dev/nvme0n1p1
create OSD on /dev/sda (xfs)
using device '/dev/nvme0n1p1' for journal
Caution: invalid backup GPT header, but valid main header; regenerating
backup header from main header.

****************************************************************************
Caution: Found protective or hybrid MBR and corrupt GPT. Using GPT, but disk
verification and recovery are STRONGLY recommended.
****************************************************************************
GPT data structures destroyed! You may now partition the disk using fdisk or
other utilities.
Creating new GPT entries.
The operation has completed successfully.
prepare_device: OSD will not be hot-swappable if journal is not the same device as the osd data
prepare_device: Journal /dev/nvme0n1p1 was not prepared with ceph-disk. Symlinking directly.
Setting name!
partNum is 0
REALLY setting name!
The operation has completed successfully.
meta-data=/dev/sda1 isize=2048 agcount=4, agsize=244188597 blks
= sectsz=4096 attr=2, projid32bit=1
= crc=0 finobt=0
data = bsize=4096 blocks=976754385, imaxpct=5
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0 ftype=0
log =internal log bsize=4096 blocks=476930, version=2
= sectsz=4096 sunit=1 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
The operation has completed successfully.

I can see that it has been prepared and avaliable to ceph.

#ceph osd tree
ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 0 root default
-2 0 host prox-ceph-5-1-51
0 0 osd.0 down 1.00000 1.00000

But I can not see it in the Proxmox interface.

OSD LOG says ;

2017-04-13 10:11:09.713989 7f55ec4e2800 -1 OSD::mkfs: ObjectStore::mkfs failed with error -13
2017-04-13 10:11:09.714089 7f55ec4e2800 -1 ** ERROR: error creating empty object store in /var/lib/ceph/tmp/mnt.y11kCT: (13) Permission denied
2017-04-13 10:11:10.287998 7fbd65627800 0 set uid:gid to 64045:64045 (ceph:ceph)
2017-04-13 10:11:10.288009 7fbd65627800 0 ceph version 10.2.7 (50e863e0f4bc8f4b9e31156de690d765af245185), process ceph-osd, pid 24422
2017-04-13 10:11:10.290130 7fbd65627800 1 filestore(/var/lib/ceph/tmp/mnt.V_F6HH) mkfs in /var/lib/ceph/tmp/mnt.V_F6HH
2017-04-13 10:11:10.290148 7fbd65627800 1 filestore(/var/lib/ceph/tmp/mnt.V_F6HH) mkfs fsid is already set to 995fa999-f78a-4f9a-834e-994f4d49b430
2017-04-13 10:11:10.290152 7fbd65627800 1 filestore(/var/lib/ceph/tmp/mnt.V_F6HH) write_version_stamp 4
2017-04-13 10:11:10.292551 7fbd65627800 0 filestore(/var/lib/ceph/tmp/mnt.V_F6HH) backend xfs (magic 0x58465342)
2017-04-13 10:11:10.422327 7fbd65627800 1 filestore(/var/lib/ceph/tmp/mnt.V_F6HH) leveldb db exists/created
2017-04-13 10:11:10.422432 7fbd65627800 -1 filestore(/var/lib/ceph/tmp/mnt.V_F6HH) mkjournal error creating journal on /var/lib/ceph/tmp/mnt.V_F6HH/journal: (13) Permission denied
2017-04-13 10:11:10.422468 7fbd65627800 -1 OSD::mkfs: ObjectStore::mkfs failed with error -13
2017-04-13 10:11:10.422521 7fbd65627800 -1 ** ERROR: error creating empty object store in /var/lib/ceph/tmp/mnt.V_F6HH: (13) Permission denied

# ls -hal /var/lib/ceph/
total 20K
drwxr-x--- 8 ceph ceph 8 Apr 11 23:15 .
drwxr-xr-x 45 root root 46 Apr 11 23:15 ..
drwxr-xr-x 2 ceph ceph 3 Apr 12 04:42 bootstrap-mds
drwxr-xr-x 2 ceph ceph 3 Apr 12 04:42 bootstrap-osd
drwxr-xr-x 2 ceph ceph 3 Apr 12 04:42 bootstrap-rgw
drwxr-xr-x 3 ceph ceph 3 Apr 12 04:42 mon
drwxr-xr-x 3 ceph ceph 3 Apr 12 11:10 osd
drwxr-xr-x 2 ceph ceph 4 Apr 13 10:11 tmp

# ls -hal /var/lib/ceph/tmp/
total 10K
drwxr-xr-x 2 ceph ceph 4 Apr 13 10:11 .
drwxr-x--- 8 ceph ceph 8 Apr 11 23:15 ..
-rwxr-xr-x 1 root root 0 Apr 12 12:13 ceph-disk.activate.lock
-rwxr-xr-x 1 root root 0 Apr 12 12:12 ceph-disk.prepare.lock

#ls /dev/nvm* -hal

drwxr-xr-x 21 root root 4.6K Apr 13 10:11 .
drwxr-xr-x 22 root root 22 Apr 9 18:28 ..
crw------- 1 root root 248, 0 Apr 12 14:55 /dev/nvme0
brw-rw---- 1 root disk 259, 0 Apr 12 14:55 /dev/nvme0n1
brw-rw---- 1 root disk 259, 1 Apr 12 14:55 /dev/nvme0n1p1

wolfgang · Apr 13, 2017

I have test this setup and it works.
so i would wipe all and start over.

pveceph stop
pveceph pure
umount /var/lib/ceph/..
rm -r /var/lib/ceph
rm -r /etc/ceph

use parted to format the disks
parted /dev/.. mktabel gpt
parted /dev/nvme.. mkpart

now you can start from beginning

udo · Apr 13, 2017

gcakici said:
...
2017-04-13 10:11:10.422432 7fbd65627800 -1 filestore(/var/lib/ceph/tmp/mnt.V_F6HH) mkjournal error creating journal on /var/lib/ceph/tmp/mnt.V_F6HH/journal: (13) Permission denied
2017-04-13 10:11:10.422468 7fbd65627800 -1 OSD::mkfs: ObjectStore::mkfs failed with error -13
2017-04-13 10:11:10.422521 7fbd65627800 -1 ** ERROR: error creating empty object store in /var/lib/ceph/tmp/mnt.V_F6HH: (13) Permission denied

#ls /dev/nvm* -hal
brw-rw---- 1 root disk 259, 1 Apr 12 14:55 /dev/nvme0n1p1

Hi,
perhaps:

Code:

chown ceph /dev/nvme0n1p1

Udo

gcakici · Apr 14, 2017

udo said:
Hi,
perhaps:

Code:

chown ceph /dev/nvme0n1p1

Udo

Yes, It worked that way. After the reboot, permission becomes restored and OSD is not booting and Ceph is not starting. I think that's a UDEV issue that I've seen on your older posts. But I can't fix that either.

gcakici · Apr 14, 2017

wolfgang said:
I have test this setup and it works.
so i would wipe all and start over.

pveceph stop
pveceph pure
umount /var/lib/ceph/..
rm -r /var/lib/ceph
rm -r /etc/ceph

use parted to format the disks
parted /dev/.. mktabel gpt
parted /dev/nvme.. mkpart

now you can start from beginning

Didn't help. I've tried the next recommendation from Udo and it goes as I replied below.

udo · Apr 14, 2017

gcakici said:
Yes, It worked that way. After the reboot, permission becomes restored and OSD is not booting and Ceph is not starting. I think that's a UDEV issue that I've seen on your older posts. But I can't fix that either.

Hi,
you can try following (which worked for me on an hammer-cluster):

1. name your journal with sgdisk to journal-0 (for osd-0) so that /dev/disk/by-partlabel/journal-0 shows a link to your partition (/dev/nvme0n1p1)
2. put following in ceph.conf:

Code:

[osd]
osd_journal = /dev/disk/by-partlabel/journal-$id

Perhaps autostart worked than?? But I'm not sure, that the ceph-permissions are set...

Udo

Search

Search

Ceph OSD with file journal

gcakici

Renowned Member

wolfgang

Proxmox Retired Staff

gcakici

Renowned Member

wolfgang

Proxmox Retired Staff

udo

Distinguished Member

gcakici

Renowned Member

gcakici

Renowned Member

udo

Distinguished Member