Ceph OSD with file journal

gcakici

Renowned Member
Sep 26, 2009
46
1
73
I've an Intel P3700DC NVMe drive which I want to use it as journal for 3 SATA OSDs. I partitioned the device but can not use it them as block devices so I want to use it as file journals.

I couldn't find a way to go on Proxmox interface and ceph-disk prepares the OSD but I can not see it in the proxmox interface either. This is a new installed platform with enterprise repo and Jewel is the latest version.

How can I create file journaled OSD's which can be seen and operated in the Proxmox interface?

Thanks
Gokalp

ceph-disk prepare --fs-type xfs --cluster ceph --journal-file /dev/sda /tmp/journal​

prepare_file: OSD will not be hot-swappable if journal is not the same device as the osd data
Setting name!
partNum is 0
REALLY setting name!
The operation has completed successfully.
meta-data=/dev/sda1 isize=2048 agcount=4, agsize=244188597 blks
= sectsz=4096 attr=2, projid32bit=1
= crc=0 finobt=0
data = bsize=4096 blocks=976754385, imaxpct=5
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0 ftype=0
log =internal log bsize=4096 blocks=476930, version=2
= sectsz=4096 sunit=1 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
The operation has completed successfully.


pveversion -v

proxmox-ve: 4.4-86 (running kernel: 4.4.49-1-pve)
pve-manager: 4.4-13 (running version: 4.4-13/7ea56165)
pve-kernel-4.4.35-1-pve: 4.4.35-77
pve-kernel-4.4.49-1-pve: 4.4.49-86
lvm2: 2.02.116-pve3
corosync-pve: 2.4.2-2~pve4+1
libqb0: 1.0.1-1
pve-cluster: 4.0-49
qemu-server: 4.0-110
pve-firmware: 1.1-11
libpve-common-perl: 4.0-94
libpve-access-control: 4.0-23
libpve-storage-perl: 4.0-76
pve-libspice-server1: 0.12.8-2
vncterm: 1.3-2
pve-docs: 4.4-4
pve-qemu-kvm: 2.7.1-4
pve-container: 1.0-97
pve-firewall: 2.0-33
pve-ha-manager: 1.0-40
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u3
lxc-pve: 2.0.7-4
lxcfs: 2.0.6-pve1
criu: 1.6.0-1
novnc-pve: 0.5-9
smartmontools: 6.5+svn4324-1~pve80
zfsutils: 0.6.5.9-pve15~bpo80
ceph: 10.2.7-1~bpo80+1
 
Hi,

here is the command for using a journal as you like
assume sdb is the disk and nvm00n1p1 is the NVME partition.

pveceph createosd /dev/sda -journal_dev /dev/nvme0n1p1
 
Thank you for the reply. This is exactly what I did for my installation. Before the execution of the command ;

#ceph osd tree
ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 0 root default
-2 0 host prox-ceph-5-1-51

#ceph-disk zap /dev/sda
The operation has completed successfully.

Then I executed the same command as yours.

#pveceph createosd /dev/sda -journal_dev /dev/nvme0n1p1
create OSD on /dev/sda (xfs)
using device '/dev/nvme0n1p1' for journal
Caution: invalid backup GPT header, but valid main header; regenerating
backup header from main header.

****************************************************************************
Caution: Found protective or hybrid MBR and corrupt GPT. Using GPT, but disk
verification and recovery are STRONGLY recommended.
****************************************************************************
GPT data structures destroyed! You may now partition the disk using fdisk or
other utilities.
Creating new GPT entries.
The operation has completed successfully.
prepare_device: OSD will not be hot-swappable if journal is not the same device as the osd data
prepare_device: Journal /dev/nvme0n1p1 was not prepared with ceph-disk. Symlinking directly.
Setting name!
partNum is 0
REALLY setting name!
The operation has completed successfully.
meta-data=/dev/sda1 isize=2048 agcount=4, agsize=244188597 blks
= sectsz=4096 attr=2, projid32bit=1
= crc=0 finobt=0
data = bsize=4096 blocks=976754385, imaxpct=5
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0 ftype=0
log =internal log bsize=4096 blocks=476930, version=2
= sectsz=4096 sunit=1 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
The operation has completed successfully.

I can see that it has been prepared and avaliable to ceph.

#ceph osd tree
ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 0 root default
-2 0 host prox-ceph-5-1-51
0 0 osd.0 down 1.00000 1.00000

But I can not see it in the Proxmox interface.

OSD LOG says ;

2017-04-13 10:11:09.713989 7f55ec4e2800 -1 OSD::mkfs: ObjectStore::mkfs failed with error -13
2017-04-13 10:11:09.714089 7f55ec4e2800 -1 ** ERROR: error creating empty object store in /var/lib/ceph/tmp/mnt.y11kCT: (13) Permission denied
2017-04-13 10:11:10.287998 7fbd65627800 0 set uid:gid to 64045:64045 (ceph:ceph)
2017-04-13 10:11:10.288009 7fbd65627800 0 ceph version 10.2.7 (50e863e0f4bc8f4b9e31156de690d765af245185), process ceph-osd, pid 24422
2017-04-13 10:11:10.290130 7fbd65627800 1 filestore(/var/lib/ceph/tmp/mnt.V_F6HH) mkfs in /var/lib/ceph/tmp/mnt.V_F6HH
2017-04-13 10:11:10.290148 7fbd65627800 1 filestore(/var/lib/ceph/tmp/mnt.V_F6HH) mkfs fsid is already set to 995fa999-f78a-4f9a-834e-994f4d49b430
2017-04-13 10:11:10.290152 7fbd65627800 1 filestore(/var/lib/ceph/tmp/mnt.V_F6HH) write_version_stamp 4
2017-04-13 10:11:10.292551 7fbd65627800 0 filestore(/var/lib/ceph/tmp/mnt.V_F6HH) backend xfs (magic 0x58465342)
2017-04-13 10:11:10.422327 7fbd65627800 1 filestore(/var/lib/ceph/tmp/mnt.V_F6HH) leveldb db exists/created
2017-04-13 10:11:10.422432 7fbd65627800 -1 filestore(/var/lib/ceph/tmp/mnt.V_F6HH) mkjournal error creating journal on /var/lib/ceph/tmp/mnt.V_F6HH/journal: (13) Permission denied
2017-04-13 10:11:10.422468 7fbd65627800 -1 OSD::mkfs: ObjectStore::mkfs failed with error -13
2017-04-13 10:11:10.422521 7fbd65627800 -1 ** ERROR: error creating empty object store in /var/lib/ceph/tmp/mnt.V_F6HH: (13) Permission denied

# ls -hal /var/lib/ceph/
total 20K
drwxr-x--- 8 ceph ceph 8 Apr 11 23:15 .
drwxr-xr-x 45 root root 46 Apr 11 23:15 ..
drwxr-xr-x 2 ceph ceph 3 Apr 12 04:42 bootstrap-mds
drwxr-xr-x 2 ceph ceph 3 Apr 12 04:42 bootstrap-osd
drwxr-xr-x 2 ceph ceph 3 Apr 12 04:42 bootstrap-rgw
drwxr-xr-x 3 ceph ceph 3 Apr 12 04:42 mon
drwxr-xr-x 3 ceph ceph 3 Apr 12 11:10 osd
drwxr-xr-x 2 ceph ceph 4 Apr 13 10:11 tmp

# ls -hal /var/lib/ceph/tmp/
total 10K
drwxr-xr-x 2 ceph ceph 4 Apr 13 10:11 .
drwxr-x--- 8 ceph ceph 8 Apr 11 23:15 ..
-rwxr-xr-x 1 root root 0 Apr 12 12:13 ceph-disk.activate.lock
-rwxr-xr-x 1 root root 0 Apr 12 12:12 ceph-disk.prepare.lock

#ls /dev/nvm* -hal

drwxr-xr-x 21 root root 4.6K Apr 13 10:11 .
drwxr-xr-x 22 root root 22 Apr 9 18:28 ..
crw------- 1 root root 248, 0 Apr 12 14:55 /dev/nvme0
brw-rw---- 1 root disk 259, 0 Apr 12 14:55 /dev/nvme0n1
brw-rw---- 1 root disk 259, 1 Apr 12 14:55 /dev/nvme0n1p1
 
I have test this setup and it works.
so i would wipe all and start over.

pveceph stop
pveceph pure
umount /var/lib/ceph/..
rm -r /var/lib/ceph
rm -r /etc/ceph

use parted to format the disks
parted /dev/.. mktabel gpt
parted /dev/nvme.. mkpart

now you can start from beginning
 
...
2017-04-13 10:11:10.422432 7fbd65627800 -1 filestore(/var/lib/ceph/tmp/mnt.V_F6HH) mkjournal error creating journal on /var/lib/ceph/tmp/mnt.V_F6HH/journal: (13) Permission denied
2017-04-13 10:11:10.422468 7fbd65627800 -1 OSD::mkfs: ObjectStore::mkfs failed with error -13
2017-04-13 10:11:10.422521 7fbd65627800 -1 ** ERROR: error creating empty object store in /var/lib/ceph/tmp/mnt.V_F6HH: (13) Permission denied

#ls /dev/nvm* -hal
brw-rw---- 1 root disk 259, 1 Apr 12 14:55 /dev/nvme0n1p1
Hi,
perhaps:
Code:
chown ceph /dev/nvme0n1p1
Udo
 
Hi,
perhaps:
Code:
chown ceph /dev/nvme0n1p1
Udo
Yes, It worked that way. After the reboot, permission becomes restored and OSD is not booting and Ceph is not starting. I think that's a UDEV issue that I've seen on your older posts. But I can't fix that either.
 
I have test this setup and it works.
so i would wipe all and start over.

pveceph stop
pveceph pure
umount /var/lib/ceph/..
rm -r /var/lib/ceph
rm -r /etc/ceph

use parted to format the disks
parted /dev/.. mktabel gpt
parted /dev/nvme.. mkpart

now you can start from beginning
Didn't help. I've tried the next recommendation from Udo and it goes as I replied below.
 
Yes, It worked that way. After the reboot, permission becomes restored and OSD is not booting and Ceph is not starting. I think that's a UDEV issue that I've seen on your older posts. But I can't fix that either.
Hi,
you can try following (which worked for me on an hammer-cluster):

1. name your journal with sgdisk to journal-0 (for osd-0) so that /dev/disk/by-partlabel/journal-0 shows a link to your partition (/dev/nvme0n1p1)
2. put following in ceph.conf:
Code:
[osd]
osd_journal = /dev/disk/by-partlabel/journal-$id
Perhaps autostart worked than?? But I'm not sure, that the ceph-permissions are set...

Udo
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!