fresh PVE 5.0/ceph 12.2.0 @home

luphi

Renowned Member
Nov 9, 2015
82
5
73
Hey all,

since a few day I plan a new single PVE server for home use.
Here is my strategy:
I don't care much about availability, so a single host is ok for me.
I don't care much about performance, so a single host is ok for me.
I care about flexibility, that's why I want ceph for storage and not ZFS or others.
I care about my data, so I want my pools to be 2/1 (the really important data is backed up additionally outside of ceph)

4 HDD OSDs
2 SSDs for jounal (two OSDs per SSD)

After the fresh PVE 5.0 installation, I installed ceph with the following commands:
Code:
pveceph install
pveceph init --network 172.23.1.0/24
pveceph createmon
ceph osd require-osd-release luminous
ceph osd set-require-min-compat-client jewel
ceph osd crush tunables optimal

remark:
My first idea was to use 127.0.0.0/8 for the ceph network. I thought it would be a good idea since this is a single node cluster. Unfortunately it didn't work:

After a reboot the ceph-mgr seems to be started, but it looks like it crashes immediately.
I have to start it manually after each reboot.

cat /var/log/ceph/ceph-mgr.pve.log
Code:
2017-09-19 14:32:55.927732 7ff9b4937540  0 set uid:gid to 64045:64045 (ceph:ceph)
2017-09-19 14:32:55.927759 7ff9b4937540  0 ceph version 12.2.0 (36f6c5ea099d43087ff0276121fd34e71668ae0e) luminous (rc), process (unknown), pid 1301
2017-09-19 14:32:55.929630 7ff9b4937540  0 pidfile_write: ignore empty --pid-file
2017-09-19 14:32:56.655752 7ff9b4937540  1 mgr send_beacon standby
2017-09-19 14:32:57.671507 7ff9aaf0c700  1 mgr handle_mgr_map Activating!
2017-09-19 14:32:57.671674 7ff9aaf0c700  1 mgr handle_mgr_map I am now activating
2017-09-19 14:32:57.764892 7ff9a6f04700  1 mgr init Loading python module 'restful'
2017-09-19 14:32:58.232206 7ff9a6f04700  1 mgr load Constructed class from module: restful
2017-09-19 14:32:58.232235 7ff9a6f04700  1 mgr init Loading python module 'status'
2017-09-19 14:32:58.256827 7ff9a6f04700  1 mgr load Constructed class from module: status
2017-09-19 14:32:58.256852 7ff9a6f04700  1 mgr start Creating threads for 2 modules
2017-09-19 14:32:58.256931 7ff9a6f04700  1 mgr send_beacon active
2017-09-19 14:32:58.257519 7ff9a2ef5700  0 mgr[restful] Traceback (most recent call last):
  File "/usr/lib/ceph/mgr/restful/module.py", line 248, in serve
    self._serve()
  File "/usr/lib/ceph/mgr/restful/module.py", line 299, in _serve
    raise RuntimeError('no certificate configured')
RuntimeError: no certificate configured

2017-09-19 14:32:58.661688 7ff9a7f06700  1 mgr send_beacon active
2017-09-19 14:32:58.661968 7ff9a7f06700  1 mgr.server send_report Not sending PG status to monitor yet, waiting for OSDs
2017-09-19 14:33:00.662056 7ff9a7f06700  1 mgr send_beacon active
2017-09-19 14:33:00.662275 7ff9a7f06700  1 mgr.server send_report Not sending PG status to monitor yet, waiting for OSDs
2017-09-19 14:33:01.512451 7ff9aaf0c700 -1 mgr handle_mgr_map I was active but no longer am
2017-09-19 14:33:01.512458 7ff9aaf0c700  1 mgr respawn  e: '/usr/bin/ceph-mgr'
2017-09-19 14:33:01.512459 7ff9aaf0c700  1 mgr respawn  0: '/usr/bin/ceph-mgr'
2017-09-19 14:33:01.512460 7ff9aaf0c700  1 mgr respawn  1: '-f'
2017-09-19 14:33:01.512461 7ff9aaf0c700  1 mgr respawn  2: '--cluster'
2017-09-19 14:33:01.512461 7ff9aaf0c700  1 mgr respawn  3: 'ceph'
2017-09-19 14:33:01.512462 7ff9aaf0c700  1 mgr respawn  4: '--id'
2017-09-19 14:33:01.512463 7ff9aaf0c700  1 mgr respawn  5: 'pve'
2017-09-19 14:33:01.512463 7ff9aaf0c700  1 mgr respawn  6: '--setuser'
2017-09-19 14:33:01.512464 7ff9aaf0c700  1 mgr respawn  7: 'ceph'
2017-09-19 14:33:01.512464 7ff9aaf0c700  1 mgr respawn  8: '--setgroup'
2017-09-19 14:33:01.512465 7ff9aaf0c700  1 mgr respawn  9: 'ceph'
2017-09-19 14:33:01.512498 7ff9aaf0c700  1 mgr respawn respawning with exe /usr/bin/ceph-mgr
2017-09-19 14:33:01.512499 7ff9aaf0c700  1 mgr respawn  exe_path /proc/self/exe
2017-09-19 14:33:01.888611 7f34975c5540  0 set uid:gid to 64045:64045 (ceph:ceph)
2017-09-19 14:33:01.888622 7f34975c5540  0 ceph version 12.2.0 (36f6c5ea099d43087ff0276121fd34e71668ae0e) luminous (rc), process (unknown), pid 1801
2017-09-19 14:33:01.889988 7f34975c5540  0 pidfile_write: ignore empty --pid-file
2017-09-19 14:33:01.893769 7f34975c5540  1 mgr send_beacon standby
2017-09-19 14:33:02.584162 7f348db9a700  1 mgr handle_mgr_map Activating!
2017-09-19 14:33:02.584325 7f348db9a700  1 mgr handle_mgr_map I am now activating
2017-09-19 14:33:02.594165 7f3489b92700  1 mgr init Loading python module 'restful'
2017-09-19 14:33:02.831363 7f3489b92700  1 mgr load Constructed class from module: restful
2017-09-19 14:33:02.831417 7f3489b92700  1 mgr init Loading python module 'status'
2017-09-19 14:33:02.851314 7f3489b92700  1 mgr load Constructed class from module: status
2017-09-19 14:33:02.851361 7f3489b92700  1 mgr start Creating threads for 2 modules
2017-09-19 14:33:02.851455 7f3489b92700  1 mgr send_beacon active
2017-09-19 14:33:02.852214 7f3485b83700  0 mgr[restful] Traceback (most recent call last):
  File "/usr/lib/ceph/mgr/restful/module.py", line 248, in serve
    self._serve()
  File "/usr/lib/ceph/mgr/restful/module.py", line 299, in _serve
    raise RuntimeError('no certificate configured')
RuntimeError: no certificate configured

2017-09-19 14:33:03.894030 7f348ab94700  1 mgr send_beacon active
2017-09-19 14:33:03.894227 7f348ab94700  1 mgr.server send_report Not sending PG status to monitor yet, waiting for OSDs
2017-09-19 14:33:05.894292 7f348ab94700  1 mgr send_beacon active
2017-09-19 14:33:05.894456 7f348ab94700  1 mgr.server send_report Not sending PG status to monitor yet, waiting for OSDs
2017-09-19 14:33:06.882945 7f48cf530540  0 set uid:gid to 64045:64045 (ceph:ceph)
2017-09-19 14:33:06.882958 7f48cf530540  0 ceph version 12.2.0 (36f6c5ea099d43087ff0276121fd34e71668ae0e) luminous (rc), process (unknown), pid 1861
2017-09-19 14:33:06.884360 7f48cf530540  0 pidfile_write: ignore empty --pid-file
2017-09-19 14:33:06.888024 7f48cf530540  1 mgr send_beacon standby
2017-09-19 14:33:07.595953 7f48c5b05700  1 mgr handle_mgr_map Activating!
2017-09-19 14:33:07.596108 7f48c5b05700  1 mgr handle_mgr_map I am now activating
2017-09-19 14:33:07.605926 7f48c1afd700  1 mgr init Loading python module 'restful'
2017-09-19 14:33:07.842043 7f48c1afd700  1 mgr load Constructed class from module: restful
2017-09-19 14:33:07.842069 7f48c1afd700  1 mgr init Loading python module 'status'
2017-09-19 14:33:07.861939 7f48c1afd700  1 mgr load Constructed class from module: status
2017-09-19 14:33:07.861960 7f48c1afd700  1 mgr start Creating threads for 2 modules
2017-09-19 14:33:07.862030 7f48c1afd700  1 mgr send_beacon active
2017-09-19 14:33:07.862660 7f48bdaee700  0 mgr[restful] Traceback (most recent call last):
  File "/usr/lib/ceph/mgr/restful/module.py", line 248, in serve
    self._serve()
  File "/usr/lib/ceph/mgr/restful/module.py", line 299, in _serve
    raise RuntimeError('no certificate configured')
RuntimeError: no certificate configured

2017-09-19 14:33:08.888286 7f48c2aff700  1 mgr send_beacon active
2017-09-19 14:33:08.888486 7f48c2aff700  1 mgr.server send_report Not sending PG status to monitor yet, waiting for OSDs
2017-09-19 14:33:10.888564 7f48c2aff700  1 mgr send_beacon active
2017-09-19 14:33:10.888740 7f48c2aff700  1 mgr.server send_report Not sending PG status to monitor yet, waiting for OSDs
2017-09-19 14:33:11.516123 7f48c5b05700 -1 mgr handle_mgr_map I was active but no longer am
2017-09-19 14:33:11.891422 7f5dc299d540  0 set uid:gid to 64045:64045 (ceph:ceph)
2017-09-19 14:33:11.891434 7f5dc299d540  0 ceph version 12.2.0 (36f6c5ea099d43087ff0276121fd34e71668ae0e) luminous (rc), process (unknown), pid 1912
2017-09-19 14:33:11.892819 7f5dc299d540  0 pidfile_write: ignore empty --pid-file
2017-09-19 14:33:11.896540 7f5dc299d540  1 mgr send_beacon standby
2017-09-19 14:33:12.606687 7f5db8f72700  1 mgr handle_mgr_map Activating!
2017-09-19 14:33:12.606831 7f5db8f72700  1 mgr handle_mgr_map I am now activating
2017-09-19 14:33:12.616848 7f5db4f6a700  1 mgr init Loading python module 'restful'
2017-09-19 14:33:12.852142 7f5db4f6a700  1 mgr load Constructed class from module: restful
2017-09-19 14:33:12.852168 7f5db4f6a700  1 mgr init Loading python module 'status'
2017-09-19 14:33:12.871998 7f5db4f6a700  1 mgr load Constructed class from module: status
2017-09-19 14:33:12.872026 7f5db4f6a700  1 mgr start Creating threads for 2 modules
2017-09-19 14:33:12.872107 7f5db4f6a700  1 mgr send_beacon active
2017-09-19 14:33:12.872727 7f5db0f5b700  0 mgr[restful] Traceback (most recent call last):
  File "/usr/lib/ceph/mgr/restful/module.py", line 248, in serve
    self._serve()
  File "/usr/lib/ceph/mgr/restful/module.py", line 299, in _serve
    raise RuntimeError('no certificate configured')
RuntimeError: no certificate configured

2017-09-19 14:33:13.896816 7f5db5f6c700  1 mgr send_beacon active
2017-09-19 14:33:13.897012 7f5db5f6c700  1 mgr.server send_report Not sending PG status to monitor yet, waiting for OSDs
2017-09-19 14:33:15.897110 7f5db5f6c700  1 mgr send_beacon active
2017-09-19 14:33:15.897312 7f5db5f6c700  1 mgr.server send_report Not sending PG status to monitor yet, waiting for OSDs

after a manual restart, the manager runs stable.

Next step was to create the OSDs. I did that from the GUI, which added all four OSD to the host bucket. To be redundant on the journals, I made some changes to the crushmap:
Code:
# types
type 0 osd
type 1 journalssd
type 2 host
type 3 root

# buckets
journalssd even {
    id -2        # do not change unnecessarily
    id -5 class hdd        # do not change unnecessarily
    # weight 0.292
    alg straw2
    hash 0    # rjenkins1
    item osd.0 weight 0.146
    item osd.2 weight 0.146
}
journalssd odd {
    id -4        # do not change unnecessarily
    id -6 class hdd        # do not change unnecessarily
    # weight 0.292
    alg straw2
    hash 0    # rjenkins1
    item osd.1 weight 0.146
    item osd.3 weight 0.146
}
host pve {
    id -3        # do not change unnecessarily
    id -7 class hdd        # do not change unnecessarily
    # weight 0.584
    alg straw2
    hash 0    # rjenkins1
    item even weight 0.292
    item odd weight 0.292
}
root default {
    id -1        # do not change unnecessarily
    id -8 class hdd        # do not change unnecessarily
    # weight 0.584
    alg straw2
    hash 0    # rjenkins1
    item pve weight 0.584
}

# rules
rule replicated_rule {
        id 0
        type replicated
        min_size 1
        max_size 10
        step take default
        step chooseleaf firstn 0 type journalssd
        step emit
}

everything looked fine so far:
ceph osd tree:
Code:
ID CLASS WEIGHT  TYPE NAME               STATUS REWEIGHT PRI-AFF
-1       0.58400 root default
-3       0.58400     host pve
-2       0.29199         journalssd even
 0   hdd 0.14600             osd.0           up  1.00000 1.00000
 2   hdd 0.14600             osd.2           up  1.00000 1.00000
-4       0.29199         journalssd odd
 1   hdd 0.14600             osd.1           up  1.00000 1.00000
 3   hdd 0.14600             osd.3           up  1.00000 1.00000

After an additional reboot my new buckets survived, but the OSDs were moved to the host bucket.
Code:
ID CLASS WEIGHT  TYPE NAME               STATUS REWEIGHT PRI-AFF
-1       0.58398 root default
-3       0.58398     host pve
-2             0         journalssd even
-4             0         journalssd odd
 0   hdd 0.14600         osd.0               up  1.00000 1.00000
 1   hdd 0.14600         osd.1               up  1.00000 1.00000
 2   hdd 0.14600         osd.2               up  1.00000 1.00000
 3   hdd 0.14600         osd.3               up  1.00000 1.00000
My workaround is to create a script to start the manager and inject my crushmap, but that is a bit ...
Let's say it's not the right way, right?

Cheers,
luphi
 
I did some further research this morning by monitoring "ceph -s" and "ceph osd tree" during startup.
(I removed the host bucket since this is an unnecessary layer in my hierarchy)

At the beginning, everything seems to be ok, mgr is active, osd tree is correct, OSDs are just comming up:
Code:
  cluster:
    id:     18161f23-edda-46dc-8821-8e9295f9358d
    health: HEALTH_WARN
            3 osds down
            1 journalssd (2 osds) down

  services:
    mon: 1 daemons, quorum pve
    mgr: pve(active)
    osd: 4 osds: 1 up, 4 in

  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 bytes
    usage:   8388 MB used, 591 GB / 599 GB avail
    pgs:

ID CLASS WEIGHT  TYPE NAME           STATUS REWEIGHT PRI-AFF
-1       0.58398 root default
-2       0.29199     journalssd even
 0   hdd 0.14600         osd.0         down  1.00000 1.00000
 2   hdd 0.14600         osd.2           up  1.00000 1.00000
-4       0.29199     journalssd odd
 1   hdd 0.14600         osd.1         down  1.00000 1.00000
 3   hdd 0.14600         osd.3         down  1.00000 1.00000
Two seconds later, mgr is gone:
Code:
  cluster:
    id:     18161f23-edda-46dc-8821-8e9295f9358d
    health: HEALTH_WARN
            3 osds down
            1 journalssd (2 osds) down
            no active mgr

  services:
    mon: 1 daemons, quorum pve
    mgr: no daemons active
    osd: 4 osds: 1 up, 4 in

  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 bytes
    usage:   8388 MB used, 591 GB / 599 GB avail
    pgs:

ID CLASS WEIGHT  TYPE NAME           STATUS REWEIGHT PRI-AFF
-1       0.58398 root default
-2       0.29199     journalssd even
 0   hdd 0.14600         osd.0         down  1.00000 1.00000
 2   hdd 0.14600         osd.2           up  1.00000 1.00000
-4       0.29199     journalssd odd
 1   hdd 0.14600         osd.1         down  1.00000 1.00000
 3   hdd 0.14600         osd.3         down  1.00000 1.00000
Next step: mfr stays down, osd.3 is moved to root bucket
Code:
  cluster:
    id:     18161f23-edda-46dc-8821-8e9295f9358d
    health: HEALTH_WARN
            3 osds down
            1 journalssd (1 osds) down
            no active mgr

  services:
    mon: 1 daemons, quorum pve
    mgr: no daemons active
    osd: 4 osds: 1 up, 4 in

  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 bytes
    usage:   8388 MB used, 591 GB / 599 GB avail
    pgs:

ID CLASS WEIGHT  TYPE NAME           STATUS REWEIGHT PRI-AFF
-1       0.58398 root default
-2       0.29199     journalssd even
 0   hdd 0.14600         osd.0         down  1.00000 1.00000
 2   hdd 0.14600         osd.2           up  1.00000 1.00000
-4       0.14600     journalssd odd
 1   hdd 0.14600         osd.1         down  1.00000 1.00000
 3   hdd 0.14600     osd.3             down  1.00000 1.00000
osd.2 down, osd.3 up
Code:
  cluster:
    id:     18161f23-edda-46dc-8821-8e9295f9358d
    health: HEALTH_WARN
            3 osds down
            2 journalssds (3 osds) down
            no active mgr

  services:
    mon: 1 daemons, quorum pve
    mgr: no daemons active
    osd: 4 osds: 1 up, 4 in

  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 bytes
    usage:   8388 MB used, 591 GB / 599 GB avail
    pgs:

ID CLASS WEIGHT  TYPE NAME           STATUS REWEIGHT PRI-AFF
-1       0.58398 root default
-2       0.29199     journalssd even
 0   hdd 0.14600         osd.0         down  1.00000 1.00000
 2   hdd 0.14600         osd.2         down  1.00000 1.00000
-4       0.14600     journalssd odd
 1   hdd 0.14600         osd.1         down  1.00000 1.00000
 3   hdd 0.14600     osd.3               up  1.00000 1.00000
osd.1 moved to root
Code:
  cluster:
    id:     18161f23-edda-46dc-8821-8e9295f9358d
    health: HEALTH_WARN
            3 osds down
            2 journalssds (3 osds) down
            no active mgr

  services:
    mon: 1 daemons, quorum pve
    mgr: no daemons active
    osd: 4 osds: 1 up, 4 in

  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 bytes
    usage:   8388 MB used, 591 GB / 599 GB avail
    pgs:

ID CLASS WEIGHT  TYPE NAME           STATUS REWEIGHT PRI-AFF
-1       0.58398 root default
-2       0.29199     journalssd even
 0   hdd 0.14600         osd.0         down  1.00000 1.00000
 2   hdd 0.14600         osd.2         down  1.00000 1.00000
-4             0     journalssd odd
 1   hdd 0.14600     osd.1             down  1.00000 1.00000
 3   hdd 0.14600     osd.3               up  1.00000 1.00000
and up
Code:
 cluster:
    id:     18161f23-edda-46dc-8821-8e9295f9358d
    health: HEALTH_WARN
            2 osds down
            2 journalssds (2 osds) down
            no active mgr

  services:
    mon: 1 daemons, quorum pve
    mgr: no daemons active
    osd: 4 osds: 2 up, 4 in

  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 bytes
    usage:   8388 MB used, 591 GB / 599 GB avail
    pgs:

ID CLASS WEIGHT  TYPE NAME           STATUS REWEIGHT PRI-AFF
-1       0.58398 root default
-2       0.29199     journalssd even
 0   hdd 0.14600         osd.0         down  1.00000 1.00000
 2   hdd 0.14600         osd.2         down  1.00000 1.00000
-4             0     journalssd odd
 1   hdd 0.14600     osd.1               up  1.00000 1.00000
 3   hdd 0.14600     osd.3               up  1.00000 1.00000
at the end it looks like this:
Code:
  cluster:
    id:     18161f23-edda-46dc-8821-8e9295f9358d
    health: HEALTH_WARN
            no active mgr

  services:
    mon: 1 daemons, quorum pve
    mgr: no daemons active
    osd: 4 osds: 4 up, 4 in

  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 bytes
    usage:   8388 MB used, 591 GB / 599 GB avail
    pgs:

ID CLASS WEIGHT  TYPE NAME           STATUS REWEIGHT PRI-AFF
-1       0.58398 root default
-2             0     journalssd even
-4             0     journalssd odd
 0   hdd 0.14600     osd.0               up  1.00000 1.00000
 1   hdd 0.14600     osd.1               up  1.00000 1.00000
 2   hdd 0.14600     osd.2               up  1.00000 1.00000
 3   hdd 0.14600     osd.3               up  1.00000 1.00000

Any idea is apprechiated.

Cheers,
luphi
 
Hi,

I would suggest starting over from scratch again since that only takes about 15 minutes with the following changes:
- After pveceph install modify ceph.conf with the chooseleaf modification to enable single node operation.
- With ceph init use the same network as the host network.
Continue as on the wiki.

Works fine on my single node ceph cluster, no need to manually change the crush map.
 
Thanks for your replay.
But how can I make sure, that primary and replicated pg are not on OSDs, which have their journal on the same SSD?
If the SSD will fail, I will loose my data.
Is my setup not the right way?

Cheeers,
Martin
 
My understanding is that journal disks are pretty much useless when you use bluestore.
 
I did some tests:
rados bench -p test 30 write --no-cleanup
Code:
journal on               SSD             OSD

Total time run:         30.824673       30.506182
Total writes made:      485             405
Write size:             4194304         4194304
Object size:            4194304         4194304
Bandwidth (MB/sec):     62.9366         53.104
Stddev Bandwidth:       8.5651          17.1372
Max bandwidth (MB/sec): 80              84
Min bandwidth (MB/sec): 44              0
Average IOPS:           15              13
Stddev IOPS:            2               4
Max IOPS:               20              21
Min IOPS:               11              0
Average Latency(s):     1.01006         1.20487
Stddev Latency(s):     0.611315        0.456212
Max latency(s):         3.38241         2.63138
Min latency(s):         0.112241        0.160709
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!