Ceph emulating Raid 1

vaschthestampede · Jun 22, 2021

How can I make Ceph emulate some sort of RAI1?

I am aware that Ceph is a distributed file system and that if it has fewer than three nodes it does not work as it should.
I'm fighting for the third node. To put the second into production I have to be able to convince my bosses to take a 25GB switch.
In the meantime I would like to put into production the disks that I managed to get taken but with only one node I cannot have redundancy.

Is there a way or does Ceph need at least two nodes to have redundancy?

ermanishchawla · Jun 22, 2021

Yes configure RF=2 though it is not recommended for production setup as node failure will leave the setup unusable

vaschthestampede · Jun 22, 2021

How?
I tried

Code:

ceph osd pool set <pool name> RF  2

Didn't work.

ermanishchawla · Jun 22, 2021

Post the status
Ceph osd tree
Ceph -s

vaschthestampede · Jun 23, 2021

ceph osd tree:

Code:

ID CLASS WEIGHT   TYPE NAME      STATUS REWEIGHT PRI-AFF
-1       14.88217 root default                           
-3       14.88217     host pveZZ                         
 0 NVMe1  0.45479         osd.0      up  1.00000 1.00000
 1 NVMe1  0.45479         osd.1      up  1.00000 1.00000
 2 NVMe2  6.98630         osd.2      up  1.00000 1.00000
 3 NVMe2  6.98630         osd.3      up  1.00000 1.00000

Ceph -s

Code:

  cluster:
    id:     dc0e8f62-4ab8-441b-9891-0cf905b52e87
    health: HEALTH_WARN
            1 pool(s) have no replicas configured
            Reduced data availability: 128 pgs inactive
            Degraded data redundancy: 128 pgs undersized
            mon is allowing insecure global_id reclaim
 
  services:
    mon: 1 daemons, quorum pveZZ (age 16h)
    mgr: pveZZ(active, since 16h)
    osd: 4 osds: 4 up (since 16h), 4 in (since 5d)
 
  data:
    pools:   2 pools, 256 pgs
    objects: 97.36k objects, 376 GiB
    usage:   115 GiB used, 15 TiB / 15 TiB avail
    pgs:     50.000% pgs not active
             128 active+clean
             128 undersized+peered

ermanishchawla · Jun 24, 2021

According to ceph osd tree output, you only have 1 node which is having NVME Disks. Where is the second node at all or you clipped the output?

otherwise to make it work, crush rule need to be modified but you will not get any advantage, it will just work, wont survive failure

spirit · Jun 24, 2021

I'll not work with 2 nodes, as you need quorum for monitors, so you need 3 monitors minimum.
if you do only 2 monitor, it'll go readonly if 1 host is down.
(the osd/storage can be replicated on 2 nodes : ceph osd pool set <pool name> size 2, ceph osd pool set <pool name> min_size 1

vaschthestampede · Jun 25, 2021

I'm aware that Ceph is a distributed file system and that if it has fewer than three nodes it does not work as it should.
I'm fighting for the third node. To put the second into production I have to be able to convince my bosses to take a 25GB switch.
In the meantime I would like to put into production the disks that I managed to get taken but with only one node I cannot have redundancy.

I mean redundancy on disk level, like RAID 1.

ermanishchawla · Jun 25, 2021

How many nodes you have right now??

vaschthestampede · Jun 25, 2021

One.

To put the second into production I have to be able to convince my bosses to take a 25GB switch and NICs.

ermanishchawla · Jun 25, 2021

Well one way to start with single node setup is modify the crush rule and allow the replication across osds rather than nodes. You can do that easily by manually editing crush maps. As soon as you keep adding nodes and you have sufficient number of nodes, change the crush map again to distribute data across nodes rather than osd

for this to work make sure you have minimum 3 OSD per server and remind you this is not at all recommended setup.

If you don't have a need to go for ceph directly. You can configure LVM-thin using existing disks by configuring disks in RAID 1 either through BIOS or through ZFS RAID 1 and later migrate data when you have sufficient nodes for CEPH

ermanishchawla · Jun 25, 2021

Follow this command to start with single node ceph

Assumption: You have single node, You have configured a cluster already with single node member as of now

pveceph init --network=172.19.X.Y/24. == Specify your 10G or higher interface network here

pveceph mon create

find the list of drives using command
lsblk -f
I have 3 disks added for testing /dev/sdb, /dev/sdc, /dev/sdd

pveceph osd create /dev/sdb --crush-device-class hdd
pveceph osd create /dev/sdb --crush-device-class hdd
pveceph osd create /dev/sdb --crush-device-class hdd

Choose your device type properly, if you have ssd use "ssd" after crush-device-class

Now its time to edit the crush map

ceph osd getcrushmap -o crush.bin

crushtool -d crush.bin -o crush.txt

edit the crush.txt file
and you will find something like this

rule replicated_rule {

id 0

type replicated

min_size 1

max_size 10

step take default

step chooseleaf firstn 0 type host

step emit

}

Change this line step chooseleaf firstn 0 type host. to step chooseleaf firstn 0 type osd

crushtool -c crush.txt -o crushnew.bin

ceph osd setcrushmap -i crushnew.bin

Verify the data placement, now create a pool with size 2

pveceph pool create vm2 --size 2

Create a dummy file

dd if=/dev/zero of=test bs=1M count=4

put the file in the rados

rados -p vm2 put obj1 test

Verify the data placement

ceph osd map vm2 obj1

osdmap e25 pool 'vm2' (2) object 'obj1' -> pg 2.6cf8deff (2.7f) -> up ([2,1], p2) acting ([2,1], p2)

Check the health status

ceph -s

cluster:

id: fad45a97-e141-49d0-8485-14389d69204c

health: HEALTH_OK

services:

mon: 1 daemons, quorum pve1 (age 18m)

mgr: pve1(active, since 18m)

osd: 3 osds: 3 up (since 14m), 3 in (since 14m)

data:

pools: 1 pools, 128 pgs

objects: 1 objects, 4 MiB

usage: 3.0 GiB used, 297 GiB / 300 GiB avail

pgs: 128 active+clean

I hope it helps

Search

Search

Ceph emulating Raid 1

vaschthestampede

Member

ermanishchawla

Well-Known Member

vaschthestampede

Member

ermanishchawla

Well-Known Member

vaschthestampede

Member

ermanishchawla

Well-Known Member

spirit

Distinguished Member

vaschthestampede

Member

ermanishchawla

Well-Known Member

vaschthestampede

Member

ermanishchawla

Well-Known Member

ermanishchawla

Well-Known Member