Ceph monitor writing a lot of data ?

SanderM

Member
Oct 21, 2016
40
1
6
41
I have proxmox installed on a PCI-e NVM SSD. This SSD is not an enterprise class one, so it doesn't have a high endurance TBW rating.

Now, my proxmox+ceph setup is running for only 4 days and when I do "iostat -k" I can see that there's already about 280Gb written to my PCI-e SSD. Which is not good for these kind of SSD's...

So after reading and searching online I read that Ceph monitors write to /var/lib/ceph/mon/ceph-0/*.
I can see that's it's constantly writing there, so I think this is going to be a problem because it's going to wear out my PCI-e SSD pretty fast. The idea was to use this SSD only for proxmox which should be doing a lot of writing at all..

So, my question. Should I (somehow?) modify the ceph monitors to write to an enterprise SSD instead? I can add a partition to one of the journal SSD's and have it written there. Those kind of SSD's are enterprise SSD's with a very high endurance TBW rating.

Please advice.

Also: if I need to modify the location. Can I do that without interrupting Ceph? I'm already running many production VM's on this setup now...
 
So, my question. Should I (somehow?) modify the ceph monitors to write to an enterprise SSD instead? I can add a partition to one of the journal SSD's and have it written there. Those kind of SSD's are enterprise SSD's with a very high endurance TBW rating.

Please advice.

Also: if I need to modify the location. Can I do that without interrupting Ceph? I'm already running many production VM's on this setup now...

stop the monitor, copy mon datas on the new partition, and mount this partition in /var/lib/ceph/mon/ceph-0/ , start the mon
 
Okay. So, in all my systems I have identical disk configurations like this:

PCIe SSD (consumer, for boot/proxmox)
3x 960Gb SSD (Enterprise, for pool1)
2x 900Gb SAS + 1x 240GB SSD (Enterprise, for pool2)
1x 480Gb SSD (Enterprise, for pool3)

The 240Gb SSD is used as a journal for both SAS disks and has only 2x 15GB journal partition.

So, I guess it's better to create a 3rd partition on that SSD and put the mon data on it like this:

- stop the monitor
- move data from /var/lib/ceph0/mon/ceph-0 to temp location
- mount enterprise SSD partition to /var/lib/ceph0/mon/ceph-0/ (and adjust /etc/fstab)
- move data back to /var/lib/ceph0/mon/ceph-0

Does this sound okay?

Or is it a very bad idea to put the mon data on the same SSD used for journal for some reason? Do I need to buy an additional enterprise SSD for the mon data?

Is it normal for the monitors to write so much data? In the last 8 hours it wrote about 15Gb I think. At least, if I compare the "iostat -k" output with 8 hours back, about 15Gb has been written to the SSD.
 
- stop the monitor
- move data from /var/lib/ceph0/mon/ceph-0 to temp location
- mount enterprise SSD partition to /var/lib/ceph0/mon/ceph-0/ (and adjust /etc/fstab)
- move data back to /var/lib/ceph0/mon/ceph-0

Does this sound okay?
yes, no problem

Or is it a very bad idea to put the mon data on the same SSD used for journal for some reason? Do I need to buy an additional enterprise SSD for the mon data?
I think it's ok.

Is it normal for the monitors to write so much data? In the last 8 hours it wrote about 15Gb I think. At least, if I compare the "iostat -k" output with 8 hours back, about 15Gb has been written to the SSD.

my ceph-mon are writing around 500kB/s constant, so around 2GB by hour. Seem to be around same than you.


Personnaly, I'm using small intel s3710 200G for os + mon.