Proxmox Ceph - what happens when you loose a journal-disk ?

Q-wulf

Renowned Member
Mar 3, 2013
613
39
93
my test location
I'm currently using a test machine with 14 HDD's and 2 SSD's + OS-Disks.
Each SSD acts as a journal for 7 HDD's.

What happens when a journal Disk fails ? is the data still available on the OSD's ? OSD still readable ? OSD still writeable?

This is espcially important as i am creating my Erasure coded pools


do i need to take journals into account ?

Yes
  • it is K=5 and M=8
  • either 8 OSD's can fail or 1 Journal + 1 OSD.
  • 160% overhead.
No
  • K=11 M=2
  • 2 OSD's can fail.
  • 18% overhead
anyone know ?
 
I'm currently using a test machine with 14 HDD's and 2 SSD's + OS-Disks.
Each SSD acts as a journal for 7 HDD's.

What happens when a journal Disk fails ? is the data still available on the OSD's ? OSD still readable ? OSD still writeable?

This is espcially important as i am creating my Erasure coded pools


do i need to take journals into account ?

Yes
  • it is K=5 and M=8
  • either 8 OSD's can fail or 1 Journal + 1 OSD.
  • 160% overhead.
No
  • K=11 M=2
  • 2 OSD's can fail.
  • 18% overhead
anyone know ?
Hi,
if you loose all journal disk you loose all OSDs too (which has jounals on the ssd).
This is the reason why you should use an dc ssd like intel dc s3700.
with an replaced ssd you must recreate the journal to use the osd again.
in your second config you are unable to reboot one node without data interuption.

Udo

ps EC pools are slow and working with ssd cache tier (for proxmox) only! i use them for archive data only.
 
thats what i figured, loose a journal, loose the disks, i did not know you can rebuild the journals tho.

I understand. I need to get some more SSD's as journal Disks, so i do not waste too much on overhead. My work cluster uses a 4 OSD per SSD and is actually a 4-node cluster (20 journal SSD, 80 HDD's).
But this one is a tinkering 1-node-cluster to get myself a lot more familiar with ceph, as i feel a bit uncomfortable right now.


on my Tinker-Machine I am currently considering placing my Fast HDD's on SSD-journals (4) and use the left over ssd as a cache tier, but i feel that would probably be like tilting at windmills.

Is there a way to influence which OSD's receive first Writes, kinda like "primary-Affinity" does for Reads ?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!