Yeah, that means you can cope with one host going dark on you.
In m mind that is smarter then using OSD as failure domain (only interesting when you have loads of OSD's per node)
Yes .. and no ..
According to the intel specs on the P3600 you are looking at
0.020-0.030 ms (milliseconds)...
I've done 6 Proxmox ZFS setups with Lara for separate customers on the side in March 2017 via hetzner. You go into your robot. You go to support, at the bottom it lists the "LARA" (ip console) option. I typically choose "plug in when ever" + 2 hours needed. Enough time to setup proxmox twice...
Then these numbers make more sense. (You are doing replication 2 on the host failure-domain, right ? not on the OSD Failure-Domain ?)
3/4th of your nodes use the same SSD for OSD+Journal. I am guessing you are getting about 150 MB/s ish write performance out of these Cruicial MX300's.
the NVME...
That is with 4 Nodes and 10G for Ceph client + 10g for ceph-cluster, right ?
How many Journal SSD (SM863 240GB) and Disks (which type ??) do you have copnfigured per node ?
What pool replication size did you end up using ?
It all boils down to this in essence:
The only way to be sure is to constantly meassure how much data is written (reads as: "run a software that graphs smartvalues of all your flash devices")
Write amplification exists. It effects are worse in lowend SSD's because the mitigation tools are...
Are you using OVH's vrack solution ?
If so, check this from about a year ago. might be helpfull:
http://pve.proxmox.com/pipermail/pve-user/2016-April/010251.html
disclaimer: (only ever used unicast on openvpn with ovh - personal projects)
Here is an SSD for journals primer just in case (scroll to the bottom)
https://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/
That is one of your bottlenecks right there.
If you use one SSD as a journal and OSD and another only as a OSD, and their weight is similar, then statistically they are as likely to be used as Primary OSD.
This "multipurpose SSD" handles the following writes:
4x Journal Writes for the 4x...
If your Postgres-DB is using 10 GB in space and you use 3 nodes with a single dedicated SSD for the task with Replication set to 3, then you are looking at 30GB of data written residing on said SSD's in total (10G each). In reality you probbaly want to size this pool larger. You probbaly wan't...
Just to reiterate on the SSD's and Data written values. Gave a look at this post where someone ran the numbers:
SSD €/TBW
Please have a look at how journals work in Ceph:
https://www.sebastien-han.fr/blog/2014/02/17/ceph-io-patterns-the-bad/
And also keep this in mind:
If you have a...
Just saw this.
What you do is Order a LARA at hetzner on the support page (read up on lara on hetzner wiki first - especially the part about telling them to mount a specific ISO on a USB-stick). It is basically a remote console, which streams video of your vga + usb + mouse to a web page.
That...
okay, let me recap this based on the information you provided:
You have 10 Servers
each has a 1G link for Public Network
each has a 1G link for Cluster Network (exception node Ceph30 - shared with public network)
each Server acts as a MON (10 Mons total)
you have split Spinners from SSD's...
Can you post the following, please?
Ceph server specs (cpu, ram, networking)
Ceph config (including networking)
Do you use the 20 cache-tier SSD's as pure cache or also as a journal?
Ceph crush map.
preferably encapsulated by code/quote bb-code.
You can put a pool as cache on any other pool...
That should actually work no problem.
Afaik you should be able to just lcap in balance-tcp mode the 10G links and assign em to OVS-Bridge vmbr0. Add OVS-IntPorts for the following networks:
Ceph_Public (jumbo Frames are your friend here)
Ceph_Cluster (jumbo Frames are your friend here)...
yeah, you are right. I was looking at it from my pov where we have higher link speeds and way more nodes and OSDs per node. In that situation it equalizes before it turns into a disadvantage.
With 3 nodes and only 4 OSD's on a suspected 1G line a 2/3 node is faster, since his 1G would be the...
Question:
Is the IO-Delay only present during the restore ?
Your screenshot suggests that to me.
If so, there seems to be an issue, because according to your screenshot the io delay was roughly 50 minutes, the backup was 50GB. That suggests a restore rate of 17 MB/s. Unless you're using...
Lets brake down what Ceph is designed for:
The ability to take the loss of one OSD (or multiples)
the ability to take the loss of one node (or multiple)
the Ability to take the loss of one Rack (or multiples)
... all the way up to one Campus/Region/Datacenter ...
All you have to do is plan for...
Just a side note (because I feel this might have gotten lost while reading the openvswitch link i provided):
You use openvswitch inside Proxmox instead of your native linux bridging.
You do not need to run openvswitch on dedicated hardware (ie as a hardware-switch replacement)
Basically what...
While you can do it manually, you should not.
It should not add more safety (as ceph is designed to loose every component of a cluster without encountering data loss).
Can you post your Hardware details, so we can make more educated guesses on why you might think a "software raid journal" is...
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.