Ceph30 also have separated 1G interface for cluster.
As I wrote before, all SSD drives are identical. On each server there are 2 SSD, plase look at partition table for this drives:
First with system, journals, and osd
Disk /dev/sda: 199GB
Sector size (logical/physical)...
Yes, but there is max 30% of network device usage.
This is replicated (replica 3) pool with cache tier and journal on SSD. All SSDs drives are INTEL SSDSC2BX200G4.
How can I check this?
4MB block-size on local storage gets 69MB/s read and 20MB/s write on SATA.
Tested from live CD on my laptop, using fio and this config:
invalidate=0 # mandatory
write: io=2048.0MB, bw=7717.5KB/s, iops=1929, runt=271742msec
So it's looks...
There are no tunning on proxmox, upgraded from 4.0 last week (but on 4.0 the same symptoms).
Fio 2.1.11 on localstorage (SATA) from hypervisor:
bgwriter: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=32
queryA: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=mmap...
On our environment I see some performance issue, maybe someone can help me to find where is the problem.
We have 6 servers on PVE4.4 with ca. 200VMs (Windows and Linux). All VM disks (rbd) are stored on separated Ceph cluster (10 servers, 20 SSD osd - cache tier and 48 HDD osd ).
Thank You for this info.
In Sunday I moved dev cluster to vlan 2000, but this not help, after read this links, I enabled IGMP L2-general-quiter and now there is quorum on each cluster.
Unfortunately the omping is not working (before and after L2 quiter function change), I suppose, it should...
Today I make shutdown of all proxmox servers, next start one by one. Each server join cluster and works, but only for 10 minutes, and quorum is down. IGMP Snooping now is enabled globally, dev cluster switched down.
Corosync.log in attachment.
Propably IGMP Snooping is disabled. I'm looking at this now.
Dev cluster wos recreated via pvecm. clusters have unique name. ('backup' and 'c01')
Should I restart one-by-one starting from first node?
/etc/pve/cluster.conf from 3.4 cluster
I have similar issue. We have 2 PVE clusters, dev and production. clusters are connected to the same switches, in other ip networks, but without vlans.
Yesterday I upgraded dev cluster from PVE 3.4 to 4.1. Procedure from PVE wiki is end without problems, but after few minutes dev cluster...