Hi, We have the following servers mentioned below, they can not be changed as we are using MaaS and they are configured this way. I realize this is not optimal but would prefer to spend time on figuring out what can work with this hardware layout and how can we leverage what we have.
I am not new to proxmox and previously we used server 4,5 as FreeNAS and ran ZFS and ran a standard cluster. I would like to set up a Hyper-Converged cluster and rebuild the lot. Im looking for the best architecture and formula. (i do not need a migration plan the hardware bare and clean) we will run between 60 and 120 VM's with a combination of web apps and time series databases. We do not have a lot of peaks and valleys in performance but do tend to have a steady medium volume of data in (write) with peak reads when people access dashboards.
All of the servers have dual 10G network bonded with 3 vlans, pulbic, cluster/storage, network traffic. The OS is installed on the first disk in each machine ~100G, installed with maxvz 0. I have tried to converting all free space on disk 1 to ceph and have 13 OSD, with an OSD per disk. Below provides the output of the system the way it is (which I'm sure is wrong as its all default. I provide this not because i think this is the right way to do this but it gives you an idea of the way not to do it.... based on the degraded state.
Thanks for all the help in advance. its my first time with ceph and hyper converged clusters.
root@pve1:~# ceph -s
cluster:
id: b4e4e110-677a-44b0-b904-4b5c25305212
health: HEALTH_WARN
Degraded data redundancy: 89/11523 objects degraded (0.772%), 3 pgs degraded, 3 pgs undersized
services:
mon: 4 daemons, quorum pve1,pve5,pve3,pve4 (age 15m)
mgr: pve4(active, since 2h), standbys: pve5
mds: 4 up:standby
osd: 14 osds: 14 up (since 15m), 14 in (since 3h); 1 remapped pgs
data:
pools: 2 pools, 112 pgs
objects: 3.84k objects, 15 GiB
usage: 60 GiB used, 30 TiB / 30 TiB avail
pgs: 89/11523 objects degraded (0.772%)
37/11523 objects misplaced (0.321%)
108 active+clean
3 active+undersized+degraded
1 active+clean+remapped
root@pve1:~# ceph osd df
ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS
0 ssd 0.43619 1.00000 447 GiB 3.9 GiB 2.9 GiB 283 KiB 1024 MiB 443 GiB 0.88 4.50 20 up
1 ssd 0.33409 1.00000 342 GiB 3.5 GiB 2.5 GiB 221 KiB 1024 MiB 339 GiB 1.03 5.28 19 up
2 ssd 0.43619 1.00000 447 GiB 4.4 GiB 3.4 GiB 373 KiB 1024 MiB 442 GiB 0.98 5.02 24 up
3 ssd 0.33409 1.00000 342 GiB 3.6 GiB 2.6 GiB 321 KiB 1024 MiB 339 GiB 1.05 5.40 19 up
4 ssd 0.43619 1.00000 447 GiB 4.2 GiB 3.2 GiB 364 KiB 1024 MiB 442 GiB 0.95 4.86 22 up
5 ssd 0.33409 1.00000 342 GiB 2.5 GiB 1.5 GiB 157 KiB 1024 MiB 340 GiB 0.73 3.76 11 up
6 ssd 3.39059 1.00000 3.4 TiB 4.2 GiB 3.2 GiB 250 KiB 1024 MiB 3.4 TiB 0.12 0.62 24 up
7 ssd 3.49260 1.00000 3.5 TiB 4.7 GiB 3.7 GiB 267 KiB 1024 MiB 3.5 TiB 0.13 0.67 27 up
8 ssd 3.49260 1.00000 3.5 TiB 4.9 GiB 3.9 GiB 286 KiB 1024 MiB 3.5 TiB 0.14 0.70 28 up
9 ssd 3.49260 1.00000 3.5 TiB 5.2 GiB 4.2 GiB 513 KiB 1023 MiB 3.5 TiB 0.15 0.75 30 up
10 ssd 3.39059 1.00000 3.4 TiB 5.9 GiB 4.9 GiB 260 KiB 1024 MiB 3.4 TiB 0.17 0.87 37 up
11 ssd 3.49260 1.00000 3.5 TiB 5.1 GiB 4.1 GiB 530 KiB 1023 MiB 3.5 TiB 0.14 0.73 29 up
12 ssd 3.49260 1.00000 3.5 TiB 3.8 GiB 2.8 GiB 258 KiB 1024 MiB 3.5 TiB 0.11 0.55 20 up
13 ssd 3.49260 1.00000 3.5 TiB 4.2 GiB 3.2 GiB 264 KiB 1024 MiB 3.5 TiB 0.12 0.60 23 up
TOTAL 30 TiB 60 GiB 46 GiB 4.3 MiB 14 GiB 30 TiB 0.20
MIN/MAX VAR: 0.55/5.40 STDDEV: 0.49
Server_1:
I am not new to proxmox and previously we used server 4,5 as FreeNAS and ran ZFS and ran a standard cluster. I would like to set up a Hyper-Converged cluster and rebuild the lot. Im looking for the best architecture and formula. (i do not need a migration plan the hardware bare and clean) we will run between 60 and 120 VM's with a combination of web apps and time series databases. We do not have a lot of peaks and valleys in performance but do tend to have a steady medium volume of data in (write) with peak reads when people access dashboards.
All of the servers have dual 10G network bonded with 3 vlans, pulbic, cluster/storage, network traffic. The OS is installed on the first disk in each machine ~100G, installed with maxvz 0. I have tried to converting all free space on disk 1 to ceph and have 13 OSD, with an OSD per disk. Below provides the output of the system the way it is (which I'm sure is wrong as its all default. I provide this not because i think this is the right way to do this but it gives you an idea of the way not to do it.... based on the degraded state.
Thanks for all the help in advance. its my first time with ceph and hyper converged clusters.
root@pve1:~# ceph -s
cluster:
id: b4e4e110-677a-44b0-b904-4b5c25305212
health: HEALTH_WARN
Degraded data redundancy: 89/11523 objects degraded (0.772%), 3 pgs degraded, 3 pgs undersized
services:
mon: 4 daemons, quorum pve1,pve5,pve3,pve4 (age 15m)
mgr: pve4(active, since 2h), standbys: pve5
mds: 4 up:standby
osd: 14 osds: 14 up (since 15m), 14 in (since 3h); 1 remapped pgs
data:
pools: 2 pools, 112 pgs
objects: 3.84k objects, 15 GiB
usage: 60 GiB used, 30 TiB / 30 TiB avail
pgs: 89/11523 objects degraded (0.772%)
37/11523 objects misplaced (0.321%)
108 active+clean
3 active+undersized+degraded
1 active+clean+remapped
root@pve1:~# ceph osd df
ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS
0 ssd 0.43619 1.00000 447 GiB 3.9 GiB 2.9 GiB 283 KiB 1024 MiB 443 GiB 0.88 4.50 20 up
1 ssd 0.33409 1.00000 342 GiB 3.5 GiB 2.5 GiB 221 KiB 1024 MiB 339 GiB 1.03 5.28 19 up
2 ssd 0.43619 1.00000 447 GiB 4.4 GiB 3.4 GiB 373 KiB 1024 MiB 442 GiB 0.98 5.02 24 up
3 ssd 0.33409 1.00000 342 GiB 3.6 GiB 2.6 GiB 321 KiB 1024 MiB 339 GiB 1.05 5.40 19 up
4 ssd 0.43619 1.00000 447 GiB 4.2 GiB 3.2 GiB 364 KiB 1024 MiB 442 GiB 0.95 4.86 22 up
5 ssd 0.33409 1.00000 342 GiB 2.5 GiB 1.5 GiB 157 KiB 1024 MiB 340 GiB 0.73 3.76 11 up
6 ssd 3.39059 1.00000 3.4 TiB 4.2 GiB 3.2 GiB 250 KiB 1024 MiB 3.4 TiB 0.12 0.62 24 up
7 ssd 3.49260 1.00000 3.5 TiB 4.7 GiB 3.7 GiB 267 KiB 1024 MiB 3.5 TiB 0.13 0.67 27 up
8 ssd 3.49260 1.00000 3.5 TiB 4.9 GiB 3.9 GiB 286 KiB 1024 MiB 3.5 TiB 0.14 0.70 28 up
9 ssd 3.49260 1.00000 3.5 TiB 5.2 GiB 4.2 GiB 513 KiB 1023 MiB 3.5 TiB 0.15 0.75 30 up
10 ssd 3.39059 1.00000 3.4 TiB 5.9 GiB 4.9 GiB 260 KiB 1024 MiB 3.4 TiB 0.17 0.87 37 up
11 ssd 3.49260 1.00000 3.5 TiB 5.1 GiB 4.1 GiB 530 KiB 1023 MiB 3.5 TiB 0.14 0.73 29 up
12 ssd 3.49260 1.00000 3.5 TiB 3.8 GiB 2.8 GiB 258 KiB 1024 MiB 3.5 TiB 0.11 0.55 20 up
13 ssd 3.49260 1.00000 3.5 TiB 4.2 GiB 3.2 GiB 264 KiB 1024 MiB 3.5 TiB 0.12 0.60 23 up
TOTAL 30 TiB 60 GiB 46 GiB 4.3 MiB 14 GiB 30 TiB 0.20
MIN/MAX VAR: 0.55/5.40 STDDEV: 0.49
Server_1:
CPU: dual silver
mem: 384G
disk1: 450G SSD
disk2: 450G SSD
Server_2:CPU: dual silver
mem: 384G
disk1: 450G SSD
disk2: 450G SSD
Server_3:CPU: dual silver
mem: 384G
disk1: 450G SSD
disk2: 450G SSD
Server_4:CPU: E3-1270V6
mem: 32G
disk1: 3.4T SSD
disk2: 3.4T SSD
disk3: 3.4T SSD
disk4: 3.4T SSD
Server_5:CPU: E3-1270V6
mem: 32G
disk1: 3.4T SSD
disk2: 3.4T SSD
disk3: 3.4T SSD
disk4: 3.4T SSD