12€ Hyperconverged ProxMox 7 Cloud Cluster with 1TB storage

idoc · Aug 3, 2021

Hyperconvergent ProxMox Cloud Cluster (2).png

Hi community,

i have created a hyperconverged ProxMox Cloud Cluster as an experimental project.

features:

using 1blu.de compute nodes: 1 Euro / month
using 1blu.de storage node 1 TB: 9 Euro / month
using LAN based on Vodafone Cable Internet IPV6 DS-LITE (VF NetBox)
the ProxMox cluster uses public internet for cluster communication
cluster can only run LXC containers (no VMX flag, no nested virtualization)
containers can move between compute nodes
containers can fail-over (restarting mode)
compute nodes share a Ceph device filling all the local storage of the ProxMox machines left over from the root filesystem after installation of ProxMox
Ceph is using public internet, each compute node is a Ceph manager and runs a Ceph monitors for the OSDs
Ceph-FS available on top of the rdb
containers can use Ceph rdb, local or nfs-nas (1 TB) for images
containers can use nfs-rclone mounted Google Drive (1 PB) for dumps, container templates, files, databases etc.
containers can join a software defined (SDN) VXLAN spanning across containers on arbitrary compute nodes and the WireGuard gateway to communicate securely with the LAN
containers can reach the public internet directly via its compute node (NAT)
LAN uses gl.inet Mango router as WireGuard gateway into the ProxMox cluster
LAN client get their routing into the cluster from the DHCP server on the gl.inet Mango router

steps i went:

prepare the compute nodes by booting the ProxMox 7 ISO, enter public IPs masks, hostnames, your email and stuff, select whole disk for ProxMox
connect via VNC and edit /etc/network/interfaces: add mac entries into /etc/network/interfaces, as ip-spoofing makes your ProxMox nodes unrechable (see ProxMox upgrade notes) - after that reboot and you can reach the ProxMox UI
enable ip forwaring on all compute nodes
install ProxMox 7 repos, update upgrade
install Ceph (v16.2) on all compute nodes
create cluster
Code:
```
apt install fail2ban
```
on all compute nodes (optional)
remove pve/data from all compute nodes via shell with:
Code:
```
lvremove pve/data
```
create new data using:
Code:
```
lvcreate -n data -l100%FREE pve
```
this will be the compute node OSD, with 3 compute nodes we will get 3 OSDs
ProxMox does no longer supports partitions as intrastructure for OSDs - so we have do force it to do so, and can not use the ProxMox UI to create our OSDs.
now the trickery begins: ProxMox does not let us create our OSDs the easy way, as we do NOT have a local network for our Ceph traffic. It is not recommendet to use the public internet for Ceph traffic BUT we have to as we do not have a second NIC installed in our compute nodes. So we force Ceph to use the public internet by editing /etc/pve/ceph.conf
this is not recommended as it created vast amounts of traffic on our productive network (which also is the public internet between our compute nodes). Remember this is an experimental project!
Ceph complaints about not finding ceph.keyring so i ran
Code:
```
ceph auth get-or-create client.bootstrap-osd -o /var/lib/ceph/bootstrap-osd/ceph.keyring
```
to create one on every compute node
force create a new Ceph OSD using our newly created pve/data partition
Code:
```
ceph-volume lvm create --data /dev/pve/data
```
with the 1€ VPS i used as compute nodes i got 80+ GB OSDs:
using ProxMox UI at the DataCenter level I created an RADOS block device (rbd) using all compute nodes' OSDs

we just created the green Ceph rdb 'network' shown in the overview between 3 VPS using the public internet as Ceph traffic network.

using the UI you may watch Ceph juggling blocks between your compute nodes - not recommended for heavy use as iowait will kill you. But its fun to watch ...
--- we just created a Hyperconverged ProxMox Ceph Cluster in the cloud using 3 VPS ---
you can also create MetaData servers on each compute node using the Ceph ProxMox UI on each node and furthermore create a CepfFS on top of the rdb we just created (optional). The rdb is a thin-provisioning infrastructure you can use to deploy containers that automatically replicate across all included nodes on a block level basis. You can also create linked clones of containers on an rdb that only use a minimal amount of storage for individual configurations and data and share the common blocks with the master container. All distrucuted across your 3 bucks cluster in the cloud
setup storage node (i chose the 1 TB variant for 9 €) with nfs kernel server and rclone
on your storage node setup an rclone mount for your Google Drive (optional). I am using VFS caching and the compute nodes may use all of the 1 TB storage as Rclone cache - but you may restrict Rclone as you like. I can tell my Google Drive is lightning fast shared across all compute nodes.
Unfortunately i did not manage to have ISOs and container vm disks on that mount - some filesystems features like truncating and random accessing blocks in a file seem not to be provided by a Rclone mount.
I shared the rest of the 1 TB as a NFS export with all compute nodes (shown as nfs-nas). This can be also used for running containers with thin-provisioning. Speed is reasonable fast. We have just created the violet SAN shown in the overview.
i have Rclone clean the VFS cache after 24h. Here is the systemd script for rclone on the storage node:
You can watch Rclone filling and cleaning the cache using ProxMox UI:
--- we just created a shitty 12€ ProxMox Cluster with 1 TB shared storage and 1 PB secondary/optional storage ---

to be continued ...

idoc · Aug 4, 2021

The second part of this experiment focusses on linking the LAN to the ProXMox Cloud Cluster. I decided to go with WireGuard and got myself a gl.Inet Mango Router from Amazon (~20 €).

steps I went:

as I utilize the Vodafone ConnectBox (or NetBox) I got for free from VF I had no chance to implement WireGuard on the router itsself - so I attached the Mango LAN interface to one of the LAN ports of the ConnectBox.
switched off WiFi on the Mango - its not gonna be used
configured a WireGuard client with the hostname of one of compute nodes of the ProxMox Cloud Cluster as endpoint
created a Debian LXC on the Endpoint Host to function as WireGuard Gateway into the software defined net of the ProxMox Cluster: Every container joining the SDN will automatically join my home LAN from the cloud in a secure way. No LXC inside that SDN should have a link into the public internet to keep security high (well ... most of the time as we will see later). I used wireguard-install.sh from GitHub - its a great project.
Code:
```
wget git.io/wireguard -O wireguard-install.sh && bash wireguard-install.sh
```
will get you to this menu

add client for the gl.Inet router:

allow traffic for the endpoint IP of the gl.Inet (see screenshot from gl.Inet) and traffic from my LAN (192.168.0.0/24) into the clusters SDN.
The network 10.7.0.0/24 is solely used by WireGuard - we dont care...
on the compute node running the Debian WireGuard gateway I enabled NAT for the WireGuard tunnel to reach the Debian LXC WireGuard gateway from the ProxMox host

Note: The WireGuard Debian gateway is not high available for this experiment - i though of creating 3 client configurations of the gl.Inet for each compute node of the Cloud Cluster to be able to move the WireGuard gateway around the cluster so that always one of the three gl.Inet WireGuard client configs would be able to reach the Debian WireGuard gateway regardless which compute node it runs on (as the endpoint addresses are different static public IPs).
To have the gl.Inet always open the tunnel i make use of the fact, that the VPS seem to have a static IPV4 - so avoiding obstacles regarding IPV6 DSLite on the VF NetBox. The gl.Inet also tries to open all WireGuard tunnels automatically after reboot which comes in handy here.
i configured 192.168.0.10 as the static LAN IP of the gl.Inet as this IP will be gateway for LAN clients to reach the cloud cluster.

connect via ssh into the gl.Inet I can ping through the WireGuard tunnel into the ProXMox Cluster
--- we just created the blue connection tunnel between the WireGuard Debian in the cloud and the gl.Inet inside the LAN as shown in the overview --
at DataCenter level I created a VXLAN Zone for the SDN to span all compute nodes

MTU size must be reduced by 50b - see ProXMox online documentation. Be sure to install requirements for SDN on all compute nodes (see proxMox online documentation). I rebooted all nodes and got the SDN menu entry inside the DataCenter UI of ProxMox.
inside the VXLAN Zone, that i called WireGuard nodes (wgnodes) I created a VNET:

and used a random tag value
i did not create any subnets inside the wgnet VNET as this is only an experimental project - but sure enough there can be lots of subnets communicating with the LAN and with each other inside the SDN

to be continued ...

idoc · Aug 4, 2021

i created Debian test machines on the other nodes called sdntest01 and sndtest02. I chose 10.10.20.0/24 as the subnet for all LXC joining the SDN

as you can see sdntest01 is able to see sdntest02 across the compute nodes.
to enable the SDN to send traffic into the WireGuard tunnel I had the WireGuard gateway join the SDN. The subnet for LXC that are allowed to reach the public internet was defined as
Code:
```
10.10.10.0/24
```
. I decided to create two bridges on the WireGuard gateway and have the kernel handle the ip forwarding for me avoiding NATting:

the wgnet bridge is provided by our SDN VXLAN and vmbr1 I configured on every compute node like this:

to make clear: The SDN is a VXLAN and has no exit node like a bgn border node - it is closed. the vmbr1 provides NAt into the internet via the vmbr0 each compute node uses as its connection to the world.
So the WireGuard gateway communicates via eth0 (bridge: vmbr1) with the gl.Inet via the public internet and communicates with the LXC inside the cloud based SDN via eth1 (bridge: wgnet).
Now: If I want to do installing of one of the LXC inside the SDN I temporarily add a NIC connected to vmbr1, do my installing and remove the NIC from the LXC to lock and secure the LXC back into the SDN.
SDN enabled LXC do not need any routing as they use 10.10.20.1 as their default gateway which is eth0 on the WireGuard gateway that just joined the SDN.The WireGuard gateway itsself routes traffic towards the LAN into the tunnel device wg0:
now my workstation inside the LAN (has 192.168.0.235 from DHCP) can ping sdntest01 through the gl.Inet WireGuard tunnel, through the Debian WireGuard gateway through the SDN VXLAN on another compute node (and vice versa) (see documentation on providing the routing into the ProxMox Cluster for LAN clients below):
--- we just created the whole blue SDN and the WireGuard tunnel as shown in the overview with nearly pure routing & bridging ---

to be continued ...

idoc · Aug 4, 2021

to have my LAN clients: computers, IoT devices and so on reach the LXC servers inside the ProxMox Cloud Cluster SDN they all need a routing into 10.10.20.0/24. Alas the VF NetBox does not provide any means to create static routes. So I decided to advertise my routing via DHCP
I switched OFF the DHCP on the VodaFone box and startet the DHCP server on the gl.Inet to serve my LAN with IP addresses, DNS servers and (!) routing:

On the Advanced Setup page I installed LuCi and went to the Network/Interfaces tab
I configured DNS servers to provide to the clients:
and more important I configured DHCP options for the routing:

I configured the default gateway (option 3) as the VodaFone NetBox and the gateway into the ProxMox cluster SDN as the address of the gl.Inet (option 121). OpenWrt lists all available options for you if you ask Google.

Now every LAN client gets the routing towards the LXCs waiting in subnet 10.10.20.0/24 secured inside the VXLAN of the 12€ Hyperconverged ProxMox Cloud Cluster with 20€ WireGuard connector.

This ends my experimental journey for now. I think I will experiment withan LXC ioBroker inside the ProxMox cloud SDN to drive IoT devices inside my LAn or something else - lets see...

I hope someone finds my details interesting. Would be happy to hear from your experiments and stay healthy ...

SINOS · Aug 4, 2021

Not sure if I'm shocked or amazed. Impressive experiment nonetheless!

tuxillo · Oct 26, 2022

What are those blue.de compute nodes for 1eur a month?

bzb-rs · Oct 26, 2022

Appreciate the time taken to post this. Will give some insight on how a different model of setup can be achieved.

bobmc · Oct 26, 2022

Looks like you've done a lot of work and thanks for sharing this with the community.

tuxillo · Oct 27, 2022

bzb-rs said:
Appreciate the time taken to post this. Will give some insight on how a different model of setup can be achieved.

Yeah, I'm doing something similar right now, I might share my setup too!

bzb-rs · Oct 27, 2022

tuxillo said:
Yeah, I'm doing something similar right now, I might share my setup too!

Then i will mostly wait before i run to my lab to test these. Thanks!!

sha256shah · Oct 27, 2022

tuxillo said:
What are those blue.de compute nodes for 1eur a month?

Finally found it here, but it is only €1 for first 6 months out of the 12 months that you have to sign up for:

https://www.1blu.de/webhosting/managedhosting/

pille99 · Oct 28, 2022

tuxillo said:
Yeah, I'm doing something similar right now, I might share my setup too!

this setup is not recommended for prod service/server. its just a playground.

Search

Search

12€ Hyperconverged ProxMox 7 Cloud Cluster with 1TB storage

idoc

Member

Attachments

idoc

Member

idoc

Member

idoc

Member

SINOS

Member

tuxillo

Renowned Member

bzb-rs

Member

bobmc

Renowned Member

tuxillo

Renowned Member

bzb-rs

Member

sha256shah

New Member

pille99

Active Member