[SOLVED] Create a virtual cluster in lxc's?

lifeboy

Renowned Member
Hi all,

Is it possible to create a virtual proxmox cluster in lxc instances? I'm planning to create a test cluster to experiment with Terraform, so if I can do that in 3 linux containers (create a node in each), it would be the lowest resource usage. Of course I can use full Qemu KVM guests, but if I don't have to, I don't want to :)
 
no, PVE can't really be installed in a container..
 
no, PVE can't really be installed in a container..
I actually did install it, and all seems to be installable (using the installation instructions for Proxmox on Debian Bulleye), but corosync doesn't run despite using separate VLAN for the lxc's... I was hoping to that there is a way around whatever the problem is.


Code:
* corosync.service - Corosync Cluster Engine
     Loaded: loaded (/lib/systemd/system/corosync.service; enabled; vendor preset: enabled)
     Active: failed (Result: exit-code) since Mon 2022-08-01 14:46:51 UTC; 20h ago
       Docs: man:corosync
             man:corosync.conf
             man:corosync_overview
   Main PID: 596 (code=exited, status=15)
        CPU: 16ms

Aug 01 14:46:51 vpmx1 corosync[596]:   [TOTEM ] Initializing transport (Kronosnet).
Aug 01 14:46:51 vpmx1 corosync[596]:   [TOTEM ] knet_handle_new failed: File name too long (36)
Aug 01 14:46:51 vpmx1 corosync[596]:   [KNET  ] transport: Failed to set socket buffer via force option 33: Operation not permitted
Aug 01 14:46:51 vpmx1 corosync[596]:   [KNET  ] transport: Unable to set local socketpair receive buffer: File name too long
Aug 01 14:46:51 vpmx1 corosync[596]:   [KNET  ] handle: Unable to initialize internal dstsockpair: File name too long
Aug 01 14:46:51 vpmx1 corosync[596]:   [MAIN  ] Can't initialize TOTEM layer
Aug 01 14:46:51 vpmx1 corosync[596]:   [MAIN  ] Corosync Cluster Engine exiting with status 15 at main.c:1608.
Aug 01 14:46:51 vpmx1 systemd[1]: corosync.service: Main process exited, code=exited, status=15/n/a
Aug 01 14:46:51 vpmx1 systemd[1]: corosync.service: Failed with result 'exit-code'.
Aug 01 14:46:51 vpmx1 systemd[1]: Failed to start Corosync Cluster Engine.
 
you'll need to make tons of workaround that make the resulting "PVE" nowhere close to an actual one (which makes it unsuitable for a test environment).
 
  • Like
Reactions: lifeboy
Yes, indeed, I'm doing that at the moment. I prefer to use LXC's whenever possible, which it why I gave it a shot...
On a single node, I also do that but in a cluster, LX(C) containers offer no "cluster benefits". I need live migration and containers do not offer that.
 
containers need to be restarted as part of the migration, VMs do not.
 
  • Like
Reactions: LnxBil
Running a proxmox cluster inside lxd containers is actually rather neat. You can pass through the kvm device from the host:

Bash:
lxc config device add proxmox1 kvm unix-char source=/dev/kvm

Then VMs inside proxmox run at full speed, without the overhead of nested virtualization. Furthermore, live migration of running VMs between cluster nodes "just works" (including local storage migration).

However, you do need to set the containers to privileged mode because of the corosync problem mentioned in this issue. That's a shame, because Ubuntu have fixed it in their corosync package: see LP#1918735. EDIT: the fix was upstreamed but it's hidden behind a flag allow_knet_handle_fallback. You can get corosync to start by adding

Code:
system {
  allow_knet_handle_fallback: yes
}

to /etc/corosync/corosync.conf. It then gets overwritten, but you can fix that by copying the (edited) file to /etc/pve/corosync.conf. With this setting, a proxmox cluster node works inside an unprivileged lxd container, yay!

Unfortunately you can't run Proxmox CTs (containers) inside Proxmox running in a lxd container, because Proxmox needs to manipulate loopback devices on the host to set up the container's filesystem.

And whilst you can run Ceph mons and managers, you can't run Ceph OSDs (even in a privileged container): Ceph wants to directly manage LVM on the host (*), and that's too risky to allow.

Bash:
# On the host
lvcreate --name ceph1a --size 4G /dev/vg0
[FONT=Open Sans]lxc config device add proxmox1 a unix-block \[/FONT]
  source=/dev/mapper/vg0-ceph1a path=/dev/sda

# Inside the container
pveceph osd create /dev/sda
...
Use of uninitialized value in hash element at /usr/share/perl5/PVE/Diskmanage.pm line 455, <DATA> line 960.
  /dev/mapper/control: open failed: Operation not permitted
  Failure to communicate with kernel device-mapper driver.
  Check that device-mapper is available in the kernel.
  Incompatible libdevmapper 1.02.185 (2022-05-18) and kernel driver (unknown version).
  /dev/mapper/control: open failed: Operation not permitted
  Failure to communicate with kernel device-mapper driver.
  Check that device-mapper is available in the kernel.
  Incompatible libdevmapper 1.02.185 (2022-05-18) and kernel driver (unknown version).
unable to get device info for '/dev/sda'

What you can do instead is use lxd to run full-fat VMs for additional Proxmox nodes. These nodes can be used for Proxmox CTs, Ceph OSDs, and anything else which needs direct access to block devices (e.g. ZFS replication). This works fine.

(*) It appears that bluestore OSDs configure LVM inside of the block device. Here is proxmox running inside a VM where I gave it 4 additional block devices:

Code:
root@proxmox6:~# pvs
  PV         VG                                        Fmt  Attr PSize  PFree
  /dev/sdb   ceph-b558e89a-bd13-496b-ab85-7df08e9e9a9b lvm2 a--  <4.00g    0
  /dev/sdc   ceph-3388bbde-cd81-485f-b994-51fc82af43ab lvm2 a--  <4.00g    0
  /dev/sdd   ceph-1307d475-eda5-4797-bcc2-6d013a3e4a6c lvm2 a--  <4.00g    0
  /dev/sde   ceph-34e41bf4-9f08-4e3a-a55a-a7116e671f5d lvm2 a--  <4.00g    0
root@proxmox6:~# vgs
  VG                                        #PV #LV #SN Attr   VSize  VFree
  ceph-1307d475-eda5-4797-bcc2-6d013a3e4a6c   1   1   0 wz--n- <4.00g    0
  ceph-3388bbde-cd81-485f-b994-51fc82af43ab   1   1   0 wz--n- <4.00g    0
  ceph-34e41bf4-9f08-4e3a-a55a-a7116e671f5d   1   1   0 wz--n- <4.00g    0
  ceph-b558e89a-bd13-496b-ab85-7df08e9e9a9b   1   1   0 wz--n- <4.00g    0
root@proxmox6:~# lvs
  LV                                             VG                                        Attr       LSize  Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  osd-block-0842c3a9-413d-4c8a-a5b7-9cf1a3bf0aa1 ceph-1307d475-eda5-4797-bcc2-6d013a3e4a6c -wi-ao---- <4.00g
  osd-block-78259d5f-f136-48f3-b1a5-054fbd19668a ceph-3388bbde-cd81-485f-b994-51fc82af43ab -wi-ao---- <4.00g
  osd-block-4990e664-e086-4741-b03d-0a18997fa9b9 ceph-34e41bf4-9f08-4e3a-a55a-a7116e671f5d -wi-ao---- <4.00g
  osd-block-67a9fe12-3a75-4d01-8e72-769a8af9d14a ceph-b558e89a-bd13-496b-ab85-7df08e9e9a9b -wi-ao---- <4.00g
 
Last edited:
@lifeboy @candlerb I'm about to get another thick client and was thinking about setting up proxmox cluster as an LXD so this thread is really helpful. I can imagine a situation where you have a bunch of computers that generally sit idle but can probably support other cluster type services like Proxmox or docker swarms or even kasm agents.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!