Should I move from KVM to LXC ?

silverstone

Well-Known Member
Apr 28, 2018
79
5
48
35
I have A LOT of VMs running on Proxmox VE across several Servers.

Most of the VMs run Debian GNU/Linux Bookworm.

Overall the KVMs seems quite inefficient, especially in Terms of:
- RAM Usage (a bit less on Xeon E3 v5/v6, since up to 64GB RAM can be used there)
- Disk Space (each VM takes 16GB as a minimum and the Average is more like 16GB-32GB mark) ... so a handful of VM already require 1000GB as a Minimum nowadays

CPU usage isn't really an Issue, that's usually NOT *that* high.

But I'm wondering if the Switch is worth or, since I already plan to do that, rather move to use Podman (Docker) Container for most Apps instead like what I am doing with newly deployed Services.

Of course Podman/Docker doesn't allow for lots of System Configuration since containers are Destroyed/Recreated each Time. Most of Configuration using Bind Mounts works fine for most Apps though.

I'd be a bit more wary for stuff like Bind (DNS) Server, there I'd probably still want to have full hands-on possibilities. I tried to set it up in LXC for one Server and it worked fine there at least.

What are your thought ? Most of the Legacy stuff that I have setup YEARS ago is still KVM based and it seems like it's eating A LOT of RAM/Disk Resources for very little use.
 
I get the general sense that most Proxmox users prefer VMs. And I get it. They are easy to set up and provide a very familiar interface. They also have pretty obvious security properties. But as you correctly point out, all of this convenience comes at a significant cost.

Long before discovering Proxmox, I had used LXC in exactly the way that you describe it. And when I switched to Proxmox, I stuck with this approach. Once you wrap your head around it, it works really well. I personally think it's the sweet spot between VM and Docker. But you probably need to spend a little time to adjust your workflow and set up suitable automations. I only use VMs for things that can't reasonably done with an LXC container. And since LXCs are so lightweight, I have lots of them. They are like microservices, only, unlike with Docker, I get a full and persistent operating system which makes administration easier for me.

Of course, your personal preference could very well vary. I understand why people like Podman/Docker. It makes different trade-offs that don't quite work well for me, but that are attractive for some use cases.
 
  • Like
Reactions: silverstone
Before you make any changes, it may be worth while to understand if the ram usage issue is specific to the workloads you are running or if it is caching being done by ZFS. Also, I run a number of VMs with as little as 8gb of disk space. LXC containers are fine, but they do bring their own set of challenges. Normally you would run an LXC as an unprivileged container for security purposes, which can make it more difficult to mount shares to, etc. Also, since the LXC uses the main host kernel itself, you lose a bit of isolation. Also, LXC containers are NOT recommended for running docker. I think we need a bit more info before you can get really good advice.
 
  • Like
Reactions: silverstone
Just to give OP all the information available.

I agree that unprivileged containers are the way to go. I actually find that makes sharing of filesystems easier than with VMs, as you can directly bind mount. But that comes with its own set of pros and cons. I makes it impossible to migrate between nodes. If migration is necessary, you need to set up proper file sharing just the same as with VMs. And if using direct bind mounts, you need to do a bit of planning for how you want to deal with user and group ids. It's not rocket science, but it's just part of the overall design process.

As for Docker, it is possible to install docker or similar technology in LXC. I have done this with Podman, and it works satisfactorily. But I agree that it isn't a perfect fit and it took some effort to smooth over a few rough edges. A VM would probably be considerably be easier. Or maybe, future versions of Proxmox will follow Incus' lead and gain native Docker support. That would be the ideal scenario.
 
  • Like
Reactions: silverstone
Before you make any changes, it may be worth while to understand if the ram usage issue is specific to the workloads you are running or if it is caching being done by ZFS. Also, I run a number of VMs with as little as 8gb of disk space. LXC containers are fine, but they do bring their own set of challenges. Normally you would run an LXC as an unprivileged container for security purposes, which can make it more difficult to mount shares to, etc. Also, since the LXC uses the main host kernel itself, you lose a bit of isolation. Also, LXC containers are NOT recommended for running docker. I think we need a bit more info before you can get really good advice.
Well for Podman (Docker) I usually setup a KVM Virtual Machine for that Purpose. Initially Debian, now slowly migrating to Fedora because of more recent Podman Support.

Many Services are not even "Deployed". Rather some kind of "Work in Progress" that hasn't progressed for several Months/Years :(.

Besides losing lots of Disk Space (and RAM as well), I also lose lots of Time keeping things uptodate.

Most of the Existing Services could/might/should be migrated to Podman Containers.
Example of such Services include (assuming that it is possible to run a Wireguard VPN + NFS client inside LXC of course):
- PiHole (already running in LXC)
- Mediawiki
- Syncthing
- Seafile
- Bind Nameserver
- Syslog Server
- Nextcloud (maybe, it also tends to break quite easily between Upgrades)

Now ... some Services are definitively more challenging than others, so I don't think these will ever be migrated, simply due to how they tend to break between upgrades and they are almost unfix-able without full Access:
- Gitlab

And definitively some Stuff cannot be migrated to LXC or Podman:
- Virtualized Desktop (possibly including PCIe Pass-through of GPU)
- Systems that require Full Isolation for Security Purposes
- Systems that might create their own Chroot environments (e.g. Package Builders Machines)
- NOT Linux Kernel (OPNSense, FreeBSD, ...)
 
- Virtualized Desktop (possibly including PCIe Pass-through of GPU)

If you don't need any passthrough, then this actually works absolutely beautifully in an LXC container. Been doing this for years. I use Chrome Remote Desktop to connect to the GUI, but I am sure you could use different remoting protocols if you prefered.

On the other hand, I also have a VM that runs emulates ARM hardware, and that's something that you can't do as a container. It's crazy slow unfortunately, but it's convenient as every few months, I need a virtualized Raspberry Pi for some testing, and conveniently that's something you can do in Proxmox if you need to.

- Systems that might create their own Chroot environments (e.g. Package Builders Machines)

I don't often need chroot environments these days, as I can simply spin up another container for that. But I don't believe there is any reason why you couldn't use chroot in a container.

As for spinning up additional containers on demand, I have all sorts of automations that make heavy use of implementation details in Proxmox. Maybe, that'll come back to bite me at some point, but it's just so incredibly convenient that all my LXC containers live on a ZFS system. Snapshots are such a versatile and powerful tool.
 
Last edited:
  • Like
Reactions: silverstone
Just to give OP all the information available.
I concur, therefore here are more:
  • you cannot live migrate a LX(C) container, it is stopped and started again (WIP). This is the main reason people use KVM in clusters.
  • best LX(C) experience with ZFS (and BTRFS is on par nowadays) due to no trim necessary, transparent compression

Overall the KVMs seems quite inefficient, especially in Terms of:
- RAM Usage (a bit less on Xeon E3 v5/v6, since up to 64GB RAM can be used there)
- Disk Space (each VM takes 16GB as a minimum and the Average is more like 16GB-32GB mark) ... so a handful of VM already require 1000GB as a Minimum nowadays
I don't know what you're hosting, but I run a lot of bookworm machines with only 256 MB RAM and 4 GB of storage. If RAM is a problem, use ballooning.


Besides losing lots of Disk Space (and RAM as well), I also lose lots of Time keeping things uptodate.
I did not understand which platform you meant (LXC, KVM, Docker?), but for every platform, there are auto update tools that work well (as long as the used updates work).
 
I concur, therefore here are more:
  • you cannot live migrate a LX(C) container, it is stopped and started again (WIP). This is the main reason people use KVM in clusters.
  • best LX(C) experience with ZFS (and BTRFS is on par nowadays) due to no trim necessary, transparent compression
Not a factor for the First. I always use ZFS on Root rpool (and sometimes I also have a separate zdata Storage Pool, again on ZFS).


I don't know what you're hosting, but I run a lot of bookworm machines with only 256 MB RAM and 4 GB of storage. If RAM is a problem, use ballooning.
Lucky you :). Heck even Nextcloud VM takes 3+ GB of RAM. Same with Mediawiki at 3+ GB of RAM. Also the test Seafile VM that's not doing anything is taking 3+ GB of RAM. Let's not even Talk about GitLab at 8GB+ RAM or the Virtualized NAS at 8-16 GB RAM.

Is this a case of "VM will use as much RAM as you throw at them for buffering/caching ?" ? Probably so ... The OOM Killer sometimes Triggers on this Host ...

Syncthing/Nextcloud/Seafile/Mediawiki VMs take 18GB on Disk (output of zfs list), even more "inside the VM" so to speak (once ZFS compression has been taken out of the Equation).


And when you say "I run a lot of bookworm machines", do you mean LXC or VMs ? I'd tend to agree if it was LXC ...

I run mostly Debian Bookworm VMs nowadays ...


I did not understand which platform you meant (LXC, KVM, Docker?), but for every platform, there are auto update tools that work well (as long as the used updates work).
Sure, using Debian/Ubuntu, I turned on unattended-upgrades on most Stuff now, even though that itself also does NOT perform "Regular Updates" (only Security Updates by Default, which is the Recommended way). On the few Fedora (for Podman) VMs, dnf-automatic also works well.
 
Of course Podman/Docker doesn't allow for lots of System Configuration since containers are Destroyed/Recreated each Time. Most of Configuration using Bind Mounts works fine for most Apps though.
If your concern is that you want to conserve resources (RAM, disk space, etc.,) then moving to all docker containers is definitely the way to go. I don't use podman, I am a docker/portainer fan myself. But either way, I use two Debian VMs to run all my docker containers: one in a trusted VLAN for internal use only applications, and one in an untrusted VLAN for anything facing the internet (wordpress, cloudflared, nextcloud, etc.) . I have found there is not a use case (for me anyway) that I haven't been able to fully migrate from a VM to docker. However, I run a separate NFS server outside of Proxmox that I mount all my volumes to. I also run my firewall/router (pfSense in my case) in a separate bare metal machine. I had originally virtualized pfSense and it worked fine, but my wife was going to kill me if I kept bring the network down by rebooting my server for some reason. So my network AND my storage are independent of my Proxmox servers (I have three servers, but not in a cluster: one for my "production" workloads, one for my development/testing work (which I turn off at night) and one for running Ansible and to serve as a backup destination)

Personally, I wouldn't want to have the learning curve of switching from Debian to Fedora, but that's just me. Also don't bother with Kubernetes if you are trying to conserve resources. I run a K3S Kubernetes cluster on my test machine, and it uses a lot more memory than just running containers in docker. Personally I think LXC containers are fine as well, but I find them more difficult to work with in some ways than docker containers. I have successfully run Nexcloud, Wordpress and a few other applications in LXC containers from TurnKey Linux.
 
  • Like
Reactions: silverstone
And when you say "I run a lot of bookworm machines", do you mean LXC or VMs ? I'd tend to agree if it was LXC ...
I run mostly Debian Bookworm VMs nowadays ...
VMs and I have of course also "monsters", out gitlab and its runners are huge in comparison to that. I don't run LX(C) containers in our clusters due to the non-live-migratability.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!