Live migration LXC

there is no working live migration for LXC, and therefore this feature is also not implemented in PVE. we regularly test with newer CRIU versions, but the current state is still far from usable (e.g., open network connections prevent container migration).
Have any progress about this?
LXD have live migration, but Proxmox still no.
When it's tentatively planned to add LXC live migrations?

We (still) do not plan to implement that. Please use VMs if you want live migration.
I was read this and I'm upset...
 
Last edited:
When it's tentatively planned to add LXC live migrations?

still not in the plans, the situation with CRIU hasn't changed that much.
 
still not in the plans, the situation with CRIU hasn't changed that much.

Being able to live-migrate containers would make it an easier sell to some of our customers–what work would need to be done to make it happen?
 
I am not sure only counting things labelled as crash there is the relevant metric ;) every other open issue is either restore failing, or checkpoint failing, or some thing that is not yet supported but probably happens with real-world system containers..
 
  • Like
Reactions: Johannes S
I am not sure only counting things labelled as crash there is the relevant metric ;) every other open issue is either restore failing, or checkpoint failing, or some thing that is not yet supported but probably happens with real-world system containers..
  1. I have directly responded to the post: "bug, hang, crash"
  2. I agree, and I have tried to browse the issues but (for random sampling) found no relevant critical problem. Do you have any specific list of critical issues which are blocking the normal everyday use?
  3. If you have a list of critical issues, it may be useful to examine why aren't they bugs...
If the issues are minor, or involve corner cases then it would be maybe feasible to hide it in an advanced setting page, and offer a warning and a direction how to report issues to criu project, making it possible for the real world issues to materialise and get reported, otherwise this never will be fixed.
And it may be possible that the feature already would work for the majority of the cases?

Is there any relative simple way to test? (Command line, etc.)
I really wonder when have you tested it last time and what issues have you observed?

This is a relatively high demand issue.
 
literally 9 out of the first 10 open issues (from the last 2 months!) are about checkpoint or restore crashing or failing. it has (admittedly) been a while since I personally tested it, but the issue is not so much with the implementation, but the general concept - there is no abstraction layer sitting between the container and host that can easily serialize and deserialize the state of the guest, like with a VM (it's a hard enough feature for Qemu to support!). CRIU will always be limited in what it can support in the first place (which means - only specific kinds of processes / heavily restricted containers), and playing catch-up with new kernel features or changes that break it.
 
  • Like
Reactions: Johannes S
  • 2478 - open (kernel question)
  • 2477 - packaging problem
  • 2464 - apparmor config problem?
  • 2457 - worksforme
  • 2450 - amd gpu related, investigating
  • 2448 - some dbus corder case, but no comment on it
  • 2446 - unreproducible
  • 2443 - dupe 1649, basically that iowait cannot be checkpointed, but they cannot be stopped either, that's expected (closed)
  • 2433 - feature request
  • 2432 - I cannot judge, seems like gpu clash?
  • 2426 - nested containers (corder case) if I see it right
  • 2421 - lxc/cgroupd problem and seems to be fixed.
Bold is the only problem I see as a confirmed and valid bug. Out of the first 10. (11)

When I look the lxc/criu wiki page there are two specific known problems: fuse and shared mounts, which is important, but not an issue for most of the users.

Nevertheless, when I search the net for lxc+criu problems I found surprisingly few unresolved, and many of the problems were pretty atypical uses. I have 95% of servers which use standard system call and framework and seems like being in the set of "working".

Still, I would test it in the wild if there was some simple way to do it, and if this helps you to progress. This is important for me as well as for many people out there, and I really would prefer not to have it ignored because "it seems like problematic".

(qemu is extremely different, let's not compare and get into very unproductive debates.)

About serializing: an 500 ms clock skew (freeze - thaw delay) may be way better in many cases than a shutdown/startup.
 
yeah, you can try with criu, lxc-checkpoint or pct suspend (in order from low to high level, but pct suspend is basically just combining locking and calling lxc-checkpoint) and you will pretty much find out that any regular container is not working.. note how the example in the wiki is using a 10 year old ubuntu version (predating most nested namespacing and some namespace types), has no console, and there is basically nothing happening on the LXC side w.r.t. criu: https://github.com/lxc/lxc/commits/main/src/lxc/criu.c
 
@fabian what is the "example in the wiki" you are referring to? Do you have a link? I couldn't find anything in the Wiki...

I have tested `pct suspend` on a simple Debian container and indeed it fails. Reason seems that UTS namespaces is still not fully implemented. This is very surprising.

Anyway, assuming this gets finally fixed in CRIU, can we then do live migration "inofficially" by invoking "pct suspend" first and, assuming this does not fail, live migration to another node?

I am really, really bummed by the lack of container live migration. I have used it successfully ten years ago in OpenVZ so it didn't even cross my mind, proxmox wouldn't support it a decade afterwards. It's mainly my whole workspace in screen sessions which is lost with every offline migration.
 
the wiki page I was referring to is the one in @grin 's reply right above mine ;)

yes, if `pct suspend` (and `pct resume`!) ever work reliably across a range of container loads, implementing "live" migration on top of that would maybe be feasible.
 
Not to protect the proxmox people :D but while it is well known that OpenVZ was many ways superior to cgroups, it has been fallen seriously behind of the kernel development due to politics and various technical problems and proxmox is not really in the position to develop openvz or fix cgroups. Do not be "suprised" that lxc can't do a lot of things OpenVZ did, they are very different systems. We have cgroups now, we need a system developed by 3rd parties to do the checkpointing and restore.

My usual problem is that for me (as a technical outsider) it is pretty hard to play a messenger between criu and proxmox people, to make them list the problems, try to assess them and separate things we can fix, things we shall fix and things we cannot fix because it is blocked by what, so we would have a roadmap to a suspend working in the majority of cases, which is a requirement for live migration.

I would, if I could, kindly ask the proxmox people to start actively reporting problems to criu people and list when they get a yes, later or no response on the wiki. I hope they get annoyed enough by the periodical and highly repeated question about it but instead of blocking it as a noise they try to get the thing moved. Proxmox is big enough now to make an impression on the criu project that it is important and worths investigating, isn't it?
 
  • Like
Reactions: exp and Johannes S
I would, if I could, kindly ask the proxmox people to start actively reporting problems to criu people and list when they get a yes, later or no response on the wiki. I hope they get annoyed enough by the periodical and highly repeated question about it but instead of blocking it as a noise they try to get the thing moved. Proxmox is big enough now to make an impression on the criu project that it is important and worths investigating, isn't it?

I think the issue with that is that we (as developers) are not convinced that the approach by CRIU will ever bear fruit in a meaningful way for generic containers. by it's very design it can only ever work with a ton of restrictions and footnotes and not for "arbitrary groups of processes doing whatever the kernel allows" (effectively, OpenVZ avoided a lot of this with a huge kernel patch set and baking these restrictions into their concept of containers, which was more restrictive than LXC is).

if you require live migration, qemu is just a way better fit since it comes with a builtin abstraction layer handling all these issues. and spending lots of time and effort in getting containers to support half of it with no real "exit strategy" for the second half seems like time and energy that should be better spent elsewhere.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!