Debian 13.1 LXC template fails to create/start (FIX)

I've just updated the new packages from pve-no-subscription for PVE version 9. All my Debian 13.1 containers are working fine again after the reboot. I really appreciate the quick fix. It is by no means a matter of course to provide a fix for non-paying customers on a Sunday evening. Thanks again to the whole team behind Proxmox.
 
  • Like
Reactions: tsv0
It is by no means a matter of course to provide a fix for non-paying customers on a Sunday evening.
You do realize that paying customers always get the fix later? I don't want to detract from the effort of the Proxmox team to release this fix in the weekend but (they did have the same issue when Debian 12.1 was released and) it's normal procedure that it goes out to no-subscription first. Why was it <= 13 used in the first place instead of < 14 when Debian does not make breaking changes within a major release?
 
they did have the same issue when Debian 12.1 was released
Not that I can recall, there we bumped the version soon enough which was included with pve-container 4.4-5 for the previous PVE release, which got bumped on 16th of June and with pve-container version 5.0.2 for the PVE 8 release, which was not yet out and there the package got bumped on the 16 June. The first Debian Bookworm 12.1 point release OTOH happened over a month later on 2023-07-22.
Why was it <= 13 used in the first place instead of < 14 when Debian does not make breaking changes within a major release?
It was simply a bug that went unnoticed, as mentioned in my initial reply here: the check was never ideal the way it was.
It probably went simply unnoticed due to the upper boundary getting set high enough before the point release, and it might have already happened before (I really only checked for Debian 12) and "quick fixed" without really thinking about the subtle implications for the way the check was.
 
Not that I can recall, there we bumped the version soon enough which was included with pve-container 4.4-5 for the previous PVE release, which got bumped on 16th of June and with pve-container version 5.0.2 for the PVE 8 release, which was not yet out and there the package got bumped on the 16 June. The first Debian Bookworm 12.1 point release OTOH happened over a month later on 2023-07-22.
I did not look into the details so I might have jumped to the wrong conclusion. I which case I'm sorry and would like to make it right (if you can help me on how).
It was simply a bug that went unnoticed, as mentioned in my initial reply here: the check was never ideal the way it was.
It probably went simply unnoticed due to the upper boundary getting set high enough before the point release, and it might have already happened before (I really only checked for Debian 12) and "quick fixed" without really thinking about the subtle implications for the way the check was.
The <=13 looks wrong, just as the <=12 looked wrong. It was mentioned that it was changed to <15 (which includes the current Sid, while I would have expected <14). It still feels like this is an accident waiting to happen in another few years. Did the team include upping the version in the next release check-list? Or maybe an upper bound is not necessary (until it actually is)? You did elaborate on some points (for which I'm thankful) but I still got the impression that it's a quick fix for now but not a real change in procedures. Not that I'm an expert on these things and I could easily be wrong (so please correct me). I don't mean no harm but I do like process/procedure improvements.

Either way, this is not a big issue and small mistakes happen every now and then (and I have make quite a few myself at work) and I do appreciate you making the effort to get this fixed during the weekend. The product is great, the way it is supported and how issues are handled is almost always great also. And I'm know to nitpick anyway...
 
Out of ignorance and curiousity (and as a Debian container user); what is actually going on here? Is Proxmox refusing to start containers that run a Debian version too high? Why?
 
The check as it was previously was indeed not ideal, so we not only fixed the issue itself but also changed the check such that it cannot happen in this form again, as now PVE either supports a major Debian release or it doesn't, but no break from e.g. going from .0 to .1 like here anymore.
I wonder if there's an easy way for you to provide a mechanism for us users to be able to "force" boot an unsupported distro. Many command line tools have "--force" options. In this case we have to resort to editing a packaged file to change the test. I wonder if you could have a "conf.d" directory or some other easily extensible mechanism for us end users to be able to "force" boot a newer (or unknown) distro? Maybe a text field in the LXC Features dialog where we have to enter "forceboot=true" or something. Just thinking out loud here.
 
The <=13 looks wrong, just as the <=12 looked wrong.
Yes, after the fact it certainly does and quickly stands out; these things often do that, but nobody pointed (or patched) that out until now, that what I meant with that it was simply a bug. Can go unnoticed long, but once you see it you might wonder why it was overlooked and what others there are in the vicinity and how one can avoid the whole class of bugs not just squish the single one, but those are all different orders of magnitude of work required.
It was mentioned that it was changed to <15 (which includes the current Sid, while I would have expected <14).
Not that it's very relevant, but while I get what you mean that's not entirely true. As Debian Sid is forever the unstable development version and not an actual release it normally also does not have any version number associated, i.e. Debian 14 is currently "testing", i.e. what will become Forky, and as Sid is the current development version, which means it's not Forky+1 but rather it's also what basically will become Forky, as its packages will continuously migrate into the testing repository. And we (and some users) often test with newer Debian releases already, that's why I expanded the upper range to Forky already now.
Did the team include upping the version in the next release check-list?
We normally have and also test newer releases, and while we did the latter also for PVE 9, the former obviously wasn't, which would have been noticed with the updated check.
Or maybe an upper bound is not necessary (until it actually is)?
In theory, incompatibilities for new releases could also cause some fallouts, that said it's a bit harder to imagine something that will have actually persistent effects in practice, especially nowadays where the Linux distro landscape is much more homogeneous, mostly due to systemd. So, for existing CTs it will almost always be better to avoid a hard error here–as hinted in the original reply–for new CT it might be good to check more explicitly, but tbh., it probably would be quite strange that one can set up a CT with version X and then upgrade it to X+1, and it works but setting up a CT from an X+1 template directly does not work, so it quite definitively should always be a warning at max.
You did elaborate on some points (for which I'm thankful) but I still got the impression that it's a quick fix for now but not a real change in procedures.
It's indeed not the "best" elaborate change one can make, but it wasn't a "quick'n'dirty" fix either, I tried to keep the change small to reduce potential for (unrelated) regression potential but also improve on it such that even if we do not get around in further improving this overall (which is rather unlikely, especially now that I wrote much more about this than I expected), we still will much easier catch this in our internal QA.
Not that I'm an expert on these things and I could easily be wrong (so please correct me). I don't mean no harm but I do like process/procedure improvements.
That's fine and appreciated, such an issue definitively warrants revisiting if it can be avoided in the first place and if there's a reason for why that did not happen.
 
  • Like
Reactions: eilko and leesteken
Out of ignorance and curiousity (and as a Debian container user); what is actually going on here? Is Proxmox refusing to start containers that run a Debian version too high? Why?
As mentioned in my last reply to this thread it certainly isn't as important as it was ten years ago or so due to the landscape being much more homogeneous, but the underlying idea was that there can be unexpected consequences of trying to run something that cannot run in the CT environment that PVE sets up and thus avoid theoretical long-lasting consequences. Such checks against newer versions are in general not bad, but naturally only when they are correct itself and can happen at every stage of where one might get to such a version. Here, the check itself was flawed in accepting an X.0 version but not an X.1 one, and we cannot really intercept updates from in the container to pull in a new version and execute the check there, as while theoretically that might be partially possible, doing so would breach the CT boundary, so many security implications, and hard to get fully right, especially as users can also manually update some components circumventing the distro package manager.

So, the original intend of adding this check was not bad per se but (and that's partially guesswork, it happened >10y ago) probably just missed that CTs can also reach a new version from getting upgraded from the inside, or overestimated the ability that we will always raise the versions soon enough.
 
  • Like
Reactions: XMarcR and Lombra
I wonder if there's an easy way for you to provide a mechanism for us users to be able to "force" boot an unsupported distro. Many command line tools have "--force" options. In this case we have to resort to editing a packaged file to change the test. I wonder if you could have a "conf.d" directory or some other easily extensible mechanism for us end users to be able to "force" boot a newer (or unknown) distro? Maybe a text field in the LXC Features dialog where we have to enter "forceboot=true" or something. Just thinking out loud here.
Producing a task warning might be enough. As e.g. Debian basically restarts almost everything on upgrade anyway the system state and compatibility with PVE for a CT running after the upgrade and the one that the same CT has after being freshly started is not that different. But yeah, such a flag would also be an option, but for a lot of users it's hard to evaluate when it's OK to set this, and if it's always OK to try with no harm done, then PVE could already default to that behavior in the first place.
 
Producing a task warning might be enough. As e.g. Debian basically restarts almost everything on upgrade anyway the system state and compatibility with PVE for a CT running after the upgrade and the one that the same CT has after being freshly started is not that different. But yeah, such a flag would also be an option, but for a lot of users it's hard to evaluate when it's OK to set this, and if it's always OK to try with no harm done, then PVE could already default to that behavior in the first place.
Just warning and booting anyway instead of erroring out would be even better! What's the worst that could happen?
 
very cool that Debian 13 i now supported, but AlmaLinux 10 not yet, not even it was "hackable" past week ...

run_buffer: 571 Script exited with status 25
lxc_init: 845 Failed to run lxc.hook.pre-start for container "556"
__lxc_start: 2039 Failed to initialize container "556"
TASK ERROR: startup for container '556' failed
 
The fix is contained in pve-container version 6.0.10 for PVE 9 and version 5.3.1 for PVE 8, which are both currently available on the respective pve-no-subscription repository. While we tested this closely, it would be still great to get additional feedback about those versions.
Thanks. Works. Same fix as I did manually.
 
very cool that Debian 13 i now supported, but AlmaLinux 10 not yet, not even it was "hackable" past week ...
The 10 series of AlmaLinux (and other RHEL derivatives) drop support for the network configuration system we used previously, so that one needs a bit more changes to get full support, there is an initial patch series on our development mailing list though, so should not be that far out.
 
Note that with Debian 13 containers if you configure IPv6 DHCP you could run into bug 8844 which breaks the container.

I don’t think blocking the container from starting is better but I can imagine you want the user to be aware this is not a finished solution.
 
Just stepping in to say thank you for this quick patch. It saved me from going down yet another rabbit hole! :-)

---
Proxmox VE 9.0.6 w/ Debian 13.1 "Trixie"
 
I observed the same after upgrading a cloned CT to Ubuntu 25.10.

Might want to add this to /usr/share/perl5/PVE/LXC/Setup/Ubuntu.pm since it's going to be released soon anyways:

Code:
'25.10' => 1, # questing
 
@t.lamprecht: wouldn't it be possible to allow also testing (next-stable) Releases of Containers to boot without Issue ? I am also running some CTs (and VMs) for the next Stable Release (Debian and Ubuntu currently, but I could imagine doing the same for Fedora for Instance) for the next Stable Release (Debian 14 and Ubuntu 25.10, the latter to be released soon) but I have to manually Patch the /usr/share/perl5/PVE/LXC/Setup/Ubuntu.pm File in order to be able to start the Container. Then of course whenever any Update of pve-container is released, that will come undone, so I will have to patch it again and probably setup some Systemd Service to automatically perform the Action whenever the File got changed by an Update.

Also as mentioned by me, @Colin 't Hart and @binaryanomaly this should be just a Warning, not a hard Error :) .

EDIT 1: I just updated to pve-container-6.0.12, seems like both 25.10 (next stable) and 26.04 (next LTS) were both added:
Code:
my $known_versions = {
    '26.04' => 1, # r LTS
    '25.10' => 1, # questing
    '25.04' => 1, # plucky
    '24.10' => 1, # oracular
    '24.04' => 1, # noble LTS
    '23.10' => 1, # mantic
    '23.04' => 1, # lunar
    '22.10' => 1, # kinetic
    '22.04' => 1, # jammy LTS
    '21.10' => 1, # impish
    '21.04' => 1, # hirsute
    '20.10' => 1, # groovy
    '20.04' => 1, # focal LTS
    '19.10' => 1, # eoan
    '19.04' => 1, # disco
    '18.10' => 1, # cosmic
    '18.04' => 1, # bionic LTS
    '17.10' => 1, # artful
    '17.04' => 1, # zesty
    # TODO: actively drop below entries that ship with systemd, as their version is to old for CGv2
    '16.10' => 1, # yakkety
    '16.04' => 1, # xenial LTS
    '15.10' => 1, # wily
    '15.04' => 1, # vivid
    '14.04' => 1, # trusty LTS
    '12.04' => 1, # precise LTS
};

Looking quickly at /usr/share/perl5/PVE/LXC/Setup/Fedora.pm it does NOT seem that there are any hard Version checks on Fedora Version at least.
 
Last edited:
@t.lamprecht: wouldn't it be possible to allow also testing (next-stable) Releases of Containers to boot without Issue ? I am also running some CTs (and VMs) for the next Stable Release (Debian and Ubuntu currently, but I could imagine doing the same for Fedora for Instance) for the next Stable Release (Debian 14 and Ubuntu 25.10, the latter to be released soon) but I have to manually Patch the /usr/share/perl5/PVE/LXC/Setup/Ubuntu.pm File in order to be able to start the Container. Then of course whenever any Update of pve-container is released, that will come undone, so I will have to patch it again and probably setup some Systemd Service to automatically perform the Action whenever the File got changed by an Update.

Also as mentioned by me, @Colin 't Hart and @binaryanomaly this should be just a Warning, not a hard Error :) .

EDIT 1: I just updated to pve-container-6.0.12, seems like both 25.10 (next stable) and 26.04 (next LTS) were both added:
Code:
my $known_versions = {
    '26.04' => 1, # r LTS
    '25.10' => 1, # questing
    '25.04' => 1, # plucky
    '24.10' => 1, # oracular
    '24.04' => 1, # noble LTS
    '23.10' => 1, # mantic
    '23.04' => 1, # lunar
    '22.10' => 1, # kinetic
    '22.04' => 1, # jammy LTS
    '21.10' => 1, # impish
    '21.04' => 1, # hirsute
    '20.10' => 1, # groovy
    '20.04' => 1, # focal LTS
    '19.10' => 1, # eoan
    '19.04' => 1, # disco
    '18.10' => 1, # cosmic
    '18.04' => 1, # bionic LTS
    '17.10' => 1, # artful
    '17.04' => 1, # zesty
    # TODO: actively drop below entries that ship with systemd, as their version is to old for CGv2
    '16.10' => 1, # yakkety
    '16.04' => 1, # xenial LTS
    '15.10' => 1, # wily
    '15.04' => 1, # vivid
    '14.04' => 1, # trusty LTS
    '12.04' => 1, # precise LTS
};

Looking quickly at /usr/share/perl5/PVE/LXC/Setup/Fedora.pm it does NOT seem that there are any hard Version checks on Fedora Version at least.
The latest pve-container version 5.3.3 does not only adds the new Ubuntu releases, but for PVE 9 we also defused the hard checks to soft checks, which now only produce a warning in the case an unknown (to new) release is encountered. This was done for Debian and Ubuntu.