Thank you for your reply — I appreciate it.You might find it quicker to use the ticket system, no? As you paid for tickets as part of being on enterprise - https://my.proxmox.com/en
After performing a dist-upgrade on a production Proxmox host (non-ZFS, UEFI boot), the server failed to boot and got stuck at a black GRUB screen. This happened despite being a paying Proxmox Enterprise subscriber, and I must express my frustration: I received absolutely no support from Proxmox, not even basic guidance. For a paid product, this is simply unacceptable.Hi,
How did you proceed with the update ?
Best regards,
Without it, I would not have been able to access the OS at all.
Thank you for your response.I don't know why your system ran into problems with this particular and relatively low risk update. Do feel free to share your experience and maybe people here can figure out what went wrong for you today. I do feel sorry that this happened to you but please also improve your process to prevent such a stressful situations in the future.
I feels to me like you have some expectations that don't match reality (as I see it):
Here are some things, that are under your control, that could be improved:
- (I'm guessing here that you have 'community support' since you appear not to have used any support ticket and only posted on this forum which is not a good place for urgent help. If I'm wrong and you got no support on your ticket then you should definately find a different support company more local that can actually support you during your business hours!)
Community support gives you entitlement to support from Proxmox. No. it's the other way around: you pay to support Proxmox. Get a higher tier support subscription with actual support tickets for support from Proxmox based on a SLA.- Paying Proxmox or using the enterprise repository ensures that nothing ever goes wrong. No, if there is a bug that you run into then it takes longer for you to get the fix when you use the enterprise repository. Note that nobody else reported problems about this security update today. Things sometimes happen. What if it wasn't the update but broken hardware instead.
- This forum will give you support from Proxmox. No, this forum will give you, often but not always, help from mostly random volunteers on the internet and sometimes from Proxmox staff. You did not add your subscription to your forum account and Proxmox staff might haved missed your post. Either way, it's mostly people helping each other in their spare time when they feel like it.
I did not write this to claim that it was (partly) your fault. It's honest feedback on some things that might help you and others in the future, as problems will always happen regardless.
- Updating a production server while it was in use. You could have planned for maintenance down time and done the update without exposing your customers.
- Updating a production server. You could have first tested the update on a test server. There is also Proxmox Offline MIrror that can help manage when updates are put to production.
- Fall-back scenario. Implement a fall-back server in case anything goes wrong and might be out of your control, like a botched Proxmox update or a hardware failure.
- Don't post on busy forum where your cry for help scroll off the first page very quickly and expect volunteers to immediately fix it for you.
- Buy support tickets to make sure you can get support (from Proxmox or another company) from experts within a known amount of time.
D
state), e.g. using top
as that would confirm that the update hangs in the IO path, meaning likely some broken storage (hw or software). Also check the journal system log for any errors around the time this got stuck. This needs more info to be solved.FYI: We discussed this shortly internal as another dev asked about the changes to GRUB being in any way a possible cause here, which we ruled out as is as close to impossible as it gets, as the recent GRUB update really just dropped the NTFS module from being preloaded, so that a security check in lock down mode (secure boot) cannot be circumvented. As nothing changed on how GRUB is installed, at which stage this hangs, it's close to impossible that the update itself is problematic. And that's just to never say never, albeit if this would be a generic problem, we would have dozens if not more threads here.
Anyhow, this seems to me rather like a broken storage where the grub image cannot be fully written out.
First thing to check would be if the grub–or some other–process hangs (D
state), e.g. usingtop
as that would confirm that the update hangs in the IO path, meaning likely some broken storage (hw or software). Also check the journal system log for any errors around the time this got stuck. This needs more info to be solved.
And yes, while we take reports very seriously, if we can be quite sure that this is not a general issue–which we were here, especially as we looked at those (one line) grub change very closely–we do not put a high priority on community forums threads. The enterprise support is really the only channel where we actually have guaranteed response times governed by the subscription agreement, we communicate that very clearly and transparent.
You provided almost no actual useful information, but from it seems the upgrade hung, was interrupted and rebooted, grub was not correctly update due to that and the system failed to boot. Use a Proxmox VE ISO's rescue boot option, a live system to repair this.and only after the apt dist-upgrade (which included GRUB updates), the system refused to boot.
Do you boot from NTFS? Else it is not precisely affected by recent GRUB changes, please stop trying to blame the GRUB update and start providing some actual relevant details of what happened.That’s precisely the kind of environment affected by the recent GRUB changes — especially since the update removed the NTFS module to prevent Secure Boot circumvention.
Start by providing more details if you want help here... Providing theThis deserves deeper investigation, not just dismissal. Thanks again for responding — this kind of technical engagement is appreciated.
/var/log/apt/history.log*
and /var/log/apt/term.log*
files and the system log from around the time the update was done would be a simple start to shed some light on what really happened. Then post hardware details and what root filestem/storage is used, if you got a system where the update is still hung is affected check for hanging processes.You provided almost no actual useful information, but from it seems the upgrade hung, was interrupted and rebooted, grub was not correctly update due to that and the system failed to boot. Use a Proxmox VE ISO's rescue boot option, a live system to repair this.
Do you boot from NTFS? Else it is not precisely affected by recent GRUB changes, please stop trying to blame the GRUB update and start providing some actual relevant details of what happened.
Start by providing more details if you want help here... Providing the/var/log/apt/history.log*
and/var/log/apt/term.log*
files and the system log from around the time the update was done would be a simple start to shed some light on what really happened. Then post hardware details and what root filestem/storage is used, if you got a system where the update is still hung is affected check for hanging processes.
And if you're nodes have a valid subscription for those hosts eligible for enterprise support our enterprise support will gladly take a look, they can help to get the relevant logs and data faster.
We use essential cookies to make this site work, and optional cookies to enhance your experience.