[URGENT] Broken Pipe - cannot access any VMs or any settings on the HOST

deanfourie

Member
Jan 28, 2023
57
1
8
So, i logged into Proxmox tonight and made one change to a VM. That change was to remove the virutal CDROM drive of the bootable ISO file.

After that, everything just broke, I could not do ANYTHING in Promox, not even on the host itself. It just shows a Broken Pipe error. Even trying to load the network tab, DNS settings and all other settings in the host just shows this "Broken Pipe" error.

I cannot access the host console, cannot access any VMs or any settings on any VM or even the HOST.

Eventually, the web interface just stopped responding at all, and now all VMs appear to be dead.

Any ideas?
Thanks
 
Hi,

this sounds like a total host lockup - did you try simply rebooting the machine yet?
This should (hopefully) restores it into working order, for now. Then, to investigate what caused this, the syslog might have some information (journalctl -b -1).

Changing a simple VM hardware setting like this should _not_ cause anything like that. It really sounds more like a hardware failure (or maybe a kernel panic ..)
 
Ok I rebooted, and all is fine. But the question still remains now is, why?

I looked through the logs and nothing jumped out at me. Not sure if you can see anything. I'm not sure but the temperatures do look rather high.

Appreciate your input.

Thanks
 

Attachments

Good to hear at least now everything is working again.
My guess would be some hardware problem, maybe related to the enabled IOMMU (which often are unfortunaly not that well-tested by OEMs)

Thanks for the log!
Looking through it a bit, I'd first suggest using an updated microcode (apt update && apt install intel-microcode should be enough), as well as updating the BIOS/UEFI. That may fix some things.
You can also try out the opt-in 6.1 kernel.

If this is a non-production system, running a memtest could reveal if some RAM might be bad (when booting, there should be an option to select memtest86+ in the GRUB menu). Just be aware that may take a few hours.

The temperature reported by smartd is probably just bogus (at least the >100 one). The other one (~ 66) might be real, which is indeed a bit high - and thus can reduce the lifetime of the disk.

Otherwise, this really just seems like a hard lockup, either through a hardware failure or a kernel panic.
Best you can do is updating and monitor the system. If it happens again, you could connect a monitor (if possible) and see if the kernel console printed an oops trace or similar.
 
Thanks, ill do some testing over the next few days and see how things go. Ill try to reproduce the problem and grab a screen grab of the kernel panic.

What is intel microcode?

Also, do you recommend to give the 6.1 kernel a shot??

Thanks
 
Thanks, I did have a little dig.

It's interesting because this issue seemed to happen again tonight, I changed something with my uplink router which connects to my VMs WAN interface and is the only interface that is PCIE passthrough.

It appeared dropping and re-establishing the link a few times could have caused it to panic. I will try replicate this tomorrow but this could back up your comments about it being something to do with QEMU?
 
Attempted to install the intel-microcode but with the following error

apt install intel-microcode
intel-microcode has no installation candidate.

Do I need to add custom repositories?

Thanks
 
Do I need to add custom repositories?
You need to enable the non-free packages of the repository, they are not enabled by default - sorry for that.

You can just run this snippet in a shell:
Code:
sed -i.bak 's/main contrib$/main contrib non-free/g' /etc/apt/sources.list
After that, run apt update and apt install intel-microcode again.

Basically, Debian has three divisions for its packages, of which main, contrib and non-free, of which the first two are enabled by default. non-free generally has all proprietary firmware, just like microcode.
 
You need to enable the non-free packages of the repository, they are not enabled by default - sorry for that.

You can just run this snippet in a shell:
Code:
sed -i.bak 's/main contrib$/main contrib non-free/g' /etc/apt/sources.list
After that, run apt update and apt install intel-microcode again.

Basically, Debian has three divisions for its packages, of which main, contrib and non-free, of which the first two are enabled by default. non-free generally has all proprietary firmware, just like microcode.
Excellent, thank you!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!