Hi just to followup, for clarity, there is no patch and no pinning with the workaround I am using. Simply make sure your kernel boot parameters set in /etc/default/grub and then designate pcie=noaer as a required option. Then rebuild your grub boot config file / update grub / and then...
OK! yes - I agree - it seems this flag is a good workaround, but ultimately there is some underlying problem that should be addressed if possible. I am not sure but it seems like next steps would be (a) Someone who is familiar with PCIE Debug will review the logs I captured (b) they might spot...
Footnote,
I unpinned and was able to boot, with the revised boot stanza thus:
dmesg hint tells me:
[ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-6.5.11-8-pve root=UUID=5b7d97ba-41f2-4b1f-8b38-ba885925c617 ro nomodeset iommu=pt console=tty0 console=ttyS0,115200n8 pci=noaer...
Hi, I only just tried this now, added the kernel boot stanza, pci=noaer
and now it boots, and it works, I am using the pinned 'older' kernel
I will try without pinning and see if it still works
below are pastebin links for lspci and dmesg output from the successful boot in case of interest...
Hi, OK, I just got the content, it is in pastebin, there are 3 different paste due to size
before lspci > https://pastebin.com/eZuddMG0
after lspci > https://pastebin.com/RRDsDYAy
dmesg after > https://pastebin.com/z07BJ8mY
please let me know if this is more useful / and possibly if you see...
I will see what I can do. It means rebooting the box again temporarily in rescue mode to get the captures off that are generated when it has no network. Will do so and update after that.
Tim
Hi, thank you for the added followup. The lspci gave .. a fair mass of output, I am pasting below a series of 6 screenshot to capture the ixgbe relevant chunk from lspci -vvxxxx output
then bit further below, output from remove and reinstall the module, and then below that, dmesg capture after...
footnote for clarity, I am happy to continue to poke some test/config adjust changes on this host for another few days. Ideally I need to get this thing into production early next week but I have buffer in my schedule right now so can keep mucking about with "try this, does anything change?"...
Hi, thank you for the reply and suggestion! I have just tried this, ie, added the "amd_iommu=off" stanza to my boot flags / rebooted. So far from what I can tell there is no change, we see ~same messages in dmesg about ixgbe failed prob error -5
and my public facing nic is absent
and no...
Hi, I just reinstalled OVH Box with Proxmox-7 template
did upgrade to 8 in-place
then pinned it to Kernel
6.1.10-1-pve
thinking this would let me boot with network / as per this thread
rebooted
no joy, network error, get KVM to get access, capture dmesg to text file
reboot on rescue mode > get...
Oh, wow, thank you. That looks a tiny bit fiddley. Do you think - if I just sit on Proxmox7 for a few months and upgrade to Proxmox8 later, get new kernel (not right away) - the issue will ?probably? be sorted in kernel by then? OR otherwise I guess I could maybe do the proxmox 7>8 upgrade...
Quick note - in case it helps someone else ? I just rented a new box from OVH yesterday that I'm setting up for a client, an AMD_Epyc based 16-core with 256gb ram on a supermicro board - and the stock install of Proxmox 7 using OVH template was great, but once I did an in-place upgrade to...
Hi LnxBil, thank you for this added detail. I just did some digging/reading online, so far the most-clear thing I could find to talk about HugePage and possible impact on KSM was here > here in redhat docs
If I am reading it correctly, HugePage enabled will likely give better performance to...
Hi everyone, thank you for all the great replies. I must admit I had forgotten about the obvious baseline problem of "what if ram is heavily sliced up" - this was definitely a factor here, ie,
server was online for ~90 days
initially all VM were spun up, with less than 64gb allocated to my...
Hi, I wonder if someone can help me understand maybe this situation. I've got a proxmox node with a pretty classic config, ie,
OVH hardware hosting environment, MD Raid storage, 2 x 4Tb mirror NVME_SSD as my primary proxmox datastore
Host is a 6-core Xeon 12 thread, with 128gb physical ram
I've...
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.