VM freezes irregularly

Hello,

After upgrading the microcode, I did not make any change to the BIOS settings.
As far as I remember, it is running the BIOS defaults.
As you can see in my previous post, I have C-states enabled.
It is common pratice to try disabling some C-state in case of stability problems ; especially disabling states C6 and higher.
However, in this particular case, I do think it is not related.

I will check my BIOS settings and will let you know.
Thanks!
 
Alright, I have updated to the 6.2 kernel and the 24 microcode. Lets see how this turns out. Still need to reset the BIOS settings, but need to pull the thing out of the closet to connect a monitor etc.

Thanks everyone!
 
Code:
root@pve:~# apt search cpupower
Sorting... Done
Full Text Search... Done
cpupower-gui/stable 0.7.2-2 amd64
  GUI utility to change the CPU frequency

libcpupower-dev/stable-security 5.10.162-1 amd64
  CPU frequency and voltage scaling tools for Linux (development files)

libcpupower1/bullseye-backports,now 6.1.12-1~bpo11+1 amd64 [installed,automatic]
  CPU frequency and voltage scaling tools for Linux (libraries)

linux-cpupower/bullseye-backports,now 6.1.12-1~bpo11+1 amd64 [installed]
  CPU power management tools for Linux

the last one called linux-cpupower is the one.
on my box this was already installed.
at least i dont remember installing it.
its possible that it was installed by some other package as dependancy (possibly cpufrequtils, which i did install at some point).
 
Last edited:
  • Like
Reactions: caramb
My pfSense VM has been running for 16 days now with the below setup. Very promising as before it would crash right at the two week mark. If it runs for another week I'll call it fixed.

pfSense 23.01 restarted short of 15 days with no crash dump.

Updated everything, specifically kernel, QEMU, and microcode. Let's try again...
Code:
microcode: microcode updated early to revision 0x24000024, date = 2022-09-02
6.1.14-1-pve

In case anyone wants to manually update microcode to 0x24000024 (assumes you already have 0x24000023 package installed):
Code:
wget https://github.com/intel/Intel-Linux-Processor-Microcode-Data-Files/archive/main.zip
unzip main.zip -d MCU
cp -r /root/MCU/Intel-Linux-Processor-Microcode-Data-Files-main/intel-ucode/. /lib/firmware/intel-ucode/
update-initramfs -u -k all
reboot
....
dmesg -T | grep microcode
 
Last edited:
  • Like
Reactions: PepNg and MrHello
Hi.

I'm not sure if I'm doing this right, but I found these forums about random reboots which I am also experiencing on my device.

I've upgraded to Proxmox 7.4 on a topton like device running Intel J6413. I tried to update the intel microcode to the latest, but it seems like it's stuck on 0x17.

Code:
root@proxmox:~# dmesg | grep microcode
[    0.000000] microcode: microcode updated early to revision 0x17, date = 2022-07-15
[    0.304380] SRBDS: Vulnerable: No microcode
[    1.475015] microcode: Microcode Update Driver: v2.2.


I have run these commands (as suggested in another post) to try to get it to the latest microcode:
Code:
wget https://r-1.ch/intel-microcode_3.20221108.2_amd64.deb
dpkg -i intel-microcode_3.20221108.2_amd64.deb


After the install, I rebooted. but still says 0x17 from 2022-07-15

Are the microcode numbers specific to processors or should I be on 0x24 too?

Other info that might be useful:
Code:
root@proxmox:/etc/apt# cat sources.list
deb http://ftp.ca.debian.org/debian bullseye main contrib non-free


deb http://ftp.ca.debian.org/debian bullseye-updates main contrib non-free


# security updates
deb http://security.debian.org bullseye-security main contrib non-free


#Add Backports
deb http://deb.debian.org/debian bullseye-backports main contrib non-free


# non production use updates
deb http://download.proxmox.com/debian/pve bullseye pve-no-subscription

Code:
root@proxmox:/etc/apt# grep 'stepping\|model\|microcode' /proc/cpuinfo
model           : 150
model name      : Intel(R) Celeron(R) J6413 @ 1.80GHz
stepping        : 1
microcode       : 0x17
model           : 150
model name      : Intel(R) Celeron(R) J6413 @ 1.80GHz
stepping        : 1
microcode       : 0x17
model           : 150
model name      : Intel(R) Celeron(R) J6413 @ 1.80GHz
stepping        : 1
microcode       : 0x17
model           : 150
model name      : Intel(R) Celeron(R) J6413 @ 1.80GHz
stepping        : 1
microcode       : 0x17

What should I do?

UPDATE: I think I figured it out, the version number is processor dependent as per the intel Github page

1679720498490.png
Thanks.
 
Last edited:
@ckl

Every generation of processors has a different Microcode version, even when released in the same package.

J6413 belongs to the Elkhart Lake (0x00000017) (microcode-20221108 release)
N6005 is in the Jasper Lake (Latest Microcode 0x24000024) (microcode-20230214 Release)

The issues that you are facing is not related to the ones in this topic.

For your random reboots, is it on the Proxmox Host or the Proxmox VM?
For the Jasper Lake systems it is the VM, the host is stable and does not have issues.

Usually, we don't recommend Processor Microcode updating as a requirement, unless there is a known errata, or security issue in the processor generation.
 
Last edited:
@ckl

Every generation of processors has a different Microcode version, even when released in the same package.

J6413 belongs to the Elkhart Lake (0x00000017) (microcode-20221108 release)
N6005 is in the Jasper Lake (Latest Microcode 0x24000024) (microcode-20230214 Release)

The issues that you are facing is not related to the ones in this topic.

For your random reboots, is it on the Proxmox Host or the Proxmox VM?
For the Jasper Lake systems it is the VM, the host is stable and does not have issues.

Usually, we don't recommend Processor Microcode updating as a requirement, unless there is a known errata, or security issue in the processor generation.

sorry, I should have been clearer. The reboots are from the pfsense firewall running in a vm on proxmox... same as a lot of users are experiencing on this thread. Proxmox itself was not rebooting.
 
sorry, I should have been clearer. The reboots are from the pfsense firewall running in a vm on proxmox... same as a lot of users are experiencing on this thread. Proxmox itself was not rebooting.

Did you have 0x00000017 installed before the update? If it did update the microcode it may still help. It's interesting that another CPU generation seems to be affected by this issue.
 
Last edited:
@ckl
Sorry to hear about that, it is actually the first time I am hearing that J6413 has the same issues with the N5095/N5105/N6005.

What Kernel version are you on? Have you tried upgrading it to 6.1? I do know that the 12th Gen and 13th Gen processors require Kernel 6.1 and above to properly use the features of the processor.

You are already on the latest release at this current time for Microcode, and that should be from the BIOS itself, from what I read. Sorry, other than updating the Kernel to see if it helps, I do not know what else to advice you on. Perhaps running Prime95 to test and see if any worker threads fail and memtest86 to see if you have faulty RAM, I really do not know what else to check at this time.
 
Last edited:
Did you have 0x00000017 installed before the update? If it did update the firmware it may still help. It's interesting that another CPU generation seems to be affected by this issue.

No. I had tried 0x15 and 0x16 with pve kernal 6.1 on proxmox 7.3. Then I updated to proxmox 7.4 and pve kernel 6.2, then updated microcode to 0x17.

I can't really tell you what led me to this point because I've done so much tinkering I can't remember the order in which things happened. but I do know that my unit originally was set to turbo performance with c-states disabled on proxmox 7.3 and it was stable. It was only when I started to spin up additional VM's did I notice that pfsense uptime counters occasionally reset. I would say every 2 weeks. Logs would show kernel trap 9 and kernel trap 1's before reboot. This is when I started messing with the settings.

At some point I found this thread and the discussion about pfsense rebooting so I updated the pve kernel to 6.1 then tried microcode updates. Then I noticed that my cpu speed was pegged at 2700Mhz (because of turbo performance bios setting) and then I started tinkering with bios settings. I enabled c-states, changed power management to "max battery" and that's when pfsense started crashing nightly. So this problem with pfsense crashing is not strictly an N5105 problem... it happens with J6413 as well.

In my environment, I know the setup required to crash pfsense. So after I updated proxmox to 7.4, pve-kernel to 6.2, and microcode to 0x17, I recreated the setup expecting pfsense to crash last night but it didn't. I will leave it running like this for a while and see what happens as I normally don't leave any unnecessary VM's running 24/7.
 
The intel-microcode 3.20230214.1~deb11u1 package has been added to stable-proposed-updates. It should ship with the next point release of Debian 11.7 on April 29th.

https://release.debian.org/proposed-updates/stable.html
https://lists.debian.org/debian-live/2023/03/msg00025.html

The Debian distribution moves like molasses. Ubuntu 22.04 LTS has had the new microcode since February 27th as a security update:
https://launchpad.net/ubuntu/+source/intel-microcode/3.20230214.0ubuntu0.22.04.1

@t.lamprecht

Would it be possible for the Proxmox team to release a package through their own repository that would install automatically? This would substantially improve Proxmox VM stability for users running affected CPUs.

My pfSense VM uptime is nearing 19 days with new microcode.
 
@AdriftAtlas

LOL, Proxmox is fine as it is, leave the Debian stuff to Debian, and proxmox stuff to Proxmox.
The package has been escalated already, 10 days from submission by non maintainer, 30 days in testing, before going to stable. That is pretty fast, considering the submission was done by a non-maintainer.

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1033079

Fast is a matter of perspective here. The updated microcode package fixes five medium severity CVEs. Should it take several months from public disclosure for a distribution to push an available update? Why did it take a non-maintainer to submit a security fix?

One could argue that many CPU vulnerabilities are hard to exploit. SecOps will still wonder why their scanner is showing the same unpatched vulnerabilities every month.

Proxmox is using Ubuntu's kernel sources on Debian. Might as well use their microcode sources. ;)
 
Last edited:
My uptime is currently "up 2 weeks, 3 days, 2 hours, 55 minutes" on a Beelink U59 (with N5105). Previous uptime prior to the changes below was never more than 6 hours.

Changes I made:
  1. Upgrade Ubuntu VM kernel to 6.2
    1. wget https://kernel.ubuntu.com/~kernel-p...0-generic_6.2.0-060200.202302191831_amd64.deb
    2. wget https://kernel.ubuntu.com/~kernel-p....2.0-060200_6.2.0-060200.202302191831_all.deb
    3. wget https://kernel.ubuntu.com/~kernel-p...0-generic_6.2.0-060200.202302191831_amd64.deb
    4. wget https://kernel.ubuntu.com/~kernel-p...0-generic_6.2.0-060200.202302191831_amd64.deb
    5. sudo dpkg -i *.deb
  2. Update Microcode on Proxmox Debian
    1. wget https://r-1.ch/intel-microcode_3.20221108.2_amd64.deb
    2. dpkg -i intel-microcode_3.20221108.2_amd64.deb
    3. optional
      1. cat /proc/cpuinfo
  3. Update Ubuntu VM GRUB
    1. Edit /etc/default/grub and modified the value of GRUB_CMDLINE_LINUX_DEFAULT to be "intel_pstate=disable quiet"
    2. Save the changes and run the update-grub command
    3. reboot
  4. Update Ubuntu VM Microcode Blacklist
    1. cd /etc/modprobe.d
    2. mv intel-microcode-blacklist.conf intel-microcode-blacklist.conf~
I don't believe #3 or #4 were necessary, but I haven't tested rolling them back yet.

Thanks to everyone for the help on this.
 
The intel-microcode 3.20230214.1~deb11u1 package has been added to stable-proposed-updates. It should ship with the next point release of Debian 11.7 on April 29th.

https://release.debian.org/proposed-updates/stable.html
https://lists.debian.org/debian-live/2023/03/msg00025.html

The Debian distribution moves like molasses. Ubuntu 22.04 LTS has had the new microcode since February 27th as a security update:
https://launchpad.net/ubuntu/+source/intel-microcode/3.20230214.0ubuntu0.22.04.1

@t.lamprecht

Would it be possible for the Proxmox team to release a package through their own repository that would install automatically? This would substantially improve Proxmox VM stability for users running affected CPUs.

My pfSense VM uptime is nearing 19 days with new microcode.

Youre a skilled tester, do you think microcode 24 solves the issue?
 
Youre a skilled tester, do you think microcode 24 solves the issue?

I'm just an IT guy. :)

my pfSense VM is running for 23 days now. Other users on here have reported similar results. The 24 microcode likely does solve the issue or at least makes its occurrence very rare.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!