All VMs stopped workin : restore of backup does not help

NinthWave

Member
Sep 27, 2021
37
0
11
44
Montreal, CANADA
Yesterday, I switched from PVE internal backup to PBS.

I don't see how it's relevant but my VM stopped working on the same night. All of them.
I retrieved a 2-day earlier backup from PVE (not PBS) to no avail. It won't start

journalctl on pastebin
 
Last edited:
earlier this week, I upgraded the host from 7.x to 8.x
It worked fine.

If I remove the Quadro and Tuner card from "pcie hardware" in the VM, it boots. If I want to add them back, there is no mapped devices. So something in the host has messed with the passthrough.

Anyhelp to investigate this ?
 
I checked this on Proxmox host
Code:
root@pve1:~# lspci -nn
00:00.0 Host bridge [0600]: Intel Corporation Xeon E3-1200 v3 Processor DRAM Controller [8086:0c08] (rev 06)
00:01.0 PCI bridge [0604]: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor PCI Express x16 Controller [8086:0c01] (rev 06)
00:01.1 PCI bridge [0604]: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor PCI Express x8 Controller [8086:0c05] (rev 06)
00:14.0 USB controller [0c03]: Intel Corporation 8 Series/C220 Series Chipset Family USB xHCI [8086:8c31] (rev 05)
00:1a.0 USB controller [0c03]: Intel Corporation 8 Series/C220 Series Chipset Family USB EHCI #2 [8086:8c2d] (rev 05)
00:1c.0 PCI bridge [0604]: Intel Corporation 8 Series/C220 Series Chipset Family PCI Express Root Port #1 [8086:8c10] (rev d5)
00:1c.1 PCI bridge [0604]: Intel Corporation 8 Series/C220 Series Chipset Family PCI Express Root Port #2 [8086:8c12] (rev d5)
00:1c.2 PCI bridge [0604]: Intel Corporation 8 Series/C220 Series Chipset Family PCI Express Root Port #3 [8086:8c14] (rev d5)
00:1c.3 PCI bridge [0604]: Intel Corporation 8 Series/C220 Series Chipset Family PCI Express Root Port #4 [8086:8c16] (rev d5)
00:1c.4 PCI bridge [0604]: Intel Corporation 8 Series/C220 Series Chipset Family PCI Express Root Port #5 [8086:8c18] (rev d5)
00:1d.0 USB controller [0c03]: Intel Corporation 8 Series/C220 Series Chipset Family USB EHCI #1 [8086:8c26] (rev 05)
00:1f.0 ISA bridge [0601]: Intel Corporation C222 Series Chipset Family Server Essential SKU LPC Controller [8086:8c52] (rev 05)
00:1f.2 SATA controller [0106]: Intel Corporation 8 Series/C220 Series Chipset Family 6-port SATA Controller 1 [AHCI mode] [8086:8c02] (rev 05)
00:1f.3 SMBus [0c05]: Intel Corporation 8 Series/C220 Series Chipset Family SMBus Controller [8086:8c22] (rev 05)
00:1f.6 Signal processing controller [1180]: Intel Corporation 8 Series Chipset Family Thermal Management Controller [8086:8c24] (rev 05)
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP107GL [Quadro P600] [10de:1cb2] (rev a1)
01:00.1 Audio device [0403]: NVIDIA Corporation GP107GL High Definition Audio Controller [10de:0fb9] (rev a1)
03:00.0 VGA compatible controller [0300]: Matrox Electronics Systems Ltd. MGA G200e [Pilot] ServerEngines (SEP1) [102b:0522] (rev 05)
04:00.0 Ethernet controller [0200]: Intel Corporation I210 Gigabit Network Connection [8086:1533] (rev 03)
05:00.0 Ethernet controller [0200]: Intel Corporation I210 Gigabit Network Connection [8086:1533] (rev 03)
06:00.0 Multimedia controller [0480]: Philips Semiconductors SAA7164 [1131:7164] (rev 81)
07:00.0 System peripheral [0880]: Global Unichip Corp. Coral Edge TPU [1ac1:089a]

This part seems OK
 
this seems ok
Code:
root@pve1:/etc/modprobe.d# cat /etc/kernel/cmdline
root=ZFS=rpool/ROOT/pve-1 boot=zfs quiet intel_iommu=on iommu=pt pcie_acs_override=downstreem,multifunction nofb nomodeset video=vesafb:off,efifb:off intremap=no_x2apic_optout
 
I dont' know how this is possible but in order to get passthrough, machine type is supposed to be 135, which it was. And not it's q440x... How is this possible?
 
saw oom kill in logs
ok..

the host has 16GB
VM1 has 8GB allowed
VM2 has 3GB allowed

Apart from updating Proxmox from 7.x to 8.x, nothing changed in the VM....
A recover of the VMs to when it failed should bring it back to normal.

Should I investigate more? If yes how?
Or should I simply fresh install PM 8?
 
check the memory usage when (physical and swap )
  • no vm running
  • 1 vm running
  • then start 2nd vm , and monitor resource usage with htop
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!