[TUTORIAL] How to recover from auto-booting GPU passthrough

lukyjay

Active Member
Aug 18, 2020
31
9
28
Hi

My IPMI motherboard died so I have a 'gaming' motherboard which has a 5GB Realtek ethernet port. Unfortunately that doesn't work with the provided linux kernel and the latest PVE upgrade removed my kernel module. I can normally just re-install it, but I've enabled GPU passthrough on my VM which auto-boots. Ultimately, this meant I had no network access (i.e. no SSH) and no GPU (i.e. no direct input).

It was a real ordeal figure out how to fix this. Several hours of my life figuring out how to do what ended up being a 15 minute fix. So I thought I'd post the fix on these forums to help others (and myself if I get in the same pickle!)

Consider alternatives​

Editing your database directly is very dangerous, so consider other options to fix your installation:
  • SSH
  • Use a temporary PCI-e ethernet card
  • Plug a keyboard and monitor directly into the machine
  • Use another GPU

Boot into recovery / emergency mode​

  1. When you boot your PC you'll see a brief boot list which lets you select your kernel. Press the arrow key to stop the timer so your boot pauses here.
  2. Highlight one of the boot options (e.g top one is fine) and press 'e' to edit it.
  3. Enter this at the end of the line: systemd.unit=rescue.target
  4. Press enter and your installation will boot into a recovery mode where most things will work but most importantly your guests won't start.
  5. Take backups of your files, specifically /var/lib/pve-cluster/config.db

Edit your VM or CT conf file​

  1. You'll note the /etc/pve directory is blank. That's because it's not really a directory. Those files are just kept as rows in a database. You'll need to edit that.
  2. Install sqlite3 if you haven't already. I had it installed so can't help you here, but I think you can tether from an android phone with a USB cable if you need a temporary internet connection (google it!)
  3. You need to open the database, so run: sqlite3 /var/lib/pve-cluster/config.db
    1. Consider taking a backup of this file. If you break it you will brick your Proxmox installation.
  4. Edit the setting you need to change, e.g. I wanted to change 'onboot: 1' to 'onboot: 0' in all guests
    1. Per my example: UPDATE tree SET data = REPLACE(data, 'onboot: 1', 'onboot: 0');
  5. Exit sqlite3: .exit

Reboot and cross your fingers.
 
  • Like
Reactions: groque