[TTM] Buffer eviction failed

I am also running into this problem on a proxmox machine running 7.4-17 with a VM running Linux Mint 22 Cinnamon 6.2.9 with kernel version 6.8.0-47-generic. This proxmox machine previously has had no issues running VMs for weeks at a time. So I'm prone to thinking its something wrong with the VM as when I start VMs with older OS's it will still run for long periods of time (weeks/months) without error. This issue has persisted through to a second proxmox machine running 8.2.7 where the first Linux Mint VM was copied over to the machine. As a test I installed a VM with PopOS 22.04 LTS with kernel version 6.9.3-76060903-generic and the QXL error has occured there too.

I'm going to transfer one of the VMs that I have not had a problem with over to the 8.2.7 proxmox machine and see if i get a QXL error.

If anyone has some recommended tests they would like me to do to help solve this problem I would be more than happy to assist!
 
  • Like
Reactions: jebbam
No joy on any permutation or combination of RAM/VRAM and vgamem settings - for me, the QXL error occurs in all cases and still seems to be random.

Edit:
Assuming I'm reading it right, after reading through the kernel changelog for Ubuntu's 6.8.0-48 kernel (covering e.g. Ubuntu 24.04, Linux Mint 22 and others if the most up-to-date kernel is installed), it seems that the following occured with regard to the alleged QXL driver bug fix, the discussion of which I previously linked to:

14 Jun 2024
Reverted "drm/qxl: simplify qxl_fence_wait" in upstream kernel 6.8.7, which was pulled into Ubuntu 6.8.0-1008.8-22.04.1 [6.8.0-38.38]

19 Jul 2024
Reapplied "drm/qxl: simplify qxl_fence_wait" in upstream kernel 6.8.10, which was pulled into Ubuntu 6.8.0-1010.10-22.04.1 [6.8.0-40.40]

What's not clear is whether the bug had actually been fixed when the code was reapplied (in 6.8.0-40) or whether it was reapplied in original (buggy) form awaiting a future fix.

To answer my own question, from kernel.org's changelog for upstream kernel 6.8.10:
commit 3dfe35d8683daf9ba69278643efbabe40000bbf6
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date: Mon May 6 13:28:59 2024 -0700

Reapply "drm/qxl: simplify qxl_fence_wait"

commit 3628e0383dd349f02f882e612ab6184e4bb3dc10 upstream.

This reverts commit 07ed11afb68d94eadd4ffc082b97c2331307c5ea.

Stephen Rostedt reports:
"I went to run my tests on my VMs and the tests hung on boot up.
Unfortunately, the most I ever got out was:

[ 93.607888] Testing event system initcall: OK
[ 93.667730] Running tests on all trace events:
[ 93.669757] Testing all events: OK
[ 95.631064] ------------[ cut here ]------------
Timed out after 60 seconds"

and further debugging points to a possible circular locking dependency
between the console_owner locking and the worker pool locking.

Reverting the commit allows Steve's VM to boot to completion again.

[ This may obviously result in the "[TTM] Buffer eviction failed"
messages again, which was the reason for that original revert. But at
this point this seems preferable to a non-booting system... ]

Reported-and-bisected-by: Steven Rostedt <rostedt@goodmis.org>
Link: https://lore.kernel.org/all/20240502081641.457aa25f@gandalf.local.home/

So, any downstream (distro) kernel that pulls from upstream linux kernel <6.8.7 or >=6.8.10 will have the buggy QXL code. That's for the 6.8 series; other kernel series probably also have the buggy code (e.g. 5.15).

For Ubuntu and derivatives, it looks like kernels 6.8.0-38 and 6.8.0-39 have the reverted code, so I'll see if I can test those.
 

Attachments

Last edited:
I have seen this in Debian bullseye, bookworm, trixie, and sid. It has been around for a lot of different kernel versions.
Yes, originally some 3-4 years ago when the QXL driver was first simplified. It's probably in most kernel series since then. But I'm only testing the 6.8 series at the moment.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!