[SOLVED] VM crashes since yesterday when another runs

jpbaril

Renowned Member
Feb 7, 2015
34
1
73
Hi,

I'm cluless. I have been running Proxmox for years at home, and in this exact setup since a year, but since yesterday, whenever I start another VM a first one (VM ID 103) crashes.
Same on the opposite, if another VM already runs VM ID 103 does not start.
(BTW I'm not that techie)

Here are the logs of "journalctl -e" in the case of the other VM (ID 104) already running and trying to start VM ID 103:

Code:
Dec 20 21:27:09 pve2 pvedaemon[1645]: <root@pam> starting task UPID:pve2:00002B57:00047A21:6583A27D:qmstart:103:root@pam:
Dec 20 21:27:09 pve2 pvedaemon[11095]: start VM 103: UPID:pve2:00002B57:00047A21:6583A27D:qmstart:103:root@pam:
Dec 20 21:27:10 pve2 systemd[1]: Started 103.scope.
Dec 20 21:27:11 pve2 kernel: tap103i0: entered promiscuous mode
Dec 20 21:27:11 pve2 kernel: vmbr0: port 3(tap103i0) entered blocking state
Dec 20 21:27:11 pve2 kernel: vmbr0: port 3(tap103i0) entered disabled state
Dec 20 21:27:11 pve2 kernel: tap103i0: entered allmulticast mode
Dec 20 21:27:11 pve2 kernel: vmbr0: port 3(tap103i0) entered blocking state
Dec 20 21:27:11 pve2 kernel: vmbr0: port 3(tap103i0) entered forwarding state
Dec 20 21:27:12 pve2 kernel: kvm invoked oom-killer: gfp_mask=0x140dc2(GFP_HIGHUSER|__GFP_COMP|__GFP_ZERO), order=0, oom_score_adj=0
Dec 20 21:27:12 pve2 kernel: CPU: 2 PID: 11103 Comm: kvm Tainted: P           O       6.5.11-7-pve #1
Dec 20 21:27:12 pve2 kernel: Hardware name: ASUS All Series/H97I-PLUS, BIOS 3602 04/08/2018
Dec 20 21:27:12 pve2 kernel: Call Trace:
Dec 20 21:27:12 pve2 kernel:  <TASK>
Dec 20 21:27:12 pve2 kernel:  dump_stack_lvl+0x48/0x70
Dec 20 21:27:12 pve2 kernel:  dump_stack+0x10/0x20
Dec 20 21:27:12 pve2 kernel:  dump_header+0x4f/0x260
Dec 20 21:27:12 pve2 kernel:  oom_kill_process+0x10d/0x1c0
Dec 20 21:27:12 pve2 kernel:  out_of_memory+0x270/0x560
Dec 20 21:27:12 pve2 kernel:  __alloc_pages+0x114f/0x12e0
Dec 20 21:27:12 pve2 kernel:  __folio_alloc+0x1d/0x60
Dec 20 21:27:12 pve2 kernel:  ? policy_node+0x69/0x80
Dec 20 21:27:12 pve2 kernel:  vma_alloc_folio+0x9f/0x3a0
Dec 20 21:27:12 pve2 kernel:  do_anonymous_page+0x76/0x3c0
Dec 20 21:27:12 pve2 kernel:  __handle_mm_fault+0xb50/0xc30
Dec 20 21:27:12 pve2 kernel:  handle_mm_fault+0x164/0x360
Dec 20 21:27:12 pve2 kernel:  __get_user_pages+0x1f5/0x630
Dec 20 21:27:12 pve2 kernel:  __gup_longterm_locked+0x27e/0xc20
Dec 20 21:27:12 pve2 kernel:  ? __domain_mapping+0x280/0x4a0
Dec 20 21:27:12 pve2 kernel:  pin_user_pages_remote+0x7a/0xb0
Dec 20 21:27:12 pve2 kernel:  vaddr_get_pfns+0x78/0x290 [vfio_iommu_type1]
Dec 20 21:27:12 pve2 kernel:  vfio_pin_pages_remote+0x370/0x4e0 [vfio_iommu_type1]
Dec 20 21:27:12 pve2 kernel:  ? intel_iommu_iotlb_sync_map+0x8f/0x100
Dec 20 21:27:12 pve2 kernel:  vfio_iommu_type1_ioctl+0x10c7/0x1af0 [vfio_iommu_type1]
Dec 20 21:27:12 pve2 kernel:  ? restore_fpregs_from_fpstate+0x47/0xf0
Dec 20 21:27:12 pve2 kernel:  vfio_fops_unl_ioctl+0x6b/0x380 [vfio]
Dec 20 21:27:12 pve2 kernel:  ? __fget_light+0xa5/0x120
Dec 20 21:27:12 pve2 kernel:  __x64_sys_ioctl+0xa3/0xf0
Dec 20 21:27:12 pve2 kernel:  do_syscall_64+0x5b/0x90
Dec 20 21:27:12 pve2 kernel:  ? exit_to_user_mode_prepare+0x39/0x190
Dec 20 21:27:12 pve2 kernel:  ? irqentry_exit_to_user_mode+0x17/0x20
Dec 20 21:27:12 pve2 kernel:  ? irqentry_exit+0x43/0x50
Dec 20 21:27:12 pve2 kernel:  ? exc_page_fault+0x94/0x1b0
Dec 20 21:27:12 pve2 kernel:  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
Dec 20 21:27:12 pve2 kernel: RIP: 0033:0x7f389bc39b5b
Dec 20 21:27:12 pve2 kernel: Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3>
Dec 20 21:27:12 pve2 kernel: RSP: 002b:00007ffef22b6550 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
Dec 20 21:27:12 pve2 kernel: RAX: ffffffffffffffda RBX: 000055b92b6cba60 RCX: 00007f389bc39b5b
Dec 20 21:27:12 pve2 kernel: RDX: 00007ffef22b65b0 RSI: 0000000000003b71 RDI: 0000000000000024
Dec 20 21:27:12 pve2 kernel: RBP: 0000000100000000 R08: 0000000000000000 R09: ffffffffffffffff
Dec 20 21:27:12 pve2 kernel: R10: 0000000200000000 R11: 0000000000000246 R12: 0000000200000000
Dec 20 21:27:12 pve2 kernel: R13: 0000000200000000 R14: 00007ffef22b65b0 R15: 000055b92b6cba60
Dec 20 21:27:12 pve2 kernel:  </TASK>
Dec 20 21:27:12 pve2 kernel: Mem-Info:
Dec 20 21:27:12 pve2 kernel: active_anon:681236 inactive_anon:3033252 isolated_anon:0
                              active_file:0 inactive_file:116 isolated_file:0
                              unevictable:43288 dirty:95 writeback:0
                              slab_reclaimable:6407 slab_unreclaimable:57351
                              mapped:23316 shmem:18020 pagetables:9923
                              sec_pagetables:1485 bounce:0
                              kernel_misc_reclaimable:0
                              free:33248 free_pcp:449 free_cma:0
Dec 20 21:27:12 pve2 kernel: Node 0 active_anon:2724944kB inactive_anon:12133008kB active_file:0kB inactive_file:464kB unevictable:173152kB isolated(anon):0kB isolated(fi>
Dec 20 21:27:12 pve2 kernel: Node 0 DMA free:13308kB boost:0kB min:64kB low:80kB high:96kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inact>
Dec 20 21:27:12 pve2 kernel: lowmem_reserve[]: 0 3069 15781 15781 15781
Dec 20 21:27:12 pve2 kernel: Node 0 DMA32 free:63592kB boost:0kB min:13132kB low:16412kB high:19692kB reserved_highatomic:0KB active_anon:516344kB inactive_anon:2540680kB>
Dec 20 21:27:12 pve2 kernel: lowmem_reserve[]: 0 0 12711 12711 12711
Dec 20 21:27:12 pve2 kernel: Node 0 Normal free:56092kB boost:0kB min:54384kB low:67980kB high:81576kB reserved_highatomic:2048KB active_anon:2208600kB inactive_anon:9592>
Dec 20 21:27:12 pve2 kernel: lowmem_reserve[]: 0 0 0 0 0
Dec 20 21:27:12 pve2 kernel: Node 0 DMA: 1*4kB (U) 1*8kB (U) 1*16kB (U) 1*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 1*512kB (U) 0*1024kB 2*2048kB (UM) 2*4096kB (M) = 13>
Dec 20 21:27:12 pve2 kernel: Node 0 DMA32: 906*4kB (M) 856*8kB (M) 422*16kB (M) 371*32kB (UM) 235*64kB (M) 132*128kB (M) 10*256kB (M) 0*512kB 0*1024kB 0*2048kB 0*4096kB =>
Dec 20 21:27:12 pve2 kernel: Node 0 Normal: 8935*4kB (MEH) 1459*8kB (ME) 177*16kB (UME) 69*32kB (UME) 25*64kB (MEH) 12*128kB (M) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096k>
Dec 20 21:27:12 pve2 kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
Dec 20 21:27:12 pve2 kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Dec 20 21:27:12 pve2 kernel: 23695 total pagecache pages
Dec 20 21:27:12 pve2 kernel: 0 pages in swap cache
Dec 20 21:27:12 pve2 kernel: Free swap  = 0kB
Dec 20 21:27:12 pve2 kernel: Total swap = 0kB
Dec 20 21:27:12 pve2 kernel: 4162736 pages RAM
Dec 20 21:27:12 pve2 kernel: 0 pages HighMem/MovableOnly
Dec 20 21:27:12 pve2 kernel: 102114 pages reserved
Dec 20 21:27:12 pve2 kernel: 0 pages hwpoisoned
Dec 20 21:27:12 pve2 kernel: Tasks state (memory values in pages):
Dec 20 21:27:12 pve2 kernel: [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
Dec 20 21:27:12 pve2 kernel: [    586]     0   586    10351      928   110592        0          -250 systemd-journal
Dec 20 21:27:12 pve2 kernel: [    599]     0   599    20145     6272    94208        0         -1000 dmeventd
Dec 20 21:27:12 pve2 kernel: [    602]     0   602     7077     1504    77824        0         -1000 systemd-udevd
Dec 20 21:27:12 pve2 kernel: [   1205]     0  1205    19796      352    57344        0             0 pvefw-logger
Dec 20 21:27:12 pve2 kernel: [   1207]   103  1207     1969      768    57344        0             0 rpcbind
Dec 20 21:27:12 pve2 kernel: [   1273]   102  1273     2314      992    53248        0          -900 dbus-daemon
Dec 20 21:27:12 pve2 kernel: [   1276]     0  1276    38187      480    69632        0         -1000 lxcfs
Dec 20 21:27:12 pve2 kernel: [   1277]     0  1277    69539      768    86016        0             0 pve-lxc-syscall
Dec 20 21:27:12 pve2 kernel: [   1279]     0  1279    55444     1152    86016        0             0 rsyslogd
Dec 20 21:27:12 pve2 kernel: [   1281]     0  1281     3031     1216    65536        0             0 smartd
Dec 20 21:27:12 pve2 kernel: [   1283]     0  1283     2399      275    53248        0             0 ksmtuned
Dec 20 21:27:12 pve2 kernel: [   1285]     0  1285     1327      352    45056        0             0 qmeventd
Dec 20 21:27:12 pve2 kernel: [   1293]     0  1293     8391     1312    81920        0             0 systemd-logind
Dec 20 21:27:12 pve2 kernel: [   1295]     0  1295      583      320    40960        0         -1000 watchdog-mux
Dec 20 21:27:12 pve2 kernel: [   1301]     0  1301    60164     1088   102400        0             0 zed
Dec 20 21:27:12 pve2 kernel: [   1376]     0  1376     1256      416    49152        0             0 lxc-monitord
Dec 20 21:27:12 pve2 kernel: [   1394]     0  1394     3333      428    57344        0             0 iscsid
Dec 20 21:27:12 pve2 kernel: [   1395]     0  1395     3459     3375    65536        0           -17 iscsid
Dec 20 21:27:12 pve2 kernel: [   1410]     0  1410     3852     1920    69632        0         -1000 sshd
Dec 20 21:27:12 pve2 kernel: [   1414]     0  1414     2101      384    49152        0             0 agetty
Dec 20 21:27:12 pve2 kernel: [   1429]   101  1429     4715      618    57344        0             0 chronyd
Dec 20 21:27:12 pve2 kernel: [   1431]   101  1431     2633      499    57344        0             0 chronyd
Dec 20 21:27:12 pve2 kernel: [   1479]     0  1479   110158      595   143360        0             0 rrdcached
Dec 20 21:27:12 pve2 kernel: [   1493]     0  1493   159213    16638   450560        0             0 pmxcfs
Dec 20 21:27:12 pve2 kernel: [   1586]     0  1586    10664      549    69632        0             0 master
Dec 20 21:27:12 pve2 kernel: [   1587]   106  1587    10762      672    77824        0             0 pickup
Dec 20 21:27:12 pve2 kernel: [   1588]   106  1588    10774      704    73728        0             0 qmgr
Dec 20 21:27:12 pve2 kernel: [   1593]     0  1593   139903    41725   401408        0             0 corosync
Dec 20 21:27:12 pve2 kernel: [   1594]     0  1594     2285      512    53248        0             0 cron
Dec 20 21:27:12 pve2 kernel: [   1615]     0  1615    39949    24167   303104        0             0 pve-firewall
Dec 20 21:27:12 pve2 kernel: [   1616]     0  1616    38719    23443   307200        0             0 pvestatd
Dec 20 21:27:12 pve2 kernel: [   1620]     0  1620      615      256    45056        0             0 bpfilter_umh
Dec 20 21:27:12 pve2 kernel: [   1643]     0  1643    58940    33852   413696        0             0 pvedaemon
Dec 20 21:27:12 pve2 kernel: [   1644]     0  1644    63957    37885   458752        0             0 pvedaemon worke
Dec 20 21:27:12 pve2 kernel: [   1645]     0  1645    61124    34973   434176        0             0 pvedaemon worke
Dec 20 21:27:12 pve2 kernel: [   1646]     0  1646    61145    35037   434176        0             0 pvedaemon worke
Dec 20 21:27:12 pve2 kernel: [   1650]     0  1650    55462    27601   364544        0             0 pve-ha-crm
Dec 20 21:27:12 pve2 kernel: [   1652]    33  1652    59303    34220   409600        0             0 pveproxy
Dec 20 21:27:12 pve2 kernel: [   1653]    33  1653    61556    35596   442368        0             0 pveproxy worker
Dec 20 21:27:12 pve2 kernel: [   1654]    33  1654    62485    36556   466944        0             0 pveproxy worker
Dec 20 21:27:12 pve2 kernel: [   1655]    33  1655    62491    36620   454656        0             0 pveproxy worker
Dec 20 21:27:12 pve2 kernel: [   1658]    33  1658    20833    12915   180224        0             0 spiceproxy
Dec 20 21:27:12 pve2 kernel: [   1659]    33  1659    20887    13012   180224        0             0 spiceproxy work
Dec 20 21:27:12 pve2 kernel: [   1660]     0  1660    55325    27590   364544        0             0 pve-ha-lrm
Dec 20 21:27:12 pve2 kernel: [   1761]     0  1761   578893   403148  4022272        0             0 kvm
Dec 20 21:27:12 pve2 kernel: [   1816]     0  1816    54262    27979   368640        0             0 pvescheduler
Dec 20 21:27:12 pve2 kernel: [   3660]     0  3660     4495     2080    73728        0             0 sshd
Dec 20 21:27:12 pve2 kernel: [   3665]     0  3665     5034     1792    81920        0           100 systemd
Dec 20 21:27:12 pve2 kernel: [   3666]     0  3666    26092     1281    86016        0           100 (sd-pam)
Dec 20 21:27:12 pve2 kernel: [   3686]     0  3686     1529      640    53248        0             0 login
Dec 20 21:27:12 pve2 kernel: [   3691]     0  3691     2658      768    65536        0             0 bash
Dec 20 21:27:12 pve2 kernel: [   8255]     0  8255   926519   533269  5234688        0             0 kvm
Dec 20 21:27:12 pve2 kernel: [   9929]     0  9929     4495     2016    77824        0             0 sshd
Dec 20 21:27:12 pve2 kernel: [   9935]     0  9935     1529      736    49152        0             0 login
Dec 20 21:27:12 pve2 kernel: [   9944]     0  9944     2658      736    53248        0             0 bash
Dec 20 21:27:12 pve2 kernel: [   9978]     0  9978   214139     1248   929792        0             0 journalctl
Dec 20 21:27:12 pve2 kernel: [   9979]     0  9979     2127      448    57344        0             0 pager
Dec 20 21:27:12 pve2 kernel: [  10928]     0 10928     4495     2112    77824        0             0 sshd
Dec 20 21:27:12 pve2 kernel: [  10934]     0 10934     1529      736    45056        0             0 login
Dec 20 21:27:12 pve2 kernel: [  10939]     0 10939     2658      736    61440        0             0 bash
Dec 20 21:27:12 pve2 kernel: [  10948]     0 10948     1999      320    53248        0             0 sleep
Dec 20 21:27:12 pve2 kernel: [  11095]     0 11095    63097    34525   430080        0             0 task UPID:pve2:
Dec 20 21:27:12 pve2 kernel: [  11098]     0 11098    63097    34349   421888        0             0 task UPID:pve2:
Dec 20 21:27:12 pve2 kernel: [  11099]     0 11099    44897     3505   249856        0             0 kvm
Dec 20 21:27:12 pve2 kernel: [  11103]     0 11103  2790748  2554666 20783104        0             0 kvm
Dec 20 21:27:12 pve2 kernel: [  11109]     0 11109     7078      998    61440        0             0 (udev-worker)
Dec 20 21:27:12 pve2 kernel: [  11110]     0 11110     7078     1062    61440        0             0 (udev-worker)
Dec 20 21:27:12 pve2 kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=qemu.slice,mems_allowed=0,global_oom,task_memcg=/qemu.slice/103.scope,task=kvm,pid>
Dec 20 21:27:12 pve2 kernel: Out of memory: Killed process 11103 (kvm) total-vm:11162992kB, anon-rss:10217000kB, file-rss:1664kB, shmem-rss:0kB, UID:0 pgtables:20296kB oo>
Dec 20 21:27:12 pve2 systemd[1]: 103.scope: A process of this unit has been killed by the OOM killer.
Dec 20 21:27:12 pve2 systemd[1]: 103.scope: Failed with result 'oom-kill'.
Dec 20 21:27:12 pve2 systemd[1]: 103.scope: Consumed 1.938s CPU time.
Dec 20 21:27:12 pve2 kernel: vmbr0: port 3(tap103i0) entered disabled state
Dec 20 21:27:12 pve2 kernel: tap103i0 (unregistering): left allmulticast mode
Dec 20 21:27:12 pve2 kernel: vmbr0: port 3(tap103i0) entered disabled state
Dec 20 21:27:12 pve2 pvedaemon[1646]: VM 103 qmp command failed - VM 103 not running
Dec 20 21:27:13 pve2 pvedaemon[11095]: start failed: QEMU exited with code 1
Dec 20 21:27:13 pve2 pvedaemon[1645]: <root@pam> end task UPID:pve2:00002B57:00047A21:6583A27D:qmstart:103:root@pam: start failed: QEMU exited with code 1

Again, I'm not that techie so I don't understand these logs.

I seem to see something about running out of memory but that does not make sense because 1) as I said it was working before, and 2) Host has 16 GB of RAM, VM 103 has 8 GB and VM 104 2 GB, so plenty of RAM left.

That's a big issue for me because VM ID is 103 my virtualized desktop...

Thank you.
 
Last edited:
… and your currently used PVE & kernel version. Are you using GPU passthrough for your desktop VM?
 
Code:
               total        used        free      shared  buff/cache   available
Mem:            15Gi        14Gi       661Mi        67Mi       171Mi       538Mi
Swap:             0B          0B          0B

PVE: 8.1.3
Kernell: 6.5.11-7-pve (2023-12-05T09:44Z)

Yes, I'm passing through the GPU. Q35 machine with SeaBIOS.
 
Last edited:
Can you lower the assigned RAM of your VM by 2GB and try to start it again?
 
I don't remember exactly what I did -- it was very late or rather early -- but now it kind of works. All I remember was deactivating memory baloon for that VM ID 103 and I also migrated a small VM (1 GB RAM) -- that was not even always running during my troubleshooting -- to another server. Yet now I'm able to start that VM ID 103 and VM ID 104 just as before. Sorry to not being able to provide a clear resolution debrief.
I'll maybe try to bring back that third VM and see if it still works. I'll then report back.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!