OOM Problem? WIN10 VM updated and won't restart

abraxas1

Member
Feb 17, 2022
5
0
6
61
New to PVE but this system was stable for a couple months and then i let WIN10 do an update and now it won't boot up.
this is where i host BlueIris so no security cameras in my new house for now...

my main storage is one 5TB drive right now i keep as ext4. planning on setting up a NAS "soon"
but i can't tell what's full or why.
here's my journal log during the boot attempt on the WIN10 VM

root@pve:/backup/images# cat /tmp/journal.log
-- Journal begins at Sun 2022-02-06 21:15:37 PST. --
Apr 19 09:42:03 pve qmeventd[2893084]: Starting cleanup for 102
Apr 19 09:42:03 pve kernel: fwbr102i0: port 1(fwln102i0) entered disabled state
Apr 19 09:42:03 pve kernel: vmbr0: port 4(fwpr102p0) entered disabled state
Apr 19 09:42:03 pve kernel: device fwln102i0 left promiscuous mode
Apr 19 09:42:03 pve kernel: fwbr102i0: port 1(fwln102i0) entered disabled state
Apr 19 09:42:03 pve kernel: device fwpr102p0 left promiscuous mode
Apr 19 09:42:03 pve kernel: vmbr0: port 4(fwpr102p0) entered disabled state
Apr 19 09:42:03 pve qmeventd[2893084]: Finished cleanup for 102
Apr 19 09:46:07 pve smartd[1102]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 120 to 117
Apr 19 09:46:08 pve smartd[1102]: Device: /dev/sdb [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 33 to 34
Apr 19 09:47:04 pve pvedaemon[2896000]: start VM 102: UPID:pve:002C3080:0E5B07C1:625EE788:qmstart:102:root@pam:
Apr 19 09:47:04 pve pvedaemon[2876537]: <root@pam> starting task UPID:pve:002C3080:0E5B07C1:625EE788:qmstart:102:root@pam:
Apr 19 09:47:05 pve systemd[1]: Started 102.scope.
Apr 19 09:47:05 pve pvedaemon[2876537]: <root@pam> end task UPID:pve:002C3080:0E5B07C1:625EE788:qmstart:102:root@pam: OK
Apr 19 09:47:10 pve pvedaemon[2876537]: VM 102 qmp command failed - VM 102 qmp command 'guest-ping' failed - got timeout
Apr 19 09:47:16 pve pvedaemon[2872287]: VM 102 qmp command failed - VM 102 qmp command 'guest-ping' failed - got timeout
Apr 19 09:47:17 pve pvedaemon[2872287]: <root@pam> successful auth for user 'root@pam'
Apr 19 09:47:29 pve kernel: kvm [2896012]: vcpu1, guest rIP: 0xfffff8073f9f7b62 vmx_set_msr: BTF|LBR in IA32_DEBUGCTLMSR 0x1, nop
Apr 19 09:47:29 pve kernel: kvm [2896012]: vcpu1, guest rIP: 0xfffff8073f9f7b62 vmx_set_msr: BTF|LBR in IA32_DEBUGCTLMSR 0x1, nop
Apr 19 09:47:29 pve kernel: kvm [2896012]: vcpu1, guest rIP: 0xfffff8073f9f7b62 vmx_set_msr: BTF|LBR in IA32_DEBUGCTLMSR 0x1, nop
Apr 19 09:47:29 pve kernel: kvm [2896012]: vcpu1, guest rIP: 0xfffff8073f9f7b62 vmx_set_msr: BTF|LBR in IA32_DEBUGCTLMSR 0x1, nop
Apr 19 09:47:29 pve kernel: kvm [2896012]: vcpu1, guest rIP: 0xfffff8073f9f7b62 vmx_set_msr: BTF|LBR in IA32_DEBUGCTLMSR 0x1, nop
Apr 19 09:47:29 pve kernel: kvm [2896012]: vcpu1, guest rIP: 0xfffff8073f9f7b62 vmx_set_msr: BTF|LBR in IA32_DEBUGCTLMSR 0x1, nop
Apr 19 09:47:29 pve kernel: kvm [2896012]: vcpu1, guest rIP: 0xfffff8073f9f7b62 vmx_set_msr: BTF|LBR in IA32_DEBUGCTLMSR 0x1, nop
Apr 19 09:47:29 pve kernel: kvm [2896012]: vcpu1, guest rIP: 0xfffff8073f9f7b62 vmx_set_msr: BTF|LBR in IA32_DEBUGCTLMSR 0x1, nop
Apr 19 09:47:29 pve kernel: kvm [2896012]: vcpu1, guest rIP: 0xfffff8073f9f7b62 vmx_set_msr: BTF|LBR in IA32_DEBUGCTLMSR 0x1, nop
Apr 19 09:47:29 pve kernel: kvm [2896012]: vcpu1, guest rIP: 0xfffff8073f9f7b62 vmx_set_msr: BTF|LBR in IA32_DEBUGCTLMSR 0x1, nop
Apr 19 09:47:59 pve kernel: kvm invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
Apr 19 09:47:59 pve kernel: CPU: 3 PID: 1524 Comm: kvm Tainted: P O 5.13.19-2-pve #1
Apr 19 09:48:00 pve kernel: Mem-Info:
Apr 19 09:48:00 pve kernel: active_anon:2008660 inactive_anon:1629301 isolated_anon:0
active_file:196 inactive_file:274 isolated_file:0
unevictable:7993 dirty:0 writeback:0
slab_reclaimable:69671 slab_unreclaimable:118259
mapped:11925 shmem:8899 pagetables:16099 bounce:0
free:32848 free_pcp:0 free_cma:0
Apr 19 09:48:00 pve kernel: Node 0 active_anon:8034640kB inactive_anon:6517204kB active_file:784kB inactive_file:1096kB unevictable:31972kB isolated(anon):0kB isolated(file):0kB mapped:47700kB dirty:0kB writeback:0kB shmem:35596kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 1261568kB writeback_tmp:0kB kernel_stack:7328kB pagetables:64396kB all_unreclaimable? no
Apr 19 09:48:00 pve kernel: Node 0 DMA free:13264kB min:64kB low:80kB high:96kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15988kB managed:15360kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
Apr 19 09:48:00 pve kernel: lowmem_reserve[]: 0 3081 15783 15783 15783
Apr 19 09:48:00 pve kernel: Node 0 DMA32 free:63856kB min:13184kB low:16480kB high:19776kB reserved_highatomic:0KB active_anon:2813740kB inactive_anon:137456kB active_file:156kB inactive_file:64kB unevictable:8kB writepending:0kB present:3280544kB managed:3214432kB mlocked:8kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
Apr 19 09:48:00 pve kernel: lowmem_reserve[]: 0 0 12701 12701 12701
Apr 19 09:48:00 pve kernel: Node 0 Normal free:54272kB min:54332kB low:67912kB high:81492kB reserved_highatomic:0KB active_anon:5220832kB inactive_anon:6379760kB active_file:756kB inactive_file:2064kB unevictable:31964kB writepending:0kB present:13336576kB managed:13013184kB mlocked:31848kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
Apr 19 09:48:00 pve kernel: lowmem_reserve[]: 0 0 0 0 0
Apr 19 09:48:00 pve kernel: Node 0 DMA: 0*4kB 0*8kB 1*16kB (U) 2*32kB (UE) 2*64kB (UE) 2*128kB (UE) 2*256kB (UE) 2*512kB (UE) 1*1024kB (E) 3*2048kB (UME) 1*4096kB (M) = 13264kB
Apr 19 09:48:00 pve kernel: Node 0 DMA32: 1175*4kB (UME) 1086*8kB (UME) 1451*16kB (UME) 465*32kB (UME) 110*64kB (UME) 20*128kB (UM) 4*256kB (U) 3*512kB (M) 1*1024kB (M) 0*2048kB 0*4096kB = 64668kB
Apr 19 09:48:00 pve kernel: Node 0 Normal: 119*4kB (UM) 40*8kB (UM) 1345*16kB (UM) 1021*32kB (UE) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 54988kB
Apr 19 09:48:00 pve kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
Apr 19 09:48:00 pve kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Apr 19 09:48:00 pve kernel: 176643 total pagecache pages
Apr 19 09:48:00 pve kernel: 163143 pages in swap cache
Apr 19 09:48:00 pve kernel: Swap cache stats: add 1752779350, delete 1752609936, find 1118336183/1558075436
Apr 19 09:48:00 pve kernel: Free swap = 0kB
Apr 19 09:48:00 pve kernel: Total swap = 8388604kB
Apr 19 09:48:00 pve kernel: 4158277 pages RAM
Apr 19 09:48:00 pve kernel: 0 pages HighMem/MovableOnly
Apr 19 09:48:00 pve kernel: 97533 pages reserved
Apr 19 09:48:00 pve kernel: 0 pages hwpoisoned
Apr 19 09:48:00 pve kernel: Tasks state (memory values in pages):
Apr 19 09:48:00 pve kernel: [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
Apr 19 09:48:00 pve kernel: [ 445] 0 445 20094 6282 94208 0 -1000 dmeventd
Apr 19 09:48:00 pve kernel: [2896012] 0 2896012 4516892 2563138 29265920 38970 0 kvm
Apr 19 09:48:00 pve kernel: [2896344] 0 2896344 1326 16 49152 0 0 sleep
Apr 19 09:48:00 pve kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=qemu.slice,mems_allowed=0,global_oom,task_memcg=/qemu.slice/102.scope,task=kvm,pid=2896012,uid=0
Apr 19 09:48:00 pve kernel: Out of memory: Killed process 2896012 (kvm) total-vm:18067568kB, anon-rss:10252548kB, file-rss:0kB, shmem-rss:4kB, UID:0 pgtables:28580kB oom_score_adj:0
Apr 19 09:48:00 pve kernel: oom_reaper: reaped process 2896012 (kvm), now anon-rss:0kB, file-rss:68kB, shmem-rss:4kB
Apr 19 09:47:59 pve systemd[1]: 102.scope: A process of this unit has been killed by the OOM killer.
Apr 19 09:48:00 pve kernel: fwbr102i0: port 2(tap102i0) entered disabled state
Apr 19 09:48:00 pve kernel: fwbr102i0: port 2(tap102i0) entered disabled state
Apr 19 09:47:59 pve pvedaemon[2881645]: closing with write buffer at /usr/share/perl5/IO/Multiplex.pm line 928.
Apr 19 09:47:59 pve pvedaemon[2881645]: VM 102 qmp command failed - VM 102 qmp command 'query-proxmox-support' failed - client closed connection
Apr 19 09:47:59 pve postfix/qmgr[1416]: 45C0A380EB6: from=<root@pve.cecere.invalid>, size=40657, nrcpt=1 (queue active)
Apr 19 09:48:00 pve pve-firewall[1431]: firewall update time (9.549 seconds)
Apr 19 09:48:00 pve systemd[1]: 102.scope: Succeeded.
Apr 19 09:48:00 pve systemd[1]: 102.scope: Consumed 3min 15.387s CPU time.
 
Short reply: try to give it more Ram.

Good luck
yes, i tried that right off.
i see now windows is trying to do an update install after booting, but it dies in a few seconds. even if i hit F8 and get into recovery it still dies at around the same time. so not like it's reaching the same point of needing more memory and not getting it.
moving my whole house functionality to proxmox has been great, but now all my cameras are down and i need new debugging skills....
 
You could post the contents of /etc/pve/qemu-server/xxx.conf (where xxx is ID of your problem VM)
What is the spec of your server - how much ram/storage and how many other vm's are running on your host?
 
Dell Inc. OptiPlex 7050/0NW6H5
i7-7700 16GB
3 other guests,
container with Plex, 8GB
VM with Home Assistant, 2GB
VM with OpenMediaVault, 8GB.
OMV is serving a single 5TB drive with samba and used for backups on pve also.
(yes, the NAS is on my xmas list. and working it's way higher up every day)
Not sure if related to the windows update that seems to have initiated this boot loop. but it even loops if i get into windows recover with F8.

~# cat /etc/pve/qemu-server/102.conf
agent: 1
balloon: 0
boot: order=scsi0;ide2;net0;ide1
cores: 8
ide1: local:iso/virtio-win-0.1.208.iso,media=cdrom,size=543390K
ide2: local:iso/Windows10.iso,media=cdrom
machine: pc-i440fx-6.1
memory: 24000
meta: creation-qemu=6.1.0,ctime=1644560151
name: Win10BlueIris1
net0: virtio=96:2B:FB:97:2C:71,bridge=vmbr0,firewall=1
numa: 0
onboot: 1
ostype: win10
scsi0: local-lvm:vm-102-disk-1,cache=writeback,size=100G
scsihw: virtio-scsi-pci
smbios1: uuid=ffc9b00d-2abf-4a88-a391-b96085abdc58
sockets: 1
unused0: local-lvm:vm-102-disk-0
vmgenid: 2442a3a3-972a-4151-a44b-d01359b4f850
vmstatestorage: local-lvm
 
Well you are massively over-committed on memory. I'd suggest you look at how much memory each vm needs as a minimum to run. For example, I know Plex will run quite happily with 1 or 2GB of RAM, the HomeMediaVault official min requirement is 1GB and also don't forget Proxmox itself needs a couple of GB
 
if only being over-committed in memory was the worst thing going on here.
all balanced out now. making proper backups, learned a lot.
yeah
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!