what do you mean? have you checked the logs from last week?Both syslog/kern have no logs.
yes, when it happens again make sure to send the logs from that timeframe here. (tip: useAny way to find out the cause? Surely it will be again.
[code][/code]
tags, or attach the syslog/journal file)# cat kern.log.1
Jan 25 12:01:56 XXX kernel: [1122677.458424] device tap41010i0 entered promiscuous mode
Jan 25 12:01:56 XXX kernel: [1122677.528142] fwbr41010i0: port 1(tap41010i0) entered blocking state
Jan 25 12:01:56 XXX kernel: [1122677.528156] fwbr41010i0: port 1(tap41010i0) entered disabled state
Jan 25 12:01:56 XXX kernel: [1122677.528470] fwbr41010i0: port 1(tap41010i0) entered blocking state
Jan 25 12:01:56 XXX kernel: [1122677.528476] fwbr41010i0: port 1(tap41010i0) entered forwarding state
Jan 25 12:01:56 XXX kernel: [1122677.558311] device fwln41010o0 entered promiscuous mode
Jan 25 12:01:56 XXX kernel: [1122677.591262] fwbr41010i0: port 2(fwln41010o0) entered blocking state
Jan 25 12:01:56 XXX kernel: [1122677.591275] fwbr41010i0: port 2(fwln41010o0) entered disabled state
Jan 25 12:01:56 XXX kernel: [1122677.591554] fwbr41010i0: port 2(fwln41010o0) entered blocking state
Jan 25 12:01:56 XXX kernel: [1122677.591560] fwbr41010i0: port 2(fwln41010o0) entered forwarding state
# cat kern.log | head -n 1000 | sed -e 's/pve-stor-node-01041/XXX/g'
Jan 30 05:25:48 XXX kernel: [1530908.395381] set kvm_intel.dump_invalid_vmcs=1 to dump internal KVM state.
Jan 30 05:25:49 XXX kernel: [1530908.564738] fwbr41010i0: port 1(tap41010i0) entered disabled state
Jan 30 05:25:49 XXX kernel: [1530908.565060] fwbr41010i0: port 1(tap41010i0) entered disabled state
Jan 30 05:25:50 XXX kernel: [1530909.764557] fwbr41010i0: port 2(fwln41010o0) entered disabled state
Jan 30 05:25:50 XXX kernel: [1530909.766005] device fwln41010o0 left promiscuous mode
Jan 30 05:25:50 XXX kernel: [1530909.766020] fwbr41010i0: port 2(fwln41010o0) entered disabled state
Feb 2 18:05:28 XXX kernel: [1835687.261013] device tap41010i0 entered promiscuous mode
Feb 2 18:05:28 XXX kernel: [1835687.330664] fwbr41010i0: port 1(tap41010i0) entered blocking state
Feb 2 18:05:28 XXX kernel: [1835687.330678] fwbr41010i0: port 1(tap41010i0) entered disabled state
Feb 2 18:05:28 XXX kernel: [1835687.330982] fwbr41010i0: port 1(tap41010i0) entered blocking state
Feb 2 18:05:28 XXX kernel: [1835687.330989] fwbr41010i0: port 1(tap41010i0) entered forwarding state
Feb 2 18:05:28 XXX kernel: [1835687.360419] device fwln41010o0 entered promiscuous mode
Feb 2 18:05:28 XXX kernel: [1835687.393673] fwbr41010i0: port 2(fwln41010o0) entered blocking state
Feb 2 18:05:28 XXX kernel: [1835687.393684] fwbr41010i0: port 2(fwln41010o0) entered disabled state
Feb 2 18:05:28 XXX kernel: [1835687.393970] fwbr41010i0: port 2(fwln41010o0) entered blocking state
Feb 2 18:05:28 XXX kernel: [1835687.393977] fwbr41010i0: port 2(fwln41010o0) entered forwarding state
# cat syslog
Jan 30 05:25:48 pve-stor-node-01041 QEMU[641241]: KVM: entry failed, hardware error 0x80000021
Jan 30 05:25:48 pve-stor-node-01041 QEMU[641241]: If you're running a guest on an Intel machine without unrestricted mode
Jan 30 05:25:48 pve-stor-node-01041 QEMU[641241]: support, the failure can be most likely due to the guest entering an invalid
Jan 30 05:25:48 pve-stor-node-01041 QEMU[641241]: state for Intel VT. For example, the guest maybe running in big real mode
Jan 30 05:25:48 pve-stor-node-01041 QEMU[641241]: which is not supported on less recent Intel processors.
Jan 30 05:25:48 pve-stor-node-01041 QEMU[641241]: EAX=0000000d EBX=00000000 ECX=00000000 EDX=00000000
Jan 30 05:25:48 pve-stor-node-01041 QEMU[641241]: ESI=c45eff58 EDI=0000000d EBP=c45eff48 ESP=c45eff38
Jan 30 05:25:48 pve-stor-node-01041 QEMU[641241]: EIP=00008000 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=1 HLT=0
Jan 30 05:25:48 pve-stor-node-01041 QEMU[641241]: ES =0000 00000000 ffffffff 00809300
Jan 30 05:25:48 pve-stor-node-01041 QEMU[641241]: CS =c000 7ffc0000 ffffffff 00809300
Jan 30 05:25:48 pve-stor-node-01041 QEMU[641241]: SS =0000 00000000 ffffffff 00809300
Jan 30 05:25:48 pve-stor-node-01041 QEMU[641241]: DS =0000 00000000 ffffffff 00809300
Jan 30 05:25:48 pve-stor-node-01041 QEMU[641241]: FS =0000 00000000 ffffffff 00809300
Jan 30 05:25:48 pve-stor-node-01041 QEMU[641241]: GS =0000 00000000 ffffffff 00809300
Jan 30 05:25:48 pve-stor-node-01041 QEMU[641241]: LDT=0000 00000000 000fffff 00000000
Jan 30 05:25:48 pve-stor-node-01041 QEMU[641241]: TR =0040 0009c000 0000206f 00008b00
Jan 30 05:25:48 pve-stor-node-01041 QEMU[641241]: GDT= 0009a000 0000007f
Jan 30 05:25:48 pve-stor-node-01041 QEMU[641241]: IDT= 00000000 00000000
Jan 30 05:25:48 pve-stor-node-01041 QEMU[641241]: CR0=00050032 CR2=0b31b400 CR3=649ae000 CR4=00000000
Jan 30 05:25:48 pve-stor-node-01041 QEMU[641241]: DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
Jan 30 05:25:48 pve-stor-node-01041 QEMU[641241]: DR6=00000000ffff0ff0 DR7=0000000000000400
Jan 30 05:25:48 pve-stor-node-01041 QEMU[641241]: EFER=0000000000000000
Jan 30 05:25:48 pve-stor-node-01041 QEMU[641241]: Code=kvm: ../hw/core/cpu-sysemu.c:77: cpu_asidx_from_attrs: Assertion `ret < cpu->num_ases && ret >= 0' failed.
Jan 30 05:25:50 pve-stor-node-01041 qmeventd[2466270]: Starting cleanup for 41010
Jan 30 05:25:50 pve-stor-node-01041 ovs-vsctl: ovs|00001|vsctl|INFO|Called as /usr/bin/ovs-vsctl del-port fwln41010o0
Jan 30 05:25:50 pve-stor-node-01041 ovs-vsctl: ovs|00001|vsctl|INFO|Called as /usr/bin/ovs-vsctl del-port fwln41010i0
Jan 30 05:25:50 pve-stor-node-01041 ovs-vsctl: ovs|00002|db_ctl_base|ERR|no port named fwln41010i0
Jan 30 05:25:50 pve-stor-node-01041 ovs-vsctl: ovs|00001|vsctl|INFO|Called as /usr/bin/ovs-vsctl del-port tap41010i0
Jan 30 05:25:50 pve-stor-node-01041 ovs-vsctl: ovs|00002|db_ctl_base|ERR|no port named tap41010i0
Jan 30 05:25:50 pve-stor-node-01041 qmeventd[2466270]: Finished cleanup for 41010
# qm config 01041
400 Parameter verification failed.
vmid: invalid format - value does not look like a valid VM ID
qm config <vmid> [OPTIONS]
root@pve-stor-node-01041:/var/log# qm config 41010
agent: 1
bios: ovmf
boot: order=scsi0;ide2;net0
cores: 6
cpu: kvm64,flags=+aes
efidisk0: local-lvm:vm-41010-disk-1,efitype=4m,pre-enrolled-keys=1,size=4M
ide2: none,media=cdrom
machine: q35
memory: 12288
meta: creation-qemu=6.1.0,ctime=1641958361
name: ...
net0: virtio=CA:C0:35:7D:84:1A,bridge=vmbr1,firewall=1
numa: 1
onboot: 1
ostype: l26
scsi0: local-lvm:vm-41010-disk-0,discard=on,size=80G,ssd=1
scsi2: /dev/disk/by-id/wwn-0x600508b1001cefb58834258d393a01d0,size=586029016K
scsi3: /dev/disk/by-id/wwn-0x600508b1001c4be13955d4caee7222ca,size=586029016K
scsi4: /dev/disk/by-id/wwn-0x600508b1001c0f14f83d467a11c8ca45,size=586029016K
scsi5: /dev/disk/by-id/wwn-0x600508b1001cefcb19caf75cbcbbfe9e,size=586029016K
scsi6: /dev/disk/by-id/wwn-0x600508b1001c78bc89ad00e6fc0751a4,size=586029016K
scsi7: /dev/disk/by-id/wwn-0x600508b1001c8412768344d96e0f8667,size=586029016K
scsihw: virtio-scsi-pci
smbios1: uuid=...
sockets: 1
vmgenid: ...
Follow that comment, there is no graphic output as shown below and the VM does not turn on normally
also it would help us with identifying the issue if you could provide the following info:
* CPU model (lscpu)
* is this a nested setup? (are you running PVE on another hypervisor? if so, which?)
* which PVE kernel version is running? uname -a on the PVE host
qm showcmd 41010
/usr/bin/kvm -id 41010 -name k8s-stor-node-02063 -no-shutdown -chardev 'socket,id=qmp,path=/var/run/qemu-server/41010.qmp,server=on,wait=off' -mon 'chardev=qmp,mode=control' -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' -mon 'chardev=qmp-event,mode=control' -pidfile /var/run/qemu-server/41010.pid -daemonize -smbios 'type=1,uuid=ff5a427b-3f02-4409-a7ff-5f56703a96df' -drive 'if=pflash,unit=0,format=raw,readonly=on,file=/usr/share/pve-edk2-firmware//OVMF_CODE_4M.secboot.fd' -drive 'if=pflash,unit=1,format=raw,id=drive-efidisk0,size=540672,file=/dev/pve/vm-41010-disk-1' -smp '6,sockets=1,cores=6,maxcpus=6' -nodefaults -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' -vnc 'unix:/var/run/qemu-server/41010.vnc,password=on' -cpu kvm64,+aes,enforce,+kvm_pv_eoi,+kvm_pv_unhalt,+lahf_lm,+sep -m 12288 -object 'memory-backend-ram,id=ram-node0,size=12288M' -numa 'node,nodeid=0,cpus=0-5,memdev=ram-node0' -readconfig /usr/share/qemu-server/pve-q35-4.0.cfg -device 'vmgenid,guid=4f65b6d4-f20a-43bc-b9da-d3b74312b80d' -device 'usb-tablet,id=tablet,bus=ehci.0,port=1' -device 'VGA,id=vga,bus=pcie.0,addr=0x1' -chardev 'socket,path=/var/run/qemu-server/41010.qga,server=on,wait=off,id=qga0' -device 'virtio-serial,id=qga0,bus=pci.0,addr=0x8' -device 'virtserialport,chardev=qga0,name=org.qemu.guest_agent.0' -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' -iscsi 'initiator-name=iqn.1993-08.org.debian:01:d6d372cf8ece' -drive 'if=none,id=drive-ide2,media=cdrom,aio=io_uring' -device 'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=101' -device 'virtio-scsi-pci,id=scsihw0,bus=pci.0,addr=0x5' -drive 'file=/dev/pve/vm-41010-disk-0,if=none,id=drive-scsi0,discard=on,format=raw,cache=none,aio=io_uring,detect-zeroes=unmap' -device 'scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0,rotation_rate=1,bootindex=100' -drive 'file=/dev/disk/by-id/wwn-0x600508b1001cefb58834258d393a01d0,if=none,id=drive-scsi2,format=raw,cache=none,aio=io_uring,detect-zeroes=on' -device 'scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=2,drive=drive-scsi2,id=scsi2' -drive 'file=/dev/disk/by-id/wwn-0x600508b1001c4be13955d4caee7222ca,if=none,id=drive-scsi3,format=raw,cache=none,aio=io_uring,detect-zeroes=on' -device 'scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=3,drive=drive-scsi3,id=scsi3' -drive 'file=/dev/disk/by-id/wwn-0x600508b1001c0f14f83d467a11c8ca45,if=none,id=drive-scsi4,format=raw,cache=none,aio=io_uring,detect-zeroes=on' -device 'scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=4,drive=drive-scsi4,id=scsi4' -drive 'file=/dev/disk/by-id/wwn-0x600508b1001cefcb19caf75cbcbbfe9e,if=none,id=drive-scsi5,format=raw,cache=none,aio=io_uring,detect-zeroes=on' -device 'scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=5,drive=drive-scsi5,id=scsi5' -drive 'file=/dev/disk/by-id/wwn-0x600508b1001c78bc89ad00e6fc0751a4,if=none,id=drive-scsi6,format=raw,cache=none,aio=io_uring,detect-zeroes=on' -device 'scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=6,drive=drive-scsi6,id=scsi6' -drive 'file=/dev/disk/by-id/wwn-0x600508b1001c8412768344d96e0f8667,if=none,id=drive-scsi7,format=raw,cache=none,aio=io_uring,detect-zeroes=on' -device 'scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=7,drive=drive-scsi7,id=scsi7' -netdev 'type=tap,id=net0,ifname=tap41010i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' -device 'virtio-net-pci,mac=CA:C0:35:7D:84:1A,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=102' -machine 'type=q35+pve0'
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 46 bits physical, 48 bits virtual
CPU(s): 24
On-line CPU(s) list: 0-23
Thread(s) per core: 2
Core(s) per socket: 6
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 45
Model name: Intel(R) Xeon(R) CPU E5-2640 0 @ 2.50GHz
Stepping: 7
CPU MHz: 3000.000
CPU max MHz: 3000.0000
CPU min MHz: 1200.0000
BogoMIPS: 4987.60
Virtualization: VT-x
L1d cache: 384 KiB
L1i cache: 384 KiB
L2 cache: 3 MiB
L3 cache: 30 MiB
NUMA node0 CPU(s): 0-5,12-17
NUMA node1 CPU(s): 6-11,18-23
Vulnerability Itlb multihit: KVM: Mitigation: Split huge pages
Vulnerability L1tf: Mitigation; PTE Inversion; VMX conditional cache flushes, SMT vulnerable
Vulnerability Mds: Mitigation; Clear CPU buffers; SMT vulnerable
Vulnerability Meltdown: Mitigation; PTI
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2: Mitigation; Full generic retpoline, IBPB conditional, IBRS_FW, STIBP conditional, RSB filling
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Not affected
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc a
rch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1
sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm epb pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid xsaveopt dtherm ida arat pln pts m
d_clear flush_l1d
Feb 6 11:00:03 pve-node-24 QEMU[1596]: KVM: entry failed, hardware error 0x80000021
Feb 6 11:00:03 pve-node-24 QEMU[1596]: If you're running a guest on an Intel machine without unrestricted mode
Feb 6 11:00:03 pve-node-24 QEMU[1596]: support, the failure can be most likely due to the guest entering an invalid
Feb 6 11:00:03 pve-node-24 QEMU[1596]: state for Intel VT. For example, the guest maybe running in big real mode
Feb 6 11:00:03 pve-node-24 QEMU[1596]: which is not supported on less recent Intel processors.
Feb 6 11:00:03 pve-node-24 QEMU[1596]: EAX=00000001 EBX=00000000 ECX=00000000 EDX=00000001
Feb 6 11:00:03 pve-node-24 QEMU[1596]: ESI=83bb7f58 EDI=2f66ae00 EBP=83bb7f18 ESP=83bb7f10
Feb 6 11:00:03 pve-node-24 QEMU[1596]: EIP=00008000 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=1 HLT=0
Feb 6 11:00:03 pve-node-24 QEMU[1596]: ES =0000 00000000 ffffffff 00809300
Feb 6 11:00:03 pve-node-24 QEMU[1596]: CS =9800 7ff98000 ffffffff 00809300
Feb 6 11:00:03 pve-node-24 QEMU[1596]: SS =0000 00000000 ffffffff 00809300
Feb 6 11:00:03 pve-node-24 QEMU[1596]: DS =0000 00000000 ffffffff 00809300
Feb 6 11:00:03 pve-node-24 QEMU[1596]: FS =0000 00000000 ffffffff 00809300
Feb 6 11:00:03 pve-node-24 QEMU[1596]: GS =0000 00000000 ffffffff 00809300
Feb 6 11:00:03 pve-node-24 QEMU[1596]: LDT=0000 00000000 000fffff 00000000
Feb 6 11:00:03 pve-node-24 QEMU[1596]: TR =0040 001ce000 0000206f 00008b00
Feb 6 11:00:03 pve-node-24 QEMU[1596]: GDT= 001cc000 0000007f
Feb 6 11:00:03 pve-node-24 QEMU[1596]: IDT= 00000000 00000000
Feb 6 11:00:03 pve-node-24 QEMU[1596]: CR0=00050032 CR2=562401e8 CR3=6cfe8000 CR4=00000000
Feb 6 11:00:03 pve-node-24 QEMU[1596]: DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
Feb 6 11:00:03 pve-node-24 QEMU[1596]: DR6=00000000ffff0ff0 DR7=0000000000000400
Feb 6 11:00:03 pve-node-24 QEMU[1596]: EFER=0000000000000000
Feb 6 11:00:03 pve-node-24 QEMU[1596]: Code=kvm: ../hw/core/cpu-sysemu.c:77: cpu_asidx_from_attrs: Assertion `ret < cpu->num_ases && ret >= 0' failed.
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 46 bits physical, 48 bits virtual
CPU(s): 32
On-line CPU(s) list: 0-31
Thread(s) per core: 2
Core(s) per socket: 8
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 63
Model name: Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
Stepping: 2
CPU MHz: 1364.216
BogoMIPS: 4794.19
Virtualization: VT-x
L1d cache: 512 KiB
L1i cache: 512 KiB
L2 cache: 4 MiB
L3 cache: 40 MiB
NUMA node0 CPU(s): 0-7,16-23
NUMA node1 CPU(s): 8-15,24-31
Vulnerability Itlb multihit: KVM: Mitigation: Split huge pages
Vulnerability L1tf: Mitigation; PTE Inversion; VMX conditional cache flushes, SMT vulnerable
Vulnerability Mds: Mitigation; Clear CPU buffers; SMT vulnerable
Vulnerability Meltdown: Mitigation; PTI
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2: Mitigation; Full generic retpoline, IBPB conditional, IBRS_FW, STIBP conditional, RSB filling
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Not affected
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pe
bs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe
popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm cpuid_fault epb invpcid_single pti intel_ppin ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad
fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm xsaveopt cqm_llc cqm_occup_llc dtherm ida arat pln pts md_clear flush_l1d
Xeon(R) CPU E5-2640
Xeon(R) CPU E5-2630
but do you get the same error messages after disabling smm for that VM?Follow that comment, there is no graphic output as shown below and the VM does not turn on normally.
could you attempt the same fix there too and let us know if that works?And Today, I had the same problem on another node.
This node has been working for several months and this is the first time I've had this problem.
The first happened node will have periodic problems immediately after installation.