VM keeps shutting down

damjank

Member
Apr 2, 2020
27
0
6
Hello,

I have PVE 7.1-10, on it I have several VMs, but one in particular keeps shutting down. It is Ubuntu 20.04-3 server, used for Docker. It was part of the same template cloning as other VMs there. The only difference for this machine is that I have 2h period of backing it up. Now VM is working perfectly, no issues reported in logs but every now and then, i few days time, the VM is shutdown, no error reported not in PVE nor in VM itself.
Can anyone suggest any insight into this behaviour? I have never had this issue before and I have several PVEs for long time and this is the first time I observed this "issue".
Thanks in advance on any possible info or insight towards resolution of this.

D
 
Did you search the syslog for OOM messages? Most of the time VMs will stop working because the OOM killer kills them because your host is running out of RAM.
 
The output of requested commands:
Bash:
root@pve-hq:~# lvs
  LV            VG   Attr       LSize    Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  root          pve  -wi-ao---- <110.74g                                                   
  swap          pve  -wi-ao----    8.00g                                                   
  vm-303-disk-0 vol0 -wi-ao----    4.00m                                                   
  vm-303-disk-1 vol0 -wi-ao----   60.00g                                                   
  vm-303-disk-2 vol0 -wi-ao----    4.00m                                                   
  vm-311-disk-0 vol0 -wi-ao----    4.00m                                                   
  vm-311-disk-1 vol0 -wi-ao----   60.00g                                                   
  vm-311-disk-2 vol0 -wi-ao----    4.00m                                                   
  vm-999-disk-0 vol0 -wi-a-----   60.00g                                                   
  vm-999-disk-1 vol0 -wi-a-----    4.00m                                                   
  vm-999-disk-2 vol0 -wi-a-----    4.00m                                                   
root@pve-hq:~# vgs
  VG   #PV #LV #SN Attr   VSize    VFree
  pve    1   2   0 wz--n- <118.74g     0
  vol0   1   9   0 wz--n-   <2.00t  1.82t
  vol1   1   0   0 wz--n-   <4.00t <4.00t

About the RAM - server has 256GB, utilising currently 50GB thus not finding OOM messages (checked anyway, there are none).
 
Well nothing extreme there, though I found this - I copied few lines before and few after, if it makes any sense to anyone:
Bash:
Jan 25 19:31:16 pve-hq pvedaemon[3964404]: cannot delete 'vcpus' - not set in current configuration!
Jan 25 19:31:16 pve-hq pvedaemon[3964404]: cannot delete 'cpulimit' - not set in current configuration!
Jan 25 19:31:16 pve-hq pvedaemon[3964404]: cannot delete 'cpuunits' - not set in current configuration!
Jan 25 19:31:25 pve-hq kernel: [288983.394055] usb 2-14: USB disconnect, device number 33
Jan 25 19:31:29 pve-hq kernel: [288987.277225] usb 2-14: new low-speed USB device number 34 using xhci_hcd
Jan 25 19:31:29 pve-hq kernel: [288987.432293] usb 2-14: New USB device found, idVendor=06da, idProduct=ffff, bcdDevice= 0.03
Jan 25 19:31:29 pve-hq kernel: [288987.432296] usb 2-14: New USB device strings: Mfr=1, Product=2, SerialNumber=4
Jan 25 19:31:29 pve-hq kernel: [288987.432298] usb 2-14: Product: Offline UPS
Jan 25 19:31:29 pve-hq kernel: [288987.432299] usb 2-14: Manufacturer: PPC
Jan 25 19:31:29 pve-hq kernel: [288987.432300] usb 2-14: SerialNumber: 000000000
Jan 25 19:31:29 pve-hq kernel: [288987.453365] hid-generic 0003:06DA:FFFF.3E2B: hiddev1,hidraw4: USB HID v1.00 Device [PPC Offline UPS] on usb-0000:00:14.0-14/input0
Jan 25 19:31:43 pve-hq kernel: [289001.569092] usb 2-14: USB disconnect, device number 34
Jan 25 19:31:47 pve-hq kernel: [289005.457011] usb 2-14: new low-speed USB device number 35 using xhci_hcd
Jan 25 19:31:47 pve-hq kernel: [289005.615887] usb 2-14: New USB device found, idVendor=06da, idProduct=ffff, bcdDevice= 0.03
Jan 25 19:31:47 pve-hq kernel: [289005.615896] usb 2-14: New USB device strings: Mfr=1, Product=2, SerialNumber=4
Jan 25 19:31:47 pve-hq kernel: [289005.615900] usb 2-14: Product: Offline UPS
Jan 25 19:31:47 pve-hq kernel: [289005.615903] usb 2-14: Manufacturer: PPC
Jan 25 19:31:47 pve-hq kernel: [289005.615906] usb 2-14: SerialNumber: 000000000
Jan 25 19:31:47 pve-hq kernel: [289005.637048] hid-generic 0003:06DA:FFFF.3E2C: hiddev1,hidraw4: USB HID v1.00 Device [PPC Offline UPS] on usb-0000:00:14.0-14/input0
Jan 25 19:31:47 pve-hq pvedaemon[3562490]: shutdown VM 201: UPID:pve-hq:00365BFA:01B8FE2A:61F04213:qmshutdown:201:root@pam:
Jan 25 19:31:47 pve-hq pvedaemon[3964402]: <root@pam> starting task UPID:pve-hq:00365BFA:01B8FE2A:61F04213:qmshutdown:201:root@pam:
Jan 25 19:32:01 pve-hq kernel: [289019.743094] usb 2-14: USB disconnect, device number 35
Jan 25 19:32:05 pve-hq kernel: [289023.628861] usb 2-14: new low-speed USB device number 36 using xhci_hcd
Jan 25 19:32:05 pve-hq kernel: [289023.783898] usb 2-14: New USB device found, idVendor=06da, idProduct=ffff, bcdDevice= 0.03
Jan 25 19:32:05 pve-hq kernel: [289023.783901] usb 2-14: New USB device strings: Mfr=1, Product=2, SerialNumber=4
Jan 25 19:32:05 pve-hq kernel: [289023.783903] usb 2-14: Product: Offline UPS
Jan 25 19:32:05 pve-hq kernel: [289023.783904] usb 2-14: Manufacturer: PPC
Jan 25 19:32:05 pve-hq kernel: [289023.783905] usb 2-14: SerialNumber: 000000000
Jan 25 19:32:05 pve-hq kernel: [289023.804997] hid-generic 0003:06DA:FFFF.3E2D: hiddev1,hidraw4: USB HID v1.00 Device [PPC Offline UPS] on usb-0000:00:14.0-14/input0
Jan 25 19:32:09 pve-hq pvedaemon[3964403]: VM 201 qmp command failed - VM 201 qmp command 'guest-ping' failed - got timeout
Jan 25 19:32:19 pve-hq kernel: [289037.917118] usb 2-14: USB disconnect, device number 36
Jan 25 19:32:23 pve-hq kernel: [289041.800638] usb 2-14: new low-speed USB device number 37 using xhci_hcd
Jan 25 19:32:23 pve-hq kernel: [289041.955627] usb 2-14: New USB device found, idVendor=06da, idProduct=ffff, bcdDevice= 0.03
Jan 25 19:32:23 pve-hq kernel: [289041.955630] usb 2-14: New USB device strings: Mfr=1, Product=2, SerialNumber=4
Jan 25 19:32:23 pve-hq kernel: [289041.955631] usb 2-14: Product: Offline UPS
Jan 25 19:32:23 pve-hq kernel: [289041.955633] usb 2-14: Manufacturer: PPC
Jan 25 19:32:23 pve-hq kernel: [289041.955633] usb 2-14: SerialNumber: 000000000
Jan 25 19:32:23 pve-hq kernel: [289041.977308] hid-generic 0003:06DA:FFFF.3E2E: hiddev1,hidraw4: USB HID v1.00 Device [PPC Offline UPS] on usb-0000:00:14.0-14/input0
Jan 25 19:32:28 pve-hq pvedaemon[3964402]: VM 201 qmp command failed - VM 201 qmp command 'guest-ping' failed - got timeout
Jan 25 19:32:37 pve-hq kernel: [289056.092356] usb 2-14: USB disconnect, device number 37
Jan 25 19:32:41 pve-hq kernel: [289059.976439] usb 2-14: new low-speed USB device number 38 using xhci_hcd
Jan 25 19:32:41 pve-hq kernel: [289060.131665] usb 2-14: New USB device found, idVendor=06da, idProduct=ffff, bcdDevice= 0.03
Jan 25 19:32:41 pve-hq kernel: [289060.131668] usb 2-14: New USB device strings: Mfr=1, Product=2, SerialNumber=4
Jan 25 19:32:41 pve-hq kernel: [289060.131670] usb 2-14: Product: Offline UPS
Jan 25 19:32:41 pve-hq kernel: [289060.131671] usb 2-14: Manufacturer: PPC
Jan 25 19:32:41 pve-hq kernel: [289060.131672] usb 2-14: SerialNumber: 000000000
Jan 25 19:32:41 pve-hq kernel: [289060.153149] hid-generic 0003:06DA:FFFF.3E2F: hiddev1,hidraw4: USB HID v1.00 Device [PPC Offline UPS] on usb-0000:00:14.0-14/input0
 
Well nothing extreme there, though I found this - I copied few lines before and few after, if it makes any sense to anyone:
Bash:
Jan 25 19:31:16 pve-hq pvedaemon[3964404]: cannot delete 'vcpus' - not set in current configuration!
Jan 25 19:31:16 pve-hq pvedaemon[3964404]: cannot delete 'cpulimit' - not set in current configuration!
Jan 25 19:31:16 pve-hq pvedaemon[3964404]: cannot delete 'cpuunits' - not set in current configuration!
...
This is when you booted up your VM?

Can you post the vm config:
qm config Your_VM_ID
 
I think those lines are not in regards of VM, since it war running at that time. Nevertheless, here is the output of the config:
Bash:
root@pve-hq:~# qm config 201
agent: 1
balloon: 0
bios: ovmf
boot: order=scsi0;ide2;net0
cores: 8
cpu: host,flags=+aes
efidisk0: pve-hq_prod_storage_0:vm-201-disk-0,efitype=4m,pre-enrolled-keys=1,size=1M
ide2: none,media=cdrom
machine: q35
memory: 16384
meta: creation-qemu=6.1.0,ctime=1641376448
name: pdocker-hq
net0: virtio=F6:9F:57:14:A9:A1,bridge=vmbr0
numa: 0
onboot: 1
ostype: l26
protection: 1
scsi0: pve-hq_prod_storage_0:vm-201-disk-1,iothread=1,size=60G,ssd=1
scsihw: virtio-scsi-single
smbios1: uuid=d3778bb1-465c-4397-97c1-39b4a2d5f1f9
sockets: 1
tpmstate0: pve-hq_prod_storage_0:vm-201-disk-2,size=4M,version=v2.0
vmgenid: d08aa352-2a65-4afd-a792-838f88583e34
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!