Containers working but VM's not after upgrade to 5.0

mihalski

New Member
May 22, 2017
17
0
1
47
I only have one VM containing all my other containers using docker.. So this is quite a blow.
If I run qm status 102 I get:

status: io-error

Not sure where to go from there in terms of debugging.

Config is:

qm config 102
balloon: 512
bootdisk: scsi0
cores: 2
description: Debian Stretch Linux Docker server running%3A%0A- portainer%0A- influxdb%0A- grafana%0A? graphite%0A! sabnzbd%0A! transmission%0A- nzbget%0A- sonarr%0A- radarr%0A- ombi%0A- hassio%0AFiles stored in /docker/<containername>
ide2: none,media=cdrom
memory: 5120
name: dock
net0: virtio=5e:9b:02:44:e0:75,bridge=vmbr0
numa: 0
onboot: 1
ostype: l26
parent: WIP
scsi0: local-lvm:vm-102-disk-1,size=32G
scsihw: virtio-scsi-pci
smbios1: uuid=3eb5e922-b8fa-4834-bcfa-10a12f28395f
sockets: 1
usb0: host=05ac:8218
usb1: host=0658:0200

And since it's an io-error maybe this is worth something:
lvm> lvscan
ACTIVE '/dev/pve/swap' [7.00 GiB] inherit
ACTIVE '/dev/pve/root' [23.00 GiB] inherit
ACTIVE '/dev/pve/data' [51.41 GiB] inherit
ACTIVE '/dev/pve/vm-103-disk-1' [2.00 GiB] inherit
ACTIVE '/dev/pve/vm-101-disk-2' [10.00 GiB] inherit
ACTIVE '/dev/pve/vm-102-disk-1' [32.00 GiB] inherit
ACTIVE '/dev/pve/vm-102-state-WIP' [10.49 GiB] inherit
inactive '/dev/pve/snap_vm-102-disk-1_WIP' [32.00 GiB] inherit

When you start it it boots up running but very quickly goes to io-error:

root@pve:~# qm status 102
status: running
root@pve:~# qm status 102
status: io-error

Is there any way to trace exactly what's happening to help figure out what's causing it to fail?

EDIT: WORST case.. Is there any way to recover any data from the volume assigned to the VM? Since it's all docker containers it's pretty easy to reinstall those.. but the /docker directory on there is hundreds of hours of work that I hadn't backed up. First thing I do if I manage to get it working is to set up a backup solution.
 
Last edited:
So I'm thinking.. Maybe? Possibly? I've run out of space in the VM storage? How could I check that? or fix it? Don't see how I could have used that much space but it's not impossible.
 
Code:
lvs
  LV                     VG  Attr       LSize  Pool Origin        Data%  Meta%  Move Log Cpy%Sync Convert
  data                   pve twi-aotzD- 51.41g                    100.00 52.93
  root                   pve -wi-ao---- 23.00g
  snap_vm-102-disk-1_WIP pve Vri---tz-k 32.00g data vm-102-disk-1
  swap                   pve -wi-ao----  7.00g
  vm-101-disk-2          pve Vwi-a-tz-- 10.00g data               13.64
  vm-102-disk-1          pve Vwi-aotz-- 32.00g data               89.27
  vm-102-state-WIP       pve Vwi-a-tz-- 10.49g data               47.39
  vm-103-disk-1          pve Vwi-aotz--  2.00g data               73.14

EDIT: I JUST found the CODE tags
 
1. What can I do about that without wiping necessary data (ie backups/snapshots seem like they could be deleted?)
2. Why does the system allow absolute failure like this if you run out of space? Is there a way to prevent this? On a non-virtual system things work to some extent, just with a bunch of errors.
 
I have a sinking feeling that the Proxmox web UI let me obliterate my system without so much as a prompt :/
Unless the high CPU usage will lead to a working system.. I deleted the snapshot leaving only NOW for the docker.
I can't even get a directory listing in the docker vm now.
 
So is all hope lost now?
Code:
root@pve:~# qm stop 102
VM still running - terminating now with SIGTERM
VM still running - terminating now with SIGKILL
root@pve:~# qm start 102
start failed: org.freedesktop.systemd1.UnitExists: Unit 102.scope already exists.

Code:
root@pve:~# lvs
  LV            VG  Attr       LSize  Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  data          pve twi-aotz-- 51.41g             60.47  29.92
  root          pve -wi-ao---- 23.00g
  swap          pve -wi-ao----  7.00g
  vm-102-disk-1 pve Vwi-aotz-- 32.00g data        92.57
  vm-103-disk-1 pve Vwi-aotz--  2.00g data        73.14
 
I've suffered through a lot.. but I've managed to recover my data and back it up.. now I want to clear out the space used by the old vm and can't:
Code:
root@pve:~# lvremove /dev/pve/vm-102-state-WIP
Do you really want to remove and DISCARD logical volume pve/vm-102-state-WIP? [y/n]: y      
  Thin pool pve-data-tpool (253:4) transaction_id is 30, while expected 28.
  Failed to suspend pve/data with queued messages.
  Failed to update pool pve/data.
root@pve:~# lvremove /dev/pve/vm-102-disk-1
  Logical volume pve/vm-102-disk-1 in use.
Any clues? I'd just like to start over but this isn't giving me enough free space.

EDIT: I just don't know how to deal with this:
Code:
root@pve:~# lvscan
  ACTIVE            '/dev/pve/swap' [7.00 GiB] inherit
  ACTIVE            '/dev/pve/root' [23.00 GiB] inherit
  ACTIVE            '/dev/pve/data' [51.41 GiB] inherit
  ACTIVE            '/dev/pve/vm-103-disk-1' [2.00 GiB] inherit
  ACTIVE            '/dev/pve/vm-102-disk-1' [32.00 GiB] inherit
  inactive          '/dev/pve/vm-102-state-WIP' [10.49 GiB] inherit
root@pve:~# lvchange -a n /dev/pve/vm-102-disk-1
  Logical volume pve/vm-102-disk-1 in use.
root@pve:~# lvscan
  ACTIVE            '/dev/pve/swap' [7.00 GiB] inherit
  ACTIVE            '/dev/pve/root' [23.00 GiB] inherit
  ACTIVE            '/dev/pve/data' [51.41 GiB] inherit
  ACTIVE            '/dev/pve/vm-103-disk-1' [2.00 GiB] inherit
  ACTIVE            '/dev/pve/vm-102-disk-1' [32.00 GiB] inherit
  inactive          '/dev/pve/vm-102-state-WIP' [10.49 GiB] inherit
 
Last edited:
I ran into a similar problem, and it turned out to be krbd.

virtual disks that were created prior to upgrade dont seem to want to work with krbd enabled on the pool. containers, of course, will not run with it not enabled.

solution- create seperate pools for qm and lxc; alternatively, back up the qm vdisks with krbd disabled and restore them with krbd enabled.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!