PVE6 is munching my VM config files

micush

Renowned Member
Jul 18, 2015
73
3
73
I have a 5 node cluster that has been in production for more than 5 years now.

Since PVE6, I'm starting to see things like this in my VM config files:

Code:
[^A^@^@^@^@^@^@&^@^@^@^@^@^@^@^@^@^@^@nodes/lphxpve16/qemu-server/1150.conf^@agent: 1
balloon: 0
boot: dcn
bootdisk: virtio0
cores: 15
ide2: none,media=cdrom
memory: 49152
name: lphxgns3
net0: virtio=A6:65:D5:DD:A5:FC,bridge=vmbr0,queues=8,tag=15
numa: 0
onboot: 1
ostype: l26
scsihw: virtio-scsi-single
smbios1: uuid=9a08bcf0-03f2-41db-8dfd-bad2d71dc9d0
sockets: 1

Besides the garbage at the top of the file, you will notice that there is no longer a disk definition in the file. The disk image is on disk, but PVE will not add it to the config, even if I add it manually by modifying the config file by hand.

This is causing me some real issues. Please help.

Code:
# pveversion --verbose
proxmox-ve: 6.0-2 (running kernel: 5.0.21-1-pve)
pve-manager: 6.0-6 (running version: 6.0-6/c71f879f)
pve-kernel-5.0: 6.0-7
pve-kernel-helper: 6.0-7
pve-kernel-4.15: 5.4-6
pve-kernel-5.0.21-1-pve: 5.0.21-2
pve-kernel-5.0.18-1-pve: 5.0.18-3
pve-kernel-5.0.15-1-pve: 5.0.15-1
pve-kernel-4.15.18-18-pve: 4.15.18-44
ceph-fuse: 12.2.11+dfsg1-2.1
corosync: 3.0.2-pve2
criu: 3.11-3
glusterfs-client: 5.5-3
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.11-pve1
libpve-access-control: 6.0-2
libpve-apiclient-perl: 3.0-2
libpve-common-perl: 6.0-4
libpve-guest-common-perl: 3.0-1
libpve-http-server-perl: 3.0-2
libpve-storage-perl: 6.0-7
libqb0: 1.0.5-1
lvm2: 2.03.02-pve3
lxc-pve: 3.1.0-64
lxcfs: 3.0.3-pve60
novnc-pve: 1.0.0-60
openvswitch-switch: 2.10.0+2018.08.28+git.8ca7c82b7d+ds1-12
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.0-7
pve-cluster: 6.0-6
pve-container: 3.0-5
pve-docs: 6.0-4
pve-edk2-firmware: 2.20190614-1
pve-firewall: 4.0-7
pve-firmware: 3.0-2
pve-ha-manager: 3.0-2
pve-i18n: 2.0-2
pve-qemu-kvm: 4.0.0-5
pve-xtermjs: 3.13.2-1
qemu-server: 6.0-7
smartmontools: 7.0-pve2
spiceterm: 3.1-1
vncterm: 1.6-1
zfsutils-linux: 0.8.1-pve2
 
Since PVE6, I'm starting to see things like this in my VM config files:

Can you narrow done when this happens? You could, for example, create a new test VM/CT and try to add a few things (NIC, Disk) or change the config (through API/Webinterface) in other ways.

Also, do you have any monitoring program, or any configuration management tooling, which could interfere itself with the configuration - outside of PVE?
 
Further, is this node in a cluster?
And disk are healthy? Not that the disk IO load of the not so small 5 to 6 upgrade but the disk over the brink.. (just a guess, but would explain things)
 
I have a similar issue, for me it is even breaking the corosync.conf file and filling it with garbage in first line - testing pve-cluster=6.0-7 now.
 
No joy, pve-cluster-6.0-7 keeps on breaking/garbling my corosync.conf file.
 
Last edited:
Apparently the corosync3 file system is not working as expected. I just had a look at /root/.ssh/authorized_keys (symbolic link to /etc/pve/priv/authorized_keys) and same issue there:

Code:
^R^C^@^@^@^@^@^@^^^@^@^@^@^@^@^@^@^@^@^@priv/authorized_keys.tmp.1501^@
ssh-rsa AAAAB3N[...]
ssh-rsa AAAAB3N[...]
 
Hello,

I was about to open a ticket about this. Yesterday I did by error updated our 4 proxmox 6 lab nodes with pvetest repository. It trashed almost every .conf files heading with garbage like this:

Code:
root@pm04:/etc/pve/local/qemu-server# cat 107.conf
)nodes/pm04/qemu-server/107.conf.tmp.2196agent: 1
balloon: 0
bios: ovmf
bootdisk: virtio0
cores: 8

and cropping the end of the files.

Also after changing a setting for one of the vm, the conf file of the vm was suddently 0 bytes.

I'll try cleaning the conf files and update to pve-cluster-6.0-7
 
Okay, for now after having cleaned all the conf files and recreated a conf for the 0 byte one. It seems ok with pve-cluster-6.0.7.
I did some changes to some vm and they seems to stay clean.
Knocking wood.
 
No joy, pve-cluster-6.0-7 keeps on breaking/garbling my corosync.conf file.

Yes, it won't repair the state automatically, but it works now if you write them (new or from backup).
 
  • Like
Reactions: schinzelh
Okay, for now after having cleaned all the conf files and recreated a conf for the 0 byte one. It seems ok with pve-cluster-6.0.7.
I did some changes to some vm and they seems to stay clean.
Knocking wood.

Yes, it was a clear regression which thus was quite clear to fix. It only shows up in clustered pmxcfs, not standalone, that, and the fact that it contained some other, relatively important fixes, made me miss it and moved it a bit to fast from internal to pvetest repo - that's what one gets for doing a late Friday upload.. Sorry for any inconvenience caused, as it was "only" on pvetest it should be to much :)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!