Cannot write inside /etc/pve

fcorbelli

New Member
Feb 6, 2023
24
1
3
Italy
github.com
I should mention that I am using a freshly-installed proxmox system not made by me (using a template from the supplier), so I do not know exactly how it was done

For some reason unknown to me, it is not possible to write inside /etc/pve and its folders (getting something like)

unable to create VM unable to open file /etc/pve/nodes/nsxxxxx/qemu-server/102.conf.tmp
Input/output error (500)


Quick-and dirty check: /etc/pve seems to be a "special" folder

The system (should) have 2xmirror-zfs based, + 1 spare SSD-zfs drive

root@ns337400:/etc# sudo su root@ns337400:/etc# echo prova >/etc/pve/test1.txt bash: /etc/pve/test1.txt: Input/output error root@ns337400:/etc#

root@ns337400:/etc# df -h /etc/pve Filesystem Size Used Avail Use% Mounted on /dev/fuse 128M 16K 128M 1% /etc/pve root@ns337400:/etc# df -h Filesystem Size Used Avail Use% Mounted on udev 16G 0 16G 0% /dev tmpfs 3.2G 1.4M 3.2G 1% /run zp0/zd1 20G 3.6G 17G 18% / tmpfs 16G 46M 16G 1% /dev/shm tmpfs 5.0M 0 5.0M 0% /run/lock zp0/zd0 1.0G 92M 933M 9% /boot tank 431G 128K 431G 1% /tank zp0/zd2 7.2T 123G 7.1T 2% /var/lib/vz /dev/fuse 128M 16K 128M 1% /etc/pve tmpfs 3.2G 0 3.2G 0% /run/user/0



root@ns337400:/etc# mount sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime) proc on /proc type proc (rw,nosuid,nodev,noexec,relatime) udev on /dev type devtmpfs (rw,nosuid,relatime,size=16226532k,nr_inodes=4056633,mode=755,inode64) devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000) tmpfs on /run type tmpfs (rw,nosuid,nodev,noexec,relatime,size=3252000k,mode=755,inode64) zp0/zd1 on / type zfs (rw,xattr,posixacl) securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime) tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev,inode64) tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k,inode64) cgroup2 on /sys/fs/cgroup type cgroup2 (rw,nosuid,nodev,noexec,relatime) pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime) efivarfs on /sys/firmware/efi/efivars type efivarfs (rw,nosuid,nodev,noexec,relatime) bpf on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,relatime,mode=700) systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=30,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=19184) mqueue on /dev/mqueue type mqueue (rw,nosuid,nodev,noexec,relatime) hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,pagesize=2M) tracefs on /sys/kernel/tracing type tracefs (rw,nosuid,nodev,noexec,relatime) debugfs on /sys/kernel/debug type debugfs (rw,nosuid,nodev,noexec,relatime) fusectl on /sys/fs/fuse/connections type fusectl (rw,nosuid,nodev,noexec,relatime) configfs on /sys/kernel/config type configfs (rw,nosuid,nodev,noexec,relatime) sunrpc on /run/rpc_pipefs type rpc_pipefs (rw,relatime) /dev/sdc1 on /boot/efi type vfat (rw,relatime,fmask=0022,dmask=0022,codepage=437,iocharset=iso8859-1,shortname=mixed,errors=remount-ro) tank on /tank type zfs (rw,xattr,noacl) zp0/zd0 on /boot type zfs (rw,xattr,posixacl) zp0/zd2 on /var/lib/vz type zfs (rw,xattr,posixacl) lxcfs on /var/lib/lxcfs type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other) /dev/fuse on /etc/pve type fuse (rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other) tmpfs on /run/user/0 type tmpfs (rw,nosuid,nodev,relatime,size=3251996k,nr_inodes=812999,mode=700,inode64)

root@ns337400:/etc# zfs list NAME USED AVAIL REFER MOUNTPOINT tank 620K 430G 104K /tank zp0 126G 7.02T 96K none zp0/zd0 91.7M 932M 91.7M /boot zp0/zd1 3.51G 16.5G 3.51G / zp0/zd2 122G 7.02T 122G /var/lib/vz


root@ns337400:/etc# zpool status pool: tank state: ONLINE scan: scrub repaired 0B in 00:00:00 with 0 errors on Tue May 9 13:28:25 2023 config: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 sda ONLINE 0 0 0 errors: No known data errors pool: zp0 state: ONLINE status: Some supported and requested features are not enabled on the pool. The pool can still be used, but some features are unavailable. action: Enable all features using 'zpool upgrade'. Once this is done, the pool may no longer be accessible by software that does not support the features. See zpool-features(7) for details. scan: scrub repaired 0B in 00:00:23 with 0 errors on Tue May 9 13:28:22 2023 config: NAME STATE READ WRITE CKSUM zp0 ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 sdb2 ONLINE 0 0 0 sdc2 ONLINE 0 0 0 errors: No known data errors
Any ideas? Thanks!
 
Last edited:
Hi,
/etc/pve is a fuse based mount backed by a sqlite database. This is provided by the pmxcfs as described in more detail here https://pve.proxmox.com/pve-docs/pve-admin-guide.html#chapter_pmxcfs

In order identify the cause of your write issues, please post the output of
Bash:
journalctl -b -u pve-cluster.service
systemctl status pve-cluster.service
pvecm status
 
Something seems really wrong

Code:
journalctl -b -u pve-cluster.service
-- Journal begins at Tue 2023-05-09 12:37:58 UTC, ends at Tue 2023-05-09 16:40:46 UTC. --
May 09 13:39:26 ns337400 systemd[1]: Starting The Proxmox VE cluster filesystem...
May 09 13:39:27 ns337400 systemd[1]: Started The Proxmox VE cluster filesystem.
May 09 14:36:18 ns337400 pmxcfs[2412]: [database] crit: commit transaction failed: disk I/O>
May 09 14:36:18 ns337400 pmxcfs[2412]: [database] crit: rollback transaction failed: cannot>

Code:
root@ns337400:~# systemctl status pve-cluster.service
● pve-cluster.service - The Proxmox VE cluster filesystem
     Loaded: loaded (/lib/systemd/system/pve-cluster.service; enabled; vendor preset: enabl>
     Active: active (running) since Tue 2023-05-09 13:39:27 UTC; 3h 2min ago
    Process: 2382 ExecStart=/usr/bin/pmxcfs (code=exited, status=0/SUCCESS)
   Main PID: 2412 (pmxcfs)
      Tasks: 6 (limit: 38030)
     Memory: 57.0M
        CPU: 7.134s
     CGroup: /system.slice/pve-cluster.service
             └─2412 /usr/bin/pmxcfs

May 09 13:39:26 ns337400 systemd[1]: Starting The Proxmox VE cluster filesystem...
May 09 13:39:27 ns337400 systemd[1]: Started The Proxmox VE cluster filesystem.
May 09 14:36:18 ns337400 pmxcfs[2412]: [database] crit: commit transaction failed: disk I/O>
May 09 14:36:18 ns337400 pmxcfs[2412]: [database] crit: rollback transaction failed: cannot

Code:
root@ns337400:~# pvecm status
Error: Corosync config '/etc/pve/corosync.conf' does not exist - is this node part of a cluster?

Should (...should!) not be a HW error, can copy ~300GB OK
 
If you have major problems with your Proxmox VE host, for example hardware issues, it could be helpful to copy the pmxcfs database file /var/lib/pve-cluster/config.db, and move it to a new Proxmox VE host. On the new host (with nothing running), you need to stop the pve-cluster service and replace the config.db file (required permissions 0600). Following this, adapt /etc/hostname and /etc/hosts according to the lost Proxmox VE host, then reboot and check (and don’t forget your VM/CT data).

Seems a bit... radical...
 
OK, so far I made
Code:
systemctl stop pve-cluster
cd /var/lib/pve-cluster
mv config.db config.kaputt
sftp (another config.db from another proxmox server)...
shutdown -r now

But now, of course, I do not get the local storage, getting the remote storage, and remote "dangling" VM
 
If you have major problems with your Proxmox VE host, for example hardware issues, it could be helpful to copy the pmxcfs database file /var/lib/pve-cluster/config.db, and move it to a new Proxmox VE host. On the new host (with nothing running), you need to stop the pve-cluster service and replace the config.db file (required permissions 0600). Following this, adapt /etc/hostname and /etc/hosts according to the lost Proxmox VE host, then reboot and check (and don’t forget your VM/CT data).

Seems a bit... radical...
This is intended for disaster recovery, in your case there might still be a chance to recover the database.
OK, so far I made
Code:
systemctl stop pve-cluster
cd /var/lib/pve-cluster
mv config.db config.kaputt
sftp (another config.db from another proxmox server)...
shutdown -r now

But now, of course, I do not get the local storage, getting the remote storage, and remote "dangling" VM
Hmm, you where to quick, I would have suggested to simply try to restart the service with the existing DB... Can you swap the databases again and see if the transaction errors persist?
 
Last edited:
  • Like
Reactions: fcorbelli
root@ns337400:~# pvecm status Error: Corosync config '/etc/pve/corosync.conf' does not exist - is this node part of a cluster?
BTW, this is fine, if this is a standalone node.
 
This is intended for disaster recovery, in your case there might still be a chance to recover the database.

Hmm, you where to quick, I would have suggested to simply try to restart the service with the existing DB... Can you swap the databases again and see if the transaction errors persist?
Copied-back the SQL "kaputt" and restarted, seems to work
Weird

Thank you very much
Just another info: is therefore necessary to backup the config.db (via a ZFS snapshot, for example) as a backup-disaster/recovery measure?
 
Copied-back the SQL "kaputt" and restarted, seems to work
Weird

Thank you very much
Just another info: is therefore necessary to backup the config.db (via a ZFS snapshot, for example) as a backup-disaster/recovery measure?
Well, the transaction was not persisted because of the IO error and the rollback failed as well, so restarting the DB got rid of the transaction I presume. The question remains why the IO error arouse to begin with, it can have multiple causes https://www.sqlite.org/rescode.html#ioerr

A backup of the config.db is definitely recommended for quick disaster recovery.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!