unable to open file '/etc/pve/nodes/ax41/qemu-server/ - Input/output error

barnybla

New Member
Apr 21, 2024
24
3
3
I have 2 VM's with the same issue. I looked in that directory, but there are no temp files:
Code:
root@ax41:/etc/pve/nodes/ax41/qemu-server# ls -l
total 3
-rw-r----- 1 root www-data 492 May 25 21:00 100.conf
-rw-r----- 1 root www-data 449 May 25 21:00 101.conf
-rw-r----- 1 root www-data 454 May 25 21:01 102.conf
-rw-r----- 1 root www-data 416 May 25 21:04 103.conf
-rw-r----- 1 root www-data 487 May 25 21:04 104.conf

update VM 102: -agent 1
unable to open file '/etc/pve/nodes/ax41/qemu-server/102.conf.tmp.365574' - Input/output error

root@ax41:/etc/pve/nodes/ax41/qemu-server# qm list
      VMID NAME                 STATUS     MEM(MB)    BOOTDISK(GB) PID       
       100 xxxx                      running    2048              32.00 1813     
       101 yyyyy                    running    2048              32.00 1896     
       102 zzzzzzzz               running    4096              40.00 150594   
       103 nnnn                    running    4096              80.00 6684     
       104 mmmmmm        stopped    8192             150.00 0     
      
root@ax41:/etc/pve/nodes/ax41/qemu-server# qm unlock 104
unable to open file '/etc/pve/nodes/ax41/qemu-server/104.conf.tmp.372020' - Input/output error

What can I do to fix the problemes, VM 102 ist at least runnig, 104 is stopped, and I cann't reboot or shutdown or anything else.

Is there a command that cleans the issues, so I can go on working.

Thanks for help

Bernd
 
root@ax41:/etc/pve/nodes/ax41/qemu-server# qm unlock 104
unable to open file '/etc/pve/nodes/ax41/qemu-server/104.conf.tmp.372020' - Input/output error
Try do run that command from another place, not being in that fuse-mounted database which /etc/pve is. ;-)

Read: "cd ~" and then run "qm unlock..."
 
Thanks for your answers.

I think the storage is new, it is a new Server with zfs storage and there is enough space:
Code:
root@ax41:~# df -h
Filesystem        Size  Used Avail Use% Mounted on
udev               32G     0   32G   0% /dev
tmpfs             6.3G  1.3M  6.3G   1% /run
rpool/ROOT/pve-1  343G  2.6G  341G   1% /
tmpfs              32G   46M   32G   1% /dev/shm
tmpfs             5.0M     0  5.0M   0% /run/lock
vms               1.3T  128K  1.3T   1% /vms
rpool             341G  128K  341G   1% /rpool
rpool/var-lib-vz  446G  106G  341G  24% /var/lib/vz
rpool/ROOT        341G  128K  341G   1% /rpool/ROOT
rpool/data        341G  128K  341G   1% /rpool/data
/dev/fuse         128M   32K  128M   1% /etc/pve
tmpfs             6.3G     0  6.3G   0% /run/user/0

I made it from /root, it is the same:

Code:
root@ax41:~# qm unlock 102
unable to open file '/etc/pve/nodes/ax41/qemu-server/102.conf.tmp.398970' - Input/output error
root@ax41:~# qm unlock 104
unable to open file '/etc/pve/nodes/ax41/qemu-server/104.conf.tmp.399122' - Input/output error
root@ax41:~# df -h

What can I do now?

Bernd
 
No, that is no cluster, it is a root Server in the internet.

Code:
root@ax41:~# pvecm status
Error: Corosync config '/etc/pve/corosync.conf' does not exist - is this node part of a cluster?

That is the output
 
No, that is no cluster, it is a root Server in the internet.
Okay. Did you perform a successful installation from the official proxmox-ve_8.2-1.iso?

Even without a cluster I would expect several daemons to run, but I am unsure as I do not have a standalone machine: systemctl status pve*.service (No need to post all, only those with errors.)
  • What filesystems are mounted? mount?
  • Disk full? df -h
Try to touch a dummy stepping down the hierarchy:
  • touch /etc/dummy
  • touch /etc/pve/dummy
  • touch /etc/pve/local/dummy
  • touch /etc/pve/local/qemu-server/dummy
(And delete that file afterwards with "rm ...")

"Input/output error" often means Hardware. What storage devices are present? lsblk -f
 
Yes, I did perform a successful installation from the official proxmox-ve_8.2-1.iso

I have attached all your commands in the pdf file. And you can see the dummy stepping works only in /etc, all the other doesn't work with "Input/output error", what does that mean?

Thanks Bernd
 

Attachments

  • proxmox.pdf
    69.7 KB · Views: 13
I have attached all your commands in the pdf file.
Posting a .pdf is fine for me. But you should have used a monospaced - for better readability ;-)

I don't see any problems listed. (Except the /etc/pve problem of course.)

As I had already said /etc/pve is listed (correctly) as a fuse-mountpoint. It should be writeable. (In a cluster as long as Quorum is reached.) You have a standalone setup - which is a total valid use case.

Unfortunately I have no idea how to repair this situation. Hopefully some other forum user will chime in here, sorry.
 
You are right, one nvme2n1p1 is degraded, so I need Hetzner Support.

Code:
root@ax41:~# zpool status -v
  pool: rpool
 state: ONLINE
  scan: scrub repaired 0B in 00:00:12 with 0 errors on Sun May 12 00:24:13 2024
config:

        NAME                                                 STATE     READ WRITE CKSUM
        rpool                                                ONLINE       0     0     0
          mirror-0                                           ONLINE       0     0     0
            nvme-eui.00000000000000018ce38e01000937c2-part3  ONLINE       0     0     0
            nvme-eui.00000000000000018ce38e01000937bf-part3  ONLINE       0     0     0

errors: No known data errors

  pool: vms
 state: DEGRADED
status: One or more devices could not be used because the label is missing or
        invalid.  Sufficient replicas exist for the pool to continue
        functioning in a degraded state.
action: Replace the device using 'zpool replace'.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-4J
  scan: scrub repaired 0B in 00:00:47 with 0 errors on Sun May 12 00:24:49 2024
config:

        NAME                     STATE     READ WRITE CKSUM
        vms                      DEGRADED     0     0     0
          mirror-0               DEGRADED     0     0     0
            nvme1n1              ONLINE       0     0     0
            5352710524674690282  UNAVAIL      0     0     0  was /dev/nvme2n1p1

errors: No known data errors

Thanks for your help.

Bernd
 
  • Like
Reactions: Kingneutron
after changing from the nvme2n1 I couldn'7 reach the server. I started the rescue system, I try to make a "zpool replace", I don't the exact command for the replacement:

Code:
root@rescue ~ # nvme list
Node                  Generic               SN                   Model                                    Namespace Usage                                        Format           FW Rev
--------------------- --------------------- -------------------- ---------------------------------------- --------- -------------------------                  - ---------------- --------
/dev/nvme3n1          /dev/ng3n1            S677NF0RB00376       SAMSUNG MZVL22T0HBLB-00B00               1           1.62  TB /   2.05  TB                      512   B +  0 B   GXB7801Q
/dev/nvme2n1          /dev/ng2n1            S677NN0WA01564       SAMSUNG MZVL22T0HBLB-00B00               1           1.65  TB /   2.05  TB                      512   B +  0 B   GXB7801Q
/dev/nvme1n1          /dev/ng1n1            604Y10CKYZ5L         KXG60ZNV512G TOSHIBA                     1         512.11  GB / 512.11  GB                      512   B +  0 B   AGGA4104
/dev/nvme0n1          /dev/ng0n1            S675NX0T571239       SAMSUNG MZVL2512HCJQ-00B00               1           0.00   B / 512.11  GB                      512   B +  0 B   GXA7801Q

root@rescue ~ # zpool status -v rpool
  pool: rpool
 state: DEGRADED
status: One or more devices could not be used because the label is missing or
        invalid.  Sufficient replicas exist for the pool to continue
        functioning in a degraded state.
action: Replace the device using 'zpool replace'.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-4J
  scan: scrub repaired 0B in 00:00:12 with 0 errors on Sun May 12 00:24:13 2024
config:

        NAME                                                 STATE     READ WRITE CKSUM
        rpool                                                DEGRADED     0     0     0
          mirror-0                                           DEGRADED     0     0     0
            nvme-eui.00000000000000018ce38e01000937c2-part3  ONLINE       0     0     0
            2547238053384881551                              UNAVAIL      0     0     0  was /dev/disk/by-id/nvme-eui.00000000000000018ce38e01000937bf-part3

errors: No known data errors
root@rescue ~ #  zpool replace rpool /dev/nvme0n1
cannot replace /dev/nvme0n1 with /dev/nvme0n1: no such device in pool
root@rescue ~ #  zpool replace rpool nvme0n1
cannot replace nvme0n1 with nvme0n1: no such device in pool

Please can you help me

Bernd
 
after replacing the nvme and brings the rpool to online, the one vm witch was locked during backup, is still locked. If I try to unlock the vm, I get:
Code:
root@ax41:~# qm start 104
VM is locked (backup)

root@ax41:~# qm unlock 104
unable to open file '/etc/pve/nodes/ax41/qemu-server/104.conf.tmp.262716' - Input/output error

All other vm's are running perfect. What can I do to unlock the vm, because the file doesn't exist.

Regards
Bernd
 
If you have a valid previous backup of the problem vm, I would suggest deleting the bad one and restoring from that backup

Otherwise you might try and mount the vdisk on the host and copy any critical files out, then re-create the vm

You could try touching the missing file, but IDK if a 0-byte file vs a file with valid content matters
 
I have no backup, but it is an imported VM from an esxi, so I can import it once more.
Thanks

Bernd
 
Only some observation:
Initially you show, that (only) your vms pool is degraded, but later your rpool is degraded (too?)?! Did the support exchange the wrong disk? Would check the status of your vms pool...

Also, when replacing a disk in the rpool, there is more action required, since it gets booted from:
https://pve.proxmox.com/wiki/ZFS_on_Linux#_zfs_administration -> "Changing a failed bootable device"
 
  • Like
Reactions: Kingneutron
There was many trouble with the server, first with one nvme from vms pool, than with the rpool and than with the mainboard. So they changed the mailboard an the both nvme's. After that the server was running, without problems.
But every time when the backup was running , by 2 VM's, they was locked after the backup. The backup space was too small. so I created a new backupspace in the vms pool. There is space enough. This evening the first backup is running. So I can look, was that the reason.

Bernd
 
I have forgotten, the pools are healthy:

Code:
root@ax41:/etc/pve/nodes/ax41/qemu-server# zpool status -v
  pool: rpool
 state: ONLINE
  scan: resilvered 23.5M in 00:00:00 with 0 errors on Sat Jun  1 15:17:41 2024
config:

        NAME                                                 STATE     READ WRITE CKSUM
        rpool                                                ONLINE       0     0     0
          mirror-0                                           ONLINE       0     0     0
            nvme-eui.00000000000000018ce38e01000937c2-part3  ONLINE       0     0     0
            nvme1n1p3                                        ONLINE       0     0     0

errors: No known data errors

  pool: vms
 state: ONLINE
  scan: resilvered 519G in 13:28:24 with 0 errors on Mon May 27 10:08:04 2024
config:

        NAME                                                STATE     READ WRITE CKSUM
        vms                                                 ONLINE       0     0     0
          mirror-0                                          ONLINE       0     0     0
            nvme-SAMSUNG_MZVL22T0HBLB-00B00_S677NN0WA01564  ONLINE       0     0     0
            nvme-SAMSUNG_MZVL22T0HBLB-00B00_S677NF0RB00376  ONLINE       0     0     0

errors: No known data errors
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!