Hi, folx,
we have some starnge issues with a stand alone Proxmox-server.
- VM 1 is running without problems
- VM 2 is "running", but not reachable via RDP
- Proxmox-WebGui is not reachable, too. When logging in with valid credentials, there is an error "wrong password". The same credentials are valid when logging in via SSH.
The error logs are saying:
- so basically the reason is/are that error(s):
We can´t even do a
- same error (input / output error). We had the same problem a while ago when we overprovisoned a VM, i.e. we assigned 2 TB to a virtual disk, but there was only 1,8 TB left in then ZFS. Interesting experience, by the way - the whole Proxmox stopped with all VMs, we could not use the server for a couple of days because we had to restore the backups on a second machine and then install thenwhole system from scratch :\ The fact that it is possible to overprovision a VM is very poor IMHO, that was never possible with VMWare, Xen or Hyper-V! But that´s another discussion ....
Now, back to our problem. We could restart the whole Proxmox, then we are able for a short time to reach the GUI and then second VM. After some hours / half a day, the same problem will occur.
When we are trying "df -h" or "df -i", there is space enough, I guess. When changing to /etc/pve, I am in the Fuse-environment as usual, but I cannot do anything there:
Any ideas how this could happen?
we have some starnge issues with a stand alone Proxmox-server.
- VM 1 is running without problems
- VM 2 is "running", but not reachable via RDP
- Proxmox-WebGui is not reachable, too. When logging in with valid credentials, there is an error "wrong password". The same credentials are valid when logging in via SSH.
The error logs are saying:
"unable to write lrm status file - unable to open file '/etc/pve/nodes/pve/lrm_status.tmp.1774' - Input/output error"
- so basically the reason is/are that error(s):
unable to write lrm status file
unable to open file '/etc/pve/nodes/pve/ Input/output error
We can´t even do a
touch /etc/pve/testfile
- same error (input / output error). We had the same problem a while ago when we overprovisoned a VM, i.e. we assigned 2 TB to a virtual disk, but there was only 1,8 TB left in then ZFS. Interesting experience, by the way - the whole Proxmox stopped with all VMs, we could not use the server for a couple of days because we had to restore the backups on a second machine and then install thenwhole system from scratch :\ The fact that it is possible to overprovision a VM is very poor IMHO, that was never possible with VMWare, Xen or Hyper-V! But that´s another discussion ....
Now, back to our problem. We could restart the whole Proxmox, then we are able for a short time to reach the GUI and then second VM. After some hours / half a day, the same problem will occur.
# zfs list
NAME USED AVAIL REFER MOUNTPOINT
rpool 3.73T 325G 166K /rpool
rpool/ROOT 868G 325G 153K /rpool/ROOT
rpool/ROOT/pve-1 868G 325G 868G /
rpool/data 2.88T 325G 153K /rpool/data
rpool/data/vm-100-disk-0 310G 325G 310G -
rpool/data/vm-100-disk-1 985G 325G 985G -
rpool/data/vm-101-disk-0 303G 325G 303G -
rpool/data/vm-101-disk-1 1.32T 325G 1.32T -
rpool/var-lib-vz 204K 325G 204K /var/lib/vz
# zpool list
NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
rpool 5.22T 4.67T 567G - - 19% 89% 1.00x ONLINE -
# zpool status -v
pool: rpool
state: ONLINE
scan: scrub repaired 0B in 00:42:56 with 0 errors on Sun Dec 8 01:06:57 2024
config:
NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
ata-SAMSUNG_MZ7L3960HCJR-00A07_S662NN0W717623-part3 ONLINE 0 0 0
ata-SAMSUNG_MZ7L3960HCJR-00A07_S662NN0W717626-part3 ONLINE 0 0 0
ata-SAMSUNG_MZ7L3960HCJR-00A07_S662NN0W717625-part3 ONLINE 0 0 0
ata-SAMSUNG_MZ7L3960HCJR-00A07_S662NN0W717622-part3 ONLINE 0 0 0
ata-SAMSUNG_MZ7L3960HCJR-00A07_S662NN0W717631-part3 ONLINE 0 0 0
ata-SAMSUNG_MZ7L3960HCJR-00A07_S662NN0W717630-part3 ONLINE 0 0 0
errors: No known data errors
When we are trying "df -h" or "df -i", there is space enough, I guess. When changing to /etc/pve, I am in the Fuse-environment as usual, but I cannot do anything there:
root@pve:/etc# df -h .
Filesystem Size Used Avail Use% Mounted on
rpool/ROOT/pve-1 1.2T 869G 325G 73% /
root@pve:/etc# cd /etc/pve
root@pve:/etc/pve# df -h .
Filesystem Size Used Avail Use% Mounted on
/dev/fuse 128M 16K 128M 1% /etc/pve
root@pve:/etc/pve# touch small-file
touch: cannot touch 'small-file': Input/output error
Any ideas how this could happen?
Last edited: