Web GUI & system commands not working

Rattlehead.ie

Member
Dec 30, 2021
17
0
6
41
Hi guys
Similar issue to the problem described "by the looks of it" in this thread, but my ..i think German...isnt great ..Thread here
I cannot reboot the system normally and have to force the command, using
//systemctl --force --force reboot

If I try to do a dist update I get errors and the webGui isnt available.
After a reboot they system behaves for a period of time and then goes back into this errored state. Any help would be great. Ill post some commands below

Code:
root@vm:~# apt-get update -y
Ign:1 http://security.debian.org bullseye-security InRelease
Err:2 http://security.debian.org bullseye-security Release
  Could not open file /var/lib/apt/lists/partial/security.debian.org_dists_bullseye-security_Release - open (30: Read-only file system) [IP: 199.232.26.132 80]
Hit:3 http://ftp.ie.debian.org/debian bullseye InRelease
Err:3 http://ftp.ie.debian.org/debian bullseye InRelease
  Couldn't create temporary file /tmp/apt.conf.ggGjru for passing config to apt-key
Ign:4 http://ftp.ie.debian.org/debian bullseye-updates InRelease
Err:5 http://ftp.ie.debian.org/debian bullseye-updates Release
  Could not open file /var/lib/apt/lists/partial/ftp.ie.debian.org_debian_dists_bullseye-updates_Release - open (30: Read-only file system) [IP: 130.89.148.12 80]
Hit:6 http://download.proxmox.com/debian bullseye InRelease
Err:6 http://download.proxmox.com/debian bullseye InRelease
  Couldn't create temporary file /tmp/apt.conf.J1AvaA for passing config to apt-key
Hit:7 https://repo.zabbix.com/zabbix-agent2-plugins/1/debian bullseye InRelease
Err:7 https://repo.zabbix.com/zabbix-agent2-plugins/1/debian bullseye InRelease
  Couldn't create temporary file /tmp/apt.conf.jSjpJL for passing config to apt-key
Hit:8 https://repo.zabbix.com/zabbix/6.2/debian bullseye InRelease
Err:8 https://repo.zabbix.com/zabbix/6.2/debian bullseye InRelease
  Couldn't create temporary file /tmp/apt.conf.xK3wxR for passing config to apt-key
Reading package lists... Done
W: chown to _apt:root of directory /var/lib/apt/lists/partial failed - SetupAPTPartialDirectory (30: Read-only file system)
W: chmod 0700 of directory /var/lib/apt/lists/partial failed - SetupAPTPartialDirectory (30: Read-only file system)
W: chown to _apt:root of directory /var/lib/apt/lists/auxfiles failed - SetupAPTPartialDirectory (30: Read-only file system)
W: chmod 0755 of directory /var/lib/apt/lists/auxfiles failed - SetupAPTPartialDirectory (30: Read-only file system)

Code:
root@vm:~# df -h
-bash: /usr/bin/df: Input/output error

Any help would be appreciated or let me know more commands to run to help diagnose. would be great. The system is an Asus PN50 with m.2 ssd and seperate disk drive
 
Last edited:
From the read-only filesystem and input/output errors, I would guess your root filesystem is corrupted or the drive itself is corrupted. Maybe boot the system from a Linux Live CD/USB and test the drive and check the filesystem. I do hope you have backups.
 
From the read-only filesystem and input/output errors, I would guess your root filesystem is corrupted or the drive itself is corrupted. Maybe boot the system from a Linux Live CD/USB and test the drive and check the filesystem. I do hope you have backups.
Hi @Ieesteken
I do, although I have a strange feeling my backups are on that drive.
The funny thing is, that a reboot has the system working correctly now and will for a period of time, is that still symthomatic of drive or fiel system correcuption.
Is there anything I can run while its in this working state that can help me? As commands work currently.

UPDATE:
72 ERRORS, I think thats what I should be looking at and my concern would it be?
Code:
root@vm:~# smartctl -a /dev/nvme0n1p2
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.108-1-pve] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Number:                       CT250P2SSD8
Serial Number:                      xxxxxxxxx
Firmware Version:                   P2CR045
PCI Vendor/Subsystem ID:            0xc0a9
IEEE OUI Identifier:                0x00a075
Total NVM Capacity:                 250,059,350,016 [250 GB]
Unallocated NVM Capacity:           0
Controller ID:                      1
NVMe Version:                       1.3
Number of Namespaces:               1
Namespace 1 Size/Capacity:          250,059,350,016 [250 GB]
Namespace 1 Formatted LBA Size:     512
Namespace 1 IEEE EUI-64:            00a075 56b000000b
Local Time is:                      Fri Aug 25 10:19:22 2023 UTC
Firmware Updates (0x12):            1 Slot, no Reset required
Optional Admin Commands (0x0017):   Security Format Frmw_DL Self_Test
Optional NVM Commands (0x005e):     Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat Timestmp
Log Page Attributes (0x0e):         Cmd_Eff_Lg Ext_Get_Lg Telmtry_Lg
Maximum Data Transfer Size:         64 Pages
Warning  Comp. Temp. Threshold:     70 Celsius
Critical Comp. Temp. Threshold:     85 Celsius

Supported Power States
St Op     Max   Active     Idle   RL RT WL WT  Ent_Lat  Ex_Lat
 0 +     3.50W       -        -    0  0  0  0        0       0
 1 +     1.90W       -        -    1  1  1  1        0       0
 2 +     1.50W       -        -    2  2  2  2        0       0
 3 -   0.0700W       -        -    3  3  3  3     5000    1900
 4 -   0.0020W       -        -    4  4  4  4    13000  100000

Supported LBA Sizes (NSID 0x1)
Id Fmt  Data  Metadt  Rel_Perf
 0 +     512       0         1
 1 -    4096       0         0

=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SMART/Health Information (NVMe Log 0x02)
Critical Warning:                   0x00
Temperature:                        61 Celsius
Available Spare:                    100%
Available Spare Threshold:          5%
Percentage Used:                    9%
Data Units Read:                    5,632,567 [2.88 TB]
Data Units Written:                 32,016,208 [16.3 TB]
Host Read Commands:                 29,210,878
Host Write Commands:                1,179,474,997
Controller Busy Time:               24,237
Power Cycles:                       44
Power On Hours:                     14,390
Unsafe Shutdowns:                   33
Media and Data Integrity Errors:    0
Error Information Log Entries:      72
Warning  Comp. Temperature Time:    4137
Critical Comp. Temperature Time:    0
Temperature Sensor 1:               76 Celsius
Thermal Temp. 1 Transition Count:   8
Thermal Temp. 2 Transition Count:   1
Thermal Temp. 1 Total Time:         1483587
Thermal Temp. 2 Total Time:         218433

Error Information (NVMe Log 0x01, 16 of 16 entries)
Num   ErrCount  SQId   CmdId  Status  PELoc          LBA  NSID    VS
  0         72     0  0x7000  0x4005  0x028            0     0     -
 
Last edited:
It's a QLC SSD, so I hope you are not using it with ZFS. Just the fact that there are 72 messages in the error log, does not mean they are all problematic. The drive does seem to warn about high temperatures. Maybe it only gives problems when used too much/often and it gets too hot?
 
You are going to have to help me with the ZFS question @Ieesteken.
Maybe as I moved location of the VM Host into the comms cabinet rather than the office its heated up. Ill move it back and check.
Is there a way to clear the error count and monitor?
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!