I must say, im very unimpressed about how quickly proxmox breaks fro the smallest reason.
I really thought it would be more capable and stable than this after so many years in production. Its now causing me massive headaches all because i unplugged and re-plugged a graphics card?
It really cannot be this fragile.
Ive been building PC for over 2 decades so know my way around this. I needed the Nvidia 1070ti that i have in my dedicated proxmox PC for another PC for a short while as i traveled. So while booted down, i removed teh GPU and put it in the other PC. When done, i simply re-plugged it back into my Proxmox machine. Technically the Proxmox machine didnt even know the GPU was removed.
This caused Massive endless issues, so much so that i cannot even find the network of the proxmox anymore. Its gone from bad to worse within hours and i cant understand how this system is so fragile. From having a clean running system with 10VMs for over a year to now not being able to find it on the network all because i re-plugged a GPU is mind blowing.
The first problem that it presented was that normally it would boot and stay on the screen saying
Now, it shows tha message for a fews seconds, but then transitions to this message:
I found this odd as i had never seen it before.
I then tried to boot up one of my windows VMs, its simply refused to show on screen. I then tried Ubunut, same issue. I then tried to remove the PCI device in hardware section and re add it, still nothing. I then set Display to default instead of none to try run it through VNC, still didn't work. I checked all my settings.
I tried my NAs Scale, which still workedv ia its own IP address, so the issue was clearly the GPU for anthing that wsa tyrign to use the GPU.
After a day of struggling, I then got a message saying that it cannot capture any more logs as my Hardisk is full. Again found this odd because its never had issues. I deleted some ISO files to free up space, but saw it quickly filling up that emptied space with logs data as NasScale and a VM were running. I check this and it had endless messages saying:
I then tried to shit them down and refresh teh GUI, and now it refused to connect to the GUI with an error "The connection has timed out"
I then used Putty to access the machine, which worked. I ran
[/LIST]
23GB of log data. So then i ran
I cannot believe that my entire proxmox server is unaccesabel jsut because i pulled and repluuged my GPOU while its off. It seriaouly has to be bult better than this. Everythign was 100% befor ei did this, so no nothign changed other than that.
I really thought it would be more capable and stable than this after so many years in production. Its now causing me massive headaches all because i unplugged and re-plugged a graphics card?
It really cannot be this fragile.
Ive been building PC for over 2 decades so know my way around this. I needed the Nvidia 1070ti that i have in my dedicated proxmox PC for another PC for a short while as i traveled. So while booted down, i removed teh GPU and put it in the other PC. When done, i simply re-plugged it back into my Proxmox machine. Technically the Proxmox machine didnt even know the GPU was removed.
This caused Massive endless issues, so much so that i cannot even find the network of the proxmox anymore. Its gone from bad to worse within hours and i cant understand how this system is so fragile. From having a clean running system with 10VMs for over a year to now not being able to find it on the network all because i re-plugged a GPU is mind blowing.
The first problem that it presented was that normally it would boot and stay on the screen saying
loading initial ramdisk
until i fire dup a VMNow, it shows tha message for a fews seconds, but then transitions to this message:
Found volume group "pve" using metadata type lvm2"
3 Logical volumes(s) in volume group "pve" now active
I found this odd as i had never seen it before.
I then tried to boot up one of my windows VMs, its simply refused to show on screen. I then tried Ubunut, same issue. I then tried to remove the PCI device in hardware section and re add it, still nothing. I then set Display to default instead of none to try run it through VNC, still didn't work. I checked all my settings.
I tried my NAs Scale, which still workedv ia its own IP address, so the issue was clearly the GPU for anthing that wsa tyrign to use the GPU.
After a day of struggling, I then got a message saying that it cannot capture any more logs as my Hardisk is full. Again found this odd because its never had issues. I deleted some ISO files to free up space, but saw it quickly filling up that emptied space with logs data as NasScale and a VM were running. I check this and it had endless messages saying:
pve kernel: vfio-pci 0000:0b:00.0: BAR 1: can't reserve [mem 0xd0000000-0xdfffffff 64bit pref]
I then tried to shit them down and refresh teh GUI, and now it refused to connect to the GUI with an error "The connection has timed out"
I then used Putty to access the machine, which worked. I ran
ls -allh /var/log/sys*
which output:
[LIST]
[*]-rw-r----- 1 root adm 23G Jun 4 11:51 /var/log/syslog
[*]-rw-r----- 1 root adm 300K Jun 3 13:12 /var/log/syslog.1
[*]-rw-r----- 1 root adm 14K May 15 00:01 /var/log/syslog.2.gz
[*]-rw-r----- 1 root adm 66K May 13 17:47 /var/log/syslog.3.gz
[*]-rw-r----- 1 root adm 162K Apr 30 09:54 /var/log/syslog.4.gz
[/LIST]
23GB of log data. So then i ran
rm /var/log/syslog
to remove that and clear up space. now i cant even access it via putty and my router no longer sees it on the lan. Absolute mess!I cannot believe that my entire proxmox server is unaccesabel jsut because i pulled and repluuged my GPOU while its off. It seriaouly has to be bult better than this. Everythign was 100% befor ei did this, so no nothign changed other than that.