Where do i start...
Here is my current setup - I have a windows 10 VM and unraid VM setup on a single node that i'm been working on for the past week
My next step was to install my RTX 3070 GPU and pass it through to the windows 10 VM. I realized in this process that I had setup the VM with machine type i440fx instead of g35 and and bios was set to default(seaBIOS) instead of OVMF (UEFI) which many people have pointed out is not easy to convert over. I actually was initially successful in following instructions from someone on converting to GPT and able to get it to boot/follow steps for gpu passthrough, but eventually i ran into more issues and decide to just destroy the VM and start over.
Next i setup a new windows 10 VM from scratch (just in case i used a different VMid although backups was the only thing that i could find might be an issue with reusing a VMid (which i dont have backups setup yet). After following the correct steps initially to OVMF (UEFI) + g35 i was able to get windows 10 going with GPU passed through and see it in device manager. But almost immediately my system started not responding, connecting via RDP was super slow and almost nonresponsive. I realized that the drive that proxmox was installed on was maxing out in space after follow more instructions i found on forums to identify based on behavior.
At this point i could do nothing really because the VM was locked and i realize that i had forgotten to uncheck backup on the windows 10 storage 100gb drive. So I'm pretty sure that was the real issue. Only option was to forcefully shutdown and restart but when i did the web gui would not launch - i was so tired at that point i just turned the box off and slept. This morning i thought Oh - let me try to SSH in and was able to access the proxmox machine. I've been looking into why the web interface to see why it randomly because in accessible.
Here is my /etc/hosts:
Here is lsof -i output:
output from systemctl status pveproxy
Thanks,
kirkyg
Here is my current setup - I have a windows 10 VM and unraid VM setup on a single node that i'm been working on for the past week
My next step was to install my RTX 3070 GPU and pass it through to the windows 10 VM. I realized in this process that I had setup the VM with machine type i440fx instead of g35 and and bios was set to default(seaBIOS) instead of OVMF (UEFI) which many people have pointed out is not easy to convert over. I actually was initially successful in following instructions from someone on converting to GPT and able to get it to boot/follow steps for gpu passthrough, but eventually i ran into more issues and decide to just destroy the VM and start over.
Next i setup a new windows 10 VM from scratch (just in case i used a different VMid although backups was the only thing that i could find might be an issue with reusing a VMid (which i dont have backups setup yet). After following the correct steps initially to OVMF (UEFI) + g35 i was able to get windows 10 going with GPU passed through and see it in device manager. But almost immediately my system started not responding, connecting via RDP was super slow and almost nonresponsive. I realized that the drive that proxmox was installed on was maxing out in space after follow more instructions i found on forums to identify based on behavior.
At this point i could do nothing really because the VM was locked and i realize that i had forgotten to uncheck backup on the windows 10 storage 100gb drive. So I'm pretty sure that was the real issue. Only option was to forcefully shutdown and restart but when i did the web gui would not launch - i was so tired at that point i just turned the box off and slept. This morning i thought Oh - let me try to SSH in and was able to access the proxmox machine. I've been looking into why the web interface to see why it randomly because in accessible.
Here is my /etc/hosts:
root@pve:~# cat /etc/hosts
127.0.0.1 localhost.localdomain localhost
192.168.1.30 pve.server.local pve
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts
Here is lsof -i output:
root@pve:~# lsof -i
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
systemd 1 root 36u IPv4 26789 0t0 TCP *:sunrpc (LISTEN)
systemd 1 root 37u IPv4 23130 0t0 UDP *:sunrpc
systemd 1 root 38u IPv6 24805 0t0 TCP *:sunrpc (LISTEN)
systemd 1 root 39u IPv6 26114 0t0 UDP *:sunrpc
rpcbind 1278 _rpc 4u IPv4 26789 0t0 TCP *:sunrpc (LISTEN)
rpcbind 1278 _rpc 5u IPv4 23130 0t0 UDP *:sunrpc
rpcbind 1278 _rpc 6u IPv6 24805 0t0 TCP *:sunrpc (LISTEN)
rpcbind 1278 _rpc 7u IPv6 26114 0t0 UDP *:sunrpc
sshd 1449 root 3u IPv4 33580 0t0 TCP *:ssh (LISTEN)
sshd 1449 root 4u IPv6 33582 0t0 TCP *:ssh (LISTEN)
chronyd 1467 _chrony 5u IPv4 21412 0t0 UDP localhost.localdomain:323
chronyd 1467 _chrony 6u IPv6 21413 0t0 UDP ip6-localhost:323
pvedaemon 1513 root 6u IPv4 41286 0t0 TCP localhost.localdomain:85 (LISTEN)
pvedaemon 1514 root 6u IPv4 41286 0t0 TCP localhost.localdomain:85 (LISTEN)
pvedaemon 1515 root 6u IPv4 41286 0t0 TCP localhost.localdomain:85 (LISTEN)
pvedaemon 1516 root 6u IPv4 41286 0t0 TCP localhost.localdomain:85 (LISTEN)
spiceprox 1528 www-data 6u IPv6 29286 0t0 TCP *:3128 (LISTEN)
spiceprox 1529 www-data 6u IPv6 29286 0t0 TCP *:3128 (LISTEN)
sshd 1641 root 4u IPv4 33620 0t0 TCP pve.server.local:ssh->192.168.1.36:50362 (ESTABLISHED)
pveproxy 1918 www-data 6u IPv6 40230 0t0 TCP *:8006 (LISTEN)
pveproxy 2256 www-data 6u IPv6 40230 0t0 TCP *:8006 (LISTEN)
pveproxy 2257 www-data 6u IPv6 40230 0t0 TCP *:8006 (LISTEN)
pveproxy 2258 www-data 6u IPv6 40230 0t0 TCP *:8006 (LISTEN)
output from systemctl status pveproxy
root@pve:~# systemctl status pveproxy
● pveproxy.service - PVE API Proxy Server
Loaded: loaded (/lib/systemd/system/pveproxy.service; enabled; vendor preset: enabled)
Active: active (running) since Thu 2023-02-09 08:50:02 CST; 2min 55s ago
Process: 1912 ExecStartPre=/usr/bin/pvecm updatecerts --silent (code=exited, status=111)
Process: 1914 ExecStart=/usr/bin/pveproxy start (code=exited, status=0/SUCCESS)
Main PID: 1918 (pveproxy)
Tasks: 4 (limit: 76996)
Memory: 132.9M
CPU: 3.039s
CGroup: /system.slice/pveproxy.service
├─1918 pveproxy
├─2042 pveproxy worker
├─2043 pveproxy worker
└─2044 pveproxy worker
Feb 09 08:52:57 pve pveproxy[1918]: worker 2041 finished
Feb 09 08:52:57 pve pveproxy[1918]: worker 2039 finished
Feb 09 08:52:57 pve pveproxy[1918]: worker 2040 finished
Feb 09 08:52:57 pve pveproxy[1918]: starting 3 worker(s)
Feb 09 08:52:57 pve pveproxy[1918]: worker 2042 started
Feb 09 08:52:57 pve pveproxy[1918]: worker 2043 started
Feb 09 08:52:57 pve pveproxy[1918]: worker 2044 started
Feb 09 08:52:57 pve pveproxy[2042]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/pe>
Feb 09 08:52:57 pve pveproxy[2043]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/pe>
Feb 09 08:52:57 pve pveproxy[2044]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/pe>
- First task is to get web GUI accessible
- Second task is to confirm the disk storage issue was causing vm to lockup due to backup and see if im now able to set the windows 10 vm to not backup and clear the excess data that was created/or in logging that was running away.
- Third thing is to verify that GPU passthrough is indeed working and vm performing / able to do GPU tasks/rendering/games etc.
Thanks,
kirkyg