I have a 4-node PVE cluster, and one of my nodes needs to be re-built from scratch (I'm re-arranging how the ZFS pool is allocated and rebuilding some of the VDEVs, including the rpool partition).
This host's hostname is embedded across various maintenance scripts which I use, so it would be...
With respect - you're coming off as almost hostile, @fabian.
@crackers819903 said they've already tried various combinations of directives in their systemd setup. This doesn't strike me as a "Help I'm a noob, I'm in completely over my head" post. It's more "Hey I've tried x, y, and z, and here...
I'd be interested in seeing what's up with this as well. I'm wanting to enable ZFS's built-in NFS server, but I'll hold off if it's known to require manual restarting after reboots.
Hmmm, I'll have a look at this. The reason I was hoping to do plain-vanilla ZFS snapshots is because I intend to use this same process to snapshot some other (non-PVE) zfs datasets as well. But your points about fsfreeze and RAM dumps are good - maybe I just need to split my automatic-snapshot...
I keep my LXC container and QM VM disks on a ZFS filesystem. I don't know much about how PVE manages the ZFS snapshots associated with a given container/vm, but I do see in my filesystem that it does have various snapshots which it creates for migrations etc:
root@node1:~# zfs list -r -t...
I have a 4-node cluster with nodes A, B, C, and D. I recently had to take nodes A and B offline to install a new NIC in them.
I wanted to watch their status on the web GUI of node C while I waited for them to boot back up, so after I shut down nodes A and B, I pulled up node C's web GUI. I was...
Thanks for the advice. I will continue using proper PVE snapshots for my containers/VMs. This zfs-auto-snapshot was only intended to create snapshots for my non-PVE ZFS datasets - I just had not anticipated that it would ALSO snapshot my PVE ZFS datasets, and was worried it might mess up...
I use Proxmox to host a bunch of container and VMs, but I also have created some other (non-pve-related) ZFS pools/datasets on my PVE host. I was exploring 3rd-party scripts for automatically creating snapshots of these datasets (specifically this one...
Honestly I'm just gonna nuke the node and reinstall. There's a decent chance that it's not even the node's fault, and I've messed up something upstream in my network. We could spend days on it and still find nothing.
I appreciate your help along the way though - Thank you for your time! I'll...
I'm closing this thread, as I think the root issues are probably in the missing permissions mentioned in my previous post. I've opened a fresh thread which will address this issue directly, instead of going down rabbitholes of symptoms rather than the root cause.
Thanks for your help, @mira ...
While troubleshooting another issue (couldn't access web gui), I discovered that the write permissions for most of my /etc/pve/ directory are absent on one of my nodes (XXXX):
root@XXXX:/etc/pve# ll
total 14K
drwxr-xr-x 2 root www-data 0 Dec 31 1969 .
drwxr-xr-x 87 root root 177 Nov...
Well this is very very strange - Poking around the filesystem, I see that write permissions for most of my /etc/pve/ directory are absent on host XXXX:
root@XXXX:/etc/pve# ll
total 14K
drwxr-xr-x 2 root www-data 0 Dec 31 1969 .
drwxr-xr-x 87 root root 177 Nov 18 10:57 ..
-r--r----- 1...
It's a new install. I'm actually thinking of just decommissioning it and rebuilding it - it's not currently hosting anything important, and ultimately may be less trouble. But I hate not knowing what's causing it. :-(
One other interesting behavior I'm noticing (don't know if it's relevant) is that when I SSH into YYYY (the healthy node) from my desktop, the connection is immediate. However when I SSH into XXXX (the unhealthy one), it takes a good solid 30 seconds for it to connect. This is odd, because XXXX...
I don't get an error when accessing in web gui - just a timeout.
The way I initially noticed something was wrong was that I noticed a red "X" on the node in the web gui of one of my OTHER nodes in the cluster. Note that in this screenshot, I'm accessing node YYYY's web GUI - node YYYY seems...
You bet:
root@xxxx:~# ss -tlpn
State Recv-Q Send-Q Local Address:Port Peer Address:Port Process
LISTEN 0 4096 127.0.0.1:85 0.0.0.0:* users:(("pvedaemon worke",pid=9392,fd=6),("pvedaemon worke",pid=9391,fd=6),("pvedaemon...
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.