Individual node in cluster doesn't see shared storage

ravenssettle · Jul 17, 2024

I have a three node cluster where they're named/numbered prox0-2. I have a TrueNAS server in a separate machine. Nodes 0 and 2 see my Share, but node 1 doesn't. In the server view list it shows a ? over Share (prox1).

Additionally I'm having trouble putting the share as the location for saving my VM/LXC data as my machines fail during creation, but one step at a time I suppose

When I click on Share (prox1) I get "mount error: refer to the mount.cifs(8) (e.g. man mount.cifs) manual page and kernel log messages (dmesg) (500)"

I'm not sure where my next steps should be and would appreciate any advice. I'm going to go see what's in the kernel log messages and I'll post here if I find anything.

jsterr · Jul 17, 2024

Please specify your questions and share some more information (logs, errors, screenshots). If node 0 and 2 see the share, whats different on node1? did you check, switch, cables, vlans etc. what share are we talking bout? NFS?

ravenssettle · Jul 17, 2024

Sorry about that - accidentally posted before I was done typing and I've added some more info to the OP.

Share is SMB/CIFS. Switch/cables/vlans are all good. I can connect to any of the nodes and bounce between them. Below is my /etc/pve/storage.cfg from prox0:

GNU nano 7.2 /etc/pve/storage.cfg
dir: local
path /var/lib/vz
content iso,vztmpl,backup
shared 1

lvmthin: local-lvm
thinpool data
vgname pve
content rootdir,images
nodes prox3,prox0,prox2,prox1

cifs: Share
path /mnt/pve/Share
server truenas
share Prox
content images,iso,vztmpl,rootdir,snippets
prune-backups keep-all=1
username prox

pbs: Backups
datastore Backup
server prox2
content backup
encryption-key "gotta rotate that now - oops"
prune-backups keep-all=1
username root

And prox1:

GNU nano 7.2 /etc/pve/storage.cfg
dir: local
path /var/lib/vz
content iso,vztmpl,backup
shared 1

lvmthin: local-lvm
thinpool data
vgname pve
content rootdir,images
nodes prox3,prox0,prox2,prox1

cifs: Share
path /mnt/pve/Share
server truenas
share Prox
content images,iso,vztmpl,rootdir,snippets
prune-backups keep-all=1
username prox

pbs: Backups
datastore Backup
server prox2
content backup
encryption-key "gotta rotate that now - oops"
prune-backups keep-all=1
username root

ravenssettle · Jul 17, 2024

Removing and readding the storage doesn't help. I'm considering deleting the node, reinstalling Prox, and then readding the node, but I've had bad luck when trying that before and had to reload the whole cluster from zero.

Does anyone know what commands I should run to show y'all what's going on?

ravenssettle · Aug 7, 2024

ravenssettle said:
Removing and readding the storage doesn't help. I'm considering deleting the node, reinstalling Prox, and then readding the node, but I've had bad luck when trying that before and had to reload the whole cluster from zero.

Does anyone know what commands I should run to show y'all what's going on?

I uninstalled and reinstalled Proxmox on this server. Still can't mount my share.

Here's the error from syslog:
Aug 07 16:11:50 prox1 pve-firewall[915]: status update error: iptables_restore_cmdlist: Try `iptables-restore -h' or 'iptables-restore --help' for more information.
Aug 07 16:11:51 prox1 pvestatd[935]: mount error: Refer to the mount.cifs(8) manual page (e.g. man mount.cifs) and kernel log messages (dmesg)

When trying to create a VM I got an error saying "mount error: cifs filesystem not supported by the system" so I'll chase that error for a bit.

When I'm looking in dmesg I don't see anything about my network mounting. Can anyone let me know what to look for?

LnxBil · Aug 9, 2024

Get the actual command that is not working, rerun it and post the ouput as well as the last lines of dmesg.

ravenssettle · Sep 29, 2024

Okay, so I found the root cause of this. Not sure if I should make a new thread or keep this one, but, for now, I'll keep it here.

The issue is that my kernel is stuck in an old ass version. Through hours of hunting I finally found that the kernel was 6.5.blah. I reimaged the server, didn't change. I tried forcing grub to select the newest kernel, didn't change. When I looked in the kernel directory to see what kernels were in there (and to delete 6.5) there were the latest three kernel versions.
I finally managed to get to an updated kernel because I remembered that there's two drives in this machine. I reimaged both drives, which now gives an error on boot saying there's no boot partition. I have to press okay, then select the primary drive. This is annoying, but not the end of the world as I'll try to not reboot as much as possible.

As soon as the kernel was updated and I re-added the server to my cluster it was able to connect to my share immediately.

I *always* update via dist-upgrade, so the kernel should get updated every time it's possible. Should I use full-upgrade instead?

I have zero clue how this old kernel got stuck. I have no idea why I couldn't force grub to select the right kernel.
Now I'm running 6.8.4-2-pve, which is not the current kernel version. I'm not sure how I could update it to the current one. Does anyone have any ideas of where I should look? What's gonna happen when the kernel gets older again if I can't update it?

Also: my cluster has three computers. Two laptops and a mini pc. It's one of the laptops (the newer and more powerful one, which is why I'm so stubborn on getting it working) that is having the kernel problems.

ravenssettle · Sep 29, 2024

I've found kernel pinning, so I guess that'll be my answer moving forward. Is there a way to mark this thread as solved?

Search

Search

Individual node in cluster doesn't see shared storage

ravenssettle

New Member

jsterr

Renowned Member

ravenssettle

New Member

ravenssettle

New Member

ravenssettle

New Member

LnxBil

Distinguished Member

ravenssettle

New Member

ravenssettle

New Member

We value your privacy