Node showing question mark

Sakamoto

New Member
Oct 25, 2024
13
0
1
HI All,

I need some help with my proxmox server. My server sometimes suddenly unresponsive and appears as a question mark from the GUI. When I open KVM I can see there's watchdog: bug error. I can't do anything in the KVM. If I run any command it will respond for a while, then the watchdog error will keep popping up. Anyone experienced the same problem as me? I need help with this.

My current solution right now is just to restart the server and wait until this error occurs again.


1742527800275.png



1742527738204.png
 
Hi

About your post: this is the national (german) part of the forum. You may not get as many answers in English here.
About your problem: do you need the multipath demon? Do you have a storage array with multiple connectors connected? is it a cluster or standalone system? Maybe you can disable multipath daemon as a test and see if it will help.
 
Hi @Welby sorry for that.

to answer your question, yes I need a multipath for my storage and yes I have storage array with multiple connectors connected.
This is a cluster of 3 nodes.
I belive if I disable the multipath it will effect my storage.
 
HI All,

I need some help with my proxmox server. My server sometimes suddenly unresponsive and appears as a question mark from the GUI. When I open KVM I can see there's watchdog: bug error. I can't do anything in the KVM. If I run any command it will respond for a while, then the watchdog error will keep popping up. Anyone experienced the same problem as me? I need help with this.

My current solution right now is just to restart the server and wait until this error occurs again.


1742527800275.png



1742527738204.png
Please check whether the storage is OK and whether the storage paths are OK.
If you can still access the CLI, check with multipath -l.
To me this clearly looks like a loss of all paths.
 
Please check whether the storage is OK and whether the storage paths are OK.
If you can still access the CLI, check with multipath -l.
To me this clearly looks like a loss of all paths.
Hi @Falk R.

For now, this server is functioning and this is the result for the multipath -l

1742872866283.png

I will try to open a multipath -l when this issue happens again. I can confirm that the CLI cannot be access anymore at that time but I can directly access the server by using KVM.
 
Hi @Falk R.

For now, this server is functioning and this is the result for the multipath -l

View attachment 84072

I will try to open a multipath -l when this issue happens again. I can confirm that the CLI cannot be access anymore at that time but I can directly access the server by using KVM.
You have a Problem with your SAN. 2 LUNs have only 3 Paths, all other 4.
That ist not normaly.
 
You have a Problem with your SAN. 2 LUNs have only 3 Paths, all other 4.
That ist not normaly.
Now looks like all LUNs have 4 path after I run ( echo "- - -" > /sys/class/scsi_host/host4/scan ) for my both host. I have 2 host, host3 and host4. I run the command for both host.

1742887591088.png
 
Now looks like all LUNs have 4 path after I run ( echo "- - -" > /sys/class/scsi_host/host4/scan ) for my both host. I have 2 host, host3 and host4. I run the command for both host.

View attachment 84077
However, this still indicates that you had an error in your storage network and this error may still be present.
 
But the IBM DS35xx is also very old (the last ones were manufactured 10 years ago). I haven't seen one for many years.
Perhaps SFPs have aged or other signs of ageing are becoming noticeable.
 
  • Like
Reactions: Johannes S