Hi Everyone,
We have 2 issues with our Cluster:
- Creating new disk image on newly attached NFS storage not working
- Disabling newly attached NFS storage caused Node / server restart
This is our setup:
Cluster of PVE Nodes
Diskless PVE Nodes booting on ISCSI from Netapp LUNs
VM images stored on NFS shares from Netapp volumes
Nodes are Dell PowerEdge R[46]XXs with 10GBE NICs to SAN in bondig
PVE Manager Version: pve-manager/7.1-10
Kernel Version: Linux 5.13.19-3-pve #1 SMP PVE 5.13.19-7
Details of first issue:
I created a new NFS share on Netapp and added it to the Cluster on the Storage section of web UI.
I waited some seconds for all Nodes report the new Storage online.
After that, I tried to add a new raw disk image on this share to a stopped VM hosted on Node pve2.
The task was started but nothing happens awhile. Waiting about a minute and I stopped the task manually.
Question is what could be the problem with the image creation. This kind of tasks commonly done immediately.
I observe this behavior sometimes for newly mounted NFS. After that I usually done some workarounds like delete and readd the NFS to Cluster, or restart the whole process and recreate the volume on Netapp.
Details of second issue:
This time I chose another workaround. I set Storage disable on web UI. This was a bad choice because the Node restarted without any sign of issue. What could be the problem here?
These are the related logs in syslog:
Last couple of raws and the crash happened:
I cannot find more logs about the case. There is a Dell Lifecycle Log:
SYS1003: System CPU Resetting.
2022-10-04T11:47:31-0500
Log Sequence Number: 1343
Detailed Description:
System is performing a CPU reset because of system power off, power on or a warm reset like CTRL-ALT-DEL.
Any suggestions are welcome and appreciated.
Best Regards,
Attila
We have 2 issues with our Cluster:
- Creating new disk image on newly attached NFS storage not working
- Disabling newly attached NFS storage caused Node / server restart
This is our setup:
Cluster of PVE Nodes
Diskless PVE Nodes booting on ISCSI from Netapp LUNs
VM images stored on NFS shares from Netapp volumes
Nodes are Dell PowerEdge R[46]XXs with 10GBE NICs to SAN in bondig
PVE Manager Version: pve-manager/7.1-10
Kernel Version: Linux 5.13.19-3-pve #1 SMP PVE 5.13.19-7
Details of first issue:
I created a new NFS share on Netapp and added it to the Cluster on the Storage section of web UI.
I waited some seconds for all Nodes report the new Storage online.
After that, I tried to add a new raw disk image on this share to a stopped VM hosted on Node pve2.
The task was started but nothing happens awhile. Waiting about a minute and I stopped the task manually.
Question is what could be the problem with the image creation. This kind of tasks commonly done immediately.
I observe this behavior sometimes for newly mounted NFS. After that I usually done some workarounds like delete and readd the NFS to Cluster, or restart the whole process and recreate the volume on Netapp.
Details of second issue:
This time I chose another workaround. I set Storage disable on web UI. This was a bad choice because the Node restarted without any sign of issue. What could be the problem here?
These are the related logs in syslog:
Code:
Oct 4 11:44:41 pve2 pvedaemon[936905]: <ops@pam> update VM 189: -scsi0 mail_srv:50,format=raw,backup=0
Oct 4 11:44:41 pve2 pvedaemon[936905]: <ops@pam> starting task UPID:pve2:0018F8D6:06B21A27:633C0089:qmconfig:189:ops@pam:
Oct 4 11:45:36 pve2 pvedaemon[1636566]: VM 189 creating disks failed
Oct 4 11:45:36 pve2 pvedaemon[1636566]: unable to create image: received interrupt
Oct 4 11:45:36 pve2 pvedaemon[936905]: <ops@pam> end task UPID:pve2:0018F8D6:06B21A27:633C0089:qmconfig:189:ops@pam: unable to create image: received interrupt
Last couple of raws and the crash happened:
Code:
Oct 4 11:45:45 pve2 corosync[3492]: [KNET ] pmtud: Starting PMTUD for host: 2 link: 0
Oct 4 11:45:45 pve2 corosync[3492]: [KNET ] udp: detected kernel MTU: 9000
Oct 4 11:45:45 pve2 corosync[3492]: [KNET ] pmtud: PMTUD completed for host: 2 link: 0 current link mtu: 8885
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
Oct 4 11:49:56 pve2 systemd-modules-load[835]: Inserted module 'ib_iser'
Oct 4 11:49:56 pve2 kernel: [ 0.000000] Linux version 5.13.19-3-pve (build@proxmox) (gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2) #1 SMP PVE 5.13.19-7 (Thu, 20 Jan 2022 16:37:56 +0100) ()
...
I cannot find more logs about the case. There is a Dell Lifecycle Log:
SYS1003: System CPU Resetting.
2022-10-04T11:47:31-0500
Log Sequence Number: 1343
Detailed Description:
System is performing a CPU reset because of system power off, power on or a warm reset like CTRL-ALT-DEL.
Any suggestions are welcome and appreciated.
Best Regards,
Attila
Last edited: