Scenario:
A Proxmox host is connected to multiple LVM-backed datastores over Fibre Channel (FC) from a mirrored SAN setup. Each datastore is used by a separate VM, and synthetic I/O load was running on additional disks attached to these VMs.
During a planned SAN upgrade, while I/O load was still active: (node 1 upgrade went fine and whenever I start upgrade on 2nd server, I'm seeing this issue)
What are the most relevant logs to collect on the Proxmox host when LVM volumes suddenly disappear or become inaccessible due to SAN or FC-related events?
Specifically:
Already checked:
Would appreciate any advice on what to collect or monitor during these kinds of events.
A Proxmox host is connected to multiple LVM-backed datastores over Fibre Channel (FC) from a mirrored SAN setup. Each datastore is used by a separate VM, and synthetic I/O load was running on additional disks attached to these VMs.
During a planned SAN upgrade, while I/O load was still active: (node 1 upgrade went fine and whenever I start upgrade on 2nd server, I'm seeing this issue)
- All VMs experienced I/O failures (e.g., Error 1117 – I/O device error).
- On the Proxmox host, pvs showed no LVM volumes (indicating all SAN-backed volumes disappeared).
- On the SAN side, multiple previously active FC ports became unavailable.
Question:
What are the most relevant logs to collect on the Proxmox host when LVM volumes suddenly disappear or become inaccessible due to SAN or FC-related events?
Specifically:
- Which logs show storage device removal, I/O errors, or path failures?
- Are there specific Fibre Channel or multipath-related logs worth checking?
- Any recommended commands or logging methods to capture such events in more detail — either during the issue or for postmortem analysis?
Notes:
Already checked:
- dmesg – Shows some I/O-related messages
- journalctl – Reviewing kernel and udev events
- multipath -ll – Shows the devices were removed or lost paths
Would appreciate any advice on what to collect or monitor during these kinds of events.