Proxmox does not come up propely with a dying disk

Sebastian Schubert

Well-Known Member
Aug 28, 2017
67
12
48
45
Hi there

just ran into an issue with a failing device (ssd decided to die)
After rebooting the node, it won't bring up the interfaces, as the "ifupdown2-pre.service" wont succeed .. its basically a "/bin/udevadm settle" that waits till everything is okay.
But due to the failing disk device, udev won't settle.. and voila .. your network won't come up ...

Code:
[ 1939.690975] blk_update_request: I/O error, dev sdc, sector 2000409088 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[ 1939.692053] Buffer I/O error on dev sdc, logical block 250051136, async page read
[ 1940.149921] sd 0:0:3:0: Power-on or device reset occurred
[ 1940.190951] mpt3sas_cm0: log_info(0x31120b10): originator(PL), code(0x12), sub_code(0x0b10)
[ 1940.190958] sd 0:0:3:0: [sdc] tag#2196 FAILED Result: hostbyte=DID_SOFT_ERROR driverbyte=DRIVER_OK
[ 1940.190960] sd 0:0:3:0: [sdc] tag#2196 CDB: Read(10) 28 00 77 3b d2 00 00 00 08 00
[ 1940.190962] blk_update_request: I/O error, dev sdc, sector 2000409088 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0

manually deleting the device (echo 1 > /sys/block/sdc/device/delete) and restarting the services got me back a working system.

I think this behaviour should be fixed , as it prevents a recovery of a node in trouble by denying network connectivity :-/
 
had exact same issue with my new install. I finally just gave up and going to just rip out the drives