node have been delay after upgrade (apt-get dist-upgrade)

Xing

Member
Jan 23, 2019
13
0
6
29
Hello

i get problem after i upgrade promox (apt-get dist-upgrade), my server node get hang after message " No /etc/kernel/pve-efiboot-uuids found, skipping ESP sync.


i can't console VM and server node has been delay.


i had attach log below.

Please help me

Thank you,
 

Attachments

hi,

from the syslog it looks to me like a failing drive..

Code:
May 12 19:41:11 pxGAME47 kernel: [2260311.627269] sd 0:0:0:0: [sda] tag#131 Sense Key : Medium Error [current] 
May 12 19:41:11 pxGAME47 kernel: [2260311.627271] sd 0:0:0:0: [sda] tag#131 Add. Sense: Unrecovered read error - auto reallocate failed
May 12 19:41:11 pxGAME47 kernel: [2260311.627274] sd 0:0:0:0: [sda] tag#131 CDB: Read(10) 28 00 00 00 08 08 00 00 08 00
May 12 19:41:11 pxGAME47 kernel: [2260311.627276] blk_update_request: I/O error, dev sda, sector 2061 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
May 12 19:41:11 pxGAME47 kernel: [2260311.628199] Buffer I/O error on dev sda1, logical block 1, async page read
May 12 19:41:11 pxGAME47 kernel: [2260311.629068] ata7: EH complete
May 12 19:41:11 pxGAME47 kernel: [2260311.629079] sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 1 tries: 1
May 12 19:41:14 pxGAME47 kernel: [2260314.430134] sas: Enter sas_scsi_recover_host busy: 1 failed: 1
May 12 19:41:14 pxGAME47 kernel: [2260314.430139] sas: ata7: end_device-0:0: cmd error handler
May 12 19:41:14 pxGAME47 kernel: [2260314.430158] sas: ata7: end_device-0:0: dev error handler
May 12 19:41:14 pxGAME47 kernel: [2260314.430163] sas: ata8: end_device-0:1: dev error handler
May 12 19:41:14 pxGAME47 kernel: [2260314.430165] ata7.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
May 12 19:41:14 pxGAME47 kernel: [2260314.430169] sas: ata9: end_device-0:2: dev error handler
May 12 19:41:14 pxGAME47 kernel: [2260314.430169] ata7.00: failed command: READ DMA
May 12 19:41:14 pxGAME47 kernel: [2260314.430174] ata7.00: cmd c8/00:08:08:08:00/00:00:00:00:00/e0 tag 29 dma 4096 in
May 12 19:41:14 pxGAME47 kernel: [2260314.430174]          res 51/40:00:0d:08:00/00:00:00:00:00/00 Emask 0x9 (media error)
May 12 19:41:14 pxGAME47 kernel: [2260314.430175] ata7.00: status: { DRDY ERR }
May 12 19:41:14 pxGAME47 kernel: [2260314.430176] ata7.00: error: { UNC }
May 12 19:41:14 pxGAME47 kernel: [2260314.485489] ata7.00: configured for UDMA/133
May 12 19:41:14 pxGAME47 kernel: [2260314.485511] sd 0:0:0:0: [sda] tag#143 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
May 12 19:41:14 pxGAME47 kernel: [2260314.485514] sd 0:0:0:0: [sda] tag#143 Sense Key : Medium Error [current] 
May 12 19:41:14 pxGAME47 kernel: [2260314.485517] sd 0:0:0:0: [sda] tag#143 Add. Sense: Unrecovered read error - auto reallocate failed
May 12 19:41:14 pxGAME47 kernel: [2260314.485521] sd 0:0:0:0: [sda] tag#143 CDB: Read(10) 28 00 00 00 08 08 00 00 08 00
May 12 19:41:14 pxGAME47 kernel: [2260314.485525] blk_update_request: I/O error, dev sda, sector 2061 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
May 12 19:41:14 pxGAME47 kernel: [2260314.487108] Buffer I/O error on dev sda1, logical block 1, async page read
May 12 19:41:14 pxGAME47 kernel: [2260314.488448] ata7: EH complete
May 12 19:41:14 pxGAME47 kernel: [2260314.488459] sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 1 tries: 1

try to check with smartct;, but my guess is most likely you will need to fix/replace the drive.

taking a dd or clonezilla backup can come in handy (probably can still recover data from the drive)
 
Hello

and here are the log before upgrade the disk before upgrade the disk look ok .

Best Regards,

there's still disk errors here around 16:41

smartctl also says ATA Error Count: 8084 so it doesn't look too good.

please check the cables (power and data) and make sure to take a backup

i'd also look into getting a replacement disk
 
Hello

from yesterday , i come to check the Node fix on it own don't know idea why it's alread has been fixed.

and when i check the disk health it's become normal again no error count .

BTW, thank you for supporting me, later on i will replace it.

Thank you,