[SOLVED] Zpool unavailable, SMART is OK

mproxk

Active Member
Jun 2, 2019
21
2
43
41
Hello

I have a second WD Red 2TB disk with ZFS for the large VMs. One of the VMs crashed and could not be started again. I rebooted the whole node and the whole pool is unavailable now.

Code:
root@silver:~# smartctl -a /dev/sdb
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.4.195-1-pve] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Red (SMR)
Device Model:     WDC WD20EFAX-68B2RN1
Serial Number:    WD-WX22D81DTRF6
LU WWN Device Id: 5 0014ee 2bf31a201
Firmware Version: 83.00A83
User Capacity:    2,000,398,934,016 bytes [2.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Form Factor:      3.5 inches
TRIM Command:     Available, deterministic, zeroed
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-3 T13/2161-D revision 5
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Wed Aug  9 11:29:15 2023 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (63104) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        ( 250) minutes.
Conveyance self-test routine
recommended polling time:        (   2) minutes.
SCT capabilities:              (0x3039) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   179   176   021    Pre-fail  Always       -       2033
  4 Start_Stop_Count        0x0032   098   098   000    Old_age   Always       -       2160
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   088   088   000    Old_age   Always       -       9467
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   098   098   000    Old_age   Always       -       2148
192 Power-Off_Retract_Count 0x0032   198   198   000    Old_age   Always       -       2136
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       111
194 Temperature_Celsius     0x0022   110   091   000    Old_age   Always       -       33
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age   Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%      9430         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay


It is sdb

Code:
root@silver:~# lsblk
NAME                                                                                                  MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda                                                                                                     8:0    0 119.2G  0 disk
├─sda1                                                                                                  8:1    0  1007K  0 part
├─sda2                                                                                                  8:2    0   512M  0 part
└─sda3                                                                                                  8:3    0  99.5G  0 part
sdb                                                                                                     8:16   0   1.8T  0 disk
├─sdb1                                                                                                  8:17   0   1.8T  0 part
└─sdb9                                                                                                  8:25   0     8M  0 part
sdc                                                                                                     8:32   1  57.3G  0 disk
└─ceph--def4e8f2--c945--4c45--ae49--7bf5622300eb-osd--block--91734852--c703--4333--98a2--6920a2670f6b 253:0    0  57.3G  0 lvm
zd0                                                                                                   230:0    0     8G  0 disk
├─zd0p1                                                                                               230:1    0     1M  0 part
├─zd0p2                                                                                               230:2    0     1G  0 part
└─zd0p3                                                                                               230:3    0     7G  0 part
zd16                                                                                                  230:16   0   273M  0 disk
├─zd16p1                                                                                              230:17   0    16M  0 part
└─zd16p2                                                                                              230:18   0   256M  0 part

that listed pool is from the boot/proxmox SSD:

Code:
root@silver:~# df -h
Filesystem                    Size  Used Avail Use% Mounted on
udev                          3.8G     0  3.8G   0% /dev
tmpfs                         780M   17M  763M   3% /run
rpool/ROOT/pve-1               69G  9.0G   60G  14% /
tmpfs                         3.9G   60M  3.8G   2% /dev/shm
tmpfs                         5.0M     0  5.0M   0% /run/lock
tmpfs                         3.9G     0  3.9G   0% /sys/fs/cgroup
rpool                          60G  128K   60G   1% /rpool
rpool/ROOT                     60G  128K   60G   1% /rpool/ROOT
rpool/data                     60G  128K   60G   1% /rpool/data
rpool/ROOT/subvol-105-disk-0  8.0G  912M  7.2G  12% /rpool/ROOT/subvol-105-disk-0
rpool/ROOT/subvol-100-disk-1   12G  9.5G  2.6G  79% /rpool/ROOT/subvol-100-disk-1
rpool/ROOT/subvol-111-disk-0  4.0G  3.0G  1.1G  73% /rpool/ROOT/subvol-111-disk-0
rpool/ROOT/subvol-120-disk-0  4.0G  2.3G  1.8G  56% /rpool/ROOT/subvol-120-disk-0
rpool/ROOT/subvol-121-disk-1   60G  128K   60G   1% /rpool/ROOT/subvol-121-disk-1
rpool/ROOT/subvol-116-disk-0  5.0G  2.7G  2.4G  54% /rpool/ROOT/subvol-116-disk-0
rpool/ROOT/subvol-121-disk-0  4.5G  1.4G  3.2G  30% /rpool/ROOT/subvol-121-disk-0
rpool/ROOT/subvol-123-disk-0  8.0G  2.6G  5.5G  32% /rpool/ROOT/subvol-123-disk-0
/dev/fuse                      30M   88K   30M   1% /etc/pve
tmpfs                         3.9G   28K  3.9G   1% /var/lib/ceph/osd/ceph-1
tmpfs                         780M     0  780M   0% /run/user/0



Code:
root@silver:~# fdisk -l /dev/sdb
Disk /dev/sdb: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors
Disk model: WDC WD20EFAX-68B
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: FB1BB149-4AE5-2D44-AD2A-E11FA6D2EC92


Device          Start        End    Sectors  Size Type
/dev/sdb1        2048 3907012607 3907010560  1.8T Solaris /usr & Apple ZFS
/dev/sdb9  3907012608 3907028991      16384    8M Solaris reserved 1


Code:
root@silver:~# gdisk /dev/sdb
GPT fdisk (gdisk) version 1.0.3


Partition table scan:
  MBR: protective
  BSD: not present
  APM: not present
  GPT: present


Found valid GPT with protective MBR; using GPT

Any zfs / zpool command just hangs


I run a long smart test. Also took the disk out and did a long read test with Acronis on Windows and no issue were found

I do have a backup but it is strange to lose it all. There might be a power cut, it is on UPS but it does not shut down the nodes after the UPS is out


What can I try?
 
Last edited:
=== START OF INFORMATION SECTION ===
Model Family: Western Digital Red (SMR)
Device Model: WDC WD20EFAX-68B2RN1
SMR is not good for ZFS (or NAS in general) and you might want/need to replace the drive for a CMR one (Red Plus). Please search the forum for SMR problem reports.
 
  • Like
Reactions: aaron
can you see your pool with zpool status -v?
No it just hangs

The CT from the main disk run fine but on the UI for this node is like this with high IO delay. `data2` is the missing pool

Also if I go to the `Disks` section on the node it is just "loading", nothing to come up.

1691596978519.png
 
Ineresting. Seems that somemthing really broke. Can I see your journalctl -b?
 
Yes, the host works fine. SSH is OK, even a VM is running from the root disk. I think the interface is with ? just bcs it is timing out on checking the freedisk space for `data2`


Here is the journal

Aug 13 08:12:06 silver kernel: scsi 1:0:0:0: Direct-Access ATA WDC WD20EFAX-68B 0A83 PQ: 0 ANSI: 5 Aug 13 08:12:06 silver kernel: sd 1:0:0:0: Attached scsi generic sg1 type 0 Aug 13 08:12:06 silver kernel: sd 1:0:0:0: [sdb] 3907029168 512-byte logical blocks: (2.00 TB/1.82 TiB) Aug 13 08:12:06 silver kernel: sd 1:0:0:0: [sdb] 4096-byte physical blocks Aug 13 08:12:06 silver kernel: sd 1:0:0:0: [sdb] Write Protect is off Aug 13 08:12:06 silver kernel: sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00 Aug 13 08:12:06 silver kernel: sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Aug 13 08:12:06 silver kernel: sda: sda1 sda2 sda3 Aug 13 08:12:06 silver kernel: sd 0:0:0:0: [sda] Attached SCSI disk Aug 13 08:12:06 silver kernel: usb 1-3: new full-speed USB device number 3 using xhci_hcd Aug 13 08:12:06 silver kernel: sdb: sdb1 sdb9 Aug 13 08:12:06 silver kernel: sd 1:0:0:0: [sdb] Attached SCSI disk Aug 13 08:12:08 silver smartd[976]: Device: /dev/sdb, type changed from 'scsi' to 'sat' Aug 13 08:12:08 silver smartd[976]: Device: /dev/sdb [SAT], opened Aug 13 08:12:08 silver smartd[976]: Device: /dev/sdb [SAT], WDC WD20EFAX-68B2RN1, S/N:WD-WX22D81DTRF6, WWN:5-0014ee-2bf31a201, FW:83.00A83, 2. Aug 13 08:12:08 silver smartd[976]: Device: /dev/sdb [SAT], found in smartd database: Western Digital Red (SMR) Aug 13 08:12:08 silver smartd[976]: Device: /dev/sdb, type changed from 'scsi' to 'sat' Aug 13 08:12:08 silver smartd[976]: Device: /dev/sdb [SAT], opened Aug 13 08:12:08 silver smartd[976]: Device: /dev/sdb [SAT], WDC WD20EFAX-68B2RN1, S/N:WD-WX22D81DTRF6, WWN:5-0014ee-2bf31a201, FW:83.00A83, 2. Aug 13 08:12:08 silver smartd[976]: Device: /dev/sdb [SAT], found in smartd database: Western Digital Red (SMR) Aug 13 08:12:08 silver systemd[1]: Started ZFS file system shares. Aug 13 08:12:08 silver systemd[1]: Reached target ZFS startup target. Aug 13 08:12:08 silver kernel: vmbr0: port 1(enp1s0) entered blocking state Aug 13 08:12:08 silver kernel: vmbr0: port 1(enp1s0) entered disabled state Aug 13 08:12:08 silver kernel: device enp1s0 entered promiscuous mode Aug 13 08:12:08 silver zed[1019]: eid=2 class=config_sync pool='rpool' Aug 13 08:12:08 silver kernel: Generic FE-GE Realtek PHY r8169-0-100:00: attached PHY driver [Generic FE-GE Realtek PHY] (mii_bus:phy_addr=r81 Aug 13 08:12:08 silver zed[1021]: eid=3 class=pool_import pool='rpool' <- this is root ZFS SSD disk, works fine Aug 13 08:12:23 silver pvestatd[1623]: zfs error: cannot open 'data2': no such pool <- The WD DISK that is ureadable Aug 13 08:16:04 silver kernel: INFO: task zfs:7821 blocked for more than 120 seconds. Aug 13 08:16:04 silver kernel: Tainted: P O 5.4.195-1-pve #1 Aug 13 08:16:04 silver kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Aug 13 08:16:04 silver kernel: zfs D 0 7821 7795 0x00000004 Aug 13 08:16:04 silver kernel: Call Trace: Aug 13 08:16:04 silver kernel: __schedule+0x2e6/0x6f0 Aug 13 08:16:04 silver kernel: schedule+0x33/0xa0 Aug 13 08:16:04 silver kernel: schedule_preempt_disabled+0xe/0x10 Aug 13 08:16:04 silver kernel: __mutex_lock.isra.12+0x297/0x490 Aug 13 08:16:04 silver kernel: __mutex_lock_slowpath+0x13/0x20 Aug 13 08:16:04 silver kernel: mutex_lock+0x2c/0x30 Aug 13 08:16:04 silver kernel: spa_all_configs+0x3b/0x120 [zfs] Aug 13 08:16:04 silver kernel: zfs_ioc_pool_configs+0x1b/0x70 [zfs] Aug 13 08:16:04 silver kernel: zfsdev_ioctl_common+0x5b2/0x820 [zfs] Aug 13 08:16:04 silver kernel: ? __kmalloc_node+0x267/0x330 Aug 13 08:16:04 silver kernel: ? lru_cache_add_active_or_unevictable+0x39/0xb0 Aug 13 08:16:04 silver kernel: zfsdev_ioctl+0x54/0xe0 [zfs] Aug 13 08:16:04 silver kernel: do_vfs_ioctl+0xa9/0x640 Aug 13 08:16:04 silver kernel: ? handle_mm_fault+0xc9/0x1f0 Aug 13 08:16:04 silver kernel: ksys_ioctl+0x67/0x90 Aug 13 08:16:04 silver kernel: __x64_sys_ioctl+0x1a/0x20 Aug 13 08:16:04 silver kernel: do_syscall_64+0x57/0x190 Aug 13 08:16:04 silver kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9 Aug 13 08:16:04 silver kernel: RIP: 0033:0x7fa7076cae57 Aug 13 08:16:04 silver kernel: Code: Bad RIP value. Aug 13 08:16:04 silver kernel: RSP: 002b:00007ffd91f89fc8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 Aug 13 08:16:04 silver kernel: RAX: ffffffffffffffda RBX: 000055d8114fc2e0 RCX: 00007fa7076cae57
 
Aug 13 08:12:08 silver smartd[976]: Device: /dev/sdb [SAT], WDC WD20EFAX-68B2RN1, S/N:WD-WX22D81DTRF6, WWN:5-0014ee-2bf31a201, FW:83.00A83, 2. Aug 13 08:12:08 silver smartd[976]: Device: /dev/sdb [SAT], found in smartd database: Western Digital Red (SMR)

Shingled platters are expected to produce problems with ZFS earlier or later. You can find several threads describing the problem here and on other places. Just try everything to not use them...

Good luck!
 
I am restoring the files from a backup now. I was not aware than such fancy drives exist :)

I hope I can still use the disk as temporary storage with different file system
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!