Cannot access disk on zfs

dean_za

Renowned Member
Mar 6, 2012
2
0
66
Hi

I have a zpool mirror that is not reporting any errors

root@proxmox03:/dev/zvol/zfspool0# zpool status -v
pool: zfspool0
state: ONLINE
scan: scrub in progress since Thu Oct 20 08:36:59 2016
81.0G scanned out of 133G at 25.4M/s, 0h34m to go
0 repaired, 60.86% done
config:

NAME STATE READ WRITE CKSUM
zfspool0 ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
sdb ONLINE 0 0 0
sdc ONLINE 0 0 0

errors: No known data errors

when I list the zfs block devices I can see all of them

zfs list
NAME USED AVAIL REFER MOUNTPOINT
zfspool0 133G 405G 96K /zfspool0
zfspool0/vm-102-disk-1 7.58G 405G 7.58G -
zfspool0/vm-103-disk-1 24.1G 405G 24.1G -
zfspool0/vm-105-disk-1 10.2G 405G 10.2G -
zfspool0/vm-105-disk-2 30.5G 405G 30.5G -
zfspool0/vm-105-disk-3 10.1G 405G 10.1G -
zfspool0/vm-107-disk-1 26.9G 405G 25.7G -
zfspool0/vm-107-state-centrify 2.97G 405G 2.97G -
zfspool0/vm-110-disk-1 21.0G 405G 21.0G -


when I try start vm-103 it fails saying it cant access the device

kvm: -drive file=/dev/zvol/zfspool0/vm-103-disk-1,if=none,id=drive-scsi0,format=raw,cache=none,aio=native,detect-zeroes=on: Could not open '/dev/zvol/zfspool0/vm-103-disk-1': No such file or directory
TASK ERROR: start failed: command '/usr/bin/kvm -id 103 -chardev 'socket,id=qmp,path=/var/run/qemu-server/103.qmp,server,nowait' -mon 'chardev=qmp,mode=control' -pidfile /var/run/qemu-server/103.pid -daemonize -smbios 'type=1,uuid=3bf569fc-7457-43a9-b137-d91cb5ea12bd' -name PDC -smp '1,sockets=1,cores=1,maxcpus=1' -nodefaults -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' -vga cirrus -vnc unix:/var/run/qemu-server/103.vnc,x509,password -cpu qemu64,+kvm_pv_unhalt,+kvm_pv_eoi,enforce -m 2048 -k en-us -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' -iscsi 'initiator-name=iqn.1993-08.org.debian:01:5f915d41d0f2' -drive 'if=none,id=drive-ide2,media=cdrom,aio=threads' -device 'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=200' -device 'virtio-scsi-pci,id=scsihw0,bus=pci.0,addr=0x5' -drive 'file=/dev/zvol/zfspool0/vm-103-disk-1,if=none,id=drive-scsi0,format=raw,cache=none,aio=native,detect-zeroes=on' -device 'scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0,bootindex=100' -netdev 'type=tap,id=net0,ifname=tap103i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' -device 'virtio-net-pci,mac=46:14:FE:DA:45:6F,netdev=net0,bus=pci.0,addr=0x12,id=net0'' failed: exit code 1


How can I check the the vm-103 block device is not damaged ?

Thanx
Dean
 
do /dev/zvol and /dev/zvol/zpool0 exist? what are their contents?
 
Me too. After upgrade from Proxmox 5.0 to 5.1 I lost all my VM disks and /dev/zvol is missing.

I didn't reboot the server yet, but I know that I won't recover what's missing, but I need to create new servers and I can't.
 
Check your kernel version. I had one node that failed to grab the kernel update and therefore couldn't mount zfs volumes with the new libs. Correct should be 4.13.4-1-pve #1 SMP PVE 4.13.4-25 or higher.
 
@hardstone - Don't reboot. I was seeing what you were & rebooted. Big mistake. I had to restore all my virtual machines to a different server. (I might have been able to do a ZFS push, to the other server, but backups had just run.)

There seems to be a bug in 4.13 which is causing isci_task_abort_task panics on a couple of my hosts. Both are Supermicro x9srw with E5-2660 v2.

The behavior is much like this report: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1726519

Reverting to 4.10.17-4-pve boots to Proxmox at least, but /dev/zvol is missing, so I can't do much with the local ZFS filesystems.
 
@hardstone - Don't reboot. I was seeing what you were & rebooted. Big mistake. I had to restore all my virtual machines to a different server. (I might have been able to do a ZFS push, to the other server, but backups had just run.)

There seems to be a bug in 4.13 which is causing isci_task_abort_task panics on a couple of my hosts. Both are Supermicro x9srw with E5-2660 v2.

The behavior is much like this report: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1726519

Reverting to 4.10.17-4-pve boots to Proxmox at least, but /dev/zvol is missing, so I can't do much with the local ZFS filesystems.

Hi Joshin,

If you are on 4.10 again, do this If you did not update the pool:

apt-get install libzpool2linux=0.6.5.11-pve17~bpo90
apt-get install libnvpair1linux=0.6.5.11-pve17~bpo90
apt-get install libuutil1linux=0.6.5.11-pve17~bpo90
apt-get install libzfs2linux-dev=0.6.5.11-pve17~bpo90
apt-get install zfs-initramfs=0.6.5.11-pve17~bpo90
apt-get install zfsutils-linux=0.6.5.11-pve17~bpo90
apt-get install spl=0.6.5.11-pve10~bpo90

Then reboot, zfs will work and you can boot up your vm

Best regards.
 
Last edited:
  • Like
Reactions: joshin
apt-get install libzfs2linux-dev=0.6.5.11-pve17~bpo90

Thanks J. Carlos!

I suspect you meant:
apt-get install libzfs2linux=0.6.5.11-pve17~bpo90 instead of libzfs2linux-dev, as -dev doesn't seem to exist on a standard install.

My two servers that are misbehaving on 4.13 are manifesting a kernel panic involving an interesting isci_task_abort_task panic. So not perfectly aligned with the others suffering this problem, but similar enough.
 
So I had this issue and it was because for some reason, the symbolic link tothe actual zvol dev for the VM went away after a system crash due to the good old zfs eating up all the ram during high io... a backup was occuring.

I simply readded it.
In the /dev/zvol/rpool/data folder, I did
ln -s ../../../zd64 vm-100-disk-1

And the VM reappeared!

Now I have to figure out the other issue.
THis machine as 64 gigs ram and I was running just 4 vms and I set the zfs arc max down to 8 gigs... turned off compression and such for the rpool/swap partition, etc.. and it STILL bombed. I am going to upgrade to 5.whatever. THis is a 4.3 box.