TASK ERROR: can't activate LV '/dev/pve3/vm-109-disk-0': Cannot process volume group pve3

ShaunG · Jan 17, 2024

Hi,

I tried to restart a VM yesterday which failed, the UI or "qm stop" would not stop it so I had to kill the process.

Now when I try to start the VM all I get is this error:

TASK ERROR: can't activate LV '/dev/pve3/vm-109-disk-0': Cannot process volume group pve3

I pulled the disk as it wasn't showing up with "lsblk" and reseated, (Dell Poweredge server), but it just refuses to show up anywhere, even iDRAC isn't showing it. The caddy is flashing green which as I understand may be "rebuilding", but I would have thought I would be able to confirm this somewhere?

Just want to make sure that there isn't something I'm missing with Proxmox, or any other advice would greatly be appreciated.

Philipp Hufnagl · Jan 17, 2024

Hello

That sounds strange. Would you mind sharing your /etc/pve/storage.cfg and the VM configuration at /etc/pve/qemu-server/<VMID>.conf?

Also, you could try to run journalctl -f to see if something insightful is logged when you attach the disk.

ShaunG · Jan 17, 2024

Philipp Hufnagl said:
Hello

That sounds strange. Would you mind sharing your /etc/pve/storage.cfg and the VM configuration at /etc/pve/qemu-server/<VMID>.conf?

Also, you could try to run journalctl -f to see if something insightful is logged when you attach the disk.

Hi, sure, thanks for replying:

Bash:

dir: local
    path /var/lib/vz
    content iso,vztmpl,backup
    prune-backups keep-last=6
    shared 0

lvmthin: local-lvm
    thinpool data
    vgname pve
    content rootdir,images

lvm: zm-lvm
    vgname zmdata
    content images,rootdir
    shared 0

lvm: local-lvm2
    vgname pve2
    content images,rootdir
    shared 0

lvm: local-lvm3
    vgname pve3
    content images,rootdir
    shared 0

For some reason "journalctl" shows first log July 2023, and last log October 2023??

Philipp Hufnagl · Jan 17, 2024

Hello

What storage is the VM using? Can you post the VM configuration at /etc/pve/qemu-server/<VMID>.conf? Also, can you show me the lsblk?

ShaunG said:
For some reason "journalctl" shows first log July 2023, and last log October 2023??

hmnm… did you just issue journalctl without parameters? In that case, it would show you its entire log. You can navigate it with the arrow keys.

However, you can use arguments to narrow down the results. For example:

To get the log since the last boot: journalctl -b.
To get the log since 15. 01.2024: journalctl --since '2024-01-15'.
To get any new logs from now on: journalctl -f

ShaunG · Jan 18, 2024

Philipp Hufnagl said:
Hello

What storage is the VM using? Can you post the VM configuration at /etc/pve/qemu-server/<VMID>.conf? Also, can you show me the lsblk?

hmnm… did you just issue journalctl without parameters? In that case, it would show you its entire log. You can navigate it with the arrow keys.

However, you can use arguments to narrow down the results. For example:

To get the log since the last boot: journalctl -b.
To get the log since 15. 01.2024: journalctl --since '2024-01-15'.
To get any new logs from now on: journalctl -f

VM.conf:

Bash:

agent: 1
boot: order=scsi0;ide2;net0
cores: 4
ide2: local:iso/ubuntu-20.04.3-live-server-amd64.iso,media=cdrom
memory: 12288
name: a
net0: virtio=4A:45:DD:81:C0:F1,bridge=vmbr1,tag=10
numa: 0
onboot: 1
ostype: l26
scsi0: local-lvm3:vm-109-disk-0,size=20G
scsihw: virtio-scsi-pci
smbios1: uuid=ac5d3bdc-2696-47e5-89a7-6b025e6b5bc3
sockets: 1
unused0: local-lvm:vm-109-disk-0
vga: qxl
vmgenid: 83ce984f-d350-4f1c-a148-ed7d9c832985

Some relevant logs from journal on drive mounts, you can see 3 but there is supposed to be 4.

Bash:

Jan 16 11:48:28 proxmox kernel: scsi 0:0:0:0: Direct-Access     ATA      INTEL SSDSC2BA20 DL2D PQ: 0 ANSI: 6
Jan 16 11:48:28 proxmox kernel: usb 2-1: new high-speed USB device number 2 using ehci-pci
Jan 16 11:48:28 proxmox kernel: ata1: SATA link down (SStatus 0 SControl 300)
Jan 16 11:48:28 proxmox kernel: scsi 0:0:3:0: Direct-Access     ATA      INTEL SSDSC2BA20 DL08 PQ: 0 ANSI: 6
Jan 16 11:48:28 proxmox kernel: ata5: SATA link down (SStatus 0 SControl 300)
Jan 16 11:48:28 proxmox kernel: usb 1-1: New USB device found, idVendor=8087, idProduct=800a, bcdDevice= 0.05
Jan 16 11:48:28 proxmox kernel: usb 1-1: New USB device strings: Mfr=0, Product=0, SerialNumber=0
Jan 16 11:48:28 proxmox kernel: hub 1-1:1.0: USB hub found
Jan 16 11:48:28 proxmox kernel: hub 1-1:1.0: 6 ports detected
Jan 16 11:48:28 proxmox kernel: usb 2-1: New USB device found, idVendor=8087, idProduct=8002, bcdDevice= 0.05
Jan 16 11:48:28 proxmox kernel: usb 2-1: New USB device strings: Mfr=0, Product=0, SerialNumber=0
Jan 16 11:48:28 proxmox kernel: hub 2-1:1.0: USB hub found
Jan 16 11:48:28 proxmox kernel: hub 2-1:1.0: 8 ports detected
Jan 16 11:48:28 proxmox kernel: scsi 0:0:6:0: Direct-Access     ATA      ST91000640NS     AA02 PQ: 0 ANSI: 6
Jan 16 11:48:28 proxmox kernel:  sdb: sdb1
Jan 16 11:48:28 proxmox kernel: sd 0:0:3:0: [sdb] Attached SCSI disk
Jan 16 11:48:28 proxmox kernel:  sda: sda1 sda2 sda3
Jan 16 11:48:28 proxmox kernel: sd 0:0:0:0: [sda] Attached SCSI disk
Jan 16 11:48:28 proxmox kernel:  sdc: sdc1
Jan 16 11:48:28 proxmox kernel: sd 0:0:6:0: [sdc] Attached SCSI disk

1000s of these warnings:

Jan 16 21:22:46 proxmox pvestatd[1521]: command '/sbin/vgscan --ignorelockingfailure --mknodes' failed: exit code 5

ShaunG · Jan 18, 2024

I've just tried to restore, to another disk, and I can't even shutdown the "new" VM!

Code:

Jan 18 10:08:09 proxmox pvedaemon[1531]: <root@pam> end task UPID:proxmox:000032CD:0002871E:65A8F84C:qmshutdown:109:root@pam: VM quit/powerdown failed - got timeout

Have no idea what's going on here!

ShaunG · Jan 18, 2024

@Philipp Hufnagl I've just tried to restore a backup now to another drive, and that has also now disappeared from the server completely with errors attached..... It was running absolutely fine, what is proxmox doing to my drives??

Bash:

Jan 18 10:52:03 proxmox pvedaemon[1533]: <root@pam> starting task UPID:proxmox:0000686B:0006A388:65A902D3:vzcreate:109:root@pam:
Jan 18 10:53:43 proxmox kernel: sd 0:0:3:0: [sdb] tag#312 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK cmd_age=99s
Jan 18 10:53:43 proxmox kernel: sd 0:0:3:0: [sdb] tag#312 CDB: Write(10) 2a 00 0c 41 31 38 00 00 08 00
Jan 18 10:53:43 proxmox kernel: sd 0:0:3:0: [sdb] tag#313 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK cmd_age=99s
Jan 18 10:53:43 proxmox kernel: I/O error, dev sdb, sector 205599032 op 0x1:(WRITE) flags 0x3800 phys_seg 1 prio class 2
Jan 18 10:53:43 proxmox kernel: sd 0:0:3:0: [sdb] tag#313 CDB: Write(10) 2a 00 01 44 60 68 00 00 30 00
Jan 18 10:53:43 proxmox kernel: sd 0:0:3:0: SCSI device is removed
Jan 18 10:53:43 proxmox kernel: sd 0:0:3:0: [sdb] tag#294 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK cmd_age=0s
Jan 18 10:53:43 proxmox kernel: sd 0:0:3:0: [sdb] tag#294 CDB: Write(10) 2a 00 00 23 30 00 00 00 08 00
Jan 18 10:53:43 proxmox kernel: EXT4-fs warning (device dm-5): ext4_end_bio:343: I/O error 10 writing to inode 791576 starting block 286720)
Jan 18 10:53:43 proxmox kernel: Buffer I/O error on device dm-5, logical block 286720
Jan 18 10:53:43 proxmox kernel: EXT4-fs (dm-3): I/O error while writing superblock
Jan 18 10:53:43 proxmox kernel: EXT4-fs (dm-3): I/O error while writing superblock
Jan 18 10:53:43 proxmox kernel: EXT4-fs (dm-3): I/O error while writing superblock

Tried it again, so yes when I create a new CT on this drive it kills the whole drive and I can no longer see it without rebooting. I can't easily run FSCK on the drive as when i boot again it is used by other containers, nightmare...

Philipp Hufnagl · Jan 18, 2024

Hello

ShaunG said:

You can take a look at the task log with pvenode task log UPID:proxmox:000032CD:0002871E:65A8F84C:qmshutdown:109:root@pam: .

ShaunG said:

@Philipp Hufnagl I've just tried to restore a backup now to another drive, and that has also now disappeared from the server completely with errors attached..... It was running absolutely fine, what is proxmox doing to my drives??

Bash:

Jan 18 10:52:03 proxmox pvedaemon[1533]: <root@pam> starting task UPID:proxmox:0000686B:0006A388:65A902D3:vzcreate:109:root@pam:
Jan 18 10:53:43 proxmox kernel: sd 0:0:3:0: [sdb] tag#312 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK cmd_age=99s
Jan 18 10:53:43 proxmox kernel: sd 0:0:3:0: [sdb] tag#312 CDB: Write(10) 2a 00 0c 41 31 38 00 00 08 00
Jan 18 10:53:43 proxmox kernel: sd 0:0:3:0: [sdb] tag#313 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK cmd_age=99s
Jan 18 10:53:43 proxmox kernel: I/O error, dev sdb, sector 205599032 op 0x1:(WRITE) flags 0x3800 phys_seg 1 prio class 2
Jan 18 10:53:43 proxmox kernel: sd 0:0:3:0: [sdb] tag#313 CDB: Write(10) 2a 00 01 44 60 68 00 00 30 00
Jan 18 10:53:43 proxmox kernel: sd 0:0:3:0: SCSI device is removed
Jan 18 10:53:43 proxmox kernel: sd 0:0:3:0: [sdb] tag#294 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK cmd_age=0s
Jan 18 10:53:43 proxmox kernel: sd 0:0:3:0: [sdb] tag#294 CDB: Write(10) 2a 00 00 23 30 00 00 00 08 00
Jan 18 10:53:43 proxmox kernel: EXT4-fs warning (device dm-5): ext4_end_bio:343: I/O error 10 writing to inode 791576 starting block 286720)
Jan 18 10:53:43 proxmox kernel: Buffer I/O error on device dm-5, logical block 286720
Jan 18 10:53:43 proxmox kernel: EXT4-fs (dm-3): I/O error while writing superblock
Jan 18 10:53:43 proxmox kernel: EXT4-fs (dm-3): I/O error while writing superblock
Jan 18 10:53:43 proxmox kernel: EXT4-fs (dm-3): I/O error while writing superblock

Tried it again, so yes when I create a new CT on this drive it kills the whole drive and I can no longer see it without rebooting. I can't easily run FSCK on the drive as when i boot again it is used by other containers, nightmare...

Honestly, that sounds quite concerning. In my experience, issues like this are often caused by faulty hardware. Would you mind providing an lsblk and the file generated with journalctl -b > $(hostname)-journal.txt?

ShaunG · Jan 18, 2024

Philipp Hufnagl said:
Hello

You can take a look at the task log with pvenode task log UPID:proxmox:000032CD:0002871E:65A8F84C:qmshutdown:109:root@pam: .

Honestly, that sounds quite concerning. In my experience, issues like this are often caused by faulty hardware. Would you mind providing an lsblk and the file generated with journalctl -b > $(hostname)-journal.txt?

As I said above when I run this create CT from proxmox it kills the drive. It doesn't show up anyway, lsblk, fdisk even iDRAC doesn't show it any longer.

I've just taken it out and plugged into a desktop and it shows fine, is there anything I can run from here to "repair" it?

I posted tons of journal logs above, anything related to the drives pretty much. Nothing of note.

ShaunG · Jan 18, 2024

@Philipp Hufnagl So I see that when connected to desktop the "new" logical volume is there (109). I ran fsck on a few of the LVs and they are fine, but 109 shows below:

Code:

fsck from util-linux 2.37.2
e2fsck 1.46.5 (30-Dec-2021)
ext2fs_open2: Bad magic number in super-block
fsck.ext2: Superblock invalid, trying backup blocks...
fsck.ext2: Bad magic number in super-block while trying to open /dev/mapper/pve2-vm--109--disk--0

The superblock could not be read or does not describe a valid ext2/ext3/ext4
filesystem.  If the device is valid and it really contains an ext2/ext3/ext4
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
    e2fsck -b 8193 <device>
 or
    e2fsck -b 32768 <device>

Question is, Proxmox just created this, I can remove it but how do I then run FSCK on the free space of this drive? I tried running on /dev/sda but it says device is busy.

Philipp Hufnagl · Jan 18, 2024

Hello

ShaunG said:
As I said above when I run this create CT from proxmox it kills the drive. It doesn't show up anyway, lsblk, fdisk even iDRAC doesn't show it any longer.

I've just taken it out and plugged into a desktop and it shows fine, is there anything I can run from here to "repair" it?

I posted tons of journal logs above, anything related to the drives pretty much. Nothing of note.

I would not come to the conclusion that joust because you verified that the disk is not faulty (which is great news btw), it is not a hardware issue. I would also recommend verifying other involved hardware like the connector and the cables.

Search

Search

TASK ERROR: can't activate LV '/dev/pve3/vm-109-disk-0': Cannot process volume group pve3

ShaunG

Member

Philipp Hufnagl

Active Member

ShaunG

Member

Philipp Hufnagl

Active Member

ShaunG

Member

ShaunG

Member

ShaunG

Member

Philipp Hufnagl

Active Member

ShaunG

Member

ShaunG

Member

Philipp Hufnagl

Active Member

We value your privacy