Proxmox crashed constantly pve/data corrupted

Vin

New Member
Mar 6, 2023
8
0
1
Hello everybody,

Proxmox somehow constantly crashes, I have setup everything multiple times.

After the crash often times the pve is corrupted, and even though firstly It can be repaired with lvconvert --repair pve/data, eventually it turns into a persistant corruption and I have to setup everything all over again.

Can anybody make sense out of the logs, or tell me how to troubleshoot below issues and the crashes?

Thank you in advance

dmesg
https://pastebin.com/HnP5SF3v

journal
https://pastebin.com/ghGvbLjU

root@Snake:~# vgchange -a y pve
Check of pool pve/data failed (status:1). Manual repair required!
2 logical volume(s) in volume group "pve" now active
root@Snake:~# lvchange -a y pve/data
Check of pool pve/data failed (status:1). Manual repair required!
root@Snake:~# lvconvert --repair pve/data
Child 4234 exited abnormally
Repair of thin metadata volume of thin pool pve/data failed (status:-1). Manual repair required!
 

Richard

Proxmox Staff Member
Staff member
Mar 6, 2015
1,010
58
73
Austria
Hello everybody,

Proxmox somehow constantly crashes, I have setup everything multiple times.

After the crash often times the pve is corrupted, and even though firstly It can be repaired with lvconvert --repair pve/data, eventually it turns into a persistant corruption and I have to setup everything all over again.

Can anybody make sense out of the logs, or tell me how to troubleshoot below issues and the crashes?

Thank you in advance

dmesg
https://pastebin.com/HnP5SF3v

journal
https://pastebin.com/ghGvbLjU

root@Snake:~# vgchange -a y pve
Check of pool pve/data failed (status:1). Manual repair required!
2 logical volume(s) in volume group "pve" now active
root@Snake:~# lvchange -a y pve/data
Check of pool pve/data failed (status:1). Manual repair required!
root@Snake:~# lvconvert --repair pve/data
Child 4234 exited abnormally
Repair of thin metadata volume of thin pool pve/data failed (status:-1). Manual repair required!
Most probable a hardware error (disk? controller?). AFAIU lvm pve is corrupt after a while. It may help to have a look into lvm's history, shown by archive and backup file:
Code:
cat /etc/lvm/archive/*
cat /etc/lvm/backup/*
 

Vin

New Member
Mar 6, 2023
8
0
1
I reinstalled the entire setup multiple times by now

Also I just changed the NVMe to a brand new one
Only installed Proxmox and ist crashed again, with a corrupted file system

So unfortunately I dont have logfiles anymore.

Is there a way to see the I/O errors in a running system?

Also I apparently do have some corrupted sections in my NAS HDDs, can those lead to the crashes?
I tried to fix them via gparted, but I just ran into errors
 

Attachments

  • 1.png
    1.png
    129.5 KB · Views: 12
  • 2.png
    2.png
    48 KB · Views: 10
  • 3.png
    3.png
    56.9 KB · Views: 11
  • 5.png
    5.png
    725.3 KB · Views: 9
  • 6.png
    6.png
    841.8 KB · Views: 10

Richard

Proxmox Staff Member
Staff member
Mar 6, 2015
1,010
58
73
Austria
I reinstalled the entire setup multiple times by now

Also I just changed the NVMe to a brand new one
Only installed Proxmox and ist crashed again, with a corrupted file system

So unfortunately I dont have logfiles anymore.

Is there a way to see the I/O errors in a running system?

Try to boot via an external live media and investigate the file from previous Proxmox installation then.

Also I apparently do have some corrupted sections in my NAS HDDs, can those lead to the crashes?
I tried to fix them via gparted, but I just ran into errors
NAS is used as data storage only (or?), therefore it cannot cause the problem of Proxmox crash you reported. In order to exclude any inflince from NAS I suggest to run Proxmox for the moment without it (and configure it later as soon as Proxmox is stable).
 

Vin

New Member
Mar 6, 2023
8
0
1
I reinstalled everything from scratch, still crashes

dmesg Proxmox
https://pastebin.com/AkPiDT6j

Regarding the NAS I do passthrough the disks to a Debian VM with OMV installed to it.
In this particular dmesg from Proxmox here, I only passed one SSD to the VM.

I do suspect an I/O problem, due to I/O load, as described years ago there
https://bugzilla.kernel.org/show_bug.cgi?id=199727#c0

I did change all my VM Disks to VirtiIO SCSI single, Cache = Write Back, Discard = 1, IO Thread = 1, Async IO=threads, SSD emulation
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!