dev loop0

SamTzu

Renowned Member
Mar 27, 2009
523
17
83
Helsinki, Finland
sami.mattila.eu
Hi guys.

My big is itching :)
My big is a server with about 10 KVM clients that each host about 10 LXC containers (nested virtualization is the way to go these days.) Lately I have seen several "dev loop0" problems usually with I/O errors indicating a malfunctioning disk causing problems with the disk image causing the infernal "Structure needs cleaning" error when trying to rsync or backup. I have just stopped the client and run a fsck -v -y on the file and all is good again but it's happening with different disks... hence the itch. I'm thinking its a memory issue somewhere.

Some background on the host:
Uptime: 445 days :)
Proxmox 5.1-52
RAM: 128GB
Storage: Lots of different types of "storage" including NAS. Mostly SSD but some larger SATA thrown in.
At first I thought problem was just with SSD drives.

Some background on the KVM clients:
Proxmox 5.4-6 cluster
RAM: 16-32GB RAM each.
Storage: Share the same NAS drives + host drives.

Some background on the nested LXC clients:
Debian based containers, mostly v8-9. Some Ubuntu in the mix.
RAM:1M-8M each.
Storage: Share the same NAS drives + host drives.

Any thoughts are appreciated.

Sam
 
Last edited:
I would take a close look in your log files, your case does sound like either a disk or a memory issue. Are you using storage redundancy (ZFS, RAID, etc...)?

If you are using ECC RAM you can also check your kernel log on the host, it might help you regarding memory errors.
 
ZFS/Ext4 disks on the Host (& KVM hosts) and LVM/Ext4 on the NAS.
I did find several error messages for dm-17 (one of the disks) that said "lost async page write."
Rest of the dmesg complains are about network interface detecting hardware unit hang (probably from the VPN server).
 
It would help to know if your damaged disk images are located on the NAS or locally.

The errors you are seeing definitely indicates some kind of hardware fault though (see also https://serverfault.com/a/866150).

ZFS *should* prevent such errors from affecting anything, provided you are not just striping your disks. Does 'zpool status' report anything?
 
In that case I assume it's either a hardware fault or an issue with your NAS.

I would suggest upgrading and rebooting your host system and running a memtest to check for RAM errors. It might also help running a 'smartctl' diagnostic on your disks.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!