Mdadm scanning is very slow after Proxmos upgrade

Kubernat

New Member
Jan 10, 2024
5
0
1
Hi everyone.This has happened to us on some other servers. It turns out that after installing Proxmox, I have several machines with a RAID mounted through mdadm.

These machines start to fail, and what I notice is that mdadm rescans and doesn't finish. It starts with normal speeds, but gradually drops to 5 k/s.

My Proxmox version is as follows:

pve-manager/8.4.1/2a5fa54a8503f96d (running kernel: 6.8.4-3-pve)

I've checked the hard drives, and they're all in good condition. I have one of them as a spare in case of a failure, but so far it still indicates that the disk hasn't been detected.

root@Proxmox:~# cat /proc/mdstat

Personalities : [raid6] [raid5] [raid4] [raid0] [raid1] [raid10]
md0 : active raid5 sde1[4] sdd1[5] sdc1[1] sdb1[0] sdf[6](S)
29297335296 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
[==>..................] check = 11.2% (1099637740/9765778432) [B]finish=28322975.1min speed=5K/sec[/B]
bitmap: 3/73 pages [12KB], 65536KB chunk


Do you have any idea why this error occurs and if there's a way to resolve it so that the mdadm scan can complete again?

Thank you so much.
 
Do you have any idea why this error occurs
  • power supply
  • mechanical problems with the contact of the cables - remove/reconnect them on both ends
  • firmware of the motherboard - "why only now" is a weak argument
  • all drives: run SMART long selftest and check the results tomorrow
  • RAM is one of the usual suspects - memtest86 may verify this
  • cosmic radiation ;-)
finish=28322975.1min

That's one of a dozen (or more) features of why I prefer ZFS: building a new pool is done in a second and scrubbing/repairing only reads already occupied data, not the whole underlying device.
 
Last edited:
  • Like
Reactions: Johannes S
Thanks for the help.

From what I understand, the best way is to switch from mdadm to zpool, but I don't know if there's a way to do it, nor if it's even possible, since it doesn't complete the mdadm scan.

Do you know of any post or website that explains whether I can do this and what steps to follow?

Thank you very much.
 
From what I understand, the best way is to switch from mdadm to zpool
You can not simply "switch".

To use ZFS you would need to have new disks or erase the old ones. The data loss will be 100%.

Some hints pointing to a safe strategy, but for sure not a complete list:
  • look up / search for ZFS beginner friendly introductions
  • play around in some VM (which would require a working PVE setup...)
  • alternatively play around with some old gear
  • read https://pve.proxmox.com/pve-docs/pve-admin-guide.html#chapter_zfs and https://pve.proxmox.com/wiki/ZFS_on_Linux
  • learn about ZFS pitfalls - there are different ones depending on your self-defined (but unknown) requirements!
  • buy some new disks, possibly plus two good SSD/NVMe for a "Special Device" - we do not know if you have had / plan to use rotating rust or SSD only
  • setup ZFS in parallel to the old Raid (if you get it working again, to transfer the old content)
  • test that new storage
  • transfer old VMs to new storage
You did not tell us if you have backups and what's your strategy in this regard; the alternative approach is to delete everything (VM stroage, not OS), create that ZFS pool and restore from backup.

Possibly I am drifting away from your actual needs. That happens when too few information is shared...