Cannot stop hard drives spinning down

Aug 30, 2023
90
18
8
Luxembourg
I have 5 Toshiba 4TB SATA drives attached to my Proxmox server, and for some reason they keep spinning down when I want them all to be running all the time.

Using hdparm I'm able to apply "-B" to each one, "/dev/sdd" through "/dev/sdh" and they respond with "APM_level = off", and stay that way until I reboot the machine. When I immediately check after they are all set to "APM_level = 254, and then a short while later "APM_level = 20".

Any ideas what is overriding the disable setting? I also checked using sdparm but got nowhere even trying to find out if it could see power settings. I do also have "smarttools" installed for monitoring by Zabbix, which is how I detected this because it keeps logging the drives going in and out of idle etc

Proxmox v8.2.4 - 6.8.8-2-pve
 
I do see messages from smartctl that show the disk status changing, which is why I want to be sure the disks cannot be placed into anything other than active hence wanting to disable the power management.

Code:
Jun 28 09:07:02 abe smartd[2318]: Device: /dev/sdd [SAT], is back in ACTIVE or IDLE mode, resuming checks (1 check skipped)
Jun 28 09:07:03 abe smartd[2318]: Device: /dev/sde [SAT], is back in ACTIVE or IDLE mode, resuming checks (1 check skipped)
Jun 28 09:07:03 abe smartd[2318]: Device: /dev/sdf [SAT], is back in ACTIVE or IDLE mode, resuming checks (1 check skipped)
Jun 28 09:07:03 abe smartd[2318]: Device: /dev/sdg [SAT], is back in ACTIVE or IDLE mode, resuming checks (1 check skipped)
Jun 28 09:07:04 abe smartd[2318]: Device: /dev/sdh [SAT], is back in ACTIVE or IDLE mode, resuming checks (1 check skipped)
Jun 28 10:37:07 abe smartd[2318]: Device: /dev/sdd [SAT], is in STANDBY mode, suspending checks
Jun 28 10:37:12 abe smartd[2318]: Device: /dev/sde [SAT], is in STANDBY mode, suspending checks
Jun 28 10:37:17 abe smartd[2318]: Device: /dev/sdf [SAT], is in STANDBY mode, suspending checks
Jun 28 10:37:22 abe smartd[2318]: Device: /dev/sdg [SAT], is in STANDBY mode, suspending checks
Jun 28 10:37:27 abe smartd[2318]: Device: /dev/sdh [SAT], is in STANDBY mode, suspending checks

I also just saw his which is interesting as I did try using the hdparm.conf to control it but I thought it was being ignored:

Code:
Jun 28 10:13:56 abe (udev-worker)[771091]: sdf: Process '/usr/bin/hdparm -B 255 /dev/sdf' failed with exit code 1.
Jun 28 10:13:56 abe (udev-worker)[771091]: sdf: Process '/usr/bin/hdparm -S 0 /dev/sdf' failed with exit code 1.
Jun 28 10:13:56 abe (udev-worker)[771093]: sdh: Process '/usr/bin/hdparm -B 255 /dev/sdh' failed with exit code 1.
Jun 28 10:13:56 abe (udev-worker)[771093]: sdh: Process '/usr/bin/hdparm -S 0 /dev/sdh' failed with exit code 1.
Jun 28 10:13:56 abe (udev-worker)[771069]: sdd: Process '/usr/bin/hdparm -B 255 /dev/sdd' failed with exit code 1.
Jun 28 10:13:56 abe (udev-worker)[771069]: sdd: Process '/usr/bin/hdparm -S 0 /dev/sdd' failed with exit code 1.
Jun 28 10:13:56 abe (udev-worker)[771071]: sdg: Process '/usr/bin/hdparm -B 255 /dev/sdg' failed with exit code 1.
Jun 28 10:13:56 abe (udev-worker)[771071]: sdg: Process '/usr/bin/hdparm -S 0 /dev/sdg' failed with exit code 1.
Jun 28 10:13:56 abe (udev-worker)[771092]: sde: Process '/usr/bin/hdparm -B 255 /dev/sde' failed with exit code 1.
Jun 28 10:13:56 abe (udev-worker)[771092]: sde: Process '/usr/bin/hdparm -S 0 /dev/sde' failed with exit code 1.

I added this at the end:

Code:
/dev/sdd {
       apm = 255
}
 
Wow, those spin-up and start/stop cycles are a lot.

Are there any other non-default packages installed?

What about your complete hdparm.conf? Mine is empty besides a quiet:

Code:
root@proxmox-beta ~ > grep -Eve '^(#|$)' /etc/hdparm.conf
quiet
 
Yes, I probably have a few non-default packages, not sure how best to show just those so attached all installed, but from memory I do have the Zabbix Agent2, and tools like htop, smartmontools, screen.
Maybe something was installed that triggers those spindowns. Why is hdparm running so often from udev? This is not normal.
 
Hmmm. Getting a bit creative here, but you could temporarily replace the hdparm executable with a shell script or similar that dumps the output of ps -ef f to a text file when it runs.

That'll give you the complete list of all processes running on the system at the time it runs, and do it in tree format so you can fairly easily see what called it.
 
  • Like
Reactions: LnxBil
Hello,

My advice would be to double check that the BIOS settings do not have anything power-saving related and everything is set to "performance" mode. If the disks are attached to a hardware controller, that would be the second place where I would look at.
 
Sorry it's been a while, but I just figured out why these were not working:

Code:
Jul 17 11:13:56 abe (udev-worker)[3650063]: sdd: Process '/usr/bin/hdparm -B 255 /dev/sdd' failed with exit code 1.
Jul 17 11:13:56 abe (udev-worker)[3650063]: sdd: Process '/usr/bin/hdparm -S 0 /dev/sdd' failed with exit code 1.
Jul 17 11:13:56 abe (udev-worker)[3650069]: sdf: Process '/usr/bin/hdparm -B 255 /dev/sdf' failed with exit code 1.
Jul 17 11:13:56 abe (udev-worker)[3650069]: sdf: Process '/usr/bin/hdparm -S 0 /dev/sdf' failed with exit code 1.
Jul 17 11:13:56 abe (udev-worker)[3650103]: sdh: Process '/usr/bin/hdparm -B 255 /dev/sdh' failed with exit code 1.
Jul 17 11:13:56 abe (udev-worker)[3650103]: sdh: Process '/usr/bin/hdparm -S 0 /dev/sdh' failed with exit code 1.
Jul 17 11:13:56 abe (udev-worker)[3650099]: sdg: Process '/usr/bin/hdparm -B 255 /dev/sdg' failed with exit code 1.
Jul 17 11:13:56 abe (udev-worker)[3650099]: sdg: Process '/usr/bin/hdparm -S 0 /dev/sdg' failed with exit code 1.
Jul 17 11:13:56 abe (udev-worker)[3650059]: sde: Process '/usr/bin/hdparm -B 255 /dev/sde' failed with exit code 1.
Jul 17 11:13:56 abe (udev-worker)[3650059]: sde: Process '/usr/bin/hdparm -S 0 /dev/sde' failed with exit code 1.

I found that "hdparm" is not in /usr/bin but it's in /usr/sbin - not sure why I didn't notice this before, so I adjusted the udev rule for this and I'm monitoring the log to see if the message has gone.
 
  • Like
Reactions: justinclift