[SOLVED] High disk temp?

LooneyTunes

Active Member
Jun 1, 2019
203
22
38
Hi,
I have just switched from a failed HDD that was running hot (>80C) to a SSD. I feel a little concerned because Samsung says max temp should be 70C. What would you say, cooling issues? Thanks

Code:
Aug 11 15:45:40 pve smartd[586]: Device: /dev/sda [SAT], SMART Usage Attribute: 190 Airflow_Temperature_Cel changed from 61 to 69
Aug 11 16:15:40 pve smartd[586]: Device: /dev/sda [SAT], SMART Usage Attribute: 190 Airflow_Temperature_Cel changed from 69 to 68
Aug 11 16:45:40 pve smartd[586]: Device: /dev/sda [SAT], SMART Usage Attribute: 190 Airflow_Temperature_Cel changed from 68 to 64
Aug 11 17:45:40 pve smartd[586]: Device: /dev/sda [SAT], SMART Usage Attribute: 190 Airflow_Temperature_Cel changed from 64 to 67
Aug 11 17:59:08 pve smartd[586]: Device: /dev/sda [SAT], SMART Usage Attribute: 190 Airflow_Temperature_Cel changed from 67 to 66
 
Also keep in mind that a SSD will start thermal throttleing and gets slow. If you want the fulll performance you should add some active/passive cooling so it will never come near to the 65+ degree range.
 
Also keep in mind that a SSD will start thermal throttleing and gets slow. If you want the fulll performance you should add some active/passive cooling so it will never come near to the 65+ degree range.
Hi,
Thanks for the advice. As much as I don't like it, I can't add another fan to it, as it's a small Intel NUC. I did increse cooling in bios though which got it down some. But if Samsung says operating temperature is between 0-70C, then ~66-ish should be ok. No idea why it runs so hot, the fan spins and looks and sounds alright for it's age. Perhaps time to start thinking of upgrading the casing...
 
Hi,
Thanks for the advice. As much as I don't like it, I can't add another fan to it, as it's a small Intel NUC. I did increse cooling in bios though which got it down some. But if Samsung says operating temperature is between 0-70C, then ~66-ish should be ok. No idea why it runs so hot, the fan spins and looks and sounds alright for it's age. Perhaps time to start thinking of upgrading the casing...
The point is that you don't know if it is trottleing or not. It might be that you see temps around 66 degree just because it is already throttling. As soon as it is thermal throttleing it will get slow and stay between something like 65 and 70 degree if it is rated for up to 70 degree.
 
The point is that you don't know if it is trottleing or not. It might be that you see temps around 66 degree just because it is already throttling. As soon as it is thermal throttleing it will get slow and stay between something like 65 and 70 degree if it is rated for up to 70 degree.
I see. I was thinking of benchmarking it to see if it really is that hot... I read what is put in the syslog. Case feels reasonably cool, and still syslog reports 66C, which according to the attacked screenshot is closest to what they call "normalized"... If really that hot I should feel it... The HDD that this replaced really did heat the case up.

I am not sure I fully understand the Wikipedia explanation of ID #190... (https://en.wikipedia.org/wiki/S.M.A.R.T.). Is what is shown not the actual temp?? I took a screenshot from the current SMART values for Air flow...
 

Attachments

  • SSD_temp_smart.png
    SSD_temp_smart.png
    12.3 KB · Views: 16
What is the output of smartctl -a /dev/sda?
It was a huge list, I'm guessing this was what you were asking for?
Code:
SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  9 Power_On_Hours          0x0032   099   099   000    Old_age   Always       -       120
 12 Power_Cycle_Count       0x0032   099   099   000    Old_age   Always       -       9
177 Wear_Leveling_Count     0x0013   100   100   000    Pre-fail  Always       -       0
179 Used_Rsvd_Blk_Cnt_Tot   0x0013   100   100   010    Pre-fail  Always       -       0
181 Program_Fail_Cnt_Total  0x0032   100   100   010    Old_age   Always       -       0
182 Erase_Fail_Count_Total  0x0032   100   100   010    Old_age   Always       -       0
183 Runtime_Bad_Block       0x0013   100   100   010    Pre-fail  Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0032   066   053   000    Old_age   Always       -       34
195 Hardware_ECC_Recovered  0x001a   200   200   000    Old_age   Always       -       0
199 UDMA_CRC_Error_Count    0x003e   100   100   000    Old_age   Always       -       0
235 Unknown_Attribute       0x0012   099   099   000    Old_age   Always       -       5
241 Total_LBAs_Written      0x0032   099   099   000    Old_age   Always       -       489842020

When measuring with an IR thermometer on top of the NUC (where the drive sits) it was a mere 33C...
 
Last edited:
RAW_VALUE is the human readable value. So a SMART value of 66 represents 34 degree.
That wasn't entirely obvious, thanks! Wonder why stuff needs to be this complicated, but guess there is a point to it as well. I suppose it should be in good working order then, and no risk of "throttling"... Thanks for clarifying this! :)
 
  • Like
Reactions: kwinz
That wasn't entirely obvious, thanks! Wonder why stuff needs to be this complicated, but guess there is a point to it as well. I suppose it should be in good working order then, and no risk of "throttling"... Thanks for clarifying this! :)
Its even more complicated. SMART values aren't standardized. Every manufacturer and every model is different and uses other attributes. So it even could be that "Airflow_Temperature_Cel 34" could mean something like "12345 GB written". If you really want to be sure what the SMART attributes actually mean you need to look at the SSDs datasheets and hope that the manufacturer wrote a good ducumentation where it is explained what the SMART attributes actually mean and how exaclty they are calculated. But of all the 30+ SSDs that I got Intel is the only manufacturer that actually documents this.
 
Last edited:
Agreed. I think as by touching it feels cool (enough), I think I'll leave it with that (and regular backups). It has a 5 year warranty whatever that may include (or not). Anyhow, thanks for the insight! :)
 
You will loose your warranty as soon as you write more than your SSD got TBW. If your SSD for example got "300 TBW" that means that your warranty is valid as long as didn't passed the 5 years AND the SSD wrote less then 300 TB. But you also need to take the write amplification into account. Lets say you got a lot of sync writes and you got a average write amplification of factor 20 (thats what I got). If you then write 15TB inside the VM your host will write 300TB to the SSD. So the warranty will be lost after only writing real 15TB of data. So you really should monitor the writes of your SSD because every single write will damage it a bit.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!