Pve kernel 4.15.17 comes with high and unstable disk latency.

PEB · May 31, 2018

Hi,

Today, I realized that since the night between May the 27th and May the 28th, when we rebooted one of our nodes, its disk latency just went nuts. Alas, it's not just its disks, it's also the disks of any VM started on this node. By investigating, I was able to correlate the beginning of this issue to the reboot and the subsequent use of the pve kernel 4.15.17-1.

For the host's disk latency, see

(daily graph) or

for the weekly version. You can see that the thing started upon the 28th of May. The "return to normal" thingy you see on the daily graph at 18:00 is after we rebooted under pve kernel 4.13.16-2-pve, after trying to upgrade first to 4.15.17-2-pve to see if this issue has been fixed.

For some VM that were on the machine, see eg

Note that the VM arrived on the host from another on the 25th of may, hence the burst between the 25th and the 28th, that is perfectly normal. The second burst, between the 27th and the 28th is the issue.

The end of the graph is when we removed the VM from the host, as this VM is critical and can't suffer from a *10 IO time/latency of its disks.

So, there is an issue with 4.15 pve kernel.

Have you already been informed? If not, I hope this post serves its bug report purpose.

Cheers, and thanks for your work!

marsian · May 31, 2018

Interesting, but I would assume some more technical information on your environment could be helpful here

HDD/SSD types, Controller, Cache, Type of Storage (local 7k2/10k, iSCSI etc.), ...?

micush · Jun 1, 2018

Phoronix benchmarking shows that Spectre and Meltdown patches can affect both cpu and disk performance. Perhaps that is what is going on here.

PEB · Jun 1, 2018

marsian said:
Interesting, but I would assume some more technical information on your environment could be helpful here HDD/SSD types, Controller, Cache, Type of Storage (local 7k2/10k, iSCSI etc.), ...?

As the own host disk is impacted, I'd bet it's not really a big matter. But as soon as some specific intel is asked I'd be happy to answer.

Regarding meltdown/spectre patches, maybe it could be it. But as you see, it's not just a basic increase of the average access times, it's an increase coming with instabilities. Also, I can imagine delay in many things, but x5 to x20 for latency? I'm doubtful.

SamTzu · Jun 1, 2018

I hate the entire 4.15.x line of kernels. They suck like a vacuum-cleaner.

joshin · Jun 1, 2018

Do you have PTI enabled? Turn it off and see if that fixes it.

We took a 30% hit when we enabled Spectre/Meltdown mitigation. Same symptoms you're seeing.

mac.linux.free · Jun 3, 2018

joshin said:
Do you have PTI enabled? Turn it off and see if that fixes it.

We took a 30% hit when we enabled Spectre/Meltdown mitigation. Same symptoms you're seeing.

for me at least it fixes it...hard to believe but true

Search

Search

Pve kernel 4.15.17 comes with high and unstable disk latency.

PEB

New Member

marsian

Well-Known Member

micush

Renowned Member

PEB

New Member

SamTzu

Renowned Member

joshin

Renowned Member

mac.linux.free

Renowned Member

We value your privacy