Worse performance with higher specs server

ggroke · May 15, 2024

Hello,

I had a Dell R610 server with 2x Xeon X5670, 192 Gb RAM, and six SAS hard disks (10k RPM) (hardware RAID'ed by H700 controller), which gave me a decent performance.

But it's an old machine, not so reliable as in its glory days, so I decided to migrate everything to an HP DL380 Gen9 server (2x E5-2673v3, 256 Gb of RAM, 4xEnterprise SSD (Intel) (also hardware RAID by P440 controller), supposedly much faster, right?

It turns out that the performance on the HP is much lower than what I achieved with the Dell. Some VMs visibly behave slower than the other server.

I can't understand where the bottleneck is. Or is HP just that bad?

I ran pveperf on both machines

DELL R610 (the old server)
CPU BOGOMIPS: 140439.84
REGEX/SECOND: 1818054
HD SIZE: 95.94 GB (/dev/mapper/pve-root)
BUFFERED READS: 601.11 MB/sec
AVERAGE SEEK TIME: 0.19 ms
FSYNCS/SECOND: 2932.92
DNS EXT: 22.67 ms
DNS INT: 0.77 ms (safari.local)

HP DL380 (the new server)
CPU BOGOMIPS: 230133.60
REGEX/SECOND: 3031673
HD SIZE: 93.93 GB (/dev/mapper/pve-root)
BUFFERED READS: 958.41 MB/sec
AVERAGE SEEK TIME: 0.09 ms
FSYNCS/SECOND: 909.08
DNS EXT: 138.56 ms
DNS INT: 1.38 ms (safari.local)

The majority of specs is way superior, so what am I missing here?

dcsapak · May 15, 2024

the only thing that stands out here is that the network is seemingly slower (you did not write what the nic hardware was/is)
and the fsyncs/second is much lower (probably because the raid controller is not good? does it have a bbu cache? what's the cache mode? etc.)

everything else looks much faster at least

leesteken · May 15, 2024

The new CPU has twice the cores but a lower base speed and it also the boost speed is a little lower. Maybe it boosts less (because many cores are active and/or cooling is less effective) and therefore single threaded work-loads feel slower?

ggroke · May 15, 2024

dcsapak said:
the only thing that stands out here is that the network is seemingly slower (you did not write what the nic hardware was/is)
and the fsyncs/second is much lower (probably because the raid controller is not good? does it have a bbu cache? what's the cache mode? etc.)

everything else looks much faster at least

NICs on both servers are SFP+ with 10 Gbps links. Dell with Intel X520, HP with HP 560SFP+.
Dell (old): Perc H700 with 512 Mb of cache.
HP (new): P440 with 2 Gb of cache.
Both with battery backup installed and active, cache set to 90% writing, 10% reading. Pretty much the same configuration on both servers.

ggroke · May 15, 2024

leesteken said:
The new CPU has twice the cores but a lower base speed and it also the boost speed is a little lower. Maybe it boosts less (because many cores are active and/or cooling is less effective) and therefore single threaded work-loads feel slower?

Yes, that was one of my hypotheses. Still, there are issues with poor disk performance on the new server. Although it doesn't make sense to me, since on the new server, I'm using corporate SSDs and on the old one, the good old spinning disks. Considering all this, I think it should still perform better, after all it's a much newer machine

_gabriel · May 15, 2024

What exact models SSDs are ?
fsyncs seems too low.
Here is ~5000 with one SATA HP MK000480GWSSC (Samsung SM883), ext4 filesystem,
connected directly to the motherboard (HP DL325 G10), not using hw raid P408i-a + cache.

Ramalama · May 15, 2024

Just a note, on my Backupserver (ML350 G9), that has 24 3,5inch bays,
was some sort of HP-Raid Controller, with a Multiplexer built in.

I dont remember which controller it was, but one that comes as default with ML350 G9.
And both ZFS-Pools (One Consists of 12x 4TB Sata drives, the other Consists of 6x 14TB SAS Drives), so those arent even big pools.... Both were utterly slow, i couldn't get above 400mb/s read/write speed, io was horrible either.

Then i buyed an LSI 9305-24i + 6 Cables to connect the Bay to the LSI 9305 Card, because the other Cables were incompatible.
I didn't even had to recreate the pools, because on the HP-Controller i used already the HBA-Mode, so it simply worked.
I just lost the ability to secure boot the system, because the 9305-24 doesnt support secure booting.

However, after the Exchange from HP P4xx+Multiplexer to LSI 9305-24i, the Read/Write speed increased to Max, 1,2GB/s Writespeed on the SATA pool and 800-900mb/s Writespeed on the SAS-Pool. (i care only about the writespeed, because its a PBS-Server)

The HP Raid-Controller is the last possible crap on earth, same goes for HP SATA/HDD Drives and SSD's.
I have HP-SAS Drives here:

Code:

/dev/sdp -> 127 MB/s (2m) - 118 MB/s (1M) -> HPE JWUDB 512B
/dev/sdq -> 130 MB/s (2m) - 96 MB/s (1M) -> HPE JWUDB 512B
/dev/sdu -> 124 MB/s (2m) - 120 MB/s (1M) -> HPE JWUDB 512B
/dev/sdr -> 252 MB/s (2m) - 255 MB/s (1M) -> WDC WUH 4K
/dev/sds -> 198 MB/s (2m) - 230 MB/s (1M) -> HPE JWTFD 512B
/dev/sdo -> 254 MB/s (2m) - 256 MB/s (1M) -> WDC HC550 512B
/dev/sdp -> 132 MB/s (2m) - 88 MB/s (1M) -> HPE JWUDB 4K
/dev/sdq -> 127 MB/s (2m) - 150 MB/s (1M) -> HPE JWUDB 4K
/dev/sdu -> 130 MB/s (2m) - 115 MB/s (1M) -> HPE JWUDB 4K

Explanation: Thats only write speeds with dd, 2m = means 2m blocksize and 1m = 1m blocksize.
512B / 4K = Means simply Logical Block size.
I used sg_format to change the Logical Blocksize, simply to see if there is an performance improvement... There isn't in short.

I have a lot more drives, most of them are doing around 250MB/s, just HP Drives underperform heavily.
Same for SSD's from HP, basically everything from HP is Crap.

I exchanged even HP640 SFP28 Nics with original Mellanox 4 LX Nics, because the HP Ones have an gigantic Vlan bug, even with the newest 1010 Firmware, while original Mellanox 4 LX Nics doesn't have the Vlan Bug with the 1010 Firmware.
HP 640-SFP28 are rebranded Mellanox 4 LX Nics.

Basically every "Accessory" from HPE is utterly crap.
Another side note, you should on HP Servers always enable "Static: High Performance Mode" in ILO, because the "Dynamic Power Savings Mode" is simply utterly crap either and buggy, the whole Server gets with the High Performance Mode (no matter which task) a lot more Responsive and like 2x faster.
I have a lot of those Gen9 Servers, i think 6-8 in our Company, they are all crap.
Cheers

PS: Those benchmarked Drives are all 14TB SAS Drives. So not even some old Crap. The WDC HC550 is a 16TB Drive.
But i tested a lot more drives here, so thats why i can say for sure that all HP-Drives are Crap. Worst part of them is, that they have very inconsistent Write Speeds, which Varies a lot from test to test.
All other Drives (Basically i tested actually only WDC Drives, dunno why we don't have HGST/Seagate here), however, All WDC-Drives at least have super Consistent Write-Speeds.
Those HPE Drives are from WDC either, just with a firmware from HPE, dunno which WDC Drives HP Buyes, but probably the worst WDC Drives or the HP-Firmware is simply so bad.
Just wanted to note, that HP=WDC either, so that it rules out, that WDC is great or something. HP is just bad, thats all.
Thats the reason why HP-Servers in general are so affordable. But Dell isn't a lot better, their Raid-Controllers are Crap either.

Cheers

_gabriel · May 15, 2024

HP disable Disk's cache by default (embedded controller and in hw raid card) , more secure and required for hotplug.

edited: For the story, HP RAID P408i-a shipped with a DL325 Gen10 I've installed, die after one week :-( , fortunately I was able to connect drives on the amd controller of motherboard, I neve re-installed the replaced P408i-a because temp was too high about 80° idle same as the faulty first ... I hope was just me not lucky ...

Ramalama · May 15, 2024

_gabriel said:
HP disable Disk's cache by default (embedded controller and in hw raid card) , more secure and required for hotplug.

edited: For the story, HP RAID P408i-a shipped with a DL325 Gen10 I've installed, die after one week :-( , fortunately I was able to connect drives on the amd controller of motherboard, I neve re-installed the replaced P408i-a because temp was too high about 80° idle same as the faulty first ... I hope was just me not lucky ...

TBH, They are all very Hot, no matter which Raid-Card i seen in my life.
So i wouldn't worry at all about the heat. Maybe True HBA-Cards are Cold, but even the 9305 that i use now with IT-FW is very hot.
Not sure about those new Broadcom Tri-Mode Controllers, since they are too expensive to test xD

gfngfn256 · May 15, 2024

ggroke said:
it's a much newer machine

I think you exaggerate : you've upgraded from a 14 year old CPU to a 10 year old one (& slower in freq. specs but double the cores).

ggroke · May 15, 2024

_gabriel said:
What exact models SSDs are ?
fsyncs seems too low.
Here is ~5000 with one SATA HP MK000480GWSSC (Samsung SM883), ext4 filesystem,
connected directly to the motherboard (HP DL325 G10), not using hw raid P408i-a + cache.

The SSDs are 4x Intel DC S4500 (1.92 Tb each).

ggroke · May 15, 2024

_gabriel said:
HP disable Disk's cache by default (embedded controller and in hw raid card) , more secure and required for hotplug.

edited: For the story, HP RAID P408i-a shipped with a DL325 Gen10 I've installed, die after one week :-( , fortunately I was able to connect drives on the amd controller of motherboard, I neve re-installed the replaced P408i-a because temp was too high about 80° idle same as the faulty first ... I hope was just me not lucky ...

Yes, I know HP disables write cache when you install SSDs and enables "HP SmartPath" or something (which is some crappy implementation that uses some SSD space for caching). But I've disabled SmartPath and enabled FBWC. Believe it or not, before that, was even worse!

ggroke · May 15, 2024

Ramalama said:
Just a note, on my Backupserver (ML350 G9), that has 24 3,5inch bays,
was some sort of HP-Raid Controller, with a Multiplexer built in.

I dont remember which controller it was, but one that comes as default with ML350 G9.
And both ZFS-Pools (One Consists of 12x 4TB Sata drives, the other Consists of 6x 14TB SAS Drives), so those arent even big pools.... Both were utterly slow, i couldn't get above 400mb/s read/write speed, io was horrible either.

Then i buyed an LSI 9305-24i + 6 Cables to connect the Bay to the LSI 9305 Card, because the other Cables were incompatible.
I didn't even had to recreate the pools, because on the HP-Controller i used already the HBA-Mode, so it simply worked.
I just lost the ability to secure boot the system, because the 9305-24 doesnt support secure booting.

However, after the Exchange from HP P4xx+Multiplexer to LSI 9305-24i, the Read/Write speed increased to Max, 1,2GB/s Writespeed on the SATA pool and 800-900mb/s Writespeed on the SAS-Pool. (i care only about the writespeed, because its a PBS-Server)

The HP Raid-Controller is the last possible crap on earth, same goes for HP SATA/HDD Drives and SSD's.
I have HP-SAS Drives here:

Code:

/dev/sdp -> 127 MB/s (2m) - 118 MB/s (1M) -> HPE JWUDB 512B /dev/sdq -> 130 MB/s (2m) - 96 MB/s (1M) -> HPE JWUDB 512B /dev/sdu -> 124 MB/s (2m) - 120 MB/s (1M) -> HPE JWUDB 512B /dev/sdr -> 252 MB/s (2m) - 255 MB/s (1M) -> WDC WUH 4K /dev/sds -> 198 MB/s (2m) - 230 MB/s (1M) -> HPE JWTFD 512B /dev/sdo -> 254 MB/s (2m) - 256 MB/s (1M) -> WDC HC550 512B /dev/sdp -> 132 MB/s (2m) - 88 MB/s (1M) -> HPE JWUDB 4K /dev/sdq -> 127 MB/s (2m) - 150 MB/s (1M) -> HPE JWUDB 4K /dev/sdu -> 130 MB/s (2m) - 115 MB/s (1M) -> HPE JWUDB 4K

Explanation: Thats only write speeds with dd, 2m = means 2m blocksize and 1m = 1m blocksize.
512B / 4K = Means simply Logical Block size.
I used sg_format to change the Logical Blocksize, simply to see if there is an performance improvement... There isn't in short.

I have a lot more drives, most of them are doing around 250MB/s, just HP Drives underperform heavily.
Same for SSD's from HP, basically everything from HP is Crap.

I exchanged even HP640 SFP28 Nics with original Mellanox 4 LX Nics, because the HP Ones have an gigantic Vlan bug, even with the newest 1010 Firmware, while original Mellanox 4 LX Nics doesn't have the Vlan Bug with the 1010 Firmware.
HP 640-SFP28 are rebranded Mellanox 4 LX Nics.

Basically every "Accessory" from HPE is utterly crap.
Another side note, you should on HP Servers always enable "Static: High Performance Mode" in ILO, because the "Dynamic Power Savings Mode" is simply utterly crap either and buggy, the whole Server gets with the High Performance Mode (no matter which task) a lot more Responsive and like 2x faster.
I have a lot of those Gen9 Servers, i think 6-8 in our Company, they are all crap.
Cheers

PS: Those benchmarked Drives are all 14TB SAS Drives. So not even some old Crap. The WDC HC550 is a 16TB Drive.
But i tested a lot more drives here, so thats why i can say for sure that all HP-Drives are Crap. Worst part of them is, that they have very inconsistent Write Speeds, which Varies a lot from test to test.
All other Drives (Basically i tested actually only WDC Drives, dunno why we don't have HGST/Seagate here), however, All WDC-Drives at least have super Consistent Write-Speeds.
Those HPE Drives are from WDC either, just with a firmware from HPE, dunno which WDC Drives HP Buyes, but probably the worst WDC Drives or the HP-Firmware is simply so bad.
Just wanted to note, that HP=WDC either, so that it rules out, that WDC is great or something. HP is just bad, thats all.
Thats the reason why HP-Servers in general are so affordable. But Dell isn't a lot better, their Raid-Controllers are Crap either.

Cheers

Like I suspected from the beginning: HP is a shit...

But the drives are good (maybe the only good things), those are Intel (and not HP-branded Intel), so they were supposed to perform well.

Even so, the server is here now, and I don't intend to replace it (at least, not right away). What would you suggest? Ditch HP controller and attach disks directly to motherboard? Or take a chance and continue with the old Dell R610?

_gabriel · May 15, 2024

fyi, here with one intel D3 S4510 ( 480 GB ) in Tiny Lenovo M75q , fsyncs/s is about ~3200

edited: On HP ML350 G8 , HW RAID1 p420i Cache 512MB + HDD (same on two systems)
Power Regulator HP High Static Perfs Mode : fsyncs/s is about ~2700
Power Regulator HP Dynamic Low Power Mode : fsyncs/s is about ~1400

ggroke · May 15, 2024

gfngfn256 said:
I think you exaggerate : you've upgraded from a 14 year old CPU to a 10 year old one (& slower in freq. specs but double the cores).

Correct. I've upgraded from an very old to a moderately old system...

Ramalama · May 15, 2024

ggroke said:
Like I suspected from the beginning: HP is a shit...

But the drives are good (maybe the only good things), those are Intel (and not HP-branded Intel), so they were supposed to perform well.

Even so, the server is here now, and I don't intend to replace it (at least, not right away). What would you suggest? Ditch HP controller and attach disks directly to motherboard? Or take a chance and continue with the old Dell R610?

Check if there is a multiplexer card, if there is, remove it if possible.
Check if you can use HBA mode somehow, firmware updates for the raid controller helps often.

In my particular case, i need an Raid-Card for the old esxi servers, either a raid card or an iscsi/fs storage, because its esxi xD
But on the Backup-Server at the time i started, i switched to HBA Mode, but i did not tested Raid-Mode speeds tbh, those 400MB/s were already in HBA mode on that HP-Controller i had.
You can at least test if it gives more performance.

Otherwise its just testing and testing and testing. I had todo the same, but in my case for the backup-server, i had no choice, i needed another controller, because of the 24x Drive Bays...
In your Case, if you have not so much disks, you should try the onboard Sata connection (Not sure if your board has one), on the backup-server (ML350 G9) i don't have onboard any sata connections, but on DL360 G9 i have some.
If you can, try.

Check with smartctl -a /dev/...
If your intel drives support other LBA Modes, there is usually an indicator 0(best) to 4(worst)

Code:

Samsung PM 1735:
Supported LBA Sizes (NSID 0x1)
Id Fmt  Data  Metadt  Rel_Perf
 0 -     512       0         1
 1 -     512       8         3
 2 +    4096       0         0
 3 -    4096       8         2
 4 -    4096      64         3

Code:

Samsung PM 9A3:
Supported LBA Sizes (NSID 0x1)
Id Fmt  Data  Metadt  Rel_Perf
 0 +     512       0         0
 1 -    4096       0         0

Code:

Micron 7450 Max:
Supported LBA Sizes (NSID 0x1)
Id Fmt  Data  Metadt  Rel_Perf
 0 -     512       0         2
 1 +    4096       0         0

Thats an example, but from an NVME's, i sadly don't have any SATA Based Enterprise SSD's.
But if your SSD's support 4k Physical Blocksize's, you can usually reformat Logical Blocksize to 4k.
sg_format is probably something that could do it for you, or even better hdparm, if it supports your ssds.

On HDD's i see no speed difference, so its not worth it at all to reformat those to 4k from 512B.
On NVME's there is indeed an improvement. On SATA SSD's i simply don't know, but probably yes.

And inside ILO the static Performance Mode, is a must. It will consume something around 30W more power, so not much.
That's all the tipps i can give you, have fun testing and report if you found sth to improve your situation!

Ramalama · May 15, 2024

PS: i forgot to mention.

FSYNCS/SECOND: 909.08
-> is not really bad, looks okay to me for lets say 4SSD's in Raid 10.

FSYNCS/SECOND: 2932.92
-> For 6x 10k SAS-Disks, looks far too much to me, like it's just a cached result.

Maybe we should start with that. Because i get:

Code:

CPU BOGOMIPS:      492083.20
REGEX/SECOND:      6470191
HD SIZE:           855.07 GB (rpool/ROOT/pve-1)
FSYNCS/SECOND:     12757.88
DNS EXT:           7.10 ms
DNS INT:           0.43 ms (xxx.xxxxxx.net)

But that are 2x PM9A3 in a ZFS Mirror.

And on the Backupserver (ML350 G9):

Code:

CPU BOGOMIPS:      111875.68
REGEX/SECOND:      3780378
HD SIZE:           255.01 GB (rpool/ROOT/pve-1)
FSYNCS/SECOND:     244.70
DNS EXT:           7.06 ms
DNS INT:           0.54 ms (xxx.xxxxxx.net)

And this is 2x HP EH0300JDYTH (15K SAS) Crap Disks (300GB each)

This is a front-Server (HP DL360 G10):

Code:

CPU BOGOMIPS:      96000.00
REGEX/SECOND:      3251732
HD SIZE:           446.13 GB (/dev/sda3)
BUFFERED READS:    327.50 MB/sec
AVERAGE SEEK TIME: 0.04 ms
FSYNCS/SECOND:     2640.22
DNS EXT:           7.06 ms
DNS INT:           0.47 ms (xxx.xxxxxx.net)

I don't know what drives that are, but some btrfs mirror with 2x MK000480GWTTH

beisser · May 15, 2024

900 fsyncs are utter crap for a single sata enterprise disk, let alone for a raid10.

my mirrored intel s3610 (1.6TB) get 5200 fsyncs (same as a single drive) on an onboard sata connector.

im willing to bet the raid controller is throwing the wrench into the gear here.

ggroke · May 16, 2024

Ramalama said:
Check if there is a multiplexer card, if there is, remove it if possible.
Check if you can use HBA mode somehow, firmware updates for the raid controller helps often.

No, there isn't a multiplexer card, just the controller itself. The multiplexer are necessary only to control more than 8 disks (not the case).

Ramalama said:
And inside ILO the static Performance Mode, is a must. It will consume something around 30W more power, ´ll so not much.
That's all the tipps i can give you, have fun testing and report if you found sth to improve your situation!

Indeed, it was set to balanced mode or something. I set it to performance mode,and that was a game changer! That´s pveperf changing this setting alone:

CPU BOGOMIPS: 230134.08
REGEX/SECOND: 3204800
HD SIZE: 93.93 GB (/dev/mapper/pve-root)
BUFFERED READS: 1601.76 MB/sec
AVERAGE SEEK TIME: 0.08 ms
FSYNCS/SECOND: 4534.59
DNS EXT: 25.53 ms
DNS INT: 1.00 ms

Fsync/sec have improved drastically.

As this computer does not have currently any VM running (I've migrated all back to the old server while I try to figure these things out), I can run some other tests: removing the controller and connecting the HDs directly to the motherboard; use it in HBA mode, and see if I can make a lemonade out of this lemon!

_gabriel · May 16, 2024

Thanks for the reminder about Power Regulator @Ramalama
I've just re-checked the tested ML350 G8 , Power Regulator was set to HP Dynamic Low Power.
It had to be temporary since fans get crazy for somes reasons, but was forgotten, my bad.
Re set to HP Static High Perf, then fsyncs/s bump from 1400 to 2700 (because test use Cache of RAID HW)
Single CPU Thread score from PassMark inside a Windows VM bump from 1300 to 2000 (Xeon E5-2667 v2 End 2013)

Worse performance with higher specs server

Member

Proxmox Staff Member

Distinguished Member

Member

Member

Renowned Member

Well-Known Member

Renowned Member

Well-Known Member

Renowned Member

Member

Member

Member

Renowned Member

Member

Well-Known Member

Well-Known Member

Active Member

Member

Renowned Member