Getting new host, how to use SSD disk?

Jan 31, 2014
113
11
38
The OVH host currently used has gotten too slow in the disk department. I plan to get a new host with SSD or NVMe disks and ZFS file system for Proxmox 5.2. But I'm hesitant on how to use the non-rotating disk.

Current host with Proxmox 4.4, same as the below new host except no SSD disks and memory is 64 GB. It works but any heavier disk activity makes it very slow.

Planned host: https://www.ovh.co.uk/dedicated_servers/hg/1801mhg01.xml with Proxmox 5.2.

Options: 128 GB RAM, 2 * 4 TB SAS - 7,2K Enteprise, 2 * 480GB SSD - Datacenter - Intel - S3xxx (0.3 DWPD min)

If I go with ZFS on the rotating disks and 8 GB ZIL, there is 472 GB left on the SSD disk and using it all for ZFS cache seems too much. Would there be more useful way to use the SSD, say 128 GB for ZFS Cache and the rest for something that benefits from being on fast disk? What proxmox storage would that be?
 
Last edited:
>> What proxmox storage would that be?

That would be an additional "directory" type pve storage. You need to partition, format and set a mount point for the ssd, then add it via GUI..

upload_2018-6-30_12-29-7.png
 
I'll have to study to understand the answer by @acidrop .

Meanwhile, I have received the new host, slightly different from what stated in my messages. This is OVH host EG-128 with 2 * 4 TB SATA disk and 2 * 420 GB NVMe disk.
I set it up like so:
Code:
root@rautaantenni:/var/lib/vz/SPEED# zpool status
  pool: rpool
 state: ONLINE
status: Some supported features are not enabled on the pool. The pool can
    still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
    the pool may no longer be accessible by software that does not support
    the features. See zpool-features(5) for details.
  scan: none requested
config:

    NAME           STATE     READ WRITE CKSUM
    rpool          ONLINE       0     0     0
      mirror-0     ONLINE       0     0     0
        sda2       ONLINE       0     0     0
        sdb2       ONLINE       0     0     0
    logs
      mirror-1     ONLINE       0     0     0
        nvme0n1p1  ONLINE       0     0     0
        nvme1n1p1  ONLINE       0     0     0
    cache
      nvme0n1p2    ONLINE       0     0     0
      nvme1n1p2    ONLINE       0     0     0

errors: No known data errors

Code:
root@rautaantenni:/var/lib/vz/SPEED# fdisk -l
Disk /dev/sda: 3.7 TiB, 4000787030016 bytes, 7814037168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 2ABBB4FF-4226-413B-886B-C4B3987501F2

Device          Start        End    Sectors  Size Type
/dev/sda1          34    1048609    1048576  512M EFI System
/dev/sda2     1050624 7814020749 7812970126  3.7T Solaris /usr & Apple ZFS
/dev/sda9  7814020750 7814037134      16385    8M Solaris reserved 1


Disk /dev/sdb: 3.7 TiB, 4000787030016 bytes, 7814037168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 135968F3-0521-4649-AB10-2FC5032EC1D1

Device          Start        End    Sectors  Size Type
/dev/sdb1          34    1048609    1048576  512M EFI System
/dev/sdb2     1050624 7814020749 7812970126  3.7T Solaris /usr & Apple ZFS
/dev/sdb9  7814020750 7814037134      16385    8M Solaris reserved 1


Disk /dev/nvme0n1: 419.2 GiB, 450098159616 bytes, 879097968 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0xdb3b2452

Device         Boot    Start       End   Sectors   Size Id Type
/dev/nvme0n1p1          2048  16779263  16777216     8G 83 Linux
/dev/nvme0n1p2      16779264 879097967 862318704 411.2G 83 Linux


Disk /dev/nvme1n1: 419.2 GiB, 450098159616 bytes, 879097968 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x6da2a5ce

Device         Boot    Start       End   Sectors   Size Id Type
/dev/nvme1n1p1          2048  16779263  16777216     8G 83 Linux
/dev/nvme1n1p2      16779264 879097967 862318704 411.2G 83 Linux


Disk /dev/zd0: 4 GiB, 4294967296 bytes, 8388608 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
root@rautaantenni:/var/lib/vz/SPEED#

Tested some speeds, FSYNCS is now way better, twice what I got on another test host with HW RAID.
Code:
root@rautaantenni:/var/lib/vz/SPEED# pveperf 
CPU BOGOMIPS:      95993.76
REGEX/SECOND:      3210587
HD SIZE:           3591.75 GB (rpool/ROOT/pve-1)
FSYNCS/SECOND:     11116.54
DNS EXT:           35.46 ms
root@rautaantenni:/var/lib/vz/SPEED#

Speed test writing and reading:
Code:
root@rautaantenni:/var/lib/vz/SPEED# df -hT .
Filesystem       Type  Size  Used Avail Use% Mounted on
rpool/ROOT/pve-1 zfs   3.6T 1014M  3.6T   1% /
root@rautaantenni:/var/lib/vz/SPEED# dd if=/dev/zero of=junk bs=1024000 count=1024
1024+0 records in
1024+0 records out
1048576000 bytes (1.0 GB, 1000 MiB) copied, 0.275118 s, 3.8 GB/s

root@rautaantenni:/var/lib/vz/SPEED# ls -lh
total 512
-rw-r--r-- 1 root root 1000M Jul  1 10:07 junk

root@rautaantenni:/var/lib/vz/SPEED# dd if=junk of=/dev/null bs=1024000
1024+0 records in
1024+0 records out
1048576000 bytes (1.0 GB, 1000 MiB) copied, 0.177973 s, 5.9 GB/s

Since ZFS cache does not benefit from mirroring, I have twice as much cache, its 800 GB now. If I understand correctly the cache is empty at boot, so it it takes a lot of time to fill. Plus I suspect it never gets even half full with the usage pattern this host gets.
I have not yet installed any virtual hosts on this new host, so no idea about the real world performance.

But before it gets to production use, is that disk system setup reasonably sensible?
 
>> I'll have to study to understand the answer by @acidrop .

Sorry, I misunderstood your question. I thought that you were about to partition the SSDs in 3 slices (1 for L2ARC, 1 for SLOG and the 3rd for generic VM use). That 3rd partition in this case, may be used as an additional storage on PVE host to store VMs which require high IOPS, but it looks like that's not the case.

NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
sda2 ONLINE 0 0 0
sdb2 ONLINE 0 0 0
logs
mirror-1 ONLINE 0 0 0
nvme0n1p1 ONLINE 0 0 0
nvme1n1p1 ONLINE 0 0 0
cache
nvme0n1p2 ONLINE 0 0 0
nvme1n1p2 ONLINE 0 0 0

Your zpool setup looks OK, however I would suggest you to read the following article for more details about L2ARC and SLOG or perhaps post to zfs-on-linux ML (lots of ZFS experts in there).

>> ... and 2 * 420 GB NVMe disk

To me those look quite big to "waste" them just for SLOG and L2ARC. I would dedicate just 1/3 of the space for L2ARC/ZIL and the rest to something more "meaningful" ... keep an eye on their "wear leveling" though ...

Y
 
  • Like
Reactions: Tapio Lehtonen
I'll have to study to understand the answer by @acidrop .
...
Code:
Speed test writing and reading:
[CODE]root@rautaantenni:/var/lib/vz/SPEED# df -hT .
Filesystem       Type  Size  Used Avail Use% Mounted on
rpool/ROOT/pve-1 zfs   3.6T 1014M  3.6T   1% /
root@rautaantenni:/var/lib/vz/SPEED# dd if=/dev/zero of=junk bs=1024000 count=1024
1024+0 records in
1024+0 records out
1048576000 bytes (1.0 GB, 1000 MiB) copied, 0.275118 s, 3.8 GB/s

root@rautaantenni:/var/lib/vz/SPEED# ls -lh
total 512
-rw-r--r-- 1 root root 1000M Jul  1 10:07 junk

root@rautaantenni:/var/lib/vz/SPEED# dd if=junk of=/dev/null bs=1024000
1024+0 records in
1024+0 records out
1048576000 bytes (1.0 GB, 1000 MiB) copied, 0.177973 s, 5.9 GB/s

Since ZFS cache does not benefit from mirroring, I have twice as much cache, its 800 GB now. If I understand correctly the cache is empty at boot, so it it takes a lot of time to fill. Plus I suspect it never gets even half full with the usage pattern this host gets.
I have not yet installed any virtual hosts on this new host, so no idea about the real world performance.

But before it gets to production use, is that disk system setup reasonably sensible?
Hi Tapio,
your speed test is not usefull, because zfs compress the data and zeros are good to compress.

Use fio as test.

Udo
 
Har har!
Top! You made my day!

The results are bullshit and don't show what I expect to measure? A fat lot I care! The values are nice and I can read it much better than others...

Super.


Udo
Hi again,
I will give you an example.
I have an zfs raid 10 with three pairs of mirrored 10K-300GB-SAS drives.
Such an SAS-drive can app. do 150MB/s write (perhaps a little more or less). Three stripes mean an maximun of app. 450MB/s.

Your test gives much much much better values:
Code:
dd if=/dev/zero of=junk bs=1024000 count=1024                                                                                           
1024+0 records in                                                                                                                                         
1024+0 records out
1048576000 bytes (1.0 GB, 1000 MiB) copied, 0.461368 s, 2.3 GB/s
because it's measure caching... it's perhaps ok, if I want to measure caching...

Wit fio I got more realistic values:
Code:
fio --filename=test --sync=1 --rw=write --bs=10M --numjobs=4 --iodepth=16 --size=2000MB --name=test
...
WRITE: io=8000.0MB, aggrb=363991KB/s, minb=90997KB/s, maxb=91314KB/s, mint=22428msec, maxt=22506msec
363991/1024 = 355 MB/s! With this value I can work - it's looks ugly agains 2.3 GB/s but 2.3 GB/s has nothing to to with the real life.

Udo
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!