VMs that are created on striped storage (RAID0) are not striped

robotmoon

New Member
Sep 19, 2024
3
0
1
I have two disks which are arranged into a striped logical volume, but when I create a VM and select that volume group as storage, the VM is stored as a new linear logical volume within that volume group instead of being stored on the already existing striped logical volume.

lvs -o+lv_layout,stripes
Code:
LV                 VG         Attr       LSize  Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert Layout     #Str
  striped_storage_lv storage_vg -wi-a-----  3.49t                                                     striped       2
  vm-203-disk-0      storage_vg -wi-a----- 32.00g                                                     linear        1

The setup is two disk drives attached together in a volume group that's set up as LVM storage in Proxmox. These are meant for storing temporary VMs that need fast read/write abilities for various data processing jobs. Backups and the Proxmos OS are on separate disks (ie. if there's total data loss because of a RAID0 setup, then that's an acceptable risk).

Here are the steps I took to create it:
wipefs --all /dev/nvme0n1
wipefs --all /dev/nvme1n1
pvcreate /dev/nvme0n1 /dev/nvme1n1
vgcreate storage_vg /dev/nvme0n1 /dev/nvme1n1
lvcreate -i 2 -I 64 -L 3.49T -n striped_storage_lv storage_vg
I then add the storage via the GUI as an LVM on Datacenter for holding VMs and containers.

I've also tried this with a thin pool with the same issues
lvcreate -i 2 -I 64 -L 3.49T -c 128K --thinpool striped_storage_lv storage_vg

The NVMe drives are 4TB each. That is, the 3.49TB I allocated to the striped logical volume is just 50% of the total storage available within the volume group, so the new VMs are correctly added to the volume group, but not the logical volume.
ls /dev/storage_vg/
Output striped_storage_lv vm-203-disk-0

How do I set this up properly to have striped VM storage?
 
Last edited:
Partially solved. Thanks to this thread.

I mounted the logical volume as a directory. It wouldn't let me do that without specifying a filesystem, so I chose XFS.
mkfs.xfs /dev/storage_vg/striped_storage_lv
mount /dev/storage_vg/striped_storage_lv /mnt/striped_storage_lv/
Then in Datacenter, I added the /mnt/striped_storage as directory storage.

Then I created a VM to verify it didn't make another logical volume.
Code:
  LV                 VG         Attr       LSize Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert Layout     #Str
  striped_storage_lv storage_vg -wi-ao---- 3.49t                                                     striped       2

In the VM, I ran fio with these settings:
Code:
[global]
name=nvme-seq-read
time_based
ramp_time=5
runtime=180
readwrite=read
bs=128K
ioengine=libaio
direct=1
numjobs=1
iodepth=32
group_reporting=1

[vg]
filename=/dev/sda

The block size of 128k was chosen to match the disk's datasheet (max seq read @ 128k was listed as 6.8 GB/s). My results:
Code:
READ: bw=6015MiB/s (6307MB/s), 6015MiB/s-6015MiB/s (6307MB/s-6307MB/s), io=1057GiB (1135GB), run=180001-180001msec

So it matches the factory benchmarks. But it should be twice that since it's striped (stripe size is 64K, so setting bs=128K should have utilized the striping).

The VM is stored properly in the logical volume now. But the data doesn't seem to be getting striped.
 
Last edited:
Ok, I've done more tests. There's something I'm not understanding here.

I ran the same fio test on a completely separate machine with an NVMe to get a sense of what to expect.
I also ran the fio test on a logical volume with no striping. I used lvremove to rebuild the logical volume each time I changed the stripe settings, so it's apples to apples on the same disk, same volume group size, same VM setup (rebuilt each time).

Here are the raw results:
Code:
DESKTOP (control)
=====
bs=128K:
   READ: bw=2940MiB/s (3083MB/s), 2940MiB/s-2940MiB/s (3083MB/s-3083MB/s), io=517GiB (555GB), run=180002-180002msec
bs=256K:
   READ: bw=2994MiB/s (3139MB/s), 2994MiB/s-2994MiB/s (3139MB/s-3139MB/s), io=526GiB (565GB), run=180003-180003msec

NO STRIPE
=====
bs=64K:
   READ: bw=5130MiB/s (5379MB/s), 5130MiB/s-5130MiB/s (5379MB/s-5379MB/s), io=902GiB (968GB), run=180001-180001msec
bs=128K:
   READ: bw=6460MiB/s (6773MB/s), 6460MiB/s-6460MiB/s (6773MB/s-6773MB/s), io=1136GiB (1219GB), run=180001-180001msec
bs=256K:
   READ: bw=7688MiB/s (8061MB/s), 7688MiB/s-7688MiB/s (8061MB/s-8061MB/s), io=1351GiB (1451GB), run=180001-180001msec
bs=512K:
   READ: bw=5873MiB/s (6158MB/s), 5873MiB/s-5873MiB/s (6158MB/s-6158MB/s), io=1032GiB (1108GB), run=180002-180002msec

64K STRIPE (i=2 I=64)
=====
bs=128K:
   READ: bw=6015MiB/s (6307MB/s), 6015MiB/s-6015MiB/s (6307MB/s-6307MB/s), io=1057GiB (1135GB), run=180001-180001msec
bs=256K:
   READ: bw=7928MiB/s (8314MB/s), 7928MiB/s-7928MiB/s (8314MB/s-8314MB/s), io=1394GiB (1496GB), run=180001-180001msec

128K STRIPE (i=2 I=128)
=====
bs=128K:
   READ: bw=6392MiB/s (6703MB/s), 6392MiB/s-6392MiB/s (6703MB/s-6703MB/s), io=1124GiB (1207GB), run=180001-180001msec
bs=256K:
   READ: bw=8618MiB/s (9036MB/s), 8618MiB/s-8618MiB/s (9036MB/s-9036MB/s), io=1515GiB (1627GB), run=180001-180001msec

Looking at no striping as a baseline,
- the spec sheet for the disk was 6.8 GB/s max @ 128k seq read. So even if it was actually 6.8000000 GB/s, that's still 99.6% performance inside the VM (I'm assuming companies will always round their speeds up lol). This rules out the disk itself from troubleshooting.
- the matched speeds also rule out alignment issues

Taking a closer look at lvcreate, we have this:
-i|--stripes Number
Specifies the number of stripes in a striped LV. This is
the number of PVs (devices) that a striped LV is spread
across. Data that appears sequential in the LV is spread
across multiple devices in units of the stripe size (see
--stripesize). This does not change existing allocated
space, but only applies to space being allocated by the
command. When creating a RAID 4/5/6 LV, this number does
not include the extra devices that are required for pari‐
ty. The largest number depends on the RAID type (raid0:
64, raid10: 32, raid4/5: 63, raid6: 62), and when unspeci‐
fied, the default depends on the RAID type (raid0: 2,
raid10: 2, raid4/5: 3, raid6: 5.) To stripe a new raid LV
across all PVs by default, see lvm.conf(5) alloca‐
tion/raid_stripe_all_devices.

-I|--stripesize Size[k|UNIT]
The amount of data that is written to one device before
moving to the next in a striped LV.
So the "stripesize" here isn't actually the stripe size. It's the stripe UNIT size. The terminology is wrong in lvcreate (eg. when striping across 3 disks with a stripe UNIT size of 1k, the stripe size is 3k, but inlvcreate the "stripesize" is actually the stripe UNIT size).

Based on this, I SHOULD be getting somewhere close to 13.546 GB/s (2 x 6.773 GB/s) with bs=128k and --stripesize=64k. But I'm not.

There seems to be something I'm not getting. @CCupp
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!