Slow IO performance with LVM and no RAID

did-vmonroig

Renowned Member
Aug 14, 2013
15
0
66
Hello.

I've been reading about performance, and I think this system IO should be better, but I can't found the problem.

It's a Core i5-2400 with two Toshiba DT01ACA200 SATA hard disks. There is no RAID in use. One HD is for system and VMs and the other for backups.

There are no SMART problems and there is almost no difference with cache on or off.

Code:
root@servidor02:~# hdparm -W /dev/sda

/dev/sda:
 write-caching =  1 (on)

I think partitions are aligned, as I've read it should be a problem with LVM.

Code:
root@servidor02:~# parted /dev/sda u b print
Model: ATA TOSHIBA DT01ACA2 (scsi)
Disk /dev/sda: 2000398934016B
Sector size (logical/physical): 512B/4096B
Partition Table: msdos

Number  Start         End             Size            Type     File system     Flags
 1      2097152B      10486809087B    10484711936B    primary  ext4            boot
 2      10487857152B  12584174079B    2096316928B     primary  linux-swap(v1)
 3      12585009152B  2000398934015B  1987813924864B  primary                  lvm

There is ext4 in use, but I don't think it can degrade so much, can it?

Code:
root@servidor02:~# cat /etc/fstab 
# <file system>    <mount point>    <type>    <options>    <dump>    <pass>
/dev/sda1    /    ext4    errors=remount-ro    0    1
/dev/sda2    swap    swap    defaults    0    0
/dev/vg1/lv1    /var/lib/vz    ext4    defaults,usrquota    0    2
/dev/sdb1    /backups    ext4    defaults    0    3
proc            /proc   proc    defaults        0       0
sysfs           /sys    sysfs   defaults        0       0

But performance, specially fsyncs/second are very low.

Code:
root@servidor02:~# pveversion -v
pve-manager: 2.2-32 (pve-manager/2.2/3089a616)
running kernel: 2.6.32-16-pve
proxmox-ve-2.6.32: 2.2-80
pve-kernel-2.6.32-16-pve: 2.6.32-82
lvm2: 2.02.95-1pve2
clvm: 2.02.95-1pve2
corosync-pve: 1.4.4-1
openais-pve: 1.1.4-2
libqb: 0.10.1-2
redhat-cluster-pve: 3.1.93-2
resource-agents-pve: 3.9.2-3
fence-agents-pve: 3.1.9-1
pve-cluster: 1.0-34
qemu-server: 2.0-72
pve-firmware: 1.0-21
libpve-common-perl: 1.0-41
libpve-access-control: 1.0-25
libpve-storage-perl: 2.0-36
vncterm: 1.0-3
vzctl: 4.0-1pve2
vzprocps: 2.0.11-2
vzquota: 3.1-1
pve-qemu-kvm: 1.3-10
ksm-control-daemon: 1.1-1

Code:
root@servidor02:~# pveperf /var/lib/vz
CPU BOGOMIPS:      24743.28
REGEX/SECOND:      1362869
HD SIZE:           984.31 GB (/dev/mapper/vg1-lv1)
BUFFERED READS:    154.23 MB/sec
AVERAGE SEEK TIME: 14.03 ms
FSYNCS/SECOND:     35.92
DNS EXT:           36.25 ms
DNS INT:           3.01 ms (ovh.net)

At this time there is very low use on this server, but it's gonna change in short, so I would like to get better those values.

Any thoughts, please?

Regards.
 
...
Code:
root@servidor02:~# pveperf /var/lib/vz
CPU BOGOMIPS:      24743.28
REGEX/SECOND:      1362869
HD SIZE:           984.31 GB (/dev/mapper/vg1-lv1)
BUFFERED READS:    154.23 MB/sec
AVERAGE SEEK TIME: 14.03 ms
FSYNCS/SECOND:     35.92
DNS EXT:           36.25 ms
DNS INT:           3.01 ms (ovh.net)

At this time there is very low use on this server, but it's gonna change in short, so I would like to get better those values.

Any thoughts, please?

Regards.

Hi,
I guess the hdd is in use by other processes (VMs/CTs, logging, ...). Look with iostat (apt-get install sysstat; iostat -dm 5 sda) to see how much IOs you have on the disks.

Perhaps the cache on the disk is disabled (this is normaly done, if disks are used on a raidcontroller with own cache (protected by an battery)?!

Udo
 
Thanks, Udo.

I guess the hdd is in use by other processes (VMs/CTs, logging, ...). Look with iostat (apt-get install sysstat; iostat -dm 5 sda) to see how much IOs you have on the disks.

The server is under very low load:

Code:
root@servidor02:~# iostat -dm 5 sda
Linux 2.6.32-16-pve (servidor02.XXX.XXX)     15/08/13        _x86_64_        (4 CPU)


Device:            tps    MB_read/s    MB_wrtn/s    MB_read    MB_wrtn
sda               4,20         0,00         0,06          0          0


Device:            tps    MB_read/s    MB_wrtn/s    MB_read    MB_wrtn
sda               2,20         0,00         0,02          0          0


Device:            tps    MB_read/s    MB_wrtn/s    MB_read    MB_wrtn
sda              28,40         0,00         0,23          0          1


Device:            tps    MB_read/s    MB_wrtn/s    MB_read    MB_wrtn
sda               1,20         0,00         0,00          0          0


Device:            tps    MB_read/s    MB_wrtn/s    MB_read    MB_wrtn
sda               1,20         0,00         0,02          0          0


Device:            tps    MB_read/s    MB_wrtn/s    MB_read    MB_wrtn
sda              10,00         0,00         0,05          0          0


Device:            tps    MB_read/s    MB_wrtn/s    MB_read    MB_wrtn
sda               1,80         0,00         0,02          0          0


Device:            tps    MB_read/s    MB_wrtn/s    MB_read    MB_wrtn
sda               0,80         0,00         0,00          0          0


Device:            tps    MB_read/s    MB_wrtn/s    MB_read    MB_wrtn
sda               0,80         0,00         0,00          0          0


Device:            tps    MB_read/s    MB_wrtn/s    MB_read    MB_wrtn
sda               3,20         0,00         0,02          0          0


Device:            tps    MB_read/s    MB_wrtn/s    MB_read    MB_wrtn
sda               1,40         0,00         0,01          0          0


Device:            tps    MB_read/s    MB_wrtn/s    MB_read    MB_wrtn
sda               7,80         0,00         0,08          0          0


Device:            tps    MB_read/s    MB_wrtn/s    MB_read    MB_wrtn
sda               3,80         0,00         0,06          0          0

Code:
root@servidor02:~# pveperf /var/lib/vz
CPU BOGOMIPS:      24743.28
REGEX/SECOND:      1371343
HD SIZE:           984.31 GB (/dev/mapper/vg1-lv1)
BUFFERED READS:    153.26 MB/sec
AVERAGE SEEK TIME: 15.87 ms
FSYNCS/SECOND:     12.66
DNS EXT:           48.74 ms
DNS INT:           2.96 ms (ovh.net)

Perhaps the cache on the disk is disabled (this is normaly done, if disks are used on a raidcontroller with own cache (protected by an battery)?!

I've tried enabling them with hdparm, and it reports that are enabled. Anyways, there is very little difference with cache on or off, and there is no RAID.

Regards.
 
I had a very similar problem, except I only had a single SATA hard disc. Fsync's were like yours - very low. I tried lots of things (including posting on here) and in the end tried a different disc. Everything went back to normal. The disc I was using was a Seagate Barracuda 7200rpm 250 GB one. I haven't tried it in another system yet to see if it really is the disc, but if you have another one I'd suggest trying that and see what you get.
 
such slow fsyncs are expected on single disks (with default mount options and ext4).
 
I don't agree to the assumption with ext4. This is the result of one of my nodes using ext4 (SSD though). Except for adding discard to mount options due to SSD it uses default mount options.
Screenshot from 2013-08-15 10:47:13.png
 
if you can´t believe it, just test it with a plain sata disk.

and of course, hardware raid or ssd is different.
 
yes, fsync are much better with ext3 for single sata disks. or change mount options for ext4 via fstab (only if you know what you are doing).
 
May mounting with noatime or any other options get better IO performance? Are there any caveats to this?

Thanks.
 
I just made an inquiry of your disks. Toshiba describes the disks as using 'Advanced Format 512e'. This means native 4k logical sectors and not 512B.
So maybe you are facing an alignment problem which needs special settings in /etc/lvm/lvm.conf
 
what are you current mount options?

This is /etc/fstab:

Code:
# <file system> <mount point>   <type>  <options>       <dump>  <pass>
/dev/sda1       /       ext4    errors=remount-ro       0       1
/dev/sda2       swap    swap    defaults        0       0
/dev/vg1/lv1    /var/lib/vz     ext4    defaults,usrquota       0       2
/dev/sdb1       /backups        ext4    defaults        0       3
proc            /proc   proc    defaults        0       0
sysfs           /sys    sysfs   defaults        0       0

I just made an inquiry of your disks. Toshiba describes the disks as using 'Advanced Format 512e'. This means native 4k logical sectors and not 512B.
So maybe you are facing an alignment problem which needs special settings in /etc/lvm/lvm.conf

Yes, something like that I've read, but I tried to align everything to 4096 bytes when creating partitions:

Code:
root@servidor02:~# parted /dev/sda u b print
Model: ATA TOSHIBA DT01ACA2 (scsi)
Disk /dev/sda: 2000398934016B
Sector size (logical/physical): 512B/4096B
Partition Table: msdos

Number  Start         End             Size            Type     File system     Flags
 1      2097152B      10486809087B    10484711936B    primary  ext4            boot
 2      10487857152B  12584174079B    2096316928B     primary  linux-swap(v1)
 3      12585009152B  2000398934015B  1987813924864B  primary                  lvm

Maybe I'm missing something about alignment.
 
Creating the partition is not the problem since all disk partitioning tools in Linux since a couple of years has aligned all partitions to the first MB it is more lvm I am thinking off.

"2 10487857152B 12584174079B 2096316928B primary linux-swap(v1)"

This on looks strange. Why do you have a swap partition in version 1?
 
Number Start End Size Type File system Flags
1 2097152B 10486809087B 10484711936B primary ext4 boot
2 10487857152B 12584174079B 2096316928B primary linux-swap(v1)
3 12585009152B 2000398934015B 1987813924864B primary lvm

Another observation: It seems you have aligned your disk to 2 MB? Why is that?

From lvm.conf:
# Default alignment of the start of a data area in MB. If set to 0,
# a value of 64KB will be used. Set to 1 for 1MiB, 2 for 2MiB, etc.
# default_data_alignment = 1

This means lvm is aligning to 1MB but your disk is aligned to 2MB which means the above value should be:
default_data_alignment = 2
 
Creating the partition is not the problem since all disk partitioning tools in Linux since a couple of years has aligned all partitions to the first MB it is more lvm I am thinking off.

"2 10487857152B 12584174079B 2096316928B primary linux-swap(v1)"

This on looks strange. Why do you have a swap partition in version 1?

My ISP offers an online tool to create partitions. Seems that when I selected swap partition, it created with that type. Can this be problematic?

Number Start End Size Type File system Flags
1 2097152B 10486809087B 10484711936B primary ext4 boot
2 10487857152B 12584174079B 2096316928B primary linux-swap(v1)
3 12585009152B 2000398934015B 1987813924864B primary lvm

Another observation: It seems you have aligned your disk to 2 MB? Why is that?

From lvm.conf:
# Default alignment of the start of a data area in MB. If set to 0,
# a value of 64KB will be used. Set to 1 for 1MiB, 2 for 2MiB, etc.
# default_data_alignment = 1

This means lvm is aligning to 1MB but your disk is aligned to 2MB which means the above value should be:
default_data_alignment = 2

To be completely honest, I don't remember, but probably it was because I used some notes for partitioning a SSD disk, although it's a mechanichal one. May I change that parameter without risks to existing VMs?
 
I have no idea of the difference between version 1 and 2 of swap partitions. My best guess will be that this have no influence on the speed of LVM partitions. It can easily be changed. Disable swap and regenerate the swap partition using default settings will create a swap partition version 2. Remember to turn swap on again;-)

AFAIK the setting in lvm.conf will only influence new LVM partitions but what is worse seems that this setting is used when creating the PV (psychical volume). This indicates that you will have to recreate your PV before the right alignment is used:-\

If you are suppose to recreate the PV my recommendation would be to start from scratch and then use the defaults. Swap version 2 and alignment of partitions using 1MB which is the default.

Forgot: 1MB alignment is also the recommendation for 512E disks since this is the optimum for disks using Sector size (logical/physical): 512B/4096B
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!