SoftRAID+lvm+kvm+proxmox+high IO = crash ?

nextpage

New Member
Nov 26, 2013
3
0
1
Hi, I wanted to give Proxmox a try.
But just as we wanted to go "live" and after configuring five VMs we experience crashes reproducable when doing high IO (bonnei++)

I did some IO-stress test before (without md-RAID) and no crash.

But then for HA reasons I moved to Softraid. (and went from ext3 to ext4)

Everything is fine until there is some load. An import of a mysqldump is sufficiant to crash the mysqlserver.

The system is a Ubuntu lts 12.04.

Does anybody experience the same issue?

Thanks for any hint
d.


"
[441973.188022] [sched_delayed] sched: RT throttling activated
[453494.377366] ata3.00: exception Emask 0x0 SAct 0x3 SErr 0x0 action 0x6 frozen
[453494.377504] ata3.00: failed command: WRITE FPDMA QUEUED
[453494.377582] ata3.00: cmd 61/08:00:20:1c:c1/00:00:01:00:00/40 tag 0 ncq 4096 out
[453494.377582] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[453494.377751] ata3.00: status: { DRDY }
[453494.377818] ata3.00: failed command: WRITE FPDMA QUEUED
[453494.377897] ata3.00: cmd 61/08:08:40:1c:c1/00:00:01:00:00/40 tag 1 ncq 4096 out
[453494.377897] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[453494.378064] ata3.00: status: { DRDY }
[453494.378140] ata3: hard resetting link....

[453555.112562] ata3: EH complete
[453592.606815] BUG: soft lockup - CPU#0 stuck for 29s! [scsi_eh_2:176]
[453592.606883] Modules linked in: microcode(F) psmouse(F) joydev(F) serio_raw(F) i2c_piix4(F) mac_hid(F) virtio_balloon(F) lp(F) parport(F) hid_generic(F) usbhid(F) hid(F) floppy(F) e1000(F) ahci(F) libahci(F)
[453592.606955] CPU 0
[453592.606959] Pid: 176, comm: scsi_eh_2 Tainted: GF 3.8.0-33-generic #48~precise1-Ubuntu Bochs Bochs
[453592.606961] RIP: 0010:[<ffffffff816f4319>] [<ffffffff816f4319>] _raw_spin_unlock_irqrestore+0x19/0x30
[453592.606997] RSP: 0018:ffff88011ba69cc0 EFLAGS: 00000286


"
 
First of all: If this is a server why are you using a kernel optimized for desktop use? ([sched_delayed] sched: RT throttling activated)
Second: Going from ext3 to ext4 did you remember to mount with barrier=0? ext4 will mount with barrier=1 by default which in proxmox is barrier=0 for ext3.
Third: what kind of RAID and which kind of disks?
 
First of all: If this is a server why are you using a kernel optimized for desktop use? ([sched_delayed] sched: RT throttling activated)
Second: Going from ext3 to ext4 did you remember to mount with barrier=0? ext4 will mount with barrier=1 by default which in proxmox is barrier=0 for ext3.
Third: what kind of RAID and which kind of disks?

1. This is a plain Ubuntu 12.04, so no tweaking up to now.
2. Ext4 (barrier=0) on the host seems to be the problem, the FS of the VM doesn't matter. Didn't notice that.
The plan was to add later "rw,noatime,data=writeback,barrier=0,nob" anyway, but why is barrier=1 so diastrous?
(speaking of improvement, do you have any other hint for performance/reliability?)
3. Just RAID1

Code:
Personalities : [raid1] 
md1 : active raid1 sda2[2] sdb2[1]
      487731008 blocks super 1.2 [2/2] [UU]
      
md0 : active raid1 sda1[2] sdb1[1]
      522944 blocks super 1.2 [2/2] [UU]

thanks for the quick answer!
 
Yeah, but no choice at the moment, and I will think twice to do that again? Next time I insist on hardware raid.