The first thing that comes to my mind when talking about md/dm raid is this: The
default caching mode for VMs we use is
none
, that is, the VMs will access disks with the
O_DIRECT
flag. The MD/DM raid implementations, for this case, will simply forward a pointer to the memory region to
each individual block device, and each of those will copy the data from memory
separately. If a 2nd thread is currently writing to that data, the underlying disks
will sooner or later write
different data, immediately corrupting the raid.[1]
In [1] I also mentioned a real case where this can happen: An in-progress write to swap happening for memory that is simultaneously freed, therefore the swap entry is already discarded while the disk I/O still happening, causing the raid to be degraded.
Ideally the kernel would just ignore
O_DIRECT
here, since it
is in fact documented as
*trying* to minimize cache effects... not forcibly skipping caches consistency be damned, completely disregarding the one job that for example a RAID1 has: actually writing the *same* data on both disks...
And yes, writing data which is being modified *normally* means you need to expect garbage on disk. However, the point of a RAID is to at least have the *same* garbage on *both* disks, not give userspace a trivial way to degrade the thing.
If you take care not to use this mode, you'll be fine with it though, but you'll be utilizing some more memory.
[1]
https://bugzilla.kernel.org/show_bug.cgi?id=99171