[TUTORIAL] Understanding QCOW2 Risks with QEMU cache=none in Proxmox

bbgeek17

Distinguished Member
Nov 20, 2020
5,977
2,150
258
Blockbridge
www.blockbridge.com
Hey everyone,

A few recent developments prompted us to examine QCOW2’s behavior and reliability characteristics more closely:

1. Community feedback
  • There are various community discussions questioning the reliability of QCOW2. We have customers (predating our native integration) interested in using QCOW on LVM.
2. Integrity testing failures with QCOW/LVM snapshots
  • When we ran our data integrity tests against the tech preview of QCOW2/LVM snapshots, we observed consistent failures starting immediately after the first snapshot was taken.
3. Confusing Documentation
  • The existing resources documenting the behavior and semantics of the various cache modes lack clarity.

After extensive lab testing, we now have a clear understanding of QEMU and QCOW2 behavior, as well as the inherent risks.

Lessons Learned:

Compared to physical storage devices, QCOW2 exhibits unusual write semantics due to delayed metadata updates.

The integrity issues with LVM snapshots arose from a common misconception that cache=none disables write caching entirely. In reality, this assumption only holds for RAW disks. QEMU/QCOW2 defers and maintains cached metadata structures that remain volatile for much longer than expected, even across guest reboot!

Subcluster allocation in the new snapshot chain feature ("Volume as Snapshot Chains") significantly increases metadata churn. It amplifies the risk of torn writes and data inconsistency after power loss or unplanned guest termination.

Technical Report:

We've published a technical article summarizing what we've learned, including a reproducible experiment that demonstrates the semantics leading to corruption on power loss:
Please feel free to ask questions, and we'll do our best to answer. If you spot a gap in our understanding, let us know.


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox