Hello,
Long time ago after when I found a guide how to configure Proxmox two node cluster + DRBD I was really happy. We only needed to nodes for online migration and quick recovery after hardware failure. I saw that as a solution that could be widely used. And it worked for a while.
When I noticed that online migration sometimes fails for no visible reason. While reading DRBD documentation I found a recommendation to check DRBD synchronization consistency at least ones a month and I started to do. Surprisingly I found that there were new out of sync sectors every week.
I went deeper and found that:
- most of the time out of sync happen on a swap space of Linux VMs (not critical for primary/secondary mode but critlical for primary/primary as can cause memory corruption)
- sometimes (quite rarely) out of sync happen for Windows VMs
- out of sync never happen for ext4 volumes of VMs
At the beginning I was thinking it was hardware issue and we tried to disable any kind of offload, disable rr-bonding and we even asked Dell to replace hard drives (we assumed there was a firware issue). However nothing helped.
Finally we found (thanks for Lars Ellenberg) that KVM can change buffers while data in-flight if cache mode with O_DIRECT for a particular virtual hard drive is used. Switching cache mode solves (or works around) the issue.
So far I have the following recommendations for KVM on top of DRBD:
- WRONG: use writethrough or directsync for all drives of all VMs on DRBD (means no write cache)
- CORRECT: use writethrough or writeback for all drives of all VMs on DRBD (means no O_DIRECT)
- use hardware RAID with write cache and BBU (this is extremely necessary as we disabled write cache for VMs)
- you can enable modes other than writethrough or writeback for some virtual drives that have reliable barrier support, for example, if a particular drive has only ext4 partition with barrier enabled and no swap
Any more ideas and suggestions are welcome.
More information here: http://www.gossamer-threads.com/lists/drbd/users/25227
Best regards,
Stanislav German-Evtushenko
Long time ago after when I found a guide how to configure Proxmox two node cluster + DRBD I was really happy. We only needed to nodes for online migration and quick recovery after hardware failure. I saw that as a solution that could be widely used. And it worked for a while.
When I noticed that online migration sometimes fails for no visible reason. While reading DRBD documentation I found a recommendation to check DRBD synchronization consistency at least ones a month and I started to do. Surprisingly I found that there were new out of sync sectors every week.
I went deeper and found that:
- most of the time out of sync happen on a swap space of Linux VMs (not critical for primary/secondary mode but critlical for primary/primary as can cause memory corruption)
- sometimes (quite rarely) out of sync happen for Windows VMs
- out of sync never happen for ext4 volumes of VMs
At the beginning I was thinking it was hardware issue and we tried to disable any kind of offload, disable rr-bonding and we even asked Dell to replace hard drives (we assumed there was a firware issue). However nothing helped.
Finally we found (thanks for Lars Ellenberg) that KVM can change buffers while data in-flight if cache mode with O_DIRECT for a particular virtual hard drive is used. Switching cache mode solves (or works around) the issue.
So far I have the following recommendations for KVM on top of DRBD:
- WRONG: use writethrough or directsync for all drives of all VMs on DRBD (means no write cache)
- CORRECT: use writethrough or writeback for all drives of all VMs on DRBD (means no O_DIRECT)
- use hardware RAID with write cache and BBU (this is extremely necessary as we disabled write cache for VMs)
- you can enable modes other than writethrough or writeback for some virtual drives that have reliable barrier support, for example, if a particular drive has only ext4 partition with barrier enabled and no swap
Any more ideas and suggestions are welcome.
More information here: http://www.gossamer-threads.com/lists/drbd/users/25227
Best regards,
Stanislav German-Evtushenko
Last edited: