I've been working with Proxmox VE for about four months now, and I'm looking for some suggestions (or maybe reassurance) that what I'm doing is "right." The technical information of the post is mostly DRBD related, but the end result leads back to PVE. (I apologize if this sounds like a broken record, especially to the Proxmox 2.0/HA developers...)
What I want, but can't have*
Proxmox Host 1
--------------------
Local OS disk
DRBD-R0 - Primary
DRBD-R1 - Primary
DRBD-R2 - Secondary
Proxmox Host 2
--------------------
Local OS disk
DRBD-R0 - Primary
DRBD-R1 - Secondary
DRBD-R2 - Primary
R0 would be a testing area, R1 and R2 would be for production machines. Running a Primary/Secondary system makes it easy to avoid split brain problems. However, PVE isn't happy if all of its storage options aren't writable.* To achieve this, DRBD must be running in dual primary mode. Things seemed alright, but trouble was brewing in the dark. DRBD doesn't complain much (or at all in some cases) letting this little charm slip by for almost three weeks:
(from kern.log.3)
Apr 15 10:56:07 pve2 kernel: block drbd0: Digest integrity check FAILED.
Apr 15 10:56:07 pve2 kernel: block drbd0: error receiving Data, l: 4136!
Instant split-brain, probably for no "good" reason. It happened right after a new VM was created and started. Lars Ellenberg has explained the problem in a few different places, here's something close: "Digest-integrity with dual-primary is not a very good idea." (www.mail-archive.com) It boils down to the local data changing before the two DRBD daemons have their hashes sorted out.
I removed the digest-integrity options from the net{} section and started looking at the verify-alg options in syncer{}. Scheduling "drbdadm verify all" to run during periods of low activity reduces the risk of spurious split-brain errors, but it's still a possibility.
I haven't had much luck searching for similar stories with Proxmox and DRBD, so I started reading about Xen. I found a page that seems to have all of the components: http://backdrift.org/live-migration-and-synchronous-replicated-storage-with-xen-drbd-and-lvm One of the visitor comments brings up the Primary/Secondary problem; some magic Xen script seems to handle internally. The DRBD user manual mentions the disable-sendpage parameter, but I can't figure out what it really does.
How does everyone else handle this? The Wiki/How-To about DRBD doesn't mention data integrity checks. Maybe I'm worried for nothing and I should trust the piles of RAID-1 and daily backups to do their job. Thoughts?
PVE1 / PVE2
-----------------------
Dell PowerEdge R515
2x Opteron 4122
16GB DDR3
Dell H700 RAID (LSI MegaSAS 9260)
65GB R1 - PVE Install
400GB R1 - DRBD-R0
465GB R1 - DRBD-R1
465GB R1 - DRBD-R2
2x hot spares
(PVE and DRBD-R0 live on the same two disks)
2x Dual Broadcom 5709 NIC w/TOE
- WAN + 3x DRBD
Proxmox 1.8 installed and worked well right away.
*If one storage option fails, it seems to take all subsequent storage with it.
Local Storage
DRBD-R0
DRBD-R1
DRBD-R2 <-- if this is "secondary"
NFS-ISO <-- these storage groups
NFS-Backup <--aren't initialized
I noticed trouble before I had restarted the machines - having any DRBD resource in Secondary effectively disables the web GUI and any backups with a stream of "vgchange" errors on both PVE hosts. The errors themselves make sense - PVE can't open a volume that is read-only. The part I don't understand is why the host needs to access volume groups for machines that it isn't running or trying to modify.
What I want, but can't have*
Proxmox Host 1
--------------------
Local OS disk
DRBD-R0 - Primary
DRBD-R1 - Primary
DRBD-R2 - Secondary
Proxmox Host 2
--------------------
Local OS disk
DRBD-R0 - Primary
DRBD-R1 - Secondary
DRBD-R2 - Primary
R0 would be a testing area, R1 and R2 would be for production machines. Running a Primary/Secondary system makes it easy to avoid split brain problems. However, PVE isn't happy if all of its storage options aren't writable.* To achieve this, DRBD must be running in dual primary mode. Things seemed alright, but trouble was brewing in the dark. DRBD doesn't complain much (or at all in some cases) letting this little charm slip by for almost three weeks:
(from kern.log.3)
Apr 15 10:56:07 pve2 kernel: block drbd0: Digest integrity check FAILED.
Apr 15 10:56:07 pve2 kernel: block drbd0: error receiving Data, l: 4136!
Instant split-brain, probably for no "good" reason. It happened right after a new VM was created and started. Lars Ellenberg has explained the problem in a few different places, here's something close: "Digest-integrity with dual-primary is not a very good idea." (www.mail-archive.com) It boils down to the local data changing before the two DRBD daemons have their hashes sorted out.
I removed the digest-integrity options from the net{} section and started looking at the verify-alg options in syncer{}. Scheduling "drbdadm verify all" to run during periods of low activity reduces the risk of spurious split-brain errors, but it's still a possibility.
I haven't had much luck searching for similar stories with Proxmox and DRBD, so I started reading about Xen. I found a page that seems to have all of the components: http://backdrift.org/live-migration-and-synchronous-replicated-storage-with-xen-drbd-and-lvm One of the visitor comments brings up the Primary/Secondary problem; some magic Xen script seems to handle internally. The DRBD user manual mentions the disable-sendpage parameter, but I can't figure out what it really does.
How does everyone else handle this? The Wiki/How-To about DRBD doesn't mention data integrity checks. Maybe I'm worried for nothing and I should trust the piles of RAID-1 and daily backups to do their job. Thoughts?
PVE1 / PVE2
-----------------------
Dell PowerEdge R515
2x Opteron 4122
16GB DDR3
Dell H700 RAID (LSI MegaSAS 9260)
65GB R1 - PVE Install
400GB R1 - DRBD-R0
465GB R1 - DRBD-R1
465GB R1 - DRBD-R2
2x hot spares
(PVE and DRBD-R0 live on the same two disks)
2x Dual Broadcom 5709 NIC w/TOE
- WAN + 3x DRBD
Proxmox 1.8 installed and worked well right away.
*If one storage option fails, it seems to take all subsequent storage with it.
Local Storage
DRBD-R0
DRBD-R1
DRBD-R2 <-- if this is "secondary"
NFS-ISO <-- these storage groups
NFS-Backup <--aren't initialized
I noticed trouble before I had restarted the machines - having any DRBD resource in Secondary effectively disables the web GUI and any backups with a stream of "vgchange" errors on both PVE hosts. The errors themselves make sense - PVE can't open a volume that is read-only. The part I don't understand is why the host needs to access volume groups for machines that it isn't running or trying to modify.