Filesystem corruption on HA

ejc317

Member
Oct 18, 2012
263
0
16
So we have 12 nodes and 1 vm (testing HA). It is a CentOS cPanel server. Storage is 500gb of space on an iSCSI SAN.

When the node first started HA it migrated it from Node 1 to Node 3 (not sure why) but it ran fine. After a reboot getting TONS of FSCK errors. I migrated manually the node back to node 1 and took off HA and it said vm already running ... and then the fsck errors went away

I check /dev/SAN NAME/ and on both systems a connection was open

Could this be both nodes accessing the same disk at the same time? This is a huge issue ....
 
So we have 12 nodes and 1 vm (testing HA). It is a CentOS cPanel server. Storage is 500gb of space on an iSCSI SAN.

When the node first started HA it migrated it from Node 1 to Node 3 (not sure why) but it ran fine.

Never enable HA if the VM is already running. Make sure that the VM is stopped, then enable HA - now, the resource manager is controlling the VM (and starts it). Your VM was probably started twice and/or hard poweroff by the resource manager.

After a reboot getting TONS of FSCK errors. I migrated manually the node back to node 1 and took off HA and it said vm already running ... and then the fsck errors went away

I check /dev/SAN NAME/ and on both systems a connection was open

Could this be both nodes accessing the same disk at the same time? This is a huge issue ....

HA is not trivial but if configured well very nice - you need to know what you are doing. therefore we always recommend support subscription and review by our team to make sure you are on the right way.