Slow KVM Backup times and HA issue during back

ejc317 · Oct 22, 2012

Hi all,

We're getting there, slowly, but surely.

So we have the cluster up - nodes 1-4 and have a VM up (KVM) and during a backup it

a) shuts the node off - so no live backups?
b) as it's backing up HA kicks in and says "ok" that the VM has moved from node 3 to node 1 ... but in the GUI it still shows up under node 3. When I click start the VM, it does the HA start VM 101 again and says OK that it's moved.
c) this 32gb harddrive is taking more than 30 minutes to backup?
d) it's only backing up the IDE drive not the virtual drive (100GB virtio drive and 30GB IDE drive) ... is there a reason? or do I have to choose separately

Is this normal or something we should set specifically?

Thank you very much in advance

dietmar · Oct 22, 2012

ejc317 said:
a) shuts the node off - so no live backups?

What? The node or the VM?

ejc317 said:
b) as it's backing up HA kicks in and says "ok" that the VM has moved from node 3 to node 1 ... but in the GUI it still shows up under node 3. When I click start the VM, it does the HA start VM 101 again and says OK that it's moved.

What happens exactly? That requires further analysis. I guess your network is overloaded, and that breaks cluster communication. For HA, you should use a separate network for cluster to avoid such things.

ejc317 said:
c) this 32gb harddrive is taking more than 30 minutes to backup?

That depends on the speed of the storage.

ejc317 said:
d) it's only backing up the IDE drive not the virtual drive (100GB virtio drive and 30GB IDE drive) ... is there a reason? or do I have to choose separately

This is not normal. Any hint in the backup log?

ejc317 · Oct 22, 2012

1) # 1 is solved - seems like user error
2) It says server restarted on node 1 and logs say its fine no errors but server never moved. I actually have a question. We have 2 public NICs bonded in active / failover and 2 NICs currently unbonded but we will either do 802.3ad or Balance-alb ... (any suggestions?) Only issue I see with 802.3ad is that our 2 private nics go to separate switches and I can't bond them switch side (switches are stacked but can't trunk across stacks)

3) Storage is an SSD array with 8 x Samsung 830 SSDs with an LSI 9266-8i with 1GB cache and a cachevault so the speed is there. I checked seems like it was 25% IO wait and the network port was saturated

It seems that the cluster hostname and related IP becomes the initiator and the target is the target that was added originally to proxmox. IE if I have 2 nics, unless its bonded, it won't go out the other private nic. Similarly for the target it'll just chose 10.10.1.100 (for example) as the SAN ip even though I have 15 other 1gbps interfaces that are not getting any traffic (i've chekced switch logs)

I will try to get multipathing to work but it seems that the scsi device no longer shows up (I assume the LVM group is busy)

ejc317 · Oct 22, 2012

Or conversely, how do I have proxmox use a different NIC for iscsi communication than it does for others?

We had the proxmox IP as the public IP but were afraid that transfers will take place over the public network (67.x.x.x) so we changed the proxmox IP to (10.10.1.x) on the private network.

Ideally, we can have a public IP that is used for cluster communication and a private network for iSCSI but how do we accomplish that via the cluster config? The other reason we had the cluster IP as a private IP is in case we need to re-number, it won't mess up the cluster but do you suggest we go back to having public IP for cluster IPs??

Search

Search

Slow KVM Backup times and HA issue during back

ejc317

Member

dietmar

Proxmox Staff Member

ejc317

Member

ejc317

Member