We have two PVE servers P1 and P2, OpenVZ only. P1 hosts a lot of Apache type containers, while P2 has twoMySQL containers and some small Apache on it. Both are identicaly setup (Core2 Quad CPU, 8GB RAM, Adaptec SATA RAID, PVE 1.9, kernel 2.6.32-4). P1 starts the daily vzdump backup routine at 10pm, P2 at 2am (to an NFS share hosted on a third server).
We have been informed that during these backups, for a couple of minutes the sites slow down and even time out, and sometimes they can't reach the MySQL server. After some investigation we found that right after the snapshot backup starts at 10pm, the sites on P1 start to become unresponsive so much that they fill up the available connections on the MySQL servers on P2.
PVE reports an IO delay around 30% when the vzdump backups are running, but the load on the containers is between 40-160. Network performance is sluggish as well, ssh logins take minutes. Basically vzdump kills the complete IO subsystem of these servers.
We have already changed the default IO scheduler from CFQ to deadline, because:
- we read that CFQ is not aware of the IO queue the Adaptec RAID uses so it's unnecessary
- servers were completely unusable with CFQ during backups
Any ideas how to run vzdump so it doesn't kill the IO of the server?
We have been informed that during these backups, for a couple of minutes the sites slow down and even time out, and sometimes they can't reach the MySQL server. After some investigation we found that right after the snapshot backup starts at 10pm, the sites on P1 start to become unresponsive so much that they fill up the available connections on the MySQL servers on P2.
PVE reports an IO delay around 30% when the vzdump backups are running, but the load on the containers is between 40-160. Network performance is sluggish as well, ssh logins take minutes. Basically vzdump kills the complete IO subsystem of these servers.
We have already changed the default IO scheduler from CFQ to deadline, because:
- we read that CFQ is not aware of the IO queue the Adaptec RAID uses so it's unnecessary
- servers were completely unusable with CFQ during backups
Any ideas how to run vzdump so it doesn't kill the IO of the server?
Last edited: