KVM Zero Downtime Live Migration

Giovanni

Renowned Member
Apr 1, 2009
101
6
83
Good Morning Guys,

I am looking to get the most out of KVM and PVE. Right now I have a single server that I am planning to migrate to a High availability cluster.

For this, I need some recommendations.

1) On PVE 1.4 one of the features released was KVM Zero Downtime Live migration, however there is no more information on the matter. What is required for me to be able to transfer a KVM guest with no downtime?


2) For PVE 1.6 which should have High Availability features, what kind of setup is proxmox testing/using? Are you using Fibre Channel shared storage, DRBD or other? What is the best/fastest choice to have the highest IOPS yet have data redundancy and fault tolerance (automatic)?


3) For my next hardware selection for promox, I wanted to ensure that we expand our ability to host more Guests in a single node/server. Our hiccup has always been IOPS on our hard-drives, this could be overcomed by an Intel high performance SSD. Has Proxmox been tested with SSD drives? - my dream setup would be as follows:
SATAII hdd: Proxmox bare-metal OS
Local SSD: Storage for Guests

A few concerns about SSD:
- Data protection/backup; are there any tools or opensource realtime data replication tools so that data can be backed up in real time in case of SSD failure?
- High Availability and Live Migration; will it be possible to Live Migrate KVM guests with little to no downtime (no transfering via the network of 80GB disk image for the guest)?

Your input is appreciated.
 
.
1) On PVE 1.4 one of the features released was KVM Zero Downtime Live migration, however there is no more information on the matter. What is required for me to be able to transfer a KVM guest with no downtime?

For KVM Zero Downtime Live Migration, you need shared storage (iSCSI, Fibre Channel, NFS...). So, you don't move the image file, only the memory.

Alain
 
...
2) For PVE 1.6 which should have High Availability features, what kind of setup is proxmox testing/using? Are you using Fibre Channel shared storage, DRBD or other? What is the best/fastest choice to have the highest IOPS yet have data redundancy and fault tolerance (automatic)?
Hi,
i'm using FibreChannel. I made also some test with iScsi and DRBD, but FC are absolutly the fastest (depends on the raids you use). But if your iScsi/FC-Raid fails, you lost all. DRBD is more robust - you have two redundant storage and can use them also for live migration. But the performance are well below FC. Perhaps it's better with 10GB-NIC, but i don't have experience with that.

3) For my next hardware selection for promox, I wanted to ensure that we expand our ability to host more Guests in a single node/server. Our hiccup has always been IOPS on our hard-drives, this could be overcomed by an Intel high performance SSD. Has Proxmox been tested with SSD drives? - my dream setup would be as follows:
SATAII hdd: Proxmox bare-metal OS
Local SSD: Storage for Guests
A good raid-controller is similar important like fast disks! I made one test with an ssd (not the expensive intel ssd) on a backup-server (bacula) as spool-disk. The result are very bad - if you only write or only read to/from the ssd the performance are good, but with mixed io the performance are very bad.
I think (or hope, because i dream also about that) with the intel-ssd it's not the problem.
One drawback of internal SSDs: you can use the storage only for live migration with DRBD - so test the performance (i guess, you don't need SSDs for that).
An external FC-Raid with SSDs makes more sense (if you need live migration).
A few concerns about SSD:
- Data protection/backup; are there any tools or opensource realtime data replication tools so that data can be backed up in real time in case of SSD failure?
- High Availability and Live Migration; will it be possible to Live Migrate KVM guests with little to no downtime (no transfering via the network of 80GB disk image for the guest)?

Your input is appreciated.
To A: You should protect your data on a SSD as well with a raid-controller. DRBD is a realtime replication!
To B: It's always passible with the current proxmox-version (without HA). You need only external storage or DRBD.

Udo
 
Hi,
i'm using FibreChannel. I made also some test with iScsi and DRBD, but FC are absolutly the fastest (depends on the raids you use). But if your iScsi/FC-Raid fails, you lost all. DRBD is more robust - you have two redundant storage and can use them also for live migration. But the performance are well below FC. Perhaps it's better with 10GB-NIC, but i don't have experience with that.


A good raid-controller is similar important like fast disks! I made one test with an ssd (not the expensive intel ssd) on a backup-server (bacula) as spool-disk. The result are very bad - if you only write or only read to/from the ssd the performance are good, but with mixed io the performance are very bad.
I think (or hope, because i dream also about that) with the intel-ssd it's not the problem.
One drawback of internal SSDs: you can use the storage only for live migration with DRBD - so test the performance (i guess, you don't need SSDs for that).
An external FC-Raid with SSDs makes more sense (if you need live migration).
To A: You should protect your data on a SSD as well with a raid-controller. DRBD is a realtime replication!
To B: It's always passible with the current proxmox-version (without HA). You need only external storage or DRBD.

Udo

Thanks for your input!

I am curious, in regards to your single SSD can you give me the brand and model to compare it to Intel X-25M? Also, do you have any stats IOPS etc? If you have windows guests on PVE you can run Iometer etc

Right now I am trying to be conservative, it seems like the cheapest option to have H.A. on the affordable side is to do DRBD, but you mentioned that it was slow... would you mind sharing how slow was it?

My main concern is to be able to run at least 3 Virtual Machines on a single server and have at least 3 physical Proxmox VE connected to the shared storage.

High Availability is coming in Proxmox PVE I do not know what kind of features it will have but if you needed a similar setup today what kind of hardware would you get? Mind you this should all fit on a 22U cabinet.

Thanks for your help!
 
for fast DRBD (and it will be very fast) you need good hardware raid controller with BBU and fast dedicated network - best is 10Gbit, Infiniband, etc.

supermicro offers integrated Infiniband 40Gbps Controller on some new server mainboards, but note - we do not have such hardware in our labs and therefore we never tested and validated it.

I suggest you ask the DRBD experts at www.linbit.com (creator´s of DRBD) for consultancy here, they also know Proxmox VE - oh, and feel free to donate some bucks to extend our test lab equipment.
 
Yes, I'm currently using Dolphin adapters connected to their switch both for DRBD and (with modified NFS stacks on the hosts and server) as NFS-based shared storage for Proxmox. It works fantastically well.
 
Yes, I'm currently using Dolphin adapters connected to their switch both for DRBD and (with modified NFS stacks on the hosts and server) as NFS-based shared storage for Proxmox. It works fantastically well.
Hi,
what read/write-rates do you reach?
Do you use a 10 or 20GB-Connection?
What is the value of syncer-rate?

Perhaps i should use the trial-offer from Dolphin...

Udo
 
Hi,
what read/write-rates do you reach?
Do you use a 10 or 20GB-Connection?
What is the value of syncer-rate?

Perhaps i should use the trial-offer from Dolphin...

Udo

My read/write rates are *relatively* low due to the disk subsystem, around 240MB/s. But that's more than is required for me, and definitely NOT a limitation of the Dolphin hardware at all. For DRBD, I use a syncer-rate of 60. It's fairly conservative, but adequate for what I'm doing.

I have a total of 10 hosts, 2 storage and 8 VM hosts, each single-attached to the Dolphin switch (i.e., 10Gb/s).

What is really, really fast, though, is live migration. Because I can achieve pretty close to 10Gb/s with extremely low latency, it's essentially instantaneous.
 
We're trying PVE 1.5 + KVM Live Migration WITH DRBD and the results are incredible.

we have a real downtime of around 2 - 4 seconds, not more than that.

We have a "problem" though, what if server 1 with all the machines goes down?

What happens to the server 2 who's currently taking a nap - in synch with drbd on server 1 - in order to get the machines up?

So what we basically did was, synch'd DRBD, and suddendly shutted down server 1

server 2 had no idea what to do.

Is there any workaround, idea, what so ever to be able to get up the machines from the "dead" server 1, even when drbd was fully synched?
 
...

So what we basically did was, synch'd DRBD, and suddendly shutted down server 1

server 2 had no idea what to do.

Is there any workaround, idea, what so ever to be able to get up the machines from the "dead" server 1, even when drbd was fully synched?
Hi,
the config-files of the VMs are lost. So it's a good idea to rsync the configs to the other node (they are in /etc/qemu-server/ in case of kvm). Then you need only to move the configs on the survirvored node to /etc/qemu-server and you can start the VMs again.
Full HA-support should be in pve 2.0.

Udo
 
Does anyone have any notes on setting up a standby master with PVE 1.5? It would make a nice Wiki article. I'm thinking of using cron script and Heartbeat to "wake" the standby control node.
 
Hi,
the config-files of the VMs are lost. So it's a good idea to rsync the configs to the other node (they are in /etc/qemu-server/ in case of kvm). Then you need only to move the configs on the survirvored node to /etc/qemu-server and you can start the VMs again.
Full HA-support should be in pve 2.0.

Udo


That was nice, and unknown to me, now everything makes sense INDEED.

Cheers, and thanks so much for the info!
 
Thanks for your input!

I am curious, in regards to your single SSD can you give me the brand and model to compare it to Intel X-25M? Also, do you have any stats IOPS etc? If you have windows guests on PVE you can run Iometer etc

Right now I am trying to be conservative, it seems like the cheapest option to have H.A. on the affordable side is to do DRBD, but you mentioned that it was slow... would you mind sharing how slow was it?

My main concern is to be able to run at least 3 Virtual Machines on a single server and have at least 3 physical Proxmox VE connected to the shared storage.

High Availability is coming in Proxmox PVE I do not know what kind of features it will have but if you needed a similar setup today what kind of hardware would you get? Mind you this should all fit on a 22U cabinet.

Thanks for your help!
Hi,
sorry for the late answer.
I have not test the SSD with proxmox, only as a spool-disk for bacula (some writings to the disk and one reading to support a LTO-4 drive).
If nothing simultanous write to the ssd i got a transfer-rate of 110 MB/s to the drive - that's ok.
But if i write some streams to the Disk (all data came over one 1GB-network-connection) my transferrate goes down to 23MB/s (worst case). Most times i reach 60 til 100 MB/s but i expected a lot more. At a test ( http://www.behardware.com/articles/753-3/ssd-2009-act-1-ocz-apex-and-samsung-pb22-j.html ) the performance-data looks better, but the test only read and then write - not at the same time! And this happens in true life.
The SSD is a "Samsung SSD 64GB PB22-J 2.5" SATA II (MLC)".

Udo
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!