When will KVM live / suspend migration on ZFS work?

gkovacs

Renowned Member
Dec 22, 2008
512
50
93
Budapest, Hungary
Upon upgrading our cluster to PVE 4, I just realized that live migration of KVM guests on ZFS local storage (zvol) still does not work. Since vzdump live backups do work (presumably using ZFS snapshots), I wonder why it's not implemented for migration, and when is it expected? Is it on the roadmap at all?
 
live migration with local storage cannot work.

it would be basically possible to do a "storage live migration", but this makes very limited sense to me.

as you would need to copy the complete virtual disk content over the network just for the migration. so either poweroff and do an offline migration or use a distributed or shared storage.

maybe we add this in future, but AFAIK currently no-one is working on this.
 
live migration with local storage cannot work.

it would be basically possible to do a "storage live migration", but this makes very limited sense to me.

as you would need to copy the complete virtual disk content over the network just for the migration. so either poweroff and do an offline migration or use a distributed or shared storage.

maybe we add this in future, but AFAIK currently no-one is working on this.

Well, even if a true live migration is not possible from local storage, an almost live (suspended) migration is badly needed. We have several hundred gigabyte KVM guests, and turning them off for migration causes so much downtime that our users simply cannot accept it.

Since QEMU/QCOW2, LVM and ZFS all support snapshots, would it be possible to create a two phase (snapshot, suspend, copy the rest) migration for KVM, just like it works very well for backups?
 
Last edited:
Since QEMU/QCOW2, LVM and ZFS all support snapshots, would it be possible to create a two phase (snapshot, suspend, copy the rest) migration for KVM, just like it works very well for backups?

Qemu already include some block migration code, so it would be best to use that. But most people simply use shared/distributed storage instead, which can handle live migration much faster ...
 
Qemu already include some block migration code, so it would be best to use that. But most people simply use shared/distributed storage instead, which can handle live migration much faster ...

I seriously doubt that "most people" use distributed / shared storage, since that requires 10G networking to reach acceptable performance, and it's still on the expensive side. We use bonded 1G interfaces (tried 2x and 3x), and Ceph is sluggish even for backup purposes (both latency and throughput are bad).

I do hope that at least a snapshot/suspend migration of KVM will get implemented in Proxmox in the near future.
 
  • Like
Reactions: chrone and vkhera
We use bonded 1G interfaces (tried 2x and 3x), and Ceph is sluggish even for backup purposes (both latency and throughput are bad).

When you use bonded interfaces, I do not believe that a single connection will use more than the single NIC. Everything I've read about bonded interfaces is that they spread the load across multiple clients only.
 
  • Like
Reactions: chrone
When you use bonded interfaces, I do not believe that a single connection will use more than the single NIC. Everything I've read about bonded interfaces is that they spread the load across multiple clients only.
Partly true, or you could say true for writes but if you are using bond-xmit-hash-policy layer3+4 and a threaded application you should theoretically be able to spread reads over more nics.
 
  • Like
Reactions: chrone
LACP partially also works. Most of the techniques work on a per-MAC-basis, so if you have one target (e.g. iscsi portal), every traffic to that one mac gets routed through only one NIC, because source and target are always the same, but you will get more bandwidth for e.g. clients for a webserver.
 
I seriously doubt that "most people" use distributed / shared storage, since that requires 10G networking to reach acceptable performance, and it's still on the expensive side. We use bonded 1G interfaces (tried 2x and 3x), and Ceph is sluggish even for backup purposes (both latency and throughput are bad).

I do hope that at least a snapshot/suspend migration of KVM will get implemented in Proxmox in the near future.
Ceph is distributed storage. Shared storage would be using something like a NAS/SAN and sharing the images over NFS which is a very common thing to do. Then you can live migrate and all it has to move is the RAM.

If you want live migration, redesign your cluster. Local storage isn't designed for live migrating.
 
Ceph is distributed storage. Shared storage would be using something like a NAS/SAN and sharing the images over NFS which is a very common thing to do. Then you can live migrate and all it has to move is the RAM.

If you want live migration, redesign your cluster. Local storage isn't designed for live migrating.

I understand that, but my original post was also about suspend migration... problem is, unless you use shared or distributed storage, migrating a KVM guest means considerable downtime.

And since QCOW2, LVM and ZFS all support snapshots, I reckon it would be trivial to have at least a two phase (snapshot, migrate, suspend, migrate the rest) migration for KVM. For backups, Proxmox already uses these snapshot features, so my question was why can't we use the same technology for migration? I know it's not live, but would be very close to it...
 
Last edited:
I understand that, but my original post was also about suspend migration... problem is, unless you use shared or distributed storage, migrating a KVM guest means considerable downtime.

And since QCOW2, LVM and ZFS all support snapshots, I reckon it would be trivial to have at least a two phase (snapshot, migrate, suspend, migrate the rest) migration for KVM. For backups, Proxmox already uses these snapshot features, so my question was why can't we use the same technology for migration? I know it's not live, but would be very close to it...
Except your use case is very unusual. You have a cluster yet no shared storage? Could you not repurpose one of your cluster nodes and make that the shared storage?

From a business perspective, it would probably take a good length of time to satisfy your very edge case when solutions are already available.
 
Except your use case is very unusual. You have a cluster yet no shared storage? Could you not repurpose one of your cluster nodes and make that the shared storage?

From a business perspective, it would probably take a good length of time to satisfy your very edge case when solutions are already available.

You have no idea what's usual or what's unusual. You have no information about what's an edge case regarding storage decisions of Proxmox users. And how could you? Are you a Proxmox developer? Do you have any hard data or talked to that many different people running Proxmox clusters?

You are talking nonsense about edge cases and business perspectives, and feel the need to tell us what to do with our cluster with a two week old forum user, while we are using Proxmox in production for 8 years now. The only thing you have detailed information about is your own cluster (if you have one at all), and you haven't got the faintest idea about our customers and their performance requirements that's behind our node and cluster design.

Yes, we have a cluster, but no, we don't run KVM disks from NAS/SAN or Ceph because these solutions have a severe I/O performance penalty, especially on gigabit ethernet. We have shared NFS storage of course, but it's only used for backups, not live disks. We have based these decisions on a lot of benchmark data that we gathered in the past years, and not on reading forum posts by people who act like experts yet don't have the data or the experience.

So thanks for pointing out that you personally don't need KVM snapshot migration, but please try to be tolerant of the fact the others (like us) do. And since the technology is readily available and already being used in backup jobs, probably it's not a big ask of the developers to use those same snapshots in migration jobs.
 
You seem suitably mad so I'll stop offering advice. The response from a proxmox staff member above said no one was working on it and neither is there much demand for such a feature. Make of that what you will. If your business/customers have high performance requirements yet aren't willing to spend the relatively small amount of money on a 10G backbone (if that's what you think it will take to get decent performance) or on a flash array for the IOps intensive disks, there's no helping you.

Have you considered running whatever this VM does on a cloud provider? AWS/Azure can give you a great deal of iops and compute power if the tasks aren't continuous.
 
You seem suitably mad so I'll stop offering advice. The response from a proxmox staff member above said no one was working on it and neither is there much demand for such a feature. Make of that what you will. If your business/customers have high performance requirements yet aren't willing to spend the relatively small amount of money on a 10G backbone (if that's what you think it will take to get decent performance) or on a flash array for the IOps intensive disks, there's no helping you.

Have you considered running whatever this VM does on a cloud provider? AWS/Azure can give you a great deal of iops and compute power if the tasks aren't continuous.

Looks like there is no stopping you, can't even keep your own promise of not offering any more "advice". Dude, seriously:

1. No one asked your opinion on the price of a 10G backbone related to our business. Asking you would be stupid, since you do not posess the facts, the benchmarks, the information needed for such advice. Yet you give it.

2. Not even sure how you come to the conclusion that we need to spend on a flash array for the workload you aren't familiar with in the first place. FYI we are running all flash RAIDZ arrays, especially because of our performance needs (and these arrays would probably perform shit even on 10G, compared to what they can do as local storage). Yet you go there again, giving advice that no one asked for, in matters you are not familiar with.

How can I ask you: please stop hijacking my thread. And while you are at it, you could stop posing as an expert in matters you have no experience in, have no posession of the facts, have no benchmarks data and no statistics to speak of. I would choose to converse with people who have similar needs than we do, and don't want to persuade me that we are doing something wrong with 10 years of virtualization experience behind us in the field we are operating in.
 
Last edited:
@gkovacs - the feature you request is reasonable, why you need it has been well explained. As you point out, the implementation should not be terribly difficult since all of the required parts already exist - its mainly a matter of pulling them together and testing (which, to be fair, may not be trivial). While I don't have any specific need for this, I have also often wondered why its not supported since it would seem not too difficult to include. Both ESX and Hyper-V allow migration from local storage (a combined storage/live migration). I'm sure that they didn't do it just for fun - so there must be some demand. In fact, when it was introduced to Hyper-V as part of Server-2012, MS did a lot of press on just how important this feature was.

Don't let pretenders/haters get under your skin. Its clear that he speaks without facts or data and has no understanding of your application. I don't either - but I can confirm you've presented a reasonable ask for what appear to be reasonable reasons. I'd suggest you don't even respond or acknowledge him if he chooses to jump in with further comment.

Keep asking the real implementer & staff members here. I've seen in the past that while they may not immediately get the motivation they can be reasoned with.
 
Last edited:
  • Like
Reactions: gkovacs and chrone
Basically both technologies for the job is available - storage migration and VM migration. Since both technologies support online and offline migration it should be fairly easy to make a combined storage and VM migration both offline and online. Only thing which is tricky is how to make it 'atomic'. Eg. it might not be possible, or even desirable to rollback a successful storage migration.
 
In fact, when it was introduced to Hyper-V as part of Server-2012, MS did a lot of press on just how important this feature was.

And it sure was. I really like the idea of share-nothing live-migration.

I'd like such a feature @gkovacs requests very much too. Based on ZFS is could be used for a very low-cost two-node system without shared storage but with ZFS. Yet @mir is right, the atomic part is crucial and the live-migration downtime would be a little bit bigger than with "real" shared storage.
 
  • Like
Reactions: vkhera
Qemu already include some block migration code, so it would be best to use that. But most people simply use shared/distributed storage instead, which can handle live migration much faster ...

@dietmar is this what you were talking about?

Live disk migration with libvirt blockcopy
https://kashyapc.com/2014/07/06/live-disk-migration-with-libvirt-blockcopy/

According to this document all the facilities to create non-shared storage migration are already included in QEMU / libvirt.
 
Last edited:
  • Like
Reactions: chrone

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!