About My Setup

teknoadmin · Aug 30, 2023

Hello! I have a infrastructure in a data center like;

60x R730 (3x 2 TB NVMe for each)
512 GB DDR4 RAM (each)
2x E5 2699v3 (each)

I want build a structure which can carry 2000-5000 VMs at the same time. It must be healthy and permanent 24/7. I want backup 250 or 300 vms daily.

What do you suggest as a disk type? How should I use as CEPH? How can I monitor health of the disks and hardware of servers? Should I use ZFS or EXT4? I want performance and stability but preferably stability.

Thank you!

bbgeek17 · Aug 30, 2023

teknoadmin said:
What do you suggest as a disk type?

do you mean physical disks or virtual? If physical - use Enterprise grade (PLP capable) NVMe. If virtual - that will depend on your choice of storage, with Ceph it will technically be raw.

teknoadmin said:
How should I use as CEPH?

At 60+ nodes, you may be better of creating smaller cells of compute and storage. You can use PVE guide to Ceph setup, the document is stickied in the thread on the front page of the forum.

teknoadmin said:
How can I monitor health of the disks and hardware of servers?

You will likely need to invest in some sort of monitoring/alerting solution, in addition to PVE. Grafana, Prometheus, etc.

teknoadmin said:
Should I use ZFS or EXT4?

Do you mean for root/OS disk? It doesnt matter much, thats not where your performance bottleneck will be. Use mirrored setup for boot disk. We like Dell's BOSS. If you cant get hardware raid, you can utilize ZFS mirror.

teknoadmin said:
want performance and stability but preferably stability.

At 5000 active VMs on 60 hosts (83 VMs per host), doing backups and trying to balance NUMA allocation (due to dual CPU package), Ceph is likely not the best choice for performance. I suspect you will need all the CPU cycles you can squeeze from your hardware.
You may need to apply some of the tunables described here:
https://kb.blockbridge.com/technote/proxmox-tuning-low-latency-storage/

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

abufrejoval · Aug 30, 2023

teknoadmin said:
Hello! I have a infrastructure in a data center like;

60x R730 (3x 2 TB NVMe for each)
512 GB DDR4 RAM (each)
2x E5 2699v3 (each)

I want build a structure which can carry 2000-5000 VMs at the same time. It must be healthy and permanent 24/7. I want backup 250 or 300 vms daily.

What do you suggest as a disk type? How should I use as CEPH? How can I monitor health of the disks and hardware of servers? Should I use ZFS or EXT4? I want performance and stability but preferably stability.

Thank you!

You don't happen to work in Russia?

Massive amounts of 10 year old hardware sounds like boykott busting to me...

ness1602 · Aug 30, 2023

Meh, boycott is a non-democratic option.
As for production, split it into 3x20 PVE CEPH-backed clusters, that would be the most safe option.

abufrejoval · Aug 30, 2023

I believe non-democratic is what Putin is opting for, also outside of Russia.

"Neutral" is effectively aiding and abetting, and as a born Prussian who didn't realize that all those ruins and machine gun pockmarks in building walls when I was growing up in Berlin were what was left when they removed the bodies, I do not want history repeating itself.

ILeA · Aug 30, 2023

abufrejoval said:
You don't happen to work in Russia?

Massive amounts of 10 year old hardware sounds like boykott busting to me...

Your statement or thoughts look at least stupid and bring here some potics will makes nothing for the comunity of Proxmox users.
By the way, I am work in Russia, for Russia if you prefer and I have to say there is no boykott or sanctions. However, what we got it is festival of hypocrisy made by EU mostly.

teknoadmin · Aug 31, 2023

abufrejoval said:
You don't happen to work in Russia?

Massive amounts of 10 year old hardware sounds like boykott busting to me...

it is none of your business. We are debating infra in here, not component or hardware.

teknoadmin · Aug 31, 2023

bbgeek17 said:
do you mean physical disks or virtual? If physical - use Enterprise grade (PLP capable) NVMe. If virtual - that will depend on your choice of storage, with Ceph it will technically be raw.

At 60+ nodes, you may be better of creating smaller cells of compute and storage. You can use PVE guide to Ceph setup, the document is stickied in the thread on the front page of the forum.

You will likely need to invest in some sort of monitoring/alerting solution, in addition to PVE. Grafana, Prometheus, etc.

Do you mean for root/OS disk? It doesnt matter much, thats not where your performance bottleneck will be. Use mirrored setup for boot disk. We like Dell's BOSS. If you cant get hardware raid, you can utilize ZFS mirror.

At 5000 active VMs on 60 hosts (83 VMs per host), doing backups and trying to balance NUMA allocation (due to dual CPU package), Ceph is likely not the best choice for performance. I suspect you will need all the CPU cycles you can squeeze from your hardware.
You may need to apply some of the tunables described here:
https://kb.blockbridge.com/technote/proxmox-tuning-low-latency-storage/

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

Monitoring resources is not a problem. There are many alternatives but controlling specific things, like checking health of the disk, is hard to achieve. I'm asking it actually.

I'm asking disk type for VMs. What should I use as disk type? EXT4 or ZFS?

Creating 2000-5000 VMs is our target. We cannot do it with 60 servers right now. We will set 60-70 vms per server.

Thank you for your help.

bbgeek17 · Aug 31, 2023

teknoadmin said:
There are many alternatives but controlling specific things, like checking health of the disk, is hard to achieve. I'm asking it actually.

If Dell does not provide a centralized solution to do it, you may need to write your own scripts. One of the main hallmarks of Enterprise storage is attention to detail, specifically holistically looking at the entire stack even if it is Software Defined. For example, at Blockbridge, we have wired up all the FRU monitoring into our solution's alerting/GUI/CLI, including disk health. Whether our Software runs on on Supermicro, Dell, HP or Intel, we can report disk status down to slot of the failed disk on regular off-the-shelf servers. We made it work for NVMe devices as well as SES does for SCSI.
As you can imagine, this comes very handy in Data Centers with remote-hands services.

teknoadmin said:
I'm asking disk type for VMs. What should I use as disk type? EXT4 or ZFS?

There is no golden rule answer here. Whatever works best for your OS/setup/etc. If your backend storage implements reliable disk protection then it would be pointless to double that work inside VM.

teknoadmin said:
Creating 2000-5000 VMs is our target. We cannot do it with 60 servers right now. We will set 60-70 vms per server.

As was mentioned already in this thread - creating one large cluster is likely not what you want.

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

abufrejoval · Aug 31, 2023

teknoadmin said:
it is none of your business. We are debating infra in here, not component or hardware.

Actually it is.

If I spend my time trying to help a people that is at war against democracy, I am doing damage to myself and those I cherish by aiding a declared enemy.

So if you're Russian and working for Russia, you should be given the worst advice possible or quite simply be banned from receiving support.

And you should consider what you can do to remove Putin from power, because if you're lucky, your kids or granchildren will ask you why you didn't stop him.

BTW: interesting, that you should infra not consisting of components or hardware...

Actually after re-reading your post, let's just cut the crap:

Russia has launched a war against Ucraine and is in an undeclared cyberwar with the West: you are quite simply an enemy.

The West has imposed sanctions, which makes it illegal to provide services to people living in Russia, that would help your war of aggression.

Providing helpful advice, is breaking those sanctions and quite probably illegal.

Your pride in being able to break or circumvent those sanctions may be patriotic, but only shows that you have no scruples nor conscience doing illegal things and inciting others to do the same.

I'd guess you come to Proxmox, because sanctions on VMware worked.

I hope that Proxmox will do everything in its power to ensure that it also doesn't work for you.

teknoadmin · Aug 31, 2023

abufrejoval said:
Actually it is.

If I spend my time trying to help a people that is at war against democracy, I am doing damage to myself and those I cherish by aiding a declared enemy.

So if you're Russian and working for Russia, you should be given the worst advice possible or quite simply be banned from receiving support.

And you should consider what you can do to remove Putin from power, because if you're lucky, your kids or granchildren will ask you why you didn't stop him.

BTW: interesting, that you should infra not consisting of components or hardware...

Actually after re-reading your post, let's just cut the crap:

Russia has launched a war against Ucraine and is in an undeclared cyberwar with the West: you are quite simply an enemy.

The West has imposed sanctions, which makes it illegal to provide services to people living in Russia, that would help your war of aggression.

Providing helpful advice, is breaking those sanctions and quite probably illegal.

Your pride in being able to break or circumvent those sanctions may be patriotic, but only shows that you have no scruples nor conscience doing illegal things and inciting others to do the same.

I'd guess you come to Proxmox, because sanctions on VMware worked.

I hope that Proxmox will do everything in its power to ensure that it also doesn't work for you.

you are drooling. You think that you know much thing or you can guess something about someone. I'm neither Russian nor working for Russia. I think you are Ukranian and you took it personally. You can't blame someone who is actually Russian and you can't smear someone about being Russian. I'm from Turkey. We gave a lot of BAYRAKTAR drones to Ukranie to support them and their legitimacy. You don't know anything about me and you can't. We are using 10 years old hardware because of currency. I think you don't know anything about information techs because, hardware is not important than software. Our processors are enough to make anything. Our disks are data center NVMe. We are using DDR4 RAM. We have a backup infra in VMware. If you are not using space tech nvme or DDR5 ram, then shut your mouth please. Servers and it's models are not important so much, if you use good components. I hope Proxmox will ban someone like you who blaming someone cause breaking war.

teknoadmin · Aug 31, 2023

bbgeek17 said:
If Dell does not provide a centralized solution to do it, you may need to write your own scripts. One of the main hallmarks of Enterprise storage is attention to detail, specifically holistically looking at the entire stack even if it is Software Defined. For example, at Blockbridge, we have wired up all the FRU monitoring into our solution's alerting/GUI/CLI, including disk health. Whether our Software runs on on Supermicro, Dell, HP or Intel, we can report disk status down to slot of the failed disk on regular off-the-shelf servers. We made it work for NVMe devices as well as SES does for SCSI.
As you can imagine, this comes very handy in Data Centers with remote-hands services.

There is no golden rule answer here. Whatever works best for your OS/setup/etc. If your backend storage implements reliable disk protection then it would be pointless to double that work inside VM.

As was mentioned already in this thread - creating one large cluster is likely not what you want.

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

Thank you very much sir! For all of it.

abufrejoval · Aug 31, 2023

teknoadmin said:
you are drooling. You think that you know much thing or you can guess something about someone. I'm neither Russian nor working for Russia. I think you are Ukranian and you took it personally. You can't blame someone who is actually Russian and you can't smear someone about being Russian. I'm from Turkey. We gave a lot of BAYRAKTAR drones to Ukranie to support them and their legitimacy. You don't know anything about me and you can't. We are using 10 years old hardware because of currency. I think you don't know anything about information techs because, hardware is not important than software. Our processors are enough to make anything. Our disks are data center NVMe. We are using DDR4 RAM. We have a backup infra in VMware. If you are not using space tech nvme or DDR5 ram, then shut your mouth please. Servers and it's models are not important so much, if you use good components. I hope Proxmox will ban someone like you who blaming someone cause breaking war.

I am sincerely sorry if I hit the wrong guy!

I guess this should have gone to ILeA instead.

But in these times, something as seemingly innocent as giving peer-to-peer support can evidently become quite political and even a crime.

No, I'm not Ukranian but German. But I grew up in Western Berlin during the cold war and with about a million Russian soldiers stationed all around me, ready to pounce with only a token few Americans, Brits and French to defend us.

I had sincerely hoped that with Perestroika the German-Russian relationship would be one of happy friends, but Russia has allowed a breed of nationalism to fester that isn't far from what killed 60 million people 20 years before I was born. I'd just like that not to happen to my children, so Putin and his cronies need to be stopped, before they do a Hitler on us and Russia.

While one of my sons is in the Navy, all I can do is to make sure that I'm not accidentally helping Russia along, but the rest of us must do that, too: complacense is immoral and probably illegal.

BTW: I still operate a Xeon E5-2696v3 as a workstation on a X99 mainboard in my home lab and I quite like the machine, 128GB of ECC DDR4 and FusionIO instead of NVMe. But some of my laptops are starting to run circles around it, too.

Güle Güle

t.lamprecht · Aug 31, 2023

This is the community support forum for Proxmox projects, please keep on topic, or we'll have to lock this thread.

Everybody is free to decide whom to reply/help or not, but everybody has also to follow the forum terms and rules, which they agreed on registering.

Oh, and in general I would advise against giving wrong advice on purpose, as this forum is read by many others and some unknowing (new) user may then also follow it, causing potential harm to unrelated people.

Now let's get back on topic please!

teknoadmin · Aug 31, 2023

abufrejoval said:
I am sincerely sorry if I hit the wrong guy!

I guess this should have gone to ILeA instead.

But in these times, something as seemingly innocent as giving peer-to-peer support can evidently become quite political and even a crime.

No, I'm not Ukranian but German. But I grew up in Western Berlin during the cold war and with about a million Russian soldiers stationed all around me, ready to pounce with only a token few Americans, Brits and French to defend us.

I had sincerely hoped that with Perestroika the German-Russian relationship would be one of happy friends, but Russia has allowed a breed of nationalism to fester that isn't far from what killed 60 million people 20 years before I was born. I'd just like that not to happen to my children, so Putin and his cronies need to be stopped, before they do a Hitler on us and Russia.

While one of my sons is in the Navy, all I can do is to make sure that I'm not accidentally helping Russia along, but the rest of us must do that, too: complacense is immoral and probably illegal.

BTW: I still operate a Xeon E5-2696v3 as a workstation on a X99 mainboard in my home lab and I quite like the machine, 128GB of ECC DDR4 and FusionIO instead of NVMe. But some of my laptops are starting to run circles around it, too.

Güle Güle

I'm sorry too. You are quite right in some points. I couldn't see those at the beginning. If I upset you with my words, I'm so sorry about it.

Türken sind den Deutschen gegenüber immer freundlich. Tschüss!

abufrejoval · Aug 31, 2023

bbgeek17 said:
do you mean physical disks or virtual? If physical - use Enterprise grade (PLP capable) NVMe. If virtual - that will depend on your choice of storage, with Ceph it will technically be raw.

At 60+ nodes, you may be better of creating smaller cells of compute and storage. You can use PVE guide to Ceph setup, the document is stickied in the thread on the front page of the forum.

You will likely need to invest in some sort of monitoring/alerting solution, in addition to PVE. Grafana, Prometheus, etc.

Do you mean for root/OS disk? It doesnt matter much, thats not where your performance bottleneck will be. Use mirrored setup for boot disk. We like Dell's BOSS. If you cant get hardware raid, you can utilize ZFS mirror.

At 5000 active VMs on 60 hosts (83 VMs per host), doing backups and trying to balance NUMA allocation (due to dual CPU package), Ceph is likely not the best choice for performance. I suspect you will need all the CPU cycles you can squeeze from your hardware.
You may need to apply some of the tunables described here:
https://kb.blockbridge.com/technote/proxmox-tuning-low-latency-storage/

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

In an environment like that, your first focus might have to be on the network.

And that should both be high-bandwidth and segregated according to function, so read the recommendations in the manual carefully.

Cluster state (corosync) may just take 1Gbit, but should be separate, if you use Ceph, try to aim for a distinct 10Gbit or better network just for storage. If you can afford another for live-migrations, all the better. Last not least, use another network (also high-speed) for North-South traffic that's towards clients or Internet. Admins like each and all of them bonded with redundant switches.

You say nothing about the intended use of these VMs: is it VDI (virtual desktops) or Internet hosting? Are these VMs near totally independent with only or will there be significant East-West traffic. Again, a distinct network or one which allows you to manage quality and bandwidth parameters will help to avoid trouble.

Will live-migration actually be a frequent thing? Is there therefore a benefit in having shared storage via Ceph? Will you run those servers 24x7 or will you try to run only a minimized set to save on power?

Ext4/ZFS and Ceph are somewhat exclusive and when you use Ceph, they'll just be for boot. Note that Promox by default only deals with whole disks. So if you want to only use parts of an NVMe drive for boot and the rest for Ceph, you'll have to use the command line, partition and create ODS manually: the GUI won't support that. If you want a redundant boot (typically not an issue with NVMe), ZFS may be better, but I've never tried it.

I suggest you run some experiments by running Proxmox via nested virtualization on one of your machines via VMware before (that's how I did my first experiments, before I put Proxmox on bare metal).

Disk sizes on your VMs could be a factor, if live migrations won't be rare. As long as VMs are moved between nodes that participate in a Ceph pool, only RAM needs to move, so it could pay off to create several smaller pools. Of course Ceph is for scale-out, so at this scale if your network is good enough, you might just not care.

When it comes to sizing the ODS, I suggest you follow the general recommendations of ZFS: going all out and really wide (say 57:3) may offer the best net storage efficiency, but if resilvering needs to happen, you involve a lot of nodes, too.

It's really beyond any practical experience I have, but I'd say make sure that you never scale beyond your network switches for Ceph.

And if you're trying to build for a university or research campus with lots of logically shared storage like a global NFS, you may have to do a completely different design and in fact consider partially using GlusterFS for that e.g. with ZFS underneath.

Monitoring: You better already have something you're familiar with. Running the setup of this farm and learning which monitoring tool will serve you best, is probably too much at once. But all of this should ideally be done in stages and with some really tolerant pilot poplulations.

teknoadmin · Aug 31, 2023

abufrejoval said:
In an environment like that, your first focus might have to be on the network.

And that should both be high-bandwidth and segregated according to function, so read the recommendations in the manual carefully.

Cluster state (corosync) may just take 1Gbit, but should be separate, if you use Ceph, try to aim for a distinct 10Gbit or better network just for storage. If you can afford another for live-migrations, all the better. Last not least, use another network (also high-speed) for North-South traffic that's towards clients or Internet. Admins like each and all of them bonded with redundant switches.

You say nothing about the intended use of these VMs: is it VDI (virtual desktops) or Internet hosting? Are these VMs near totally independent with only or will there be significant East-West traffic. Again, a distinct network or one which allows you to manage quality and bandwidth parameters will help to avoid trouble.

Will live-migration actually be a frequent thing? Is there therefore a benefit in having shared storage via Ceph? Will you run those servers 24x7 or will you try to run only a minimized set to save on power?

Ext4/ZFS and Ceph are somewhat exclusive and when you use Ceph, they'll just be for boot. Note that Promox by default only deals with whole disks. So if you want to only use parts of an NVMe drive for boot and the rest for Ceph, you'll have to use the command line, partition and create ODS manually: the GUI won't support that. If you want a redundant boot (typically not an issue with NVMe), ZFS may be better, but I've never tried it.

I suggest you run some experiments by running Proxmox via nested virtualization on one of your machines via VMware before (that's how I did my first experiments, before I put Proxmox on bare metal).

Disk sizes on your VMs could be a factor, if live migrations won't be rare. As long as VMs are moved between nodes that participate in a Ceph pool, only RAM needs to move, so it could pay off to create several smaller pools. Of course Ceph is for scale-out, so at this scale if your network is good enough, you might just not care.

When it comes to sizing the ODS, I suggest you follow the general recommendations of ZFS: going all out and really wide (say 57:3) may offer the best net storage efficiency, but if resilvering needs to happen, you involve a lot of nodes, too.

It's really beyond any practical experience I have, but I'd say make sure that you never scale beyond your network switches for Ceph.

And if you're trying to build for a university or research campus with lots of logically shared storage like a global NFS, you may have to do a completely different design and in fact consider partially using GlusterFS for that e.g. with ZFS underneath.

Monitoring: You better already have something you're familiar with. Running the setup of this farm and learning which monitoring tool will serve you best, is probably too much at once. But all of this should ideally be done in stages and with some really tolerant pilot poplulations.

Likely,

I will use 128 GB data center SSD for every dedicated server for installation/boot sector and I will use EXT4 format for proxmox base.

For VMs, I will use every disk on EXT4 format(for VMs). (https://prnt.sc/S9E7IdSxzCIz , https://prnt.sc/QQGXL2cg8aKT)

I will use TrueNAS for backup. RAID 5 or 10 should be and Proxmox Backup Manager will manage schedule.

Btw, If I reinstall proxmox, can I use VMs on the disk without losing data? How should I do it?

Probably, I will use CEPH with 10G network cards and with 15-20 servers for each clusters.

abufrejoval · Aug 31, 2023

teknoadmin said:
Likely,

I will use 128 GB data center SSD for every dedicated server for installation/boot sector and I will use EXT4 format for proxmox base.

For VMs, I will use every disk on EXT4 format(for VMs). (https://prnt.sc/S9E7IdSxzCIz , https://prnt.sc/QQGXL2cg8aKT)

I will use TrueNAS for backup. RAID 5 or 10 should be and Proxmox Backup Manager will manage schedule.

Btw, If I reinstall proxmox, can I use VMs on the disk without losing data? How should I do it?

Probably, I will use CEPH with 10G network cards and with 15-20 servers for each clusters.

If you have those data center SSDs left over, it's a good way of putting them to some use, load on them should be very light.

Note that Proxmox does seem to support a ZFS stripe set for boot, it's just that I've never tried that, beause my first hardware target are actually NUCs, which only have a single NVMe drive, which I therefore had to partition.

But testing the functionality and effectiveness of a redundant ZFS boot is easy, if you use a "virtual infra" to do your failure scenario testing, which is much faster to do that way than on real hardware.

I've done all my higher level erasure code failure testing for Ceph using virtual hardware, because I just didn't have sufficient hardware available for QA testing.

Don't format any storage you plan to use with Cepth, Ceph will eat those disks whole and write its own on-disk data structures for storage.

What you use inside the VMs is entirely up to you and the OS you run there. Whatever is simplest is best and I guess with thousands of VMs you'll want an automated setup anyway. And you'll want a good plan on how you allocated and monitor storage inside those VMs, to ensure you're not overcommitting without good proactive monitoring. Few systems, applications and users deal nicely with what happens if storage you believe is there is in fact not.

But do make sure that you have discard or trim support within that OS and for the virtual disks you create, leave some physical spare area and use eager discard eagerly to keep your NVMe storage fast and healthy.

Proxmox doesn't do OVA exports or similar, but it is very good at backups. You alway have a backup, so the following is only for being faster.

A good hypervisor won't touch storage on re-installs. Best I've seen there was Xcp-ng, which would just pick up whatever was left there from before and across releases. With oVirt/RHV there is a very smart management engine that uses a database, so even if no data is destroyed on a HCI re-installation, without a backup of the management egine it's not much use. And it's been rather bad on full generational updates...

With Proxmox, I have not tried yet. But perhaps now I will. But if I do, I'll make sure to do that on a virtual farm (and with snapshots of the not-yet-desctructed state) first, because it's too much loss and work otherwise.

But generally I just try to avoid doing a full global re-install of Proxmox or any other Hypervisor (oVirt/RHV mostly) but go node-by-node. I move my VMs away from whatever I want to re-install and then just move them back afterwards. If it's a full generational (or platform) switch without backward compatibility (did happen elsewhere), I've worked with OVA exports and imports, which worked there.

While Proxmox doesn't do OVA, the disk images are really just QCOW and those are easy to handle.

I've transported quite a few VMs from VWare and VirtualBox to oVirt/RHV and again quite a few of those to Proxmox.
Far too manual for my taste, but of you have really massive volumes, you can write tools to automate that.

Remember, Proxmox is mostly just an API on top of commodity Linux technology so Ceph is Cepth, with or without Proxmox, ZFS, LVM and file systems are just holding standard QCOW files representing VM disks for use with KVM/QEMU.

teknoadmin · Sep 1, 2023

abufrejoval said:
If you have those data center SSDs left over, it's a good way of putting them to some use, load on them should be very light.

Note that Proxmox does seem to support a ZFS stripe set for boot, it's just that I've never tried that, beause my first hardware target are actually NUCs, which only have a single NVMe drive, which I therefore had to partition.

But testing the functionality and effectiveness of a redundant ZFS boot is easy, if you use a "virtual infra" to do your failure scenario testing, which is much faster to do that way than on real hardware.

I've done all my higher level erasure code failure testing for Ceph using virtual hardware, because I just didn't have sufficient hardware available for QA testing.

Don't format any storage you plan to use with Cepth, Ceph will eat those disks whole and write its own on-disk data structures for storage.

What you use inside the VMs is entirely up to you and the OS you run there. Whatever is simplest is best and I guess with thousands of VMs you'll want an automated setup anyway. And you'll want a good plan on how you allocated and monitor storage inside those VMs, to ensure you're not overcommitting without good proactive monitoring. Few systems, applications and users deal nicely with what happens if storage you believe is there is in fact not.

But do make sure that you have discard or trim support within that OS and for the virtual disks you create, leave some physical spare area and use eager discard eagerly to keep your NVMe storage fast and healthy.

Proxmox doesn't do OVA exports or similar, but it is very good at backups. You alway have a backup, so the following is only for being faster.

A good hypervisor won't touch storage on re-installs. Best I've seen there was Xcp-ng, which would just pick up whatever was left there from before and across releases. With oVirt/RHV there is a very smart management engine that uses a database, so even if no data is destroyed on a HCI re-installation, without a backup of the management egine it's not much use. And it's been rather bad on full generational updates...

With Proxmox, I have not tried yet. But perhaps now I will. But if I do, I'll make sure to do that on a virtual farm (and with snapshots of the not-yet-desctructed state) first, because it's too much loss and work otherwise.

But generally I just try to avoid doing a full global re-install of Proxmox or any other Hypervisor (oVirt/RHV mostly) but go node-by-node. I move my VMs away from whatever I want to re-install and then just move them back afterwards. If it's a full generational (or platform) switch without backward compatibility (did happen elsewhere), I've worked with OVA exports and imports, which worked there.

While Proxmox doesn't do OVA, the disk images are really just QCOW and those are easy to handle.

I've transported quite a few VMs from VWare and VirtualBox to oVirt/RHV and again quite a few of those to Proxmox.
Far too manual for my taste, but of you have really massive volumes, you can write tools to automate that.

Remember, Proxmox is mostly just an API on top of commodity Linux technology so Ceph is Cepth, with or without Proxmox, ZFS, LVM and file systems are just holding standard QCOW files representing VM disks for use with KVM/QEMU.

Thank you for your kindly answer sir. I really appreciate for it

About My Setup

New Member

Distinguished Member

Member

Famous Member

Member

Renowned Member

New Member

New Member

Distinguished Member

Member

New Member

New Member

Member

Proxmox Staff Member

New Member

Member

New Member

Member

New Member

We value your privacy