We're in the process of setting up a Proxmox cluster, and I wanted to share my thoughts about the hardware we chose, and suggest a few improvements to the wiki.
For our cluster, in the short term, we've decided to do three nodes, with one of those being a simple FreeNAS quorum and backup storage server. The two KVM nodes will do DRBD. In the long term, we'll move to a Netapp-served iSCSI implementation.
As far as hardware, for each of the two KVM servers, we went with:
The motherboard has the Intel i350 quad gigabit NIC controller with one port connected to a protected public network, two ports bonded directly between the two KVM servers for drbd and live migration, one port connected to the internal VLAN for the quorum server, and the dedicated Realtek IPMI interface connected to the private VLAN as well.
The i350 has a feature called Virtual Machine Devide Queues, which I'm curious about. Here's a snippet from Intel:
We setup RAID-1 on the system drives with mdadm based on these instructions, which worked very well. This leads me to my other big point; suggestions for wiki improvements. I agree with the decision to not support software RAID for image hosting, but I think there should at least be a wiki page for setting up system drive redundancy when you have shared storage or dedicated hardware RAID for images.
On the DRBD page, you should mention in "Disk for DRBD" section, that fdisk doesn't work on partitions larger than 2 TB, and if you need to go larger, you need to use gparted.
On the Proxmox VE 2.0 Cluster page, you say "Changing the hostname and IP is not possible after cluster creation." However, I found this not to be case...you may not be able to change hostname, but you can change the IP by editing /etc/hosts and running ssh-copy-id. In my case, I had to move the cluster to a different IP configuration and triggered this issue which I missed when I was reading the wiki the first time. I was relieved to find out I didn't have to start over, which is what it was starting to seem like. Also, I had mistakenly setup the cluster over the public interface by using the public IP with 'pvecm add'...you may want to make it clear that if you have a private connection for DRBD, when you setup the cluster, you should specify the IP of that interface so that the live migration can benefit from the improved bandwidth.
That's all I've got so far. Next week will be when I setup the quorum and fencing, then finish implementing the HA configuration, and finally I'll need to implement backups. I'll update with my thoughts on those projects.
For our cluster, in the short term, we've decided to do three nodes, with one of those being a simple FreeNAS quorum and backup storage server. The two KVM nodes will do DRBD. In the long term, we'll move to a Netapp-served iSCSI implementation.
As far as hardware, for each of the two KVM servers, we went with:
- Supermicro 6027R-3RF4 for the chassis
- Two six core processors
- 64 GB of memory
- Adaptec 6405 Raid card with AFM 600 flash module for writeback cache
- Redundant 250GB system drives
- Four Seagate ES.3 2TB drives in a RAID 10 configuration
The motherboard has the Intel i350 quad gigabit NIC controller with one port connected to a protected public network, two ports bonded directly between the two KVM servers for drbd and live migration, one port connected to the internal VLAN for the quorum server, and the dedicated Realtek IPMI interface connected to the private VLAN as well.
The i350 has a feature called Virtual Machine Devide Queues, which I'm curious about. Here's a snippet from Intel:
In the official Intel brief I linked to above, it mentions that the feature is supported by VMWare and Microsoft...I'm curious if it's supported in Proxmox?Virtual Machine Device Queues (VMDq) is a technology designed to offload some of the switching done in the VMM (Virtual Machine Monitor) to networking hardware specifically designed for this function. VMDq drastically reduces overhead associated with I/O switching in the VMM which greatly improves throughput and overall system performance
We setup RAID-1 on the system drives with mdadm based on these instructions, which worked very well. This leads me to my other big point; suggestions for wiki improvements. I agree with the decision to not support software RAID for image hosting, but I think there should at least be a wiki page for setting up system drive redundancy when you have shared storage or dedicated hardware RAID for images.
On the DRBD page, you should mention in "Disk for DRBD" section, that fdisk doesn't work on partitions larger than 2 TB, and if you need to go larger, you need to use gparted.
On the Proxmox VE 2.0 Cluster page, you say "Changing the hostname and IP is not possible after cluster creation." However, I found this not to be case...you may not be able to change hostname, but you can change the IP by editing /etc/hosts and running ssh-copy-id. In my case, I had to move the cluster to a different IP configuration and triggered this issue which I missed when I was reading the wiki the first time. I was relieved to find out I didn't have to start over, which is what it was starting to seem like. Also, I had mistakenly setup the cluster over the public interface by using the public IP with 'pvecm add'...you may want to make it clear that if you have a private connection for DRBD, when you setup the cluster, you should specify the IP of that interface so that the live migration can benefit from the improved bandwidth.
That's all I've got so far. Next week will be when I setup the quorum and fencing, then finish implementing the HA configuration, and finally I'll need to implement backups. I'll update with my thoughts on those projects.