Long time lurker on this forum, first time poster. I have read many many threads that are a close variation to, but not exactly similar to my issue. And I have already resolved my issue to my satisfaction, but have some questions about how Proxmox's deeper features work and if I should really be using the fix that I figured out to allow me to make backups without crashing.
Anyway here we go:
This was a budget build for a small business and I recognize it is not really up to snuff with purpose built VEs, but it is what I am working with, and the business would like to get several more years of use out of it.
VE Hardware:
CPU - AMD Opteron 6168 1.9 GHz Processor
MOBO - Supermicro H8SGL-F Motherboard - Amd Magny Cours Single Socket with on-board Ipmi
RAM - 32GB 1333Mhz Quad-Rank non-ECC
RAID - LSI MegaRaid SAS 9260-4i
SATA - 1x Seagate SATA 3.0 2TB HD connected to MOBO SATA (for backups only)
PSU - 800W AthenaPower Server PSU (non-redundant)
VE Software:
PROXMOX - Virtual Environment 4.4-1/eb2d6f1e
RAID - megaraid_sas driver and megactl management and reporting tools
UPS - Cyberpower 1500PFCLCD managed via nut
VMs:
-2x Windows Server 2008 R2 x64 installed via best practices guide from this forum
-2x Windows Server 2016 x64 installed via best practices guide form this forum
The Problem:
Until recently I had been completely unable to run a backup of any of the VMs without the VE almost immediately becoming unresponsive and eventually crashing (but not rebooting on it's own). Every single time I tried to run a backup the VE would become unresponsive within ~3mins and totally crash within ~5mins.
The Solution:
What I did that eventually solved this problem was to simply disable RAM ballooning and KSM. I had installed the ballooning driver in all of the Windows VMs and believe it was functioning correctly. But once I disabled KSM backups are running reliably (and very quickly). I have run about a dozen backup jobs since disabling ballooning across all VMs and disabling KSM, and every backup has worked perfectly.
Next Steps:
First of all is, this normal? Should I have disabled ballooning and KSM due to known issues with Windows VMs in the first place? And I had just never noticed that recommendation somehow.
If not, and I should be able to use ballooning and KSM with Windows VMs and be able to back them up normally. I would like to conduct testing to try and figure out why those systems were causing my backup jobs to crash. If anyone can provide info as to the appropriate logs and how to access them to allow me to trace the root issue, that info would be greatly appreciated.
And of course if any additional info about my setup or configuration would be helpful please don't hesitate to ask.
And thanks already to this forum for helping me with numerous past issues that did not necessitate me posting a thread or even a comment!
Anyway here we go:
This was a budget build for a small business and I recognize it is not really up to snuff with purpose built VEs, but it is what I am working with, and the business would like to get several more years of use out of it.
VE Hardware:
CPU - AMD Opteron 6168 1.9 GHz Processor
-12 Core Processor, 128 KB L1 Cache, 512KB L2 Cache (per Core), 12MB L3 Cache (per Socket)
-CPU: Single 1944-pin Socket G34 Support one Twelve/Eight-Core ready AMD Opteron 6100 Series processors,Support HT3.0 Link Technology
-Chipset: AMD chipset SR5650 + SP5100
-Memory: 8x 240pin DDR3-1333/1066/800 DIMMs, Supports Upto 128GB ECC/REG Memory
-Slots: 1x PCI-Express 2.0 x16 Slot; 2x PCI-Express 2.0 x8 Slots; 3x PCI Slots
-Video: Matrox G200 Graphics Controller, w/ 16MB DDR2 Video Memory
-LAN: Dual Intel 82574L Gigabit Ethernet Controllers
-Chipset: AMD chipset SR5650 + SP5100
-Memory: 8x 240pin DDR3-1333/1066/800 DIMMs, Supports Upto 128GB ECC/REG Memory
-Slots: 1x PCI-Express 2.0 x16 Slot; 2x PCI-Express 2.0 x8 Slots; 3x PCI Slots
-Video: Matrox G200 Graphics Controller, w/ 16MB DDR2 Video Memory
-LAN: Dual Intel 82574L Gigabit Ethernet Controllers
RAID - LSI MegaRaid SAS 9260-4i
-4x Seagate SATA 3.0 RAID capable 2TB HDs (forgot which ones exactly)
-4x disks running in a 4-disk RAID6
-4x disks running in a 4-disk RAID6
PSU - 800W AthenaPower Server PSU (non-redundant)
VE Software:
PROXMOX - Virtual Environment 4.4-1/eb2d6f1e
RAID - megaraid_sas driver and megactl management and reporting tools
UPS - Cyberpower 1500PFCLCD managed via nut
VMs:
-2x Windows Server 2008 R2 x64 installed via best practices guide from this forum
-2x Windows Server 2016 x64 installed via best practices guide form this forum
The Problem:
Until recently I had been completely unable to run a backup of any of the VMs without the VE almost immediately becoming unresponsive and eventually crashing (but not rebooting on it's own). Every single time I tried to run a backup the VE would become unresponsive within ~3mins and totally crash within ~5mins.
The Solution:
What I did that eventually solved this problem was to simply disable RAM ballooning and KSM. I had installed the ballooning driver in all of the Windows VMs and believe it was functioning correctly. But once I disabled KSM backups are running reliably (and very quickly). I have run about a dozen backup jobs since disabling ballooning across all VMs and disabling KSM, and every backup has worked perfectly.
Next Steps:
First of all is, this normal? Should I have disabled ballooning and KSM due to known issues with Windows VMs in the first place? And I had just never noticed that recommendation somehow.
If not, and I should be able to use ballooning and KSM with Windows VMs and be able to back them up normally. I would like to conduct testing to try and figure out why those systems were causing my backup jobs to crash. If anyone can provide info as to the appropriate logs and how to access them to allow me to trace the root issue, that info would be greatly appreciated.
And of course if any additional info about my setup or configuration would be helpful please don't hesitate to ask.
And thanks already to this forum for helping me with numerous past issues that did not necessitate me posting a thread or even a comment!
Last edited: