Upgrade and error fixing advice - v2.3

MattE

Member
Jan 12, 2018
20
1
8
43
I've been running Proxmox v2.3 since Feb 2013 on an i7-2600K, Asus Sabertooth P67 mobo, 2x4GB F3-12800CL9-4GBXL RAM. Proxmox is installed on a 100GB partition on a 1TB Samsung Spinpoint F1 HDD and the rest is partitioned for pve-data.

I run a couple Ubuntu Server VMs and a couple Win7 VMs. One of the Ubuntu Servers is hosting a web page via Apache2 and twice now (months apart) I've noticed the website losses access to data on it's virtual hdd (virtio0, raw). Apache still responds but it serves up old data. After rebooting the VM, it finds disk errrors and repairs them. Then it seems as though everything works again until next time. This last time I connected a keyboard/mouse/lcd to the proxmox host and saw the errors shown in the attached screenshot. I'm thinking my main HDD is at fault? I think it's only affected the one VM so far, unless the others somehow deal with it gracefully.

So if I'm replacing my main HDD, should I try a SSD and install ver 5 at the same time? Should my VMs be compatible with the newer version? Is it a bad idea to try upgrading from v2.3 to v5.x?

Thanks!
 

Attachments

  • 20180103_110654.jpg
    20180103_110654.jpg
    754.6 KB · Views: 9
So if I'm replacing my main HDD, should I try a SSD and install ver 5 at the same time? Should my VMs be compatible with the newer version? Is it a bad idea to try upgrading from v2.3 to v5.x?

VMs (virtual disks and .conf files) are compatible. To use an SSD is an advantage, the decision to do so is mainly an economic question.
 
Is this a single drive that stores both proxmox & VM's? I would start with backing up critical user data within each VM to a backup location. Then add a second disk as a directory, and start backing up the VMs themselves.

This link shows how to add a 2nd disk as local storage (you just have to add the new disk as directory and configure it for backups within Proxmox GUI/datacenter/storage after reboot). Just warning, but do not attempt ANYTHING without backing your VM data and VM first. ;)

https://www.linuxtutorial.co.uk/proxmox-add-a-second-hard-drive-to-node-for-backups/

BTW you can quickly check your current drive health status using "smartctl -H /dev/sda"

After VM full backup, I would take the disks out and not touch them (for now). Consider adding extra memory (should be at least 16GB total) and installing latest Proxmox on at least 2 new drives (as ZFS raid1) or 4 drives (as ZFS raid10) mount the disk used to store your VM backups, and restore to the new Proxmox install.

If you need something quick and dirty, theoretically you can simply clone the drive unto another drive (same size or larger) using CloneZilla.
But I would not attempt this unless both the VM data and VMs are already backed up somewhere safe.
 
  • Like
Reactions: MattE
Is this a single drive that stores both proxmox & VM's? I would start with backing up critical user data within each VM to a backup location. Then add a second disk as a directory, and start backing up the VMs themselves.

This link shows how to add a 2nd disk as local storage (you just have to add the new disk as directory and configure it for backups within Proxmox GUI/datacenter/storage after reboot). Just warning, but do not attempt ANYTHING without backing your VM data and VM first. ;)

BTW you can quickly check your current drive health status using "smartctl -H /dev/sda"

After VM full backup, I would take the disks out and not touch them (for now). Consider adding extra memory (should be at least 16GB total) and installing latest Proxmox on at least 2 new drives (as ZFS raid1) or 4 drives (as ZFS raid10) mount the disk used to store your VM backups, and restore to the new Proxmox install.

If you need something quick and dirty, theoretically you can simply clone the drive unto another drive (same size or larger) using CloneZilla.
But I would not attempt this unless both the VM data and VMs are already backed up somewhere safe.

Yes, a single drive for both the OS and storage of the VMs. So far, only the one VM has this problem. My other Ubuntu VM and Windows 7 VMs don't exhibit any filesystem issues. When I notice that the filesystem is locked (web server serves up old data) I can log into the VM via ssh, it tells me that the filesystem is locked as read-only. After rebooting and connecting via Proxmox's web console, the VM is waiting for user input regarding whether to fix filesystem errors automatically, ignore or manual fixing. If I tell it to fix automatically, it fixes a page of orphaned inodes and everything seems to be working fine again on the VM. While the VM is fixing it's filesystem, the Proxmox host's console output shows more of the failed commands in my OP attached screenshot.

I have ordered 2 new 500GB Samsung 850 EVO SSDs for this. Can the Proxmox installer guide me though creating the ZFS raid1 array?
 

Attachments

  • smart data.txt
    5.7 KB · Views: 0
Here's the disk IO for the troubled VM. Shortly before 7 am is when the filesystem became locked.
 

Attachments

  • diskIO.JPG
    diskIO.JPG
    49.2 KB · Views: 3
@MattE you might want to reconsider the brand/type of SSD for Proxmox. It's never a good idea to use "consumer" SSD for production servers. But here's a list of recommended SSDs

https://forum.proxmox.com/threads/proxmox-slow-with-ssd-disks-on-lsi-9260-8i.39127/#post-193743

Also consider a solid backup strategy to protect yourself from these kinds of issues:

https://www.backblaze.com/blog/the-3-2-1-backup-strategy/

Regarding your troubled VM, I would make a copy of the VM to external storage, and then you can try and backup your data files using a Linux Rescue CD or Scan the VM for errors. Again you need to make a backup of the physical drive/VM's before doing anything

https://forum.proxmox.com/threads/corrupted-vm.21718/
 
  • Like
Reactions: MattE
@MattE you might want to reconsider the brand/type of SSD for Proxmox. It's never a good idea to use "consumer" SSD for production servers. But here's a list of recommended SSDs

https://forum.proxmox.com/threads/proxmox-slow-with-ssd-disks-on-lsi-9260-8i.39127/#post-193743

After a couple quick searches, this unit seems to be decent performance for $?
https://m.newegg.ca/products/2U3-000X-00009

I don't think I have the budget for two of these. I could possible afford one in place of those two EVOs.

How about platter drives? Is there a similar compiled list of benchmarks on HDDs?

Most of my VMs are not very write intensive except the one. It has about 75 RRDs (Round Robin Databases) that get updated every 2-10 mins. What are good options for that? Is there a Linux method/system of using RAM for 1 hr blocks of data to minimize disk writes?
 
For traditional spinners, 10k/15K SAS drives with SAS controllers in direct HBA/IT mode will get the most out of ZFS. For traditional SATA, I would go with enterprise drives like Seagate "Constellation", Western Digital "RE4" series, Hitachi "Ultrastar" and always in raid1 (absolute minimum) or raid 10 (4 drive minimum). I've had good luck with WD "red" series as well. If cost is an issue, you can probably get away with WD blue in Raid 1 or Raid 10 (URE rates are lower for WD blue drives btw). YMMV, but I've had excellent luck with WD RE4 drives with low power-on hours (under 10,000) on eBay. You'll have to contact the seller to check #hours and make sure they pack the drive well, and then do a WD full diag/surface scan once you receive the drives.

Whichever you choose, you'll still need a solid 3-2-1 backup plan for data and the VM's themselves. I can't stress that enough ;)
 
  • Like
Reactions: MattE
How about the WD Black series? I already have 3 4TB WD Black drives I could put into an array. Is there any place for a consumer SSD as a cache or something else?
 
Last edited:
@MattE you might want to reconsider the brand/type of SSD for Proxmox. It's never a good idea to use "consumer" SSD for production servers. But here's a list of recommended SSDs

https://forum.proxmox.com/threads/proxmox-slow-with-ssd-disks-on-lsi-9260-8i.39127/#post-193743

I ran the fio tests discussed here, on a Mushkin Chronos MX 240GB SSD. I ran some tests while it was plugged into the P67 Sata2 (3G) controller and also while connected to the Sata3 (6G) controller. I feel like the speeds were pretty decent for consumer SSD. That being said, both times the drive eventually "died" and I lost the connection while fio was running.
fio: pid=6400, err=5/file:engines/sync.c:62, func=xfer, error=Input/output error
The local console showed errors: end_request: I/O error, dev sda, sector X
And fdisk doesn't see the drive anymore until I reboot the OS.
 

Attachments

  • fio tests.JPG
    fio tests.JPG
    38.1 KB · Views: 1
The WD Black can be used in Raid1 or Raid 10. As for consumer SSD, that's totally up to you balancing risks/costs. You should probably run ASUS hardware diagnostics on your server to make sure SATA controllers/ports/cables are working normally. If the motherboard is old and starting to flake out, it won't matter what disks you use, they could become corrupt.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!