Would it be OK if I start logging all the "Ah Ha" moments for me as I learn basics of PM and Ceph and even basic admin stuff?
My problem: Proxmox was installed to a 80GB SSD and after some time of running fine, I started getting error messages during regular system updates of being out of space. (Link needed) Somewhere I read that the standard proxmox install uses a percentage basis for initial hard disk or ssd partitioning. In my case - PM gave me 3 partitions on the main SSD it was installed to. The third, sda3, was created as a Logical Volume - and was set to 74 GB (pretty much the whole drive save some space on the first 2 partitions for boot and EFI). Look at yours by this command:
This is what my setup looked like with 1 - 80GB SSD and 1 - 1TB HDD for data:
This showed me some basics of how PM setup the system.
Fine and dandy - but how are these partitions used? Let's look at
Here is some interesting info to look at - partition "sda3" has 4 more divisions/volumes... or lvm's in the group... "pve-swap" was set to 8GB, the root partition/lvm is set to 18.5G while local data meta and vm pool storage was 36.3 and change. Huh? There's my problem.. only 18.5G for the root partition... seemed to fill up substantially when I updated ceph.. Wait wait wait... those numbers don't all add up to 74G for that whole drive - WHAT?... Hold on..
sda3 8GB + 18.5GB +1G+ 36.3G = 63.8, not the full 74G allocated to sda3... what happened to the other 10G??
Huh? Now I'm really confused about who is telling the truth... PM GUI shows sda3 at 79.49GB not 74G... are we talking bytes or bits or what is going on here?
Anyhow - back to my initial problem - running out of space on root during updates for os/pm and packages installed...
I initially looked to see where I could DELETE some crap that was building up... went to the logs folder and started getting rid of old stuff... pufff, Who needs old logs anyhow? (me as it turns out.... more on that some other day).
Cleared out enough space on the root to at least finish the updates... Then... ceph started having a conniption fit telling me it only had 1% space remaining on the monitor node... ugh..
How do I free up some space on pve-root or make it bigger? Move VM's on the data pool on that drive over and extend, reduce the swap partition size and then extend the root... add more hdd/ssd in the server or upgrade the drive and then extend the volume group after format...
Helpful Links I am going to explain more on...
https://linuxize.com/post/how-to-add-swap-space-on-debian-10/
https://www.cyberpratibha.com/create-swap-partition-in-linux-by-mkswap-command/
https://tldp.org/HOWTO/Partition/setting_up_swap.html
https://pve.proxmox.com/pve-docs/chapter-pve-installation.html
https://pve.proxmox.com/pve-docs/pve-admin-guide.html
Though I have been playing with linux and nux systems for decades - I really don't think I'm all that with them... I had to go back in and relearn cli stuff as simple as what fstab was really doing and relearn all the new ways to have swap on linux - pretty cool stuff tbh...
Decision: Reduce volume pve-swap and then extend/grow the pve-root volume.
Decision: Add another SSD/HDD to extend/grow the pve-root volume and partition part of new ssd for linux swap.
more notes to come...
and just pvecm updatecerts without the force option seems to do the trick most the time...
(credit to many sources but here too - Blog-D DannyDa's blog of tips)
This happens a LOT to me - not entirely sure why - but certs get out of sync or something and I can no longer remote in from machine to machine to manage and that goofy error pops up about unable to connect - somone may be acting malicious or something along those lines... will post the error as soon as it happens again.
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that a host key has just been changed.
The fingerprint for the RSA key sent by the remote host is
SHA256:t8AhmXxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Please contact your system administrator.
Add correct host key in /root/.ssh/known_hosts to get rid of this message.
Offending RSA key in /etc/ssh/ssh_known_hosts:12
remove with:
ssh-keygen -f "/etc/ssh/ssh_known_hosts" -R "10.0.1.xx"
RSA host key for 10.0.1.xxhas changed and you have requested strict checking.
Host key verification failed.
- How to resize root and move swap space on default Proxmox install.
My problem: Proxmox was installed to a 80GB SSD and after some time of running fine, I started getting error messages during regular system updates of being out of space. (Link needed) Somewhere I read that the standard proxmox install uses a percentage basis for initial hard disk or ssd partitioning. In my case - PM gave me 3 partitions on the main SSD it was installed to. The third, sda3, was created as a Logical Volume - and was set to 74 GB (pretty much the whole drive save some space on the first 2 partitions for boot and EFI). Look at yours by this command:
fdisk -l
This is what my setup looked like with 1 - 80GB SSD and 1 - 1TB HDD for data:
This showed me some basics of how PM setup the system.
Fine and dandy - but how are these partitions used? Let's look at
Code:
lsblk
Here is some interesting info to look at - partition "sda3" has 4 more divisions/volumes... or lvm's in the group... "pve-swap" was set to 8GB, the root partition/lvm is set to 18.5G while local data meta and vm pool storage was 36.3 and change. Huh? There's my problem.. only 18.5G for the root partition... seemed to fill up substantially when I updated ceph.. Wait wait wait... those numbers don't all add up to 74G for that whole drive - WHAT?... Hold on..
sda3 8GB + 18.5GB +1G+ 36.3G = 63.8, not the full 74G allocated to sda3... what happened to the other 10G??
Huh? Now I'm really confused about who is telling the truth... PM GUI shows sda3 at 79.49GB not 74G... are we talking bytes or bits or what is going on here?
Anyhow - back to my initial problem - running out of space on root during updates for os/pm and packages installed...
I initially looked to see where I could DELETE some crap that was building up... went to the logs folder and started getting rid of old stuff... pufff, Who needs old logs anyhow? (me as it turns out.... more on that some other day).
Cleared out enough space on the root to at least finish the updates... Then... ceph started having a conniption fit telling me it only had 1% space remaining on the monitor node... ugh..
How do I free up some space on pve-root or make it bigger? Move VM's on the data pool on that drive over and extend, reduce the swap partition size and then extend the root... add more hdd/ssd in the server or upgrade the drive and then extend the volume group after format...
Helpful Links I am going to explain more on...
https://linuxize.com/post/how-to-add-swap-space-on-debian-10/
https://www.cyberpratibha.com/create-swap-partition-in-linux-by-mkswap-command/
https://tldp.org/HOWTO/Partition/setting_up_swap.html
https://pve.proxmox.com/pve-docs/chapter-pve-installation.html
https://pve.proxmox.com/pve-docs/pve-admin-guide.html
Though I have been playing with linux and nux systems for decades - I really don't think I'm all that with them... I had to go back in and relearn cli stuff as simple as what fstab was really doing and relearn all the new ways to have swap on linux - pretty cool stuff tbh...
Decision: Reduce volume pve-swap and then extend/grow the pve-root volume.
- Turn off swap on dev/pve/swap
-
Code:
swapoff -v /dev/pve/swap
- verify swap is off with swapon -s - should not show any swap active - if so - locate the swap partition and turn it off with swapoff ( swapoff - Debian man page )
-
- Reduce /dev/pve/swap size by 4G - should be just enough I needed to give root some space and finish updates...
lvreduce --size 4G /dev/pve/swap
- verify
lsblk
- Extend the Root Volume by 4GB freed up on sda
- pvdisplay - shows the Physical Volume (sda3 for me) for pve info
- pvs - also helps summarize (pvs - Debian man page)
- lvdisplay - show Logical Volumes (lvdisplay - Debian man page)
- lvs - (lvs - debian man page)
- vgs - show the volume groups
- With all that info you should be able to sort out what volume to extend. For me - I did
lvresize -l +4G /dev/pve-root
and this grows the pve-root by 4GB even live (** setup may vary from mine though)... (lvresize - Debian man page)
- verify everything looks as it should now with those commands to display volume and group info as well as display disk data (lvs command should show root has grown by 4GB and swap is smaller now).
- Be sure to reconfigure swap for the 4gb:
mkswap /dev/pve/swap
- remakes swap space with current logical volume settings. (mkswap - Debian man page- should not need to mess with fstab at this point since no changes to actual mount points or otherwise.
Decision: Add another SSD/HDD to extend/grow the pve-root volume and partition part of new ssd for linux swap.
- Install new SSD
- run gparted (or fdisk) and make sure wipe drive then create new partition for swap and new ext4 partition for extending the logical volumes of my choice - in my case, pve-root. (I will add How-To here in a while... use links above if you need for now)
- Format new partition for linux volume group use... in my case I added the swap partition first - and made it at the end of the drive (note there are reasons to make swap at the outside edge of drive for hdd but for ssd it makes no sense... will add links here for more reading later). So I have sdc1 as the SWAP and sdc2 labeled pve2 as I will add that physical volume to the pve volume group and then extend the pve-root Logical Volume to also make use of the second SSD space.. or so that is the plan.
vgextend pve /dev/sdc2
- extend volume group pve (where all the PM stuff and proxmox pve-root lives) onto new ssd/hdd (vgextend - Debian man page)- pvs - shows physical volume info - should now show pve VG name on both original ssd/hdd and the new ssd/hdd device
- pvdisplay - same as above but more details
- now - if all looks good - we have 2 physical disks and partitions on them attached to the "volume group" called "pve" and that volume group is still holding the default install proxmox data, swap and pve-root... check before moving on...
- resize pve-root and extend to full use of the rest of the available space for pve VG.
lvresize -l +100%FREE /dev/pve/root
-adds all the available space in pve VG to root lvm.- run "vgs" and "pvs" to verify or however you like... should now show pve volume full size of both drives/partitions and VG "pve" on sda3 and sdc2 in my case (both drives).
- Because mount point for pve-root is already defined in fstab - shoud not need to add anything else at this point.
- DON'T FORGET to run resize2fs if it applies to your setup... for me - it does. When I logged into PM gui it did not recognize the extended size of root - DOH! Just do this:
resize2fs /dev/pve/root
then refresh the PM gui and all should be well.
- adding 16GB physical partition on new ssd to linux swap
- Make new swap space on new drive live - since 4G (from above lvm leftover) really isn't much swap (this particular machine does not use it that much and swap is hard on SSD anyhow) - I still wanted to add some more on the new ssd installed so I made sdc1 partition (from steps above) 16GB.. all I had to do is mkswap (as above) that sdc1 partition (my physical 16gb partition of sdc drive I installed - verify your drive info - will likely be different than mine) and since it was formatted/defined as swap on the physical drive (prefer physical swap partition over file or logical volume) all I need to do now is just add it to Linux so system knows it is there. See links above for more info on swap manipulation - the only thing to do now is make the swap partition live. Note: I have been told I am wasting resources and doing things all wrong - but in my testing having swap on 2 drives at the same priority seems to speed it up a little... the man pages seem to confirm this too.
- mkswap /dev/sdc1
- edit fstab to ADD the new swap partition
nano /etc/fstab
- (use vi if you want or whatever... nano is just easy for me)- I added /dev/sdc1 none swap sw,pri=3 0 0 then ctl x and save and exit. This will ensure sdc1 is mounted for swap... please see this very helpful link for more info: (Setting up Swap Space) and note that I added pri=3 on both the pve lvm swap as well as the sdc1 physical swap space... several articles (need links and any references here) explain how using multiple drives with same priority swap will help speed things up like a RAID environment... Linux just does it for you.
- Because these are SSD I wanted to make sure to not over-use swap if at all possible. Proxmox default "swapiness" is 60... meaning swap starts getting utilized when physical RAM is at or above 60% - or so is my generic summary of what happens... To make less writes to SSD and save a little I set swapiness to 10 - so swap will be reserved till it is really needed.
sysctl vm.swappiness=10
- sets swapiness to 10
more notes to come...
How to: Regenerate Self-Signed SSL/TLS certificate for Proxmox VE (PVE)
pvecm updatecerts --force
and just pvecm updatecerts without the force option seems to do the trick most the time...
(credit to many sources but here too - Blog-D DannyDa's blog of tips)
This happens a LOT to me - not entirely sure why - but certs get out of sync or something and I can no longer remote in from machine to machine to manage and that goofy error pops up about unable to connect - somone may be acting malicious or something along those lines... will post the error as soon as it happens again.
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that a host key has just been changed.
The fingerprint for the RSA key sent by the remote host is
SHA256:t8AhmXxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Please contact your system administrator.
Add correct host key in /root/.ssh/known_hosts to get rid of this message.
Offending RSA key in /etc/ssh/ssh_known_hosts:12
remove with:
ssh-keygen -f "/etc/ssh/ssh_known_hosts" -R "10.0.1.xx"
RSA host key for 10.0.1.xxhas changed and you have requested strict checking.
Host key verification failed.
Last edited: