Hello everybody,
I have a proxmox cluster of a single node and I want to start a new VM with a PostgreSQL and TimescaleDB and after a lot of reading about how to tune ZFS volumes for this purpose I still have some doubts with the cache options. We have 3 caches: The proxmox one (ARC), the linux vm one (LRU) and the PostgreSQL one (clock sweep); in order from further to nearer to the DBs.
I have read a lot of information, some of them contradictory, do I don't know if this is true but it seems that the PG cache isn't designed as a kernel cache where it tries to catch it everything and evict only when there isn't enough room to continue caching. In fact seems to be more like a buffer for the data that is being processed at the moment and not a long term cache. Indeed, it's called shared-buffers. I guess that it's why the doc doesn't recommend to set shared_buffers to a high % of the available ram like ARC does, but somewhere between 25 and 50%. Seems that the real PG cache is the kernel one and not shared_buffers.
Taking this into consideration there are some possible configurations to take into account:
So, are my initial assumptions correct? Which configuration will work better? What would you change?
Best regards, thanks for your time,
Héctor
I have a proxmox cluster of a single node and I want to start a new VM with a PostgreSQL and TimescaleDB and after a lot of reading about how to tune ZFS volumes for this purpose I still have some doubts with the cache options. We have 3 caches: The proxmox one (ARC), the linux vm one (LRU) and the PostgreSQL one (clock sweep); in order from further to nearer to the DBs.
I have read a lot of information, some of them contradictory, do I don't know if this is true but it seems that the PG cache isn't designed as a kernel cache where it tries to catch it everything and evict only when there isn't enough room to continue caching. In fact seems to be more like a buffer for the data that is being processed at the moment and not a long term cache. Indeed, it's called shared-buffers. I guess that it's why the doc doesn't recommend to set shared_buffers to a high % of the available ram like ARC does, but somewhere between 25 and 50%. Seems that the real PG cache is the kernel one and not shared_buffers.
Taking this into consideration there are some possible configurations to take into account:
- Create a VM with a moderated ammount of RAM (let's say 12GB) and set shared_buffers to 10GB. Trying with that:
- Have a good amount of memory to act as buffer to the ongoing queries.
- Stifle the VM RAM to not use its cache, that with its LRU configuration should be the worst one and instead use ARC one with better weights. The problem with this configuration may come from that the cache it's ouside the VM and could reduce the performance instead of improving it. Also not sure about how many room I have to left over the shared_buffers size to let run the VM OS and the other DB processes.
- Create a VM with a high amount of RAM (let's say 48 GB) and keep shared_buffers in the same 10GB. Also zfs set primary cache to metadata. This way the cache will be nearer the DB and inside the VM but with a worst logic. Seems that LRU is kinda bad for DB.
- Create a VM with a high amount of ram and primarycache=all. I think that this will be a bad thing because:
- VM and proxmox chaches will compete for resources
- Cache duplication
So, are my initial assumptions correct? Which configuration will work better? What would you change?
Best regards, thanks for your time,
Héctor