How to correctly use templates on shared storage in a cluster?

sdettmer · Aug 11, 2023

Hi,

thanks to this forum I already got so much great help, that I would like to ask a general question about recommended usages / approaches relates to templates.

Background/work load:
I want to run 8 instances each on 12 nodes from one template. The template contains a VM, which gets DHCP configured, connects to a central host with some local agent and then gets jobs to execute (precisely: Jenkins swarm clients).

I thought I would create a template and shared storage and thin-clone the instances, but the cow target must be shared then as well. Then 96 VMs share my slow little Gigabit NAS and are much too slow. I would like to use local flash instead.

So I tried cloning the template which is on shared storage (better: it's disk image data is on shared storage) to each of the nodes. Unfortunately, the configuration for the template is not on shared storage, so this cloning works only on the one node who created the template.

To workaround this, currently I clone the template on first node (the one who created it). I create an own clone for each target node (so I need 12 times the flash space, because I have 12 nodes, because I clone it locally once for each node).
Then I migrate each of that clone to its target node and convert it to a template. Each node has (just yet) sufficient flash space for 12 instances, but migrating seems to need much temp space on the target. I migrate a non-running clone and apparently need much local space, additional disk space at the source. It seems, before migration the full instance is copied locally, so in total I need twice the space (?) for each instance. After migration, all disk space becomes available again. Beside space, another disadvantage is, that all distribution load is limited by one single node (I could speed up by just copy two two nodes, then from these two to four others and so on, assuming the network is the limiting factor).

After this, each node has its own "local copy template" (on local storage) of the "central template" (on shared storage).
In a last step, on each node, I clone the "local copy template" for each instance (8 times), here thin cloning works fine.

Since I have a template on shared storage, but because of the node-local configuration I need to copy it locally feels like I'm doing it wrongly. I think I miss some point to use templates correctly. Here it seems NOT using shared storage at all just has advantages: since cloning must be local anyway, we using slower shared storage at all? Is this for SAN systems that have faster shared storage than local one? Of course I see advantages in safety, availability and quick live migration when the cloned VMs are on shared storage (and that this for thin clones makes sense only if the cow base is shared as well of course). But still, then it should be possible to create the clone on any node. I think I miss something.

So this is my first question: how to use templates in a cluster correctly?

My second question is, just for curiosity, how I could do this efficiently. In my case, possibly the template won't change often, so I have no issue. But how would it be efficient?
I ask because I think maybe this is a common use case. I think instead of using a template, I could setup some local distribution. I found someone proposed using locally tracked torrent system. This surely is great, but I think quite complicated to setup. Distributed file systems like ceph for this case surely are too slow because they care about synchronicity. Any thoughs?

Search

Search

How to correctly use templates on shared storage in a cluster?

sdettmer

Active Member

We value your privacy