API interface and pool

kaltsi

Active Member
Mar 22, 2013
71
11
28
http://goo.gl/maps/d9tFg
HI,

I use API interface to manage VMs.
I have a process that clone the collected VM ID into a new POOL.

Some cloned VM will not be in the POOL.
I use workaround. After the clone processes I check which VM located in the POOL. I add the missing VMs.
But the Pool members responses sometimes not exact.

POST /api2/json/nodes/{node}/qemu/{vmid}/clone
Set up the "pool" attribute.

For Example:
Pooled VM IDs: [778, 741, 756, 766, 694, 724, 557, 753, 740, 739, 755, 754]
Cloned VM IDs: [550, 654, 694, 724, 739, 557, 656, 740, 741, 753, 754, 755, 756, 766, 778]

Few second later the VM 557 is not member of the POOL.

Is this the normal operation?

How can I solve this situation?
 
I have a process that clone the collected VM ID into a new POOL.

Some cloned VM will not be in the POOL.
So this is the actual problem and everything else is only problems in a workaround? Cloning and adding into a pool works flawlessly for me with pvesh. What kind of process is this?
 
I tested this function, because it us very important to collect all cloned VM in a pool.
My flow is this:
- create a new pool
- coolect all necesarry cloned VMs (9-15 virtual machines) and send clone request over http with pool parameter

After the application created linked clone for all necessary template. I realized the pool contains all VMs. But after some seconds some VMs are dropped out from pool.
There are 4 nodes in a cluster. For example: the application asks the node2 to create pool and asks the node2 to create linked clone from VM1 that is located on node4. In this time the node4 is busy and the node4 gets the pool information from node2 later than the clone request.

My workoround is that clone VMs without pool parameter and add the cloned VMs to the pool after clone process. It is working without issue.

For example:
Bash:
root@node-1:~# pvesh set /pools/newpool1

root@node-1:~# for vm in 101 102 103 104 ... 115; do pvesh create /nodes/node-2/qemu/$vm/clone pool=newpool1 newid=....; wating_for_clone_task; done
 
We are only talking about linked clones, not full clones? Does this also happen if you do everything on a single node or only from one node to another?
 
In this case the application uses only linked clone.
I have only logs that shows what happened.
In the last time the apploication was connected to node-1 to create new pool and to create linked clones that are located on node-3.
I don't have log from single node activity.
I could not realized what is the reason why this issue happening.
Usually the first cloned VM will not be part of the pool but sometimes another. But sometimes everything is working well.
 
I just created 200 linked clones of a small VM on another node and each one of them remained in the pool so far.

Code:
root@clusterA:~# for NEWID in {120..320}; do qm clone 103 $NEWID --target clusterB --pool test; done

While I repeat the same with the API, could you check if the qm commands work for you?



In this time the node4 is busy and the node4 gets the pool information from node2 later than the clone request.

qm clone throws an error if the pool does not exist

Code:
403 Permission check failed (pool 'wasdftest' does not exist)
 
I could not reproduce your problem.
Usually the first cloned VM will not be part of the pool
The first VM is certainly in the pool in my tests. I did not see any other VM go missing either. I ran variations of the following commands multiple times. 103 is a template.

Code:
for NEWID in {120..320}; do echo "new id is $NEWID"; pvesh create /nodes/clusterA/qemu/103/clone --pool=test --newid=$NEWID; done

for NEWID in {321..521}; do echo "new id is $NEWID"; pvesh create /nodes/clusterA/qemu/103/clone --pool=test --newid=$NEWID --target=clusterB; done


node4 gets the pool information from node2 later than the clone request.
Using an invalid pool throws an appropriate error
Code:
for NEWID in {120..320}; do echo "new id is $NEWID"; pvesh create /nodes/clusterA/qemu/103/clone --pool=testasdf --newid=$NEWID; done
proxy handler failed: 403 Permission check failed (pool 'testasdf' does not exist)
 
I created a test to reproduce the situation.
You can find the log in the attached file.
The 127.0.0.1 address is a link to node-1.
There was a running deletion process during the test.

The process cloned all VMs that located in the Template-Basic17-v3s1.1 pool.
  1. create new pool
  2. collect the selected pool content
  3. (linked) clone VM and put in the new pool
  4. set up the configuration
  5. collect the new pool content
  6. add the missing VMs to the pool again
Please let me know If you need more details.
 

Attachments

  • proxmox_pool_missing.txt
    132.9 KB · Views: 6
  • Like
Reactions: Dominic
Your logs contain these lines:

Code:
15:45:12.320 [main] DEBUG com.myapp.sd.testvmgui.proxmox.Proxmox - Header: <{newid=[669], name=[IstvanK-Basic17-v3s1.1-2-BED11], pool=[IstvanK-Basic17-v3s1.1-2], target=[vNTDB-host-4]},[Content-Type:"application/x-www-form-urlencoded", Cookie:"PVEAuthCookie=PVE:ik@pve:5D974C77::.....LoFA==", CSRFPreventionToken:"5D974C77:iXDpjf1XOy57h2+S5aG1Fv6Mxkg"]>
15:45:12.321 [main] DEBUG org.springframework.web.client.RestTemplate - HTTP POST https://127.0.0.1:8006/api2/json/nodes/vNTDB-host-4/qemu/202/clone
15:45:12.321 [main] DEBUG org.springframework.web.client.RestTemplate - Accept=[application/json, application/*+json]
15:45:12.321 [main] DEBUG org.springframework.web.client.RestTemplate - Writing [{newid=[669], name=[IstvanK-Basic17-v3s1.1-2-BED11], pool=[IstvanK-Basic17-v3s1.1-2], target=[vNTDB-host-4]}] as "application/x-www-form-urlencoded"
15:45:12.457 [main] DEBUG org.springframework.web.client.RestTemplate - Response 200 OK
15:45:12.457 [main] DEBUG org.springframework.web.client.RestTemplate - Reading to [com.myapp.sd.testvmgui.proxmox.data.ProxmoxTaskId]
15:45:12.457 [main] DEBUG com.myapp.sd.testvmgui.proxmox.Proxmox - Proxmox request: 'https://127.0.0.1:8006/api2/json/nodes/vNTDB-host-4/qemu/202/clone' with response code: 200

We can see the return code 200 for the HTTP request. If cloning is unsuccessful, then the code will remain 200. However, the JSON response payload will differ. Could you add logs for the response payload in your program?

Additionally, why is VM 227 in another pool?
Code:
15:45:38.447 [main] ERROR com.myapp.sd.testvmgui.proxmox.Proxmox - Proxmox request: 'https://127.0.0.1:8006/api2/json/pools/IstvanK-Basic17-v3s1.1-2' with error: update pools failed: error with cfs lock 'file-user_cfg': VM 227 belongs to pool 'IstvanK-Basic17-v3s1.1-1' error code:500
 
I have an idea.
The pool distribution is correct. This situation comes when there is a removing process in the background.
The system starts to remove VM 557. The system release the ID 557. My app picks up the "next free VMID" it is the 557.
The cloning process uses the id 557. But the removal process was not completed.
Maybe that's why the cloning process and the app cannot add the VM to the new pool. Because the ID 557is still assigned to the old one.
what do you think about this? Is it possible?

Code:
Oct  8 11:04:44 vNTDB-host-4 pvedaemon[1144905]: destroy VM 557: UPID:vNTDB-host-4:00117849:267D7C05:5D9C512C:qmdestroy:557:ik@pve:
Oct  8 11:04:44 vNTDB-host-4 pvedaemon[1123516]: <ik@pve> starting task UPID:vNTDB-host-4:00117849:267D7C05:5D9C512C:qmdestroy:557:ik@pve:
Oct  8 11:05:08 vNTDB-host-4 pvedaemon[1123516]: <ik@pve> end task UPID:vNTDB-host-4:00117849:267D7C05:5D9C512C:qmdestroy:557:ik@pve: OK

Oct  8 11:05:05 vNTDB-host-1 pvedaemon[2698714]: <ik@pve> starting task UPID:vNTDB-host-1:0029627E:267EA79B:5D9C5141:qmclone:570:ik@pve:
Oct  8 11:05:06 vNTDB-host-1 pvedaemon[2712190]: clone base-617-disk-0/base-570-disk-0: base-570-disk-0 snapname __base__ to vm-557-disk-0
Oct  8 11:05:06 vNTDB-host-1 pvedaemon[2698714]: <ik@pve> end task UPID:vNTDB-host-1:0029627E:267EA79B:5D9C5141:qmclone:570:ik@pve: OK


11:05:05.920 [main] DEBUG com.myapp.sd.testvmgui.proxmox.Proxmox - Header: <{newid=[557], name=[IstvanK-Basic17-v3s1.1-2-PGW02], pool=[IstvanK-Basic17-v3s1.1-2], target=[vNTDB-host-1]},[Content-Type:"application/x-www-form-urlencoded", Cookie:"PVEAuthCookie=PVE:ik@pve:5D9C5135::...8zqhg==", CSRFPreventionToken:"5D9C5135:rGRJNJW/hDzar7mKPzdU8Yq2lxU"]>
11:05:05.920 [main] DEBUG org.springframework.web.client.RestTemplate - HTTP POST https://127.0.0.1:8006/api2/json/nodes/vNTDB-host-1/qemu/570/clone
11:05:05.920 [main] DEBUG org.springframework.web.client.RestTemplate - Accept=[application/json, application/*+json]
11:05:05.920 [main] DEBUG org.springframework.web.client.RestTemplate - Writing [{newid=[557], name=[IstvanK-Basic17-v3s1.1-2-PGW02], pool=[IstvanK-Basic17-v3s1.1-2], target=[vNTDB-host-1]}] as "application/x-www-form-urlencoded"
11:05:05.920 [main] DEBUG com.myapp.sd.testvmgui.proxmox.LoggingRestTemplate - ==========================request begin==============================================
11:05:05.920 [main] DEBUG com.myapp.sd.testvmgui.proxmox.LoggingRestTemplate - URI                 : https://127.0.0.1:8006/api2/json/nodes/vNTDB-host-1/qemu/570/clone
11:05:05.920 [main] DEBUG com.myapp.sd.testvmgui.proxmox.LoggingRestTemplate - Method            : POST
11:05:05.920 [main] DEBUG com.myapp.sd.testvmgui.proxmox.LoggingRestTemplate - Headers         : [Accept:"application/json, application/*+json", Cookie:"PVEAuthCookie=PVE:ik@pve:5D9C5135::...8zqhg==", Content-Type:"application/x-www-form-urlencoded;charset=UTF-8", CSRFPreventionToken:"5D9C5135:rGRJNJW/hDzar7mKPzdU8Yq2lxU", Content-Length:"96"]
11:05:05.920 [main] DEBUG com.myapp.sd.testvmgui.proxmox.LoggingRestTemplate - Request body: newid=557&name=IstvanK-Basic17-v3s1.1-2-PGW02&pool=IstvanK-Basic17-v3s1.1-2&target=vNTDB-host-1
11:05:05.920 [main] DEBUG com.myapp.sd.testvmgui.proxmox.LoggingRestTemplate - ==========================request end================================================
11:05:05.996 [main] DEBUG com.myapp.sd.testvmgui.proxmox.LoggingRestTemplate - ==========================response begin=============================================
11:05:05.996 [main] DEBUG com.myapp.sd.testvmgui.proxmox.LoggingRestTemplate - Status code    : 200 OK
11:05:05.997 [main] DEBUG com.myapp.sd.testvmgui.proxmox.LoggingRestTemplate - Status text    : OK
11:05:05.997 [main] DEBUG com.myapp.sd.testvmgui.proxmox.LoggingRestTemplate - Headers            : [Cache-Control:"max-age=0", Connection:"close", "Keep-Alive", Date:"Tue, 08 Oct 2019 09:05:05 GMT", Pragma:"no-cache", Server:"pve-api-daemon/3.0", Content-Length:"76", Content-Type:"application/json;charset=UTF-8", Expires:"Tue, 08 Oct 2019 09:05:05 GMT"]
11:05:05.997 [main] DEBUG com.myapp.sd.testvmgui.proxmox.LoggingRestTemplate - Response body: {"data":"UPID:vNTDB-host-1:0029627E:267EA79B:5D9C5141:qmclone:570:ik@pve:"}

11:05:05.997 [main] DEBUG com.myapp.sd.testvmgui.proxmox.LoggingRestTemplate - ==========================response end===============================================
11:05:05.997 [main] DEBUG org.springframework.web.client.RestTemplate - Response 200 OK
11:05:05.997 [main] DEBUG org.springframework.web.client.RestTemplate - Reading to [com.myapp.sd.testvmgui.proxmox.data.ProxmoxTaskId]
 
Thank your for extending the logs!

what do you think about this? Is it possible?

Could be. We are currently checking if there is a bug in the code that performs locking.
 
We really have a bug here. You can follow it on Bugzilla.

Thank you for bringing this to our attention!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!