VM Migration fails with "only root can set 'affinity' config"

snarman

New Member
Mar 2, 2025
1
0
1
Hello Everyone,

thanks to the Proxmox team for this Alpha release. It makes managing VMs across multiple PVE Nodes really comfortable.

Today I encountered in issue in migrating a VM across two nodes. The error is apperently due to the CPU affinity setting on the source VM. The Datacenter Manager tries to set this on the target VM too, as I would expect it. However it fails with the error message below. The interesting part is probably:
only root can set 'affinity' config
And here's the catch: the target PVE Node is connceted via root@pam!pdm-admin and that is the root user. The source PVE Node is connected via an API Token with role 'administrator'. I guess that shouldn't be the problem here.

The target PVE Node has 16 cores so the numbers 4,5,6,7 should exist and apprently that is not the error. Also I already worked around this by removing the CPU affinity setting and after that the migration is successful.

Do you have any hints on how to further investigate this and get the Data Center Manager to keep the CPU affinity settings for migrations?

2025-03-02 13:33:33 ERROR: error - tunnel command '{"conf":"affinity: 4,5,6,7\nagent: 1\nboot: order=scsi0;ide2;net0\ncores: 4\ncpu: kvm64,flags=+aes\ncpuunits: 2048\nide2: none,media=cdrom\nlock: migrate\nmemory: 8192\nmeta: creation-qemu=7.2.0,ctime=1685182982\nname: cloud\nnet0: virtio=16:0D:EC:CB:C1:CA,bridge=vmbr0,firewall=1,tag=111\nnuma: 0\nonboot: 1\nostype: l26\nscsi0: local-zfs:vm-111-disk-0,format=raw,iothread=1,size=16G\nscsi1: local-zfs:vm-111-disk-1,format=raw,iothread=1,size=115G\nscsihw: virtio-scsi-single\nsmbios1: uuid=8a631fb2-4f33-4bc9-8b24-3430497633f3\nsockets: 1\nvmgenid: 33f8fec1-2010-40a7-ae4c-833be14b5a53\n","cmd":"config","firewall-config":null}' failed - failed to handle 'config' command - only root can set 'affinity' config
 
the issue is that "root only" settings are really limited to the root user (a token for that user might have restricted privileges after all). and when crossing the cluster boundary we need to re-check privileges. maybe it's time to expose affinity using some privilege? could you file a bug for that?
 
I ran into a somehow similar issue when migrating a VM, albeit not with the affinity setting but with a very special set of low level qemu args. I guess it boils down to the same privilege issue?

2025-05-29 11:00:32 ERROR: error - tunnel command '{"firewall-config":null,"cmd":"config","conf":"#docker on windows with nested virtualization\nagent: 1\nargs: -cpu host,+kvm_pv_unhalt,+kvm_pv_eoi,hv_spinlocks=0x1fff,hv_vapic,hv_time,hv_reset,hv_vpindex,hv_runtime,hv_relaxed,hv_synic,hv_stimer,-hypervisor\nballoon: 0\nboot: order=scsi1;ide1;ide2;net0\ncores: 3\ncpu: max\nide1: none,media=cdrom\nide2: none,media=cdrom\nlock: migrate\nmachine: pc-i440fx-7.1\nmemory: 32768\nname: winbuilder1\nnet0: virtio=72:41:63:06:D7:19,bridge=vmbr0,mtu=1,tag=15\nnuma: 1\nostype: win10\nscsi0: backups:vm-116-disk-0,cache=unsafe,format=raw,size=4G\nscsi1: backups:vm-116-disk-1,cache=unsafe,discard=on,format=raw,size=200G,ssd=1\nscsihw: virtio-scsi-pci\nsmbios1: uuid=6fdd499a-b103-46a8-8acd-e33348aaed80\nsockets: 2\nvga: qxl2\nvmgenid: 4786d44a-1bbb-49ef-b12f-df54f172469b\n"}' failed - failed to handle 'config' command - only root can set 'args' config

For now the workaround is trivial, I just removed the args for the time of the migration and readded them afterwards, so no big issue.
 
There is a very dirty solution - you need to patch this thing https://github.com/proxmox/qemu-ser...a1000e0e9d5fc7f50f/PVE/API2/Qemu.pm#L670-L679

Just edit file /usr/share/perl5/PVE/API2/Qemu.pm on your node and add 'affinity' => 1, in my $cpuoptions section

I don't know which pve daemon should be restarted after this (not only pveproxy - that's all I know), but this thing worked for me after reboot

I also know that the same issue exists if you're trying to set hookscript - error is the same. Maybe, you can add this "permission" there too, try it. The same for args in previous message etc.
 
Last edited:
  • Like
Reactions: CelticWebs
I tried the suggestion of D12310N3 on the target Node but unfortunately I also get the error

Code:
2025-05-29 22:17:59 remote: started tunnel worker 'UPID:pve2:00000FB4:000065FE:6838CF07:vzmtunnel:111:root@pam!pdm-token:'
tunnel: -> sending command "version" to remote
tunnel: <- got reply
2025-05-29 22:17:59 local WS tunnel version: 2
2025-05-29 22:17:59 remote WS tunnel version: 2
2025-05-29 22:17:59 minimum required WS tunnel version: 2
2025-05-29 22:17:59 websocket tunnel started
2025-05-29 22:17:59 shutdown CT 111
2025-05-29 22:18:03 starting migration of CT 111 to node 'pve2' (192.168.108.42)
tunnel: -> sending command "bwlimit" to remote
tunnel: <- got reply
2025-05-29 22:18:03 found local volume 'local-zfs:subvol-111-disk-0' (in current VM config)
tunnel: -> sending command "disk-import" to remote
tunnel: <- got reply
tunnel: accepted new connection on '/run/pve/111.storage'
tunnel: requesting WS ticket via tunnel
tunnel: established new WS for forwarding '/run/pve/111.storage'
full send of rpool/data/subvol-111-disk-0@__migration__ estimated size is 2.54G
total estimated size is 2.54G
TIME        SENT   SNAPSHOT rpool/data/subvol-111-disk-0@__migration__
tunnel: -> sending command "query-disk-import" to remote
tunnel: done handling forwarded connection from '/run/pve/111.storage'
tunnel: <- got reply
2025-05-29 22:18:14 volume 'local-zfs:subvol-111-disk-0' is 'Ext500:subvol-111-disk-0' on the target
2025-05-29 22:18:14 mapped: net0 from vmbr0 to vmbr0
tunnel: -> sending command "config" to remote
tunnel: <- got reply
2025-05-29 22:18:14 ERROR: error - tunnel command '{"cmd":"config","conf":"#<div align='center'>\n#  <a href='https%3A//Helper-Scripts.com' target='_blank' rel='noopener noreferrer'>\n#    <img src='https%3A//raw.githubusercontent.com/community-scripts/ProxmoxVE/main/misc/images/logo-81x112.png' alt='Logo' style='width%3A81px;height%3A112px;'/>\n#  </a>\n#\n#  <h2 style='font-size%3A 24px; margin%3A 20px 0;'>Ubuntu LXC</h2>\n#\n#  <p style='margin%3A 16px 0;'>\n#    <a href='https%3A//ko-fi.com/community_scripts' target='_blank' rel='noopener noreferrer'>\n#      <img src='https%3A//img.shields.io/badge/&#x2615;-Buy us a coffee-blue' alt='spend Coffee' />\n#    </a>\n#  </p>\n#  \n#  <span style='margin%3A 0 10px;'>\n#    <i class=\"fa fa-github fa-fw\" style=\"color%3A #f5f5f5;\"></i>\n#    <a href='https%3A//github.com/community-scripts/ProxmoxVE' target='_blank' rel='noopener noreferrer' style='text-decoration%3A none; color%3A #00617f;'>GitHub</a>\n#  </span>\n#  <span style='margin%3A 0 10px;'>\n#    <i class=\"fa fa-comments fa-fw\" style=\"color%3A #f5f5f5;\"></i>\n#    <a href='https%3A//github.com/community-scripts/ProxmoxVE/discussions' target='_blank' rel='noopener noreferrer' style='text-decoration%3A none; color%3A #00617f;'>Discussions</a>\n#  </span>\n#  <span style='margin%3A 0 10px;'>\n#    <i class=\"fa fa-exclamation-circle fa-fw\" style=\"color%3A #f5f5f5;\"></i>\n#    <a href='https%3A//github.com/community-scripts/ProxmoxVE/issues' target='_blank' rel='noopener noreferrer' style='text-decoration%3A none; color%3A #00617f;'>Issues</a>\n#  </span>\n#</div>\narch: amd64\ncores: 2\nfeatures: nesting=1\nhostname: Mail-Backup\nlock: migrate\nmemory: 2048\nnet0: name=eth0,bridge=vmbr0,hwaddr=BC:24:11:01:BA:95,ip=dhcp,type=veth\nonboot: 1\nostype: ubuntu\nrootfs: Ext500:subvol-111-disk-0,size=2G\nswap: 512\ntags: community-script;os\nlxc.cgroup2.devices.allow: a\nlxc.cap.drop: \nlxc.cgroup2.devices.allow: c 188:* rwm\nlxc.cgroup2.devices.allow: c 189:* rwm\nlxc.mount.entry: /dev/serial/by-id  dev/serial/by-id  none bind,optional,create=dir\nlxc.mount.entry: /dev/ttyUSB0       dev/ttyUSB0       none bind,optional,create=file\nlxc.mount.entry: /dev/ttyUSB1       dev/ttyUSB1       none bind,optional,create=file\nlxc.mount.entry: /dev/ttyACM0       dev/ttyACM0       none bind,optional,create=file\nlxc.mount.entry: /dev/ttyACM1       dev/ttyACM1       none bind,optional,create=file\n","firewall-config":null}' failed - failed to handle 'config' command - 403 Permission check failed (changing feature flags for privileged container is only allowed for root@pam)
2025-05-29 22:18:14 aborting phase 1 - cleanup resources
2025-05-29 22:18:14 ERROR: found stale volume copy 'Ext500:subvol-111-disk-0' on node 'pve2'
tunnel: -> sending command "quit" to remote
tunnel: <- got reply
2025-05-29 22:18:15 start final cleanup
2025-05-29 22:18:15 start container on source node
2025-05-29 22:18:17 ERROR: migration aborted (duration 00:00:18): error - tunnel command '{"cmd":"config","conf":"#<div align='center'>\n#  <a href='https%3A//Helper-Scripts.com' target='_blank' rel='noopener noreferrer'>\n#    <img src='https%3A//raw.githubusercontent.com/community-scripts/ProxmoxVE/main/misc/images/logo-81x112.png' alt='Logo' style='width%3A81px;height%3A112px;'/>\n#  </a>\n#\n#  <h2 style='font-size%3A 24px; margin%3A 20px 0;'>Ubuntu LXC</h2>\n#\n#  <p style='margin%3A 16px 0;'>\n#    <a href='https%3A//ko-fi.com/community_scripts' target='_blank' rel='noopener noreferrer'>\n#      <img src='https%3A//img.shields.io/badge/&#x2615;-Buy us a coffee-blue' alt='spend Coffee' />\n#    </a>\n#  </p>\n#  \n#  <span style='margin%3A 0 10px;'>\n#    <i class=\"fa fa-github fa-fw\" style=\"color%3A #f5f5f5;\"></i>\n#    <a href='https%3A//github.com/community-scripts/ProxmoxVE' target='_blank' rel='noopener noreferrer' style='text-decoration%3A none; color%3A #00617f;'>GitHub</a>\n#  </span>\n#  <span style='margin%3A 0 10px;'>\n#    <i class=\"fa fa-comments fa-fw\" style=\"color%3A #f5f5f5;\"></i>\n#    <a href='https%3A//github.com/community-scripts/ProxmoxVE/discussions' target='_blank' rel='noopener noreferrer' style='text-decoration%3A none; color%3A #00617f;'>Discussions</a>\n#  </span>\n#  <span style='margin%3A 0 10px;'>\n#    <i class=\"fa fa-exclamation-circle fa-fw\" style=\"color%3A #f5f5f5;\"></i>\n#    <a href='https%3A//github.com/community-scripts/ProxmoxVE/issues' target='_blank' rel='noopener noreferrer' style='text-decoration%3A none; color%3A #00617f;'>Issues</a>\n#  </span>\n#</div>\narch: amd64\ncores: 2\nfeatures: nesting=1\nhostname: Mail-Backup\nlock: migrate\nmemory: 2048\nnet0: name=eth0,bridge=vmbr0,hwaddr=BC:24:11:01:BA:95,ip=dhcp,type=veth\nonboot: 1\nostype: ubuntu\nrootfs: Ext500:subvol-111-disk-0,size=2G\nswap: 512\ntags: community-script;os\nlxc.cgroup2.devices.allow: a\nlxc.cap.drop: \nlxc.cgroup2.devices.allow: c 188:* rwm\nlxc.cgroup2.devices.allow: c 189:* rwm\nlxc.mount.entry: /dev/serial/by-id  dev/serial/by-id  none bind,optional,create=dir\nlxc.mount.entry: /dev/ttyUSB0       dev/ttyUSB0       none bind,optional,create=file\nlxc.mount.entry: /dev/ttyUSB1       dev/ttyUSB1       none bind,optional,create=file\nlxc.mount.entry: /dev/ttyACM0       dev/ttyACM0       none bind,optional,create=file\nlxc.mount.entry: /dev/ttyACM1       dev/ttyACM1       none bind,optional,create=file\n","firewall-config":null}' failed - failed to handle 'config' command - 403 Permission check failed (changing feature flags for privileged container is only allowed for root@pam)
TASK ERROR: migration aborted


Are there plans to implement some kind of fix for this @fabian ?