Hi,
we are working to migrate our actual openvz containers (from proxmox 3) to lastest proxmox 5.2 with lxc. Basically we follow this process:
- Old Proxmox 3
1. vzdump XXX -mode suspend -dumpdir /var/lib/vz/dump/ -tmpdir /var/lib/vz/vztmp/
2. copy dump to new server
- New Proxmox 5
3. pct restore XXX vzdump-openvz-... --storage vmdata --onboot 1 --swap 0
4. pct set XXX -net0 name=eth0,bridge=vmbr0,ip=IP,gw=GW
5. modify resources (cpu, memory and others) they are not correctly set after import
7. Final steps to stop old vm, final rsync, move ip and start new container.
We tested with some containers and all was fine initially but we started to find strange behaviors that started to get us crazy. Without any explication services like ssh and others from time to time get "hang" and don't respond. We did not found any on logs files nor container nor host, only with dmesg we started to find strange thinks like this:
[2857313.075597] audit: type=1400 audit(1523968769.090:9700): apparmor="DENIED" operation="mount" info="failed flags match" error=-13 profile="lxc-container-default-cgns" name="/" pid=28053 comm="(ionclean)" flags="rw, rslave"
[2857313.077774] audit: type=1400 audit(1523968769.091:9701): apparmor="DENIED" operation="file_lock" profile="lxc-container-default-cgns" pid=28055 comm="(ionclean)" family="unix" sock_type="dgram" protocol=0 addr=none
[2857313.080029] audit: type=1400 audit(1523968769.091:9702): apparmor="DENIED" operation="file_lock" profile="lxc-container-default-cgns" pid=28055 comm="(ionclean)" family="unix" sock_type="dgram" protocol=0 addr=none
[2857313.082305] audit: type=1400 audit(1523968769.091:9703): apparmor="DENIED" operation="file_lock" profile="lxc-container-default-cgns" pid=28055 comm="(ionclean)" family="unix" sock_type="dgram" protocol=0 addr=none
[2857313.084552] audit: type=1400 audit(1523968769.091:9704): apparmor="DENIED" operation="file_lock" profile="lxc-container-default-cgns" pid=28055 comm="(ionclean)" family="unix" sock_type="dgram" protocol=0 addr=none
[2859113.033119] audit: type=1400 audit(1523970569.089:9705): apparmor="DENIED" operation="file_lock" profile="lxc-container-default-cgns" pid=22327 comm="(ionclean)" family="unix" sock_type="dgram" protocol=0 addr=none
[2859113.035492] audit: type=1400 audit(1523970569.089:9706): apparmor="DENIED" operation="file_lock" profile="lxc-container-default-cgns" pid=22327 comm="(ionclean)" family="unix" sock_type="dgram" protocol=0 addr=none
[2859113.037876] audit: type=1400 audit(1523970569.089:9707): apparmor="DENIED" operation="file_lock" profile="lxc-container-default-cgns" pid=22327 comm="(ionclean)" family="unix" sock_type="dgram" protocol=0 addr=none
[2859113.040220] audit: type=1400 audit(1523970569.089:9708): apparmor="DENIED" operation="file_lock" profile="lxc-container-default-cgns" pid=22327 comm="(ionclean)" family="unix" sock_type="dgram" protocol=0 addr=none
After investigations this seem to be related to sessionclean task cron from php. We not found solution for that but it's not the real problem.
We did a strace to ssh process when it get hang and discovered this:
# strace -s 300 -p 2175
Process 2075 attached - interrupt to quit
sendto(10, "<39>Apr 17 14:43:01 sshd[2075]: debug3: fd 9 is not O_NONBLOCK", 62, MSG_NOSIGNAL, NULL, 0
and after several minutes it continue with normal operation.
Ssh is not the only service with problem.
For info, the container is actually a debian wheezy and the host server the last proxmox 5.1:
# uname -a
Linux XXX 4.13.13-6-pve #1 SMP PVE 4.13.13-42 (Fri, 9 Mar 2018 11:55:18 +0100) x86_64 GNU/Linux
We use zfs for containers storage and we tested to disable posixacl on disk.
Any ideas please ? this is really stopping us to migrate all others vps from openvz.
One final note, the same container has been working for years with openvz without any problem.
Maybe we can disable apparmor but that is not a solution, we will lost most container isolation and security.
Thanks in advance.
we are working to migrate our actual openvz containers (from proxmox 3) to lastest proxmox 5.2 with lxc. Basically we follow this process:
- Old Proxmox 3
1. vzdump XXX -mode suspend -dumpdir /var/lib/vz/dump/ -tmpdir /var/lib/vz/vztmp/
2. copy dump to new server
- New Proxmox 5
3. pct restore XXX vzdump-openvz-... --storage vmdata --onboot 1 --swap 0
4. pct set XXX -net0 name=eth0,bridge=vmbr0,ip=IP,gw=GW
5. modify resources (cpu, memory and others) they are not correctly set after import
7. Final steps to stop old vm, final rsync, move ip and start new container.
We tested with some containers and all was fine initially but we started to find strange behaviors that started to get us crazy. Without any explication services like ssh and others from time to time get "hang" and don't respond. We did not found any on logs files nor container nor host, only with dmesg we started to find strange thinks like this:
[2857313.075597] audit: type=1400 audit(1523968769.090:9700): apparmor="DENIED" operation="mount" info="failed flags match" error=-13 profile="lxc-container-default-cgns" name="/" pid=28053 comm="(ionclean)" flags="rw, rslave"
[2857313.077774] audit: type=1400 audit(1523968769.091:9701): apparmor="DENIED" operation="file_lock" profile="lxc-container-default-cgns" pid=28055 comm="(ionclean)" family="unix" sock_type="dgram" protocol=0 addr=none
[2857313.080029] audit: type=1400 audit(1523968769.091:9702): apparmor="DENIED" operation="file_lock" profile="lxc-container-default-cgns" pid=28055 comm="(ionclean)" family="unix" sock_type="dgram" protocol=0 addr=none
[2857313.082305] audit: type=1400 audit(1523968769.091:9703): apparmor="DENIED" operation="file_lock" profile="lxc-container-default-cgns" pid=28055 comm="(ionclean)" family="unix" sock_type="dgram" protocol=0 addr=none
[2857313.084552] audit: type=1400 audit(1523968769.091:9704): apparmor="DENIED" operation="file_lock" profile="lxc-container-default-cgns" pid=28055 comm="(ionclean)" family="unix" sock_type="dgram" protocol=0 addr=none
[2859113.033119] audit: type=1400 audit(1523970569.089:9705): apparmor="DENIED" operation="file_lock" profile="lxc-container-default-cgns" pid=22327 comm="(ionclean)" family="unix" sock_type="dgram" protocol=0 addr=none
[2859113.035492] audit: type=1400 audit(1523970569.089:9706): apparmor="DENIED" operation="file_lock" profile="lxc-container-default-cgns" pid=22327 comm="(ionclean)" family="unix" sock_type="dgram" protocol=0 addr=none
[2859113.037876] audit: type=1400 audit(1523970569.089:9707): apparmor="DENIED" operation="file_lock" profile="lxc-container-default-cgns" pid=22327 comm="(ionclean)" family="unix" sock_type="dgram" protocol=0 addr=none
[2859113.040220] audit: type=1400 audit(1523970569.089:9708): apparmor="DENIED" operation="file_lock" profile="lxc-container-default-cgns" pid=22327 comm="(ionclean)" family="unix" sock_type="dgram" protocol=0 addr=none
After investigations this seem to be related to sessionclean task cron from php. We not found solution for that but it's not the real problem.
We did a strace to ssh process when it get hang and discovered this:
# strace -s 300 -p 2175
Process 2075 attached - interrupt to quit
sendto(10, "<39>Apr 17 14:43:01 sshd[2075]: debug3: fd 9 is not O_NONBLOCK", 62, MSG_NOSIGNAL, NULL, 0
and after several minutes it continue with normal operation.
Ssh is not the only service with problem.
For info, the container is actually a debian wheezy and the host server the last proxmox 5.1:
# uname -a
Linux XXX 4.13.13-6-pve #1 SMP PVE 4.13.13-42 (Fri, 9 Mar 2018 11:55:18 +0100) x86_64 GNU/Linux
We use zfs for containers storage and we tested to disable posixacl on disk.
Any ideas please ? this is really stopping us to migrate all others vps from openvz.
One final note, the same container has been working for years with openvz without any problem.
Maybe we can disable apparmor but that is not a solution, we will lost most container isolation and security.
Thanks in advance.