Problems with permissions for mountpoint in unprivileged container

BerndS · Sep 21, 2023

Hello,

I don't understand the mapping of the UID and GID. It seems so simple but did not work for me.

After reading https://pve.proxmox.com/wiki/Unprivileged_LXC_containers I tried to configure my container for my needs:

For my user bernd UID and GID is 1000 (in my whole network on every device, including PVE host). In addition in network I created a group "shares" with GID 1002. This is for simple access. User bernd is in this group.

After adding the mappings, I log in as bernd. The permissions of my home dir are nobody:bernd instead of expected bernd:bernd but the mountpoint is correct - bernd:shares.

Where is my fault?

1. My /etc/subuid:

Code:

root:100000:65536
bernd:165536:65536
root:1000:1 # <-- this was added by me after reading tutorials

2. My /etc/subgid:

Code:

root:100000:65536
bernd:165536:65536
root:1002:1 # <!-- this was added by me

3. My /etc/lxc/1xx.conf:

Code:

...
unprivileged: 1
...
mp0: /mnt/data,mp=/mnt/data
...
lxc.idmap: u 0 100000 1000
lxc.idmap: g 0 100000 1002
lxc.idmap: u 1000 1000 1
lxc.idmap: g 1002 1002 1
lxc.idmap: u 1001 101001 64534
lxc.idmap: g 1003 101003 64532

Dunuin · Sep 22, 2023

BerndS said:
User bernd is in this group.

Both on the host and the LXC? With bernd in share on the host doesn't mean bernd is in that group inside the LXC too.

BerndS said:
root:1002:1 # <!-- this was added by me

Shouldn't there also be "root:1000:1" for the group bernd with GID 1000?

BerndS said:
lxc.idmap: g 0 100000 1002

BerndS said:
lxc.idmap: g 1002 1002 1

BerndS said:
lxc.idmap: g 1003 101003 64532

And...?:

Code:

lxc.idmap: g 0 100000 1000
lxc.idmap: g 1000 1000 1
lxc.idmap: g 1001 101001 1
lxc.idmap: g 1002 1002 1
lxc.idmap: g 1003 101003 64532

Was the home directory created before you changed the remapping? Because then the home directory is still owned by UID 101000 and GID 101000 which was the user bernd before but as bernd is now UID 1000 and not any longer UID 101000 bernd isn't the owner of that directory anymore. Thats why "nobody" is the owner.
And because you didn't changed the GID remapping for GID 1000 bernds group is still GID 101000 on the host ( shown as GID 1000 inside the LXC) which matches the GID 101000 that is owning the folder so your group is still reported as "bernd". If you change the GID remapping for GID 1000 from 1000->1010000 to 1000->1000 the LXC should show that the folder would be owned by "nogroup".

Ich you want to fix this you would need to shutdown the LXC. Mount the LXCs filesystem on the PVE host (see "pct mount" command). Chown every file/folder owned by UID 101000 to UID 1000 and every file/folder GID 101000 to 1000 and every GID owned by 101002 to 1002. After unmounting the filesystem and starting the LXC the home folder than should be owned by bernd:bernd.

To not have to chown all the stuff make sure to set the remapping before starting the LXC the first time.

zodiac · Sep 22, 2023

I assume, this is a Proxmox installation that isn't meant to be used in a cluster? As far as I can tell, that's really the only situation where it makes sense to use bind mounts from the host into the containers. For everything else, you probably want something like NFS or Plan9 instead (or ceph, if your cluster is a little bigger). But if you do plan to only have a single host, then yes, bind mounts can have advantages.

This is not something that Proxmox natively supports very well. Exactly because the architecture assumes that you want a cluster-aware system. And bind mounts by definition can't work across a cluster.

As an immediate consequence, as soon as you have bind mounts, you lose the ability to generate snapshots of your container(s). That's of course a huge loss in useful features. Snapshots are some of the most powerful and important administrative tools that you should not give up lightly.

The compromise that I am OK with is to:

never run any services directly on the Proxmox host. It stays as close to an unmodified stock Proxmox VE installation as possible. That simplifies future updates tremendously, but more immediately, it entirely side-steps the need for any user maps. If there aren't any users on the host (other than root and administrative roles managed by Proxmox), then it doesn't matter if user ids for the shared home directories start at 100000
first, create a separate ZFS volume on the host that is used for data that all containers share between each other. In particular in your scenario, this would mean the home directories. If you have the hardware to do so, put the ZFS volume into a dedicated ZFS pool on separate drives. That would make things easier, if you ever need to recover from catastrophic failure. But even if you can't dedicate these resources, recovery is still possible. It's just harder.
devise your own solution for snapshotting and backing up this data on a regular schedule. proxmox-backup-client is pretty powerful and it's easy to write a script that can backup this data without Proxmox VE having to get involved. You can still rely on PBS to do all the heavy lifting of managing your backups. But if you also want automated snapshots in addition to automated backups, then that's something you'd have to also script yourself. Fortunately, ZFS has excellent support for snapshotting and makes this quite seamless. For convenience, I have ~/.snapshot directories that always automatically give access to old data, in case of accidental edits/deletes.
instead of informing Proxmox VE about the bind mounts, use a lower-level API, and add an "lxc.mount.entry" to your container configuration. Proxmox VE intentionally doesn't see any mounts that have been added this way. They logically aren't part of the container, and thus won't prevent you from managing snapshots.
Now, all your containers can access the same data and you can set permissions as appropriate. Just be aware that there are security implications. A compromised container means that it can modify the shared data in other containers as well. But that's presumably something you are OK with. Sometimes, mounting just a part of the filesystem can be a wiser decision (e.g. the web server container only needs to mount ~/www and not all of /home/*). Some of the users (again, the webserver) might also be OK with read-only mounts. So, take advantage of that option.

Let me know, if this makes sense or if you want me to fill in some details. Also, I have to re-iterate that this is a non-standard way to deploy Proxmox. I don't expect it to break any time soon as the underlying lxc technology is pretty stable, but if it does, you might have to switch to proper network filesystems after all.

BerndS · Sep 22, 2023

I must honestly say that I'm a bit overwhelmed right now and can't quite understand what exactly is meant.

It's all so complicated if you are not familiar with Proxmox. All I want to do is share a ZFS dataset for my containers, VMs and network clients over NFS and Samba.

What is here the best practice in Proxmox?

Everywhere I start reading, security restrictions have to be eliminated first or many other partly complicated things have to be done. Examples?

https://forum.proxmox.com/threads/nfs-server-in-lxc.105073/
https://forum.proxmox.com/threads/nfs-server-on-alpine-lxc-not-starting.103443/
https://forum.proxmox.com/threads/nfs-server-in-lxc.105073/

Many, many, more I found outside the board on many websites and Youtube.

For me, a file server had always been a super simple thing, but under Proxmox it feels extremely complicated and you need a PhD first. It's a pain. All other things like managing Containers, VMs and Backups is really great and simple to understand.

Please let me know, what I have to do to achieve my goal in Proxmox with best practice and secure...

Thanks a lot in advance

Kind regards
Bernd

Dunuin · Sep 22, 2023

BerndS said:
I must honestly say that I'm a bit overwhelmed right now and can't quite understand what exactly is meant.

Not specifically a Proxmox problem. When working with LXCs you need to learn the limitations and how to work around them.

BerndS said:
It's all so complicated if you are not familiar with Proxmox. All I want to do is share a ZFS dataset for my containers, VMs and network clients over NFS and Samba.

Thats because PVE isn't a NAS/storage server. It got some limited capabilities to access SMB/NFS shares but it's not meant to be used as something like a TrueNAS or Unraid. You can use it like that but all manually via the CLI outside of what Proxmox is offering out of the box with webUI.

BerndS said:
Everywhere I start reading, security restrictions have to be eliminated first or many other partly complicated things have to be done. Examples?

If you want it secure and less complicated use VMs and not LXCs.

I already explained whats your problem is and how to fix it. That isn't simple to understand and I can't explain it more simple. If you want an unprivileged LXC with a SMB/NFS share thats still the only way to do it.

BerndS · Sep 22, 2023

@Dunuin Yes, you are right. I thank you for your helping attempts

Unfortunately I don't even understand your "simple" explanation so I have 2 things I can do:

Use an priviliged container but less secure or...

...I ask you to give me the correct contents of the files for my constellation with the one relevant user bernd (uid=1000, gid=1000, gid=1002 for shares):

I would really appreciate it if you could help me. I'm currently rebuilding my server and I'm not getting anywhere until I solve this problem. I will take care of the understanding later, but now I need the result ;-)

How must the complete /etc/subuid look like?
How must the complete /etc/subgid look like?
How must the complete lxc.idmap entries in /etc/pve/lxc/xxx.conf look like?

Thanks a lot in advance

Kind regards
Bernd

zodiac · Sep 22, 2023

I think you have a bit of an XY-problem here. If you are looking at /etc/subuid, you are probably already going down the wrong path. I can't envision a lot of scenarios where /etc/subuid would be one of the primary concerns, if any at all.

If you want to manage storage outside of PVE, do what I suggested earlier. Create a ZFS file system on the host, and reference it inside containers with lxc.mount.entry. Then don't worry about the fact that your host won't see the same user ids as your containers. You should only ever access this filesystem from within the containers -- and all the containers, by default, use the exact same mapping. /etc/subuid never comes into play. If you then want to also export the files with NFS or Samba for other computers on your LAN, you can configure one of your containers as a file server. This is a non-standard configuration though, and you have to do some extra work to integrate it well. But since you keep asking about /etc/subuid, it seems that you have a strong preference for managing storage outside PVE, and this would be the way to do it.

Alternatively, if you want PVE to be aware of your storage and manage it for you, then allocate the data as part of one of the containers and export it from there with either NFS, Samba, or Plan9 (or SSHfs, if you think that is easier to configure). This is the way how it would be done more commonly. Everybody else, both other containers and computers on your LAN, imports it as a network filesystem. No bind mounts anywhere. The translation of user ids (if any) would have to be done by the file server, and each of these network filesystems have their own way of configuring this aspect. If you get stuck there, post a new thread asking for help.

As Dunuin explained, you'll typically have to configure your fileserver software from the command line. But you might be able to find a prepackaged "appliance" that can be installed in either a container or VM and that gives you some sort of web front-end for configuring file sharing. I'd start by looking on turnkeylinux.org

It is my understanding that Turnkey uses containers instead of VMs, though. This means your files will live in an image file or a ZFS device, instead of directly as ZFS files on the host. This of course means that you, again, don't have to worry about /etc/subuid since the PVE host won't even see your files. But the downside also is some loss in performance and flexibility because of the extra level of indirection. In practice, PVE is pretty well-optimized and the performance penalty isn't too horrible. But if that still bothers you, you might want to investigate Turnkey LXC. I have never tried it myself, but supposedly it's a way of converting a Turnkey VM to a container.

BerndS · Sep 22, 2023

I am not committed to anything, I just want to share my existing dataset ;-)

Since I don't want to dwell on this any longer, I just created a privileged Debian container now, mounted it via mp0=/host,/container and enabled NFS and Samba/CIFS. Then installed the nfs-kernel-server and created the user and shared everything via /etc/exports. This works great. Samba is also no problem.

But this is not the proxmox way. And I still don't understand that one. I appreciate all your competent help and long formulated texts, but unfortunately I can't find a solution for me in them and so now I have to settle for my solution. Maybe my English is just not good enough to understand everything. I don't know.

The thing I did now is for me what I read in your second paragraph:

If you want to manage storage outside of PVE, do what I suggested earlier. Create a ZFS file system on the host, and reference it inside containers with lxc.mount.entry. Then don't worry about the fact that your host won't see the same user ids as your containers. You should only ever access this filesystem from within the containers -- and all the containers, by default, use the exact same mapping.

zodiac · Sep 22, 2023

Yes, this is a perfectly fine way of doing things. But I suspect that you'll discover snapshots are now disabled for this container.

If you add the mountpoint from the Proxmox VE GUI, Proxmox thinks of the mounted filesystem as part of the container's state. It wants to include the files in any snapshot. But since the files are on the host and not part of the container's managed storage, it can't create this snapshot.

From what I can tell by reading older postings in this forum, the Proxmox developers have been discussing how to allow more fine-grained control over ownership of shared storage outside of the container. But I don't believe there is an official way to do so at this point. That's why I was saying to edit the container's configuration file and to manually add an lxc.mount.entry line. It's not perfect, but it makes it clear to Proxmox VE that you don't want to pull these files into the state of the container. And that eliminates the conflict with snapshotting.

But what you have is fine. It should work. Just be aware that your files on the host aren't part of any backups, and you have to sort that out by yourself.

Dunuin · Sep 22, 2023

BerndS said:
...I ask you to give me the correct contents of the files for my constellation with the one relevant user bernd (uid=1000, gid=1000, gid=1002 for shares):

I already did that above. But no matter what you insert into the LXC conf file and suid and sgid files that won't work without changing the owner of maybe thousands of files and folders from the LXCs filesystem.
By changing the config files you only change bernds UID from 101000 to 1000. It won't change the ownership of all files that were previously owned by bernds old UID. You have to manually fix this file by file changing all stuff owned by UID 101000 to UID 1000. As bernd isn't owning the files and folders anymore that he owned before changing the user remapping..,

BerndS said:
How must the complete /etc/subuid look like?

Code:

root:100000:65536
bernd:165536:65536
root:1000:1

BerndS said:
How must the complete /etc/subgid look like?

Code:

root:100000:65536
bernd:165536:65536
root:1000:1
root:1002:1

BerndS said:
How must the complete lxc.idmap entries in /etc/pve/lxc/xxx.conf look like?

Code:

lxc.idmap: g 0 100000 1000
lxc.idmap: g 1000 1000 1
lxc.idmap: g 1001 101001 1
lxc.idmap: g 1002 1002 1
lxc.idmap: g 1003 101003 64532
lxc.idmap: u 0 100000 1000
lxc.idmap: u 1000 1000 1
lxc.idmap: u 1001 101001 64534

zodiac · Sep 23, 2023

How did those thousands of files get there in the first place? If they are the result of a backup that your restored, the best option would be to simply remove the files and restore them again, now that you have fixed the configuration.

If you'd rather try changing the permissions in place, then with a few minutes work, you can probably whip up a script that works universally and subtracts 100000 from all user and group ids. But since you have so few distinct cases to deal with, I wouldn't even bother. Just handle them one by one.

This is untested, so please test in some subdirectory first. But I think the following should work when executed on the host (i.e. outside of the container):

Code:

find . -uid 101000 -print0 | xargs -0 chown 1000
find . -gid 101000 -print0 | xargs -0 chown :1000
find . -gid 101002 -print0 | xargs -0 chown :1002

You will almost certainly have to be "root" to make these changes.

If you then wonder, whether you have found all the relevant permissions that need to be adjusted, you can search for any stragglers with:

Code:

find . -uid 1000 -o print
find . -gid 1000 -o -gid 1002 -o print

Ideally, neither command should produce any output.

Dunuin · Sep 23, 2023

zodiac said:
How did those thousands of files get there in the first place?

You just shouldn't use a LXC and later change user/group remapping. Do the remapping before starting to install packages and creating users...

Dunuin · Sep 23, 2023

Dunuin said:
You just shouldn't use a LXC and later change user/group remapping. Do the remapping before starting to install packages and creating users...

As an analogy:
When I start creating files and install programs as a user alice and later decide I don't want to use alice anymore but I want to create and use a new user bob. Then I don't have to wonder that bob isn't able to access the files and run the programs alice used before. The same is the case here, just that we don't have usernames alice and bob but UID 101000 and UID 1000. Simplified: We have two users both named bernd and the new bernd can't access the stuff of the old bernd as its still owned by the old bernd. And @BerndS isn't understanding that those two bernds aren't the same and is wondering why the new bernd can't use the old stuff.

zodiac · Sep 23, 2023

That's a very odd mental model that you have here, and it doesn't really match what is happening. I suspect that a lot of your difficulties would go away, if you worked your way up to understanding the philosophy of virtualization products such as Proxmox or any variety of similar commercial offerings. They all solve a particular problem, and they solve it really well. But they don't work all that well, if you want them to do something that doesn't match these concepts.

To try to stay within the constraints of your mental model, it's more akin to user "alice" who works for Microsoft wondering why they can't access the homedirectory of user "bob" who works at Ford. In their desparation, "alice" obtained a name change from the government and is now called "bob (formerly known as alice)", but they still can't get to the files that the real "bob" has at Ford...

Think of containers and virtual machines as completely independent computers. They don't really share common state or concepts. Each one does something different. The host is the odd one out here. It manages the containers and for technical reasons it can see into the containers (Proxmox is actually a little behind here; competing products often make the contents of containers inaccessible to the host).

Conceptually though, the host isn't really sharing state with any of the containers or VMs unless you go out of your way to break that abstraction. And if you do so, a lot of assumptions no longer work correctly and various functions of Proxmox will stop operating as intended.

I generally discourage people from messing with the host. It for good reason should be kept as close as possible to a clean out-of-the-box installation. In your particular example, there really shouldn't be a user "bernd" or a group "shares" on the host. The concept of a virtualization host doesn't allow for having local users, and it's merely an artifact of Proxmox being built on top of Debian, that you can attempt to create these users. You can of course violate this rule, but if you do so, you are on your own.

Bind mounts are a performance optimization that you would consider when you don't plan on ever running a real cluster of machines. It can be useful in that scenario. But you pay for this extra bit of performance by running into the type of problems that you just encountered. It's a bit of a self-inflicted wound. If you didn't want these issues, you should create a dedicated VM or container that is your network file server. It's the only part of the system that can directly access these shared files. Then export different views onto the filesystem over the internal network connection. Voila, no more permission problems, no more problems with snapshots, backups, or migrations either. But slightly lower performance though, and you now end up backing up your files whenever you back up your file server, instead of managing these backups as separate entities.

Dunuin · Sep 23, 2023

zodiac said:
That's a very odd mental model that you have here, and it doesn't really match what is happening. I suspect that a lot of your difficulties would go away, if you worked your way up to understanding the philosophy of virtualization products such as Proxmox or any variety of similar commercial offerings. They all solve a particular problem, and they solve it really well. But they don't work all that well, if you want them to do something that doesn't match these concepts.

But it sounds like he isn't a IT guy who wants to run a proper enterprise environment. I agree with you running a proper 5+ nodes ceph cluster with multiple fast NICs with stacked routers, multiple onsite/offsite PBSs, proper isolation, defined roles between host and guests, everything rolled out in software by a orchestrator, no single-point-of-failure, full redundant enterprise/datacenter hardware and so on would be much better. But sounds more like a small homeserver for private use where stuff just should work with the least effort possible. Doing it the proper way requires a lot of expensive hardware and years of education...probably two things he isn't willing to invest to just watch some self-hosted movies after work.
Using a unprivileged LXC with user remapping is the second easiest thing (after running a VM or privileged LXC) he could do. This is what he asked for and what I explained him.

zodiac said:
To try to stay within the constraints of your mental model, it's more akin to user "alice" who works for Microsoft wondering why they can't access the homedirectory of user "bob" who works at Ford. In their desparation, "alice" obtained a name change from the government and is now called "bob (formerly known as alice)", but they still can't get to the files that the real "bob" has at Ford...

Jup. Usernames doesn't matter at all, they just map to UIDs. Different UIDs = different users. And it doesn`t make it easier understandable that UID 1000 on the host is something different than UID 1000 in the LXC when not manually remapping it from 101000->1000 to 1000->1000.
But he wanted an explanation that is even more simple. At some point you can't make it more simple without oversimplifying it dropping some details. While your example is better explaining whats actually happening under the hood, it's probably also harder to unterstand.

Search

Search

Problems with permissions for mountpoint in unprivileged container

BerndS

New Member

Dunuin

Distinguished Member

zodiac

Member

BerndS

New Member

Dunuin

Distinguished Member

BerndS

New Member

zodiac

Member

BerndS

New Member

zodiac

Member

Dunuin

Distinguished Member

zodiac

Member

Dunuin

Distinguished Member

Dunuin

Distinguished Member

zodiac

Member

Dunuin

Distinguished Member