Unprivileged containers made simple on Debian 12 (Bookworm)

IMPORTANT NOTE: This is the full version, if you just want to come in, copy some commands, and end up making unprivileged containers under root, THERE IS A SEPARATE POST FOR THAT HERE.

0- Intro

Don’t let the length fool you, I am trying to make this the simplest and fastest yet most comprehensive tutorial to having LXC (both privileged and unprivileged) up and running on debian bookworm !

I sent a previous version of this to a friend to spare myself the need to explain to him what to do, and he found the tutorial confusing ! instead of the old arrangement, having colors to denote what lines are for what task, I have decided to SEPARATE THIS INTO PARTS….

  1. Intro – About this post (You are already in it)
  2. LXC info
  3. Shared system setup (Privileged and unprivileged)
  4. Privilaged LXC step by step
  5. Shared setup for unprivileged containers
  6. Unprivileged LXC run by new user, step by step
  7. Unprivileged LXC run by root user, step by step

I hope this clears things up, the color codes will still exist, mostly because I have already done the work !

Why yet another tutorial ?

Most of the tutorials online focus on creating an extra user to use with LXC, that is one way to do it with a few drawbacks, the other way is to create a range of subordinate IDs for the root user, the advantages of this way of doing it are related to “Autostart” and filesystem sharing between host and guest.

As per usual, the primary goal of every post on this blog is my own reference, the internet is full of misleading and inaccurate stuff, and when i come back to a similar situation, I don’t want to do the research all over again.

Part 1: About LXC

Privileged VS unprivileged

Privileged containers are generally unsafe, the only advantage of privileged containers is that is is very easy to setup.

Privileged containers share the same root user with the host, so if the container root user gets compromised, the attacker can sneak into the host system, hence, unprivileged is more secure but involves some work initially to setup

What is the problem with Privileged containers

It is relatively easy to deploy LXC (Which also happens to be what is powering LXD)… You install it, run a command to create a container, and voila, a whole new Linux system within your host Linux system sharing the same kernel as the host… But there is one caveat, if a malicious user/application compromises your container, he/she would have also compromised the host machine automatically, how, the root user on both is the same user !

The solution, unprivileged containers

In comes Unprivileged containers, in this setup, we simply either map a User ID to root within the container, or, still use root, but through subordinate IDs, so instead of having the Host’s user id for root (Usually Zero) being also root inside the container, we create a user outside the container (Or a subordinate ID of root), and instruct the kernel to map this user’s ID and treat it as ID zero inside the container, So if a malicious user gets access to the container and ends up breaking out of the container, they will find themselves logged on as a different user, with privileges very close to the privileges of the user nobody, or in other words, barely any privileges

Relevant topic: User namespaces

A relevant topic to Unprivileged LXC containers is User namespaces (Starting kernel 3.8), namespaces are created with the functions clone() or unshare().

nuff with the theory, What do i need to do ?

You setup LXC, then depending on the type of container and user you need, you may want to setup Linux kernel to use that user as root in the container, but to make that happen, you will need to take a few steps to give that user the required privileges and nothing more than what is required, nothing complicated about those steps either. So let us get started

2- Shared system setup

Before writing this tutorial, I installed a copy of bookworm, enabled SSH, and got to work doing the steps you see below, the steps in this section are the same whether you plan to create privileged or unprivileged containers or both

Step 2-1: Install everything

apt-get update
apt-get install bridge-utils lxc libvirt-clients libvirt-daemon-system debootstrap qemu-kvm virtinst nmap resolvconf iotop net-tools

Step 2-2: Enable IP forwarding

Next, we need to enable IPv4 forwarding by un-commenting a line in sysctl.conf then run sysctl -p, so open sysctl.conf in your favorite linux compatible editor, and uncomment the line

net.ipv4.ip_forward=1

Now run the following command for the effects to take place

sysctl -p

Step 2-3: Host Networking

Before creating any containers, we need to make sure the host can bridge the network to them, in Debian, this is done by editing the file /etc/network/interfaces, there are a few ways to connect the containers, your host can become a DHCP server, or you can connect the containers directly to your router

In this setup below, I am connecting the containers directly to the router.. The host machine will have the IP 192.168.7.140, IF YOU ARE USING HYPER-V, YOU WILL NEED TO ENABLE “MAC address spoofing” IN THE HYPER-V VM SETTINGS

auto br0
	iface br0 inet static
	bridge_ports eno1
	bridge_fd 0
	address 192.168.7.140
	netmask 255.255.255.0
	gateway 192.168.7.1
	bridge_stp off
	bridge_maxwait 0
	dns-nameservers 8.8.8.8
	dns-nameservers 8.8.4.4

3- Privilaged LXC

To clarify, making a privileged container does not stop you from making unprivileged containers later, BUT, the unprivileged containers need to be different containers 😉 so you might make a privileged one, then replace it with an unprivileged one

Step 3-1: Download container

The following step is all about downloading your LXC container template ! I chose the mirror with the lowest ping time from me, but you can omit the mirror line altogether

MIRROR=http://ftp.debian.org/debian lxc-create --name vm142 --template download -- --dist debian --release bookworm --arch amd64

Something unexpected happened while i was doing this, I received an error about a problem downloading, by coincidence, i rebooted the machine and it worked, my theory is that the reboot was irrelevant but if this happens to you, tell me your conclusions in the comments.

"../src/lxc/lxccontainer.c: create_run_template: 1628 Failed to create container from template"

Right after, you have a brand new LXC container which is unfortunately privileged, you can have it listed with the command “lxc-ls -f” where the f stands for fancy 😉

lxc-ls -f

Step 3-2: Edit virtual machine config

This container might not be able to start though, some editing of the config file may be necessary !

Here is this machines config file, mind the comments, this is meant to be modified to fit your networking setup, so you will need to change the IP address and relevant network address information, the machine name and rootfs path, etc…

#this is a modified LXC container config file
# Template used to create this container: /usr/share/lxc/templates/lxc-download
# Parameters passed to the template: --dist debian --release bookworm --arch amd64
# For additional config options, please look at lxc.container.conf(5)

# Uncomment the following line to support nesting containers:
#lxc.include = /usr/share/lxc/config/nesting.conf
# (Be aware this has security implications)


# Distribution configuration
lxc.include = /usr/share/lxc/config/common.conf
lxc.arch = linux64

# Container specific configuration
lxc.apparmor.profile = generated
#nesting is for having docker and other similar containerization tech inside the container, dissable it if you don't want such virtual machines in the virtual machine
lxc.apparmor.allow_nesting = 1
lxc.rootfs.path = dir:/var/lib/lxc/vm142/rootfs
lxc.uts.name = vm142

# Initial Network configuration, disabled...
#lxc.net.0.type = veth
#lxc.net.0.link = lxcbr0
#lxc.net.0.flags = up

#the above config was dissabled, so net.0 altogether is better left empty
lxc.net.0.type = empty


#Now, add networking

lxc.net.1.type = veth
lxc.net.1.flags = up
lxc.net.1.link = br0
lxc.net.1.name = eth0
lxc.net.1.ipv4.address = 192.168.7.142/24
lxc.net.1.ipv4.gateway = 192.168.7.1


#App armor profile for this PRIVILEGED container
lxc.apparmor.profile = generated


#If you want this container to start with the host, uncomment the following
#lxc.start.auto = 1
#lxc-start.delay = 10
# #the order, the higher the earlier ;) 
#lxc.start.order = 30


# Container specific configuration (Not initially there)
lxc.tty.max = 4
lxc.pty.max = 1024

Problem : One remaining problem was that the virtual machine was getting 2 IP addresses, one static that we set above, and one dynamic via DHCP, turns out the /etc/systemd/network/eth0.network forced the machine to get DHCP, so i went in and commented all the lines inside that file !

Another problem that came up was DNS resolution, the file you need to edit is sftp://192.168.7.123/etc/systemd/resolved.conf, simply add the following two lines

DNS=8.8.8.8
FallbackDNS=8.8.4.4

Step 3-3: Start the machine and change credentials

Now, after starting the machine, you will need to login to it, to start the virtual machine and do that, issue the command

lxc-start -n vm142 -d
lxc-attach -n vm142

Now, you can use the passwd command to change the container’s password, and you would probably want to install “apt-get install ssh openssh-server”, this way you can login to it with putty or any other SSH client

4- Unprivileged LXC containers (Both)

Whatever in this section applies to unprivileged containers, whether root user or any other user

Step 4-1: Enable Unprivileged User Namespaces

it is enabled by default, To make sure that it is, run the command below, if it returns “kernel.unprivileged_userns_clone = 1” you are good to go.

sysctl kernel.unprivileged_userns_clone

if for any reason it is not enabled (0), you can enable it by adding it to /etc/sysctl.d…. by editing the file “/etc/sysctl.d/00-local-userns.conf” and adding the following line, if the file does not exist, create it

kernel.unprivileged_userns_clone=1

Once done, run the command

service procps restart

5- Unprivileged container under new user

Step 5-1: Create the user

You can call the user whatever you want, I chose to call the user lxcadmin, this is an arbitrary choice, To create a user we issue the following command.

adduser lxcadmin

The output of the adduser command should be something like

Adding user `lxcadmin' ...
Adding new group `lxcadmin' (1001) ...
Adding new user `lxcadmin' (1001) with group `lxcadmin (1001)' ...
Creating home directory `/home/lxcadmin' ...
Copying files from `/etc/skel' ...
...
Adding new user `lxcadmin' to supplemental / extra groups `users' ...
Adding user `lxcadmin' to group `users' ...

So here, our user gets the ID 1001 (Since i already have a user with the ID 1000 and the root user with the ID 0. Now if we inspect the 2 files /etc/subuid (The subordinate uid file) and /etc/subgid, we will find the following content in both (Identical contents in files).

yazeed:100000:65536
lxcadmin:165536:65536

What the above means is that user lxcadmin has a range of UIDs starting with 165536 and has 65536 extra UIDs total, so the last UID that lxcadmin can use is 165536 + 65536 – 1 = 321071, and the next user we add will start at 321072.

So to recap this user has a subordinate ID range from 165537 TO 321071, notice i added one to the starting number since the first number is not a subordinate ID, but rather the user’s default ID.

Step 5-2: Network adapter quota

New users generally do not have the ability to add a container to a bridge, for that you will need to give the user a network device quota, this quota is defined in the file /etc/lxc/lxc-usernet, the initial quota for unprivileged users is zero, so edit the file and add the following lines, depending on what adapters you would like to allow lxcadmin to connect containers to, the format is user type bridge quota

lxcadmin veth lxcbr0 10
lxcadmin veth br0 10

Notice that you can replace the user with a group name, but that is a subject of a different post…

Now you will need to copy the file /etc/lxc/default.conf to the user’s home directory, in my case under /home/lxcadmin/.config/lxc/default.conf, if the config directory does not exist, create it, now edit this file you just created and depending on the user you are using (I am using the second user, hence the numbers, yours will differ unless your user is the second one added, copy the values from /etc/subuid)…

    lxc.idmap = u 0 165536 65536
    lxc.idmap = g 0 165536 65536

Now, we are closer than ever to making it run, we need to create our first container, unlike the privileged “lxc-create mycontainer” this one is slightly more complicated (The solution is below to make things unprivileged and secure again)

systemd-run --unit=my-unit --user --scope -p "Delegate=yes" -- lxc-create -t download -n my-container
lxc-create -t download -n myunprivcontainer -- -d debian -r bookworm -a amd64

Don’t expect this to work yet…. the following contgainer config file was automatically created

# Template used to create this container: /usr/share/lxc/templates/lxc-download
# Parameters passed to the template: -d debian -r bookworm -a amd64
# For additional config options, please look at lxc.container.conf(5)

# Uncomment the following line to support nesting containers:
#lxc.include = /usr/share/lxc/config/nesting.conf
# (Be aware this has security implications)


# Distribution configuration
lxc.include = /usr/share/lxc/config/common.conf
lxc.include = /usr/share/lxc/config/userns.conf
lxc.arch = linux64

# Container specific configuration
lxc.apparmor.profile = generated
lxc.apparmor.allow_nesting = 1
lxc.idmap = u 0 165536 65536
lxc.idmap = g 0 165536 65536
lxc.rootfs.path = dir:/home/lxcadmin/.local/share/lxc/myunprivcontainer/rootfs
lxc.uts.name = myunprivcontainer

# Network configuration
lxc.net.0.type = veth
lxc.net.0.link = br0
lxc.net.0.flags = up

libpam-cgfs is already installed (It was a dependancy in the apt-get install above), libpam-cgfs is a Pluggable Authentication Module (PAM) to provide logged-in users with a set of cgroups which they can administer. This allows for instance unprivileged containers, and session management using cgroup process tracking.

Configure AppArmor

App Armor is enabled on Debian 10 (buster) and after by default, AppArmor is recommended as it adds a layer of security which may prove vital for a system running your virtual machines.

to check whether it is enabled on your system or not, you can run the following command

cat /sys/module/apparmor/parameters/enabled

If the above returns the letter Y, AppArmor is enabled, and you need to set it up to allow for our unprivileged setup

6- Unprivileged container under root subordinates

This is the most interesting setups, It is a no compromise setup where you can have a container run with all the features you see in privileged containers, while still maintaining the security provided by the unprivileged setup above (More or less)

Step 6-1: root Subordinates:

the first step is to allocate a uid and gid range to root user in /etc/subuid and /etc/subgid. This is because the root user, unlike users added with adduser, does not have subordinate IDs by default, so in short, figure out what the next range of IDs is, and assign them to root by adding a line similar to the following at the top of the list in those 2 files, In my case, lxcadmin has the last range, 165536:65536 means the next id is (165536 + 65536 = 231072), And i would like a million subordinate IDs so i can hand every machine a different set of IDs which should increase security even farther.

root:231072:1000000

adduser will recognize the new range when you use it next time, and start from there

And reflect that range in /etc/lxc/default.conf using lxc.idmap entries similar to those above.

root does not need network devices quotas and uses the global configuration file, so those steps from the above are not needed.

Any container you create as root from that point on will be running unprivileged, able to auto-start, and share filesystems !

Leave a Reply

Your email address will not be published. Required fields are marked *