Cloud

Version: 25.03

Privileged Container

Scenarios

By default, iSulad starts common containers that are suitable for starting common processes. However, common containers have only the default permissions defined by capabilities in the /etc/default/isulad/config.json directory. To perform privileged operations (such as use devices in the /sys directory), a privileged container is required. By using this feature, user root in the container has root permissions of the host. Otherwise, user root in the container has only common user permissions of the host.

Usage Restrictions

Privileged containers provide all functions for containers and remove all restrictions enforced by the device cgroup controller. A privileged container has the following features:

  • Secomp does not block any system call.

  • The /sys and /proc directories are writable.

  • All devices on the host can be accessed in the container.

  • All system capabilities will be enabled.

Default capabilities of a common container are as follows:

Capability Key

Description

SETPCAP

Modifies the process capabilities.

MKNOD

Allows using the system call mknod() to create special files.

AUDIT_WRITE

Writes records to kernel auditing logs.

CHOWN

Modifies UIDs and GIDs of files. For details, see the chown(2).

NET_RAW

Uses RAW and PACKET sockets and binds any IP address to the transparent proxy.

DAC_OVERRIDE

Ignores the discretionary access control (DAC) restrictions on files.

FOWNER

Ignores the restriction that the file owner ID must be the same as the process user ID.

FSETID

Allows setting setuid bits of files.

KILL

Allows sending signals to processes that do not belong to itself.

SETGID

Allows the change of the process group ID.

SETUID

Allows the change of the process user ID.

NET_BIND_SERVICE

Allows bounding to a port whose number is smaller than 1024.

SYS_CHROOT

Allows using the system call chroot().

SETFCAP

Allows transferring and deleting capabilities to other processes.

When a privileged container is enabled, the following capabilities are added:

Capability Key

Description

SYS_MODULE

Loads and unloads kernel modules.

SYS_RAWIO

Allows direct access to /devport, /dev/mem, /dev/kmem, and original block devices.

SYS_PACCT

Allows the process BSD audit.

SYS_ADMIN

Allows executing system management tasks, such as loading or unloading file systems and setting disk quotas.

SYS_NICE

Allows increasing the priority and setting the priorities of other processes.

SYS_RESOURCE

Ignores resource restrictions.

SYS_TIME

Allows changing the system clock.

SYS_TTY_CONFIG

Allows configuring TTY devices.

AUDIT_CONTROL

Enables and disables kernel auditing, modifies audit filter rules, and extracts audit status and filtering rules.

MAC_ADMIN

Overrides the mandatory access control (MAC), which is implemented for the Smack Linux Security Module (LSM).

MAC_OVERRIDE

Allows MAC configuration or status change, which is implemented for Smack LSM.

NET_ADMIN

Allows executing network management tasks.

SYSLOG

Performs the privileged syslog(2) operation.

DAC_READ_SEARCH

Ignores the DAC access restrictions on file reading and catalog search.

LINUX_IMMUTABLE

Allows modifying the IMMUTABLE and APPEND attributes of a file.

NET_BROADCAST

Allows network broadcast and multicast access.

IPC_LOCK

Allows locking shared memory segments.

IPC_OWNER

Ignores the IPC ownership check.

SYS_PTRACE

Allows tracing any process.

SYS_BOOT

Allows restarting the OS.

LEASE

Allows modifying the FL_LEASE flag of a file lock.

WAKE_ALARM

Triggers the function of waking up the system, for example, sets the CLOCK_REALTIME_ALARM and CLOCK_BOOTTIME_ALARM timers.

BLOCK_SUSPEND

Allows blocking system suspension.

Usage Guide

iSulad runs the --privileged command to enable the privilege mode for containers. Do not add privileges to containers unless necessary. Comply with the principle of least privilege to reduce security risks.

shell
isula run --rm -it --privileged busybox