By default, iSulad starts common containers that are suitable for starting common processes. However, common containers have only the default permissions defined by capabilities in the /etc/default/isulad/config.json directory. To perform privileged operations (such as use devices in the /sys directory), a privileged container is required. By using this feature, user root in the container has root permissions of the host. Otherwise, user root in the container has only common user permissions of the host.
Privileged containers provide all functions for containers and remove all restrictions enforced by the device cgroup controller. A privileged container has the following features:
Secomp does not block any system call.
The /sys and /proc directories are writable.
All devices on the host can be accessed in the container.
All system capabilities will be enabled.
Default capabilities of a common container are as follows:
Capability Key
Description
SETPCAP
Modifies the process capabilities.
MKNOD
Allows using the system call mknod() to create special files.
AUDIT_WRITE
Writes records to kernel auditing logs.
CHOWN
Modifies UIDs and GIDs of files. For details, see the chown(2).
NET_RAW
Uses RAW and PACKET sockets and binds any IP address to the transparent proxy.
DAC_OVERRIDE
Ignores the discretionary access control (DAC) restrictions on files.
FOWNER
Ignores the restriction that the file owner ID must be the same as the process user ID.
FSETID
Allows setting setuid bits of files.
KILL
Allows sending signals to processes that do not belong to itself.
SETGID
Allows the change of the process group ID.
SETUID
Allows the change of the process user ID.
NET_BIND_SERVICE
Allows bounding to a port whose number is smaller than 1024.
SYS_CHROOT
Allows using the system call chroot().
SETFCAP
Allows transferring and deleting capabilities to other processes.
When a privileged container is enabled, the following capabilities are added:
Capability Key
Description
SYS_MODULE
Loads and unloads kernel modules.
SYS_RAWIO
Allows direct access to /devport, /dev/mem, /dev/kmem, and original block devices.
SYS_PACCT
Allows the process BSD audit.
SYS_ADMIN
Allows executing system management tasks, such as loading or unloading file systems and setting disk quotas.
SYS_NICE
Allows increasing the priority and setting the priorities of other processes.
SYS_RESOURCE
Ignores resource restrictions.
SYS_TIME
Allows changing the system clock.
SYS_TTY_CONFIG
Allows configuring TTY devices.
AUDIT_CONTROL
Enables and disables kernel auditing, modifies audit filter rules, and extracts audit status and filtering rules.
MAC_ADMIN
Overrides the mandatory access control (MAC), which is implemented for the Smack Linux Security Module (LSM).
MAC_OVERRIDE
Allows MAC configuration or status change, which is implemented for Smack LSM.
NET_ADMIN
Allows executing network management tasks.
SYSLOG
Performs the privileged syslog(2) operation.
DAC_READ_SEARCH
Ignores the DAC access restrictions on file reading and catalog search.
LINUX_IMMUTABLE
Allows modifying the IMMUTABLE and APPEND attributes of a file.
NET_BROADCAST
Allows network broadcast and multicast access.
IPC_LOCK
Allows locking shared memory segments.
IPC_OWNER
Ignores the IPC ownership check.
SYS_PTRACE
Allows tracing any process.
SYS_BOOT
Allows restarting the OS.
LEASE
Allows modifying the FL_LEASE flag of a file lock.
WAKE_ALARM
Triggers the function of waking up the system, for example, sets the CLOCK_REALTIME_ALARM and CLOCK_BOOTTIME_ALARM timers.
iSulad runs the --privileged command to enable the privilege mode for containers. Do not add privileges to containers unless necessary. Comply with the principle of least privilege to reduce security risks.