Long-Term Supported Versions

    Innovation Versions

      1 Imperceptible Container Management Plane Offload

      1.1 Overview

      The container management plane refers to container management tools such as dockerd, containerd, and isulad. Container management plane offload means to offload the container management plane from the HOST where the container is located to the DPU.

      We use qtfs to mount some directories related to container running on the HOST to the DPU so that the container management plane can access these directories to prepare the environment required for container running. In addition, the remote proc and sys file systems need to be mounted. To avoid impact on the current system, you can create a dedicated rootfs (referred to as /another_rootfs) as the running environment of dockerd and containerd.

      The rexec command is used to start or delete containers, allowing the management plane and containers to be separated on different machines for remote management. You can use either of the following modes to verify the offload.

      1.1.1 Test Mode

      Prepare two physical machines or VMs that can communicate with each other.

      One physical machine functions as the DPU, and the other functions as the host. In this document, DPU and HOST refer to the two physical machines.

      NOTE: In the test mode, network ports are exposed without connection authentication, which is risky and should be used only for internal tests and verification. Do not use this mode in the production environment. In the production environment, use closed communication to prevent external connections, such as the vsock mode.

      1.1.2 vsock Mode

      The DPU and HOST are required and must be able to provide vsock communication through virtio.

      This document describes only the test mode usage. If vsock communication is supported in the test environment (virtual environment or DPU-HOST environment that supports vsock), the following test procedure is applicable, except that you need to change the IP addresses to the vsock CIDs (TEST_MODE is not required for binary file compilation).

      2 Environment Setup

      2.1 qtfs File System Deployment

      For details, see qtfs.

      NOTE: If the test mode is used, set qtfs_TEST_MODE to 1 when compiling the .ko files on the qtfs client and server. If the vsock mode is used, you do not need to set qtfs_TEST_MODE.

      To establish a qtfs connection, you need to disable the firewall between the DPU and HOST, or open related network ports on the firewall.

      2.2 udsproxyd Service Deployment

      2.2.1 Overview

      udsproxyd is a cross-host Unix domain socket (UDS) proxy service and needs to be deployed on the HOST and DPU. The udsproxyd components are in a peer relationship. Their respective processes on the host and DPU can communicate with each other transparently using the standard UDSs. That is, if the two processes communicate with each other through UDSs on the same host, they can also communicate with each other between the HOST and DPU without the need for modifying code. As a cross-host Unix socket service, udsproxyd can be used by running with LD_PRELOAD=libudsproxy.so or configuring the udsconnect allowlist in advance. The methods for configuring the allowlist are described later.

      2.2.2 Deploying udsproxyd

      Compile udsproxyd in the dpu-utilities project.

      cd qtfs/ipc
      
      make -j UDS_TEST_MODE=1 && make install
      

      NOTE: If the vsock mode is used, you do not need to set UDS_TEST_MODE during compilation.

      The latest engine service on the qtfs server has integrated the udsproxyd capability. Therefore, you do not need to start the udsproxyd service on the server. Start the udsproxyd service on the client.

      nohup /usr/bin/udsproxyd <thread num> <addr> <port> <peer addr> <peer port> 2>&1 &
      

      Parameters:

      thread num: number of threads. Currently, only one thread is supported.
      
      addr: IP address of the host. If the vsock communication mode is used, the value is the CID.
      
      port: Port used on the host.
      
      peer addr: IP address of the udsproxyd peer. If the vsock communication mode is used, the value is the CID.
      
      peer port: port used on the udsproxyd peer.
      

      Example:

      nohup /usr/bin/udsproxyd 1 192.168.10.10 12121 192.168.10.11 12121 2>&1 &
      

      If the qtfs engine service is not started and you want to test udsproxyd separately, start udsproxyd on the server.

      nohup /usr/bin/udsproxyd 1 192.168.10.11 12121 192.168.10.10 12121 2>&1 &
      

      2.2.3 Using udsproxyd

      2.2.3.1 Using the udsproxyd Service Independently

      When starting the client process of the Unix socket application that uses the UDS service, add the LD_PRELOAD=libudsproxy.so environment variable to intercept the connect API of glibc for UDS interconnection. Alternatively, run the qtcfg command to configure the udsconnect allowlist to instruct the system to take over UDS connections in specified directories.

      2.2.3.2 Using the udsproxyd Service Transparently

      Configure the allowlist of the UDS service for qtfs. The socket file bound to the Unix socket server needs to be added to the allowlist. You can use either of the following methods:

      • Load the allowlist by using the qtcfg utility. First compile the utility in qtfs/qtinfo.

      Run the following command on the qtfs client:

      make role=client 
      make install
      

      Run the following command on the qtfs server:

      make role=server
      make install
      

      After qtcfg is installed automatically, run qtcfg to configure the allowlist. Assume that /var/lib/docker needs to be added to the allowlist:

      qtcfg -w udsconnect -x /var/lib/docker
      

      Query the allowlist:

      qtcfg -w udsconnect -z
      

      Delete an allowlist entry:

      qtcfg -w udsconnect -y 0
      

      The parameter is the index number listed when you query the allowlist.

      • Add an allowlist entry through the configuration file. The configuration file needs to be set before the qtfs or qtfs_server kernel module is loaded. The allowlist is loaded when the kernel modules are initialized.

      Add the following content to the /etc/qtfs/whitelist file.

      [Udsconnect]
      /var/lib/docker
      

      NOTE: The allowlist prevents irrelevant Unix sockets from establishing remote connections, causing errors or wasting resources. You are advised to set the allowlist as precisely as possible. For example, in this document, /var/lib/docker is set in the container scenario. It would be risky to directly add /var/lib, /var, or the root directory.

      2.3 rexec Service Deployment

      2.3.1 Overview

      rexec is a remote execution component developed using the C language. It consists of the rexec client and rexec server. The server is a daemon process, and the client is a binary file. After being started, the client establishes a UDS connection with the server using the udsproxyd service, and the server daemon process starts a specified program on the server machine. During container management plane offload, dockerd is offloaded to the DPU. When dockerd needs to start a service container process on the HOST, the rexec client is invoked to remotely start the process.

      2.3.2 Deploying rexec

      2.3.2.1 Configuring the Environment Variables and Allowlist

      Configure the rexec server allowlist on the host. Put the whitelist file in the /etc/rexec directory, and change the file permission to read-only.

      chmod 400 /etc/rexec/whitelist
      

      After downloading the dpu-utilities code, go to the qtfs/rexec directory and run make && make install to install all binary files required by rexec (rexec and rexec_server) to the /usr/bin directory.

      Before starting the rexec_server service on the server, check whether the /var/run/rexec directory exists. If not, create it.

      mkdir /var/run/rexec
      

      The underlying communication of the rexec service uses Unix sockets. Therefore, cross-host communication between rexec and rexec_server depends on the udsproxyd service, and the related files need to be added to the udsproxy allowlist.

      qtcfg -w udsconnect -x /var/run/rexec
      

      2.3.2.2 Starting the Service

      You can start the rexec_server service on the server in either of the following ways.

      • Method 1: Configure rexec as a systemd service.

      Add the rexec.service file to /usr/lib/systemd/system.

      rexec.service

      Then, use systemctl to manage the rexec service.

      Start the service for the first time:

      systemctl daemon-reload
      systemctl enable --now rexec
      

      Restart the service:

      systemctl stop rexec
      systemctl start rexec
      
      • Method 2: Manually start the service in the background.
      nohup /usr/bin/rexec_server 2>&1 &
      

      3 Changes to Management Plane Components

      3.1 dockerd

      The changes to dockerd are based on version 18.09.

      For details about the changes to Docker, see the patch file in this directory.

      3.2 containerd

      The changes to containerd are based on containerd-1.2-rc.1.

      For details about the changes to containerd, see the patch file in this directory.

      4 Container Management Plane Offload Guide

      NOTE:

      1. Start rexec_server on both the HOST and DPU.
      2. rexec_server on the HOST is used to start containerd-shim by using rexec when the DPU creates a container.
      3. rexec_server on the DPU is used to execute the call to dockerd and containerd by containerd-shim.

      4.1 Preparing the Rootfs for Running dockerd and containerd

      Note: Perform this step only on the DPU.

      In the following document, the rootfs is called /another_rootfs (the directory name can be changed as required).

      4.1.1 Using the Official openEuler QCOW2 Image

      You are advised to use the QCOW2 image provided by openEuler to prepare the new rootfs.

      4.1.1.1 Installing the Tools

      Use yum to install xz, kpartx, and qemu-img.

      yum install xz kpartx qemu-img
      

      4.1.1.2 Downloading the QCOW2 Image

      Download the openEuler 22.03 LTS VM image for x86 or openEuler 22.03 LTS VM image for Arm64 from the openEuler official website.

      4.1.1.3 Decompressing the QCOW2 Image

      Run xz -d to decompress the package and obtain the openEuler-22.03-LTS-<arch>.qcow2 file. The following uses the x86 image as an example.

      xz -d openEuler-22.03-LTS-x86_64.qcow2.xz
      

      4.1.1.4 Mounting the QCOW2 Image and Copying Files

      1. Run the modprobe nbd maxpart=<any number> command to load the nbd module.
      2. qemu-nbd -c /dev/nbd0 <VM image path>
      3. Create a folder, for example, /random_dir.
      4. mount /dev/nbd0p2 /random_dir
      5. Copy the files.
      mkdir /another_rootfs
      cp -r /random_dir/* /another_rootfs/
      

      The VM image has been mounted to /another_rootfs.

      4.1.1.5 Unmounting QCOW2

      After the rootfs is prepared, run the following command to umount the QCOW2 file:

      umount /random_dir
      qemu-nbd -d /dev/nbd0
      

      4.1.2 Installing Software in /another_rootfs

      1. Copy /etc/resolv.conf from the root directory to /another_rootfs/etc/resolv.conf.
      2. Remove the files in /another_rootfs/etc/yum.repos.d and copy the files in /etc/yum.repos.d/ to /another_rootfs/etc/yum.repos.d.
      3. Run yum install <software package> --installroot=/another_rootfs to install a software package.
      yum install --installroot=/another_rootfs iptables
      

      4.2 Starting qtfs_server on the HOST

      Copy rexec, containerd-shim, runc, and engine to the /usr/bin directory. Pay attention to the permissions. rexec and engine have been provided. Compile Docker binary files based on the patch described in 3 Changes to Management Plane Components.

      4.2.1 Inserting the qtfs_server Module

      Create a folder required by the container management plane, insert qtfs_server.ko, and start the engine process.

      You can run this script to perform this operation. If an error occurs during the execution, try using dos2unix to convert the format of the script (the same applies to all the following scripts).

      Replace the paths of the module and binary file in the script with the actual qtfs path.

      In addition, create the /usr/bin/dockerd and /usr/bin/containerd scripts for executing the rexec command on the HOST.

      /usr/bin/dockerd:

      #!/bin/bash
      rexec /usr/bin/dockerd $*
      

      /usr/bin/containerd:

      #!/bin/bash
      exec /usr/bin/containerd $*
      

      After the two scripts are created, run the chmod command to grant execute permission on them.

      chmod +x /usr/bin/containerd
      chmod +x /usr/bin/dockerd
      

      4.3 Mounting the Dependency Directories on the HOST to the DPU

      4.3.1 Installing the Software Packages

      4.3.2.1 Installing in the Root Directory

      In the DPU root directory (not /another_rootfs), install iptables, libtool, libcgroup, and tar using yum.

      yum install iptables libtool libcgroup tar
      

      You can also download all dependency packages and run the rpm command to install them. The iptables and libtool packages and their dependency packages are: iptables, libtool, emacs, autoconf, automake, libtool-ltdl, m4 and tar, libcgroup.

      After downloading the preceding software packages, run the following command:

      rpm -ivh iptables-1.8.7-5.oe2203.x86_64.rpm libtool-2.4.6-34.oe2203.x86_64.rpm emacs-27.2-3.oe2203.x86_64.rpm autoconf-2.71-2.oe2203.noarch.rpm automake-1.16.5-3.oe2203.noarch.rpm libtool-ltdl-2.4.6-34.oe2203.x86_64.rpm m4-1.4.19-2.oe2203.x86_64.rpm tar-1.34-1.oe2203.x86_64.rpm libcgroup-0.42.2-1.oe2203.x86_64.rpm
      

      4.3.2.2 Configuring the /another_rootfs Environment

      1. Install iptables in /another_rootfs, which is mandatory for dockerd startup.

        Run yum install <software package > --installroot=/another_rootfs to install.

      2. Copy rexec to /another_rootfs/usr/bin and grant execute permission.

        cp rexec /another_rootfs/usr/bin
        chmod +x /another_rootfs/usr/bin/rexec
        
      3. In addition, copy containerd and dockerd compiled based on the community Docker source code and the preceding patch to /another_rootfs/usr/bin, and copy docker to /usr/bin.

        cp {YOUR_PATH}/dockerd /another_rootfs/usr/bin
        cp {YOUR_PATH}/containerd /another_rootfs/usr/bin
        cp {YOUR_PATH}/docker /usr/bin
        
      4. Delete /another_rootfs/usr/sbin/modprobe from /another_rootfs.

        rm -f /another_rootfs/usr/sbin/modprobe
        
      5. Create the following scripts in /another_rootfs:

        /another_rootfs/usr/local/bin/containerd-shim:

        #!/bin/bash
        /usr/bin/rexec /usr/bin/containerd-shim $*
        

        /another_rootfs/usr/bin/remote_kill:

        #!/bin/bash
        /usr/bin/rexec /usr/bin/kill $*
        

        /another_rootfs/usr/sbin/modprobe:

        #!/bin/bash
        /usr/bin/rexec /usr/sbin/modprobe $*
        

        After the creation is complete, grant execute permission to them.

        chmod +x /another_rootfs/usr/local/bin/containerd-shim
        chmod +x /another_rootfs/usr/bin/remote_kill
        chmod +x /another_rootfs/usr/sbin/modprobe
        

      4.3.2.3 Mounting Directories

      Run the prepare.sh script on the DPU to mount the HOST directories required by dockerd and containerd to the DPU.

      In addition, ensure that the remote directories mounted by the script exist on both the HOST and DPU.

      4.4 dockerd and containerd Startup

      On the DPU, open two sessions and chroot them to the /another_rootfs required for running dockerd and containerd.

      chroot /another_rootfs
      

      Run the following commands in the two sessions to start containerd and then dockerd:

      containerd

      #!/bin/bash
      SHIM_HOST=${YOUR_SERVER_IP} containerd --config /var/run/docker/containerd/containerd.toml --address /var/run/containerd/containerd.sock
      

      dockerd

      #!/bin/bash
      # this need to be done once
      /usr/bin/rexec mount -t qtfs /var/lib/docker/overlay2 /another_rootfs/var/lib/docker/overlay2/
      SHIM_HOST=${YOUR_SERVER_IP} /usr/bin/dockerd --containerd /var/run/containerd/containerd.sock -s overlay2 --iptables=false --debug 2>&1 | tee docker.log
      

      Because /var/run/ and /another_rootfs/var/run/ have been bind-mounted, you can access the docker.sock interface through Docker in a normal rootfs to manage containers.

      5 Environment Restoration

      To unmount related directories, delete the existing containers, stop containerd and dockerd, and run the following commands:

      for i in `lsof | grep v1.linux | awk '{print $2}'`
      do
              kill -9 $i
      done
      mount | grep qtfs | awk '{print $3}' | xargs umount
      mount | grep another_rootfs | awk '{print $3}' | xargs umount
      
      sleep 1
      
      umount /another_rootfs/etc
      umount /another_rootfs/sys
      pkill udsproxyd
      rmmod qtfs
      

      Bug Catching

      Buggy Content

      Bug Description

      Submit As Issue

      It's a little complicated....

      I'd like to ask someone.

      PR

      Just a small problem.

      I can fix it online!

      Bug Type
      Specifications and Common Mistakes

      ● Misspellings or punctuation mistakes;

      ● Incorrect links, empty cells, or wrong formats;

      ● Chinese characters in English context;

      ● Minor inconsistencies between the UI and descriptions;

      ● Low writing fluency that does not affect understanding;

      ● Incorrect version numbers, including software package names and version numbers on the UI.

      Usability

      ● Incorrect or missing key steps;

      ● Missing prerequisites or precautions;

      ● Ambiguous figures, tables, or texts;

      ● Unclear logic, such as missing classifications, items, and steps.

      Correctness

      ● Technical principles, function descriptions, or specifications inconsistent with those of the software;

      ● Incorrect schematic or architecture diagrams;

      ● Incorrect commands or command parameters;

      ● Incorrect code;

      ● Commands inconsistent with the functions;

      ● Wrong screenshots.

      Risk Warnings

      ● Lack of risk warnings for operations that may damage the system or important data.

      Content Compliance

      ● Contents that may violate applicable laws and regulations or geo-cultural context-sensitive words and expressions;

      ● Copyright infringement.

      How satisfied are you with this document

      Not satisfied at all
      Very satisfied
      Submit
      Click to create an issue. An issue template will be automatically generated based on your feedback.
      Bug Catching
      编组 3备份