Long-Term Supported Versions

    Innovation Versions

      Imperceptible Container Management Plane Offload Deployment Guide

      NOTE:

      In this user guide, modifications are performed to the container management plane components and the rexec tool of a specific version. You can modify other versions based on the actual execution environment. The patch provided in this document is for verification only and is not for commercial use. NOTE:

      The communication between shared file systems is implemented through the network. You can perform a simulated offload using two physical machines or VMs connected through the network.

      Before the verification, you are advised to set up a Kubernetes cluster and container running environment that can be used properly and offload the management plane process of a single node. You can use a physical machine or VM that is connected to the network as an emulated DPU.

      Introduction

      Container management plane, that is, management tools of containers such as Kubernetes, dockerd, containerd, and isulad. Container management plane offload is to offload the container management plane from the host where the container is located to another host, that is, the DPU, a set of hardware that has an independent running environment.

      By mounting directories related to container running on the host to the DPU through qtfs, the container management plane tool running on the DPU can access these directories and prepare the running environment for the containers running on the host. To remotely mount the special file systems such as proc and sys, a dedicated rootfs is created as the running environment of Kubernetes and dockerd (referred to as /another_rootfs).

      In addition, rexec is used to start and delete containers so that the container management plane and containers can run on two different hosts for remote container management.

      rexec

      rexec is a remote execution tool written in the Go language based on the rexec example tool of Docker/libchan. rexec is used to remotely invoke binary files. For ease of use, capabilities such as transferring environment variables and monitoring the exit of original processes are added to rexec.

      To use the rexec tool, run the CMD_NET_ADDR=tcp://0.0.0.0:<port_number> rexec_server command on the server to start the rexec service process, and then run the CMD_NET_ADDR=tcp://<server_IP_address>:<port_number> rexec [command] on the client`. This instructs rexec_server to execute the command.

      dockerd

      The changes to dockerd are based on version 18.09.

      In containerd, the part that invokes libnetwork-setkey through hook is commented out. This does not affect container startup. In addition, to ensure the normal use of docker load, an error in the mount function in mounter_linux.go is commented out.

      In the running environment of the container management plane, /proc is mounted to the proc file system on the server, and the local proc file system is mounted to /local_proc. In dockerd and containerd, /proc is changed to /local_proc for accessing /proc/self/xxx, /proc/getpid()/xxx, or related file systems.

      containerd

      The changes to containerd are based on containerd-1.2-rc.1.

      When obtaining mounting information, /proc/self/mountinfo can obtain only the local mounting information of dockerd but cannot obtain that on the server. Therefore, /proc/self/mountinfo is changed to /proc/1/mountinfo to obtain the mounting information on the server by obtaining the mounting information of process 1 on the server.

      In containerd-shim, the Unix socket that communicates with containerd is changed to TCP. containerd obtains the IP address of the running environment of containerd-shim through the SHIM_HOST environment variable, that is, the IP address of the server. The has value of shim is used to generate a port number, which is used as the communication port to start containerd-shim.

      In addition, the original method of sending signals to containerd-shim is changed to the method of remotely invoking the kill command to send signals to shim, ensuring that Docker can correctly kill containers.

      Kubernetes

      kubelet is not modified. The container QoS manager may fail to be configured for the first time. This error does not affect the subsequent pod startup process.

      Container Management Plane Offload Operation Guide

      Start rexec_server on both the server and client. rexec_server on the server is used to invoke rexec to stat containerd-shim. rexec_server on the client is used to execute invoking of dockerd and containerd by containerd-shim.

      Server

      Create a folder required by the container management plane, insert qtfs_server.ko, and start the engine process.

      In addition, you need to create the rexec script /usr/bin/dockerd on the server.

      #!/bin/bash
      CMD_NET_ADDR=tcp://<client_IP_address>:<rexec_port_number> rexec /usr/bin/dockerd $*
      

      Client

      Prepare a rootfs as the running environment of dockerd and containerd. Use the following script to mount the server directories required by dockerd and containerd to the client. Ensure that the remote directories mounted in the script exist on both the server and client.

      #!/bin/bash
      mkdir -p /another_rootfs/var/run/docker/containerd
      iptables -t nat -N DOCKER
      echo "---------insmod qtfs ko----------"
      insmod /YOUR/QTFS/PATH/qtfs.ko qtfs_server_ip=<server_IP_address> qtfs_log_level=INFO
      
      # The proc file system in the chroot environment is replaced by the proc shared file system of the DPU. The actual proc file system of the local host needs to be mounted to **/local_proc**.
      mount -t proc proc /another_rootfs/local_proc/
      
      # Bind the chroot internal environment to the external environment to facilitate configuration and running.
      mount --bind /var/run/ /another_rootfs/var/run/
      mount --bind /var/lib/ /another_rootfs/var/lib/
      mount --bind /etc /another_rootfs/etc
      
      mkdir -p /another_rootfs/var/lib/isulad
      
      # Create and mount the dev, sys, and cgroup file systems in the chroot environment.
      mount -t devtmpfs devtmpfs /another_rootfs/dev/
      mount -t sysfs sysfs /another_rootfs/sys
      mkdir -p /another_rootfs/sys/fs/cgroup
      mount -t tmpfs tmpfs /another_rootfs/sys/fs/cgroup
      list="perf_event freezer files net_cls,net_prio hugetlb pids rdma cpu,cpuacct memory devices blkio cpuset"
      for i in $list
      do
              echo $i
              mkdir -p /another_rootfs/sys/fs/cgroup/$i
              mount -t cgroup cgroup -o rw,nosuid,nodev,noexec,relatime,$i /another_rootfs/sys/fs/cgroup/$i
      done
      
      ## common system dir
      mount -t qtfs -o proc /proc /another_rootfs/proc
      echo "proc"
      mount -t qtfs /sys /another_rootfs/sys
      echo "cgroup"
      
      # Mount the shared directory required by the container management plane.
      mount -t qtfs /var/lib/docker/containers /another_rootfs/var/lib/docker/containers
      mount -t qtfs /var/lib/docker/containerd /another_rootfs/var/lib/docker/containerd
      mount -t qtfs /var/lib/docker/overlay2 /another_rootfs/var/lib/docker/overlay2
      mount -t qtfs /var/lib/docker/image /another_rootfs/var/lib/docker/image
      mount -t qtfs /var/lib/docker/tmp /another_rootfs/var/lib/docker/tmp
      mkdir -p /another_rootfs/run/containerd/io.containerd.runtime.v1.linux/
      mount -t qtfs /run/containerd/io.containerd.runtime.v1.linux/ /another_rootfs/run/containerd/io.containerd.runtime.v1.linux/
      mkdir -p /another_rootfs/var/run/docker/containerd
      mount -t qtfs /var/run/docker/containerd /another_rootfs/var/run/docker/containerd
      mount -t qtfs /var/lib/kubelet/pods /another_rootfs/var/lib/kubelet/pods
      

      In**/another_rootfs**, create the following script to support cross-host operations:

      • /another_rootfs/usr/local/bin/containerd-shim
      #!/bin/bash
      CMD_NET_ADDR=tcp://<server_IP_address>:<rexec_port_number> /usr/bin/rexec /usr/bin/containerd-shim $*
      
      • /another_rootfs/usr/local/bin/remote_kill
      #!/bin/bash
      CMD_NET_ADDR=tcp://<server_IP_address>:<rexec_port_number> /usr/bin/rexec /usr/bin/kill $*
      
      • /another_rootfs/usr/sbin/modprobe
      #!/bin/bash
      CMD_NET_ADDR=tcp://<server_IP_address>:<rexec_port_number> /usr/bin/rexec /usr/sbin/modprobe $*
      

      After changing the root directories of dockerd and containerd to the required rootfs, run the following command to start dockerd and containerd:

      • containerd
      #!/bin/bash
      SHIM_HOST=<server_IP_address> containerd --config /var/run/docker/containerd/containerd.toml --address /var/run/containerd/containerd.sock
      
      • dockerd
      #!/bin/bash
      SHIM_HOST=<server_IP_address>CMD_NET_ADDR=tcp://<server_IP_address>:<rexec_port_number> /usr/bin/dockerd --containerd /var/run/containerd/containerd.sock
      
      • kubelet

      Use the original parameters to start kubelet in the chroot environment.

      Because /var/run/ is bound to /another_rootfs/var/run/, you can use Docker to access the docker.sock interface for container management in the regular rootfs.

      The container management plane is offloaded to the DPU. You can run docker commands to create and delete containers, or use kubectl on the current node to schedule and destroy pods. The actual container service process runs on the host.

      NOTE:

      This guide describes only the container management plane offload. The offload of container network and data volumes requires additional offload capabilities, which are not included. You can perform cross-node startup of containers that are not configured with network and storage by referring to this guide.

      Bug Catching

      Buggy Content

      Bug Description

      Submit As Issue

      It's a little complicated....

      I'd like to ask someone.

      PR

      Just a small problem.

      I can fix it online!

      Bug Type
      Specifications and Common Mistakes

      ● Misspellings or punctuation mistakes;

      ● Incorrect links, empty cells, or wrong formats;

      ● Chinese characters in English context;

      ● Minor inconsistencies between the UI and descriptions;

      ● Low writing fluency that does not affect understanding;

      ● Incorrect version numbers, including software package names and version numbers on the UI.

      Usability

      ● Incorrect or missing key steps;

      ● Missing prerequisites or precautions;

      ● Ambiguous figures, tables, or texts;

      ● Unclear logic, such as missing classifications, items, and steps.

      Correctness

      ● Technical principles, function descriptions, or specifications inconsistent with those of the software;

      ● Incorrect schematic or architecture diagrams;

      ● Incorrect commands or command parameters;

      ● Incorrect code;

      ● Commands inconsistent with the functions;

      ● Wrong screenshots.

      Risk Warnings

      ● Lack of risk warnings for operations that may damage the system or important data.

      Content Compliance

      ● Contents that may violate applicable laws and regulations or geo-cultural context-sensitive words and expressions;

      ● Copyright infringement.

      How satisfied are you with this document

      Not satisfied at all
      Very satisfied
      Submit
      Click to create an issue. An issue template will be automatically generated based on your feedback.
      Bug Catching
      编组 3备份