Long-Term Supported Versions

    Imperceptible Virtualization Management Plane Offload

    1 Imperceptible DPU Offload for libvirtd

    1.1 Introduction

    libvirtd refers to the virtualization management plane. DPU offload for libvirtd means to run libvirtd on a machine (DPU) rather than the one where the VM is located (host).

    qtfs is used to mount directories related to VM running on the host to the DPU so that libvirtd can access them and prepare the environment required for running KVM. A dedicated rootfs (/another_rootfs) is created to mount the remote /proc and /sys directories required for running libvirtd.

    In addition, rexec is used to start and delete VMs, allowing libvirtd and VMs to be separated on different machines for remote VM control.

    1.2 Components

    1.2.1 rexec

    rexec is a C-based tool that allows you to remotely execute commands on a peer server. rexec consists of the rexec client and rexec server. The server runs as a daemon. The client binary file connects to the server through unix domain socket (UDS) using the udsproxyd service. The server daemon then starts a specified program on the server. During libvirt virtualization offload, libvirtd is offloaded to the DPU. When libvirtd needs to start a QEMU process on the host, it calls the rexec client to remotely start the process.

    2 Environment Requirements

    Physical machine OS: openEuler 22.03 LTS or later

    libvirt version: 6.9.0

    QEMU version: 6.2.0

    Prepare the files:

    1. On the DPU and host, download 0001-libvirt_6.9.0_1201_offload.patch, 0003-fix-get-affinity.patch, and 0004-qmp-port-manage.patch.

    2. On the DPU and host, clone the dpu-utilities repository and perform build in the qtfs/rexec directory:

      git clone https://gitee.com/openeuler/dpu-utilities.git
      cd dpu-utilities/qtfs/rexec/
      make
      yes | cp ./rexec* /usr/bin/
      
    3. Download the libvirt-6.9.0.tat.xz software package and the qtfs.ko library file to the DPU:

    4. Install QEMU on the host:

      yum install qemu
      

    3 Operation Guide

    Note:

    1. rexec_server must be started on both the host and DPU. You can specify [DPU_IP_address]:[DPU_rexec_port_number] to remotely operate a binary file on the DPU from the host, and vice versa.
    2. When creating a VM, the DPU uses rexec_server on the host to start qemu-kvm.

    3.1 Starting rexec_server

    3.1.1 Copying Binary Files

    Copy rexec_server to the DPU and host.

    cp rexec_server /usr/bin/
    chmod +x rexec_server
    

    3.1.2 Configuring the rexec_server Service

    rexec_server can be configured as a systemd service for convenience.

    1. Create a rexec.service file in the /usr/lib/systemd/system/ directory on the DPU and host. The file content is as follows. Change the port number as required:

      [Unit]
      Description=Rexec_server Service
      After=network.target
      
      [Service]
      Type=simple
      Environment=CMD_NET_ADDR=tcp://0.0.0.0:<port_number>
      ExecStart=/usr/bin/rexec_server
      ExecReload=/bin/kill -s HUP $
      KillMode=process
      
      [Install]
      WantedBy=multi-user.target
      
    2. Run systemctl start rexec to start the rexec_server service.

    systemctl daemon-reload
    systemctl enable --now rexec
    

    3.1.3 rexec Usage Example

    Once the rexec_server service is configured, you can invoke binary files on the host from the DPU. To do this, copy the rexec binary file to /usr/bin and then run the following command:

    CMD_NET_ADDR=tcp://<host_ip>:<host_rexec_server_port> rexec [command_to_be_executed]
    

    For example, to run ls on the host (assuming that the host IP address is 192.168.1.1 and the rexec_server port number is 6666) from the DPU, run the following command:

    CMD_NET_ADDR=tcp://192.168.1.1:6666 rexec /usr/bin/ls
    

    Note:

    If you do not want to start rexec_server as a systemd service, run the following command to manually start rexec_server:

    CMD_NET_ADDR=tcp://0.0.0.0:<port_number> rexec_server

    3.2 Preparing the Rootfs for Running libvirtd

    Note: Perform this step only on the DPU.

    Assume that the rootfs is /another_rootfs (you can change the name as required). Prepare the rootfs by following the instructions in 3.2.1 or 3.2.2 (the latter is recommended). After the rootfs is prepared, you can install software packages in /another_rootfs by referring to 3.5.1.

    3.2.1 Copying the Root Directory

    In most cases, you only need to copy the root directory to /another_rootfs.

    Run the following commands to perform the copy operations:

    mkdir /another_rootfs
    cp -r /usr /another_rootfs
    cp -r /sbin /another_rootfs
    cp -r /bin /another_rootfs
    cp -r /lib64 /another_rootfs
    cp -r /lib /another_rootfs
    mkdir /another_rootfs/boot
    mkdir /another_rootfs/dev
    mkdir /another_rootfs/etc
    mkdir /another_rootfs/home
    mkdir /another_rootfs/mnt
    mkdir /another_rootfs/opt
    mkdir /another_rootfs/proc
    mkdir /another_rootfs/root
    mkdir /another_rootfs/run
    mkdir /another_rootfs/var
    mkdir /another_rootfs/etc
    mkdir /another_rootfs/sys
    mkdir /another_rootfs/local_proc
    

    3.2.2 Using the Official QCOW2 Image of openEuler

    If the root directory is not completely clean, you can use a QCOW2 image provided by the openEuler community to prepare a new rootfs.

    3.2.2.1 Installing Tools

    Use Yum to install xz, kpartx, and qemu-img.

    yum install xz kpartx qemu-img
    
    3.2.2.2 Downloading the QCOW2 Image

    Download the openEuler 22.03 LTS image for x86 VMs or the openEuler 22.03 LTS image for ARM64 VMs from the openEuler website.

    3.2.2.3 Decompressing the QCOW2 Image

    Decompress the downloaded package using the xz -d command to obtain an openEuler-22.03-LTS-<arch>.qcow2 file. The following uses the x86 image as an example.

    xz -d openEuler-22.03-LTS-x86_64.qcow2.xz
    
    3.2.2.4 Mounting the QCOW2 Image and Copying Files
    1. Run the modprobe nbd maxpart=<any_number> command to load the nbd module.
    2. Run qemu-nbd -c /dev/nbd0 <VM_image_path>.
    3. Create an arbitrary folder /random_dir.
    4. Run mount /dev/nbd0p2 /random_dir.
    5. Copy files.
    mkdir /another_rootfs
    cp -r /random_dir/* /another_rootfs/
    

    The VM image has been mounted to /another_rootfs.

    3.2.2.5 Unmounting the QCOW2 Image

    After the rootfs is prepared, run the following commands to unmount the QCOW2 image:

    umount /random_dir
    qemu-nbd -d /dev/nbd0
    

    3.3 Starting qtfs_server on the Host

    Create the folder required by the container management plane, insert qtfs_server.ko, and start the engine process.

    You can run the following script to perform these operations. If an error occurs during the execution, you may need to convert the format of the script using dos2unix (the same applies to all the following scripts). In the last two lines, replace the paths of qtfs_server.ko and engine with the actual paths.

    #!/bin/bash
    mkdir /var/lib/libvirt
    
    insmod <ko_path>/qtfs_server.ko qtfs_server_ip=0.0.0.0 qtfs_log_level=INFO # Replace with the actual path.
    <engine_path>/engine 4096 16 # Replace with the actual path.
    

    3.4 Deploying the udsproxyd Service

    3.4.1 Introduction

    udsproxyd is a cross-host UDS proxy service, which needs to be deployed on both the host and DPU. The udsproxyd components on the host and dpu are peers. They implement seamless UDS communication between the host and DPU, which means that if two processes can communicate with each other through UDS on the same host, they can do the same between the host and DPU. The code of the processes does not need to be modified, only that the client process needs to run with the LD_PRELOAD=libudsproxy.so environment variable.

    3.4.2 Deploying udsproxyd

    Build udsproxyd in the dpu-utilities project:

    cd qtfs/ipc
    make && make install
    

    The engine service on the qtfs server has incorporated the udsproxyd feature. You do not need to manually start udsproxyd if the qtfs server is deployed. However, you need to start udsproxyd on the client by running the following command:

    nohup /usr/bin/udsproxyd <thread num> <addr> <port> <peer addr> <peer port> 2>&1 &
    

    Parameters:

    thread num: number of threads. Currently, only one thread is supported.
    addr: IP address of the host.
    port:Port used on the host.
    peer addr: IP address of the udsproxyd peer
    peer port: port used on the udsproxyd peer
    

    Example:

    nohup /usr/bin/udsproxyd 1 192.168.10.10 12121 192.168.10.11 12121 2>&1 &

    If the qtfs engine service is not started, you can start udsproxyd on the server to test udsproxyd separately. Run the following command:

    nohup /usr/bin/udsproxyd 1 192.168.10.11 12121 192.168.10.10 12121 2>&1 &
    

    Then, copy libudsproxy.so, which will be used by the libvirtd service, to the /usr/lib64 directory in the changed root directory of libvirt.

    3.5 Mounting the Dependent Directories on the Host to the DPU

    3.5.1 Installing the Software Package

    3.5.2.1 Installing to the Root Directory
    1. Install libvirt-client in /another_rootfs.
    yum install libvirt-client
    
    3.5.2.2 Configuring the another_rootfs Environment
    1. Install libvirtd in /another_rootfs.

      cd /another_rootfs
      tar -xf <path_to>/libvirtd-6.9.0.tar.xz # Replace with the actual path to libvirtd-6.9.0.tar.xz.
      cd libvirtd-6.9.0
      patch -p1 < 0001-libvirt_6.9.0_1201_offload.patch # Replace with the actual path to the libvirt patches to apply the patches in sequence.
      patch -p1 < 0003-fix-get-affinity.patch # Replace with the actual path to the libvirt patches to apply the patches in sequence.
      patch -p1 < 0004-qmp-port-manage.patch # Replace with the actual path to the libvirt patches to apply the patches in sequence.
      chroot /another_rootfs
      yum groupinstall "Development tools" -y
      yum install -y vim meson qemu qemu-img strace edk2-aarch64 tar
      
      yum install -y rpcgen python3-docutils glib2-devel gnutls-devel libxml2-devel libpciaccess-devel libtirpc-devel yajl-devel systemd-devel dmidecode glusterfs-api numactl
      
      cd /libvirtd-6.9.0
      
      CFLAGS='-Wno-error=format -Wno-error=int-conversion -Wno-error=implicit-function-declaration -Wno-error=nested-externs -Wno-error=declaration-after-statement -Wno-error=unused-result -Wno-error=missing-prototypes -Wno-error=int-conversion -Wno-error=unused-parameter -Wno-error=unused-variable -Wno-error=pointer-sign -Wno-error=discarded-qualifiers -Wno-error=unused-function' meson build --prefix=/usr -Ddriver_remote=enabled -Ddriver_network=enabled -Ddriver_qemu=enabled -Dtests=disabled -Ddocs=enabled -Ddriver_libxl=disabled -Ddriver_esx=disabled -Dsecdriver_selinux=disabled -Dselinux=disabled
      
      ninja -C build install
      exit
      
    2. Copy rexec to /another_rootfs/usr/bin and grant it the execute permission.

      cp rexec /another_rootfs/usr/bin
      chmod +x /another_rootfs/usr/bin/rexec
      
    3. In /another_rootfs, run the following script to create /usr/bin/qemu-kvm and /usr/libexec/qemu-kvm. Before running the script, replace <host_ip> and <rexec_server_port> with the host IP address and rexec_server port number on the host, respectively.

      $ chroot /another_rootfs
      $ touch /usr/bin/qemu-kvm
      $ touch /usr/libexec/qemu-kvm
      $ cat > /usr/bin/qemu-kvm <<EOF
      #!/bin/bash
      host=<host_ip>
      port=<rexec_server_port>
      CMD_NET_ADDR=tcp://\$host:\$port exec /usr/bin/rexec /usr/bin/qemu-kvm \$*
      EOF
      cat > /usr/libexec/qemu-kvm <<EOF
      #!/bin/bash
      host=<host_ip>
      port=<rexec_server_port>
      CMD_NET_ADDR=tcp://\$host:\$port exec /usr/bin/rexec /usr/bin/qemu-kvm \$*
      EOF
      $ chmod +x /usr/libexec/qemu-kvm
      $ chmod +x /usr/bin/qemu-kvm
      $ exit
      
    3.5.2.3 Mounting Directories

    Run the following script on the DPU to mount the host directories required by libvirtd to the DPU.

    Ensure that the remote directories that will be mounted in the following script (prepare.sh) exist on both the host and DPU.

    #!/bin/bash
    insmod <qtfs.ko_path>/qtfs.ko qtfs_server_ip=<server_ip_address> qtfs_log_level=INFO # Change <qtfs.ko_path> and <server_ip_address>.
    
    systemctl stop libvirtd
     
    mkdir -p /var/run/rexec/pids
    cat >/var/run/rexec/qmpport << EOF
    <qmp_port_number>
    EOF
    cat > /var/run/rexec/hostaddr <<EOF
    <server_ip_address>
    EOF
    cat > /var/run/rexec/rexecport << EOF
    <rexec_port_number>
    EOF
    
    rm -f `find /var/run/libvirt/ -name "*.pid"`
    rm -f /var/run/libvirtd.pid
    
    if [ ! -d "/another_rootfs/local_proc" ]; then
        mkdir -p /another_rootfs/local_proc
    fi
    if [ ! -d "/another_rootfs/local" ]; then
        mkdir -p /another_rootfs/local
    fi
    mount -t proc proc /another_rootfs/local_proc/
    mount -t proc proc /another_rootfs/local/proc
    mount -t sysfs sysfs /another_rootfs/local/sys
    mount --bind /var/run/ /another_rootfs/var/run/
    mount --bind /var/lib/ /another_rootfs/var/lib/
    mount --bind /var/cache/ /another_rootfs/var/cache
    mount --bind /etc /another_rootfs/etc
    
    mkdir -p /another_rootfs/home/VMs/
    mount -t qtfs /home/VMs/ /another_rootfs/home/VMs/
    
    mount -t qtfs /var/lib/libvirt /another_rootfs/var/lib/libvirt
    
    mount -t devtmpfs devtmpfs /another_rootfs/dev/
    mount -t hugetlbfs hugetlbfs /another_rootfs/dev/hugepages/
    mount -t mqueue mqueue /another_rootfs/dev/mqueue/
    mount -t tmpfs tmpfs /another_rootfs/dev/shm
    
    mount -t sysfs sysfs /another_rootfs/sys
    mkdir -p /another_rootfs/sys/fs/cgroup
    mount -t tmpfs tmpfs /another_rootfs/sys/fs/cgroup
    list="perf_event freezer files net_cls,net_prio hugetlb pids rdma cpu,cpuacct memory devices blkio cpuset"
    for i in $list
    do
            echo $i
            mkdir -p /another_rootfs/sys/fs/cgroup/$i
            mount -t cgroup cgroup -o rw,nosuid,nodev,noexec,relatime,$i /another_rootfs/sys/fs/cgroup/$i
    done
    
    # common system dir
    mount -t qtfs -o proc /proc /another_rootfs/proc
    echo "proc"
    
    mount -t qtfs /sys /another_rootfs/sys
    echo "cgroup"
    mount -t qtfs /dev/pts /another_rootfs/dev/pts
    mount -t qtfs /dev/vfio /another_rootfs/dev/vfio
    

    3.6 Starting libvirtd

    On the DPU, open a session and change the root directory to /another_rootfs.

    chroot /another_rootfs
    

    Run the following commands to start virtlogd and libvirtd:

    #!/bin/bash
    virtlogd -d
    libvirtd -d
    

    Because /var/run/ has been bound to /another_rootfs/var/run/, you can use virsh to access libvirtd in a common rootfs to manage containers.

    4 Environment Restoration

    To unmount related directories, run the following commands:

    #!/bin/bash
    
    umount /root/p2/dev/hugepages
    umount /root/p2/etc
    umount /root/p2/home/VMs
    umount /root/p2/local_proc
    umount /root/p2/var/lib/libvirt
    umount /root/p2/var/lib
    umount /root/p2/*
    umount /root/p2/dev/pts
    umount /root/p2/dev/mqueue
    umount /root/p2/dev/shm
    umount /root/p2/dev/vfio
    umount /root/p2/dev
    rmmod qtfs
    
    umount /root/p2/sys/fs/cgroup/*
    umount /root/p2/sys/fs/cgroup
    umount /root/p2/sys
    

    Bug Catching

    Buggy Content

    Bug Description

    Submit As Issue

    It's a little complicated....

    I'd like to ask someone.

    PR

    Just a small problem.

    I can fix it online!

    Bug Type
    Specifications and Common Mistakes

    ● Misspellings or punctuation mistakes;

    ● Incorrect links, empty cells, or wrong formats;

    ● Chinese characters in English context;

    ● Minor inconsistencies between the UI and descriptions;

    ● Low writing fluency that does not affect understanding;

    ● Incorrect version numbers, including software package names and version numbers on the UI.

    Usability

    ● Incorrect or missing key steps;

    ● Missing prerequisites or precautions;

    ● Ambiguous figures, tables, or texts;

    ● Unclear logic, such as missing classifications, items, and steps.

    Correctness

    ● Technical principles, function descriptions, or specifications inconsistent with those of the software;

    ● Incorrect schematic or architecture diagrams;

    ● Incorrect commands or command parameters;

    ● Incorrect code;

    ● Commands inconsistent with the functions;

    ● Wrong screenshots.

    Risk Warnings

    ● Lack of risk warnings for operations that may damage the system or important data.

    Content Compliance

    ● Contents that may violate applicable laws and regulations or geo-cultural context-sensitive words and expressions;

    ● Copyright infringement.

    How satisfied are you with this document

    Not satisfied at all
    Very satisfied
    Submit
    Click to create an issue. An issue template will be automatically generated based on your feedback.
    Bug Catching
    编组 3备份