Long-Term Supported Versions

    Gazelle User Guide

    Overview

    Gazelle is a high-performance user-mode protocol stack. It directly reads and writes NIC packets in user mode based on DPDK and transmit the packets through shared hugepage memory, and uses the LwIP protocol stack. Gazelle greatly improves the network I/O throughput of applications and accelerates the network for the databases, such as MySQL and Redis.

    • High Performance Zero-copy and lock-free packets that can be flexibly scaled out and scheduled adaptively.
    • Universality Compatible with POSIX without modification, and applicable to different types of applications.

    In the single-process scenario where the NIC supports multiple queues, use liblstack.so only to shorten the packet path.

    Installation

    Configure the Yum source of openEuler and run theyum command to install Gazelle.

    yum install dpdk
    yum install libconfig
    yum install numactl
    yum install libboundscheck
    yum install libpcap
    yum install gazelle
    

    NOTE: The version of dpdk must be 21.11-2 or later.

    How to Use

    To configure the operating environment and use Gazelle to accelerate applications, perform the following steps:

    1. Installing the .ko File as the root User

    Install the .ko files based on the site requirements to bind NICs to the user-mode driver.

    Bind the NIC from the kernel driver to the user-mode driver. Choose one of the following .ko files based on the site requirements.

    #If the IOMMU is available
    modprobe vfio-pci
    
    #If the IOMMU is not available and the VFIO supports the no-IOMMU mode
    modprobe vfio enable_unsafe_noiommu_mode=1
    modprobe vfio-pci
    
    #Other cases
    modprobe igb_uio
    

    NOTE: You can check whether the IOMMU is enabled based on the BIOS configuration.

    2. Binding the NIC Using DPDK

    Bind the NIC to the driver selected in Step 1 to provide an interface for the user-mode NIC driver to access the NIC resources.

    #Using vfio-pci
    dpdk-devbind -b vfio-pci enp3s0 
    
    #Using igb_uio
    dpdk-devbind -b igb_uio enp3s0
    

    3. Configuring Memory Huge Pages

    Gazelle uses hugepage memory to improve efficiency. You can configure any size for the memory huge pages reserved by the system using the root permissions. Each memory huge page requires a file descriptor. If the memory is large, you are advised to use 1 GB huge pages to avoid occupying too many file descriptors. Select a page size based on the site requirements and configure sufficient memory huge pages. Run the following commands to configure huge pages:

    #Configuring 1024 2 MB huge pages on node0. The total memory is 2 GB.
    echo 1024 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages
    
    #Configuring 5 1 GB huge pages on node0. The total memory is 5 GB.
    echo 5 > /sys/devices/system/node/node0/hugepages/hugepages-1048576kB/nr_hugepages
    

    NOTE: Run the cat command to query the actual number of reserved pages. If the continuous memory is insufficient, the number may be less than expected.

    4. Mounting Memory Huge Pages

    Create a directory for the lstack process to access the memory huge pages. Run the following commands:

    mkdir -p /mnt/hugepages-lstack
    chmod -R 700 /mnt/hugepages-lstack
    
    mount -t hugetlbfs nodev /mnt/hugepages-lstack -o pagesize=2M
    

    5. Enabling Gazelle for an Application

    Enable Gazelle for an application using either of the following methods as required.

    • Recompile the application and replace the sockets interface.
    #Add the Makefile of Gazelle to the application makefile.
    -include /etc/gazelle/lstack.Makefile
    
    #Add the LSTACK_LIBS variable when compiling the source code.
    gcc test.c -o test ${LSTACK_LIBS}
    
    • Use the LD_PRELOAD environment variable to load the Gazelle library. Use the GAZELLE_BIND_PROCNAME environment variable to specify the process name, and LD_PRELOAD to specify the Gazelle library path.
    GAZELLE_BIND_PROCNAME=test LD_PRELOAD=/usr/lib64/liblstack.so ./test
    
    • Use the GAZELLE_THREAD_NAME environment variable to specify the thread bound to Gazelle. If only one thread of a multi-thread process meets the conditions for using Gazelle, use GAZELLE_THREAD_NAME to specify the thread for using Gazelle. Other threads use kernel-mode protocol stack.
    GAZELLE_BIND_PROCNAME=test GAZELLE_THREAD_NAME=test_thread LD_PRELOAD=/usr/lib64/liblstack.so ./test
    

    6. Configuring Gazelle

    • The lstack.conf file is used to specify the startup parameters of lstack. The default path is /etc/gazelle/lstack.conf. The parameters in the configuration file are as follows:
    OptionsValueRemarks
    dpdk_args--socket-mem (mandatory)
    --huge-dir (mandatory)
    --proc-type (mandatory)
    --legacy-mem
    --map-perfect
    -d
    DPDK initialization parameter. For details, see the DPDK description.
    --map-perfect is an extended feature. It is used to prevent the DPDK from occupying excessive address space and ensure that extra address space is available for lstack.
    The -d option is used to load the specified .so library file.
    listen_shadow0/1Whether to use the shadow file descriptor for listening. This function is enabled when there is a single listen thread and multiple protocol stack threads.
    use_ltran0/1Whether to use ltran. This parameter is no longer supported.
    num_cpus"0,2,4 ..."IDs of the CPUs bound to the lstack threads. The number of IDs is the number of lstack threads (less than or equal to the number of NIC queues). You can select CPUs by NUMA nodes.
    low_power_mode0/1Whether to enable the low-power mode. This parameter is not supported currently.
    kni_switch0/1Whether to enable the rte_kni module. The default value is 0. This parameter is no longer supported.
    unix_prefix"string"Prefix string of the Unix socket file used for communication between Gazelle processes. By default, this parameter is left blank. The value must be the same as the value of unix_prefix in ltran.conf of the ltran process that participates in communication, or the value of the -u option for gazellectl. The value cannot contain special characters and can contain a maximum of 128 characters.
    host_addr"192.168.xx.xx"IP address of the protocol stack, which is also the IP address of the application.
    mask_addr"255.255.xx.xx"Subnet mask.
    gateway_addr"192.168.xx.1"Gateway address.
    devices"aa:bb:cc:dd:ee:ff"MAC address for NIC communication. The NIC is used as the primary bond NIC in bond 1 mode.
    app_bind_numa0/1Whether to bind the epoll and poll threads of an application to the NUMA node where the protocol stack is located. The default value is 1, indicating that the threads are bound.
    send_connect_number4Number of connections for sending packets in each protocol stack loop. The value is a positive integer.
    read_connect_number4Number of connections for receiving packets in each protocol stack loop. The value is a positive integer.
    rpc_number4Number of RPC messages processed in each protocol stack loop. The value is a positive integer.
    nic_read_num128Number of data packets read from the NIC in each protocol stack cycle. The value is a positive integer.
    bond_mode-1Bond mode. Currently, two network ports can be bonded. The default value is -1, indicating that the bond mode is disabled. bond1/4/6 is supported.
    bond_slave_mac"aa:bb:cc:dd:ee:ff;AA:BB:CC:DD:EE:FF"MAC addresses of the bond network ports. Separate the MAC addresses with semicolons (;).
    bond_miimon10Listening interval in bond mode. The default value is 10. The value ranges from 0 to 1500.
    udp_enable0/1Whether to enable the UDP function. The default value is 1.
    nic_vlan_mode-1Whether to enable the VLAN mode. The default value is -1, indicating that the VLAN mode is disabled. The value ranges from -1 to 4095. IDs 0 and 4095 are commonly reserved in the industry and have no actual effect.
    tcp_conn_count1500Maximum number of TCP connections. The value of this parameter multiplied by mbuf_count_per_conn is the size of the mbuf pool applied for during initialization. If the value is too small, the startup fails. The value of (tcp_conn_count x mbuf_count_per_conn x 2048) cannot be greater than the huge page size.
    mbuf_count_per_conn170Number of mbuf required by each TCP connection. The value of this parameter multiplied by tcp_conn_count is the size of the mbuf address pool applied for during initialization. If the value is too small, the startup fails. The value of (tcp_conn_count x mbuf_count_per_conn x 2048) cannot be greater than the huge page size.

    lstack.conf example:

    dpdk_args=["--socket-mem", "2048,0,0,0", "--huge-dir", "/mnt/hugepages-lstack", "--proc-type", "primary", "--legacy-mem", "--map-perfect"]
    
    use_ltran=1
    kni_switch=0
    
    low_power_mode=0
    
    num_cpus="2,22"
    num_wakeup="3,23"
    
    host_addr="192.168.1.10"
    mask_addr="255.255.255.0"
    gateway_addr="192.168.1.1"
    devices="aa:bb:cc:dd:ee:ff"
    
    send_connect_number=4
    read_connect_number=4
    rpc_number=4
    nic_read_num=128
    mbuf_pool_size=1024000
    bond_mode=1
    bond_slave_mac="aa:bb:cc:dd:ee:ff;AA:BB:CC:DD:EE:FF"
    udp_enable=1
    nic_vlan_mode=-1
    
    • The ltran mode is deprecated. If multiple processes are required, try the virtual network mode using SR-IOV network hardware.

    7. Starting an Application

    • Start the application. If the environment variable LSTACK_CONF_PATH is not used to specify the configuration file before the application is started, the default configuration file path /etc/gazelle/lstack.conf is used.
    export LSTACK_CONF_PATH=./lstack.conf
    LD_PRELOAD=/usr/lib64/liblstack.so  GAZELLE_BIND_PROCNAME=redis-server redis-server redis.conf
    

    8. APIs

    Gazelle wraps the POSIX interfaces of the application. The code of the application does not need to be modified.

    9. Debugging Commands

    Usage: gazellectl [-h | help]
      or:  gazellectl lstack {show | set} {ip | pid} [LSTACK_OPTIONS] [time] [-u UNIX_PREFIX]
    
      where  LSTACK_OPTIONS :=
                      show lstack all statistics
      -r, rate        show lstack statistics per second
      -s, snmp        show lstack snmp
      -c, connetct    show lstack connect
      -l, latency     show lstack latency
      -x, xstats      show lstack xstats
      -k, nic-features show state of protocol offload and other feature
      -a, aggregatin  [time] show lstack send/recv aggregation
      set:
      loglevel        {error | info | debug} set lstack loglevel
      lowpower        {0 | 1} set lowpower enable
      [time]          measure latency time default 1S
    

    The -u option specifies the prefix of the Unix socket for communication between Gazelle processes. The value of this parameter must be the same as that of unix_prefix in the lstack.conf file.

    Packet Capturing Tool The NIC used by Gazelle is managed by DPDK. Therefore, tcpdump cannot capture Gazelle packets. As a substitute, Gazelle uses gazelle-pdump provided in the dpdk-tools software package as the packet capturing tool. gazelle-pdump uses the multi-process mode of DPDK to share memory with the lstack process.
    Usage

    Thread Binding When the starting a lstack process, you can specify a thread bound to lstack using the environment variable GAZELLE_THREAD_NAME. When there are multiple threads in the service process, you can use this variable to specify the thread whose network interface needs to be managed by lstack. Other threads will use the kernel-mode protocol stack. By default, this parameter is left blank, that is, all threads in the process are bound.

    10. Precautions

    1. Location of the DPDK Configuration File

    For the root user, the configuration file is stored in the /var/run/dpdk directory after the DPDK is started. For a non-root user, the path of the DPDK configuration file is determined by the environment variable XDG_RUNTIME_DIR.

    • If XDG_RUNTIME_DIR is not set, the DPDK configuration file is stored in /tmp/dpdk.
    • If XDG_RUNTIME_DIR is set, the DPDK configuration file is stored in the path specified by XDG_RUNTIME_DIR.
    • Note that XDG_RUNTIME_DIR is set by default on some servers.

    Restrictions

    Restrictions of Gazelle are as follows:

    Function Restrictions

    • Blocking accept() or connect() is not supported.
    • A maximum of 1500 TCP connections are supported.
    • Currently, only TCP, ICMP, ARP, IPv4, and UDP are supported.
    • When a peer end pings Gazelle, the specified packet length must be less than or equal to 14,000 bytes.
    • Transparent huge pages are not supported.
    • VM NICs do not support multiple queues.

    Operation Restrictions

    • By default, the command lines and configuration files provided by Gazelle requires root permissions. Privilege escalation and changing of file owner are required for non-root users.
    • To bind the NIC from user-mode driver back to the kernel driver, you must exit Gazelle first.
    • Memory huge pages cannot be remounted to subdirectories created in the mount point.
    • The minimum hugepage memory of each application instance protocol stack thread is 800 MB.
    • Gazelle supports only 64-bit OSs.
    • The -march=native option is used when building the x86 version of Gazelle to optimize Gazelle based on the CPU instruction set of the build environment (Intel® Xeon® Gold 5118 CPU @ 2.30GHz). Therefore, the CPU of the operating environment must support the SSE4.2, AVX, AVX2, and AVX-512 instruction set extensions.
    • The maximum number of IP fragments is 10 (the maximum ping packet length is 14,790 bytes). TCP does not use IP fragments.
    • You are advised to set the rp_filter parameter of the NIC to 1 using the sysctl command. Otherwise, the Gazelle protocol stack may not be used as expected. Instead, the kernel protocol stack is used.
    • The hybrid bonding of multiple types of NICs is not supported.
    • The active/standby mode (bond1 mode) supports active/standby switchover only when a fault occurs at the link layer (for example, the network cable is disconnected), but does not support active/standby switchover when a fault occurs at the physical layer (for example, the NIC is powered off or removed).
    • If the length of UDP packets to be sent exceeds 45952 (32 x 1436) bytes, increase the value of send_ring_size to at least 64.

    Precautions

    You need to evaluate the use of Gazelle based on application scenarios.

    The ltran mode and kni module is no longer supported due to changes in the dependencies and upstream community.

    Shared Memory

    • Current situation: The memory huge pages are mounted to the /mnt/hugepages-lstack directory. During process initialization, files are created in the /mnt/hugepages-lstack directory. Each file corresponds to a huge page, and the mmap function is performed on the files.
    • Current mitigation measures The huge page file permission is 600. Only the owner can access the files. The default owner is the root user. Other users can be configured. Huge page files are locked by DPDK and cannot be directly written or mapped.
    • Caution Malicious processes belonging to the same user imitate the DPDK implementation logic to share huge page memory using huge page files and perform write operations to damage the huge page memory. As a result, the Gazelle program crashes. It is recommended that the processes of a user belong to the same trust domain.

    Traffic Limit Gazelle does not limit the traffic. Users can send packets at the maximum NIC line rate to the network, which may congest the network.

    Process Spoofing Ensure that all lstack processes are trusted.

    Bug Catching

    Buggy Content

    Bug Description

    Submit As Issue

    It's a little complicated....

    I'd like to ask someone.

    PR

    Just a small problem.

    I can fix it online!

    Bug Type
    Specifications and Common Mistakes

    ● Misspellings or punctuation mistakes;

    ● Incorrect links, empty cells, or wrong formats;

    ● Chinese characters in English context;

    ● Minor inconsistencies between the UI and descriptions;

    ● Low writing fluency that does not affect understanding;

    ● Incorrect version numbers, including software package names and version numbers on the UI.

    Usability

    ● Incorrect or missing key steps;

    ● Missing prerequisites or precautions;

    ● Ambiguous figures, tables, or texts;

    ● Unclear logic, such as missing classifications, items, and steps.

    Correctness

    ● Technical principles, function descriptions, or specifications inconsistent with those of the software;

    ● Incorrect schematic or architecture diagrams;

    ● Incorrect commands or command parameters;

    ● Incorrect code;

    ● Commands inconsistent with the functions;

    ● Wrong screenshots.

    Risk Warnings

    ● Lack of risk warnings for operations that may damage the system or important data.

    Content Compliance

    ● Contents that may violate applicable laws and regulations or geo-cultural context-sensitive words and expressions;

    ● Copyright infringement.

    How satisfied are you with this document

    Not satisfied at all
    Very satisfied
    Submit
    Click to create an issue. An issue template will be automatically generated based on your feedback.
    Bug Catching
    编组 3备份