System Performance FAQ
When the NFS service is enabled on openEuler 22.03 SP1, why does the server response speed drop sharply from more than a gigabit to around 2 MB/s after less than a day of write operations
The performance drops sharply when the NFS server's cache uses more than 50% memory. This is mainly due to issues with memory allocation and reclamation mechanisms. The system reclaims memory in the background rather than during the memory allocation process. This is slower and will increase the waiting time (e.g., 500 ms latency) when the system cannot allocate enough memory in time. This issue is particularly pronounced under heavy loads since the NFS server requires substantial memory to process client requests. As the service runtime increases, the reduction in available memory leads to a dramatic performance drop, especially when there are intensive write operations.
What do I do if a file fails to be created on openEuler due to an inode error of the Ext4 file system
Set the rec_len field of the dx_node block as follows:
Use ext4_rec_len_from_disk()
to convert rec_len to 65536 and then perform a comparison. This ensures that the rec_len field is correctly set when a new dx_node block is added. Calculate and set a correct checksum for the node. This correction will prevent the inode error caused by incorrect checksums and thus allow the system to create and manage a large number of files correctly.
How to reduce the excessive memory usage of service processes on openEuler due to the use of glibc tcache
Disable the tcache feature by setting the environment variables before starting the program. Specifically, add GLIBC_TUNABLES=glibc.malloc.tcache_count=0 to bash_profile to disable tcache. After the process starts, verify the process environment variables in /proc/pid/environ to ensure that the variable has been successfully added. Once tcache is disabled, the memory of this process will be managed in accordance with glibc 2.17, without any additional side effects. According to user feedback, this solution significantly reduces the memory usage of services on openEuler, resulting in lower memory usage than CentOS.
When performing a fio multi-drive stress test, why is the performance of the Arm architecture only half that of the x86 architecture
The lower performance of the Arm architecture in the fio stress testing is due to differences in interrupt handling mechanisms. x86 achieves interrupt load balancing through APIC, while Arm's locality-specific peripheral interrupts (LPIs) are managed by Interrupt Translation Service (ITS), which by default assigns interrupts to the lowest-numbered core of each CPU. Therefore, performance bottlenecks occur on these cores during the multi-drive stress test.
After the server is powered off and restarted, why does an I/O error occur when I run ls in the XFS file system
This error usually indicates that parts of the XFS file system failed to be loaded or read, possibly due to damaged file system metadata or incomplete write operations. If a power outage occurs during write operations, data may not be fully written to XFS, causing the XFS data to be inconsistent after the server reboots. Running a command such as ls at this time will result in an input/output error.
What do I do if the new kernel is not used upon system boot
Replace the contents of the /boot/grub2/grub.cfg file with those of the /boot/efi/EFI/xxxx/grub.cfg file. This ensures that the system reads the correct configuration file containing the new kernel information upon system boot. In addition, check the system boot mode and confirm that UEFI is being used.