Memory Management
Basic Concepts
The memory is an important component of a computer, and is used to temporarily store operation data in the CPU and data exchanged with an external memory such as hardware. In particular, a non-uniform memory access architecture (NUMA) is a memory architecture designed for a multiprocessor computer. The memory access time depends on the location of the memory relative to the processor. In NUMA mode, a processor accesses the local memory faster than the non-local memory (the memory is located in another processor or shared between processors).
Viewing Memory
free: displays the system memory status. Example:
# Display the system memory status in MB. free -m
The output is as follows:
[root@openEuler ~]# free -m total used free shared buff/cache available Mem: 2633 436 324 23 2072 2196 Swap: 4043 0 4043
The fields in the command output are described as follows:
Field Description total Total memory size. used Used memory. free Free memory. shared Total memory shared by multiple processes. buff/cache Total number of buffers and caches. available Estimated available memory to start a new application without swapping. vmstat: dynamically monitors the system memory and views the system memory usage.
Example:
# Monitor the system memory and display active and inactive memory. vmstat -a
The output is as follows:
[root@openEuler ~]# vmstat -a procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free inact active si so bi bo in cs us sy id wa st 2 0 520 331980 1584728 470332 0 0 0 2 15 19 0 0 100 0 0
In the command output, the field related to the memory is described as follows:
Field Description memory Memory information.
-swpd: usage of the virtual memory, in KB.
-free: free memory capacity, in KB.
-inact: inactive memory capacity, in KB.
-active: active memory capacity, in KB.sar: monitors the memory usage of the system.
Example:
# Monitor the memory usage in the sampling period in the system. Collect the statistics every two seconds for three times. sar -r 2 3
The output is as follows:
[root@openEuler ~]# sar -r 2 3 04:02:09 PM kbmemfree kbavail kbmemused %memused kbbuffers kbcached kbcommit %commit kbactive kbinact kb dirty 04:02:11 PM 332180 2249308 189420 7.02 142172 1764312 787948 11.52 470404 1584924 36 04:02:13 PM 332148 2249276 189452 7.03 142172 1764312 787948 11.52 470404 1584924 36 04:02:15 PM 332148 2249276 189452 7.03 142172 1764312 787948 11.52 470404 1584924 36 Average: 332159 2249287 189441 7.03 142172 1764312 787948 11.52 470404 1584924 36
The fields in the command output are described as follows:
Field Description kbmemfree Unused memory space. kbmemused Used memory space. %memused Percentage of the used space. kbbuffers Amount of data stored in the buffer. kbcached Data access volume in all domains of the system. numactl: displays the NUMA node configuration and status.
Example:
# Check the current NUMA configuration. numactl -H
The output is as follows:
[root@openEuler ~]# numactl -H available: 1 nodes (0) node 0 cpus: 0 1 2 3 node 0 size: 2633 MB node 0 free: 322 MB node distances: node 0 0: 10
The server contains one NUMA node. The NUMA node that contains four cores and 6 GB memory. The command also displays the distance between NUMA nodes. The further the distance, the higher the latency of cross-node memory accesses, which should be avoided as much as possible.
numstat: displays NUMA node status.
# Check the NUMA node status. numastat
[root@openEuler ~]# numastat node0 numa_hit 5386186 numa_miss 0 numa_foreign 0 interleave_hit 17483 local_node 5386186 other_node 0
The the fields in the command output and their meanings are as follows:
Field Description numa_hit Number of times that the CPU core accesses the local memory on a node. numa_miss Number of times that the core of a node accesses the memory of other nodes. numa_foreign Number of pages that were allocated to the local node but moved to other nodes. Each numa_foreign corresponds to a numa_miss event. interleave_hit Number of pages of the interleave policy that are allocated to this node. local_node Size of memory that was allocated to this node by processes on this node. other_node Size of memory that was allocated to other nodes by processes on this node.
etMem
Introduction
The development of CPU computing power - particularly lower costs of ARM cores - makes memory cost and capacity become the core frustration that restricts business costs and performance. Therefore, the most pressing issue is how to save memory cost and how to expand memory capacity.
etMem is a tiered memory expansion technology that uses DRAM+memory compression/high-performance storage media to form tiered memory storage. Memory data is tiered, and cold data is migrated from memory media to high-performance storage media to release memory space and reduce memory costs.
The etMem software package runs on the etMem client and etMemd server. The etMemd server is resident after being started. It implements functions such as hot and cold memory identification and elimination for the target process. The etMem client runs once when being invoked and controls the etmemd server to respond to different operations based on command options.
Compilation Tutorial
Download the etMem source code.
git clone https://gitee.com/openeuler/etmem.git
Install the compilation and running dependency.
The compilation and running of etMem depend on the libboundscheck component.
Build source code.
cd etmem mkdir build cd build cmake .. make
Precautions
Running Dependencies
As a memory expansion tool, etMem depends on kernel-mode features. To identify memory access and proactively write memory to the swap partition for vertical memory expansion, etmem_scan
and etmem_swap
modules need to be inserted when etMem is running.
modprobe etmem_scan
modprobe etmem_swap
Permission Control
The root permission is required for running the etMem process. The root user has the highest permission in the system. When performing operations as the root user, strictly follow the operation guide to prevent system management and security risks caused by other operations.
Constraints
- The etMem client and server must be deployed on the same server. Cross-server communication is not supported.
- etMem can scan only the target processes whose names contain fewer than or equal to 15 characters. The process name can contain letters, digits, special characters ./%-_, and any combination of the preceding three types of characters. Other combinations are invalid.
- When the AEP media is used for memory expansion, the system must be able to correctly identify AEP devices and initialize the AEP devices as NUMA nodes. In addition, the
vm_flags
field in the configuration file can only be set toht
. - Private engine commands are valid only for the corresponding engine and tasks under the engine, for example,
showhostpages
andshowtaskpages
supported by cslide. - In the third-party policy implementation code, the
fd
field in theeng_mgt_func
interface cannot be set to0xff
or0xfe
. - Multiple third-party policy dynamic libraries can be added to a project. They are differentiated by
eng_name
in the configuration file. - Do not scan the same process concurrently.
- Do not use the
/proc/xxx/idle_pages
and/proc/xxx/swap_pages
files whenetmem_scan.ko
andetmem_swap.ko
are not loaded. - The owner of the etMem configuration file must be the root user, the permission must be 600 or 400, and the size of the configuration file cannot exceed 10 MB.
- When etMem injects third-party policies, the owner of the
so
files of the third-party policies must be the root user and the permission must be 500 or 700.
Instructions
etmem Configuration File
Before running the etMem process, the administrator needs to plan the processes that require memory expansion, configure the process information in the etMem configuration file, and configure the memory scan loops and times, and cold and hot memory thresholds.
The sample configuration files are stored in the /etc/etmem
directory in the source package. There are three sample files by function.
/etc/etmem/cslide_conf.yaml
/etc/etmem/slide_conf.yaml
/etc/etmem/thirdparty_conf.yaml
The samples are as follows:
# Example of the slide engine
# slide_conf.yaml
[project]
name=test
loop=1
interval=1
sleep=1
sysmem_threshold=50
swapcache_high_vmark=10
swapcache_low_vmark=6
[engine]
name=slide
project=test
[task]
project=test
engine=slide
name=background_slide
type=name
value=mysql
T=1
max_threads=1
swap_threshold=10g
swap_flag=yes
# Example of the cslide engine
# cslide_conf.yaml
[engine]
name=cslide
project=test
node_pair=2,0;3,1
hot_threshold=1
node_mig_quota=1024
node_hot_reserve=1024
[task]
project=test
engine=cslide
name=background_cslide
type=pid
name=23456
vm_flags=ht
anon_only=no
ign_host=no
# Example of the thirdparty engine
# thirdparty_conf.yaml
[engine]
name=thirdparty
project=test
eng_name=my_engine
libname=/usr/lib/etmem_fetch/my_engine.so
ops_name=my_engine_ops
engine_private_key=engine_private_value
[task]
project=test
engine=my_engine
name=background_third
type=pid
value=12345
task_private_key=task_private_value
Fields in the configuration file are described as follows.
Item | Description | Mandatory (Yes/No) | With Parameters (Yes/No) | Value Range | Example Description |
---|---|---|---|---|---|
[project] | Start flag of the project common configuration section | No | No | N/A | Start flag of the project configuration item, indicating that the following configuration items, before another [xxx] or to the end of the file, belong to the project section. |
name | Name of a project | Yes | Yes | A string of fewer than 64 characters | Identifies a project. When configuring an engine or task, you need to specify the project to which the engine or task is mounted. |
loop | Number of memory scan loops | Yes | Yes | 1 to 120 | loop=3 // Scan for three times. |
interval | Interval for memory scans | Yes | Yes | 1 to 1200 | interval=5 // The scan interval is 5s. |
sleep | Interval between large loops of memory scans and operations | Yes | Yes | 1 to 1200 | sleep=10 // The interval between large loops is 10s. |
sysmem_threshold | Configuration item of the slide engine, which specifies the threshold of the system memory swap-out | No | Yes | 0 to 100 | sysmem_threshold=50 // etMem triggers memory swap-out only when the remaining system memory is less than 50%. |
swapcache_high_wmark | Configuration item of the slide engine, which specifies the high watermark of the system memory occupied by the swap cache | No | Yes | 1 to 100 | swapcache_high_wmark=5 // The swap cache memory usage can be 5% of the system memory. If the usage exceeds 5%, etMem triggers swap cache reclamation.Note: The value of swapcache_high_wmark must be greater than that of swapcache_low_wmark . |
swapcache_low_wmark | Configuration item of the slide engine, which specifies the low watermark of the system memory occupied by the swap cache | No | Yes | [1, swapcache_high_wmark) | swapcache_low_wmark=3 // After swap cache reclamation is triggered, the system reclaims the swap cache memory until the usage is reduced to less than 3%. |
[engine] | Start flag of the engine common configuration section | No | No | N/A | Start flag of the engine configuration item, indicating that the following configuration items, before another [xxx] or to the end of the file, belong to the engine section. |
project | Project | Yes | Yes | A string of fewer than 64 characters | If a project named test already exists, enter project=test. |
engine | Engine | Yes | Yes | slide , cslide , or thirdparty | Identifies the slide, cslide, or thirdparty policy. |
node_pair | Configuration item of the cslide engine, which specifies the node pair of the AEP and DRAM in the system | Mandatory when engine is set to cslide | Yes | Node IDs of the AEP and DRAM are configured in pairs and separated by commas (,). Node pairs are separated by semicolons (;). | node_pair=2,0;3,1 |
hot_threshold | Configuration item of the cslide engine, which specifies the threshold of the hot and cold memory | Mandatory when engine is set to cslide | Yes | An integer greater than or equal to 0 and less than or equal to INT_MAX | hot_threshold=3 // Memory accessed fewer than 3 times is identified as cold memory. |
node_mig_quota | Configuration item of the cslide engine, which specifies the maximum unidirectional traffic during each migration between the DRAM and AEP | Mandatory when engine is set to cslide | Yes | An integer greater than or equal to 0 and less than or equal to INT_MAX | node_mig_quota=1024 // The unit is MB. A maximum of 1,024 MB data can be migrated from the AEP to the DRAM or from the DRAM to the AEP at a time. |
node_hot_reserve | Configuration item of the cslide engine, which specifies the size of the reserved space for the hot memory in the DRAM | Mandatory when engine is set to cslide | Yes | An integer greater than or equal to 0 and less than or equal to INT_MAX | node_hot_reserve=1024 // The unit is MB. When the hot memory of all VMs is greater than the value of this configuration item, the hot memory is migrated to the AEP. |
eng_name | Configuration item of the thirdparty engine, which specifies the engine name and is used for task mounting | Mandatory when engine is set to thirdparty | Yes | A string of fewer than 64 characters | eng_name=my_engine // To mount a task to the thirdparty engine, you can enter engine=my_engine in the task. |
libname | Configuration item of the thirdparty engine, which specifies the address of the dynamic library of the third-party policy. The address is an absolute address. | Mandatory when engine is set to thirdparty | Yes | A string of fewer than 256 characters | libname=/user/lib/etmem_fetch/code_test/my_engine.so |
ops_name | Configuration item of the thirdparty engine, which specifies the name of the operator in the dynamic library of the third-party policy | Mandatory when engine is set to thirdparty | Yes | A string of fewer than 256 characters | ops_name=my_engine_ops // Name of the structure of the third-party policy implementation interface |
engine_private_key | (Optional) Configuration item of the thirdparty engine. This configuration item is reserved for the third-party policy to parse private parameters. | No | No | Configured based on the private parameters of the third-party policy | Set this configuration item based on the private engine configuration items of the third-party policy. |
[task] | Start flag of the task common configuration section | No | No | N/A | Start flag of the task configuration item , indicating that the following configuration items, before another [xxx] or to the end of the file, belong to the task section. |
project | Project to which a task is mounted | Yes | Yes | A string of fewer than 64 characters | If a project named test already exists, enter project=test. |
engine | Engine to which a task is mounted | Yes | Yes | A string of fewer than 64 characters | Specifies the engine to which a task is mounted. |
name | Name of a task | Yes | Yes | A string of fewer than 64 characters | name=background1 // The task name is backgound1 . |
type | Method of identifying the target process | Yes | Yes | pid or name | pid indicates that the process is identified by the process ID, and name indicates that the process is identified by the process name. |
value | Specific fields identified by the target process | Yes | Yes | Actual process ID/name | This configuration item is used together with the type configuration item to specify the ID or name of the target process. Ensure that the configuration is correct and unique. |
T | Task configuration item of the slide engine, which specifies the threshold of the hot and cold memory | Mandatory when engine is set to slide | Yes | 0 to loop x 3 | T=3 // Memory accessed fewer than 3 times is identified as cold memory. |
max_threads | Task configuration item of the slide engine, which specifies the maximum number of threads in the internal thread pool of etmemd. Each thread processes a memory scan+operation task of a process or child process. | No | Yes | 1 to 2 x Number of cores + 1. The default value is 1 . | This configuration item controls the number of internal processing threads of etmemd. When the target process has multiple child processes, a larger value of this configuration item indicates a larger number of concurrent executions but more occupied resources. |
vm_flags | Task configuration item of the cslide engine, which specifies the flag of the VMA to be scanned. If this configuration item is not configured, the VMA is not distinguished. | No | Yes | The value is a string of fewer than 256 characters. Different flags are separated by spaces. | vm_flags=ht // Scan the VMA memory whose flag is ht (huge page). |
anon_only | Task configuration item of the cslide engine, which specifies whether to scan only anonymous pages | No | Yes | yes or no | anon_only=no // yes indicates that only anonymous pages are scanned. no indicates that non-anonymous pages are also scanned. |
ign_host | Task configuration item of the cslide engine, which specifies whether to ignore the page table scan information on the host | No | Yes | yes or no | ign_host=no // yes : Ignore; no : Do not ignore. |
task_private_key | (Optional) Task configuration item of the thirdparty engine. This configuration item is reserved for the task of the third-party policy to parse private parameters. | No | No | Configured based on the private parameters of the third-party policy | Configured based on the private task parameters of the third-party policy |
swap_threshold | Configuration item of the slide engine, which specifies the threshold of the process memory swap-out | No | Yes | Absolute value of the available memory of a process | swap_threshold=10g // If the memory usage of a process is less than 10 GB, swap-out is not triggered.In the current version, only g/G can be used as the absolute memory unit. This configuration item is used together with sysmem_threshold . When the system memory is lower than the threshold, the system checks the threshold of the processes in the allowlist. |
swap_flag | Configuration item of the slide engine, which specifies process memory to be swapped out | No | Yes | yes or no | swap_flag=yes // Specify process memory to be swapped out. |
Starting the etmemd Server
When using the servers provided by etMem, you need to modify the corresponding configuration file as required, and then run the etmemd server to operate the memory of the target process. In addition to starting the etmemd process in binary mode on the CLI, you can configure the server
file to enable the etmemd server to start the etmemd process in systemctl
mode. In this scenario, you need to use the mode-systemctl
parameter to specify whether to enable the function.
How to Use
You can run the following command to start the etmemd server:
etmemd -l 0 -s etmemd_socket
Or
etmemd --log-level 0 --socket etmemd_socket
0
in -l
and etmemd_socket
in -s
are user-defined parameters. For details about the parameters, see the following table.
Command-Line Options
Option | Description | Mandatory (Yes/No) | With Parameters (Yes/No) | Value Range | Example Description |
---|---|---|---|---|---|
-l or \-\-log-level | etmemd log level. | No | Yes | 0 to 3 | 0 : debug level.1 : info level.2 : warning level.3 : error level.Only logs of the level that is higher than or equal to the configured level are recorded in the /var/log/message file. |
-s or \-\-socket | Name of the etmemd listener, which is used to interact with the client. | Yes | Yes | A string of fewer than 107 characters | Specifies the name of the server listener. |
-m or \-\-mode-systemctl | Starts the etmemd server in systemctl mode. | No | No | N/A | The -m option must be specified in the service file. |
-h or \-\-help | Prints help information. | No | No | N/A | If this option is specified, the command execution exits after the command output is printed. |
Adding or Deleting a Project, Engine, or Task on the etMem Client
Scenarios
The administrator adds an etMem project, engine, or task. (A project can contain multiple etMem engines, and an engine can contain multiple tasks.)
The administrator deletes an existing etMem project, engine, or task. (Before a project is deleted, all tasks in the project automatically stop.)
How to Use
After the etmemd server runs properly, you can use the obj
parameter on the etMem client to add or delete a project, engine, or task. The project, engine, or task is identified based on the content configured in the configuration file.
Add an object.
etmem obj add -f /etc/etmem/slide_conf.yaml -s etmemd_socket
Or
etmem obj add --file /etc/etmem/slide_conf.yaml --socket etmemd_socket
Delete an object.
etmem obj del -f /etc/etmem/slide_conf.yaml -s etmemd_socket
Or
etmem obj del --file /etc/etmem/slide_conf.yaml --socket etmemd_socket
Command-Line Options
Option | Description | Mandatory (Yes/No) | With Parameters (Yes/No) | Example Description |
---|---|---|---|---|
-f or \-\-file | Configuration file of the specified object | Yes | Yes | Specifies the file path. |
-s or \-\-socket | Name of the socket for communicating with the etmemd server. The value must be the same as that specified when the etmemd server is started. | Yes | Yes | This option is mandatory. When there are multiple etmemd servers, the administrator selects an etmemd server to communicate with. |
Querying, Starting, or Stopping a Project on the etMem Client
Scenarios
After adding a project by running the etmem obj add
command, the administrator can start or stop the etMem project before running the etmem obj del
command to delete the project.
The administrator starts an added project.
The administrator stops a project that has been started.
When the administrator runs the obj del
command to delete a project, the project automatically stops if it has been started.
How to Use
For a project that has been successfully added, you can run the etmem project
command to start or stop the project. Example commands are as follows:
Query a project.
etmem project show -n test -s etmemd_socket
Or
etmem project show --name test --socket etmemd_socket
Start a project.
etmem project start -n test -s etmemd_socket
Or
etmem project start --name test --socket etmemd_socket
Stop a project.
etmem project stop -n test -s etmemd_socket
Or
etmem project stop --name test --socket etmemd_socket
Print help information.
etmem project help
Command-Line Options
Option | Description | Mandatory (Yes/No) | With Parameters (Yes/No) | Example Description |
---|---|---|---|---|
-n or \-\-name | Project name | Yes | Yes | Project name, which corresponds to the configuration file. |
-s or \-\-socket | Name of the socket for communicating with the etmemd server. The value must be the same as that specified when the etmemd server is started. | Yes | Yes | This option is mandatory. When there are multiple etmemd servers, the administrator selects an etmemd server to communicate with. |
Performing Memory Swap-out on the etMem Client based on the Memory Swap-out Threshold and Flag
Among the currently supported policies, only the slide
policy supports private functions and features.
- Swapping out process or system memory based on threshold
To achieve optimal service performance, you need to consider the time when the etMem memory is swapped out. When the available system memory is sufficient and the system memory pressure is low, memory swapping is not performed. When the memory usage of processes is low, memory swapping is not performed. The thresholds for controlling the system memory swap-out and process memory swap-out are available.
- Swapping out the specified process memory
In the storage environment, I/O latency-sensitive server processes do not want to swap out the memory. Therefore, a mechanism is provided for services to specify the memory that can be swapped out.
You can add the sysmem_threshold
, swap_threshold
, and swap_flag
parameters to the configuration file. For details, see the description of the etMem configuration file.
# slide_conf.yaml
[project]
name=test
loop=1
interval=1
sleep=1
sysmem_threshold=50
[engine]
name=slide
project=test
[task]
project=test
engine=slide
name=background_slide
type=name
value=mysql
T=1
max_threads=1
swap_threshold=10g
swap_flag=yes
Swapping Out System Memory Based on Threshold
In the configuration file, sysmem_threshold
indicates the threshold for system memory swap-out. The value of sysmem_threshold
ranges from 0 to 100. If sysmem_threshold
is configured in the configuration file, etMem triggers memory swap-out only when the available system memory is less than the value of sysmem_threshold
.
Procedure:
Compile the configuration file. Configure the
sysmem_threshold
parameter in the configuration file, for example,sysmem_threshold=20
.Start the server, and add and start a project.
etmemd -l 0 -s monitor_app & etmem obj add -f etmem_config -s monitor_app etmem project start -n test -s monitor_app etmem project show -s monitor_app
Check the memory swap-out result. etMem triggers memory swap-out only when the available system memory is less than 20%.
Swapping Out Process Memory Based on Threshold
In the configuration file, swap_threshold
indicates the threshold for process memory swap-out. swap_threshold
specifies the absolute value of the process memory usage (number+g/G). If swap_threshold
is configured in the configuration file, etMem will not trigger memory swap-out for a process when the memory usage of the process is less than the value of swap_threshold
.
Procedure:
Compile the configuration file. Configure the
swap_threshold
parameter in the configuration file, for example,swap_threshold=5g
.Start the server, and add and start a project.
etmemd -l 0 -s monitor_app & etmem obj add -f etmem_config -s monitor_app etmem project start -n test -s monitor_app etmem project show -s monitor_app
Check the memory swap-out result. etMem triggers memory swap-out only when the absolute value of the memory occupied by the process is greater than 5 GB.
Swapping Out the Specified Process Memory
In the configuration file, swap_flag
specifies the process memory that can be swapped out. swap_flag
can be set to yes
or no
. If swap_flag
is set to no
or not set in the configuration file, the memory swap-out function of etMem remains unchanged. If swap_flag
is set to yes
, only the specified process memory can be swapped out.
Procedure:
Compile the configuration file. Configure the
swap_flag
parameter in the configuration file, for example,swap_flag=yes
.Mark the process memory to be swapped out.
madvise(addr_start, addr_len, MADV_SWAPFLAG)
Start the server, and add and start a project.
etmemd -l 0 -s monitor_app & etmem obj add -f etmem_config -s monitor_app etmem project start -n test -s monitor_app etmem project show -s monitor_app
Check the memory swap-out result. Only the marked process memory is swapped out. Other memory is retained in the DRAM and will not be swapped out.
In the scenario where a specified page of a process is swapped out, the ioctl
call is added to the original scan interface idle_pages
to ensure that the VMA without a specific flag is not scanned or swapped out.
Scan Management Interface
Prototype
ioctl(fd, cmd, void *arg);
Input parameters
1. fd: file descriptor, which is obtained by the open call in /proc/pid/idle_pages. 2. cmd: controls the scanning behavior. Currently, the following commands are supported: VMA_SCAN_ADD_FLAGS: adds a VMA swap-out flag. Only VMAs with the specified flag are scanned. VMA_SCAN_REMOVE_FLAGS: removes the new VMA swap-out flag. 3. args: int pointer argument, which is used to transfer the specific flag mask. Currently, only the following argument is supported: VMA_SCAN_FLAG: Before the etmem_scan.ko module starts scanning, the `walk_page_test` interface is called to check whether the VMA address meets the scanning requirements. If this flag is set, only the VMA address segment with a specific swap-out flag is scanned, and other VMA addresses are ignored.
Return value
1. If the operation is successful, 0 is returned. 2. If the operation fails, a non-zero value is returned.
Note
All unsupported flags are ignored, but no error is returned.
Reclaiming Swap Cache Memory on the etMem Client
The user-mode etMem initiates a memory eviction and reclamation operation and interacts with the kernel-mode memory reclamation module through the write procfs
interface. The kernel-mode memory reclamation module parses the virtual address delivered by the user-mode etMem, obtains the page corresponding to the address, and calls the native kernel interface to swap out the memory corresponding to the page for reclamation. During memory swap-out, the swap cache occupies certain system memory. To further save memory, the swap cache memory reclamation function is added.
You can add the swapcache_high_wmark
and swapcache_low_wmark
parameters to the configuration file to use this function.
swapcache_high_wmark
: high watermark of the system memory that can be occupied by the swap cache.swapcache_low_wmark
: low watermark of the system memory that can be occupied by the swap cache.
After performing a memory swap-out, etMem checks the memory usage of the swap cache. If the memory usage exceeds the high watermark, etMem delivers the ioctl
command in swap_pages
to trigger swap cache memory reclamation. The reclamation stops when the memory usage comes down to the low watermark.
The following is an example of parameter configuration. For details, see the sections related to the etMem configuration file.
# slide_conf.yaml
[project]
name=test
loop=1
interval=1
sleep=1
swapcache_high_vmark=5
swapcache_low_vmark=3
[engine]
name=slide
project=test
[task]
project=test
engine=slide
name=background_slide
type=name
value=mysql
T=1
max_threads=1
In the swap-out scenario, the swap cache memory needs to be reclaimed to further save the memory. The ioctl
call is added to the swap_pages
interface to set the swap cache watermark and enable or disable the swap cache memory reclamation.
Prototype
ioctl(fd, cmd, void *arg);
Input parameters
1. fd: file descriptor, which is obtained by the open call in /proc/pid/idle_pages. 2. cmd: controls the scanning behavior. Currently, the following commands are supported: RECLAIM_SWAPCACHE_ON: enables swap cache memory swap-out. RECLAIM_SWAPCACHE_OFF: disables swap cache memory swap-out. SET_SWAPCACHE_WMARK: specifies the swap cache memory watermark. 3. args: int pointer argument, which is used to transfer the specific flag mask. Currently, only the following argument is supported: Argument used to transfer the swap cache watermark.
Return value
1. If the operation is successful, 0 is returned. 2. If the operation fails, a non-zero value is returned.
Note
All unsupported flags are ignored, but no error is returned.
Executing Private Engine Commands or Functions on the etMem Client
Among the supported policies, only the cslide
policy supports private commands.
showtaskpages
showhostpages
You can run the commands to view the page access information related to the task and the system huge page usage on the host of the VM.
The following are example commands:
etmem engine showtaskpages <-t task_name> -n proj_name -e cslide -s etmemd_socket
etmem engine showhostpages -n proj_name -e cslide -s etmemd_socket
Note: ``showtaskpagesand
showhostpages` support only the cslide engine.
Command-Line Options
Option | Description | Mandatory (Yes/No) | With Parameters (Yes/No) | Example Description |
---|---|---|---|---|
-n or \-\-proj_name | Project name | Yes | Yes | Specifies the name of an existing project to be executed. |
-s or \-\-socket | Name of the socket for communicating with the etmemd server. The value must be the same as that specified when the etmemd process is started. | Yes | Yes | This option is mandatory. When there are multiple etmemd servers, the administrator selects an etmemd server to communicate with. |
-e or \-\-engine | Name of the engine to be executed | Yes | Yes | Specifies the name of an existing engine to be executed. |
-t or \-\-task_name | Name of the task to be executed | No | Yes | Specifies the name of an existing task to be executed. |
Enabling and Disabling the Kernel Swap Function
When etMem is used for memory expansion, you can determine whether to enable the kernel swap function. You can disable the native swap mechanism of the kernel to prevent the native swap mechanism from swapping out the memory that should not be swapped out to cause user-mode process exceptions.
The sys interface is provided to implement the preceding control. The kobj
object named kernel_swap_enable
is created in the /sys/kernel/mm/swap
directory. It is used to enable or disable kernel swap. The default value is true
.
Examples:
# Enable kernel swap.
echo true > /sys/kernel/mm/swap/kernel_swap_enable
Or
echo 1 > /sys/kernel/mm/swap/kernel_swap_enable
# Disable kernel swap.
echo false > /sys/kernel/mm/swap/kernel_swap_enable
Or
echo 0 > /sys/kernel/mm/swap/kernel_swap_enable
Automatically Starting etMem with System
Scenarios
etmemd allows you to configure the systemd
configuration file and start the systemd
service in fork
mode.
How to Use
Compile the service
configuration file to start etmemd. Use the -m
option to specify the mode. For example:
etmemd -l 0 -s etmemd_socket -m
Command-Line Options
Option | Description | Mandatory (Yes/No) | With Parameters (Yes/No) | Value Range | Example Description |
---|---|---|---|---|---|
-l or \-\-log-level | etmemd log level. | No | Yes | 0 to 3 | 0 : debug level. 1 : info level. 2 : warning level. 3 : error level. Only logs of the level that is higher than or equal to the configured level are recorded in the /var/log/message file. |
-s or \-\-socket | Name of the etmemd listener, which is used to interact with the client. | Yes | Yes | A string of fewer than 107 characters | Name of the server listener |
-m or \-\-mode-systemctl | When etmemd is started as a service, this option must be specified in the command. | No | No | N/A | N/A |
-h or \-\-help | Prints help information. | No | No | N/A | If this option is specified, the command execution exits after the command output is printed. |
Supporting Third-Party Memory Extension Policies
Scenarios
etMem allows you to register third-party memory extension policies and provides the dynamic library of the scan module. When etMem is running, the third-party policy eviction algorithm is used to evict the memory.
You can use the dynamic library of the scan module provided by etMem and implement the interfaces in the structure required for connecting to etMem.
How to Use
To use a third-party extended eviction policy, perform the following steps:
Invoke the scan interface provided by the scan module as required.
Implement each interface based on the function template provided in the etMem header file and encapsulate the interfaces into structures.
Compile the dynamic library of the third-party extended eviction policy.
Specify the
thirdparty
engine in the configuration file as required.Enter the dynamic library name and interface structure name in the
task
field in the configuration file as required.
Other operations are similar to those of other etMem engines.
Interface structure templates:
struct engine_ops {
/* Parse the private parameters of the engine. If there are private parameters, implement this interface; otherwise, set it to NULL. */
int (*fill_eng_params)(GKeyFile *config, struct engine *eng);
/* Clear the private parameters of the engine. If there are private parameters, implement this interface; otherwise, set it to NULL. */
void (*clear_eng_params)(struct engine *eng);
/* Parse the private parameters of the task. If there are private parameters, implement this interface; otherwise, set it to NULL. */
int (*fill_task_params)(GKeyFile *config, struct task *task);
/* Parse the private parameters of the task. If there are private parameters, implement this interface; otherwise, set it to NULL. */
void (*clear_task_params)(struct task *tk);
/* Interface for starting a task */
int (*start_task)(struct engine *eng, struct task *tk);
/* Interface for stopping a task */
void (*stop_task)(struct engine *eng, struct task *tk);
/* Fill in the private parameters related to the PID. */
int (*alloc_pid_params)(struct engine *eng, struct task_pid **tk_pid);
/* Destroy the private parameters related to the PID. */
void (*free_pid_params)(struct engine *eng, struct task_pid **tk_pid);
/* Private commands required by third-party policies. If no private command is required, set it to NULL. */
int (*eng_mgt_func)(struct engine *eng, struct task *tk, char *cmd, int fd);
};
External interfaces of the scan module
Interface Name | Interface Description |
---|---|
etmemd_scan_init | Initializes the scan module. |
etmemd_scan_exit | Destructs the scan module. |
etmemd_get_vmas | Obtains the VMAs to be scanned. |
etmemd_free_vmas | Releases the VMAs scanned by etmemd_get_vmas . |
etmemd_get_page_refs | Scans pages in VMAs. |
etmemd_free_page_refs | Releases the linked list of page access information obtained by etmemd_get_page_refs . |
In the VM scanning scenario, the ioctl
call is added to the original scan interface idle_pages
to provide a mechanism for distinguishing the ept
scanning granularity and determining whether to ignore the page access flag on the host.
In the scenario where a specified page of a process is swapped out, the ioctl
call is added to the original scan interface idle_pages
to ensure that the VMA without a specific flag is not scanned or swapped out.
Scan management interface:
Prototype
ioctl(fd, cmd, void *arg);
Input parameters
1. fd: file descriptor, which is obtained by the open call in /proc/pid/idle_pages. 2. cmd: controls the scanning behavior. Currently, the following commands are supported: IDLE_SCAN_ADD_FLAG: adds a scan flag. IDLE_SCAM_REMOVE_FLAGS: removes a scan flag. VMA_SCAN_ADD_FLAGS: adds a VMA swap-out flag. Only VMAs with the specified flag are scanned. VMA_SCAN_REMOVE_FLAGS: removes the new VMA swap-out flag. 3. args: int pointer argument, which is used to transfer the specific flag mask. Currently, only the following argument is supported: SCAN_AS_HUGE: scans whether a page has been accessed based on the 2 MB huge page granularity when scanning the ept page table. If this flag is not set, scanning is performed based on the granularity of the ept page table. SCAN_IGN_HUGE: ignores the access flag in the page table on the host side during VM scanning. If this flag is not set, the access flag in the page table on the host side is not ignored. VMA_SCAN_FLAG: Before the etmem_scan.ko module starts scanning, the `walk_page_test` interface is called to check whether the VMA address meets the scanning requirements. If this flag is set, only the VMA address segment with a specific swap-out flag is scanned, and other VMA addresses are ignored.
Return value
1. If the operation is successful, 0 is returned. 2. If the operation fails, a non-zero value is returned.
Note
All unsupported flags are ignored, but no error is returned.
The following is an example of the configuration file. For details, see the configuration file description.
# thirdparty
[engine]
name=thirdparty
project=test
eng_name=my_engine
libname=/user/lib/etmem_fetch/code_test/my_engine.so
ops_name=my_engine_ops
engine_private_key=engine_private_value
[task]
project=test
engine=my_engine
name=background1
type=pid
value=1798245
task_private_key=task_private_value
Notes:
You must use the dynamic library of the scan module provided by etMem and implement the interfaces in the structure required for connecting to etMem.
The fd
field in the eng_mgt_func
interface cannot be set to 0xff
or 0xfe
.
Multiple third-party policy dynamic libraries can be added to a project. They are differentiated by eng_name
in the configuration file.
etMem Client and Server Help
Run the following command to print the help information about the etMem server:
etmemd -h
Or
etmemd --help
Run the following command to print the help information about the etMem client:
etmem help
Run the following command to print help information about projects, engines, and tasks on the etMem client:
etmem obj help
Run the following command to print the help information about the project on the etMem client:
etmem project help
How to Contribute
- Fork this repository.
- Create a branch.
- Commit your code.
- Create a pull request (PR).