AI4C User Guide
1 Introduction to AI4C
AI4C is a suite of AI-assisted compilers, which is a framework that enables compilers to integrate machine learning-driven compilation optimization.
2 Software Architecture
This framework consists of the following modules. The automatic compilation tuning tool depends on the Python environment.
- The inference engine for AI-assisted compilation optimization drives the compiler to use the results obtained from AI model inference in the optimization pass to implement compilation optimization.
- Currently, the AI-enabled optimization pass in GCC is implemented in the form of compiler plugins and is decoupled from the main version of the compiler.
- The automatic compilation tuning tool uses the external tuning tool (OpenTuner) to drive the compiler to perform multi-layer automatic compilation tuning. Currently, the GCC and LLVM compilers are supported.
- Option tuning tool, which is used to optimize application-level compilation options.
- Compilation tuning tool, which is implemented based on Autotuner and can implement fine-grained and coarse-grained compilation tuning.
- Fine-grained tuning: tunes key optimization parameters in the optimization pass, for example, the number of times that a loop is unrolled (unroll count).
- Coarse-grained tuning: tunes function-level compilation options.
Future planning:
- [ ] Integrate the LLVM compilation optimization model of ACPO and extract the related code of ACPO LLVM into a plugin to decouple it from the main version of LLVM.
- [ ] Enable the AI4C framework to support inference of more open-source machine learning frameworks (PyTorch-LibTorch and TensorFlow-LiteRT).
- [ ] Provide more AI-assisted compilation optimization models and corresponding compiler plugins.
- [ ] Integrate new search algorithms (based on white-box information) and optimize the parameter search space (tuning of hotspot functions).
- [ ] Support tuning of JDK compilation parameters.
3 Installation and Build of AI4C
3.1 Directly Installing AI4C
If you are using the latest openEuler system (24.03-LTS-SP2) and only want to use the existing features of AI4C, you can directly install the AI4C package.
yum install -y AI4CIf you use the AI4C feature of another version or install AI4C on another OS, you need to rebuild AI4C. Perform the following steps:
3.2 (Recommended) RPM Package Build and Installation Process
Run as the root user to install rpmbuild and rpmdevtools. The commands are as follows.
bash# Install rpmbuild: yum install dnf-plugins-core rpm-build # Install rpmdevtools: yum install rpmdevtoolsGenerate the rpmbuild folder in the
/rootdirectory:bashrpmdev-setuptree # Check the automatically generated directory structure: ls ~/rpmbuild/ BUILD BUILDROOT RPMS SOURCES SPECS SRPMSRun the
git clone https://atomgit.com/src-openeuler/AI4C.gitcommand to pull code from theopenEuler-24.03-LTS-SP2branch of the target repository and save the target files to the corresponding folder in rpmbuild.shellcp AI4C/AI4C-v%{version}-alpha.tar.gz ~/rpmbuild/SOURCES/ cp AI4C/*.patch ~/rpmbuild/SOURCES/ cp AI4C/AI4C.spec ~/rpmbuild/SPECS/You can perform the following steps to generate the RPM package of
AI4C:shell# Install the dependencies required by AI4C. yum-builddep ~/rpmbuild/SPECS/AI4C.spec # Build the AI4C dependency package. # If **check-rpaths** errors are reported, add **QA_RPATHS=0x0002** before **rpmbuild** as follows: # QA_RPATHS=0x0002 rpmbuild -ba ~/rpmbuild/SPECS/AI4C.spec rpmbuild -ba ~/rpmbuild/SPECS/AI4C.spec # Install the RPM package. cd ~/rpmbuild/RPMS/<arch> rpm -ivh AI4C-<version>-<release>.<arch>.rpmNote: If file conflicts arise from older RPMs already installed on your system, address them with the following methods:
shell# Method 1: Install the new version forcibly. rpm -ivh AI4C-<version>-<release>.<arch>.rpm --force # Method 2: Update the installation package. rpm -Uvh AI4C-<version>-<release>.<arch>.rpmAfter the installation is complete, the following files will be generated in the system:
/usr/bin/ai4c-*: wrapper of the AI-enabled compiler and automatic tuning tool/usr/lib64/libonnxruntime.so: dynamic library of the ONNX Runtime inference framework/usr/lib64/AI4C/*.onnx: AI-assisted compilation optimization model (in ONNX format)/usr/lib64/python<version>/site-packages/ai4c/lib/*.so:- Dynamic library of the inference engine for AI-assisted compilation optimization
- Dynamic library of the compiler plugin for AI-assisted compilation optimization and compiler tuning
/usr/lib64/python<version>/site-packages/ai4c/autotuner/*: files related to the coarse-grained and fine-grained tuning tools/usr/lib64/python<version>/site-packages/ai4c/optimizer/*: files related to AI-assisted compilation optimization/usr/lib64/python<version>/site-packages/ai4c/option_tuner/*: files related to application-level compilation option tuning
3.3 Source Code Build and Installation Process
The source code address of AI4C is https://atomgit.com/openeuler/AI4C.
3.3.1 Installing the ONNX Runtime Dependency
Solution 1:
Download version 1.16.3 from GitHub and decompress the .tgz file of the corresponding architecture. For example, in the AArch64 architecture, download onnxruntime-linux-aarch64-1.16.3.tgz.
Address: https://github.com/microsoft/onnxruntime/releases/tag/v1.16.3
Note: After the tgz file is decompressed, the dynamic library libonnxruntime.so is stored in the lib directory. To build the AI4C framework, you need to rename the lib directory to lib64. Otherwise, the error message indicating that the path of -lonnxruntime cannot be found may be displayed.
Solution 2
Ensure that the following ONNX Runtime dependency packages have been installed:
yum install -y cmake make gcc gcc-c++ abseil-cpp-devel boost-devel bzip2 python3-devel python3-numpy python3-setuptools python3-pipUse CMake to install ONNX Runtime.
cd path/to/your/AI4C/third_party/onnxruntime
cmake \
-DCMAKE_INSTALL_PREFIX=path/to/your/onnxruntime \
-Donnxruntime_BUILD_SHARED_LIB=ON \
-Donnxruntime_BUILD_UNIT_TESTS=ON \
-Donnxruntime_INSTALL_UNIT_TESTS=OFF \
-Donnxruntime_BUILD_BENCHMARKS=OFF \
-Donnxruntime_USE_FULL_PROTOBUF=ON \
-DPYTHON_VERSION=%{python3_version} \
-Donnxruntime_ENABLE_CPUINFO=ON \
-Donnxruntime_DISABLE_ABSEIL=ON \
-Donnxruntime_USE_NEURAL_SPEED=OFF \
-Donnxruntime_ENABLE_PYTHON=OFF \
-DCMAKE_BUILD_TYPE=Release \
-S cmake
make -j %{max_jobs} && make install3.3.2 Installing Other Build Dependencies of AI4C
Check that the following dependencies have been installed:
yum install -y python3-wheel openssl openssl-devel yaml-cpp yaml-cpp-devel gcc-plugin-devel libstdc++-static3.3.3 Building the AI4C Framework
cd path/to/your/AI4C/python
python3 setup.py bdist_wheel \
-Donnxruntime_ROOTDIR=path/to/your/onnxruntime \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_CXX_COMPILER=path/to/your/g++ \
-DCMAKE_C_COMPILER=path/to/your/gcc
pip3 install dist/ai4c-<version>-<python_version>-<python_version>-<os>_<arch>.whl --force-reinstall --no-depsAfter the installation is complete, the following files exist in the system:
path/to/your/pythonbin/ai4c-*: wrapper of the AI-enabled compiler and auto-tuning toolpath/to/your/onnxruntime/lib64/libonnxruntime.so: dynamic library of the ONNX Runtime inference frameworkpath/to/your/AI4C/models/*.onnx: AI-assisted compilation optimization model (in ONNX format)path/to/your/pythonlib/ai4c/lib/*.so:- Dynamic library of the inference engine for AI-assisted compilation optimization
- Dynamic library of the compiler plugin for AI-assisted compilation optimization and compilation tuning
path/to/your/pythonlib/ai4c/autotuner/*: files related to coarse-grained and fine-grained tuning toolspath/to/your/pythonlib/ai4c/optimizer/*: files related to AI-assisted compilation optimizationpath/to/your/pythonlib/ai4c/option_tuner/*: files related to application-level compilation option tuning
Notes:
path/to/your/pythonbin: After the installation is complete, you can run thewhich ai4c-gcccommand to view the path of the bin directory.path/to/your/pythonlib: After the installation is complete, you can run thepip show ai4ccommand to view the path of the lib directory, namely, Location in the command output.
4. Usage Process
4.1 AI-Assisted Compilation Optimization
The current AI-assisted compilation optimization module consists of three parts:
- ONNX model, which is the trained model for AI-assisted compilation optimization.
- Compiler plugin (only GCC is supported currently), which is used to run the ONNX model inference and obtain tuning parameters.
- AI4Compiler framework, which provides the ONNX inference engine and GCC optimization compile commands.
You can train an AI model in advance using an open-source machine learning framework and export the model in ONNX format. In addition, provide a compiler plugin for the AI model. The plugin must contain at least three modules that have the following functions:
- Extracts the compiler input features required by the AI model.
- Drives the inference engine to call the AI model to perform inference.
- Labels the data structure of the inference result returned to the compiler.
In the following test cases, you only need to add three plugin-related compilation options to the compile command for compiling the target binary each time. The three options are the plugin path, AI model path corresponding to the plugin, and inference engine path. In this way, AI-assisted compilation can be enabled to optimize the model during compilation.
# If onnxruntime is installed in a non-system folder, set the environment variable.
# export LD_LIBRARY_PATH=path/to/your/onnxruntime/lib64/:$LD_LIBRARY_PATH
gcc_compiler=path/to/your/gcc
infer_engine_path=$(ai4c-gcc --inference-engine)
model_path=path/to/your/model.onnx
plugin_path=path/to/your/<model_plugin>.so
$gcc_compiler test.c -O2 -o test \
-fplugin=$plugin_path \
-fplugin-arg-<model_plugin>-model=$model_path \
-fplugin-arg-<model_plugin>-engine=$infer_engine_pathCurrently, the supported plugins are stored in the same directory as $(ai4c-gcc --inference-engine), and the supported models are stored in path/to/your/AI4C/models.
Notes:
- The compiler plugin for compiling the AI model must be the same as that for compiling the target application to be optimized. Otherwise, compilation errors will occur due to inconsistent compiler versions.
- Currently, AI4C supports only the use of AI-assisted compilation optimization passes implemented in the cc1 phase of the GCC compiler in the form of plugins.
For details about the compiler plugin development and usage processes, see the AI-assisted compilation optimization manual and test cases.
The following provides two examples of using AI-assisted compilation optimization models in different compilation phases. The loop unrolling and function inlining model is located in the cc1 compilation optimization phase, and the AI model adaptation and inference are implemented in the form of GCC plugins. The BOLT sampling basic block precision correction model is located in the BOLT post-link optimization phase, and the model adaptation layer is located in the LLVM-BOLT repository.
4.1.1 Loop Unrolling and Function Inlining Model
The compilation optimization options corresponding to the loop unrolling and function inlining model are as follows:
| Option Name | Description |
|---|---|
| -fplugin | Specifies the absolute path of the loop unrolling and function inlining plugin (-fplugin=/path/to/<ipa_inline_unroll_plugin>.so). |
| -fplugin-arg-<ipa_inline_unroll_plugin>-engine | Specifies the absolute path of the inference engine of the function inlining ONNX model (-fplugin-arg-<ipa_inline_unroll_plugin>-inline_model=/path/to/inference_engine.so), which must be enabled together with -fplugin. You can obtain the path of /path/to/inference_engine.so using ai4c-gcc --inference-engine. |
| -fplugin-arg-<ipa_inline_unroll_plugin>-inline_model | Specifies the absolute path of the ONNX model for function inlining (-fplugin-arg-<ipa_inline_unroll_plugin>-inline_model=/path/to/inline_model.onnx), which must be enabled together with -fplugin and -fplugin-arg-<ipa_inline_unroll_plugin>-engine. |
| -fplugin-arg-<ipa_inline_unroll_plugin>-unroll_model | Specifies the absolute path of the ONNX model for loop unrolling (-fplugin-arg-<ipa_inline_unroll_plugin>-unroll_model=/path/to/unroll_model.onnx), which must be enabled together with -fplugin and -fplugin-arg-<ipa_inline_unroll_plugin>-engine. |
You can enable multiple AI-assisted compilation and optimization models in a GCC plugin at the same time. For example:
gxx_compiler=path/to/your/g++
infer_engine_path=$(ai4c-gcc --inference-engine)
inline_model_path=path/to/your/inline_model.onnx
unroll_model_path=path/to/your/unroll_model.onnx
plugin_path=path/to/your/<ipa_inline_unroll_plugin>.so
$gxx_compiler test.cc -O3 -o test -funroll-loops \
-fplugin=$plugin_path \
-fplugin-arg-<ipa_inline_unroll_plugin>-engine=$infer_engine_path \
-fplugin-arg-<ipa_inline_unroll_plugin>-inline_model=$inline_model_path \
-fplugin-arg-<ipa_inline_unroll_plugin>-unroll_model=$unroll_model_path4.1.2 Basic Block Accuracy Correction Model for BOLT Sampling
The BOLT optimization options corresponding to the basic block accuracy correction model for BOLT sampling are as follows:
| Option Name | Description |
|---|---|
| -block-correction | Enables the AI-based CFG BB count optimization. This option must be enabled together with the -model-path option to specify the ONNX model. |
| -model-path | Specifies the absolute path of the ONNX model (-model-path=/path/to/model.onnx). This option must be enabled together with the -block-correction option. |
| -annotate-threshold | Confidence threshold of the model prediction result. The default value is 0.95. |
The custom optimization options in BOLT can be enabled by using the GCC -fbolt-option option. For example:
g++ -fbolt-use=<gcov_file> -fbolt-target=<bin_file> -fbolt-option=\"-block-correction -model-path=path/to/your/block_correction_model.onnx\"4.2 Fine-Grained Tuning
Here, we use the fine-grained tuning of the loop unrolling optimization pass in GCC as an example to describe the usage process of the tuning tool.
The current fine-grained tuning module consists of two parts:
- Tuning configuration file (.ini) of the application: processes the compilation and execution of the application.
- Search space configuration file (YAML): configures the parameter search space in the Autotuner phase, which can be used to replace the default parameter search space.
The current fine-grained tuning is implemented based on Autotuner.
- In the
generatephase of the compiler, a group of tunable compilation data structures and tunable coefficient sets are generated and saved inopp/*.yaml. - Based on the provided compilation parameter search space (
search_space.yaml) and tunable data structures, the Autotuner generates a group of tuning coefficients for each tunable data structure using the tuning algorithm, and saves the coefficients ininput.yaml. - In the
autotunephase of the compiler, the tuning coefficients are marked to the corresponding data structures based on the hash value of the data structure ininput.yaml.
Before enabling fine-grained tuning, install the following dependencies:
yum install -y BiSheng-Autotuner bisheng-opentunerIn the following test case, we will tune the loop unrolling parameters of CoreMark. First, we will prepare the tuning configuration file coremark_sample.inifor CoreMark. The user needs to
- Provide the application path and the commands for building and running the application.
- Add the dynamic library for fine-grained tuning
-fplugin=%(PluginPath)s/rtl_unroll_autotune_plugin_gcc12.soto the basic compile command.- In the
generateandautotunephases, add the corresponding input file of-fplugin-arg-rtl_unroll_autotune_plugin_gcc12-<stage>.
- In the
- You can customize the paths of the configuration file for tuning structures (
./opp/*.yaml) and the input file generated by the autotuner (input.yaml).
[DEFAULT] # optional
# PluginPath = /path/to/gcc-plugins
[Environment Setting] # optional
# prepend a list of paths into the PATH in order.
# PATH = /path/to/bin
# you can also set other environment variables here too
[Compiling Setting] # required
# NOTE: ConfigFilePath is set to the path to the current config file automatically by default.
CompileDir = /path/to/coremark
LLVMInputFile = %(CompileDir)s/input.yaml
# OppDir and OppCompileCommand are optional,
# do not have to specify this if not using auto_run sub-command
OppDir = autotune_datadir/opp
CompilerCXX = /path/to/bin/gcc
BaseCommand = %(CompilerCXX)s -I. -I./posix -DFLAGS_STR=\"" -lrt"\" \
-DPERFORMANCE_RUN=1 -DITERATIONS=10000 -g \
core_list_join.c core_main.c core_matrix.c \
core_state.c core_util.c posix/core_portme.c \
-funroll-loops -O2 -o coremark \
-fplugin=%(PluginPath)s/rtl_unroll_autotune_plugin_gcc12.so
# auto-tuning
CompileCommand = %(BaseCommand)s \
-fplugin-arg-rtl_unroll_autotune_plugin_gcc12-autotune=%(LLVMInputFile)s
RunDir = %(CompileDir)s
RunCommand = ./coremark 0x0 0x0 0x66 100000 # run 300000 iterations for coremark
# generate
OppCompileCommand = %(BaseCommand)s \
-fplugin-arg-rtl_unroll_autotune_plugin_gcc12-generate=%(OppDir)sSecond, we can prepare an additional parameter search space file search_space.yaml to customize the parameter space. For example, the default search space for the loop unrolling coefficient in the dynamic library is
CodeRegion:
CodeRegionType: loop
Pass: loop2_unroll
Args:
UnrollCount:
Value: [0, 1, 2, 4, 8, 16, 32]
Type: enumFinally, we place the coremark, coremark_sample.ini, and search_space.yaml files in the same folder and run the following script:
ai4c-autotune autorun coremark_sample.ini \
-scf search_space.yaml --stage-order loop \
--time-after-convergence=100The time-after-convergence parameter indicates the number of seconds after which the tuning is terminated if no new optimal configuration is found after the historical optimal value is obtained.
After the tuning is complete, the optimal configuration is saved in the loop.yaml file. You can run the compile command in the autotune phase and modify the input file (i.e., -fplugin-arg-rtl_unroll_autotune_plugin_gcc12-autotune=loop.yaml) of the autotune option to reproduce the performance value of the tuning combination.
You can obtain the historical tuning configuration file (autotune_config.csv) and performance data file (autotune_data.csv) in the following ways:
ai4c-autotune dump -c coremark/input.yaml \
--database=opentuner.db/localhost.localdomain.db -o autotuneNotes:
- By default, the program running time is used as the performance value.
For details, see the Fine-Grained Tuning User Guide and the test case at https://atomgit.com/openeuler/AI4C/tree/master/python/test/autotuner/loop_unroll.
For details about fine-grained tuning of the LLVM compiler, see the tutorial in the Autotuner repository.
4.3 Function-Level Coarse-Grained Tuning
The current function-level coarse-grained tuning module consists of three parts:
- Tuning configuration file (.ini) of the application: processes the compilation and execution of the application.
- Search space configuration file (YAML): tuning search space of options configured in the Autotuner phase, which can be replaced with the default search space.
- Compilation option set file (YAML): a preset compilation option search space. The default file is located in
path/to/your/python<version>/site-packages/ai4c/autotuner/yaml/coarse_options.yaml.
The current function-level coarse-grained tuning is implemented based on Autotuner. It helps each function use different combinations of compilation options for compilation and optimization. The tuning principle is the same as that of fine-grained tuning. Because there are many compilation options that can be tuned for each function, the option space can be pruned in advance.
Before enabling function-level coarse-grained tuning, you need to install the following dependencies:
yum install -y BiSheng-Autotuner bisheng-opentunerThe process of using coarse-grained tuning is similar to that of fine-grained tuning. In the following test case, we will tune the compilation option parameters of each function in test_coarse_tuning.cc. First, we will prepare the tuning configuration file test_coarse_tuning.ini for test_coarse_tuning.cc. The user needs to
- Provide the application path and the commands for compiling and running the application.
- Add the coarse-grained tuning dynamic library
-fplugin=%(PluginPath)s/coarse_option_tuning_plugin_gcc12.soand the compilation option set file-fplugin-arg-coarse_option_tuning_plugin_gcc12-yaml=<YAML_FILE>to the basic compile command.- In the
generateandautotunephases, add the corresponding input files of-fplugin-arg-coarse_option_tuning_plugin_gcc12-<stage>.
- In the
- You can customize the paths of the configuration file for tuning the structure (
./opp/*.yaml) and the input file generated by the autotuner (input.yaml).
[DEFAULT] # optional
# TuningYAMLFile = /path/to/coarse_option_tuning_yaml_config_file
[Environment Setting] # optional
[Compiling Setting] # required
CompileDir = ./autotune_datadir
LLVMInputFile = %(CompileDir)s/input.yaml
OppDir = opp
Compiler = g++
BaseCommand = %(Compiler)s ../test_coarse_tuning.cc -O2 -o test_coarse_tuning \
-fplugin=%(PluginPath)s/coarse_option_tuning_plugin_gcc12.so \
-fplugin-arg-coarse_option_tuning_plugin_gcc12-yaml=%(TuningYAMLFile)s
# auto-tuning
CompileCommand = %(BaseCommand)s \
-fplugin-arg-coarse_option_tuning_plugin_gcc12-autotune=input.yaml
RunDir = %(CompileDir)s
RunCommand = ./test_coarse_tuning 3
# generate
OppCompileCommand = %(BaseCommand)s \
-fplugin-arg-coarse_option_tuning_plugin_gcc12-generate=%(OppDir)sSecond, we can prepare an additional parameter search space file search_space.yaml to customize the parameter space. For example, in the following file, we limit the search space to tuning of prefetch-related options.
CodeRegion:
CodeRegionType: function
Pass: coarse_option_generate
Args:
flag_prefetch_loop_arrays:
Type: bool
param_prefetch_latency:
Min: 100
Max: 2000
Type: int
param_simultaneous_prefetches:
Min: 1
Max: 80
Type: intFinally, we place test_coarse_tuning.cc, test_coarse_tuning.ini, and search_space.yaml in the same folder and run the following script:
ai4c-autotune autorun test_coarse_tuning.ini \
-scf search_space.yaml \
--stage-order function \
--time-after-convergence=10The time-after-convergence parameter indicates the number of seconds after which the tuning is terminated if no new optimal configuration is found. That is, the tuning is terminated in advance.
After the tuning is complete, the optimal tuning configuration is saved in the function.yaml file. You can invoke the compile command in the autotune phase again and modify the input file (i.e., -fplugin-arg-coarse_option_tuning_plugin_gcc12-autotune=function.yaml) of the autotune option to reproduce the performance value of the tuning combination.
Notes:
- Currently, the program running time is used as the performance value by default.
- The historical data stored in the dump database is not supported in coarse-grained tuning.
- The current coarse-grained tuning can be used with GCC 12.3.1. For other compiler versions, some compilation options may not be supported. You can comment out the compilation options that are not recognized by the compiler in
path/to/your/AI4C/aiframe/include/option_utils.h.
For details, see the test case at https://atomgit.com/openeuler/AI4C/tree/master/python/test/autotuner/coarse_tuning.
For details about how to perform coarse-grained tuning on the LLVM compiler, see the Autotuner repository.
4.4 Application-Level Option Tuning
The current application-level option tuning module consists of three parts:
- The compilation and running script (shell) of the application: processes the compilation, execution, and performance data collection of the application, and replaces the generated next group of options into the compilation script.
- The configuration file (YAML) of the search space for compilation options and dynamic library options: configures the search space for option tuning, including the switch options (compilation optimization/dynamic library), compilation parameters, and enumeration options.
- The configuration file (YAML) of performance values: configures the weights of multiple performance items and the target optimization direction (maximum or minimum value). The configuration must be consistent with the number and sequence of performance values obtained in the performance data collection process.
The application-level option tuning tool continuously collects the performance data of the application, updates the performance model, and generates a new compilation option combination with a high expected benefit. The new compilation option combination is replaced into the compilation script through the application compilation and running script, a new binary file is generated, and the next round of running is performed. Perform repeated tuning to obtain the historical optimal performance value.
Before enabling application-level tuning, install the following dependencies:
pip install xgboost scikit-learn
yum install -y timeThe following example will use different compilation option combinations to build and tune test.cc for three rounds. The compilation and running script of the application is as follows:
# ---------- run_test.sh ---------- #
parent_dir=$1 # path for intermediate tuning files
config=$(cat ${parent_dir}/tuning/config.txt) # current compiler configuration file
performance_file="${parent_dir}/tuning/performance.txt" # current performance data file
measure_raw_file="time.txt"
compiler=g++
compile_command="${compiler} test.cc -O2 -o test_opt_tuner"
eval "${compile_command} ${config}" # program compilation, appending tuning options
run_command="time -p -o ${measure_raw_file} ./test_opt_tuner 3"
eval "${run_command}" # program execution
info_collect_command="grep real ${measure_raw_file} | awk '{printf \"1 1 %s\", \$2}' > ${performance_file}"
eval "${info_collect_command}" # program performance collection
# ---------- run_option_tuner.sh ---------- #
ai4c-option-tune --test_limit 3 --runfile run_test.sh
# --optionfile path/to/your/python<version>/site-packages/ai4c/option_tuner/input/options.yaml \
# --libfile path/to/your/python<version>/site-packages/ai4c/option_tuner/input/options_lib.yaml \
# --measurefile path/to/your/python<version>/site-packages/ai4c/option_tuner/input/config_measure.yamlThe default options and performance value configuration file are stored in the following path: path/to/your/python<version>/site-packages/ai4c/option_tuner/input/*.yaml
You can modify the configuration files of compilation options and dynamic library options as required. The related keywords are as follows:
required_*: mandatory tuning item, which will be retained in the tuning process.bool_*: optional compilation optimization switch.interval_*: optional compilation parameter (value option, data range).enum_*: optional compilation parameter (enumerated option).
Example:
required_config:
- -O2
bool_config:
- -funroll-loops
interval_config:
- name: --param max-inline-insns-auto
default: 15
min: 10
max: 190You can modify the performance value configuration file as required. The related keywords are as follows:
weight: performance value weightoptim: target optimization direction (maximum or minimum value)
Example:
config_measure:
- name: throughput
weight: 1
optim: maximizeAfter the tuning is complete, the historical and optimal tuning data is saved in ${parent_dir}/tuning/train.csv and ${parent_dir}/tuning/result.txt.
For details, see the test case at https://atomgit.com/openeuler/AI4C/tree/master/python/test/option_tuner.