1 Introduction to aops-ceres
As the client of the A-Ops module, aops-ceres exchanges data with the A-Ops management center through HTTP and provides functions such as collecting host information, managing data collection tools (such as gala-gopher), and responding to and processing commands delivered by the management center.
2 Environment Requirements
You are advised to use openEuler 22.03 LTS or later.
3 Environment Deployment
3.1 Deploying aops-ceres
Use Yum to install aops-ceres.
yum install aops-ceres
Modify the configuration file.
vim /etc/aops/ceres.conf
[ceres] ;Start aops-ceres and bind the IP address and port number. ip=192.168.1.3 port=12000 [gopher] ;Default path of the gala-gopher configuration file. If you need to change the path, ensure that the file path is correct. config_path=/opt/gala-gopher/gala-gopher.conf ;aops-ceres log collection configuration [log] ;Level of the logs to be collected, which can be set to DEBUG, INFO, WARNING, ERROR, or CRITICAL log_level=INFO ;Location for storing collected logs log_dir=/var/log/aops ;Maximum size of a log file max_bytes=31457280 ;Number of backup logs backup_count=40
3.2 Registering with aops-zeus
To identify users and prevent APIs from being invoked randomly, aops-ceres uses tokens to authenticate users, reducing the pressure on the deployed hosts.
Before the registration, obtain the token of the management user on aops-ceres and run the register
command to register the token with aops-zeus. No database is configured for aops-ceres. After the registration is successful, the token is automatically saved to the specified file and the registration result is displayed on the GUI. In addition, save the local host information to the aops-zeus database for subsequent management.
Change the values of the data items to the actual values by referring to the /opt/aops/register_example.json template file.
{ "ssh_user": "root", // Remote login user name "password": "password", // User password "ssh_port":22, // Remote login port "zeus_ip": "192.168.1.2", // IP address of the host where aops-zeus is running "zeus_port": 11111, // aops-zeus port "host_name": "host_name", // Host name to be registered "host_group_name": "aops", // Existing host group name "management": false, // Whether to register as a management host "access_token": "token-string" // Management user token obtained after login }
Note: Ensure that aops-zeus is running on the target host, for example, 192.168.1.2, and the registered host group exists.
Run
aops-ceres register [-f <FILE>] [-d <register_host_info>]
.The registration result is displayed on the GUI. If the registration is successful, the token character string is saved to a specified file. If the registration fails, locate the fault based on the message and log (/var/log/aops/aops.log).
The following is an example of the registration result:
Registration successful:
$ aops-ceres register -f register.json
Register Success
Registration failed. The following shows an aops-zeus start failure.
$ aops-ceres register -f register.json
Register Fail
Log content:
2022-09-05 16:11:52,576 ERROR command_manage/register/331: HTTPConnectionPool(host='192.168.1.2', port=11111): Max retries exceeded with url: /manage/host/add (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff0504ce4f0>: Failed to establish a new connection: [Errno 111] Connection refused'))
4 Plug-in Support
4.1 gala-gopher
4.1.1 Introduction
gala-gopher is a low-load probe framework based on eBPF. It can be used to monitor the CPU, memory, and network status of hosts and collect data. You can configure the collection status of existing probes based on service requirements.
4.1.2 Deployment
- Run
yum install gala-gopher
to install gala-gopher. - Enable probes based on service requirements. You can view information about probes in /opt/gala-gopher/gala-gopher.conf.
- Run
systemctl start gala-gopher
to start the gala-gopher service.
4.1.3 More
For more information about gala-gopher, see https://gitee.com/openeuler/gala-gopher/blob/master/README.md.
5 Command Support
5.1 List of Commands
aops-ceres COMMAND [options]
List of Main Commands:
plugin manage plugin
collect collect some information
apollo cve/bugfix related action
General plugin options:
--start <args>
--stop <args>
--change-collect-items <args>
--info <args>
General info options:
--file <args>
--application <args>
--host <args>
General apollo options:
--set-repo <args>
--scan <args>
--fix <args>
5.1.1 aops-ceres plugin --start <args>
Description: Starts the plug-in that is installed but not running. Currently, only the gala-gopher plug-in is supported.
args: gala-gopher
5.1.2 aops-ceres plugin --stop <args>
Description: Stops a running plug-in. Currently, only the gala-gopher plug-in is supported.
args: gala-gopher
5.1.3、aops-ceres plugin --change-collect-items <args>
Description: Changes the collection status of the plug-in collection items. Currently, only the status of the gala-gopher collection items can be changed. For the gala-gopher collection items, see /opt/gala-gopher/gala-gopher.conf.
Parameter example:
{ "gala-gopher":{ "redis":"auto", "system_inode":"on", "tcp":"on", "haproxy":"auto" } }
Execution result example:
{ "resp": { "gala-gopher": { "failure": [ "redis" ], "success": [ "system_inode", "tcp", "haproxy" ] } } }
5.1.4 aops-ceres plugin --info
Description: Collects running applications in the target application collection. Currently, the target application collection contains MySQL, Kubernetes, Hadoop, Nginx, Docker, and gala-gopher.
Execution result example:
[ { "collect_items": [ { "probe_name": "system_tcp", "probe_status": "off", "support_auto": false }, { "probe_name": "haproxy", "probe_status": "auto", "support_auto": true }, { "probe_name": "nginx", "probe_status": "auto", "support_auto": true }, ], "is_installed": true, "plugin_name": "gala-gopher", "resource": [ { "current_value": "0.0%", "limit_value": null, "name": "cpu" }, { "current_value": "13 MB", "limit_value": null, "name": "memory" } ], "status": "active" } ]
5.1.5 aops-ceres collect --host <args>
Description: Obtains the plug-in running status of the host. Currently, only the gala-gopher plug-in is supported.
Parameter example:
["mem", "os", "cpu", "disk"]
Note: mem, os, cpu, and disk are optional. If the input parameter is an empty list, all information is obtained.
Execution result example:
{ "cpu": { "architecture": "aarch64", "core_count": "128", "l1d_cache": "8 MiB (128 instances)", "l1i_cache": "8 MiB (128 instances)", "l2_cache": "64 MiB (128 instances)", "l3_cache": "128 MiB (4 instances)", "model_name": "Kunpeng-920", "vendor_id": "HiSilicon" }, "memory": { "info": [ { "manufacturer": "Hynix", "size": "16 GB", "speed": "2933 MT/s", "type": "DDR4" }, { "manufacturer": "Hynix", "size": "16 GB", "speed": "2933 MT/s", "type": "DDR4" } ], "size": "32G", "total": 2 }, "os": { "bios_version": "1.82", "kernel": "5.10.0-60.18.0.50", "os_version": "openEuler 22.03 LTS" }, "disk": [ { "capacity": "xxGB", "model": "xxxxxx" } ] }
5.1.6 aops-ceres collect --application
Description: Collects running applications in the target application collection. Currently, the target application collection contains MySQL, Kubernetes, Hadoop, Nginx, Docker, and gala-gopher.
Execution result example:
["gala-gopher", "mysql"]
5.1.7 aops-ceres collect --file <args>
Description: Collects information such as the content, permission, and owner of the target configuration file. Currently, only UTF-8 encoded text files smaller than 1 MB and without execute permission can be read.
args:List of the full paths of the files to be collected
Example:
[ "/home/test.conf", "/home/test.ini", "/home/test.json"]
Execution result example:
{ "infos": [ { "content": "this is a test file", // Text file content "file_attr": { // File attributes "group": "root", // File owner group "mode": "0644", // Permissions on the file "owner": "root" // File owner }, "path": "/home/test.txt" // Path of the file } ], "success_files": [ "/home/test.txt" ], "fail_files": [ "/home/test.txt" ] }
5.1.8 aops-ceres apollo --set-repo <args>
Description: Sets the repo source.
Request parameter example
{ "repo_info":{ "name":"update", "dest":"/etc/yum.repos.d/aops-update.repo", "repo_content":"repo content" }, "check_items":[], "check":false, }
Execution result example
{ "code": "Succeed", "msg": "operate success" }
5.1.9 aops-ceres apollo --scan <args>
Description: Scans CVEs.
Request parameter example
{ "check":false, "check_items":[], "basic":true // Currently, only true is supported. }
Execution result example
{ "code": "Succeed", "msg": "operate success", "result": { "cves":[ { "cve_id": "CVE-2022-4904", "hotpatch": false }, { "cve_id": "CVE-2022-25308", "hotpatch": false } ], "installed_packages":[ { "name": "zip", "version": "3.0-30" }, { "name": "python-sqlalchemy", "version": "1.3.24-1" } ], "os_version":"openEuler 22.03 LTS" } }
5.1.10 aops-ceres apollo --fix <args>
Description: Repairs CVEs.
Request parameter example
{ "check_items": [], "check": false, "cves": [ { "cve_id": "CVE-2021-3781", "hotpatch": true }, { "cve_id": "CVE-2021-3782", "hotpatch": true } ] }
Execution result example
{ "code": "Succeed", "msg": "operate success", "result": [ { "cve_id": "CVE-2021-3782", "log": "fix succeed", "result": "succeed" }, { "cve_id": "CVE-2021-3781", "log": "fix succeed", "result": "succeed" } ] }
5.1.11 aops-ceres register [-f <file>] [-d <register_host_info>]
Description: Registers with aops-zeus
Request parameter example
{ "ssh_user": "root", //Remote login user name "password": "password", //User password "ssh_port":22, //Remote login port "zeus_ip": "127.0.0.1", //IP address of the host where aops-zeus is running "zeus_port": 11111, //aops-zeus port "host_name": "host_name", //Host name to be registered "host_group_name": "aops", //Existing host group name "management": false, //Whether to register as a management host "access_token": "token-string"//Management user token obtained after login }
FAQs
If an error is reported, view the /var/log/aops/aops.log file, rectify the fault based on the error message in the log file, and restart the service.
You are advised to run aops-ceres in Python 3.7 or later. Pay attention to the version of the Python dependency library when installing it.
The value of access_token can be obtained from the /etc/aops/agent.conf file after the registration is complete.
args
of all the commands are JSON strings.To limit the CPU and memory resources of a plug-in, add
MemoryHigh
andCPUQuota
to theService
section in the service file corresponding to the plug-in.For example, set
MemoryHigh
of gala-gopher to40M
andCPUQuota
to20%
.[Unit] Description=a-ops gala gopher service After=network.target [Service] Type=exec ExecStart=/usr/bin/gala-gopher Restart=on-failure RestartSec=1 RemainAfterExit=yes ;Limit the maximum memory that can be used by processes in the unit. The limit can be exceeded. However, after the limit is exceeded, the process running speed is limited, and the system reclaims the excess memory as much as possible. ;The option value can be an absolute memory size in bytes (K, M, G, or T suffix based on 1024) or a relative memory size in percentage. MemoryHigh=40M ;Set the CPU time limit for the processes of this unit. The value must be a percentage ending with %, indicating the maximum percentage of the total time that the unit can use a single CPU. CPUQuota=20% [Install] WantedBy=multi-user.target