1 Introduction to aops-ceres
As the client of the A-Ops module, aops-ceres exchanges data with the A-Ops management center through HTTP and provides functions such as collecting host information, managing data collection tools (such as gala-gopher), and responding to and processing commands delivered by the management center.
2 Environment Requirements
You are advised to use openEuler 22.03 LTS or later and 4 GB or more memory.
3 Environment Deployment
3.1 Disabling the firewall
systemctl stop firewalld
systemctl disable firewalld
systemctl status firewalld
3.2 Deploying aops-ceres
Use Yum to install aops-ceres.
yum install aops-ceres
Modify the configuration file. Change the value of the ip in the ceres section to the IP address of the local host.
vim /etc/aops/ceres.conf
The IP address 192.168.1.3 is used as an example.
[ceres] ;Start aops-ceres and bind the IP address and port number. ip=192.168.1.3 port=12000 [gopher] ;Default path of the gala-gopher configuration file. If you need to change the path, ensure that the file path is correct. config_path=/opt/gala-gopher/gala-gopher.conf ;aops-ceres log collection configuration [log] ;Level of the logs to be collected, which can be set to DEBUG, INFO, WARNING, ERROR, or CRITICAL log_level=INFO ;Location for storing collected logs log_dir=/var/log/aops ;Maximum size of a log file max_bytes=31457280 ;Number of backup logs backup_count=40
Run the
systemctl start aops-ceres
command to start the service.
3.3 Registering with aops-zeus
To identify users and prevent APIs from being invoked randomly, aops-ceres uses tokens to authenticate users, reducing the pressure on the deployed hosts.
For security purposes, the active registration mode is used to obtain the token. Before the registration, prepare the information to be registered on aops-ceres and run the register
command to register the information with aops-zeus. No database is configured for aops-ceres. After the registration is successful, the token is automatically saved to the specified file and the registration result is displayed on the GUI. In addition, save the local host information to the aops-zeus database for subsequent management.
- Change the values of the data items to the actual values by referring to the /opt/aops/register_example.json template file.
{
// GUI login user name
"username":"admin",
// User password
"password": "changeme",
// Name of the host to be registered
"host_name": "host1",
// Name of the group to which the host belongs. Ensure that the host group has been added.
"host_group_name": "group1",
// IP address of the host where aops-zeus is running
"zeus_ip":"192.168.1.2",
// Whether to register as a management host
"management":false,
// Port of the aops-zeus service
"zeus_port":"11111",
// Port of aops-ceres
"ceres_port":"12000"
}
Note: Ensure that aops-zeus is running on the target host, for example, 192.168.1.2, and the registered host group exists.
- Run
aops-ceres register -f register_example.json
. - The registration result is displayed on the GUI. If the registration is successful, the token character string is saved to a specified file. If the registration fails, locate the fault based on the message and log (/var/log/aops/aops.log).
The following is an example of the registration result:
Registration successful:
[root@localhost ~]# aops-ceres register -f register.json
Register Success
Registration failed. The following shows an aops-zeus start failure.
[root@localhost ~]# aops-ceres register -f register.json
Register Fail
[root@localhost ~]#
Log content:
2022-09-05 16:11:52,576 ERROR command_manage/register/331: HTTPConnectionPool(host='192.168.1.2', port=11111): Max retries exceeded with url: /manage/host/add (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff0504ce4f0>: Failed to establish a new connection: [Errno 111] Connection refused'))
[root@localhost ~]#
4 Plug-in Support
4.1 gala-gopher
4.1.1 Introduction
gala-gopher is a low-load probe framework based on eBPF. It can be used to monitor the CPU, memory, and network status of hosts and collect data. You can configure the collection status of existing probes based on service requirements.
4.1.2 Deployment
- Run
yum install gala-gopher
to install gala-gopher. - Enable probes based on service requirements. You can view information about probes in /opt/gala-gopher/gala-gopher.conf.
- Run
systemctl start gala-gopher
to start the gala-gopher service.
4.1.3 More
For more information about gala-gopher, see https://gitee.com/openeuler/gala-gopher/blob/master/README.md.
5 API Support
5.1 List of External APIs
No. | API | Type | Description |
---|---|---|---|
1 | /v1/ceres/plugin/start | POST | Starts a plug-in. |
2 | /v1/ceres/plugin/stop | POST | Stops a plug-in. |
3 | /v1/ceres/application/info | GET | Collects running applications in the target application collection. |
4 | /v1/ceres/host/info | GET | Obtains host information. |
5 | /v1/ceres/plugin/info | GET | Obtains the plug-in running information in aops-ceres. |
6 | /v1/ceres/file/collect | POST | Collects content of the configuration file. |
7 | /v1/ceres/collect/items/change | POST | Changes the running status of plug-in collection items. |
8 | /v1/ceres/cve/repo/set | POST | Sets the repo source. |
9 | /v1/ceres/cve/cve/scan | POST | Scans the CVEs on the local host. |
10 | /v1/ceres/cve/cve/set | POST | Repairs the CVEs. |
5.1.1 /v1/ceres/plugin/start
Description: Starts the plug-in that is installed but not running. Currently, only the gala-gopher plug-in is supported.
HTTP request mode: POST
Data submission mode: query
Request parameters
Parameter Mandatory Type Description plugin_name True str Plug-in name Request parameter example
Parameter Value plugin_name gala-gopher Response body parameters
Parameter Type Description code int/ Return code msg str Message corresponding to the status code Response example
{ "code": 200, "msg": "xxxx" }
5.1.2 /v1/ceres/plugin/stop
Description: Stops a running plug-in. Currently, only the gala-gopher plug-in is supported.
HTTP request mode: POST
Data submission mode: query
Request parameters
Parameter Mandatory Type Description plugin_name True str Plug-in name Request parameter example
Parameter Value plugin_name gala-gopher Response body parameters
Parameter Type Description code int Return code msg str Message corresponding to the status code Response example
{ "code": 200, "msg": "xxxx" }
5.1.3 /v1/ceres/application/info
Description: Collects running applications in the target application collection. Currently, the target application collection contains MySQL, Kubernetes, Hadoop, Nginx, Docker, and gala-gopher.
HTTP request mode: GET
Data submission mode: query
Request parameters
Parameter Mandatory Type Description Request parameter example
Parameter Value Response body parameters
Parameter Type Description code int Return code msg str Message corresponding to the status code resp dict Response body - resp
Parameter Type Description running List[str] List of the running applications
- resp
Response example
{ "code": 200, "msg": "xxxx", "resp": { "running": [ "mysql", "docker" ] } }
5.1.4 /v1/ceres/host/info
Description: Obtains information about the host where aops-ceres is installed, including the system version, BIOS version, kernel version, CPU information, and memory information.
HTTP request mode: POST
Data submission mode: application/json
Request parameters
Parameter Mandatory Type Description info_type True List[str] List of the information to be collected. Currently, only cpu
,disk
,memory
, andos
are supported.Request parameter example
["os", "cpu","memory", "disk"]
Response body parameters
Parameter Type Description code int Return code msg str Message corresponding to the status code resp dict Response body resp
Parameter Type Description cpu dict CPU information memory dict Memory information os dict OS information disk List[dict] Disk information cpu
Parameter Type Description architecture str CPU architecture core_count int Number of cores l1d_cache str L1 data cache size l1i_cache str L1 instruction cache size l2_cache str L2 cache size l3_cache str L3 cache size model_name str Model name vendor_id str Vendor ID memory
Parameter Type Description size str Total memory total int Number of DIMMs info List[dict] Information about all DIMMs - info
Parameter Type Description size str Memory size type str Type speed str Speed manufacturer str Manufacturer
- info
os
Parameter Type Description bios_version str BIOS version os_version str OS name kernel str Kernel version
Response example
{ "code": 200, "msg": "operate success", "resp": { "cpu": { "architecture": "aarch64", "core_count": "128", "l1d_cache": "8 MiB (128 instances)", "l1i_cache": "8 MiB (128 instances)", "l2_cache": "64 MiB (128 instances)", "l3_cache": "128 MiB (4 instances)", "model_name": "Kunpeng-920", "vendor_id": "HiSilicon" }, "memory": { "info": [ { "manufacturer": "Hynix", "size": "16 GB", "speed": "2933 MT/s", "type": "DDR4" }, { "manufacturer": "Hynix", "size": "16 GB", "speed": "2933 MT/s", "type": "DDR4" } ], "size": "32G", "total": 2 }, "os": { "bios_version": "1.82", "kernel": "5.10.0-60.18.0.50", "os_version": "openEuler 22.03 LTS" }, "disk": [ { "capacity": "xxGB", "model": "xxxxxx" } ] } }
5.1.5 /v1/ceres/plugin/info
Description: Obtains the plug-in running status of the host. Currently, only the gala-gopher plug-in is supported.
HTTP request mode: GET
Data submission mode: query
Request parameters
Parameter Mandatory Type Description Request parameter example
Parameter Value Response body parameters
Parameter Type Description code int Return code msg str Message corresponding to the status code resp List[dict] Response body resp
Parameter Type Description plugin_name str Plug-in name collect_items list Running status of plug-in collection items is_installed str Message corresponding to the status code resource List[dict] Plug-in resource usage status str Plug-in running status - resource
Parameter Type Description name str Resource name current_value str Resource usage limit_value str Resource limit
- resource
Response example
{ "code": 200, "msg": "operate success", "resp": [ { "collect_items": [ { "probe_name": "system_tcp", "probe_status": "off", "support_auto": false }, { "probe_name": "haproxy", "probe_status": "auto", "support_auto": true }, { "probe_name": "nginx", "probe_status": "auto", "support_auto": true }, ], "is_installed": true, "plugin_name": "gala-gopher", "resource": [ { "current_value": "0.0%", "limit_value": null, "name": "cpu" }, { "current_value": "13 MB", "limit_value": null, "name": "memory" } ], "status": "active" } ] }
5.1.6 /v1/ceres/file/collect
Description: Collects information such as the content, permission, and owner of the target configuration file. Currently, only text files smaller than 1 MB, without execute permission, and supporting UTF8 encoding can be read.
HTTP request mode: POST
Data submission mode: application/json
Request parameters
Parameter Mandatory Type Description configfile_path True List[str] List of the full paths of the files to be collected Request parameter example
[ "/home/test.conf", "/home/test.ini", "/home/test.json"]
Response body parameters
Parameter Type Description infos List[dict] File collection information success_files List[str] List of files successfully collected fail_files List[str] List of files that fail to be collected infos
Parameter Type Description path str File path content str File content file_attr dict File attributes - file_attr
Parameter Type Description mode str Permission of the file type owner str File owner group str Group to which the file belongs
- file_attr
Response example
{ "infos": [ { "content": "this is a test file", "file_attr": { "group": "root", "mode": "0644", "owner": "root" }, "path": "/home/test.txt" } ], "success_files": [ "/home/test.txt" ], "fail_files": [ "/home/test.txt" ] }
5.1.7 /v1/ceres/collect/items/change
Description: Changes the collection status of the plug-in collection items. Currently, only the status of the gala-gopher collection items can be changed. For the gala-gopher collection items, see /opt/gala-gopher/gala-gopher.conf.
HTTP request mode: POST
Data submission mode: application/json
Request parameters
Parameter Mandatory Type Description plugin_name True dict Expected modification result of the plug-in collection items - plugin_name
Parameter Mandatory Type Description collect_item True string Expected modification result of the collection item
- plugin_name
Request parameter example
{ "gala-gopher":{ "redis":"auto", "system_inode":"on", "tcp":"on", "haproxy":"auto" } }
Response body parameters
Parameter Type Description code int Return code msg str Message corresponding to the status code resp List[dict] Response body resp
Parameter Type Description plugin_name dict Modification result of the corresponding collection item - plugin_name
Parameter Type Description success List[str] Collection items that are successfully modified failure List[str] Collection items that fail to be modified
- plugin_name
Response example
{ "code": 200, "msg": "operate success", "resp": { "gala-gopher": { "failure": [ "redis" ], "success": [ "system_inode", "tcp", "haproxy" ] } } }
5.1.8 /v1/ceres/cve/repo/set
Description: Sets the repo source.
HTTP request mode: POST
Data submission mode: application/json
Request parameters
Parameter Mandatory Type Description repo_info True dict Repo information to be set check_items True List[str] Pre-check check True bool Whether to perform the check - repo_info
Parameter Mandatory Type Description name True str Repo name dest True str Repo location. Ensure that the repo file is stored in the /etc/yum.repo.d directory with the file name extension repo. repo_content True str Repo file content
- repo_info
Request parameter example
{ "repo_info":{ "name":"update", "dest":"/etc/yum.repos.d/aops-update.repo", "repo_content":"repo content" }, "check_items":[], "check":false, }
Response body parameters
Parameter Type Description code int Return code msg str Message corresponding to the status code Response example
{ "code": 200, "msg": "operate success" }
5.1.9 /v1/ceres/cve/scan
Description: Scans CVEs.
HTTP request mode: POST
Data submission mode: application/json
Request parameters
Parameter Mandatory Type Description check True dict Whether to perform the check check_items True List[str] Pre-check cves True List[str] List of CVEs to be repaired Request parameter example
{ "check":false, "check_items":[], "basic":true // Currently, only true is supported. }
Response body parameters
Parameter Type Description code int Return code msg str Message corresponding to the status code result dict Response body - result
Parameter Type Description cves List[str] List of CVE IDs installed_packages List[str] List of software packages installed on the target host os_version str OS version of the target host
- result
Response example
{ "code": 200, "msg": "operate success", "result": { "cves":[], "installed_packages":[], "os_version":"openEuler 22.03 LTS" } }
5.1.10 /v1/ceres/cve/fix
Description: Repairs CVEs.
HTTP request mode: POST
Data submission mode: application/json
Request parameters
Parameter Mandatory Type Description check True dict Whether to perform the check check_items True List[str] Pre-check cves True List[str] List of CVEs to be repaired Request parameter example
{ "check":false, "check_items":[], "cves":["CVE-2021-3782", "CVE-2021-3781"] }
Response parameters
Parameter Type Description code int Return code msg str Message corresponding to the status code result dict CVE repair result - result
Parameter Type Description cve_id str CVE ID log str Repair log (mainly used to record repair failures) result str Repair result
- result
Response example
{ "code": 200, "msg": "operate success", "result": [ { "cve_id": "CVE-2021-3782", "log": "fix succeed", "result": "succeed" }, { "cve_id": "CVE-2021-3781", "log": "fix succeed", "result": "succeed" } ] }
FAQs
If an error is reported, view the /var/log/aops/aops.log file, rectify the fault based on the error message in the log file, and restart the service.
You are advised to run aops-ceres in Python 3.7 or later. Pay attention to the version of the Python dependency library when installing it.
The value of access_token can be obtained from the /etc/aops/agent.conf file after the registration is complete.
If you choose not to disable the firewall, enable the ports involved in the aops-ceres deployment.
To limit the CPU and memory resources of a plug-in, add
MemoryHigh
andCPUQuota
to theService
section in the service file corresponding to the plug-in.For example, set
MemoryHigh
of gala-gopher to40M
andCPUQuota
to20%
.[Unit] Description=a-ops gala gopher service After=network.target [Service] Type=exec ExecStart=/usr/bin/gala-gopher Restart=on-failure RestartSec=1 RemainAfterExit=yes ;Limit the maximum memory that can be used by processes in the unit. The limit can be exceeded. However, after the limit is exceeded, the process running speed is limited, and the system reclaims the excess memory as much as possible. ;The option value can be an absolute memory size in bytes (K, M, G, or T suffix based on 1024) or a relative memory size in percentage. MemoryHigh=40M ;Set the CPU time limit for the processes of this unit. The value must be a percentage ending with %, indicating the maximum percentage of the total time that the unit can use a single CPU. CPUQuota=20% [Install] WantedBy=multi-user.target