1 Introduction to aops-ceres

As the client of the A-Ops module, aops-ceres exchanges data with the A-Ops management center through HTTP and provides functions such as collecting host information, managing data collection tools (such as gala-gopher), and responding to and processing commands delivered by the management center.

2 Environment Requirements

You are advised to use openEuler 22.03 LTS or later and 4 GB or more memory.

3 Environment Deployment

3.1 Disabling the firewall

systemctl stop firewalld
systemctl disable firewalld
systemctl status firewalld

3.2 Deploying aops-ceres

  1. Use Yum to install aops-ceres.

    yum install aops-ceres

  2. Modify the configuration file. Change the value of the ip in the ceres section to the IP address of the local host.

    vim /etc/aops/ceres.conf

    The IP address 192.168.1.3 is used as an example.

    [ceres]
    ;Start aops-ceres and bind the IP address and port number.
    ip=192.168.1.3
    port=12000
    
    [gopher]
    ;Default path of the gala-gopher configuration file. If you need to change the path, ensure that the file path is correct.
    config_path=/opt/gala-gopher/gala-gopher.conf
    
    ;aops-ceres log collection configuration
    [log]
    ;Level of the logs to be collected, which can be set to DEBUG, INFO, WARNING, ERROR, or CRITICAL
    log_level=INFO
    ;Location for storing collected logs
    log_dir=/var/log/aops
    ;Maximum size of a log file
    max_bytes=31457280
    ;Number of backup logs
    backup_count=40
    
  3. Run the systemctl start aops-ceres command to start the service.

3.3 Registering with aops-zeus

To identify users and prevent APIs from being invoked randomly, aops-ceres uses tokens to authenticate users, reducing the pressure on the deployed hosts.

For security purposes, the active registration mode is used to obtain the token. Before the registration, prepare the information to be registered on aops-ceres and run the register command to register the information with aops-zeus. No database is configured for aops-ceres. After the registration is successful, the token is automatically saved to the specified file and the registration result is displayed on the GUI. In addition, save the local host information to the aops-zeus database for subsequent management.

  1. Change the values of the data items to the actual values by referring to the /opt/aops/register_example.json template file.
{
    // GUI login user name
    "username":"admin",
    // User password
    "password": "changeme",
    // Name of the host to be registered
    "host_name": "host1",
    // Name of the group to which the host belongs. Ensure that the host group has been added.
    "host_group_name": "group1",
    // IP address of the host where aops-zeus is running
    "zeus_ip":"192.168.1.2",
    // Whether to register as a management host
    "management":false,
	// Port of the aops-zeus service
    "zeus_port":"11111",
    // Port of aops-ceres
    "ceres_port":"12000"
}

Note: Ensure that aops-zeus is running on the target host, for example, 192.168.1.2, and the registered host group exists.

  1. Run aops-ceres register -f register_example.json.
  2. The registration result is displayed on the GUI. If the registration is successful, the token character string is saved to a specified file. If the registration fails, locate the fault based on the message and log (/var/log/aops/aops.log).

The following is an example of the registration result:

Registration successful:

[root@localhost ~]# aops-ceres register -f register.json
Register Success

Registration failed. The following shows an aops-zeus start failure.

[root@localhost ~]# aops-ceres register -f register.json
Register Fail
[root@localhost ~]#

Log content:

2022-09-05 16:11:52,576 ERROR command_manage/register/331: HTTPConnectionPool(host='192.168.1.2', port=11111): Max retries exceeded with url: /manage/host/add (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff0504ce4f0>: Failed to establish a new connection: [Errno 111] Connection refused'))
[root@localhost ~]#

4 Plug-in Support

4.1 gala-gopher

4.1.1 Introduction

gala-gopher is a low-load probe framework based on eBPF. It can be used to monitor the CPU, memory, and network status of hosts and collect data. You can configure the collection status of existing probes based on service requirements.

4.1.2 Deployment

  1. Run yum install gala-gopher to install gala-gopher.
  2. Enable probes based on service requirements. You can view information about probes in /opt/gala-gopher/gala-gopher.conf.
  3. Run systemctl start gala-gopher to start the gala-gopher service.

4.1.3 More

For more information about gala-gopher, see https://gitee.com/openeuler/gala-gopher/blob/master/README.md.

5 API Support

5.1 List of External APIs

No.APITypeDescription
1/v1/ceres/plugin/startPOSTStarts a plug-in.
2/v1/ceres/plugin/stopPOSTStops a plug-in.
3/v1/ceres/application/infoGETCollects running applications in the target application collection.
4/v1/ceres/host/infoGETObtains host information.
5/v1/ceres/plugin/infoGETObtains the plug-in running information in aops-ceres.
6/v1/ceres/file/collectPOSTCollects content of the configuration file.
7/v1/ceres/collect/items/changePOSTChanges the running status of plug-in collection items.
8/v1/ceres/cve/repo/setPOSTSets the repo source.
9/v1/ceres/cve/cve/scanPOSTScans the CVEs on the local host.
10/v1/ceres/cve/cve/setPOSTRepairs the CVEs.

5.1.1 /v1/ceres/plugin/start

  • Description: Starts the plug-in that is installed but not running. Currently, only the gala-gopher plug-in is supported.

  • HTTP request mode: POST

  • Data submission mode: query

  • Request parameters

    ParameterMandatoryTypeDescription
    plugin_nameTruestrPlug-in name
  • Request parameter example

    ParameterValue
    plugin_namegala-gopher
  • Response body parameters

    ParameterTypeDescription
    codeint/Return code
    msgstrMessage corresponding to the status code
  • Response example

    {
        "code": 200,
        "msg": "xxxx"
    }
    

5.1.2 /v1/ceres/plugin/stop

  • Description: Stops a running plug-in. Currently, only the gala-gopher plug-in is supported.

  • HTTP request mode: POST

  • Data submission mode: query

  • Request parameters

    ParameterMandatoryTypeDescription
    plugin_nameTruestrPlug-in name
  • Request parameter example

    ParameterValue
    plugin_namegala-gopher
  • Response body parameters

    ParameterTypeDescription
    codeintReturn code
    msgstrMessage corresponding to the status code
  • Response example

    {
        "code": 200,
        "msg": "xxxx"
    }
    

5.1.3 /v1/ceres/application/info

  • Description: Collects running applications in the target application collection. Currently, the target application collection contains MySQL, Kubernetes, Hadoop, Nginx, Docker, and gala-gopher.

  • HTTP request mode: GET

  • Data submission mode: query

  • Request parameters

    ParameterMandatoryTypeDescription
  • Request parameter example

    ParameterValue
  • Response body parameters

    ParameterTypeDescription
    codeintReturn code
    msgstrMessage corresponding to the status code
    respdictResponse body
    • resp
      ParameterTypeDescription
      runningList[str]List of the running applications
  • Response example

    {
        "code": 200,
        "msg": "xxxx",
        "resp": {
            "running": [
                "mysql",
                "docker"
            ]
        }
    }
    

5.1.4 /v1/ceres/host/info

  • Description: Obtains information about the host where aops-ceres is installed, including the system version, BIOS version, kernel version, CPU information, and memory information.

  • HTTP request mode: POST

  • Data submission mode: application/json

  • Request parameters

    ParameterMandatoryTypeDescription
    info_typeTrueList[str]List of the information to be collected. Currently, only cpu, disk, memory, and os are supported.
  • Request parameter example

    ["os", "cpu","memory", "disk"]
    
  • Response body parameters

    ParameterTypeDescription
    codeintReturn code
    msgstrMessage corresponding to the status code
    respdictResponse body
    • resp

      ParameterTypeDescription
      cpudictCPU information
      memorydictMemory information
      osdictOS information
      diskList[dict]Disk information
    • cpu

      ParameterTypeDescription
      architecturestrCPU architecture
      core_countintNumber of cores
      l1d_cachestrL1 data cache size
      l1i_cachestrL1 instruction cache size
      l2_cachestrL2 cache size
      l3_cachestrL3 cache size
      model_namestrModel name
      vendor_idstrVendor ID
    • memory

      ParameterTypeDescription
      sizestrTotal memory
      totalintNumber of DIMMs
      infoList[dict]Information about all DIMMs
      • info
        ParameterTypeDescription
        sizestrMemory size
        typestrType
        speedstrSpeed
        manufacturerstrManufacturer
    • os

      ParameterTypeDescription
      bios_versionstrBIOS version
      os_versionstrOS name
      kernelstrKernel version
  • Response example

    {
        "code": 200,
        "msg": "operate success",
        "resp": {
            "cpu": {
                "architecture": "aarch64",
                "core_count": "128",
                "l1d_cache": "8 MiB (128 instances)",
                "l1i_cache": "8 MiB (128 instances)",
                "l2_cache": "64 MiB (128 instances)",
                "l3_cache": "128 MiB (4 instances)",
                "model_name": "Kunpeng-920",
                "vendor_id": "HiSilicon"
            },
            "memory": {
                "info": [
                    {
                        "manufacturer": "Hynix",
                        "size": "16 GB",
                        "speed": "2933 MT/s",
                        "type": "DDR4"
                    },
                    {
                        "manufacturer": "Hynix",
                        "size": "16 GB",
                        "speed": "2933 MT/s",
                        "type": "DDR4"
                    }
                ],
                "size": "32G",
                "total": 2
            },
            "os": {
                "bios_version": "1.82",
                "kernel": "5.10.0-60.18.0.50",
                "os_version": "openEuler 22.03 LTS"   
            },
            "disk": [
                {
                    "capacity": "xxGB",
                    "model": "xxxxxx"
                }
                ]
        }
    }
    

5.1.5 /v1/ceres/plugin/info

  • Description: Obtains the plug-in running status of the host. Currently, only the gala-gopher plug-in is supported.

  • HTTP request mode: GET

  • Data submission mode: query

  • Request parameters

    ParameterMandatoryTypeDescription
  • Request parameter example

    ParameterValue
  • Response body parameters

    ParameterTypeDescription
    codeintReturn code
    msgstrMessage corresponding to the status code
    respList[dict]Response body
    • resp

      ParameterTypeDescription
      plugin_namestrPlug-in name
      collect_itemslistRunning status of plug-in collection items
      is_installedstrMessage corresponding to the status code
      resourceList[dict]Plug-in resource usage
      statusstrPlug-in running status
      • resource
        ParameterTypeDescription
        namestrResource name
        current_valuestrResource usage
        limit_valuestrResource limit
  • Response example

    {
        "code": 200,
        "msg": "operate success",
        "resp": [
            {
                "collect_items": [
                    {
                        "probe_name": "system_tcp",
                        "probe_status": "off",
                        "support_auto": false
                    },
                    {
                        "probe_name": "haproxy",
                        "probe_status": "auto",
                        "support_auto": true
                    },
                    {
                        "probe_name": "nginx",
                        "probe_status": "auto",
                        "support_auto": true
                    },
                ],
                "is_installed": true,
                "plugin_name": "gala-gopher",
                "resource": [
                    {
                        "current_value": "0.0%",
                        "limit_value": null,
                        "name": "cpu"
                    },
                    {
                        "current_value": "13 MB",
                        "limit_value": null,
                        "name": "memory"
                    }
                ],
                "status": "active"
            }
        ]
    }
    

5.1.6 /v1/ceres/file/collect

  • Description: Collects information such as the content, permission, and owner of the target configuration file. Currently, only text files smaller than 1 MB, without execute permission, and supporting UTF8 encoding can be read.

  • HTTP request mode: POST

  • Data submission mode: application/json

  • Request parameters

    ParameterMandatoryTypeDescription
    configfile_pathTrueList[str]List of the full paths of the files to be collected
  • Request parameter example

    [ "/home/test.conf", "/home/test.ini", "/home/test.json"]
    
  • Response body parameters

    ParameterTypeDescription
    infosList[dict]File collection information
    success_filesList[str]List of files successfully collected
    fail_filesList[str]List of files that fail to be collected
    • infos

      ParameterTypeDescription
      pathstrFile path
      contentstrFile content
      file_attrdictFile attributes
      • file_attr
        ParameterTypeDescription
        modestrPermission of the file type
        ownerstrFile owner
        groupstrGroup to which the file belongs
  • Response example

    {
        "infos": [
            {
                "content": "this is a test file",
                "file_attr": {
                    "group": "root",
                    "mode": "0644",
                    "owner": "root"
                },
                "path": "/home/test.txt"
            }
        ],
        "success_files": [
            "/home/test.txt"
        ],
        "fail_files": [
            "/home/test.txt"
        ]
    }
    

5.1.7 /v1/ceres/collect/items/change

  • Description: Changes the collection status of the plug-in collection items. Currently, only the status of the gala-gopher collection items can be changed. For the gala-gopher collection items, see /opt/gala-gopher/gala-gopher.conf.

  • HTTP request mode: POST

  • Data submission mode: application/json

  • Request parameters

    ParameterMandatoryTypeDescription
    plugin_nameTruedictExpected modification result of the plug-in collection items
    • plugin_name
      ParameterMandatoryTypeDescription
      collect_itemTruestringExpected modification result of the collection item
  • Request parameter example

    {
        "gala-gopher":{
            "redis":"auto",
            "system_inode":"on",
            "tcp":"on",
            "haproxy":"auto"
        }
    } 
    
  • Response body parameters

    ParameterTypeDescription
    codeintReturn code
    msgstrMessage corresponding to the status code
    respList[dict]Response body
    • resp

      ParameterTypeDescription
      plugin_namedictModification result of the corresponding collection item
      • plugin_name
        ParameterTypeDescription
        successList[str]Collection items that are successfully modified
        failureList[str]Collection items that fail to be modified
  • Response example

    {
        "code": 200,
        "msg": "operate success",
        "resp": {
            "gala-gopher": {
                "failure": [
                    "redis"
                ],
                "success": [
                    "system_inode",
                    "tcp",
                    "haproxy"
                ]
            }
        }
    }
    

5.1.8 /v1/ceres/cve/repo/set

  • Description: Sets the repo source.

  • HTTP request mode: POST

  • Data submission mode: application/json

  • Request parameters

    ParameterMandatoryTypeDescription
    repo_infoTruedictRepo information to be set
    check_itemsTrueList[str]Pre-check
    checkTrueboolWhether to perform the check
    • repo_info
      ParameterMandatoryTypeDescription
      nameTruestrRepo name
      destTruestrRepo location. Ensure that the repo file is stored in the /etc/yum.repo.d directory with the file name extension repo.
      repo_contentTruestrRepo file content
  • Request parameter example

    {
        "repo_info":{
            "name":"update",
            "dest":"/etc/yum.repos.d/aops-update.repo",
            "repo_content":"repo content"
        },
        "check_items":[],
        "check":false,
    }
    
  • Response body parameters

    ParameterTypeDescription
    codeintReturn code
    msgstrMessage corresponding to the status code
  • Response example

    {
    	"code": 200,
    	"msg": "operate success"
    }
    

5.1.9 /v1/ceres/cve/scan

  • Description: Scans CVEs.

  • HTTP request mode: POST

  • Data submission mode: application/json

  • Request parameters

    ParameterMandatoryTypeDescription
    checkTruedictWhether to perform the check
    check_itemsTrueList[str]Pre-check
    cvesTrueList[str]List of CVEs to be repaired
  • Request parameter example

    {
        "check":false,
        "check_items":[],
        "basic":true // Currently, only true is supported.
    }
    
  • Response body parameters

    ParameterTypeDescription
    codeintReturn code
    msgstrMessage corresponding to the status code
    resultdictResponse body
    • result
      ParameterTypeDescription
      cvesList[str]List of CVE IDs
      installed_packagesList[str]List of software packages installed on the target host
      os_versionstrOS version of the target host
  • Response example

    {
    	"code": 200,
    	"msg": "operate success",
    	"result": {
    	"cves":[],
    	"installed_packages":[],
    	"os_version":"openEuler 22.03 LTS"
    	}
    }
    

5.1.10 /v1/ceres/cve/fix

  • Description: Repairs CVEs.

  • HTTP request mode: POST

  • Data submission mode: application/json

  • Request parameters

    ParameterMandatoryTypeDescription
    checkTruedictWhether to perform the check
    check_itemsTrueList[str]Pre-check
    cvesTrueList[str]List of CVEs to be repaired
  • Request parameter example

    {
        "check":false,
        "check_items":[],
        "cves":["CVE-2021-3782", "CVE-2021-3781"]
    }
    
  • Response parameters

    ParameterTypeDescription
    codeintReturn code
    msgstrMessage corresponding to the status code
    resultdictCVE repair result
    • result
      ParameterTypeDescription
      cve_idstrCVE ID
      logstrRepair log (mainly used to record repair failures)
      resultstrRepair result
  • Response example

    {
    	"code": 200,
    	"msg": "operate success",
    	"result": [
    		{
    			"cve_id": "CVE-2021-3782",
    			"log": "fix succeed",
    			"result": "succeed"
    		},
    		{
    			"cve_id": "CVE-2021-3781",
    			"log": "fix succeed",
    			"result": "succeed"
    		}
    	]
    }
    

FAQs

  1. If an error is reported, view the /var/log/aops/aops.log file, rectify the fault based on the error message in the log file, and restart the service.

  2. You are advised to run aops-ceres in Python 3.7 or later. Pay attention to the version of the Python dependency library when installing it.

  3. The value of access_token can be obtained from the /etc/aops/agent.conf file after the registration is complete.

  4. If you choose not to disable the firewall, enable the ports involved in the aops-ceres deployment.

  5. To limit the CPU and memory resources of a plug-in, add MemoryHigh and CPUQuota to the Service section in the service file corresponding to the plug-in.

    For example, set MemoryHigh of gala-gopher to 40M and CPUQuota to 20%.

    [Unit]
    Description=a-ops gala gopher service
    After=network.target
    
    [Service]
    Type=exec
    ExecStart=/usr/bin/gala-gopher
    Restart=on-failure
    RestartSec=1
    RemainAfterExit=yes
    ;Limit the maximum memory that can be used by processes in the unit. The limit can be exceeded. However, after the limit is exceeded, the process running speed is limited, and the system reclaims the excess memory as much as possible.
    ;The option value can be an absolute memory size in bytes (K, M, G, or T suffix based on 1024) or a relative memory size in percentage.
    MemoryHigh=40M
    ;Set the CPU time limit for the processes of this unit. The value must be a percentage ending with %, indicating the maximum percentage of the total time that the unit can use a single CPU.
    CPUQuota=20%
    
    [Install]
    WantedBy=multi-user.target
    

Bug Catching

Buggy Content

Bug Description

Submit As Issue

It's a little complicated....

I'd like to ask someone.

PR

Just a small problem.

I can fix it online!

Bug Type
Specifications and Common Mistakes

● Misspellings or punctuation mistakes;

● Incorrect links, empty cells, or wrong formats;

● Chinese characters in English context;

● Minor inconsistencies between the UI and descriptions;

● Low writing fluency that does not affect understanding;

● Incorrect version numbers, including software package names and version numbers on the UI.

Usability

● Incorrect or missing key steps;

● Missing prerequisites or precautions;

● Ambiguous figures, tables, or texts;

● Unclear logic, such as missing classifications, items, and steps.

Correctness

● Technical principles, function descriptions, or specifications inconsistent with those of the software;

● Incorrect schematic or architecture diagrams;

● Incorrect commands or command parameters;

● Incorrect code;

● Commands inconsistent with the functions;

● Wrong screenshots.

Risk Warnings

● Lack of risk warnings for operations that may damage the system or important data.

Content Compliance

● Contents that may violate applicable laws and regulations or geo-cultural context-sensitive words and expressions;

● Copyright infringement.

How satisfied are you with this document

Not satisfied at all
Very satisfied
Submit
Click to create an issue. An issue template will be automatically generated based on your feedback.