Long-Term Supported Versions

    1 Introduction to aops-ceres

    As the client of the A-Ops module, aops-ceres exchanges data with the A-Ops management center through HTTP and provides functions such as collecting host information, managing data collection tools (such as gala-gopher), and responding to and processing commands delivered by the management center.

    2 Environment Requirements

    You are advised to use openEuler 22.03 LTS or later.

    3 Environment Deployment

    3.1 Deploying aops-ceres

    1. Use Yum to install aops-ceres.

      yum install aops-ceres

    2. Modify the configuration file.

      vim /etc/aops/ceres.conf

      [ceres]
      ;Start aops-ceres and bind the IP address and port number.
      ip=192.168.1.3
      port=12000
      
      [gopher]
      ;Default path of the gala-gopher configuration file. If you need to change the path, ensure that the file path is correct.
      config_path=/opt/gala-gopher/gala-gopher.conf
      
      ;aops-ceres log collection configuration
      [log]
      ;Level of the logs to be collected, which can be set to DEBUG, INFO, WARNING, ERROR, or CRITICAL
      log_level=INFO
      ;Location for storing collected logs
      log_dir=/var/log/aops
      ;Maximum size of a log file
      max_bytes=31457280
      ;Number of backup logs
      backup_count=40
      

    3.2 Registering with aops-zeus

    To identify users and prevent APIs from being invoked randomly, aops-ceres uses tokens to authenticate users, reducing the pressure on the deployed hosts.

    Before the registration, obtain the token of the management user on aops-ceres and run the register command to register the token with aops-zeus. No database is configured for aops-ceres. After the registration is successful, the token is automatically saved to the specified file and the registration result is displayed on the GUI. In addition, save the local host information to the aops-zeus database for subsequent management.

    1. Change the values of the data items to the actual values by referring to the /opt/aops/register_example.json template file.

      {
      
      "ssh_user": "root", // Remote login user name
      "password": "password", // User password
      "ssh_port":22,  // Remote login port
      "zeus_ip": "192.168.1.2", // IP address of the host where aops-zeus is running
      "zeus_port": 11111, // aops-zeus port
      "host_name": "host_name", // Host name to be registered
      "host_group_name": "aops", // Existing host group name
      "management": false, // Whether to register as a management host
      "access_token": "token-string" // Management user token obtained after login
      }
      

      Note: Ensure that aops-zeus is running on the target host, for example, 192.168.1.2, and the registered host group exists.

    2. Run aops-ceres register [-f <FILE>] [-d <register_host_info>].

    3. The registration result is displayed on the GUI. If the registration is successful, the token character string is saved to a specified file. If the registration fails, locate the fault based on the message and log (/var/log/aops/aops.log).

    The following is an example of the registration result:

    Registration successful:

    $ aops-ceres register -f register.json
    Register Success
    

    Registration failed. The following shows an aops-zeus start failure.

    $ aops-ceres register -f register.json
    Register Fail
    

    Log content:

    2022-09-05 16:11:52,576 ERROR command_manage/register/331: HTTPConnectionPool(host='192.168.1.2', port=11111): Max retries exceeded with url: /manage/host/add (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff0504ce4f0>: Failed to establish a new connection: [Errno 111] Connection refused'))
    

    4 Plug-in Support

    4.1 gala-gopher

    4.1.1 Introduction

    gala-gopher is a low-load probe framework based on eBPF. It can be used to monitor the CPU, memory, and network status of hosts and collect data. You can configure the collection status of existing probes based on service requirements.

    4.1.2 Deployment

    1. Run yum install gala-gopher to install gala-gopher.
    2. Enable probes based on service requirements. You can view information about probes in /opt/gala-gopher/gala-gopher.conf.
    3. Run systemctl start gala-gopher to start the gala-gopher service.

    4.1.3 More

    For more information about gala-gopher, see https://gitee.com/openeuler/gala-gopher/blob/master/README.md.

    5 Command Support

    5.1 List of Commands

    aops-ceres COMMAND [options]
    List of Main Commands:
    plugin          manage plugin
    collect         collect some information
    apollo          cve/bugfix related action
    
    General plugin options:
    --start <args>
    --stop  <args>
    --change-collect-items <args>
    --info <args>
    
    General info options: 
    --file <args>
    --application <args>
    --host <args>
    
    General apollo options:
    --set-repo <args>
    --scan <args>
    --fix <args>
    

    5.1.1 aops-ceres plugin --start <args>

    • Description: Starts the plug-in that is installed but not running. Currently, only the gala-gopher plug-in is supported.

    • args: gala-gopher

    5.1.2 aops-ceres plugin --stop <args>

    • Description: Stops a running plug-in. Currently, only the gala-gopher plug-in is supported.

    • args: gala-gopher

    5.1.3、aops-ceres plugin --change-collect-items <args>

    • Description: Changes the collection status of the plug-in collection items. Currently, only the status of the gala-gopher collection items can be changed. For the gala-gopher collection items, see /opt/gala-gopher/gala-gopher.conf.

    • Parameter example:

      {
          "gala-gopher":{
              "redis":"auto",
              "system_inode":"on",
              "tcp":"on",
              "haproxy":"auto"
          }
      } 
      
    • Execution result example:

      {
          "resp": {
              "gala-gopher": {
                  "failure": [
                      "redis"
                  ],
                  "success": [
                      "system_inode",
                      "tcp",
                      "haproxy"
                  ]
              }
          }
      }
      

    5.1.4 aops-ceres plugin --info

    • Description: Collects running applications in the target application collection. Currently, the target application collection contains MySQL, Kubernetes, Hadoop, Nginx, Docker, and gala-gopher.

    • Execution result example:

      [
        {
            "collect_items": [
                {
                    "probe_name": "system_tcp",
                    "probe_status": "off",
                    "support_auto": false
                },
                {
                    "probe_name": "haproxy",
                    "probe_status": "auto",
                    "support_auto": true
                },
                {
                    "probe_name": "nginx",
                    "probe_status": "auto",
                    "support_auto": true
                },
              ],
            "is_installed": true,
            "plugin_name": "gala-gopher",
            "resource": [
                {
                    "current_value": "0.0%",
                    "limit_value": null,
                    "name": "cpu"
                },
                {
                    "current_value": "13 MB",
                    "limit_value": null,
                    "name": "memory"
                }
              ],
            "status": "active"
          }
        ]
      

    5.1.5 aops-ceres collect --host <args>

    • Description: Obtains the plug-in running status of the host. Currently, only the gala-gopher plug-in is supported.

    • Parameter example:

        ["mem", "os", "cpu", "disk"]  
      

      Note: mem, os, cpu, and disk are optional. If the input parameter is an empty list, all information is obtained.

    • Execution result example:

      {
          "cpu": {
              "architecture": "aarch64",
              "core_count": "128",
              "l1d_cache": "8 MiB (128 instances)",
              "l1i_cache": "8 MiB (128 instances)",
              "l2_cache": "64 MiB (128 instances)",
              "l3_cache": "128 MiB (4 instances)",
              "model_name": "Kunpeng-920",
              "vendor_id": "HiSilicon"
            },
          "memory": {
              "info": [
                  {
                      "manufacturer": "Hynix",
                      "size": "16 GB",
                      "speed": "2933 MT/s",
                      "type": "DDR4"
                  },
                  {
                      "manufacturer": "Hynix",
                      "size": "16 GB",
                      "speed": "2933 MT/s",
                      "type": "DDR4"
                  }
              ],
              "size": "32G",
              "total": 2
            },
          "os": {
              "bios_version": "1.82",
              "kernel": "5.10.0-60.18.0.50",
              "os_version": "openEuler 22.03 LTS"   
            },
          "disk": [
              {
                  "capacity": "xxGB",
                  "model": "xxxxxx"
              }
            ]
      }
      

    5.1.6 aops-ceres collect --application

    • Description: Collects running applications in the target application collection. Currently, the target application collection contains MySQL, Kubernetes, Hadoop, Nginx, Docker, and gala-gopher.

    • Execution result example:

        ["gala-gopher", "mysql"]
      

    5.1.7 aops-ceres collect --file <args>

    • Description: Collects information such as the content, permission, and owner of the target configuration file. Currently, only UTF-8 encoded text files smaller than 1 MB and without execute permission can be read.

    • args:List of the full paths of the files to be collected

      Example:

      [ "/home/test.conf", "/home/test.ini", "/home/test.json"]
      
    • Execution result example:

      {
          "infos": [
              {
                  "content": "this is a test file",   // Text file content
                  "file_attr": {  // File attributes
                      "group": "root", // File owner group
                      "mode": "0644", // Permissions on the file
                      "owner": "root" // File owner
                  },
                  "path": "/home/test.txt" // Path of the file
              }
          ],
          "success_files": [
              "/home/test.txt"
          ],
          "fail_files": [
              "/home/test.txt"
          ]
      }
      

    5.1.8 aops-ceres apollo --set-repo <args>

    • Description: Sets the repo source.

    • Request parameter example

      {
          "repo_info":{
              "name":"update",
              "dest":"/etc/yum.repos.d/aops-update.repo",
              "repo_content":"repo content"
          },
          "check_items":[],
          "check":false,
      }
      
    • Execution result example

      {
       "code": "Succeed",
       "msg": "operate success"
      }
      

    5.1.9 aops-ceres apollo --scan <args>

    • Description: Scans CVEs.

    • Request parameter example

      {
          "check":false,
          "check_items":[],
          "basic":true // Currently, only true is supported.
      }
      
    • Execution result example

      {
        "code": "Succeed",
        "msg": "operate success",
        "result": {
          "cves":[
            {
              "cve_id": "CVE-2022-4904", 
              "hotpatch": false
            },
            {
              "cve_id": "CVE-2022-25308", 
              "hotpatch": false
              }
          ],
          "installed_packages":[
            {
              "name": "zip", 
              "version": "3.0-30"
            },
            {
              "name": "python-sqlalchemy", 
              "version": "1.3.24-1"
            }
            ],
          "os_version":"openEuler 22.03 LTS"
          }
      }
      

    5.1.10 aops-ceres apollo --fix <args>

    • Description: Repairs CVEs.

    • Request parameter example

        {
          "check_items": [],
          "check": false,
          "cves": [
              {
                "cve_id": "CVE-2021-3781",
                "hotpatch": true
              },
              {
                "cve_id": "CVE-2021-3782",
                "hotpatch": true
              }
            ]
        }
      
    • Execution result example

      {
        "code": "Succeed",
        "msg": "operate success",
        "result": [
          {
            "cve_id": "CVE-2021-3782",
            "log": "fix succeed",
            "result": "succeed"
          },
          {
            "cve_id": "CVE-2021-3781",
            "log": "fix succeed",
            "result": "succeed"
          }
        ]
      }
      

    5.1.11 aops-ceres register [-f <file>] [-d <register_host_info>]

    • Description: Registers with aops-zeus

    • Request parameter example

        {
            "ssh_user": "root",           //Remote login user name
            "password": "password",       //User password
            "ssh_port":22,                //Remote login port
            "zeus_ip": "127.0.0.1",       //IP address of the host where aops-zeus is running
            "zeus_port": 11111,           //aops-zeus port
            "host_name": "host_name",     //Host name to be registered
            "host_group_name": "aops",    //Existing host group name
            "management": false,          //Whether to register as a management host
            "access_token": "token-string"//Management user token obtained after login
        }
      

    FAQs

    1. If an error is reported, view the /var/log/aops/aops.log file, rectify the fault based on the error message in the log file, and restart the service.

    2. You are advised to run aops-ceres in Python 3.7 or later. Pay attention to the version of the Python dependency library when installing it.

    3. The value of access_token can be obtained from the /etc/aops/agent.conf file after the registration is complete.

    4. args of all the commands are JSON strings.

    5. To limit the CPU and memory resources of a plug-in, add MemoryHigh and CPUQuota to the Service section in the service file corresponding to the plug-in.

      For example, set MemoryHigh of gala-gopher to 40M and CPUQuota to 20%.

      [Unit]
      Description=a-ops gala gopher service
      After=network.target
      
      [Service]
      Type=exec
      ExecStart=/usr/bin/gala-gopher
      Restart=on-failure
      RestartSec=1
      RemainAfterExit=yes
      ;Limit the maximum memory that can be used by processes in the unit. The limit can be exceeded. However, after the limit is exceeded, the process running speed is limited, and the system reclaims the excess memory as much as possible.
      ;The option value can be an absolute memory size in bytes (K, M, G, or T suffix based on 1024) or a relative memory size in percentage.
      MemoryHigh=40M
      ;Set the CPU time limit for the processes of this unit. The value must be a percentage ending with %, indicating the maximum percentage of the total time that the unit can use a single CPU.
      CPUQuota=20%
      
      [Install]
      WantedBy=multi-user.target
      

    Bug Catching

    Buggy Content

    Bug Description

    Submit As Issue

    It's a little complicated....

    I'd like to ask someone.

    PR

    Just a small problem.

    I can fix it online!

    Bug Type
    Specifications and Common Mistakes

    ● Misspellings or punctuation mistakes;

    ● Incorrect links, empty cells, or wrong formats;

    ● Chinese characters in English context;

    ● Minor inconsistencies between the UI and descriptions;

    ● Low writing fluency that does not affect understanding;

    ● Incorrect version numbers, including software package names and version numbers on the UI.

    Usability

    ● Incorrect or missing key steps;

    ● Missing prerequisites or precautions;

    ● Ambiguous figures, tables, or texts;

    ● Unclear logic, such as missing classifications, items, and steps.

    Correctness

    ● Technical principles, function descriptions, or specifications inconsistent with those of the software;

    ● Incorrect schematic or architecture diagrams;

    ● Incorrect commands or command parameters;

    ● Incorrect code;

    ● Commands inconsistent with the functions;

    ● Wrong screenshots.

    Risk Warnings

    ● Lack of risk warnings for operations that may damage the system or important data.

    Content Compliance

    ● Contents that may violate applicable laws and regulations or geo-cultural context-sensitive words and expressions;

    ● Copyright infringement.

    How satisfied are you with this document

    Not satisfied at all
    Very satisfied
    Submit
    Click to create an issue. An issue template will be automatically generated based on your feedback.
    Bug Catching
    编组 3备份