LTS

    Innovation Version

      1 Introduction to aops-ceres

      As the client of the A-Ops module, aops-ceres exchanges data with the A-Ops management center through HTTP and provides functions such as collecting host information, managing data collection tools (such as gala-gopher), and responding to and processing commands delivered by the management center.

      2 Environment Requirements

      You are advised to use openEuler 22.03 LTS or later.

      3 Environment Deployment

      3.1 Deploying aops-ceres

      1. Use Yum to install aops-ceres.

        yum install aops-ceres

      2. Modify the configuration file.

        vim /etc/aops/ceres.conf

        [ceres]
        ;Start aops-ceres and bind the IP address and port number.
        ip=192.168.1.3
        port=12000
        
        [gopher]
        ;Default path of the gala-gopher configuration file. If you need to change the path, ensure that the file path is correct.
        config_path=/opt/gala-gopher/gala-gopher.conf
        
        ;aops-ceres log collection configuration
        [log]
        ;Level of the logs to be collected, which can be set to DEBUG, INFO, WARNING, ERROR, or CRITICAL
        log_level=INFO
        ;Location for storing collected logs
        log_dir=/var/log/aops
        ;Maximum size of a log file
        max_bytes=31457280
        ;Number of backup logs
        backup_count=40
        

      3.2 Registering with aops-zeus

      To identify users and prevent APIs from being invoked randomly, aops-ceres uses tokens to authenticate users, reducing the pressure on the deployed hosts.

      Before the registration, obtain the token of the management user on aops-ceres and run the register command to register the token with aops-zeus. No database is configured for aops-ceres. After the registration is successful, the token is automatically saved to the specified file and the registration result is displayed on the GUI. In addition, save the local host information to the aops-zeus database for subsequent management.

      1. Change the values of the data items to the actual values by referring to the /opt/aops/register_example.json template file.

        {
        
        "ssh_user": "root", // Remote login user name
        "password": "password", // User password
        "ssh_port":22,  // Remote login port
        "zeus_ip": "192.168.1.2", // IP address of the host where aops-zeus is running
        "zeus_port": 11111, // aops-zeus port
        "host_name": "host_name", // Host name to be registered
        "host_group_name": "aops", // Existing host group name
        "management": false, // Whether to register as a management host
        "access_token": "token-string" // Management user token obtained after login
        }
        

        Note: Ensure that aops-zeus is running on the target host, for example, 192.168.1.2, and the registered host group exists.

      2. Run aops-ceres register [-f <FILE>] [-d <register_host_info>].

      3. The registration result is displayed on the GUI. If the registration is successful, the token character string is saved to a specified file. If the registration fails, locate the fault based on the message and log (/var/log/aops/aops.log).

      The following is an example of the registration result:

      Registration successful:

      $ aops-ceres register -f register.json
      Register Success
      

      Registration failed. The following shows an aops-zeus start failure.

      $ aops-ceres register -f register.json
      Register Fail
      

      Log content:

      2022-09-05 16:11:52,576 ERROR command_manage/register/331: HTTPConnectionPool(host='192.168.1.2', port=11111): Max retries exceeded with url: /manage/host/add (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff0504ce4f0>: Failed to establish a new connection: [Errno 111] Connection refused'))
      

      4 Plug-in Support

      4.1 gala-gopher

      4.1.1 Introduction

      gala-gopher is a low-load probe framework based on eBPF. It can be used to monitor the CPU, memory, and network status of hosts and collect data. You can configure the collection status of existing probes based on service requirements.

      4.1.2 Deployment

      1. Run yum install gala-gopher to install gala-gopher.
      2. Enable probes based on service requirements. You can view information about probes in /opt/gala-gopher/gala-gopher.conf.
      3. Run systemctl start gala-gopher to start the gala-gopher service.

      4.1.3 More

      For more information about gala-gopher, see https://gitee.com/openeuler/gala-gopher/blob/master/README.md.

      5 Command Support

      5.1 List of Commands

      aops-ceres COMMAND [options]
      List of Main Commands:
      plugin          manage plugin
      collect         collect some information
      apollo          cve/bugfix related action
      
      General plugin options:
      --start <args>
      --stop  <args>
      --change-collect-items <args>
      --info <args>
      
      General info options: 
      --file <args>
      --application <args>
      --host <args>
      
      General apollo options:
      --set-repo <args>
      --scan <args>
      --fix <args>
      

      5.1.1 aops-ceres plugin --start <args>

      • Description: Starts the plug-in that is installed but not running. Currently, only the gala-gopher plug-in is supported.

      • args: gala-gopher

      5.1.2 aops-ceres plugin --stop <args>

      • Description: Stops a running plug-in. Currently, only the gala-gopher plug-in is supported.

      • args: gala-gopher

      5.1.3、aops-ceres plugin --change-collect-items <args>

      • Description: Changes the collection status of the plug-in collection items. Currently, only the status of the gala-gopher collection items can be changed. For the gala-gopher collection items, see /opt/gala-gopher/gala-gopher.conf.

      • Parameter example:

        {
            "gala-gopher":{
                "redis":"auto",
                "system_inode":"on",
                "tcp":"on",
                "haproxy":"auto"
            }
        } 
        
      • Execution result example:

        {
            "resp": {
                "gala-gopher": {
                    "failure": [
                        "redis"
                    ],
                    "success": [
                        "system_inode",
                        "tcp",
                        "haproxy"
                    ]
                }
            }
        }
        

      5.1.4 aops-ceres plugin --info

      • Description: Collects running applications in the target application collection. Currently, the target application collection contains MySQL, Kubernetes, Hadoop, Nginx, Docker, and gala-gopher.

      • Execution result example:

        [
          {
              "collect_items": [
                  {
                      "probe_name": "system_tcp",
                      "probe_status": "off",
                      "support_auto": false
                  },
                  {
                      "probe_name": "haproxy",
                      "probe_status": "auto",
                      "support_auto": true
                  },
                  {
                      "probe_name": "nginx",
                      "probe_status": "auto",
                      "support_auto": true
                  },
                ],
              "is_installed": true,
              "plugin_name": "gala-gopher",
              "resource": [
                  {
                      "current_value": "0.0%",
                      "limit_value": null,
                      "name": "cpu"
                  },
                  {
                      "current_value": "13 MB",
                      "limit_value": null,
                      "name": "memory"
                  }
                ],
              "status": "active"
            }
          ]
        

      5.1.5 aops-ceres collect --host <args>

      • Description: Obtains the plug-in running status of the host. Currently, only the gala-gopher plug-in is supported.

      • Parameter example:

          ["mem", "os", "cpu", "disk"]  
        

        Note: mem, os, cpu, and disk are optional. If the input parameter is an empty list, all information is obtained.

      • Execution result example:

        {
            "cpu": {
                "architecture": "aarch64",
                "core_count": "128",
                "l1d_cache": "8 MiB (128 instances)",
                "l1i_cache": "8 MiB (128 instances)",
                "l2_cache": "64 MiB (128 instances)",
                "l3_cache": "128 MiB (4 instances)",
                "model_name": "Kunpeng-920",
                "vendor_id": "HiSilicon"
              },
            "memory": {
                "info": [
                    {
                        "manufacturer": "Hynix",
                        "size": "16 GB",
                        "speed": "2933 MT/s",
                        "type": "DDR4"
                    },
                    {
                        "manufacturer": "Hynix",
                        "size": "16 GB",
                        "speed": "2933 MT/s",
                        "type": "DDR4"
                    }
                ],
                "size": "32G",
                "total": 2
              },
            "os": {
                "bios_version": "1.82",
                "kernel": "5.10.0-60.18.0.50",
                "os_version": "openEuler 22.03 LTS"   
              },
            "disk": [
                {
                    "capacity": "xxGB",
                    "model": "xxxxxx"
                }
              ]
        }
        

      5.1.6 aops-ceres collect --application

      • Description: Collects running applications in the target application collection. Currently, the target application collection contains MySQL, Kubernetes, Hadoop, Nginx, Docker, and gala-gopher.

      • Execution result example:

          ["gala-gopher", "mysql"]
        

      5.1.7 aops-ceres collect --file <args>

      • Description: Collects information such as the content, permission, and owner of the target configuration file. Currently, only UTF-8 encoded text files smaller than 1 MB and without execute permission can be read.

      • args:List of the full paths of the files to be collected

        Example:

        [ "/home/test.conf", "/home/test.ini", "/home/test.json"]
        
      • Execution result example:

        {
            "infos": [
                {
                    "content": "this is a test file",   // Text file content
                    "file_attr": {  // File attributes
                        "group": "root", // File owner group
                        "mode": "0644", // Permissions on the file
                        "owner": "root" // File owner
                    },
                    "path": "/home/test.txt" // Path of the file
                }
            ],
            "success_files": [
                "/home/test.txt"
            ],
            "fail_files": [
                "/home/test.txt"
            ]
        }
        

      5.1.8 aops-ceres apollo --set-repo <args>

      • Description: Sets the repo source.

      • Request parameter example

        {
            "repo_info":{
                "name":"update",
                "dest":"/etc/yum.repos.d/aops-update.repo",
                "repo_content":"repo content"
            },
            "check_items":[],
            "check":false,
        }
        
      • Execution result example

        {
         "code": "Succeed",
         "msg": "operate success"
        }
        

      5.1.9 aops-ceres apollo --scan <args>

      • Description: Scans CVEs.

      • Request parameter example

        {
            "check":false,
            "check_items":[],
            "basic":true // Currently, only true is supported.
        }
        
      • Execution result example

        {
          "code": "Succeed",
          "msg": "operate success",
          "result": {
            "cves":[
              {
                "cve_id": "CVE-2022-4904", 
                "hotpatch": false
              },
              {
                "cve_id": "CVE-2022-25308", 
                "hotpatch": false
                }
            ],
            "installed_packages":[
              {
                "name": "zip", 
                "version": "3.0-30"
              },
              {
                "name": "python-sqlalchemy", 
                "version": "1.3.24-1"
              }
              ],
            "os_version":"openEuler 22.03 LTS"
            }
        }
        

      5.1.10 aops-ceres apollo --fix <args>

      • Description: Repairs CVEs.

      • Request parameter example

          {
            "check_items": [],
            "check": false,
            "cves": [
                {
                  "cve_id": "CVE-2021-3781",
                  "hotpatch": true
                },
                {
                  "cve_id": "CVE-2021-3782",
                  "hotpatch": true
                }
              ]
          }
        
      • Execution result example

        {
          "code": "Succeed",
          "msg": "operate success",
          "result": [
            {
              "cve_id": "CVE-2021-3782",
              "log": "fix succeed",
              "result": "succeed"
            },
            {
              "cve_id": "CVE-2021-3781",
              "log": "fix succeed",
              "result": "succeed"
            }
          ]
        }
        

      5.1.11 aops-ceres register [-f <file>] [-d <register_host_info>]

      • Description: Registers with aops-zeus

      • Request parameter example

          {
              "ssh_user": "root",           //Remote login user name
              "password": "password",       //User password
              "ssh_port":22,                //Remote login port
              "zeus_ip": "127.0.0.1",       //IP address of the host where aops-zeus is running
              "zeus_port": 11111,           //aops-zeus port
              "host_name": "host_name",     //Host name to be registered
              "host_group_name": "aops",    //Existing host group name
              "management": false,          //Whether to register as a management host
              "access_token": "token-string"//Management user token obtained after login
          }
        

      FAQs

      1. If an error is reported, view the /var/log/aops/aops.log file, rectify the fault based on the error message in the log file, and restart the service.

      2. You are advised to run aops-ceres in Python 3.7 or later. Pay attention to the version of the Python dependency library when installing it.

      3. The value of access_token can be obtained from the /etc/aops/agent.conf file after the registration is complete.

      4. args of all the commands are JSON strings.

      5. To limit the CPU and memory resources of a plug-in, add MemoryHigh and CPUQuota to the Service section in the service file corresponding to the plug-in.

        For example, set MemoryHigh of gala-gopher to 40M and CPUQuota to 20%.

        [Unit]
        Description=a-ops gala gopher service
        After=network.target
        
        [Service]
        Type=exec
        ExecStart=/usr/bin/gala-gopher
        Restart=on-failure
        RestartSec=1
        RemainAfterExit=yes
        ;Limit the maximum memory that can be used by processes in the unit. The limit can be exceeded. However, after the limit is exceeded, the process running speed is limited, and the system reclaims the excess memory as much as possible.
        ;The option value can be an absolute memory size in bytes (K, M, G, or T suffix based on 1024) or a relative memory size in percentage.
        MemoryHigh=40M
        ;Set the CPU time limit for the processes of this unit. The value must be a percentage ending with %, indicating the maximum percentage of the total time that the unit can use a single CPU.
        CPUQuota=20%
        
        [Install]
        WantedBy=multi-user.target
        

      Bug Catching

      Buggy Content

      Bug Description

      Submit As Issue

      It's a little complicated....

      I'd like to ask someone.

      PR

      Just a small problem.

      I can fix it online!

      Bug Type
      Specifications and Common Mistakes

      ● Misspellings or punctuation mistakes;

      ● Incorrect links, empty cells, or wrong formats;

      ● Chinese characters in English context;

      ● Minor inconsistencies between the UI and descriptions;

      ● Low writing fluency that does not affect understanding;

      ● Incorrect version numbers, including software package names and version numbers on the UI.

      Usability

      ● Incorrect or missing key steps;

      ● Missing prerequisites or precautions;

      ● Ambiguous figures, tables, or texts;

      ● Unclear logic, such as missing classifications, items, and steps.

      Correctness

      ● Technical principles, function descriptions, or specifications inconsistent with those of the software;

      ● Incorrect schematic or architecture diagrams;

      ● Incorrect commands or command parameters;

      ● Incorrect code;

      ● Commands inconsistent with the functions;

      ● Wrong screenshots.

      Risk Warnings

      ● Lack of risk warnings for operations that may damage the system or important data.

      Content Compliance

      ● Contents that may violate applicable laws and regulations or geo-cultural context-sensitive words and expressions;

      ● Copyright infringement.

      How satisfied are you with this document

      Not satisfied at all
      Very satisfied
      Submit
      Click to create an issue. An issue template will be automatically generated based on your feedback.
      Bug Catching
      编组 3备份