Long-Term Supported Versions

    Innovation Versions

      Installation and Deployment

      Overview

      This chapter describes how to install and deploy the Rubik component.

      Software and Hardware Requirements

      Hardware

      • Architecture: x86 or AArch64
      • Drive: 1 GB or more
      • Memory: 100 MB or more

      Software

      • OS: openEuler 22.03-LTS
      • Kernel: openEuler 22.03-LTS kernel

      Environment Preparation

      • Install the openEuler OS. For details, see the openEuler Installation Guide.
      • Install and deploy Kubernetes. For details, see the Kubernetes Cluster Deployment Guide.
      • Install the Docker or iSulad container engine. If the iSulad container engine is used, you need to install the isula-build container image building tool.

      Installing Rubik

      Rubik is deployed on each Kubernetes node as a DaemonSet. Therefore, you need to perform the following steps to install the Rubik RPM package on each node.

      1. Configure the Yum repositories openEuler 22.03-LTS and openEuler 22.03-LTS:EPOL (the Rubik component is available only in the EPOL repository).

        # openEuler 22.03-LTS official repository
        name=openEuler22.03
        baseurl=https://repo.openeuler.org/openEuler-22.03-LTS/everything/$basearch/ 
        enabled=1
        gpgcheck=1
        gpgkey=https://repo.openeuler.org/openEuler-22.03-LTS/everything/$basearch/RPM-GPG-KEY-openEuler
        
        # openEuler 22.03-LTS:EPOL official repository
        name=Epol
        baseurl=https://repo.openeuler.org/openEuler-22.03-LTS/EPOL/$basearch/
        enabled=1
        gpgcheck=0
        
      2. Install Rubik with root permissions.

        sudo yum install -y rubik
        

      Note:

      Files related to Rubik are installed in the /var/lib/rubik directory.

      Deploying Rubik

      Rubik runs as a container in a Kubernetes cluster in hybrid deployment scenarios. It is used to isolate and restrict resources for services with different priorities to prevent offline services from interfering with online services, improving the overall resource utilization and ensuring the quality of online services. Currently, Rubik supports isolation and restriction of CPU and memory resources, and must be used together with the openEuler 22.03-LTS kernel. To enable or disable the memory priority feature (that is, memory tiering for services with different priorities), you need to set the value in the /proc/sys/vm/memcg_qos_enable file. The value can be 0 or 1. The default value 0 indicates that the feature is disabled, and the value 1 indicates that the feature is enabled.

      sudo echo 1 > /proc/sys/vm/memcg_qos_enable
      

      Deploying Rubik DaemonSet

      1. Use the Docker or isula-build engine to build Rubik images. Because Rubik is deployed as a DaemonSet, each node requires a Rubik image. After building an image on a node, use the docker save and docker load commands to load the Rubik image to each node of Kubernetes. Alternatively, build a Rubik image on each node. The following uses isula-build as an example. The command is as follows:
      isula-build ctr-img build -f /var/lib/rubik/Dockerfile --tag rubik:0.1.0 .
      
      1. On the Kubernetes master node, change the Rubik image name in the /var/lib/rubik/rubik-daemonset.yaml file to the name of the image built in the previous step.
      ...
      containers:
      - name: rubik-agent
        image: rubik:0.1.0  # The image name must be the same as the Rubik image name built in the previous step.
        imagePullPolicy: IfNotPresent
      ...
      
      1. On the Kubernetes master node, run the kubectl command to deploy the Rubik DaemonSet so that Rubik will be automatically deployed on all Kubernetes nodes.
      kubectl apply -f /var/lib/rubik/rubik-daemonset.yaml
      
      1. Run the kubectl get pods -A command to check whether Rubik has been deployed on each node in the cluster. (The number of rubik-agents is the same as the number of nodes and all rubik-agents are in the Running status.)
      $ kubectl get pods -A
      NAMESPACE     NAME                                            READY   STATUS    RESTARTS   AGE
      ...
      kube-system   rubik-agent-76ft6                               1/1     Running   0          4s
      ...
      

      Common Configuration Description

      The Rubik deployed using the preceding method is started with the default configurations. You can modify the Rubik configurations as required by modifying the config.json section in the rubik-daemonset.yaml file and then redeploy the Rubik DaemonSet.

      This section describes common configurations in config.json.

      Configuration Item Description

      # The configuration items are in the config.json section of the rubik-daemonset.yaml file.
      {
          "autoConfig": true,
          "autoCheck": false,
          "logDriver": "stdio",
          "logDir": "/var/log/rubik",
          "logSize": 1024,
          "logLevel": "info",
          "cgroupRoot": "/sys/fs/cgroup"
      }
      
      ItemValue TypeValue RangeDescription
      autoConfigBooleantrue or falsetrue: enables automatic pod awareness.
      false: disables automatic pod awareness.
      autoCheckBooleantrue or falsetrue: enables pod priority check.
      false: disables pod priority check.
      logDriverStringstdio or filestdio: prints logs to the standard output. The scheduling platform collects and dumps logs.
      file: prints files to the log directory specified by logDir.
      logDirStringAbsolute pathDirectory for storing logs.
      logSizeInteger[10,1048576]Total size of logs, in MB. If the total size of logs reaches the upper limit, the earliest logs will be discarded.
      logLevelStringerror, info, or debugLog level.
      cgroupRootStringAbsolute pathcgroup mount point.

      Automatic Configuration of Pod Priorities

      If autoConfig is set to true in config.json to enable automatic pod awareness, you only need to specify the priority using annotations in the YAML file when deploying the service pods. After being deployed successfully, Rubik automatically detects the creation and update of the pods on the current node, and sets the pod priorities based on the configured priorities.

      Pod Priority Configuration Depending on kubelet

      Automatic pod priority configuration depends on the pod creation event notifications from the API server, which have a certain delay. The pod priority cannot be configured before the process is started. As a result, the service performance may fluctuate. To avoid this problem, you can disable the automatic priority configuration option and modify the kubelet source code, so that pod priorities can be configured using Rubik HTTP APIs after the cgroup of each container is created and before each container process is started. For details about how to use the HTTP APIs, see HTTP APIs.

      Automatic Verification of Pod Priorities

      Rubik supports consistency check on the pod QoS priority configurations of the current node during startup. It checks whether the configuration in the Kubernetes cluster is consistent with the pod priority configuration of Rubik. This function is disabled by default. You can enable or disable it using the autoCheck option. If this function is enabled, Rubik automatically verifies and corrects the pod priority configuration of the current node when it is started or restarted.

      Configuring Rubik for Online and Offline Services

      After Rubik is successfully deployed, you can modify the YAML file of a service to specify the service type based on the following configuration example. Then Rubik can configure the priority of the service after it is deployed to isolate resources.

      The following is an example of deploying an online Nginx service:

      apiVersion: v1
      kind: Pod
      metadata:
        name: nginx
        namespace: qosexample
        annotations:
          volcano.sh/preemptable: "false"   # If volcano.sh/preemptable is set to true, the service is an offline service. If it is set to false, the service is an online service. The default value is false.
      spec:
        containers:
        - name: nginx
          image: nginx
          resources:
            limits:
              memory: "200Mi"
              cpu: "1"
            requests:
              memory: "200Mi"
              cpu: "1"
      

      Restrictions

      • The maximum number of concurrent HTTP requests that Rubik can receive is 1,000 QPS. If the number of concurrent HTTP requests exceeds the upper limit, an error is reported.

      • The maximum number of pods in a single request received by Rubik is 100. If the number of pods exceeds the upper limit, an error is reported.

      • Only one set of Rubik instances can be deployed on each Kubernetes node. Multiple sets of Rubik instances may conflict with each other.

      • Rubik does not provide port access and can communicate only through sockets.

      • Rubik accepts only valid HTTP request paths and network protocols: http://localhost/ (POST), http://localhost/ping (GET), and http://localhost/version (GET). For details about the functions of HTTP requests, see HTTP APIs(./http-apis.md).

      • Rubik drive requirement: 1 GB or more.

      • Rubik memory requirement: 100 MB or more.

      • Services cannot be switched from a low priority (offline services) to a high priority (online services). For example, if service A is set to an offline service and then to an online service, Rubik reports an error.

      • When directories are mounted to a Rubik container, the minimum permission on the Rubik local socket directory /run/Rubik is 700 on the service side.

      • When the Rubik service is available, the timeout interval of a single request is 120s. If the Rubik process enters the T (stopped or being traced) or D (uninterruptible sleep) state, the service becomes unavailable. In this case, the Rubik service does not respond to any request. To avoid this problem, set the timeout interval on the client to avoid infinite waiting.

      • If hybrid deployment is used, the original CPU share funtion of cgroup has the following restrictions:

        If both online and offline tasks are running on the CPU, the CPU share configuration of offline tasks does not take effect.

        If the current CPU has only online or offline tasks, the CPU share configuration takes effect.

      Bug Catching

      Buggy Content

      Bug Description

      Submit As Issue

      It's a little complicated....

      I'd like to ask someone.

      PR

      Just a small problem.

      I can fix it online!

      Bug Type
      Specifications and Common Mistakes

      ● Misspellings or punctuation mistakes;

      ● Incorrect links, empty cells, or wrong formats;

      ● Chinese characters in English context;

      ● Minor inconsistencies between the UI and descriptions;

      ● Low writing fluency that does not affect understanding;

      ● Incorrect version numbers, including software package names and version numbers on the UI.

      Usability

      ● Incorrect or missing key steps;

      ● Missing prerequisites or precautions;

      ● Ambiguous figures, tables, or texts;

      ● Unclear logic, such as missing classifications, items, and steps.

      Correctness

      ● Technical principles, function descriptions, or specifications inconsistent with those of the software;

      ● Incorrect schematic or architecture diagrams;

      ● Incorrect commands or command parameters;

      ● Incorrect code;

      ● Commands inconsistent with the functions;

      ● Wrong screenshots.

      Risk Warnings

      ● Lack of risk warnings for operations that may damage the system or important data.

      Content Compliance

      ● Contents that may violate applicable laws and regulations or geo-cultural context-sensitive words and expressions;

      ● Copyright infringement.

      How satisfied are you with this document

      Not satisfied at all
      Very satisfied
      Submit
      Click to create an issue. An issue template will be automatically generated based on your feedback.
      Bug Catching
      编组 3备份