Appendixes

DaemonSet Configuration Template

kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: rubik
rules:
  - apiGroups: [""]
    resources: ["pods"]
    verbs: ["list", "watch"]
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: rubik
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: rubik
subjects:
  - kind: ServiceAccount
    name: rubik
    namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: rubik
  namespace: kube-system
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: rubik-config
  namespace: kube-system
data:
  config.json: |
    {
        "autoCheck": false,
        "logDriver": "stdio",
        "logDir": "/var/log/rubik",
        "logSize": 1024,
        "logLevel": "info",
        "cgroupRoot": "/sys/fs/cgroup",
        "cacheConfig": {
            "enable": false,
            "defaultLimitMode": "static",
            "adjustInterval": 1000,
            "perfDuration": 1000,
            "l3Percent": {
                "low": 20,
                "mid": 30,
                "high": 50
            },
            "memBandPercent": {
                "low": 10,
                "mid": 30,
                "high": 50
            }
        }
    }
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: rubik-agent
  namespace: kube-system
  labels:
    k8s-app: rubik-agent
spec:
  selector:
    matchLabels:
      name: rubik-agent
  template:
    metadata:
      namespace: kube-system
      labels:
        name: rubik-agent
    spec:
      serviceAccountName: rubik
      hostPID: true
      containers:
      - name: rubik-agent
        image: rubik_image_name_and_tag
        imagePullPolicy: IfNotPresent
        env:
          - name: RUBIK_NODE_NAME
            valueFrom:
              fieldRef:
                fieldPath: spec.nodeName
        securityContext:
          capabilities:
            add:
            - SYS_ADMIN
        resources:
          limits:
            memory: 200Mi
          requests:
            cpu: 100m
            memory: 200Mi
        volumeMounts:
        - name: rubiklog
          mountPath: /var/log/rubik
          readOnly: false
        - name: runrubik
          mountPath: /run/rubik
          readOnly: false
        - name: sysfs
          mountPath: /sys/fs
          readOnly: false
        - name: config-volume
          mountPath: /var/lib/rubik
      terminationGracePeriodSeconds: 30
      volumes:
      - name: rubiklog
        hostPath:
          path: /var/log/rubik
      - name: runrubik
        hostPath:
          path: /run/rubik
      - name: sysfs
        hostPath:
          path: /sys/fs
      - name: config-volume
        configMap:
          name: rubik-config
          items:
          - key: config.json
            path: config.json

Dockerfile Template

FROM scratch
COPY ./build/rubik /rubik
ENTRYPOINT ["/rubik"]

Image Build Script

#!/bin/bash
###################################################################################################
# Copyright (c) Huawei Technologies Co., Ltd. 2022. All rights reserved.
# rubik licensed under the Mulan PSL v2.
# You can use this software according to the terms and conditions of the Mulan PSL v2.
# You may obtain a copy of Mulan PSL v2 at:
#     http://license.coscl.org.cn/MulanPSL2
# THIS SOFTWARE IS PROVIDED ON AN "AS IS" BASIS, WITHOUT WARRANTIES OF ANY KIND, EITHER EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO NON-INFRINGEMENT, MERCHANTABILITY OR FIT FOR A PARTICULAR
# PURPOSE.
# See the Mulan PSL v2 for more details.
# Author: Xiang Li
# Create: 2022-11-29
# Description: Build container image for rubik. Enjoy and cherrs!
# Steps:
#    1. build image and tag it with rubik version and release number
#    2. modify `rubik-daemonset.yaml` file
###################################################################################################
set -e

CURRENT_DIR=$(cd "$(dirname "$0")" && pwd)
BINARY_NAME="rubik"

RUBIK_FILE="${CURRENT_DIR}/build/rubik"
DOCKERFILE="${CURRENT_DIR}/Dockerfile"
YAML_FILE="${CURRENT_DIR}/rubik-daemonset.yaml"

# Get version and release number of rubik binary
VERSION=$(${RUBIK_FILE} -v | grep ^Version | awk '{print $NF}')
RELEASE=$(${RUBIK_FILE} -v | grep ^Release | awk '{print $NF}')
IMG_TAG="${VERSION}-${RELEASE}"

# Get rubik image name and tag
IMG_NAME_AND_TAG="${BINARY_NAME}:${IMG_TAG}"

# Build container image for rubik
docker build -f "${DOCKERFILE}" -t "${IMG_NAME_AND_TAG}" "${CURRENT_DIR}"

echo -e "\n"
# Check image existence
docker images | grep -E "REPOSITORY|${BINARY_NAME}"

# Modify rubik-daemonset.yaml file, set rubik image name
sed -i "s/rubik_image_name_and_tag/${IMG_NAME_AND_TAG}/g" "${YAML_FILE}"

Communication Matrix

  • The Rubik service process communicates with the Kubernetes API server as a client through the list-watch mechanism to obtain information about Pods.
Listening IPListening PortProtocolPort DescriptionListening Port ModifiableAuthentication Method
Rubik does not listen to any IP addressNonNon-NoNon

File Permissions

  • All Rubik operations require root permissions.

  • Related file permissions are as follows:

PathPermissionsDescription
/var/lib/rubik750Directory generated after the RPM package is installed, which stores Rubik-related files
/var/lib/rubik/build700Directory for storing the Rubik binary file
/var/lib/rubik/build/rubik550Rubik binary file
/var/lib/rubik/rubik-daemonset.yaml550Rubik DaemonSet configuration template to be used for Kubernetes deployment
/var/lib/rubik/Dockerfile640Dockerfile template
/var/lib/rubik/build_rubik_image.sh550Rubik container image build script.
/var/log/rubik700Directory for storing Rubik log files (requires logDriver=file)
/var/log/rubik/rubik.log*600Rubik log files

Constraints

Specifications

  • Drive: More than 1 GB

  • Memory: More than 100 MB

Runtime

  • Only one Rubik instance can exist on a Kubernetes node.

  • Rubik cannot take any CLI parameters. Rubik will fail to be started if any CLI parameter is specified.

  • When directories are mounted to a container, the service side must ensure the minimum permissions of the Rubik local socket /run/rubik, for example, 700.

  • When the Rubik process is in the T (TASK_STOPPED or TASK_TRACED) OR D (TASK_UNINTERRUPTIBLE) state, the server is unavailable and does not respond. The service becomes available after the process recovers from the abnormal state.

Pod Priorities

  • Pod priorities cannot be raised. If the priority of service A is changed from -1 to 0, Rubik will report an error.

  • Adding or modifying annotations or re-applying Pod YAML configuration file does not trigger Pod rebuild. Rubik senses changes in Pod annotations through the list-watch mechanism.

  • After an online service is moved to the offline group, do not move it back to the online group, otherwise QoS exception may occur.

  • Do not add important system services and kernel processes to the offline group. Otherwise, they cannot be scheduled timely, causing system errors.

  • Online and offline configurations for the CPU and memory must be consistent to avoid QoS conflicts between the two subsystems.

  • In the scenario of hybrid service deployment, the original CPU share mechanism is restricted:

    • When both online and offline services run on a CPU, the CPU share of the offline service does not take effect.
    • If only an online or offline service runs on a CPU, its CPU share takes effect.
    • You are advised to set the Pod priority of the offline service to BestEffort.
  • Priority inversion of user-mode processes, SMT, cache, NUMA load balancing, and offline service load balancing are not supported.

Other

To prevent data inconsistency, do not manually modify cgroup or resctrl parameters of the pods, including:

  • CPU cgroup directory, such as /sys/fs/cgroup/cpu/kubepods/burstable//

    • cpu.qos_level
    • cpu.cfs_burst_us
  • memory cgroup directory, such as /sys/fs/cgroup/memory/kubepods/burstable//

    • memory.qos_level
    • memory.soft_limit_in_bytes
    • memory.force_empty
    • memory.limit_in_bytes
    • memory.high
  • blkio cgroup directory, such as /sys/fs/cgroup/blkio/kubepods/burstable//

    • blkio.throttle.read_bps_device
    • blkio.throttle.read_iops_device
    • blkio.throttle.write_bps_device
    • blkio.throttle.write_iops_device
  • RDT cgroup directory, such as /sys/fs/resctrl

Bug Catching

Buggy Content

Bug Description

Submit As Issue

It's a little complicated....

I'd like to ask someone.

PR

Just a small problem.

I can fix it online!

Bug Type
Specifications and Common Mistakes

● Misspellings or punctuation mistakes;

● Incorrect links, empty cells, or wrong formats;

● Chinese characters in English context;

● Minor inconsistencies between the UI and descriptions;

● Low writing fluency that does not affect understanding;

● Incorrect version numbers, including software package names and version numbers on the UI.

Usability

● Incorrect or missing key steps;

● Missing prerequisites or precautions;

● Ambiguous figures, tables, or texts;

● Unclear logic, such as missing classifications, items, and steps.

Correctness

● Technical principles, function descriptions, or specifications inconsistent with those of the software;

● Incorrect schematic or architecture diagrams;

● Incorrect commands or command parameters;

● Incorrect code;

● Commands inconsistent with the functions;

● Wrong screenshots.

Risk Warnings

● Lack of risk warnings for operations that may damage the system or important data.

Content Compliance

● Contents that may violate applicable laws and regulations or geo-cultural context-sensitive words and expressions;

● Copyright infringement.

How satisfied are you with this document

Not satisfied at all
Very satisfied
Submit
Click to create an issue. An issue template will be automatically generated based on your feedback.