Long-Term Supported Versions

    Innovation Versions

      Container Images for Large Language Models

      openEuler provides container images to support large language models (LLMs) such as Baichuan, ChatGLM, and iFLYTEK Spark.

      The provided container images come with pre-installed dependencies for both CPU and GPU environments, ensuring a seamless out-of-the-box experience.

      Pulling the Image (CPU Version)

      docker pull openeuler/llm-server:1.0.0-oe2203sp3
      

      Pulling the Image (GPU Version)

      docker pull icewangds/llm-server:1.0.0
      

      Downloading the Model

      Download the model and convert it to GGUF format.

      # Install Hugging Face Hub.
      pip install huggingface-hub
      
      # Download the model you want to deploy.
      export HF_ENDPOINT=https://hf-mirror.com
      huggingface-cli download --resume-download baichuan-inc/Baichuan2-13B-Chat --local-dir /root/models/Baichuan2-13B-Chat --local-dir-use-symlinks False
      
      # Convert the model to GGUF format.
      cd /root/models/
      git clone https://github.com/ggerganov/llama.cpp.git
      python llama.cpp/convert-hf-to-gguf.py ./Baichuan2-13B-Chat
      # Path to the generated GGUF model: /root/models/Baichuan2-13B-Chat/ggml-model-f16.gguf
      

      Launch

      Docker v25.0.0 or above is required.

      To use a GPU image, you must install nvidia-container-toolkit. Detailed installation instructions are available in the official NVIDIA documentation: Installing the NVIDIA Container Toolkit.

      docker-compose.yaml file content:

      version: '3'
      services:
        model:
          image: <image>:<tag>   # Image name and tag
          restart: on-failure:5
          ports:
            - 8001:8000    # Listening port number. Change "8001" to modify the port.
          volumes:
            - /root/models:/models  # LLM mount directory
          environment:
            - MODEL=/models/Baichuan2-13B-Chat/ggml-model-f16.gguf  # Model file path inside the container
            - MODEL_NAME=baichuan13b  # Custom model name
            - KEY=sk-12345678  # Custom API Key
            - CONTEXT=8192  # Context size
            - THREADS=8    # Number of CPU threads, required only for CPU deployment
          deploy: # GPU resources, required only for GPU deployment
            resources:
              reservations:
                devices:
                  - driver: nvidia
                    count: all
                    capabilities: [gpu]
      
      docker-compose -f docker-compose.yaml up
      

      docker run command:

      # For CPU deployment
      docker run -d --restart on-failure:5 -p 8001:8000 -v /root/models:/models -e MODEL=/models/Baichuan2-13B-Chat/ggml-model-f16.gguf -e MODEL_NAME=baichuan13b -e KEY=sk-12345678 openeuler/llm-server:1.0.0-oe2203sp3
      
      # For GPU deployment
      docker run -d --gpus all --restart on-failure:5 -p 8001:8000 -v /root/models:/models -e MODEL=/models/Baichuan2-13B-Chat/ggml-model-f16.gguf -e MODEL_NAME=baichuan13b -e KEY=sk-12345678 icewangds/llm-server:1.0.0
      

      Testing

      Call the LLM interface to test the deployment. A successful return indicates successful deployment of the LLM service.

      curl -X POST http://127.0.0.1:8001/v1/chat/completions \
           -H "Content-Type: application/json" \
           -H "Authorization: Bearer sk-12345678" \
           -d '{
                 "model": "baichuan13b",
                 "messages": [
                   {"role": "system", "content": "You are a openEuler community assistant, please answer the following question."},
                   {"role": "user", "content": "Who are you?"}
                 ],
                 "stream": false,
                 "max_tokens": 1024
               }'
      

      Bug Catching

      Buggy Content

      Bug Description

      Submit As Issue

      It's a little complicated....

      I'd like to ask someone.

      PR

      Just a small problem.

      I can fix it online!

      Bug Type
      Specifications and Common Mistakes

      ● Misspellings or punctuation mistakes;

      ● Incorrect links, empty cells, or wrong formats;

      ● Chinese characters in English context;

      ● Minor inconsistencies between the UI and descriptions;

      ● Low writing fluency that does not affect understanding;

      ● Incorrect version numbers, including software package names and version numbers on the UI.

      Usability

      ● Incorrect or missing key steps;

      ● Missing prerequisites or precautions;

      ● Ambiguous figures, tables, or texts;

      ● Unclear logic, such as missing classifications, items, and steps.

      Correctness

      ● Technical principles, function descriptions, or specifications inconsistent with those of the software;

      ● Incorrect schematic or architecture diagrams;

      ● Incorrect commands or command parameters;

      ● Incorrect code;

      ● Commands inconsistent with the functions;

      ● Wrong screenshots.

      Risk Warnings

      ● Lack of risk warnings for operations that may damage the system or important data.

      Content Compliance

      ● Contents that may violate applicable laws and regulations or geo-cultural context-sensitive words and expressions;

      ● Copyright infringement.

      How satisfied are you with this document

      Not satisfied at all
      Very satisfied
      Submit
      Click to create an issue. An issue template will be automatically generated based on your feedback.
      Bug Catching
      编组 3备份