支持百川、chatglm、星火等AI大模型的容器化封装

已配好相关依赖，分为CPU和GPU版本，降低使用门槛，开箱即用。

拉取镜像(CPU版本)

docker pull openeuler/llm-server:1.0.0-oe2203sp3

拉取镜像(GPU版本)

docker pull icewangds/llm-server:1.0.0

下载模型, 并转换为gguf格式

# 安装huggingface
pip install huggingface-hub

# 下载你想要部署的模型
export HF_ENDPOINT=https://hf-mirror.com
huggingface-cli download --resume-download baichuan-inc/Baichuan2-13B-Chat --local-dir /root/models/Baichuan2-13B-Chat --local-dir-use-symlinks False

# gguf格式转换
cd /root/models/
git clone https://github.com/ggerganov/llama.cpp.git
python llama.cpp/convert-hf-to-gguf.py ./Baichuan2-13B-Chat
# 生成的gguf格式的模型路径 /root/models/Baichuan2-13B-Chat/ggml-model-f16.gguf

启动方式

需要Docker v25.0.0及以上版本。

若使用GPU镜像，需要OS上安装nvidia-container-toolkit，安装方式见https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html。

docker-compose.yaml:

version: '3'
services:
  model:
    image: <image>:<tag>   #镜像名称与tag
    restart: on-failure:5
    ports:
      - 8001:8000    #监听端口号，修改“8001”以更换端口
    volumes:
      - /root/models:/models  # 大模型挂载目录
    environment:
      - MODEL=/models/Baichuan2-13B-Chat/ggml-model-f16.gguf  # 容器内的模型文件路径
      - MODEL_NAME=baichuan13b  # 自定义模型名称
      - KEY=sk-12345678  # 自定义API Key
      - CONTEXT=8192  # 上下文大小
      - THREADS=8    # CPU线程数，仅CPU部署时需要
    deploy: # 指定GPU资源, 仅GPU部署时需要
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities: [gpu]

docker-compose -f docker-compose.yaml up

docker run:

cpu部署: docker run -d --restart on-failure:5 -p 8001:8000 -v /root/models:/models -e MODEL=/models/Baichuan2-13B-Chat/ggml-model-f16.gguf -e MODEL_NAME=baichuan13b -e KEY=sk-12345678 openeuler/llm-server:1.0.0-oe2203sp3

gpu部署: docker run -d --gpus all --restart on-failure:5 -p 8001:8000 -v /root/models:/models -e MODEL=/models/Baichuan2-13B-Chat/ggml-model-f16.gguf -e MODEL_NAME=baichuan13b -e KEY=sk-12345678 icewangds/llm-server:1.0.0

调用大模型接口测试，成功返回则表示大模型服务已部署成功

curl -X POST http://127.0.0.1:8001/v1/chat/completions \
     -H "Content-Type: application/json" \
     -H "Authorization: Bearer sk-12345678" \
     -d '{
           "model": "baichuan13b",
           "messages": [
             {"role": "system", "content": "你是一个社区助手，请回答以下问题。"},
             {"role": "user", "content": "你是谁?"}
           ],
           "stream": false,
           "max_tokens": 1024
         }'

文档捉虫

“有虫”文档片段

问题描述

提交类型 issue

有点复杂...

找人问问吧。

小问题，全程线上修改...

一键搞定！

问题类型

规范和低错类

● 错别字或拼写错误；标点符号使用错误；

● 链接错误、空单元格、格式错误；

● 英文中包含中文字符；

● 界面和描述不一致，但不影响操作；

● 表述不通顺，但不影响理解；

● 版本号不匹配：如软件包名称、界面版本号；

易用性

● 关键步骤错误或缺失，无法指导用户完成任务；

● 缺少必要的前提条件、注意事项等；

● 图形、表格、文字等晦涩难懂；

● 逻辑不清晰，该分类、分项、分步骤的没有给出；

正确性

● 技术原理、功能、规格等描述和软件不一致，存在错误；

● 原理图、架构图等存在错误；

● 命令、命令参数等错误；

● 代码片段错误；

● 命令无法完成对应功能；

● 界面错误，无法指导操作；

风险提示

● 对重要数据或系统存在风险的操作，缺少安全提示；

内容合规

● 违反法律法规，涉及政治、领土主权等敏感词；

● 内容侵权；

您对文档的总体满意度

非常不满意

非常满意

提交

根据您的反馈，会自动生成issue模板。您只需点击按钮，创建issue即可。