Merge branch 'dev'
This commit is contained in:
commit
b077085fbe
12
README.md
12
README.md
|
|
@ -45,14 +45,13 @@
|
|||
|
||||
🚩 本项目未涉及微调、训练过程,但可利用微调或训练对本项目效果进行优化。
|
||||
|
||||
🌐 [AutoDL 镜像](https://www.codewithgpu.com/i/chatchat-space/Langchain-Chatchat/Langchain-Chatchat) 中 `v8` 版本所使用代码已更新至本项目 `v0.2.4` 版本。
|
||||
|
||||
🐳 [Docker 镜像](registry.cn-beijing.aliyuncs.com/chatchat/chatchat:0.2.3)
|
||||
🌐 [AutoDL 镜像](https://www.codewithgpu.com/i/chatchat-space/Langchain-Chatchat/Langchain-Chatchat) 中 `v9` 版本所使用代码已更新至本项目 `v0.2.5` 版本。
|
||||
🐳 [Docker 镜像](registry.cn-beijing.aliyuncs.com/chatchat/chatchat:0.2.5)
|
||||
|
||||
💻 一行命令运行 Docker 🌲:
|
||||
|
||||
```shell
|
||||
docker run -d --gpus all -p 80:8501 registry.cn-beijing.aliyuncs.com/chatchat/chatchat:0.2.3
|
||||
docker run -d --gpus all -p 80:8501 registry.cn-beijing.aliyuncs.com/chatchat/chatchat:0.2.5
|
||||
```
|
||||
|
||||
---
|
||||
|
|
@ -61,7 +60,8 @@ docker run -d --gpus all -p 80:8501 registry.cn-beijing.aliyuncs.com/chatchat/ch
|
|||
|
||||
想顺利运行本代码,请按照以下的最低要求进行配置:
|
||||
+ Python版本: >= 3.8.5, < 3.11
|
||||
+ Cuda版本: >= 11.7, 且能顺利安装Python
|
||||
+ Cuda版本: >= 11.7
|
||||
+ 强烈推荐使用Python3.10,部分Agent功能可能没有完全支持Python3.10以下版本。
|
||||
|
||||
如果想要顺利在GPU运行本地模型(int4版本),你至少需要以下的硬件配置:
|
||||
|
||||
|
|
@ -249,7 +249,7 @@ docker run -d --gpus all -p 80:8501 registry.cn-beijing.aliyuncs.com/chatchat/ch
|
|||
|
||||
参见 [开发环境准备](docs/INSTALL.md)。
|
||||
|
||||
**请注意:** `0.2.3` 及更新版本的依赖包与 `0.1.x` 版本依赖包可能发生冲突,强烈建议新建环境后重新安装依赖包。
|
||||
**请注意:** `0.2.5` 及更新版本的依赖包与 `0.1.x` 版本依赖包可能发生冲突,强烈建议新建环境后重新安装依赖包。
|
||||
|
||||
### 2. 下载模型至本地
|
||||
|
||||
|
|
|
|||
13
README_en.md
13
README_en.md
|
|
@ -44,14 +44,14 @@ The main process analysis from the aspect of document process:
|
|||
|
||||
🚩 The training or fined-tuning are not involved in the project, but still, one always can improve performance by do these.
|
||||
|
||||
🌐 [AutoDL image](registry.cn-beijing.aliyuncs.com/chatchat/chatchat:0.2.0) is supported, and in v7 the codes are update to v0.2.3.
|
||||
🌐 [AutoDL image](registry.cn-beijing.aliyuncs.com/chatchat/chatchat:0.2.5) is supported, and in v9 the codes are update to v0.2.5.
|
||||
|
||||
🐳 [Docker image](registry.cn-beijing.aliyuncs.com/chatchat/chatchat:0.2.0)
|
||||
🐳 [Docker image](registry.cn-beijing.aliyuncs.com/chatchat/chatchat:0.2.5)
|
||||
|
||||
💻 Run Docker with one command:
|
||||
|
||||
```shell
|
||||
docker run -d --gpus all -p 80:8501 registry.cn-beijing.aliyuncs.com/chatchat/chatchat:0.2.0
|
||||
docker run -d --gpus all -p 80:8501 registry.cn-beijing.aliyuncs.com/chatchat/chatchat:0.2.5
|
||||
```
|
||||
|
||||
---
|
||||
|
|
@ -60,16 +60,17 @@ docker run -d --gpus all -p 80:8501 registry.cn-beijing.aliyuncs.com/chatchat/ch
|
|||
|
||||
To run this code smoothly, please configure it according to the following minimum requirements:
|
||||
+ Python version: >= 3.8.5, < 3.11
|
||||
+ Cuda version: >= 11.7, with Python installed.
|
||||
+ Cuda version: >= 11.7
|
||||
+ Python 3.10 is highly recommended, some Agent features may not be fully supported below Python 3.10.
|
||||
|
||||
If you want to run the native model (int4 version) on the GPU without problems, you need at least the following hardware configuration.
|
||||
|
||||
+ chatglm2-6b & LLaMA-7B Minimum RAM requirement: 7GB Recommended graphics cards: RTX 3060, RTX 2060
|
||||
+ LLaMA-13B Minimum graphics memory requirement: 11GB Recommended cards: RTX 2060 12GB, RTX3060 12GB, RTX3080, RTXA2000
|
||||
+ Qwen-14B-Chat Minimum memory requirement: 13GB Recommended graphics card: RTX 3090
|
||||
|
||||
+ LLaMA-30B Minimum Memory Requirement: 22GB Recommended Cards: RTX A5000,RTX 3090,RTX 4090,RTX 6000,Tesla V100,RTX Tesla P40
|
||||
+ Minimum memory requirement for LLaMA-65B: 40GB Recommended cards: A100,A40,A6000
|
||||
|
||||
If int8 then memory x1.5 fp16 x2.5 requirement.
|
||||
For example: using fp16 to reason about the Qwen-7B-Chat model requires 16GB of video memory.
|
||||
|
||||
|
|
@ -191,7 +192,7 @@ See [Custom Agent Instructions](docs/自定义Agent.md) for details.
|
|||
docker run -d --gpus all -p 80:8501 registry.cn-beijing.aliyuncs.com/chatchat/chatchat:0.2.5
|
||||
```
|
||||
|
||||
- The image size of this version is `33.9GB`, using `v0.2.0`, with `nvidia/cuda:12.1.1-cudnn8-devel-ubuntu22.04` as the base image
|
||||
- The image size of this version is `33.9GB`, using `v0.2.5`, with `nvidia/cuda:12.1.1-cudnn8-devel-ubuntu22.04` as the base image
|
||||
- This version has a built-in `embedding` model: `m3e-large`, built-in `chatglm2-6b-32k`
|
||||
- This version is designed to facilitate one-click deployment. Please make sure you have installed the NVIDIA driver on your Linux distribution.
|
||||
- Please note that you do not need to install the CUDA toolkit on the host system, but you need to install the `NVIDIA Driver` and the `NVIDIA Container Toolkit`, please refer to the [Installation Guide](https://docs.nvidia.com/datacenter/cloud -native/container-toolkit/latest/install-guide.html)
|
||||
|
|
|
|||
|
|
@ -92,6 +92,7 @@ MODEL_PATH = {
|
|||
# 选用的 Embedding 名称
|
||||
EMBEDDING_MODEL = "m3e-base" # 可以尝试最新的嵌入式sota模型:piccolo-large-zh
|
||||
|
||||
|
||||
# Embedding 模型运行设备。设为"auto"会自动检测,也可手动设定为"cuda","mps","cpu"其中之一。
|
||||
EMBEDDING_DEVICE = "auto"
|
||||
|
||||
|
|
@ -174,6 +175,14 @@ ONLINE_LLM_MODEL = {
|
|||
"api_key": "", # 请在阿里云控制台模型服务灵积API-KEY管理页面创建
|
||||
"provider": "QwenWorker",
|
||||
},
|
||||
|
||||
# 百川 API,申请方式请参考 https://www.baichuan-ai.com/home#api-enter
|
||||
"baichuan-api": {
|
||||
"version": "Baichuan2-53B", # 当前支持 "Baichuan2-53B", 见官方文档。
|
||||
"api_key": "",
|
||||
"secret_key": "",
|
||||
"provider": "BaiChuanWorker",
|
||||
},
|
||||
}
|
||||
|
||||
|
||||
|
|
|
|||
|
|
@ -0,0 +1,13 @@
|
|||
# 用于批量将configs下的.example文件复制并命名为.py文件
|
||||
import os
|
||||
import shutil
|
||||
files = os.listdir("configs")
|
||||
|
||||
src_files = [os.path.join("configs",file) for file in files if ".example" in file]
|
||||
|
||||
for src_file in src_files:
|
||||
tar_file = src_file.replace(".example","")
|
||||
shutil.copy(src_file,tar_file)
|
||||
|
||||
|
||||
|
||||
Binary file not shown.
|
Before Width: | Height: | Size: 225 KiB After Width: | Height: | Size: 84 KiB |
|
|
@ -1,3 +1,4 @@
|
|||
from __future__ import annotations
|
||||
from uuid import UUID
|
||||
from langchain.callbacks import AsyncIteratorCallbackHandler
|
||||
import json
|
||||
|
|
|
|||
|
|
@ -1,3 +1,4 @@
|
|||
from __future__ import annotations
|
||||
from langchain.agents import Tool, AgentOutputParser
|
||||
from langchain.prompts import StringPromptTemplate
|
||||
from typing import List, Union
|
||||
|
|
|
|||
|
|
@ -0,0 +1,163 @@
|
|||
# import os
|
||||
# import sys
|
||||
# sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(__file__))))
|
||||
import requests
|
||||
import json
|
||||
import time
|
||||
import hashlib
|
||||
from server.model_workers.base import ApiModelWorker
|
||||
from fastchat import conversation as conv
|
||||
import sys
|
||||
import json
|
||||
from typing import List, Literal
|
||||
from configs import TEMPERATURE
|
||||
|
||||
|
||||
def calculate_md5(input_string):
|
||||
md5 = hashlib.md5()
|
||||
md5.update(input_string.encode('utf-8'))
|
||||
encrypted = md5.hexdigest()
|
||||
return encrypted
|
||||
|
||||
|
||||
def do_request():
|
||||
url = "https://api.baichuan-ai.com/v1/stream/chat"
|
||||
api_key = ""
|
||||
secret_key = ""
|
||||
|
||||
data = {
|
||||
"model": "Baichuan2-53B",
|
||||
"messages": [
|
||||
{
|
||||
"role": "user",
|
||||
"content": "世界第一高峰是"
|
||||
}
|
||||
],
|
||||
"parameters": {
|
||||
"temperature": 0.1,
|
||||
"top_k": 10
|
||||
}
|
||||
}
|
||||
|
||||
json_data = json.dumps(data)
|
||||
time_stamp = int(time.time())
|
||||
signature = calculate_md5(secret_key + json_data + str(time_stamp))
|
||||
|
||||
headers = {
|
||||
"Content-Type": "application/json",
|
||||
"Authorization": "Bearer " + api_key,
|
||||
"X-BC-Request-Id": "your requestId",
|
||||
"X-BC-Timestamp": str(time_stamp),
|
||||
"X-BC-Signature": signature,
|
||||
"X-BC-Sign-Algo": "MD5",
|
||||
}
|
||||
|
||||
response = requests.post(url, data=json_data, headers=headers)
|
||||
|
||||
if response.status_code == 200:
|
||||
print("请求成功!")
|
||||
print("响应header:", response.headers)
|
||||
print("响应body:", response.text)
|
||||
else:
|
||||
print("请求失败,状态码:", response.status_code)
|
||||
|
||||
|
||||
class BaiChuanWorker(ApiModelWorker):
|
||||
BASE_URL = "https://api.baichuan-ai.com/v1/chat"
|
||||
SUPPORT_MODELS = ["Baichuan2-53B"]
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
*,
|
||||
controller_addr: str,
|
||||
worker_addr: str,
|
||||
model_names: List[str] = ["baichuan-api"],
|
||||
version: Literal["Baichuan2-53B"] = "Baichuan2-53B",
|
||||
**kwargs,
|
||||
):
|
||||
kwargs.update(model_names=model_names, controller_addr=controller_addr, worker_addr=worker_addr)
|
||||
kwargs.setdefault("context_len", 32768)
|
||||
super().__init__(**kwargs)
|
||||
|
||||
# TODO: 确认模板是否需要修改
|
||||
self.conv = conv.Conversation(
|
||||
name=self.model_names[0],
|
||||
system_message="",
|
||||
messages=[],
|
||||
roles=["user", "assistant"],
|
||||
sep="\n### ",
|
||||
stop_str="###",
|
||||
)
|
||||
|
||||
config = self.get_config()
|
||||
self.version = config.get("version",version)
|
||||
self.api_key = config.get("api_key")
|
||||
self.secret_key = config.get("secret_key")
|
||||
|
||||
def generate_stream_gate(self, params):
|
||||
data = {
|
||||
"model": self.version,
|
||||
"messages": [
|
||||
{
|
||||
"role": "user",
|
||||
"content": params["prompt"]
|
||||
}
|
||||
],
|
||||
"parameters": {
|
||||
"temperature": params.get("temperature",TEMPERATURE),
|
||||
"top_k": params.get("top_k",1)
|
||||
}
|
||||
}
|
||||
|
||||
json_data = json.dumps(data)
|
||||
time_stamp = int(time.time())
|
||||
signature = calculate_md5(self.secret_key + json_data + str(time_stamp))
|
||||
headers = {
|
||||
"Content-Type": "application/json",
|
||||
"Authorization": "Bearer " + self.api_key,
|
||||
"X-BC-Request-Id": "your requestId",
|
||||
"X-BC-Timestamp": str(time_stamp),
|
||||
"X-BC-Signature": signature,
|
||||
"X-BC-Sign-Algo": "MD5",
|
||||
}
|
||||
|
||||
response = requests.post(self.BASE_URL, data=json_data, headers=headers)
|
||||
|
||||
if response.status_code == 200:
|
||||
resp = eval(response.text)
|
||||
yield json.dumps(
|
||||
{
|
||||
"error_code": resp["code"],
|
||||
"text": resp["data"]["messages"][-1]["content"]
|
||||
},
|
||||
ensure_ascii=False
|
||||
).encode() + b"\0"
|
||||
else:
|
||||
yield json.dumps(
|
||||
{
|
||||
"error_code": resp["code"],
|
||||
"text": resp["msg"]
|
||||
},
|
||||
ensure_ascii=False
|
||||
).encode() + b"\0"
|
||||
|
||||
|
||||
|
||||
def get_embeddings(self, params):
|
||||
# TODO: 支持embeddings
|
||||
print("embedding")
|
||||
print(params)
|
||||
|
||||
if __name__ == "__main__":
|
||||
import uvicorn
|
||||
from server.utils import MakeFastAPIOffline
|
||||
from fastchat.serve.model_worker import app
|
||||
|
||||
worker = BaiChuanWorker(
|
||||
controller_addr="http://127.0.0.1:20001",
|
||||
worker_addr="http://127.0.0.1:21001",
|
||||
)
|
||||
sys.modules["fastchat.serve.model_worker"].worker = worker
|
||||
MakeFastAPIOffline(app)
|
||||
uvicorn.run(app, port=21001)
|
||||
# do_request()
|
||||
Loading…
Reference in New Issue