更新readme (#1813)

* 更新了最新的readme

* 更新readme,加上目录
This commit is contained in:
zR 2023-10-20 21:37:43 +08:00 committed by GitHub
parent 46225ad784
commit 86ee6fe08c
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 142 additions and 744 deletions

460
README.md
View File

@ -1,29 +1,27 @@
![](img/logo-long-chatchat-trans-v2.png)
[![Telegram](https://img.shields.io/badge/Telegram-2CA5E0?style=for-the-badge&logo=telegram&logoColor=white "langchain-chatglm")](https://t.me/+RjliQ3jnJ1YyN2E9)
🌍 [READ THIS IN ENGLISH](README_en.md)
📃 **LangChain-Chatchat** (原 Langchain-ChatGLM): 基于 Langchain 与 ChatGLM 等大语言模型的本地知识库问答应用实现。
📃 **LangChain-Chatchat** (原 Langchain-ChatGLM)
基于 Langchain 与 ChatGLM 等大语言模型的本地知识库问答应用实现。
---
## 目录
* [介绍](README.md#介绍)
* [变更日志](README.md#变更日志)
* [模型支持](README.md#模型支持)
* [Docker 部署](README.md#Docker-部署)
* [开发部署](README.md#开发部署)
* [软件需求](README.md#软件需求)
* [1. 开发环境准备](README.md#1-开发环境准备)
* [2. 下载模型至本地](README.md#2-下载模型至本地)
* [3. 设置配置项](README.md#3-设置配置项)
* [4. 知识库初始化与迁移](README.md#4-知识库初始化与迁移)
* [5. 一键启动 API 服务或 Web UI](README.md#5-一键启动-API-服务或-Web-UI)
* [常见问题](README.md#常见问题)
* [路线图](README.md#路线图)
* [项目交流群](README.md#项目交流群)
* [解决的痛点](README.md#解决的痛点)
* [快速上手](README.md#快速上手)
* [1. 环境配置](README.md#1-环境配置)
* [2. 模型下载](README.md#2-模型下载)
* [3. 初始化知识库和配置文件](README.md#3-初始化知识库和配置文件)
* [4. 一键启动](README.md#4-一键启动)
* [5. 启动界面示例](README.md#5-启动界面示例)
* [联系我们](README.md#联系我们)
* [合作伙伴名单](README.md#合作伙伴名单)
---
## 介绍
@ -47,216 +45,45 @@
🌐 [AutoDL 镜像](https://www.codewithgpu.com/i/chatchat-space/Langchain-Chatchat/Langchain-Chatchat) 中 `v8` 版本所使用代码已更新至本项目 `v0.2.4` 版本。
🐳 [Docker 镜像](registry.cn-beijing.aliyuncs.com/chatchat/chatchat:0.2.3)
🐳 [Docker 镜像](registry.cn-beijing.aliyuncs.com/chatchat/chatchat:0.2.3) 已经更新到 ```0.2.3``` 版本, 如果想体验最新内容请源码安装。
💻 一行命令运行 Docker 🌲:
🧩 本项目有一个非常完整的[Wiki](https://github.com/chatchat-space/Langchain-Chatchat/wiki/) README只是一个简单的介绍__仅仅是入门教程能够基础运行__。 如果你想要更深入的了解本项目,或者对相对本项目做出共享。请移步 [Wiki](https://github.com/chatchat-space/Langchain-Chatchat/wiki/) 界面
```shell
docker run -d --gpus all -p 80:8501 registry.cn-beijing.aliyuncs.com/chatchat/chatchat:0.2.3
## 解决的痛点
该项目是一个可以实现 __完全本地化__推理的知识库增强方案, 重点解决数数据安全保护,私域化部署的企业痛点。
本开源方案采用```Apache License``,可以免费商用,无需付费。
我们支持市面上主流的本地大预言模型和Embedding模型支持开源的本地向量数据库。
支持列表详见[Wiki](https://github.com/chatchat-space/Langchain-Chatchat/wiki/)
## 快速上手
### 1. 环境配置
+ 首先,确保你的机器安装了 Python 3.10
```
---
## 环境最低要求
想顺利运行本代码,请按照以下的最低要求进行配置:
+ Python版本: >= 3.8.5, < 3.11
+ Cuda版本: >= 11.7, 且能顺利安装Python
如果想要顺利在GPU运行本地模型(int4版本),你至少需要以下的硬件配置:
+ chatglm2-6b & LLaMA-7B 最低显存要求: 7GB 推荐显卡: RTX 3060, RTX 2060
+ LLaMA-13B 最低显存要求: 11GB 推荐显卡: RTX 2060 12GB, RTX3060 12GB, RTX3080, RTXA2000
+ Qwen-14B-Chat 最低显存要求: 13GB 推荐显卡: RTX 3090
+ LLaMA-30B 最低显存要求: 22GB 推荐显卡RTX A5000,RTX 3090,RTX 4090,RTX 6000,Tesla V100,RTX Tesla P40
+ LLaMA-65B 最低显存要求: 40GB 推荐显卡A100,A40,A6000
如果是int8 则显存x1.5 fp16 x2.5的要求
使用fp16 推理Qwen-7B-Chat 模型 则需要使用16GB显存。
以上仅为估算实际情况以nvidia-smi占用为准。
## 变更日志
参见 [版本更新日志](https://github.com/imClumsyPanda/langchain-ChatGLM/releases)。
`0.1.x` 升级过来的用户请注意,需要按照[开发部署](README.md#3-开发部署)过程操作,将现有知识库迁移到新格式,具体见[知识库初始化与迁移](docs/INSTALL.md#知识库初始化与迁移)。
### `0.2.0` 版本与 `0.1.x` 版本区别
1. 使用 [FastChat](https://github.com/lm-sys/FastChat) 提供开源 LLM 模型的 API以 OpenAI API 接口形式接入,提升 LLM 模型加载效果;
2. 使用 [langchain](https://github.com/langchain-ai/langchain) 中已有 Chain 的实现,便于后续接入不同类型 Chain并将对 Agent 接入开展测试;
3. 使用 [FastAPI](https://github.com/tiangolo/fastapi) 提供 API 服务,全部接口可在 FastAPI 自动生成的 docs 中开展测试,且所有对话接口支持通过参数设置流式或非流式输出;
4. 使用 [Streamlit](https://github.com/streamlit/streamlit) 提供 WebUI 服务,可选是否基于 API 服务启动 WebUI增加会话管理可以自定义会话主题并切换且后续可支持不同形式输出内容的显示
5. 项目中默认 LLM 模型改为 [THUDM/chatglm2-6b](https://huggingface.co/THUDM/chatglm2-6b),默认 Embedding 模型改为 [moka-ai/m3e-base](https://huggingface.co/moka-ai/m3e-base),文件加载方式与文段划分方式也有调整,后续将重新实现上下文扩充,并增加可选设置;
6. 项目中扩充了对不同类型向量库的支持,除支持 [FAISS](https://github.com/facebookresearch/faiss) 向量库外,还提供 [Milvus](https://milvus.io/),[Zilliz](https://zilliz.com/), [PGVector](https://github.com/pgvector/pgvector) 向量库的接入;
7. 项目中搜索引擎对话,除 Bing 搜索外,增加 DuckDuckGo 搜索选项DuckDuckGo 搜索无需配置 API Key在可访问国外服务环境下可直接使用。
---
## 模型支持
本项目中默认使用的 LLM 模型为 [THUDM/chatglm2-6b](https://huggingface.co/THUDM/chatglm2-6b),默认使用的 Embedding 模型为 [moka-ai/m3e-base](https://huggingface.co/moka-ai/m3e-base) 为例。
### LLM 模型支持
本项目最新版本中支持接入**本地模型**与**在线 LLM API**。
本地 LLM 模型接入基于 [FastChat](https://github.com/lm-sys/FastChat) 实现,支持模型如下:
- [meta-llama/Llama-2-7b-chat-hf](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf)
- Vicuna, Alpaca, LLaMA, Koala
- [BlinkDL/RWKV-4-Raven](https://huggingface.co/BlinkDL/rwkv-4-raven)
- [camel-ai/CAMEL-13B-Combined-Data](https://huggingface.co/camel-ai/CAMEL-13B-Combined-Data)
- [databricks/dolly-v2-12b](https://huggingface.co/databricks/dolly-v2-12b)
- [FreedomIntelligence/phoenix-inst-chat-7b](https://huggingface.co/FreedomIntelligence/phoenix-inst-chat-7b)
- [h2oai/h2ogpt-gm-oasst1-en-2048-open-llama-7b](https://huggingface.co/h2oai/h2ogpt-gm-oasst1-en-2048-open-llama-7b)
- [lcw99/polyglot-ko-12.8b-chang-instruct-chat](https://huggingface.co/lcw99/polyglot-ko-12.8b-chang-instruct-chat)
- [lmsys/fastchat-t5-3b-v1.0](https://huggingface.co/lmsys/fastchat-t5)
- [mosaicml/mpt-7b-chat](https://huggingface.co/mosaicml/mpt-7b-chat)
- [Neutralzz/BiLLa-7B-SFT](https://huggingface.co/Neutralzz/BiLLa-7B-SFT)
- [nomic-ai/gpt4all-13b-snoozy](https://huggingface.co/nomic-ai/gpt4all-13b-snoozy)
- [NousResearch/Nous-Hermes-13b](https://huggingface.co/NousResearch/Nous-Hermes-13b)
- [openaccess-ai-collective/manticore-13b-chat-pyg](https://huggingface.co/openaccess-ai-collective/manticore-13b-chat-pyg)
- [OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5](https://huggingface.co/OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5)
- [project-baize/baize-v2-7b](https://huggingface.co/project-baize/baize-v2-7b)
- [Salesforce/codet5p-6b](https://huggingface.co/Salesforce/codet5p-6b)
- [StabilityAI/stablelm-tuned-alpha-7b](https://huggingface.co/stabilityai/stablelm-tuned-alpha-7b)
- [THUDM/chatglm-6b](https://huggingface.co/THUDM/chatglm-6b)
- [THUDM/chatglm2-6b](https://huggingface.co/THUDM/chatglm2-6b)
- [tiiuae/falcon-40b](https://huggingface.co/tiiuae/falcon-40b)
- [timdettmers/guanaco-33b-merged](https://huggingface.co/timdettmers/guanaco-33b-merged)
- [togethercomputer/RedPajama-INCITE-7B-Chat](https://huggingface.co/togethercomputer/RedPajama-INCITE-7B-Chat)
- [WizardLM/WizardLM-13B-V1.0](https://huggingface.co/WizardLM/WizardLM-13B-V1.0)
- [WizardLM/WizardCoder-15B-V1.0](https://huggingface.co/WizardLM/WizardCoder-15B-V1.0)
- [baichuan-inc/baichuan-7B](https://huggingface.co/baichuan-inc/baichuan-7B)
- [internlm/internlm-chat-7b](https://huggingface.co/internlm/internlm-chat-7b)
- [Qwen/Qwen-7B-Chat/Qwen-14B-Chat](https://huggingface.co/Qwen/)
- [HuggingFaceH4/starchat-beta](https://huggingface.co/HuggingFaceH4/starchat-beta)
- [FlagAlpha/Llama2-Chinese-13b-Chat](https://huggingface.co/FlagAlpha/Llama2-Chinese-13b-Chat) and others
- [BAAI/AquilaChat-7B](https://huggingface.co/BAAI/AquilaChat-7B)
- [all models of OpenOrca](https://huggingface.co/Open-Orca)
- [Spicyboros](https://huggingface.co/jondurbin/spicyboros-7b-2.2?not-for-all-audiences=true) + [airoboros 2.2](https://huggingface.co/jondurbin/airoboros-l2-13b-2.2)
- [VMware&#39;s OpenLLaMa OpenInstruct](https://huggingface.co/VMware/open-llama-7b-open-instruct)
- [baichuan2-7b/baichuan2-13b](https://huggingface.co/baichuan-inc)
- 任何 [EleutherAI](https://huggingface.co/EleutherAI) 的 pythia 模型,如 [pythia-6.9b](https://huggingface.co/EleutherAI/pythia-6.9b)
- 在以上模型基础上训练的任何 [Peft](https://github.com/huggingface/peft) 适配器。为了激活,模型路径中必须有 `peft` 。注意如果加载多个peft模型你可以通过在任何模型工作器中设置环境变量 `PEFT_SHARE_BASE_WEIGHTS=true` 来使它们共享基础模型的权重。
以上模型支持列表可能随 [FastChat](https://github.com/lm-sys/FastChat) 更新而持续更新,可参考 [FastChat 已支持模型列表](https://github.com/lm-sys/FastChat/blob/main/docs/model_support.md)。
除本地模型外,本项目也支持直接接入 OpenAI API、智谱AI等在线模型具体设置可参考 `configs/model_configs.py.example` 中的 `llm_model_dict` 的配置信息。
在线 LLM 模型目前已支持:
- [ChatGPT](https://api.openai.com)
- [智谱AI](http://open.bigmodel.cn)
- [MiniMax](https://api.minimax.chat)
- [讯飞星火](https://xinghuo.xfyun.cn)
- [百度千帆](https://cloud.baidu.com/product/wenxinworkshop?track=dingbutonglan)
- [阿里云通义千问](https://dashscope.aliyun.com/)
项目中默认使用的 LLM 类型为 `THUDM/chatglm2-6b`,如需使用其他 LLM 类型,请在 [configs/model_config.py] 中对 `llm_model_dict``LLM_MODEL` 进行修改。
### Embedding 模型支持
本项目支持调用 [HuggingFace](https://huggingface.co/models?pipeline_tag=sentence-similarity) 中的 Embedding 模型,已支持的 Embedding 模型如下:
- [moka-ai/m3e-small](https://huggingface.co/moka-ai/m3e-small)
- [moka-ai/m3e-base](https://huggingface.co/moka-ai/m3e-base)
- [moka-ai/m3e-large](https://huggingface.co/moka-ai/m3e-large)
- [BAAI/bge-small-zh](https://huggingface.co/BAAI/bge-small-zh)
- [BAAI/bge-base-zh](https://huggingface.co/BAAI/bge-base-zh)
- [BAAI/bge-large-zh](https://huggingface.co/BAAI/bge-large-zh)
- [BAAI/bge-base-zh-v1.5](https://huggingface.co/BAAI/bge-base-zh-v1.5)
- [BAAI/bge-large-zh-v1.5](https://huggingface.co/BAAI/bge-large-zh-v1.5)
- [BAAI/bge-base-zh-v1.5](https://huggingface.co/BAAI/bge-base-zh-v1.5)
- [BAAI/bge-large-zh-v1.5](https://huggingface.co/BAAI/bge-large-zh-v1.5)
- [BAAI/bge-large-zh-noinstruct](https://huggingface.co/BAAI/bge-large-zh-noinstruct)
- [sensenova/piccolo-base-zh](https://huggingface.co/sensenova/piccolo-base-zh)
- [sensenova/piccolo-large-zh](https://huggingface.co/sensenova/piccolo-large-zh)
- [shibing624/text2vec-base-chinese-sentence](https://huggingface.co/shibing624/text2vec-base-chinese-sentence)
- [shibing624/text2vec-base-chinese-paraphrase](https://huggingface.co/shibing624/text2vec-base-chinese-paraphrase)
- [shibing624/text2vec-base-multilingual](https://huggingface.co/shibing624/text2vec-base-multilingual)
- [shibing624/text2vec-base-chinese](https://huggingface.co/shibing624/text2vec-base-chinese)
- [shibing624/text2vec-bge-large-chinese](https://huggingface.co/shibing624/text2vec-bge-large-chinese)
- [GanymedeNil/text2vec-large-chinese](https://huggingface.co/GanymedeNil/text2vec-large-chinese)
- [nghuyong/ernie-3.0-nano-zh](https://huggingface.co/nghuyong/ernie-3.0-nano-zh)
- [nghuyong/ernie-3.0-base-zh](https://huggingface.co/nghuyong/ernie-3.0-base-zh)
- [sensenova/piccolo-base-zh](https://huggingface.co/sensenova/piccolo-base-zh)
- [sensenova/piccolo-base-zh](https://huggingface.co/sensenova/piccolo-large-zh)
- [OpenAI/text-embedding-ada-002](https://platform.openai.com/docs/guides/embeddings)
项目中默认使用的 Embedding 类型为 `sensenova/piccolo-base-zh`,如需使用其他 Embedding 类型,请在 [configs/model_config.py] 中对 `embedding_model_dict``EMBEDDING_MODEL` 进行修改。
---
### Text Splitter 个性化支持
本项目支持调用 [Langchain](https://api.python.langchain.com/en/latest/api_reference.html#module-langchain.text_splitter) 的 Text Splitter 分词器以及基于此改进的自定义分词器,已支持的 Text Splitter 类型如下:
- CharacterTextSplitter
- LatexTextSplitter
- MarkdownHeaderTextSplitter
- MarkdownTextSplitter
- NLTKTextSplitter
- PythonCodeTextSplitter
- RecursiveCharacterTextSplitter
- SentenceTransformersTokenTextSplitter
- SpacyTextSplitter
已经支持的定制分词器如下:
- [AliTextSplitter](text_splitter/ali_text_splitter.py)
- [ChineseRecursiveTextSplitter](text_splitter/chinese_recursive_text_splitter.py)
- [ChineseTextSplitter](text_splitter/chinese_text_splitter.py)
项目中默认使用的 Text Splitter 类型为 `ChineseRecursiveTextSplitter`,如需使用其他 Text Splitter 类型,请在 [configs/model_config.py] 中对 `text_splitter_dict``TEXT_SPLITTER` 进行修改。
关于如何使用自定义分词器和贡献自己的分词器,可以参考[Text Splitter 贡献说明](docs/splitter.md)。
## Agent生态
### 基础的Agent
我们实现了一个简单的基于OpenAI的React的Agent模型目前经过我们测试仅有以下两个模型支持
在本版本中我们实现了一个简单的基于OpenAI的React的Agent模型目前经过我们测试仅有以下两个模型支持
+ OpenAI GPT4
+ Qwen-14B-Chat
### 构建自己的Agent工具
详见 [自定义Agent说明](docs/自定义Agent.md)
## Docker 部署
🐳 Docker 镜像地址: `registry.cn-beijing.aliyuncs.com/chatchat/chatchat:0.2.5)`
```shell
docker run -d --gpus all -p 80:8501 registry.cn-beijing.aliyuncs.com/chatchat/chatchat:0.2.5
$ python --version
Python 3.10.12
```
接着,创建一个虚拟环境,并在虚拟环境内安装项目的依赖
```shell
- 该版本镜像大小 `35.3GB`,使用 `v0.2.5`,以 `nvidia/cuda:12.1.1-cudnn8-devel-ubuntu22.04` 为基础镜像
- 该版本内置两个 `embedding` 模型:`m3e-large``text2vec-bge-large-chinese`,默认启用后者,内置 `chatglm2-6b-32k`
- 该版本目标为方便一键部署使用请确保您已经在Linux发行版上安装了NVIDIA驱动程序
- 请注意您不需要在主机系统上安装CUDA工具包但需要安装 `NVIDIA Driver` 以及 `NVIDIA Container Toolkit`,请参考[安装指南](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html)
- 首次拉取和启动均需要一定时间,首次启动时请参照下图使用 `docker logs -f <container id>` 查看日志
- 如遇到启动过程卡在 `Waiting..` 步骤,建议使用 `docker exec -it <container id> bash` 进入 `/logs/` 目录查看对应阶段日志
# 拉取仓库
$ git clone https://github.com/chatchat-space/Langchain-Chatchat.git
---
# 进入目录
$ cd Langchain-Chatchat
## 开发部署
# 安装全部依赖
$ pip install -r requirements.txt
$ pip install -r requirements_api.txt
$ pip install -r requirements_webui.txt
### 软件需求
本项目已在 Python 3.8.1 - 3.10CUDA 11.7 环境下完成测试。已在 Windows、ARM 架构的 macOS、Linux 系统中完成测试。
### 1. 开发环境准备
参见 [开发环境准备](docs/INSTALL.md)。
**请注意:** `0.2.3` 及更新版本的依赖包与 `0.1.x` 版本依赖包可能发生冲突,强烈建议新建环境后重新安装依赖包。
### 2. 下载模型至本地
# 默认依赖包括基本运行环境FAISS向量库。如果要使用 milvus/pg_vector 等向量库,请将 requirements.txt 中相应依赖取消注释再安装。
```
### 2 模型下载
如需在本地或离线环境下运行本项目,需要首先将项目所需的模型下载至本地,通常开源 LLM 与 Embedding 模型可以从 [HuggingFace](https://huggingface.co/models) 下载。
@ -265,134 +92,26 @@ docker run -d --gpus all -p 80:8501 registry.cn-beijing.aliyuncs.com/chatchat/ch
下载模型需要先[安装Git LFS](https://docs.github.com/zh/repositories/working-with-files/managing-large-files/installing-git-large-file-storage),然后运行
```Shell
$ git lfs install
$ git clone https://huggingface.co/THUDM/chatglm2-6b
$ git clone https://huggingface.co/moka-ai/m3e-base
```
### 3. 初始化知识库和配置文件
### 3. 设置配置项
复制模型相关参数配置模板文件 [configs/model_config.py.example](configs/model_config.py.example) 存储至项目路径下 `./configs` 路径下,并重命名为 `model_config.py`
复制服务相关参数配置模板文件 [configs/server_config.py.example](configs/server_config.py.example) 存储至项目路径下 `./configs` 路径下,并重命名为 `server_config.py`
在开始执行 Web UI 或命令行交互前,请先检查 [configs/model_config.py](configs/model_config.py) 和 [configs/server_config.py](configs/server_config.py) 中的各项模型参数设计是否符合需求:
- 请确认已下载至本地的 LLM 模型本地存储路径写在 `llm_model_dict` 对应模型的 `local_model_path` 属性中,如:
```
"chatglm2-6b": "/Users/xxx/Downloads/chatglm2-6b",
```
- 请确认已下载至本地的 Embedding 模型本地存储路径写在 `embedding_model_dict` 对应模型位置,如:
```
"m3e-base": "/Users/xxx/Downloads/m3e-base",
```
- 请确认本地分词器路径是否已经填写,如:
```
text_splitter_dict = {
"ChineseRecursiveTextSplitter": {
"source": "huggingface", ## 选择tiktoken则使用openai的方法,不填写则默认为字符长度切割方法。
"tokenizer_name_or_path": "", ## 空格不填则默认使用大模型的分词器。
}
}
```
如果你选择使用OpenAI的Embedding模型请将模型的 ``key``写入 `embedding_model_dict`中。使用该模型你需要能够访问OpenAI官的API或设置代理。
### 4. 知识库初始化与迁移
当前项目的知识库信息存储在数据库中,在正式运行项目之前请先初始化数据库(我们强烈建议您在执行操作前备份您的知识文件)。
- 如果您是从 `0.1.x` 版本升级过来的用户针对已建立的知识库请确认知识库的向量库类型、Embedding 模型与 `configs/model_config.py` 中默认设置一致,如无变化只需以下命令将现有知识库信息添加到数据库即可:
```shell
$ python init_database.py
```
- 如果您是第一次运行本项目,知识库尚未建立,或者配置文件中的知识库类型、嵌入模型发生变化,或者之前的向量库没有开启 `normalize_L2`,需要以下命令初始化或重建知识库:
```shell
$ python init_database.py --recreate-vs
```
### 5. 一键启动 API 服务或 Web UI
#### 5.1 启动命令
一键启动脚本 startup.py,一键启动所有 Fastchat 服务、API 服务、WebUI 服务,示例代码:
按照下列方式初始化自己的知识库和简单的复制配置文件
```shell
$ python copy_config_example.py
$ python init_database.py --recreate-vs
```
### 4. 一键启动
按照以下命令启动项目
```shell
$ python startup.py -a
```
### 5. 启动界面示例
并可使用 `Ctrl + C` 直接关闭所有运行服务。如果一次结束不了,可以多按几次。
可选参数包括 `-a (或--all-webui)`, `--all-api`, `--llm-api`, `-c (或--controller)`, `--openai-api`,
`-m (或--model-worker)`, `--api`, `--webui`,其中:
- `--all-webui` 为一键启动 WebUI 所有依赖服务;
- `--all-api` 为一键启动 API 所有依赖服务;
- `--llm-api` 为一键启动 Fastchat 所有依赖的 LLM 服务;
- `--openai-api` 为仅启动 FastChat 的 controller 和 openai-api-server 服务;
- 其他为单独服务启动选项。
#### 5.2 启动非默认模型
若想指定非默认模型,需要用 `--model-name` 选项,示例:
```shell
$ python startup.py --all-webui --model-name Qwen-7B-Chat
```
更多信息可通过 `python startup.py -h`查看。
#### 5.3 多卡加载
项目支持多卡加载,需在 startup.py 中的 create_model_worker_app 函数中,修改如下三个参数:
```python
gpus=None,
num_gpus= 1,
max_gpu_memory="20GiB"
```
其中,`gpus` 控制使用的显卡的ID例如 "0,1";
`num_gpus` 控制使用的卡数;
`max_gpu_memory` 控制每个卡使用的显存容量。
注1server_config.py的FSCHAT_MODEL_WORKERS字典中也增加了相关配置如有需要也可通过修改FSCHAT_MODEL_WORKERS字典中对应参数实现多卡加载。
注2少数情况下gpus参数会不生效此时需要通过设置环境变量CUDA_VISIBLE_DEVICES来指定torch可见的gpu,示例代码:
```shell
CUDA_VISIBLE_DEVICES=0,1 python startup.py -a
```
#### 5.4 PEFT 加载(包括lora,p-tuning,prefix tuning, prompt tuning,ia3等)
本项目基于 FastChat 加载 LLM 服务,故需以 FastChat 加载 PEFT 路径针对chatglm,falconcodet5p以外的模型以及非p-tuning以外的peft方法步骤如下
1. 将训练peft生成的config.json文件命名为adapter_config.json
2. 重命名文件夹,保证文件夹中包含'peft'一词;
3. 开启 `PEFT_SHARE_BASE_WEIGHTS=true`环境变量再执行python startup.py -a
针对p-tuning和chatglm模型需要对fastchat进行较大幅度的修改详细步骤参考[chatchat加载p-tuning](docs/chatchat加载ptuning.md)
#### **5.5 注意事项:**
**1. startup 脚本用多进程方式启动各模块的服务,可能会导致打印顺序问题,请等待全部服务发起后再调用,并根据默认或指定端口调用服务(默认 LLM API 服务端口:`127.0.0.1:8888`,默认 API 服务端口:`127.0.0.1:7861`,默认 WebUI 服务端口:`本机IP8501`)**
**2.服务启动时间示设备不同而不同,约 3-10 分钟,如长时间没有启动请前往 `./logs`目录下监控日志,定位问题。**
**3. 在Linux上使用ctrl+C退出可能会由于linux的多进程机制导致multiprocessing遗留孤儿进程可通过shutdown_all.sh进行退出**
#### 5.6 启动界面示例:
如果正常启动,你将能看到以下界面
1. FastAPI docs 界面
@ -408,71 +127,32 @@ CUDA_VISIBLE_DEVICES=0,1 python startup.py -a
![](img/init_knowledge_base.jpg)
---
## 常见问题
### 注意
参见 [常见问题](docs/FAQ.md)。
以上方式只是为了快速上手,如果需要更多的功能和自定义启动方式 ,请参考[Wiki](https://github.com/chatchat-space/Langchain-Chatchat/wiki/)
---
## 路线图
## 项目里程碑
- [X] Langchain 应用
- [X] 本地数据接入
- [X] 接入非结构化文档
- [X] .md
- [X] .txt
- [X] .docx
- [ ] 结构化数据接入
- [X] .csv
- [ ] .xlsx
- [ ] 分词及召回
- [X] 接入不同类型 TextSplitter
- [X] 优化依据中文标点符号设计的 ChineseTextSplitter
- [ ] 重新实现上下文拼接召回
- [ ] 本地网页接入
- [ ] SQL 接入
- [ ] 知识图谱/图数据库接入
- [X] 搜索引擎接入
- [X] Bing 搜索
- [X] DuckDuckGo 搜索
- [X] Metaphor 搜索
- [X] Agent 实现
- [X] 基础React形式的Agent实现包括调用计算器等
- [X] Langchain 自带的Agent实现和调用
- [X] 智能调用不同的数据库和联网知识
- [ ] 更多工具
- [X] LLM 模型接入
- [X] 支持通过调用 [FastChat](https://github.com/lm-sys/fastchat) api 调用 llm
- [X] 支持 ChatGLM API 等 LLM API 的接入
- [X] Embedding 模型接入
- [X] 支持调用 HuggingFace 中各开源 Emebdding 模型
- [X] 支持 OpenAI Embedding API 等 Embedding API 的接入
- [X] 基于 FastAPI 的 API 方式调用
- [X] Web UI
- [X] 基于 Streamlit 的 Web UI
---
## 项目交流群
## 联系我们
### Telegram
[![Telegram](https://img.shields.io/badge/Telegram-2CA5E0?style=for-the-badge&logo=telegram&logoColor=white "langchain-chatglm")](https://t.me/+RjliQ3jnJ1YyN2E9)
### 项目交流群
<img src="img/qr_code_67.jpg" alt="二维码" width="300" height="300" />
🎉 langchain-Chatchat 项目微信交流群,如果你也对本项目感兴趣,欢迎加入群聊参与讨论交流。
## 关注我们
🎉 Langchain-Chatchat 项目微信交流群,如果你也对本项目感兴趣,欢迎加入群聊参与讨论交流。
### 公众号
<img src="img/official_account.png" alt="图片" width="900" height="300" />
🎉 langchain-Chatchat 项目官方公众号,欢迎扫码关注。
🎉 Langchain-Chatchat 项目官方公众号,欢迎扫码关注。
## 合作伙伴名单
🎉 Langchain-Chatchat 项目合作伙伴,感谢以下合作伙伴对本项目的支持。
🎉 langchain-Chatchat 项目合作伙伴,感谢以下赞助者对本项目的支持。
+ [AutoDL 提供弹性、好用、省钱的云GPU租用服务。缺显卡就上 AutoDL.com](https://www.autodl.com)
+ [百川智能](https://www.baichuan-ai.com/home)
+ [ChatGLM: 国内最早的中文聊天模型](https://chatglm.cn/)
+ AutoDL
+ 弹性、好用、省钱!
+ 提供弹性、好用、省钱的云GPU租用服务。缺显卡就上[AutoDL.com](https://www.autodl.com/)
+ ChatGLM

View File

@ -1,34 +1,32 @@
![](img/logo-long-chatchat-trans-v2.png)
[![Telegram](https://img.shields.io/badge/Telegram-2CA5E0?style=for-the-badge&logo=telegram&logoColor=white "langchain-chatglm")](https://t.me/+RjliQ3jnJ1YyN2E9)
🌍 [中文文档](README.md)
📃 **LangChain-Chatchat** (formerly Langchain-ChatGLM): A LLM application aims to implement knowledge and search engine based QA based on Langchain and open-source or remote LLM API.
📃 **LangChain-Chatchat** (formerly Langchain-ChatGLM):
## Content
* [Introduction](README_en.md#Introduction)
* [Change Log](README_en.md#Change-Log)
* [Supported Models](README_en.md#Supported-Models)
* [Docker Deployment](README_en.md#Docker-Deployment)
* [Development](README_en.md#Development)
* [Environment Prerequisite](README_en.md#Environment-Prerequisite)
* [Preparing Deployment Environment](README_en.md#1.-Preparing-Deployment-Environment)
* [Downloading model to local disk](README_en.md#2.-Downloading-model-to-local-disk)
* [Setting Configuration](README_en.md#3.-Setting-Configuration)
* [Knowledge Base Migration](README_en.md#4.-Knowledge-Base-Migration)
* [Launching API Service or WebUI](README_en.md#5.-Launching-API-Service-or-WebUI-with-One-Command)
* [FAQ](README_en.md#FAQ)
* [Roadmap](README_en.md#Roadmap)
A LLM application aims to implement knowledge and search engine based QA based on Langchain and open-source or remote LLM API.
---
## Table of Contents
- [Introduction](README.md#Introduction)
- [Pain Points Addressed](README.md#Pain-Points-Addressed)
- [Quick Start](README.md#Quick-Start)
- [1. Environment Setup](README.md#1-Environment-Setup)
- [2. Model Download](README.md#2-Model-Download)
- [3. Initialize Knowledge Base and Configuration Files](README.md#3-Initialize-Knowledge-Base-and-Configuration-Files)
- [4. One-Click Startup](README.md#4-One-Click-Startup)
- [5. Startup Interface Examples](README.md#5-Startup-Interface-Examples)
- [Contact Us](README.md#Contact-Us)
- [List of Partner Organizations](README.md#List-of-Partner-Organizations)
## Introduction
🤖️ A Q&A application based on local knowledge base implemented using the idea of [langchain](https://github.com/hwchase17/langchain). The goal is to build a KBQA(Knowledge based Q&A) solution that is friendly to Chinese scenarios and open source models and can run both offline and online.
💡 Inspried by [document.ai](https://github.com/GanymedeNil/document.ai) and [ChatGLM-6B Pull Request](https://github.com/THUDM/ChatGLM-6B/pull/216) , we build a local knowledge base question answering application that can be implemented using an open source model or remote LLM api throughout the process. In the latest version of this project, [FastChat](https://github.com/lm-sys/FastChat) is used to access Vicuna, Alpaca, LLaMA, Koala, RWKV and many other models. Relying on [langchain](https://github.com/langchain-ai/langchain) , this project supports calling services through the API provided based on [FastAPI](https://github.com/tiangolo/fastapi), or using the WebUI based on [Streamlit](https://github.com/streamlit/streamlit).
✅ Relying on the open source LLM and Embedding models, this project can realize full-process **offline private deployment**. At the same time, this project also supports the call of OpenAI GPT API- and Zhipu API, and will continue to expand the access to various models and remote APIs in the future.
⛓️ The implementation principle of this project is shown in the graph below. The main process includes: loading files -> reading text -> text segmentation -> text vectorization -> question vectorization -> matching the `top-k` most similar to the question vector in the text vector -> The matched text is added to `prompt `as context and question -> submitted to `LLM` to generate an answer.
@ -47,382 +45,102 @@ The main process analysis from the aspect of document process:
🐳 [Docker image](registry.cn-beijing.aliyuncs.com/chatchat/chatchat:0.2.0)
💻 Run Docker with one command:
## Pain Points Addressed
```shell
docker run -d --gpus all -p 80:8501 registry.cn-beijing.aliyuncs.com/chatchat/chatchat:0.2.0
This project is a solution for enhancing knowledge bases with fully localized inference, specifically addressing the pain points of data security and private deployments for businesses.
This open-source solution is under the Apache License and can be used for commercial purposes for free, with no fees required.
We support mainstream local large prophecy models and Embedding models available in the market, as well as open-source local vector databases. For a detailed list of supported models and databases, please refer to our [Wiki](https://github.com/chatchat-space/Langchain-Chatchat/wiki/)
## Quick Start
### Environment Setup
First, make sure your machine has Python 3.10 installed.
```
---
## Environment Minimum Requirements
To run this code smoothly, please configure it according to the following minimum requirements:
+ Python version: >= 3.8.5, < 3.11
+ Cuda version: >= 11.7, with Python installed.
If you want to run the native model (int4 version) on the GPU without problems, you need at least the following hardware configuration.
+ chatglm2-6b & LLaMA-7B Minimum RAM requirement: 7GB Recommended graphics cards: RTX 3060, RTX 2060
+ LLaMA-13B Minimum graphics memory requirement: 11GB Recommended cards: RTX 2060 12GB, RTX3060 12GB, RTX3080, RTXA2000
+ Qwen-14B-Chat Minimum memory requirement: 13GB Recommended graphics card: RTX 3090
+ LLaMA-30B Minimum Memory Requirement: 22GB Recommended Cards: RTX A5000,RTX 3090,RTX 4090,RTX 6000,Tesla V100,RTX Tesla P40
+ Minimum memory requirement for LLaMA-65B: 40GB Recommended cards: A100,A40,A6000
If int8 then memory x1.5 fp16 x2.5 requirement.
For example: using fp16 to reason about the Qwen-7B-Chat model requires 16GB of video memory.
The above is only an estimate, the actual situation is based on nvidia-smi occupancy.
## Change Log
plese refer to [version change log](https://github.com/imClumsyPanda/langchain-ChatGLM/releases)
### Current Features
* **Consistent LLM Service based on FastChat**. The project use [FastChat](https://github.com/lm-sys/FastChat) to provide the API service of the open source LLM models and access it in the form of OpenAI API interface to improve the loading effect of the LLM model;
* **Chain and Agent based on Langchian**. Use the existing Chain implementation in [langchain](https://github.com/langchain-ai/langchain) to facilitate subsequent access to different types of Chain, and will test Agent access;
* **Full fuction API service based on FastAPI**. All interfaces can be tested in the docs automatically generated by [FastAPI](https://github.com/tiangolo/fastapi), and all dialogue interfaces support streaming or non-streaming output through parameters. ;
* **WebUI service based on Streamlit**. With [Streamlit](https://github.com/streamlit/streamlit), you can choose whether to start WebUI based on API services, add session management, customize session themes and switch, and will support different display of content forms of output in the future;
* **Abundant open source LLM and Embedding models**. The default LLM model in the project is changed to [THUDM/chatglm2-6b](https://huggingface.co/THUDM/chatglm2-6b), and the default Embedding model is changed to [moka-ai/m3e-base](https:// huggingface.co/moka-ai/m3e-base), the file loading method and the paragraph division method have also been adjusted. In the future, context expansion will be re-implemented and optional settings will be added;
* **Multiply vector libraries**. The project has expanded support for different types of vector libraries. Including [FAISS](https://github.com/facebookresearch/faiss), [Milvus](https://github.com/milvus-io/milvus),[Milvus](https://milvus.io/),[Zilliz](https://zilliz.com/),and [PGVector](https://github.com/pgvector/pgvector);
* **Varied Search engines**. We provide two search engines now: Bing and DuckDuckGo. DuckDuckGo search does not require configuring an API Key and can be used directly in environments with access to foreign services.
## Supported Models
The default LLM model in the project is changed to [THUDM/chatglm2-6b](https://huggingface.co/THUDM/chatglm2-6b), and the default Embedding model is changed to [moka-ai/m3e-base](https:// huggingface.co/moka-ai/m3e-base).
### Supported LLM models
The project use [FastChat](https://github.com/lm-sys/FastChat) to provide the API service of the open source LLM models, supported models include:
- [meta-llama/Llama-2-7b-chat-hf](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf)
- Vicuna, Alpaca, LLaMA, Koala
- [BlinkDL/RWKV-4-Raven](https://huggingface.co/BlinkDL/rwkv-4-raven)
- [camel-ai/CAMEL-13B-Combined-Data](https://huggingface.co/camel-ai/CAMEL-13B-Combined-Data)
- [databricks/dolly-v2-12b](https://huggingface.co/databricks/dolly-v2-12b)
- [FreedomIntelligence/phoenix-inst-chat-7b](https://huggingface.co/FreedomIntelligence/phoenix-inst-chat-7b)
- [h2oai/h2ogpt-gm-oasst1-en-2048-open-llama-7b](https://huggingface.co/h2oai/h2ogpt-gm-oasst1-en-2048-open-llama-7b)
- [lcw99/polyglot-ko-12.8b-chang-instruct-chat](https://huggingface.co/lcw99/polyglot-ko-12.8b-chang-instruct-chat)
- [lmsys/fastchat-t5-3b-v1.0](https://huggingface.co/lmsys/fastchat-t5)
- [mosaicml/mpt-7b-chat](https://huggingface.co/mosaicml/mpt-7b-chat)
- [Neutralzz/BiLLa-7B-SFT](https://huggingface.co/Neutralzz/BiLLa-7B-SFT)
- [nomic-ai/gpt4all-13b-snoozy](https://huggingface.co/nomic-ai/gpt4all-13b-snoozy)
- [NousResearch/Nous-Hermes-13b](https://huggingface.co/NousResearch/Nous-Hermes-13b)
- [openaccess-ai-collective/manticore-13b-chat-pyg](https://huggingface.co/openaccess-ai-collective/manticore-13b-chat-pyg)
- [OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5](https://huggingface.co/OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5)
- [project-baize/baize-v2-7b](https://huggingface.co/project-baize/baize-v2-7b)
- [Salesforce/codet5p-6b](https://huggingface.co/Salesforce/codet5p-6b)
- [StabilityAI/stablelm-tuned-alpha-7b](https://huggingface.co/stabilityai/stablelm-tuned-alpha-7b)
- [THUDM/chatglm-6b](https://huggingface.co/THUDM/chatglm-6b)
- [THUDM/chatglm2-6b](https://huggingface.co/THUDM/chatglm2-6b)
- [tiiuae/falcon-40b](https://huggingface.co/tiiuae/falcon-40b)
- [timdettmers/guanaco-33b-merged](https://huggingface.co/timdettmers/guanaco-33b-merged)
- [togethercomputer/RedPajama-INCITE-7B-Chat](https://huggingface.co/togethercomputer/RedPajama-INCITE-7B-Chat)
- [WizardLM/WizardLM-13B-V1.0](https://huggingface.co/WizardLM/WizardLM-13B-V1.0)
- [WizardLM/WizardCoder-15B-V1.0](https://huggingface.co/WizardLM/WizardCoder-15B-V1.0)
- [baichuan-inc/baichuan-7B](https://huggingface.co/baichuan-inc/baichuan-7B)
- [internlm/internlm-chat-7b](https://huggingface.co/internlm/internlm-chat-7b)
- [Qwen/Qwen-7B-Chat/Qwen-14B-Chat](https://huggingface.co/Qwen/)
- [HuggingFaceH4/starchat-beta](https://huggingface.co/HuggingFaceH4/starchat-beta)
- [FlagAlpha/Llama2-Chinese-13b-Chat](https://huggingface.co/FlagAlpha/Llama2-Chinese-13b-Chat) and other models of FlagAlpha
- [BAAI/AquilaChat-7B](https://huggingface.co/BAAI/AquilaChat-7B)
- [all models of OpenOrca](https://huggingface.co/Open-Orca)
- [Spicyboros](https://huggingface.co/jondurbin/spicyboros-7b-2.2?not-for-all-audiences=true) + [airoboros 2.2](https://huggingface.co/jondurbin/airoboros-l2-13b-2.2)
- [baichuan2-7b/baichuan2-13b](https://huggingface.co/baichuan-inc)
- [VMware&#39;s OpenLLaMa OpenInstruct](https://huggingface.co/VMware/open-llama-7b-open-instruct)
* Any [EleutherAI](https://huggingface.co/EleutherAI) pythia model such as [pythia-6.9b](https://huggingface.co/EleutherAI/pythia-6.9b)(任何 [EleutherAI](https://huggingface.co/EleutherAI) 的 pythia 模型,如 [pythia-6.9b](https://huggingface.co/EleutherAI/pythia-6.9b))
* Any [Peft](https://github.com/huggingface/peft) adapter trained on top of a model above. To activate, must have `peft` in the model path. Note: If loading multiple peft models, you can have them share the base model weights by setting the environment variable `PEFT_SHARE_BASE_WEIGHTS=true` in any model worker.
The above model support list may be updated continuously as [FastChat](https://github.com/lm-sys/FastChat) is updated, see [FastChat Supported Models List](https://github.com/lm-sys/FastChat/blob/main/docs/model_support.md).
In addition to local models, this project also supports direct access to online models such as OpenAI API, Wisdom Spectrum AI, etc. For specific settings, please refer to the configuration information of `llm_model_dict` in `configs/model_configs.py.example`.
Online LLM models are currently supported:
- [ChatGPT](https://api.openai.com)
- [Smart Spectrum AI](http://open.bigmodel.cn)
- [MiniMax](https://api.minimax.chat)
- [Xunfei Starfire](https://xinghuo.xfyun.cn)
- [Baidu Qianfan](https://cloud.baidu.com/product/wenxinworkshop?track=dingbutonglan)
- [Aliyun Tongyi Qianqian](https://dashscope.aliyun.com/)
The default LLM type used in the project is `THUDM/chatglm2-6b`, if you need to use other LLM types, please modify `llm_model_dict` and `LLM_MODEL` in [configs/model_config.py].
### Supported Embedding models
Following models are tested by developers with Embedding class of [HuggingFace](https://huggingface.co/models?pipeline_tag=sentence-similarity):
- [moka-ai/m3e-small](https://huggingface.co/moka-ai/m3e-small)
- [moka-ai/m3e-base](https://huggingface.co/moka-ai/m3e-base)
- [moka-ai/m3e-large](https://huggingface.co/moka-ai/m3e-large)
- [BAAI/bge-small-zh](https://huggingface.co/BAAI/bge-small-zh)
- [BAAI/bge-base-zh](https://huggingface.co/BAAI/bge-base-zh)
- [BAAI/bge-base-zh-v1.5](https://huggingface.co/BAAI/bge-base-zh-v1.5)
- [BAAI/bge-large-zh-v1.5](https://huggingface.co/BAAI/bge-large-zh-v1.5)
- [BAAI/bge-large-zh](https://huggingface.co/BAAI/bge-large-zh)
- [BAAI/bge-large-zh-noinstruct](https://huggingface.co/BAAI/bge-large-zh-noinstruct)
- [sensenova/piccolo-base-zh](https://huggingface.co/sensenova/piccolo-base-zh)
- [sensenova/piccolo-large-zh](https://huggingface.co/sensenova/piccolo-large-zh)
- [shibing624/text2vec-base-chinese-sentence](https://huggingface.co/shibing624/text2vec-base-chinese-sentence)
- [shibing624/text2vec-base-chinese-paraphrase](https://huggingface.co/shibing624/text2vec-base-chinese-paraphrase)
- [shibing624/text2vec-base-multilingual](https://huggingface.co/shibing624/text2vec-base-multilingual)
- [shibing624/text2vec-base-chinese](https://huggingface.co/shibing624/text2vec-base-chinese)
- [shibing624/text2vec-bge-large-chinese](https://huggingface.co/shibing624/text2vec-bge-large-chinese)
- [GanymedeNil/text2vec-large-chinese](https://huggingface.co/GanymedeNil/text2vec-large-chinese)
- [nghuyong/ernie-3.0-nano-zh](https://huggingface.co/nghuyong/ernie-3.0-nano-zh)
- [nghuyong/ernie-3.0-base-zh](https://huggingface.co/nghuyong/ernie-3.0-base-zh)
- [sensenova/piccolo-base-zh](https://huggingface.co/sensenova/piccolo-base-zh)
- [sensenova/piccolo-base-zh](https://huggingface.co/sensenova/piccolo-large-zh)
- [OpenAI/text-embedding-ada-002](https://platform.openai.com/docs/guides/embeddings)
The default Embedding type used in the project is `sensenova/piccolo-base-zh`, if you want to use other Embedding types, please modify `embedding_model_dict` and `embedding_model_dict` and `embedding_model_dict` in [configs/model_config.py]. MODEL` in [configs/model_config.py].
### Build your own Agent tool
Only these Models support Agents
+ OpenAI GPT4
+ Qwen-14B-Chat
See [Custom Agent Instructions](docs/自定义Agent.md) for details.
---
## Docker Deployment
🐳 Docker image path: `registry.cn-beijing.aliyuncs.com/chatchat/chatchat:0.2.5)`
```shell
docker run -d --gpus all -p 80:8501 registry.cn-beijing.aliyuncs.com/chatchat/chatchat:0.2.5
$ python --version
Python 3.10.12
```
Then, create a virtual environment and install the project's dependencies within the virtual environment.
```shell
- The image size of this version is `33.9GB`, using `v0.2.0`, with `nvidia/cuda:12.1.1-cudnn8-devel-ubuntu22.04` as the base image
- This version has a built-in `embedding` model: `m3e-large`, built-in `chatglm2-6b-32k`
- This version is designed to facilitate one-click deployment. Please make sure you have installed the NVIDIA driver on your Linux distribution.
- Please note that you do not need to install the CUDA toolkit on the host system, but you need to install the `NVIDIA Driver` and the `NVIDIA Container Toolkit`, please refer to the [Installation Guide](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html)
- It takes a certain amount of time to pull and start for the first time. When starting for the first time, please refer to the figure below to use `docker logs -f <container id>` to view the log.
- If the startup process is stuck in the `Waiting..` step, it is recommended to use `docker exec -it <container id> bash` to enter the `/logs/` directory to view the corresponding stage logs
# 拉取仓库
$ git clone https://github.com/chatchat-space/Langchain-Chatchat.git
---
# 进入目录
$ cd Langchain-Chatchat
## Development
# 安装全部依赖
$ pip install -r requirements.txt
$ pip install -r requirements_api.txt
$ pip install -r requirements_webui.txt
### Environment Prerequisite
# 默认依赖包括基本运行环境FAISS向量库。如果要使用 milvus/pg_vector 等向量库,请将 requirements.txt 中相应依赖取消注释再安装。
```
### Model Download
The project is tested under Python3.8-python 3.10, CUDA 11.0-CUDA11.7, Windows, macOS of ARM architecture, and Linux platform.
If you need to run this project locally or in an offline environment, you must first download the required models for the project. Typically, open-source LLM and Embedding models can be downloaded from HuggingFace.
### 1. Preparing Deployment Environment
Taking the default LLM model used in this project, [THUDM/chatglm2-6b](https://huggingface.co/THUDM/chatglm2-6b), and the Embedding model [moka-ai/m3e-base](https://huggingface.co/moka-ai/m3e-base) as examples:
Please refer to [install.md](docs/INSTALL.md)
### 2. Downloading model to local disk
**For offline deployment only!**
If you want to run this project in a local or offline environment, you need to first download the models required for the project to your local computer. Usually the open source LLM and Embedding models can be downloaded from [Hugging Face](https://huggingface.co/models).
Take the LLM model [THUDM/chatglm2-6b](https://huggingface.co/THUDM/chatglm2-6b) and Embedding model [moka-ai/m3e-base](https://huggingface.co/moka-ai/m3e-base) for example:
To download the model, you need to [install Git LFS](https://docs.github.com/zh/repositories/working-with-files/managing-large-files/installing-git-large-file-storage), and then run:
To download the models, you need to first install [Git LFS](https://docs.github.com/zh/repositories/working-with-files/managing-large-files/installing-git-large-file-storage) and then run:
```Shell
$ git lfs install
$ git clone https://huggingface.co/THUDM/chatglm2-6b
$ git clone https://huggingface.co/moka-ai/m3e-base
```
### 3. Setting Configuration
### Initializing the Knowledge Base and Config File
Copy the model-related parameter configuration template file [configs/model_config.py.example](configs/model_config.py.example) and store it under the project path `. /configs` path and rename it `model_config.py`.
Follow the steps below to initialize your own knowledge base and config file:
```shell
$ python copy_config_example.py
$ python init_database.py --recreate-vs
```
Copy the service-related parameter configuration template file [configs/server_config.py.example](configs/server_config.py.example) and store it under the project path `. /configs` path and rename it `server_config.py`.
Before you start executing the Web UI or command line interactions, check that each of the items in [configs/model_config.py](configs/model_config.py) and [configs/server_config.py](configs/server_config.py) The model parameters are designed to meet the requirements:
- Please make sure that the local storage path of the downloaded LLM model is written in the `local_model_path` attribute of the corresponding model in `llm_model_dict`, e.g..
```
"chatglm2-6b":"/Users/xxx/Downloads/chatglm2-6b",
```
- Please make sure that the local storage path of the downloaded Embedding model is written in `embedding_model_dict` corresponding to the model location, e.g.:
```
"m3e-base":"/Users/xxx/Downloads/m3e-base", ``` Please make sure that the local storage path of the downloaded Embedding model is written in the location of the corresponding model, e.g.
```
- Please make sure that the local participle path is filled in, e.g.:
```
text_splitter_dict = {
"ChineseRecursiveTextSplitter": {
"source": "huggingface", ## Select tiktoken to use openai's method, don't fill it in then it defaults to character length cutting method.
"tokenizer_name_or_path": "", ## Leave blank to use the big model of the tokeniser.
}
}
```
### 4. Knowledge Base Migration
The knowledge base information is stored in the database, please initialize the database before running the project (we strongly recommend one back up the knowledge files before performing operations).
- If you migrate from `0.1.x`, for the established knowledge base, please confirm that the vector library type and Embedding model of the knowledge base are consistent with the default settings in `configs/model_config.py`, if there is no change, simply add the existing repository information to the database with the following command:
```shell
$ python init_database.py
```
- If you are a beginner of the project whose knowledge base has not been established, or the knowledge base type and embedding model in the configuration file have changed, or the previous vector library did not enable `normalize_L2`, you need the following command to initialize or rebuild the knowledge base:
```shell
$ python init_database.py --recreate-vs
```
### 5. Launching API Service or WebUI with One Command
#### 5.1 Command
The script is `startuppy`, you can luanch all fastchat related, API,WebUI service with is, here is an example:
### One-Click Launch
To start the project, run the following command:
```shell
$ python startup.py -a
```
optional args including: `-a(or --all-webui), --all-api, --llm-api, -c(or --controller),--openai-api, -m(or --model-worker), --api, --webui`, where:
* `--all-webui` means to launch all related services of WEBUI
* `--all-api` means to launch all related services of API
* `--llm-api` means to launch all related services of FastChat
* `--openai-api` means to launch controller and openai-api-server of FastChat only
* `model-worker` means to launch model worker of FastChat only
* any other optional arg is to launch one particular function only
#### 5.2 Launch none-default model
If you want to specify a none-default model, use `--model-name` arg, here is a example:
```shell
$ python startup.py --all-webui --model-name Qwen-7B-Chat
```
#### 5.3 Load model with multi-gpus
If you want to load model with multi-gpus, then the following three parameters in `startup.create_model_worker_app` should be changed:
```python
gpus=None,
num_gpus=1,
max_gpu_memory="20GiB"
```
where:
* `gpus` is about specifying the gpus' ID, such as '0,1';
* `num_gpus` is about specifying the number of gpus to be used under `gpus`;
* `max_gpu_memory` is about specifying the gpu memory of every gpu.
note:
* These parameters now can be specified by `server_config.FSCHST_MODEL_WORKERD`.
* In some extreme senses, `gpus` doesn't work, then one should specify the used gpus with environment variable `CUDA_VISIBLE_DEVICES`, here is an example:
```shell
CUDA_VISIBLE_DEVICES=0,1 python startup.py -a
```
#### 5.4 Load PEFT
Including lora,p-tuning,prefix tuning, prompt tuning,ia3
This project loads the LLM service based on FastChat, so one must load the PEFT in a FastChat way. For models other than "chatglm", "falcon" or "code5p" and peft other than "p-tuning", ensure that the word `peft` must be in the path name, the name of the configuration file must be `adapter_config.json`, and the path contains PEFT weights in `.bin` format. The peft path is specified in `args.model_names` of the `create_model_worker_app` function in `startup.py`, and enable the environment variable `PEFT_SHARE_BASE_WEIGHTS=true` parameter.
If the above method fails, you need to start standard fastchat service step by step. Step-by-step procedure could be found Section 6. For further steps, please refer to [Model invalid after loading lora fine-tuning](https://github.com/chatchat-space/Langchain-Chatchat/issues/1130#issuecomment-1685291822).
#### **5.5 Some Notes**
1. **The `startup.py` uses multi-process mode to start the services of each module, which may cause printing order problems. Please wait for all services to be initiated before calling, and call the service according to the default or specified port (default LLM API service port: `127.0.0.1:8888 `, default API service port:`127.0.0.1:7861 `, default WebUI service port: `127.0.0.1: 8501`)**
2. **The startup time of the service differs across devices, usually it takes 3-10 minutes. If it does not start for a long time, please go to the `./logs` directory to monitor the logs and locate the problem.**
3. **Using ctrl+C to exit on Linux may cause orphan processes due to the multi-process mechanism of Linux. You can exit through `shutdown_all.sh`**
#### 5.6 Interface Examples
The API, chat interface of WebUI, and knowledge management interface of WebUI are list below respectively.
1. FastAPI docs
### Example of Launch Interface
1. FastAPI docs interface
![](img/fastapi_docs_026.png)
2. Chat Interface of WebUI
2. webui page
- Dialogue interface of WebUI
- Web UI dialog page:
![img](img/LLM_success.png)
- Knowledge management interface of WebUI
- Web UI knowledge base management page:
![](img/init_knowledge_base.jpg)
## FAQ
### Note
Please refer to [FAQ](docs/FAQ.md)
The above instructions are provided for a quick start. If you need more features or want to customize the launch method, please refer to the [Wiki](https://github.com/chatchat-space/Langchain-Chatchat/wiki/).
---
## Roadmap
## Contact Us
### Telegram
- [X] Langchain applications
[![Telegram](https://img.shields.io/badge/Telegram-2CA5E0?style=for-the-badge&logo=telegram&logoColor=white "langchain-chatglm")](https://t.me/+RjliQ3jnJ1YyN2E9)
- [X] Load local documents
### WeChat Group、
- [X] Unstructured documents
- [X] .md
- [X] .txt
- [X] .docx
- [ ] Structured documents
- [X] .csv
- [ ] .xlsx
- [ ]TextSplitter and Retriever
- [X] multiple TextSplitter
- [X] ChineseTextSplitter
- [ ] Reconstructed Context Retriever
<img src="img/qr_code_67.jpg" alt="二维码" width="300" height="300" />
- [ ] Webpage
- [ ] SQL
- [ ] Knowledge Database
- [X] Search Engines
### WeChat Official Account
- [X] Bing
- [X] DuckDuckGo
- [X] Metaphor
- [X] Agent
<img src="img/official_account.png" alt="图片" width="900" height="300" />
- [X] Agent implementation in the form of basic React, including calls to calculators, etc.
- [X] Langchain's own Agent implementation and calls
- [X] Intelligent calls to different vector databases and networking knowledge
- [ ] More tools
- [X] LLM Models
## Partners
🎉A big thank you to the following partners for their support of this project.
- [X] [FastChat](https://github.com/lm-sys/fastchat) -based LLM Models
- [ ] Mutiply Remote LLM API
- [X] Embedding Models
- [X] Hugging Face-based Embedding models
- [ ] Mutiply Remote Embedding API
- [X] FastAPI-based API
- [X] Web UI
+ [AutoDL 提供弹性、好用、省钱的云GPU租用服务。缺显卡就上 AutoDL.com](https://www.autodl.com)
+ [ChatGLM: 国内最早的中文聊天模型](https://chatglm.cn/)
+ [百川智能](https://www.baichuan-ai.com/home)
- [X] Streamlit -based Web UI
---
## Wechat Group
<img src="img/qr_code_67.jpg" alt="QR Code" width="300" height="300" />
🎉 langchain-Chatchat project WeChat exchange group, if you are also interested in this project, welcome to join the group chat to participate in the discussion and exchange.
## Follow us
<img src="img/official_account.png" alt="image" width="900" height="300" />
🎉 langchain-Chatchat project official public number, welcome to scan the code to follow.