Dev (#1811)

* 北京黑客松更新知识库支持：支持zilliz数据库 Agent支持：支持以下工具调用 1. 支持互联网Agent调用 2. 支持知识库Agent调用 3. 支持旅游助手工具(未上传）知识库更新 1. 支持知识库简介，用于Agent选择 2. UI对应知识库简介提示词选择 1. UI 和模板支持提示词模板更换选择 * 数据库更新介绍问题解决 * 关于Langchain自己支持的模型 1. 修复了Openai无法调用的bug 2. 支持了Azure Openai Claude模型（在模型切换界面由于优先级问题，显示的会是其他联网模型) 3. 422问题被修复，用了另一种替代方案。 4. 更新了部分依赖 * 换一些图
2023-10-20 20:07:59 +08:00 · 2023-10-20 20:07:59 +08:00 · 46225ad784
parent 109bb7f6c5
commit 46225ad784
12 changed files with 45 additions and 41 deletions
--- a/README.md
+++ b/README.md
@ -90,7 +90,7 @@ docker run -d --gpus all -p 80:8501 registry.cn-beijing.aliyuncs.com/chatchat/ch
 3. 使用 [FastAPI](https://github.com/tiangolo/fastapi) 提供 API 服务，全部接口可在 FastAPI 自动生成的 docs 中开展测试，且所有对话接口支持通过参数设置流式或非流式输出；
 4. 使用 [Streamlit](https://github.com/streamlit/streamlit) 提供 WebUI 服务，可选是否基于 API 服务启动 WebUI，增加会话管理，可以自定义会话主题并切换，且后续可支持不同形式输出内容的显示；
 5. 项目中默认 LLM 模型改为 [THUDM/chatglm2-6b](https://huggingface.co/THUDM/chatglm2-6b)，默认 Embedding 模型改为 [moka-ai/m3e-base](https://huggingface.co/moka-ai/m3e-base)，文件加载方式与文段划分方式也有调整，后续将重新实现上下文扩充，并增加可选设置；
-6. 项目中扩充了对不同类型向量库的支持，除支持 [FAISS](https://github.com/facebookresearch/faiss) 向量库外，还提供 [Milvus](https://github.com/milvus-io/milvus), [PGVector](https://github.com/pgvector/pgvector) 向量库的接入；
+6. 项目中扩充了对不同类型向量库的支持，除支持 [FAISS](https://github.com/facebookresearch/faiss) 向量库外，还提供 [Milvus](https://milvus.io/),[Zilliz](https://zilliz.com/), [PGVector](https://github.com/pgvector/pgvector) 向量库的接入；
 7. 项目中搜索引擎对话，除 Bing 搜索外，增加 DuckDuckGo 搜索选项，DuckDuckGo 搜索无需配置 API Key，在可访问国外服务环境下可直接使用。

 ---
@ -218,13 +218,10 @@ docker run -d --gpus all -p 80:8501 registry.cn-beijing.aliyuncs.com/chatchat/ch
 ## Agent生态

 ### 基础的Agent
-
+我们实现了一个简单的基于OpenAI的React的Agent模型，目前，经过我们测试，仅有以下两个模型支持：
 在本版本中，我们实现了一个简单的基于OpenAI的React的Agent模型，目前，经过我们测试，仅有以下两个模型支持：
-
 + OpenAI GPT4
-+ ChatGLM2-130B
-
-目前版本的Agent仍然需要对提示词进行大量调试，调试位置
+ Qwen-14B-Chat

 ### 构建自己的Agent工具

@ -399,17 +396,17 @@ CUDA_VISIBLE_DEVICES=0,1 python startup.py -a

 1. FastAPI docs 界面

-![](img/fastapi_docs_020_0.png)
+![](img/fastapi_docs_026.png)

 2. webui启动界面示例：

 - Web UI 对话界面：

-![img](img/webui_0915_0.png)
+![img](img/LLM_success.png)

 - Web UI 知识库管理页面：

-![](img/webui_0915_1.png)
+![](img/init_knowledge_base.jpg)

 ---

@ -440,10 +437,11 @@ CUDA_VISIBLE_DEVICES=0,1 python startup.py -a
  - [X] 搜索引擎接入
    - [X] Bing 搜索
    - [X] DuckDuckGo 搜索
+    - [X] Metaphor 搜索
  - [X] Agent 实现
    - [X] 基础React形式的Agent实现，包括调用计算器等
    - [X] Langchain 自带的Agent实现和调用
-    - [ ] 更多模型的Agent支持
+    - [X] 智能调用不同的数据库和联网知识
    - [ ] 更多工具
 - [X] LLM 模型接入
  - [X] 支持通过调用 [FastChat](https://github.com/lm-sys/fastchat) api 调用 llm
@ -459,7 +457,7 @@ CUDA_VISIBLE_DEVICES=0,1 python startup.py -a

 ## 项目交流群

-<img src="img/qr_code_64.jpg" alt="二维码" width="300" height="300" />
+<img src="img/qr_code_67.jpg" alt="二维码" width="300" height="300" />

 🎉 langchain-Chatchat 项目微信交流群，如果你也对本项目感兴趣，欢迎加入群聊参与讨论交流。

@ -467,3 +465,14 @@ CUDA_VISIBLE_DEVICES=0,1 python startup.py -a

 <img src="img/official_account.png" alt="图片" width="900" height="300" />
 🎉 langchain-Chatchat 项目官方公众号，欢迎扫码关注。
+
+## 合作伙伴名单
+
+🎉 langchain-Chatchat 项目合作伙伴，感谢以下赞助者对本项目的支持。
+
+
+ AutoDL
+ + 弹性、好用、省钱！ 
+ + 提供弹性、好用、省钱的云GPU租用服务。缺显卡就上[AutoDL.com](https://www.autodl.com/)
+
+ ChatGLM
--- a/README_en.md
+++ b/README_en.md
@ -28,8 +28,7 @@

 🤖️ A Q&A application based on local knowledge base implemented using the idea of [langchain](https://github.com/hwchase17/langchain). The goal is to build a KBQA(Knowledge based Q&A) solution that is friendly to Chinese scenarios and open source models and can run both offline and online.

-💡 Inspried by [document.ai](https://github.com/GanymedeNil/document.ai)  and [ChatGLM-6B Pull Request](https://github.com/THUDM/ChatGLM-6B/pull/216) , we build a local knowledge base question answering application that can be implemented using an open source model  or remote LLM api throughout the process. In the latest version of this project, [FastChat](https://github.com/lm-sys/FastChat) is used to access Vicuna, Alpaca, LLaMA, Koala, RWKV and many other models. Relying on [langchain](https:// github.com/langchain-ai/langchain) , this project supports calling services through the API provided based on [FastAPI](https://github.com/tiangolo/fastapi), or using the WebUI based on [Streamlit](https://github.com /streamlit/streamlit) .
-
+💡 Inspried by [document.ai](https://github.com/GanymedeNil/document.ai) and [ChatGLM-6B Pull Request](https://github.com/THUDM/ChatGLM-6B/pull/216) , we build a local knowledge base question answering application that can be implemented using an open source model or remote LLM api throughout the process. In the latest version of this project, [FastChat](https://github.com/lm-sys/FastChat) is used to access Vicuna, Alpaca, LLaMA, Koala, RWKV and many other models. Relying on [langchain](https://github.com/langchain-ai/langchain) , this project supports calling services through the API provided based on [FastAPI](https://github.com/tiangolo/fastapi), or using the WebUI based on [Streamlit](https://github.com/streamlit/streamlit).
 ✅ Relying on the open source LLM and Embedding models, this project can realize full-process **offline private deployment**. At the same time, this project also supports the call of OpenAI GPT API- and Zhipu API, and will continue to expand the access to various models and remote APIs in the future.

 ⛓️ The implementation principle of this project is shown in the graph below. The main process includes: loading files -> reading text -> text segmentation -> text vectorization -> question vectorization -> matching the `top-k` most similar to the question vector in the text vector -> The matched text is added to `prompt `as context and question -> submitted to `LLM` to generate an answer.
@ -87,7 +86,7 @@ plese refer to [version change log](https://github.com/imClumsyPanda/langchain-C
 * **Full fuction API service based on FastAPI**. All interfaces can be tested in the docs automatically generated by [FastAPI](https://github.com/tiangolo/fastapi), and all dialogue interfaces support streaming or non-streaming output through parameters. ;
 * **WebUI service based on Streamlit**. With [Streamlit](https://github.com/streamlit/streamlit), you can choose whether to start WebUI based on API services, add session management, customize session themes and switch, and will support different display of content forms of output in the future;
 * **Abundant open source LLM and Embedding models**. The default LLM model in the project is changed to [THUDM/chatglm2-6b](https://huggingface.co/THUDM/chatglm2-6b), and the default Embedding model is changed to [moka-ai/m3e-base](https:// huggingface.co/moka-ai/m3e-base), the file loading method and the paragraph division method have also been adjusted. In the future, context expansion will be re-implemented and optional settings will be added;
-* **Multiply vector libraries**. The project has expanded support for different types of vector libraries. Including [FAISS](https://github.com/facebookresearch/faiss), [Milvus](https://github.com/milvus -io/milvus), and [PGVector](https://github.com/pgvector/pgvector);
+* **Multiply vector libraries**. The project has expanded support for different types of vector libraries. Including [FAISS](https://github.com/facebookresearch/faiss), [Milvus](https://github.com/milvus-io/milvus),[Milvus](https://milvus.io/),[Zilliz](https://zilliz.com/),and [PGVector](https://github.com/pgvector/pgvector);
 * **Varied Search engines**. We provide two search engines now: Bing and DuckDuckGo. DuckDuckGo search does not require configuring an API Key and can be used directly in environments with access to foreign services.

 ## Supported Models
@ -137,7 +136,9 @@ The project use [FastChat](https://github.com/lm-sys/FastChat) to provide the AP
 * Any [EleutherAI](https://huggingface.co/EleutherAI) pythia model such as [pythia-6.9b](https://huggingface.co/EleutherAI/pythia-6.9b)(任何 [EleutherAI](https://huggingface.co/EleutherAI) 的 pythia 模型，如 [pythia-6.9b](https://huggingface.co/EleutherAI/pythia-6.9b))
 * Any [Peft](https://github.com/huggingface/peft) adapter trained on top of a model above. To activate, must have `peft` in the model path. Note: If loading multiple peft models, you can have them share the base model weights by setting the environment variable `PEFT_SHARE_BASE_WEIGHTS=true` in any model worker.

-The above model support list may be updated continuously as [FastChat](https://github.com/lm-sys/FastChat) is updated, see [FastChat Supported Models List](https://github.com/lm-sys/FastChat/blob/main /docs/model_support.md).
+
+
+The above model support list may be updated continuously as [FastChat](https://github.com/lm-sys/FastChat) is updated, see [FastChat Supported Models List](https://github.com/lm-sys/FastChat/blob/main/docs/model_support.md).
 In addition to local models, this project also supports direct access to online models such as OpenAI API, Wisdom Spectrum AI, etc. For specific settings, please refer to the configuration information of `llm_model_dict` in `configs/model_configs.py.example`.
 Online LLM models are currently supported:

@ -179,7 +180,10 @@ Following models are tested by developers with Embedding class of [HuggingFace](

 The default Embedding type used in the project is `sensenova/piccolo-base-zh`, if you want to use other Embedding types, please modify `embedding_model_dict` and `embedding_model_dict` and `embedding_model_dict` in [configs/model_config.py]. MODEL` in [configs/model_config.py].

-### Build your own Agent tool!
+### Build your own Agent tool
+Only these Models support Agents
+ OpenAI GPT4
+ Qwen-14B-Chat

 See [Custom Agent Instructions](docs/自定义Agent.md) for details.

@ -196,7 +200,7 @@ docker run -d --gpus all -p 80:8501 registry.cn-beijing.aliyuncs.com/chatchat/ch
 - The image size of this version is `33.9GB`, using `v0.2.0`, with `nvidia/cuda:12.1.1-cudnn8-devel-ubuntu22.04` as the base image
 - This version has a built-in `embedding` model: `m3e-large`, built-in `chatglm2-6b-32k`
 - This version is designed to facilitate one-click deployment. Please make sure you have installed the NVIDIA driver on your Linux distribution.
- Please note that you do not need to install the CUDA toolkit on the host system, but you need to install the `NVIDIA Driver` and the `NVIDIA Container Toolkit`, please refer to the [Installation Guide](https://docs.nvidia.com/datacenter/cloud -native/container-toolkit/latest/install-guide.html)
+- Please note that you do not need to install the CUDA toolkit on the host system, but you need to install the `NVIDIA Driver` and the `NVIDIA Container Toolkit`, please refer to the [Installation Guide](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html)
 - It takes a certain amount of time to pull and start for the first time. When starting for the first time, please refer to the figure below to use `docker logs -f <container id>` to view the log.
 - If the startup process is stuck in the `Waiting..` step, it is recommended to use `docker exec -it <container id> bash` to enter the `/logs/` directory to view the corresponding stage logs

@ -216,9 +220,9 @@ Please refer to [install.md](docs/INSTALL.md)

 **For offline deployment only!**

-If you want to run this project in a local or offline environment, you need to first download the models required for the project to your local computer. Usually the open source LLM and Embedding models can be downloaded from [HuggingFace](https://huggingface.co/models).
+If you want to run this project in a local or offline environment, you need to first download the models required for the project to your local computer. Usually the open source LLM and Embedding models can be downloaded from [Hugging Face](https://huggingface.co/models).

-Take the LLM model [THUDM/chatglm2-6b](https://huggingface.co/THUDM/chatglm2-6b) and Embedding model [moka-ai/m3e-base](https://huggingface. co/moka-ai/m3e-base) for example:
+Take the LLM model [THUDM/chatglm2-6b](https://huggingface.co/THUDM/chatglm2-6b) and Embedding model [moka-ai/m3e-base](https://huggingface.co/moka-ai/m3e-base) for example:

 To download the model, you need to [install Git LFS](https://docs.github.com/zh/repositories/working-with-files/managing-large-files/installing-git-large-file-storage), and then run:

@ -333,7 +337,8 @@ Including lora,p-tuning,prefix tuning, prompt tuning,ia3

 This project loads the LLM service based on FastChat, so one must load the PEFT in a FastChat way. For models other than "chatglm", "falcon" or "code5p" and peft other than "p-tuning", ensure that the word `peft` must be in the path name, the name of the configuration file must be `adapter_config.json`, and the path contains PEFT weights in `.bin` format. The peft path is specified in `args.model_names` of the `create_model_worker_app` function in `startup.py`, and enable the environment variable `PEFT_SHARE_BASE_WEIGHTS=true` parameter.

-For "p-tuning" PEFT, please refer to [load p-tuning with chatchat](docs/chatchat加载ptuning.md).
+
+If the above method fails, you need to start standard fastchat service step by step.  Step-by-step procedure could be found Section 6. For further steps, please refer to [Model invalid after loading lora fine-tuning](https://github.com/chatchat-space/Langchain-Chatchat/issues/1130#issuecomment-1685291822).

 #### **5.5 Some Notes**

@ -347,17 +352,17 @@ The API, chat interface of WebUI, and knowledge management interface of WebUI ar

 1. FastAPI docs

-![](img/fastapi_docs_020_0.png)
+![](img/fastapi_docs_026.png)

 2. Chat Interface of WebUI

 - Dialogue interface of WebUI

-![img](img/webui_0915_0.png)
+![img](img/LLM_success.png)

 - Knowledge management interface of WebUI

-![img](img/webui_0915_1.png)
+![](img/init_knowledge_base.jpg)

 ## FAQ

@ -378,8 +383,7 @@ Please refer to [FAQ](docs/FAQ.md)
    - [ ] Structured documents
      - [X] .csv
      - [ ] .xlsx
-
-    - [] TextSplitter and Retriever
+    - [ ]TextSplitter and Retriever
      - [X] multiple TextSplitter
      - [X] ChineseTextSplitter
      - [ ] Reconstructed Context Retriever
@ -391,19 +395,19 @@ Please refer to [FAQ](docs/FAQ.md)

    - [X] Bing
    - [X] DuckDuckGo
+    - [X] Metaphor
  - [X] Agent

    - [X] Agent implementation in the form of basic React, including calls to calculators, etc.
    - [X] Langchain's own Agent implementation and calls
-    - [ ] More Agent support for models
+    - [X] Intelligent calls to different vector databases and networking knowledge
    - [ ] More tools
 - [X] LLM  Models

  - [X] [FastChat](https://github.com/lm-sys/fastchat) -based LLM Models
  - [ ] Mutiply Remote LLM API
 - [X] Embedding Models
-
-  - [X] HuggingFace -based Embedding models
+  - [X] Hugging Face-based Embedding models
  - [ ] Mutiply Remote Embedding API
 - [X] FastAPI-based API
 - [X] Web UI
@ -414,7 +418,7 @@ Please refer to [FAQ](docs/FAQ.md)

 ## Wechat Group

-<img src="img/qr_code_64.jpg" alt="QR Code" width="300" height="300" />
+<img src="img/qr_code_67.jpg" alt="QR Code" width="300" height="300" />

 🎉 langchain-Chatchat project WeChat exchange group, if you are also interested in this project, welcome to join the group chat to participate in the discussion and exchange.

--- a/configs/model_config.py.example
+++ b/configs/model_config.py.example
@ -90,9 +90,8 @@ MODEL_PATH = {
        "Qwen-14B-Chat":"Qwen/Qwen-14B-Chat",
    },
 }
-
 # 选用的 Embedding 名称
-EMBEDDING_MODEL = "m3e-base" # 可以尝试最新的嵌入式sota模型：piccolo-large-zh
+EMBEDDING_MODEL = "m3e-base" # 可以尝试最新的嵌入式sota模型：bge-large-zh-v1.5


 # Embedding 模型运行设备。设为"auto"会自动检测，也可手动设定为"cuda","mps","cpu"其中之一。
--- a/image/README_en/1694251973694.png
+++ b/image/README_en/1694251973694.png
--- a/image/README_en/1694252029167.png
+++ b/image/README_en/1694252029167.png
--- a/img/fastapi_docs_020_0.png
+++ b/img/fastapi_docs_020_0.png
--- a/img/fastapi_docs_026.png
+++ b/img/fastapi_docs_026.png
--- a/image/README/1694251762513.png
+++ b/image/README/1694251762513.png
--- a/img/qr_code_64.jpg
+++ b/img/qr_code_64.jpg
--- a/img/qr_code_67.jpg
+++ b/img/qr_code_67.jpg
--- a/server/model_workers/base.py
+++ b/server/model_workers/base.py
@ -46,8 +46,6 @@ class ApiModelWorker(BaseModelWorker):
    def count_token(self, params):
        # TODO：需要完善
        # print("count token")
-        print("\n\n\n")
-        print(params)
        prompt = params["prompt"]
        return {"count": len(str(prompt)), "error_code": 0}

@ -61,7 +59,7 @@ class ApiModelWorker(BaseModelWorker):

    def get_embeddings(self, params):
        print("embedding")
-        print(params)
+        # print(params)

    # help methods
    def get_config(self):
--- a/webui_pages/dialogue/dialogue.py
+++ b/webui_pages/dialogue/dialogue.py
@ -13,8 +13,6 @@ chat_box = ChatBox(
        "chatchat_icon_blue_square_v2.png"
    )
 )
-
-
 def get_messages_history(history_len: int, content_in_expander: bool = False) -> List[Dict]:
    '''
    返回消息历史。
@ -61,7 +59,6 @@ def dialogue_page(api: ApiRequest):
            f"当前运行的模型`{default_model}`, 您可以开始提问了."
        )
        chat_box.init_session()
-
    with st.sidebar:
        # TODO: 对话模型与会话绑定
        def on_mode_change():
@ -154,9 +151,6 @@ def dialogue_page(api: ApiRequest):
            key="prompt_template_select",
        )
        prompt_template_name = st.session_state.prompt_template_select
-
-        temperature = st.slider("Temperature：", 0.0, 1.0, TEMPERATURE, 0.05)
-
        history_len = st.number_input("历史对话轮数：", 0, 20, HISTORY_LEN)

        def on_kb_change():
@ -314,4 +308,4 @@ def dialogue_page(api: ApiRequest):
        file_name=f"{now:%Y-%m-%d %H.%M}_对话记录.md",
        mime="text/markdown",
        use_container_width=True,
-    )
+    )