Go to file
liang tongtong 57c3cac3bb Webui更新说明
1、自动读取knowledge_based_chatglm.py中LLM及embedding模型枚举,选择后点击setting进行模型加载,可随时切换模型进行测试
2、可手动调节保留对话历史长度,可根据显存大小自行调节
3、添加上传文件功能,通过下拉框选择已上传的文件,点击loading加载文件,过程中可随时更换加载的文件
4、底部添加use via API可对接到自己系统

TODO:
1、添加模型加载进度条
2、添加输出内容及错误提示
3、国家化语言切换
4、引用标注
5、添加插件系统(可基础lora训练等)
2023-04-11 04:30:36 +00:00
content Webui更新说明 2023-04-11 04:30:36 +00:00
LICENSE Create LICENSE 2023-04-07 11:41:10 +08:00
README.md Update README 2023-04-11 00:15:16 +08:00
README_en.md Update README 2023-04-11 00:15:16 +08:00
chatglm_llm.py use RetrievalQA instead of ChatVectorDBChain 2023-04-10 22:55:22 +08:00
knowledge_based_chatglm.py Webui更新说明 2023-04-11 04:30:36 +00:00
requirements.txt update requirements.txt 2023-04-11 11:09:38 +08:00
webui.py Webui更新说明 2023-04-11 04:30:36 +00:00

README_en.md

ChatGLM Application Based on Local Knowledge

Introduction

🌍 中文文档

🤖 A local knowledge based LLM Application with ChatGLM-6B and langchain.

💡 Inspired by document.ai by GanymedeNil and ChatGLM-6B Pull Request by AlexZhangji.

In this project, GanymedeNil/text2vec-large-chinese is used as Embedding Modeland ChatGLM-6B used as LLM。Based on those modelsthis project can be deployed offline with all open source models。

Update

[2023/04/07]

  1. Fix bug which costs twice gpu memory (Thanks to @suc16 and @myml).
  2. Add gpu memory clear function after each call of ChatGLM.
  3. Add nghuyong/ernie-3.0-nano-zh and nghuyong/ernie-3.0-base-zh as Embedding model alternativescosting less gpu than GanymedeNil/text2vec-large-chinese (Thanks to @lastrei)

[2023/04/09]

  1. Using RetrievalQA in langchain to replace the previously selected ChatVectorDBChain, the replacement can effectively solve the problem of program stopping after 2-3 questions due to insufficient gpu memory.
  2. Add EMBEDDING_MODEL, VECTOR_SEARCH_TOP_K, LLM_MODEL, LLM_HISTORY_LEN, REPLY_WITH_SOURCE parameter value settings in knowledge_based_chatglm.py.
  3. Add chatglm-6b-int4, chatglm-6b-int4-qe with smaller GPU memory requirements as LLM model alternatives.
  4. Correct code errors in README.md (Thanks to @calcitem).

Usage

Hardware Requirements

  • ChatGLM Hardware Requirements

    Quantization Level GPU Memory
    FP16no quantization 13 GB
    INT8 10 GB
    INT4 6 GB
  • Embedding Hardware Requirements

    The default Embedding model in this repo is GanymedeNil/text2vec-large-chinese, 3GB GPU Memory required when running on GPU.

Software Requirements

This repo has been tested in python 3.8 environment。

1. install python packages

pip install -r requirements.txt

Attention: With langchain.document_loaders.UnstructuredFileLoader used to connect with local knowledge file, you may need some other dependencies as mentioned in langchain documentation

2. Run knowledge_based_chatglm.py script

python knowledge_based_chatglm.py

Known issues

  • Currently tested to support txt, docx, md format files, for more file formats please refer to langchain documentation. If the document contains special characters, the file may not be correctly loaded.
  • When running this project with macOS, it may not work properly due to incompatibility with pytorch caused by macOS version 13.3 and above.

FAQ

Q: How to solve Resource punkt not found.?

A: Unzip packages/tokenizers in https://github.com/nltk/nltk_data/raw/gh-pages/packages/tokenizers/punkt.zip and put it in the corresponding directory of Searched in:.

Q: How to solve Resource averaged_perceptron_tagger not found.?

A: Download https://github.com/nltk/nltk_data/blob/gh-pages/packages/taggers/averaged_perceptron_tagger.zip, decompress it and put it in the corresponding directory of Searched in:.

Roadmap

  • local knowledge based application with langchain + ChatGLM-6B
  • unstructured files loaded with langchain
  • more different file format loaded with langchain
  • implement web ui DEMO with gradio/streamlit
  • implement API with fastapiand web ui DEMO with API