add torch_gc to clear gpu cache in knowledge_based_chatglm.py

This commit is contained in:
littlepanda0716 2023-04-07 11:02:23 +08:00
parent 60d752bc18
commit e04085e380
1 changed files with 19 additions and 0 deletions

View File

@ -10,8 +10,27 @@
✅ In this project, [GanymedeNil/text2vec-large-chinese](https://huggingface.co/GanymedeNil/text2vec-large-chinese/tree/main) is used as Embedding Modeland [ChatGLM-6B](https://github.com/THUDM/ChatGLM-6B) used as LLM。Based on those modelsthis project can be deployed **offline** with all **open source** models。
## Update
**[2023/04/07]**
1. Fix bug which costs twice gpu memory (Thanks to [@suc16](https://github.com/suc16) and [@myml](https://github.com/myml)).
2. Add gpu memory clear function after each call of ChatGLM.
## Usage
### Hardware Requirements
- ChatGLM Hardware Requirements
| **Quantization Level** | **GPU Memory** |
|------------------------|----------------|
| FP16no quantization | 13 GB |
| INT8 | 10 GB |
| INT4 | 6 GB |
- Embedding Hardware Requirements
The default Embedding model in this repo is [GanymedeNil/text2vec-large-chinese](https://huggingface.co/GanymedeNil/text2vec-large-chinese/tree/main), 3GB GPU Memory required when running on GPU.
### 1. install python packages
```commandline
pip install -r requirements