更新FAQ和requirements,解决upload_file接口的两个异常 (#593)

2023-06-11 21:25:02 +08:00 · 2023-06-11 21:25:02 +08:00 · 27a9bf2433
parent 66c4e9de92
commit 27a9bf2433
2 changed files with 54 additions and 18 deletions
--- a/docs/FAQ.md
+++ b/docs/FAQ.md
@ -25,6 +25,7 @@ A3: 方法一：https://github.com/nltk/nltk_data/raw/gh-pages/packages/tokenize
 `nltk_data` 存储路径可以通过 `nltk.data.path` 查询。

 方法二：执行python代码
+
 ```
 import nltk
 nltk.download()
@ -39,16 +40,17 @@ A4: 方法一：将 https://github.com/nltk/nltk_data/blob/gh-pages/packages/tag
 `nltk_data` 存储路径可以通过 `nltk.data.path` 查询。

 方法二：执行python代码
+
 ```
 import nltk
 nltk.download()
 ```
+
 ---

 Q5: 本项目可否在 colab 中运行？

-A5: 可以尝试使用 chatglm-6b-int4 模型在 colab 中运行，需要注意的是，如需在 colab 中运行 Web UI，需将`webui.py`中`demo.queue(concurrency_count=3).launch(
-    server_name='0.0.0.0', share=False, inbrowser=False)`中参数`share`设置为`True`。
+A5: 可以尝试使用 chatglm-6b-int4 模型在 colab 中运行，需要注意的是，如需在 colab 中运行 Web UI，需将 `webui.py`中 `demo.queue(concurrency_count=3).launch( server_name='0.0.0.0', share=False, inbrowser=False)`中参数 `share`设置为 `True`。

 ---

@ -112,6 +114,7 @@ embedding_model_dict = {
                        "text2vec": "/Users/liuqian/Downloads/ChatGLM-6B/text2vec-large-chinese"
 }
 ```
+
 ---

 Q10: 执行 `python cli_demo.py`过程中，显卡内存爆了，提示"OutOfMemoryError: CUDA out of memory"
@ -128,15 +131,46 @@ A11: 更换 pypi 源后重新安装，如阿里源、清华源等，网络条件
 # 使用 pypi 源
 $ pip install -r requirements.txt -i https://pypi.python.org/simple
 ```
+
 或
+
 ```shell
 # 使用阿里源
 $ pip install -r requirements.txt -i http://mirrors.aliyun.com/pypi/simple/
 ```
+
 或
+
 ```shell
 # 使用清华源
 $ pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple/
 ```

+
+Q12 启动api.py时upload_file接口抛出 `partially initialized module 'charset_normalizer' has no attribute 'md__mypyc' (most likely due to a circular import)`
+
+这是由于 charset_normalizer模块版本过高导致的，需要降低低charset_normalizer的版本,测试在charset_normalizer==2.1.0上可用。
+
+---
+
+Q13 启动api.py时upload_file接口，上传PDF或图片时，抛出OSError: [Errno 101] Network is unreachable
+
+某些情况下,linux系统上的ip在请求下载ch_PP-OCRv3_rec_infer.tar等文件时，可能会抛出OSError: [Errno 101] Network is unreachable，此时需要首先修改anaconda3/envs/[虚拟环境名]/lib/[python版本]/site-packages/paddleocr/ppocr/utils/network.py脚本，将57行的：
+
+```
+download_with_progressbar(url, tmp_path)
+```
+
+修改为：
+
+```
+        try:
+            download_with_progressbar(url, tmp_path)
+        except Exception as e:
+            print(f"download {url} error,please download it manually:")
+            print(e)
+```
+
+然后按照给定网址，如"https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_infer.tar"手动下载文件，上传到对应的文件夹中，如“.paddleocr/whl/rec/ch/ch_PP-OCRv3_rec_infer/ch_PP-OCRv3_rec_infer.tar”.
+
 ---
--- a/requirements.txt
+++ b/requirements.txt
@ -33,3 +33,5 @@ numpy~=1.23.5
 tqdm~=4.65.0
 requests~=2.28.2
 tenacity~=8.2.2
+# 默认下载的charset_normalizer模块版本过高会抛出，`artially initialized module 'charset_normalizer' has no attribute 'md__mypyc' (most likely due to a circular import)`
+charset_normalizer==2.1.0