diff --git a/docs/API.md b/docs/API.md
deleted file mode 100644
index 58a1b41..0000000
--- a/docs/API.md
+++ /dev/null
@@ -1,1042 +0,0 @@
----
-title: FastAPI v0.1.0
-language_tabs:
-  - shell: Shell
-  - http: HTTP
-  - javascript: JavaScript
-  - ruby: Ruby
-  - python: Python
-  - php: PHP
-  - java: Java
-  - go: Go
-toc_footers: []
-includes: []
-search: true
-highlight_theme: darkula
-headingLevel: 2
-
----
-
-<!-- Generator: Widdershins v4.0.1 -->
-
-<h1 id="fastapi">FastAPI v0.1.0</h1>
-
-> Scroll down for code samples, example requests and responses. Select a language for code samples from the tabs above or the mobile navigation menu.
-
-<h1 id="fastapi-default">Default</h1>
-
-## chat_chat_docs_chat_post
-
-<a id="opIdchat_chat_docs_chat_post"></a>
-
-> Code samples
-
-```shell
-# You can also use wget
-curl -X POST /chat-docs/chat \
-  -H 'Content-Type: application/json' \
-  -H 'Accept: application/json'
-
-```
-
-```http
-POST /chat-docs/chat HTTP/1.1
-
-Content-Type: application/json
-Accept: application/json
-
-```
-
-```javascript
-const inputBody = '{
-  "knowledge_base_id": "string",
-  "question": "string",
-  "history": []
-}';
-const headers = {
-  'Content-Type':'application/json',
-  'Accept':'application/json'
-};
-
-fetch('/chat-docs/chat',
-{
-  method: 'POST',
-  body: inputBody,
-  headers: headers
-})
-.then(function(res) {
-    return res.json();
-}).then(function(body) {
-    console.log(body);
-});
-
-```
-
-```ruby
-require 'rest-client'
-require 'json'
-
-headers = {
-  'Content-Type' => 'application/json',
-  'Accept' => 'application/json'
-}
-
-result = RestClient.post '/chat-docs/chat',
-  params: {
-  }, headers: headers
-
-p JSON.parse(result)
-
-```
-
-```python
-import requests
-headers = {
-  'Content-Type': 'application/json',
-  'Accept': 'application/json'
-}
-
-r = requests.post('/chat-docs/chat', headers = headers)
-
-print(r.json())
-
-```
-
-```php
-<?php
-
-require 'vendor/autoload.php';
-
-$headers = array(
-    'Content-Type' => 'application/json',
-    'Accept' => 'application/json',
-);
-
-$client = new \GuzzleHttp\Client();
-
-// Define array of request body.
-$request_body = array();
-
-try {
-    $response = $client->request('POST','/chat-docs/chat', array(
-        'headers' => $headers,
-        'json' => $request_body,
-       )
-    );
-    print_r($response->getBody()->getContents());
- }
- catch (\GuzzleHttp\Exception\BadResponseException $e) {
-    // handle exception or api errors.
-    print_r($e->getMessage());
- }
-
- // ...
-
-```
-
-```java
-URL obj = new URL("/chat-docs/chat");
-HttpURLConnection con = (HttpURLConnection) obj.openConnection();
-con.setRequestMethod("POST");
-int responseCode = con.getResponseCode();
-BufferedReader in = new BufferedReader(
-    new InputStreamReader(con.getInputStream()));
-String inputLine;
-StringBuffer response = new StringBuffer();
-while ((inputLine = in.readLine()) != null) {
-    response.append(inputLine);
-}
-in.close();
-System.out.println(response.toString());
-
-```
-
-```go
-package main
-
-import (
-       "bytes"
-       "net/http"
-)
-
-func main() {
-
-    headers := map[string][]string{
-        "Content-Type": []string{"application/json"},
-        "Accept": []string{"application/json"},
-    }
-
-    data := bytes.NewBuffer([]byte{jsonReq})
-    req, err := http.NewRequest("POST", "/chat-docs/chat", data)
-    req.Header = headers
-
-    client := &http.Client{}
-    resp, err := client.Do(req)
-    // ...
-}
-
-```
-
-`POST /chat-docs/chat`
-
-*Chat*
-
-> Body parameter
-
-```json
-{
-  "knowledge_base_id": "string",
-  "question": "string",
-  "history": []
-}
-```
-
-<h3 id="chat_chat_docs_chat_post-parameters">Parameters</h3>
-
-|Name|In|Type|Required|Description|
-|---|---|---|---|---|
-|body|body|[Body_chat_chat_docs_chat_post](#schemabody_chat_chat_docs_chat_post)|true|none|
-
-> Example responses
-
-> 200 Response
-
-```json
-{
-  "question": "工伤保险如何办理？",
-  "response": "根据已知信息，可以总结如下：\n\n1. 参保单位为员工缴纳工伤保险费，以保障员工在发生工伤时能够获得相应的待遇。\n2. 不同地区的工伤保险缴费规定可能有所不同，需要向当地社保部门咨询以了解具体的缴费标准和规定。\n3. 工伤从业人员及其近亲属需要申请工伤认定，确认享受的待遇资格，并按时缴纳工伤保险费。\n4. 工伤保险待遇包括工伤医疗、康复、辅助器具配置费用、伤残待遇、工亡待遇、一次性工亡补助金等。\n5. 工伤保险待遇领取资格认证包括长期待遇领取人员认证和一次性待遇领取人员认证。\n6. 工伤保险基金支付的待遇项目包括工伤医疗待遇、康复待遇、辅助器具配置费用、一次性工亡补助金、丧葬补助金等。",
-  "history": [
-    [
-      "工伤保险是什么？",
-      "工伤保险是指用人单位按照国家规定，为本单位的职工和用人单位的其他人员，缴纳工伤保险费，由保险机构按照国家规定的标准，给予工伤保险待遇的社会保险制度。"
-    ]
-  ],
-  "source_documents": [
-    "出处 [1] 广州市单位从业的特定人员参加工伤保险办事指引.docx：\n\n\t( 一)  从业单位  (组织)  按“自愿参保”原则，  为未建 立劳动关系的特定从业人员单项参加工伤保险 、缴纳工伤保 险费。",
-    "出处 [2] ...",
-    "出处 [3] ..."
-  ]
-}
-```
-
-<h3 id="chat_chat_docs_chat_post-responses">Responses</h3>
-
-|Status|Meaning|Description|Schema|
-|---|---|---|---|
-|200|[OK](https://tools.ietf.org/html/rfc7231#section-6.3.1)|Successful Response|[ChatMessage](#schemachatmessage)|
-|422|[Unprocessable Entity](https://tools.ietf.org/html/rfc2518#section-10.3)|Validation Error|[HTTPValidationError](#schemahttpvalidationerror)|
-
-<aside class="success">
-This operation does not require authentication
-</aside>
-
-## upload_file_chat_docs_upload_post
-
-<a id="opIdupload_file_chat_docs_upload_post"></a>
-
-> Code samples
-
-```shell
-# You can also use wget
-curl -X POST /chat-docs/upload \
-  -H 'Content-Type: multipart/form-data' \
-  -H 'Accept: application/json'
-
-```
-
-```http
-POST /chat-docs/upload HTTP/1.1
-
-Content-Type: multipart/form-data
-Accept: application/json
-
-```
-
-```javascript
-const inputBody = '{
-  "files": [
-    "string"
-  ],
-  "knowledge_base_id": "string"
-}';
-const headers = {
-  'Content-Type':'multipart/form-data',
-  'Accept':'application/json'
-};
-
-fetch('/chat-docs/upload',
-{
-  method: 'POST',
-  body: inputBody,
-  headers: headers
-})
-.then(function(res) {
-    return res.json();
-}).then(function(body) {
-    console.log(body);
-});
-
-```
-
-```ruby
-require 'rest-client'
-require 'json'
-
-headers = {
-  'Content-Type' => 'multipart/form-data',
-  'Accept' => 'application/json'
-}
-
-result = RestClient.post '/chat-docs/upload',
-  params: {
-  }, headers: headers
-
-p JSON.parse(result)
-
-```
-
-```python
-import requests
-headers = {
-  'Content-Type': 'multipart/form-data',
-  'Accept': 'application/json'
-}
-
-r = requests.post('/chat-docs/upload', headers = headers)
-
-print(r.json())
-
-```
-
-```php
-<?php
-
-require 'vendor/autoload.php';
-
-$headers = array(
-    'Content-Type' => 'multipart/form-data',
-    'Accept' => 'application/json',
-);
-
-$client = new \GuzzleHttp\Client();
-
-// Define array of request body.
-$request_body = array();
-
-try {
-    $response = $client->request('POST','/chat-docs/upload', array(
-        'headers' => $headers,
-        'json' => $request_body,
-       )
-    );
-    print_r($response->getBody()->getContents());
- }
- catch (\GuzzleHttp\Exception\BadResponseException $e) {
-    // handle exception or api errors.
-    print_r($e->getMessage());
- }
-
- // ...
-
-```
-
-```java
-URL obj = new URL("/chat-docs/upload");
-HttpURLConnection con = (HttpURLConnection) obj.openConnection();
-con.setRequestMethod("POST");
-int responseCode = con.getResponseCode();
-BufferedReader in = new BufferedReader(
-    new InputStreamReader(con.getInputStream()));
-String inputLine;
-StringBuffer response = new StringBuffer();
-while ((inputLine = in.readLine()) != null) {
-    response.append(inputLine);
-}
-in.close();
-System.out.println(response.toString());
-
-```
-
-```go
-package main
-
-import (
-       "bytes"
-       "net/http"
-)
-
-func main() {
-
-    headers := map[string][]string{
-        "Content-Type": []string{"multipart/form-data"},
-        "Accept": []string{"application/json"},
-    }
-
-    data := bytes.NewBuffer([]byte{jsonReq})
-    req, err := http.NewRequest("POST", "/chat-docs/upload", data)
-    req.Header = headers
-
-    client := &http.Client{}
-    resp, err := client.Do(req)
-    // ...
-}
-
-```
-
-`POST /chat-docs/upload`
-
-*Upload File*
-
-> Body parameter
-
-```yaml
-files:
-  - string
-knowledge_base_id: string
-
-```
-
-<h3 id="upload_file_chat_docs_upload_post-parameters">Parameters</h3>
-
-|Name|In|Type|Required|Description|
-|---|---|---|---|---|
-|body|body|[Body_upload_file_chat_docs_upload_post](#schemabody_upload_file_chat_docs_upload_post)|true|none|
-
-> Example responses
-
-> 200 Response
-
-```json
-{
-  "code": 200,
-  "msg": "success"
-}
-```
-
-<h3 id="upload_file_chat_docs_upload_post-responses">Responses</h3>
-
-|Status|Meaning|Description|Schema|
-|---|---|---|---|
-|200|[OK](https://tools.ietf.org/html/rfc7231#section-6.3.1)|Successful Response|[BaseResponse](#schemabaseresponse)|
-|422|[Unprocessable Entity](https://tools.ietf.org/html/rfc2518#section-10.3)|Validation Error|[HTTPValidationError](#schemahttpvalidationerror)|
-
-<aside class="success">
-This operation does not require authentication
-</aside>
-
-## list_docs_chat_docs_list_get
-
-<a id="opIdlist_docs_chat_docs_list_get"></a>
-
-> Code samples
-
-```shell
-# You can also use wget
-curl -X GET /chat-docs/list?knowledge_base_id=doc_id1 \
-  -H 'Accept: application/json'
-
-```
-
-```http
-GET /chat-docs/list?knowledge_base_id=doc_id1 HTTP/1.1
-
-Accept: application/json
-
-```
-
-```javascript
-
-const headers = {
-  'Accept':'application/json'
-};
-
-fetch('/chat-docs/list?knowledge_base_id=doc_id1',
-{
-  method: 'GET',
-
-  headers: headers
-})
-.then(function(res) {
-    return res.json();
-}).then(function(body) {
-    console.log(body);
-});
-
-```
-
-```ruby
-require 'rest-client'
-require 'json'
-
-headers = {
-  'Accept' => 'application/json'
-}
-
-result = RestClient.get '/chat-docs/list',
-  params: {
-  'knowledge_base_id' => 'string'
-}, headers: headers
-
-p JSON.parse(result)
-
-```
-
-```python
-import requests
-headers = {
-  'Accept': 'application/json'
-}
-
-r = requests.get('/chat-docs/list', params={
-  'knowledge_base_id': 'doc_id1'
-}, headers = headers)
-
-print(r.json())
-
-```
-
-```php
-<?php
-
-require 'vendor/autoload.php';
-
-$headers = array(
-    'Accept' => 'application/json',
-);
-
-$client = new \GuzzleHttp\Client();
-
-// Define array of request body.
-$request_body = array();
-
-try {
-    $response = $client->request('GET','/chat-docs/list', array(
-        'headers' => $headers,
-        'json' => $request_body,
-       )
-    );
-    print_r($response->getBody()->getContents());
- }
- catch (\GuzzleHttp\Exception\BadResponseException $e) {
-    // handle exception or api errors.
-    print_r($e->getMessage());
- }
-
- // ...
-
-```
-
-```java
-URL obj = new URL("/chat-docs/list?knowledge_base_id=doc_id1");
-HttpURLConnection con = (HttpURLConnection) obj.openConnection();
-con.setRequestMethod("GET");
-int responseCode = con.getResponseCode();
-BufferedReader in = new BufferedReader(
-    new InputStreamReader(con.getInputStream()));
-String inputLine;
-StringBuffer response = new StringBuffer();
-while ((inputLine = in.readLine()) != null) {
-    response.append(inputLine);
-}
-in.close();
-System.out.println(response.toString());
-
-```
-
-```go
-package main
-
-import (
-       "bytes"
-       "net/http"
-)
-
-func main() {
-
-    headers := map[string][]string{
-        "Accept": []string{"application/json"},
-    }
-
-    data := bytes.NewBuffer([]byte{jsonReq})
-    req, err := http.NewRequest("GET", "/chat-docs/list", data)
-    req.Header = headers
-
-    client := &http.Client{}
-    resp, err := client.Do(req)
-    // ...
-}
-
-```
-
-`GET /chat-docs/list`
-
-*List Docs*
-
-<h3 id="list_docs_chat_docs_list_get-parameters">Parameters</h3>
-
-|Name|In|Type|Required|Description|
-|---|---|---|---|---|
-|knowledge_base_id|query|string|true|Document ID|
-
-> Example responses
-
-> 200 Response
-
-```json
-{
-  "code": 200,
-  "msg": "success",
-  "data": [
-    "doc1.docx",
-    "doc2.pdf",
-    "doc3.txt"
-  ]
-}
-```
-
-<h3 id="list_docs_chat_docs_list_get-responses">Responses</h3>
-
-|Status|Meaning|Description|Schema|
-|---|---|---|---|
-|200|[OK](https://tools.ietf.org/html/rfc7231#section-6.3.1)|Successful Response|[ListDocsResponse](#schemalistdocsresponse)|
-|422|[Unprocessable Entity](https://tools.ietf.org/html/rfc2518#section-10.3)|Validation Error|[HTTPValidationError](#schemahttpvalidationerror)|
-
-<aside class="success">
-This operation does not require authentication
-</aside>
-
-## delete_docs_chat_docs_delete_delete
-
-<a id="opIddelete_docs_chat_docs_delete_delete"></a>
-
-> Code samples
-
-```shell
-# You can also use wget
-curl -X DELETE /chat-docs/delete \
-  -H 'Content-Type: application/x-www-form-urlencoded' \
-  -H 'Accept: application/json'
-
-```
-
-```http
-DELETE /chat-docs/delete HTTP/1.1
-
-Content-Type: application/x-www-form-urlencoded
-Accept: application/json
-
-```
-
-```javascript
-const inputBody = '{
-  "knowledge_base_id": "string",
-  "doc_name": "string"
-}';
-const headers = {
-  'Content-Type':'application/x-www-form-urlencoded',
-  'Accept':'application/json'
-};
-
-fetch('/chat-docs/delete',
-{
-  method: 'DELETE',
-  body: inputBody,
-  headers: headers
-})
-.then(function(res) {
-    return res.json();
-}).then(function(body) {
-    console.log(body);
-});
-
-```
-
-```ruby
-require 'rest-client'
-require 'json'
-
-headers = {
-  'Content-Type' => 'application/x-www-form-urlencoded',
-  'Accept' => 'application/json'
-}
-
-result = RestClient.delete '/chat-docs/delete',
-  params: {
-  }, headers: headers
-
-p JSON.parse(result)
-
-```
-
-```python
-import requests
-headers = {
-  'Content-Type': 'application/x-www-form-urlencoded',
-  'Accept': 'application/json'
-}
-
-r = requests.delete('/chat-docs/delete', headers = headers)
-
-print(r.json())
-
-```
-
-```php
-<?php
-
-require 'vendor/autoload.php';
-
-$headers = array(
-    'Content-Type' => 'application/x-www-form-urlencoded',
-    'Accept' => 'application/json',
-);
-
-$client = new \GuzzleHttp\Client();
-
-// Define array of request body.
-$request_body = array();
-
-try {
-    $response = $client->request('DELETE','/chat-docs/delete', array(
-        'headers' => $headers,
-        'json' => $request_body,
-       )
-    );
-    print_r($response->getBody()->getContents());
- }
- catch (\GuzzleHttp\Exception\BadResponseException $e) {
-    // handle exception or api errors.
-    print_r($e->getMessage());
- }
-
- // ...
-
-```
-
-```java
-URL obj = new URL("/chat-docs/delete");
-HttpURLConnection con = (HttpURLConnection) obj.openConnection();
-con.setRequestMethod("DELETE");
-int responseCode = con.getResponseCode();
-BufferedReader in = new BufferedReader(
-    new InputStreamReader(con.getInputStream()));
-String inputLine;
-StringBuffer response = new StringBuffer();
-while ((inputLine = in.readLine()) != null) {
-    response.append(inputLine);
-}
-in.close();
-System.out.println(response.toString());
-
-```
-
-```go
-package main
-
-import (
-       "bytes"
-       "net/http"
-)
-
-func main() {
-
-    headers := map[string][]string{
-        "Content-Type": []string{"application/x-www-form-urlencoded"},
-        "Accept": []string{"application/json"},
-    }
-
-    data := bytes.NewBuffer([]byte{jsonReq})
-    req, err := http.NewRequest("DELETE", "/chat-docs/delete", data)
-    req.Header = headers
-
-    client := &http.Client{}
-    resp, err := client.Do(req)
-    // ...
-}
-
-```
-
-`DELETE /chat-docs/delete`
-
-*Delete Docs*
-
-> Body parameter
-
-```yaml
-knowledge_base_id: string
-doc_name: string
-
-```
-
-<h3 id="delete_docs_chat_docs_delete_delete-parameters">Parameters</h3>
-
-|Name|In|Type|Required|Description|
-|---|---|---|---|---|
-|body|body|[Body_delete_docs_chat_docs_delete_delete](#schemabody_delete_docs_chat_docs_delete_delete)|true|none|
-
-> Example responses
-
-> 200 Response
-
-```json
-{
-  "code": 200,
-  "msg": "success"
-}
-```
-
-<h3 id="delete_docs_chat_docs_delete_delete-responses">Responses</h3>
-
-|Status|Meaning|Description|Schema|
-|---|---|---|---|
-|200|[OK](https://tools.ietf.org/html/rfc7231#section-6.3.1)|Successful Response|[BaseResponse](#schemabaseresponse)|
-|422|[Unprocessable Entity](https://tools.ietf.org/html/rfc2518#section-10.3)|Validation Error|[HTTPValidationError](#schemahttpvalidationerror)|
-
-<aside class="success">
-This operation does not require authentication
-</aside>
-
-# Schemas
-
-<h2 id="tocS_BaseResponse">BaseResponse</h2>
-<!-- backwards compatibility -->
-<a id="schemabaseresponse"></a>
-<a id="schema_BaseResponse"></a>
-<a id="tocSbaseresponse"></a>
-<a id="tocsbaseresponse"></a>
-
-```json
-{
-  "code": 200,
-  "msg": "success"
-}
-
-```
-
-BaseResponse
-
-### Properties
-
-|Name|Type|Required|Restrictions|Description|
-|---|---|---|---|---|
-|code|integer|false|none|HTTP status code|
-|msg|string|false|none|HTTP status message|
-
-<h2 id="tocS_Body_chat_chat_docs_chat_post">Body_chat_chat_docs_chat_post</h2>
-<!-- backwards compatibility -->
-<a id="schemabody_chat_chat_docs_chat_post"></a>
-<a id="schema_Body_chat_chat_docs_chat_post"></a>
-<a id="tocSbody_chat_chat_docs_chat_post"></a>
-<a id="tocsbody_chat_chat_docs_chat_post"></a>
-
-```json
-{
-  "knowledge_base_id": "string",
-  "question": "string",
-  "history": []
-}
-
-```
-
-Body_chat_chat_docs_chat_post
-
-### Properties
-
-|Name|Type|Required|Restrictions|Description|
-|---|---|---|---|---|
-|knowledge_base_id|string|true|none|Knowledge Base Name|
-|question|string|true|none|Question|
-|history|[array]|false|none|History of previous questions and answers|
-
-<h2 id="tocS_Body_delete_docs_chat_docs_delete_delete">Body_delete_docs_chat_docs_delete_delete</h2>
-<!-- backwards compatibility -->
-<a id="schemabody_delete_docs_chat_docs_delete_delete"></a>
-<a id="schema_Body_delete_docs_chat_docs_delete_delete"></a>
-<a id="tocSbody_delete_docs_chat_docs_delete_delete"></a>
-<a id="tocsbody_delete_docs_chat_docs_delete_delete"></a>
-
-```json
-{
-  "knowledge_base_id": "string",
-  "doc_name": "string"
-}
-
-```
-
-Body_delete_docs_chat_docs_delete_delete
-
-### Properties
-
-|Name|Type|Required|Restrictions|Description|
-|---|---|---|---|---|
-|knowledge_base_id|string|true|none|Knowledge Base Name|
-|doc_name|string|false|none|doc name|
-
-<h2 id="tocS_Body_upload_file_chat_docs_upload_post">Body_upload_file_chat_docs_upload_post</h2>
-<!-- backwards compatibility -->
-<a id="schemabody_upload_file_chat_docs_upload_post"></a>
-<a id="schema_Body_upload_file_chat_docs_upload_post"></a>
-<a id="tocSbody_upload_file_chat_docs_upload_post"></a>
-<a id="tocsbody_upload_file_chat_docs_upload_post"></a>
-
-```json
-{
-  "files": [
-    "string"
-  ],
-  "knowledge_base_id": "string"
-}
-
-```
-
-Body_upload_file_chat_docs_upload_post
-
-### Properties
-
-|Name|Type|Required|Restrictions|Description|
-|---|---|---|---|---|
-|files|[string]|true|none|none|
-|knowledge_base_id|string|true|none|Knowledge Base Name|
-
-<h2 id="tocS_ChatMessage">ChatMessage</h2>
-<!-- backwards compatibility -->
-<a id="schemachatmessage"></a>
-<a id="schema_ChatMessage"></a>
-<a id="tocSchatmessage"></a>
-<a id="tocschatmessage"></a>
-
-```json
-{
-  "question": "工伤保险如何办理？",
-  "response": "根据已知信息，可以总结如下：\n\n1. 参保单位为员工缴纳工伤保险费，以保障员工在发生工伤时能够获得相应的待遇。\n2. 不同地区的工伤保险缴费规定可能有所不同，需要向当地社保部门咨询以了解具体的缴费标准和规定。\n3. 工伤从业人员及其近亲属需要申请工伤认定，确认享受的待遇资格，并按时缴纳工伤保险费。\n4. 工伤保险待遇包括工伤医疗、康复、辅助器具配置费用、伤残待遇、工亡待遇、一次性工亡补助金等。\n5. 工伤保险待遇领取资格认证包括长期待遇领取人员认证和一次性待遇领取人员认证。\n6. 工伤保险基金支付的待遇项目包括工伤医疗待遇、康复待遇、辅助器具配置费用、一次性工亡补助金、丧葬补助金等。",
-  "history": [
-    [
-      "工伤保险是什么？",
-      "工伤保险是指用人单位按照国家规定，为本单位的职工和用人单位的其他人员，缴纳工伤保险费，由保险机构按照国家规定的标准，给予工伤保险待遇的社会保险制度。"
-    ]
-  ],
-  "source_documents": [
-    "出处 [1] 广州市单位从业的特定人员参加工伤保险办事指引.docx：\n\n\t( 一)  从业单位  (组织)  按“自愿参保”原则，  为未建 立劳动关系的特定从业人员单项参加工伤保险 、缴纳工伤保 险费。",
-    "出处 [2] ...",
-    "出处 [3] ..."
-  ]
-}
-
-```
-
-ChatMessage
-
-### Properties
-
-|Name|Type|Required|Restrictions|Description|
-|---|---|---|---|---|
-|question|string|true|none|Question text|
-|response|string|true|none|Response text|
-|history|[array]|true|none|History text|
-|source_documents|[string]|true|none|List of source documents and their scores|
-
-<h2 id="tocS_HTTPValidationError">HTTPValidationError</h2>
-<!-- backwards compatibility -->
-<a id="schemahttpvalidationerror"></a>
-<a id="schema_HTTPValidationError"></a>
-<a id="tocShttpvalidationerror"></a>
-<a id="tocshttpvalidationerror"></a>
-
-```json
-{
-  "detail": [
-    {
-      "loc": [
-        "string"
-      ],
-      "msg": "string",
-      "type": "string"
-    }
-  ]
-}
-
-```
-
-HTTPValidationError
-
-### Properties
-
-|Name|Type|Required|Restrictions|Description|
-|---|---|---|---|---|
-|detail|[[ValidationError](#schemavalidationerror)]|false|none|none|
-
-<h2 id="tocS_ListDocsResponse">ListDocsResponse</h2>
-<!-- backwards compatibility -->
-<a id="schemalistdocsresponse"></a>
-<a id="schema_ListDocsResponse"></a>
-<a id="tocSlistdocsresponse"></a>
-<a id="tocslistdocsresponse"></a>
-
-```json
-{
-  "code": 200,
-  "msg": "success",
-  "data": [
-    "doc1.docx",
-    "doc2.pdf",
-    "doc3.txt"
-  ]
-}
-
-```
-
-ListDocsResponse
-
-### Properties
-
-|Name|Type|Required|Restrictions|Description|
-|---|---|---|---|---|
-|code|integer|false|none|HTTP status code|
-|msg|string|false|none|HTTP status message|
-|data|[string]|true|none|List of document names|
-
-<h2 id="tocS_ValidationError">ValidationError</h2>
-<!-- backwards compatibility -->
-<a id="schemavalidationerror"></a>
-<a id="schema_ValidationError"></a>
-<a id="tocSvalidationerror"></a>
-<a id="tocsvalidationerror"></a>
-
-```json
-{
-  "loc": [
-    "string"
-  ],
-  "msg": "string",
-  "type": "string"
-}
-
-```
-
-ValidationError
-
-### Properties
-
-|Name|Type|Required|Restrictions|Description|
-|---|---|---|---|---|
-|loc|[anyOf]|true|none|none|
-
-anyOf
-
-|Name|Type|Required|Restrictions|Description|
-|---|---|---|---|---|
-|» *anonymous*|string|false|none|none|
-
-or
-
-|Name|Type|Required|Restrictions|Description|
-|---|---|---|---|---|
-|» *anonymous*|integer|false|none|none|
-
-continued
-
-|Name|Type|Required|Restrictions|Description|
-|---|---|---|---|---|
-|msg|string|true|none|none|
-|type|string|true|none|none|
-
diff --git a/docs/FAQ.md b/docs/FAQ.md
index 6e6f3b1..62fe080 100644
--- a/docs/FAQ.md
+++ b/docs/FAQ.md
@@ -6,21 +6,9 @@ A1: 目前已测试支持 txt、docx、md、pdf 格式文件，更多文件格
 
 ---
 
-Q2: 执行 `pip install -r requirements.txt` 过程中，安装 `detectron2` 时发生报错怎么办？
+Q2: 使用过程中 Python 包 `nltk`发生了 `Resource punkt not found.`报错，该如何解决？
 
-A2: 如果不需要对 `pdf` 格式文件读取，可不安装 `detectron2`；如需对 `pdf` 文件进行高精度文本提取，建议按照如下方法安装：
-
-```commandline
-$ git clone https://github.com/facebookresearch/detectron2.git
-$ cd detectron2
-$ pip install -e .
-```
-
----
-
-Q3: 使用过程中 Python 包 `nltk`发生了 `Resource punkt not found.`报错，该如何解决？
-
-A3: 方法一：https://github.com/nltk/nltk_data/raw/gh-pages/packages/tokenizers/punkt.zip 中的 `packages/tokenizers` 解压，放到  `nltk_data/tokenizers` 存储路径下。
+A2: 方法一：https://github.com/nltk/nltk_data/raw/gh-pages/packages/tokenizers/punkt.zip 中的 `packages/tokenizers` 解压，放到  `nltk_data/tokenizers` 存储路径下。
 
 `nltk_data` 存储路径可以通过 `nltk.data.path` 查询。
 
@@ -33,9 +21,9 @@ nltk.download()
 
 ---
 
-Q4: 使用过程中 Python 包 `nltk`发生了 `Resource averaged_perceptron_tagger not found.`报错，该如何解决？
+Q3: 使用过程中 Python 包 `nltk`发生了 `Resource averaged_perceptron_tagger not found.`报错，该如何解决？
 
-A4: 方法一：将 https://github.com/nltk/nltk_data/blob/gh-pages/packages/taggers/averaged_perceptron_tagger.zip 下载，解压放到 `nltk_data/taggers` 存储路径下。
+A3: 方法一：将 https://github.com/nltk/nltk_data/blob/gh-pages/packages/taggers/averaged_perceptron_tagger.zip 下载，解压放到 `nltk_data/taggers` 存储路径下。
 
 `nltk_data` 存储路径可以通过 `nltk.data.path` 查询。
 
@@ -48,21 +36,21 @@ nltk.download()
 
 ---
 
-Q5: 本项目可否在 colab 中运行？
+Q4: 本项目可否在 colab 中运行？
 
-A5: 可以尝试使用 chatglm-6b-int4 模型在 colab 中运行，需要注意的是，如需在 colab 中运行 Web UI，需将 `webui.py`中 `demo.queue(concurrency_count=3).launch( server_name='0.0.0.0', share=False, inbrowser=False)`中参数 `share`设置为 `True`。
+A4: 可以尝试使用 chatglm-6b-int4 模型在 colab 中运行，需要注意的是，如需在 colab 中运行 Web UI，需将 `webui.py`中 `demo.queue(concurrency_count=3).launch( server_name='0.0.0.0', share=False, inbrowser=False)`中参数 `share`设置为 `True`。
 
 ---
 
-Q6: 在 Anaconda 中使用 pip 安装包无效如何解决？
+Q5: 在 Anaconda 中使用 pip 安装包无效如何解决？
 
-A6: 此问题是系统环境问题，详细见  [在Anaconda中使用pip安装包无效问题](在Anaconda中使用pip安装包无效问题.md)
+A5: 此问题是系统环境问题，详细见  [在Anaconda中使用pip安装包无效问题](在Anaconda中使用pip安装包无效问题.md)
 
 ---
 
-Q7: 本项目中所需模型如何下载至本地？
+Q6: 本项目中所需模型如何下载至本地？
 
-A7: 本项目中使用的模型均为 `huggingface.com`中可下载的开源模型，以默认选择的 `chatglm-6b`和 `text2vec-large-chinese`模型为例，下载模型可执行如下代码：
+A6: 本项目中使用的模型均为 `huggingface.com`中可下载的开源模型，以默认选择的 `chatglm-6b`和 `text2vec-large-chinese`模型为例，下载模型可执行如下代码：
 
 ```shell
 # 安装 git lfs
@@ -80,9 +68,9 @@ $ git pull
 
 ---
 
-Q8: `huggingface.com`中模型下载速度较慢怎么办？
+Q7: `huggingface.com`中模型下载速度较慢怎么办？
 
-A8: 可使用本项目用到的模型权重文件百度网盘地址：
+A7: 可使用本项目用到的模型权重文件百度网盘地址：
 
 - ernie-3.0-base-zh.zip 链接: https://pan.baidu.com/s/1CIvKnD3qzE-orFouA8qvNQ?pwd=4wih
 - ernie-3.0-nano-zh.zip 链接: https://pan.baidu.com/s/1Fh8fgzVdavf5P1omAJJ-Zw?pwd=q6s5
@@ -93,9 +81,9 @@ A8: 可使用本项目用到的模型权重文件百度网盘地址：
 
 ---
 
-Q9: 下载完模型后，如何修改代码以执行本地模型？
+Q8: 下载完模型后，如何修改代码以执行本地模型？
 
-A9: 模型下载完成后，请在 [configs/model_config.py](../configs/model_config.py) 文件中，对 `embedding_model_dict`和 `llm_model_dict`参数进行修改，如把 `llm_model_dict`从
+A8: 模型下载完成后，请在 [configs/model_config.py](../configs/model_config.py) 文件中，对 `embedding_model_dict`和 `llm_model_dict`参数进行修改，如把 `llm_model_dict`从
 
 ```python
 embedding_model_dict = {
@@ -117,15 +105,15 @@ embedding_model_dict = {
 
 ---
 
-Q10: 执行 `python cli_demo.py`过程中，显卡内存爆了，提示"OutOfMemoryError: CUDA out of memory"
+Q9: 执行 `python cli_demo.py`过程中，显卡内存爆了，提示 "OutOfMemoryError: CUDA out of memory"
 
-A10: 将 `VECTOR_SEARCH_TOP_K` 和 `LLM_HISTORY_LEN` 的值调低，比如 `VECTOR_SEARCH_TOP_K = 5` 和 `LLM_HISTORY_LEN = 2`，这样由 `query` 和 `context` 拼接得到的 `prompt` 会变短，会减少内存的占用。或者打开量化，请在 [configs/model_config.py](../configs/model_config.py) 文件中，对`LOAD_IN_8BIT`参数进行修改
+A9: 将 `VECTOR_SEARCH_TOP_K` 和 `LLM_HISTORY_LEN` 的值调低，比如 `VECTOR_SEARCH_TOP_K = 5` 和 `LLM_HISTORY_LEN = 2`，这样由 `query` 和 `context` 拼接得到的 `prompt` 会变短，会减少内存的占用。或者打开量化，请在 [configs/model_config.py](../configs/model_config.py) 文件中，对`LOAD_IN_8BIT`参数进行修改
 
 ---
 
-Q11: 执行 `pip install -r requirements.txt` 过程中遇到 python 包，如 langchain 找不到对应版本的问题
+Q10: 执行 `pip install -r requirements.txt` 过程中遇到 python 包，如 langchain 找不到对应版本的问题
 
-A11: 更换 pypi 源后重新安装，如阿里源、清华源等，网络条件允许时建议直接使用 pypi.org 源，具体操作命令如下：
+A10: 更换 pypi 源后重新安装，如阿里源、清华源等，网络条件允许时建议直接使用 pypi.org 源，具体操作命令如下：
 
 ```shell
 # 使用 pypi 源
@@ -146,43 +134,23 @@ $ pip install -r requirements.txt -i http://mirrors.aliyun.com/pypi/simple/
 $ pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple/
 ```
 
-Q12 启动api.py时upload_file接口抛出 `partially initialized module 'charset_normalizer' has no attribute 'md__mypyc' (most likely due to a circular import)`
+---
 
-这是由于 charset_normalizer模块版本过高导致的，需要降低低charset_normalizer的版本,测试在charset_normalizer==2.1.0上可用。
+Q11: 启动 api.py 时 upload_file 接口抛出 `partially initialized module 'charset_normalizer' has no attribute 'md__mypyc' (most likely due to a circular import)`
+
+A11: 这是由于 charset_normalizer 模块版本过高导致的，需要降低低 charset_normalizer 的版本,测试在 charset_normalizer==2.1.0 上可用。
 
 ---
 
-Q13 启动api.py时upload_file接口，上传PDF或图片时，抛出OSError: [Errno 101] Network is unreachable
+Q12: 调用api中的 `bing_search_chat` 接口时，报出 `Failed to establish a new connection: [Errno 110] Connection timed out`
 
-某些情况下,linux系统上的ip在请求下载ch_PP-OCRv3_rec_infer.tar等文件时，可能会抛出OSError: [Errno 101] Network is unreachable，此时需要首先修改anaconda3/envs/[虚拟环境名]/lib/[python版本]/site-packages/paddleocr/ppocr/utils/network.py脚本，将57行的：
-
-```
-download_with_progressbar(url, tmp_path)
-```
-
-修改为：
-
-```
-        try:
-            download_with_progressbar(url, tmp_path)
-        except Exception as e:
-            print(f"download {url} error,please download it manually:")
-            print(e)
-```
-
-然后按照给定网址，如"https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_infer.tar"手动下载文件，上传到对应的文件夹中，如“.paddleocr/whl/rec/ch/ch_PP-OCRv3_rec_infer/ch_PP-OCRv3_rec_infer.tar”.
+A12: 这是因为服务器加了防火墙，需要联系管理员加白名单，如果公司的服务器的话，就别想了GG--!
 
 ---
 
-Q14 调用api中的 `bing_search_chat`接口时，报出 `Failed to establish a new connection: [Errno 110] Connection timed out`
+Q13: 加载 chatglm-6b-int8 或 chatglm-6b-int4 抛出 `RuntimeError: Only Tensors of floating point andcomplex dtype can require gradients`
 
-这是因为服务器加了防火墙，需要联系管理员加白名单，如果公司的服务器的话，就别想了GG--!
-
----
-
-Q15 加载chatglm-6b-int8或chatglm-6b-int4抛出 `RuntimeError: Only Tensors of floating point andcomplex dtype can require gradients`
-
-疑为chatglm的quantization的问题或torch版本差异问题，针对已经变为Parameter的torch.zeros矩阵也执行Parameter操作，从而抛出 `RuntimeError: Only Tensors of floating point andcomplex dtype can require gradients`。解决办法是在chatglm-项目的原始文件中的quantization.py文件374行改为：
+A13: 疑为 chatglm 的 quantization 的问题或 torch 版本差异问题，针对已经变为 Parameter 的 torch.zeros 矩阵也执行 Parameter 操作，从而抛出 `RuntimeError: Only Tensors of floating point andcomplex dtype can require gradients`。解决办法是在 chatglm 项目的原始文件中的 quantization.py 文件 374 行改为：
 
 ```
     try:
@@ -199,12 +167,6 @@ Q15 加载chatglm-6b-int8或chatglm-6b-int4抛出 `RuntimeError: Only Tensors of
 
 ---
 
-Q16 修改配置中路径后，加载text2vec-large-chinese依然提示`WARNING: No sentence-transformers model found with name text2vec-large-chinese. Creating a new one with MEAN pooling.`
+Q14: 修改配置中路径后，加载 text2vec-large-chinese 依然提示 `WARNING: No sentence-transformers model found with name text2vec-large-chinese. Creating a new one with MEAN pooling.`
 
-尝试更换embedding，如text2vec-base-chinese，请在 [configs/model_config.py](../configs/model_config.py) 文件中，修改 `text2vec-base`参数为本地路径，绝对路径或者相对路径均可
-
----
-
-Q17 启动webui.py时报错  `you need to set ValueError: If you want to offload some keys to cpu or disk, you need to set llm_int8_enable_fp32_cpu_offload=True.`
-
-疑为GPU相关的问题，重新启动服务器即可
\ No newline at end of file
+A14: 尝试更换 embedding，如 text2vec-base-chinese，请在 [configs/model_config.py](../configs/model_config.py) 文件中，修改 `text2vec-base`参数为本地路径，绝对路径或者相对路径均可
diff --git a/docs/StartOption.md b/docs/StartOption.md
deleted file mode 100644
index 7564fd3..0000000
--- a/docs/StartOption.md
+++ /dev/null
@@ -1,76 +0,0 @@
-
-#### 项目启动选项
-```test
-usage: langchina-ChatGLM [-h] [--no-remote-model] [--model MODEL] [--lora LORA] [--model-dir MODEL_DIR] [--lora-dir LORA_DIR] [--cpu] [--auto-devices] [--gpu-memory GPU_MEMORY [GPU_MEMORY ...]] [--cpu-memory CPU_MEMORY]
-                         [--load-in-8bit] [--bf16]
-
-基于langchain和chatGML的LLM文档阅读器
-
-options:
-  -h, --help            show this help message and exit
-  --no-remote-model     remote in the model on loader checkpoint, if your load local model to add the ` --no-remote-model`
-  --model MODEL         Name of the model to load by default.
-  --lora LORA           Name of the LoRA to apply to the model by default.
-  --model-dir MODEL_DIR
-                        Path to directory with all the models
-  --lora-dir LORA_DIR   Path to directory with all the loras
-  --cpu                 Use the CPU to generate text. Warning: Training on CPU is extremely slow.
-  --auto-devices        Automatically split the model across the available GPU(s) and CPU.
-  --gpu-memory GPU_MEMORY [GPU_MEMORY ...]
-                        Maxmimum GPU memory in GiB to be allocated per GPU. Example: --gpu-memory 10 for a single GPU, --gpu-memory 10 5 for two GPUs. You can also set values in MiB like --gpu-memory 3500MiB.
-  --cpu-memory CPU_MEMORY
-                        Maximum CPU memory in GiB to allocate for offloaded weights. Same as above.
-  --load-in-8bit        Load the model with 8-bit precision.
-  --bf16                Load the model with bfloat16 precision. Requires NVIDIA Ampere GPU.
-
-```
-
-#### 示例
-
-- 1、加载本地模型
-
-```text
---model-dir 本地checkpoint存放文件夹
---model  模型名称
---no-remote-model 不从远程加载模型
-```
-```shell
-$  python cli_demo.py --model-dir /media/mnt/ --model chatglm-6b --no-remote-model
-```
-
-- 2、低精度加载模型
-```text
---model-dir 本地checkpoint存放文件夹
---model  模型名称
---no-remote-model 不从远程加载模型
---load-in-8bit   以8位精度加载模型
-```
-```shell
-$ python cli_demo.py --model-dir /media/mnt/ --model chatglm-6b --no-remote-model --load-in-8bit   
-```
-
-
-- 3、使用cpu预测模型
-```text
---model-dir 本地checkpoint存放文件夹
---model  模型名称
---no-remote-model 不从远程加载模型
---cpu   使用CPU生成文本。警告：CPU上的训练非常缓慢。
-```
-```shell
-$ python cli_demo.py --model-dir /media/mnt/ --model chatglm-6b --no-remote-model --cpu 
-```
-
-
-
-- 3、加载lora微调文件
-```text
---model-dir 本地checkpoint存放文件夹
---model  模型名称
---no-remote-model 不从远程加载模型
---lora-dir   本地lora存放文件夹
---lora lora名称
-```
-```shell
-$ python cli_demo.py --model-dir /media/mnt/ --model chatglm-6b --no-remote-model --lora-dir  /media/mnt/loras --lora chatglm-step100
-```
diff --git a/docs/cli.md b/docs/cli.md
deleted file mode 100644
index ff3eb9f..0000000
--- a/docs/cli.md
+++ /dev/null
@@ -1,49 +0,0 @@
-## 命令行工具
-
-windows cli.bat  
-linux cli.sh
-
-## 命令列表
-
-### llm 管理
-
-llm 支持列表
-
-```shell
-cli.bat llm ls
-```
-
-### embedding 管理
-
-embedding 支持列表
-
-```shell
-cli.bat embedding ls
-```
-
-### start 启动管理
-
-查看启动选择
-
-```shell
-cli.bat start
-```
-
-启动命令行交互
-
-```shell
-cli.bat start cli
-```
-
-启动Web 交互
-
-```shell
-cli.bat start webui
-```
-
-启动api服务
-
-```shell
-cli.bat start api 
-```
-
diff --git a/docs/fastchat.md b/docs/fastchat.md
deleted file mode 100644
index ebd308b..0000000
--- a/docs/fastchat.md
+++ /dev/null
@@ -1,24 +0,0 @@
-# fastchat 调用实现教程
-langchain-ChatGLM 现已支持通过调用 FastChat API 进行 LLM 调用，支持的 API 形式为 **OpenAI API 形式**。 
-1. 首先请参考 [FastChat 官方文档](https://github.com/lm-sys/FastChat/blob/main/docs/openai_api.md#restful-api-server) 进行 FastChat OpenAI 形式 API 部署
-2. 依据 FastChat API 启用时的 `model_name` 和 `api_base` 链接，在本项目的 `configs/model_config.py` 的 `llm_model_dict` 中增加选项。如：
-    ```python
-    llm_model_dict = {
-            
-        # 通过 fastchat 调用的模型请参考如下格式
-        "fastchat-chatglm-6b": {
-            "name": "chatglm-6b",  # "name"修改为fastchat服务中的"model_name"
-            "pretrained_model_name": "chatglm-6b",
-            "local_model_path": None,
-            "provides": "FastChatOpenAILLM",  # 使用fastchat api时，需保证"provides"为"FastChatOpenAILLM"
-            "api_base_url": "http://localhost:8000/v1"  # "name"修改为fastchat服务中的"api_base_url"
-        },
-    }
-    ```
-    其中 `api_base_url` 根据 FastChat 部署时的 ip 地址和端口号得到，如 ip 地址设置为 `localhost`，端口号为 `8000`，则应设置的 `api_base_url` 为 `http://localhost:8000/v1`
-
-3. 将 `configs/model_config.py` 中的 `LLM_MODEL` 修改为对应模型名。如：
-    ```python
-    LLM_MODEL = "fastchat-chatglm-6b"
-    ```
-4. 根据需求运行 `api.py`, `cli_demo.py` 或 `webui.py`。
\ No newline at end of file
diff --git a/docs/向量库环境docker.md b/docs/向量库环境docker.md
index b06bdb9..162b0f0 100644
--- a/docs/向量库环境docker.md
+++ b/docs/向量库环境docker.md
@@ -1,6 +1,6 @@
-向量库环境docker-compose.yml文件在docs/docker/vector_db中
+向量库环境 docker-compose.yml 文件在 docs/docker/vector_db 中
 
-以milvus为例
+以 milvus 为例
 ```shell
 cd docs/docker/vector_db/milvus
 docker-compose up -d