Commit Graph

33 Commits

Author SHA1 Message Date
wvivi2023 cc706ce7ef enhance log 2024-04-02 10:32:34 +08:00
wvivi2023 6ed7002758 fix the issue uploadding file and embedding take too long time 2024-04-01 13:54:24 +08:00
wvivi2023 7b9369e625 enhance pdf loader 2024-03-18 09:28:30 +08:00
wvivi2023 3b49f2da54 去掉二级和三级目录标题对标点符号的判断 2024-03-12 11:09:37 +08:00
wvivi2023 53fb9f6319 修复es重复入库的问题和文档分块字符串计算包括空格和换行符的问题 2024-03-06 14:50:45 +08:00
wvivi2023 bac5b22879 title enhancement 2024-02-28 16:40:45 +08:00
wvivi2023 d123ad7c29 增加对四级目录的支持和其他支持目录的标题增强 2024-02-27 17:24:04 +08:00
wvivi2023 4acc5e7ad9 modify logo
modify logo
2024-02-19 09:49:44 +08:00
wvivi2023 99969ef1e3 一级目录加强 2024-01-19 15:54:26 +08:00
wvivi2023 173b23ad7d enhance 2024-01-18 15:44:14 +08:00
wvivi2023 51424db243 enhance RapidWordLoader 2024-01-17 10:49:59 +08:00
wvivi2023 565a94c1bb customize word loader 2024-01-10 10:45:47 +08:00
wvivi2023 7b7a180323 merge 0.2.6
merge 0.2.6
2024-01-02 10:10:41 +08:00
wvivi2023 5c8610f47f enhance 3rd catalog content 2023-12-28 10:52:52 +08:00
wvivi2023 9f327e71e4 enhance splitter algorithm 2023-12-26 15:40:45 +08:00
wvivi2023 540ff09486 enhance 2023-12-15 10:28:11 +08:00
wvivi2023 33dc60df5e commit log 2023-12-15 09:48:42 +08:00
wvivi2023 2ac52147d3 fix merging issue 2023-12-15 09:48:22 +08:00
wvivi2023 77bc5891c8 manually split 2023-12-15 08:59:13 +08:00
wvivi2023 bf21b8f116 merge single line to the next content 2023-12-13 18:06:49 +08:00
wvivi2023 c936f040e4 优化第一级目录分款 2023-12-05 16:51:45 +08:00
wvivi2023 dce1d16e29 enhance splitter
enhance splitter
2023-11-29 13:25:44 +08:00
wvivi2023 9ba2120129 enhance
enhance
2023-11-23 12:38:31 +08:00
wvivi2023 a59767711b enhance
split by 1.2 first then split by 1.23
2023-11-14 18:10:41 +08:00
wvivi2023 60a12c05f6 0.2.6 enhance
0.2.6 enhance
2023-11-13 09:20:19 +08:00
wvivi2023 526c4b52a8 search related doc title before similarity search
search related doc title before similarity search
2023-11-06 08:57:58 +08:00
imClumsyPanda fbaca1009e update requirements.txt, requirements_api.txt, test_different_splitter.py and chinese_recursive_text_splitter.py 2023-09-14 22:59:05 +08:00
zR bfdbe69fa1
增加了自定义分词器适配 (#1462)
* 添加了自定义分词器适配和测试文件
---------

Co-authored-by: zR <zRzRzRzRzRzRzR>
2023-09-13 15:42:12 +08:00
imClumsyPanda 4aa14b859e
增加 ChineseRecursiveTextSplitter (#1447)
* add RapidOCRPDFLoader

* update mypdfloader.py and requirements.txt

* add myimgloader.py

* add test samples

* add TODO to mypdfloader

* add loaders to KnowledgeFile class

* add loaders to KnowledgeFile class

* add ChineseRecursiveTextSplitter

* add ChineseRecursiveTextSplitter
2023-09-12 17:38:52 +08:00
imClumsyPanda 8d463a31fd update import pkgs and format 2023-08-10 21:50:38 +08:00
imClumsyPanda 8a4d9168fa update import pkgs and format 2023-08-10 21:26:05 +08:00
imClumsyPanda 24a280ce8c re-add zh_title_enhance.py 2023-08-09 23:09:24 +08:00
imClumsyPanda dcf49a59ef v0.2.0 first commit 2023-07-27 23:22:07 +08:00