wvivi2023
7b7a180323
merge 0.2.6
...
merge 0.2.6
2024-01-02 10:10:41 +08:00
wvivi2023
5c8610f47f
enhance 3rd catalog content
2023-12-28 10:52:52 +08:00
wvivi2023
9f327e71e4
enhance splitter algorithm
2023-12-26 15:40:45 +08:00
wvivi2023
540ff09486
enhance
2023-12-15 10:28:11 +08:00
wvivi2023
33dc60df5e
commit log
2023-12-15 09:48:42 +08:00
wvivi2023
2ac52147d3
fix merging issue
2023-12-15 09:48:22 +08:00
wvivi2023
77bc5891c8
manually split
2023-12-15 08:59:13 +08:00
wvivi2023
bf21b8f116
merge single line to the next content
2023-12-13 18:06:49 +08:00
wvivi2023
c936f040e4
优化第一级目录分款
2023-12-05 16:51:45 +08:00
wvivi2023
dce1d16e29
enhance splitter
...
enhance splitter
2023-11-29 13:25:44 +08:00
wvivi2023
9ba2120129
enhance
...
enhance
2023-11-23 12:38:31 +08:00
wvivi2023
a59767711b
enhance
...
split by 1.2 first then split by 1.23
2023-11-14 18:10:41 +08:00
wvivi2023
60a12c05f6
0.2.6 enhance
...
0.2.6 enhance
2023-11-13 09:20:19 +08:00
wvivi2023
526c4b52a8
search related doc title before similarity search
...
search related doc title before similarity search
2023-11-06 08:57:58 +08:00
imClumsyPanda
fbaca1009e
update requirements.txt, requirements_api.txt, test_different_splitter.py and chinese_recursive_text_splitter.py
2023-09-14 22:59:05 +08:00
zR
bfdbe69fa1
增加了自定义分词器适配 ( #1462 )
...
* 添加了自定义分词器适配和测试文件
---------
Co-authored-by: zR <zRzRzRzRzRzRzR>
2023-09-13 15:42:12 +08:00
imClumsyPanda
4aa14b859e
增加 ChineseRecursiveTextSplitter ( #1447 )
...
* add RapidOCRPDFLoader
* update mypdfloader.py and requirements.txt
* add myimgloader.py
* add test samples
* add TODO to mypdfloader
* add loaders to KnowledgeFile class
* add loaders to KnowledgeFile class
* add ChineseRecursiveTextSplitter
* add ChineseRecursiveTextSplitter
2023-09-12 17:38:52 +08:00
imClumsyPanda
8d463a31fd
update import pkgs and format
2023-08-10 21:50:38 +08:00
imClumsyPanda
8a4d9168fa
update import pkgs and format
2023-08-10 21:26:05 +08:00
imClumsyPanda
24a280ce8c
re-add zh_title_enhance.py
2023-08-09 23:09:24 +08:00
imClumsyPanda
dcf49a59ef
v0.2.0 first commit
2023-07-27 23:22:07 +08:00