Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / opendatalab/MinerU issues and pull requests
#975 - docs: update feature description for table conversion
Pull Request -
State: closed - Opened by myhloli 13 days ago
#975 - docs: update feature description for table conversion
Pull Request -
State: closed - Opened by myhloli 13 days ago
#974 - docs: improve GPU support list formatting in README_zh-CN.md
Pull Request -
State: closed - Opened by myhloli 13 days ago
#974 - docs: improve GPU support list formatting in README_zh-CN.md
Pull Request -
State: closed - Opened by myhloli 13 days ago
#973 - docs(README): update GPU hardware recommendations and table recognition options
Pull Request -
State: closed - Opened by myhloli 13 days ago
#973 - docs(README): update GPU hardware recommendations and table recognition options
Pull Request -
State: closed - Opened by myhloli 13 days ago
#972 - magic_pdf.user_api:parse_pdf:97 - string index out of range
Issue -
State: closed - Opened by yibie 14 days ago
- 5 comments
Labels: bug
#971 - fix: 修复issue opendatalab#715
Pull Request -
State: closed - Opened by LollipopsAndWine 14 days ago
#970 - Python 3.11 及更高版本支持?
Issue -
State: open - Opened by stevenhe1988 14 days ago
- 2 comments
Labels: enhancement
#969 - Release 0.9.3
Pull Request -
State: closed - Opened by myhloli 14 days ago
- 1 comment
#968 - Dev to 0.9.3
Pull Request -
State: closed - Opened by myhloli 14 days ago
#967 - docs(README): update project references and translations
Pull Request -
State: closed - Opened by myhloli 14 days ago
#966 - Dev to 0.9.3
Pull Request -
State: closed - Opened by myhloli 14 days ago
- 1 comment
#965 - docs:update docs for 0.9.3
Pull Request -
State: closed - Opened by myhloli 14 days ago
#964 - refactor(model): rename and restructure model modules
Pull Request -
State: closed - Opened by myhloli 14 days ago
#963 - 关于PDF文档转换问题和建议
Issue -
State: closed - Opened by Tian14267 14 days ago
- 5 comments
Labels: enhancement
#962 - 如何配置使用StructTable-InternVL2-1B进行表格提取
Issue -
State: closed - Opened by 77981836 14 days ago
- 1 comment
Labels: enhancement
#959 - 增加进度显示
Issue -
State: open - Opened by lamquan1220 14 days ago
- 3 comments
Labels: enhancement
#958 - 能否实现倾斜pdf自动调整方向纠正
Issue -
State: open - Opened by jasinliu 15 days ago
- 1 comment
Labels: enhancement
#957 - fix(parse_pipeline): Resolve post-processing exceptions caused by partial PDFs due to file corruption or non-standard format by forcing a re-print.
Pull Request -
State: closed - Opened by myhloli 15 days ago
- 1 comment
#956 - 最新版,对于含注释的文档识别不出注释部分的内容
Issue -
State: closed - Opened by wahahaer 15 days ago
- 3 comments
Labels: bug
#955 - d0558ab这个版本的Dockerfile构建时缺少yaml库无法构建
Issue -
State: closed - Opened by cyicz123 15 days ago
- 2 comments
Labels: bug
#954 - 新手关于项目中的单例模式和多进程之间的问题
Issue -
State: closed - Opened by lidaoming 15 days ago
- 1 comment
Labels: enhancement
#953 - Proposal to Fix IndexError: string index out of range in magic_pdf/para/para_split_v3.py
Issue -
State: closed - Opened by HiroshigeAoki 15 days ago
- 1 comment
Labels: bug
#952 - paddle错误
Issue -
State: closed - Opened by zhangtianhong-1998 15 days ago
- 2 comments
Labels: bug
#951 - pull request 20241114
Pull Request -
State: open - Opened by lztiancn 15 days ago
- 1 comment
#950 - 请教,为什么部署的webdemo和opendatalab上的webdemo不一样
Issue -
State: closed - Opened by sky84892070 16 days ago
- 3 comments
Labels: bug
#949 - multi-gpu报错
Issue -
State: open - Opened by simplew2011 16 days ago
- 2 comments
Labels: bug
#948 - feat: tune docs
Pull Request -
State: closed - Opened by icecraft 16 days ago
- 1 comment
#947 - 标注序号和水印识别异常
Issue -
State: closed - Opened by simplew2011 16 days ago
- 1 comment
Labels: bug
#946 - Request for Plain Text/Markdown Table Data Extraction Option
Issue -
State: open - Opened by soyeb-PQ 16 days ago
- 1 comment
Labels: enhancement
#945 - [chore] udpate DockerFile to fix build bugs
Pull Request -
State: open - Opened by ProseGuys 16 days ago
- 4 comments
#944 - markdown产出的章节标题能否继承pdf中的章节层级,现在都是同一级的
Issue -
State: open - Opened by gcy0926 16 days ago
- 4 comments
Labels: enhancement
#943 - fix(ocr_mkcontent): improve handling of single-character content #937
Pull Request -
State: closed - Opened by myhloli 16 days ago
- 1 comment
#942 - 论文标题有字体大小差异时,会造成单词被拆分
Issue -
State: closed - Opened by gcy0926 16 days ago
- 7 comments
Labels: bug
#941 - build(Dockerfile): update model download script and dependencies
Pull Request -
State: closed - Opened by myhloli 16 days ago
#940 - magic-pdf 使用方法 和 README说明不一致,且执行时报错
Issue -
State: closed - Opened by gcy0926 16 days ago
- 2 comments
Labels: bug
#939 - magic-pdf.json的生成方式
Issue -
State: closed - Opened by Runningwater2357 16 days ago
Labels: bug
#938 - fix: 修复Dockerfile文件中download_models.py脚本路径问题
Pull Request -
State: closed - Opened by kimi360 16 days ago
- 2 comments
#937 - `markdown`的序号和标题之间没空格
Issue -
State: closed - Opened by bwnjnOEI 17 days ago
- 4 comments
Labels: bug
#936 - layoutreader 进行阅读顺序排序
Issue -
State: open - Opened by lyc728 17 days ago
- 6 comments
#935 - 报错:paddlepaddle的库函数找不到
Issue -
State: closed - Opened by vanchy-z 17 days ago
- 3 comments
Labels: bug
#934 - 文档里面的内容划分不准确,请问怎么进行微调?或者说单独对layoutLMv3微调后需要更改minerU代码吗?
Issue -
State: open - Opened by aodingpeng 17 days ago
Labels: bug
#933 - 希望输出文件能有一个layout每个色块的分类和坐标,当前看着是按行给出的
Issue -
State: closed - Opened by charliedream1 17 days ago
- 1 comment
Labels: enhancement
#932 - 能否把模型加载和处理流程拆分开,这样不用重复加载模型,速度能快点
Issue -
State: closed - Opened by charliedream1 17 days ago
- 7 comments
Labels: enhancement
#931 - fix: typo
Pull Request -
State: closed - Opened by icecraft 17 days ago
- 1 comment
#929 - delete test dir
Pull Request -
State: closed - Opened by dt-yy 18 days ago
#928 - docs: rewrite zh_cn docs without translate
Pull Request -
State: closed - Opened by icecraft 18 days ago
- 1 comment
#927 - Style/docs
Pull Request -
State: closed - Opened by icecraft 18 days ago
- 1 comment
#926 - 表格识别速度非常慢,比不开表格模型慢了十几倍
Issue -
State: closed - Opened by charliedream1 18 days ago
- 6 comments
Labels: bug
#925 - docs(README_ja-JP.md): update warning message and remove outdated content
Pull Request -
State: closed - Opened by myhloli 18 days ago
#924 - docs(readme): update table recognition configuration and documentation
Pull Request -
State: closed - Opened by myhloli 18 days ago
#923 - 更新 para_split_v3.py
Pull Request -
State: closed - Opened by hyastar 18 days ago
#922 - refactor(model download script)
Pull Request -
State: closed - Opened by myhloli 18 days ago
#921 - Fix indexerror in para split v3
Pull Request -
State: closed - Opened by hyastar 18 days ago
- 5 comments
#920 - chage test from dev to master
Pull Request -
State: closed - Opened by DTwz 18 days ago
- 1 comment
#919 - fix: remove classes hierarchy diagram
Pull Request -
State: closed - Opened by icecraft 18 days ago
- 1 comment
#918 - Dev
Pull Request -
State: closed - Opened by DTwz 18 days ago
- 1 comment
#917 - Fix indexerror in para split v3
Pull Request -
State: closed - Opened by hyastar 18 days ago
- 1 comment
#916 - Fix IndexError in para_split_v3.py for empty line handling
Pull Request -
State: closed - Opened by hyastar 20 days ago
- 3 comments
#915 - feat(table): add RapidOCR support for RapidTable model
Pull Request -
State: closed - Opened by myhloli 20 days ago
#914 - test(table): improve ppTableModel test coverage
Pull Request -
State: closed - Opened by myhloli 21 days ago
#913 - Modify the test directory
Pull Request -
State: closed - Opened by DTwz 21 days ago
- 1 comment
#912 - refactor(magic_pdf_parse_main): optimize model data handling and JSON output
Pull Request -
State: closed - Opened by myhloli 21 days ago
#911 - fix(gradio-app): add missing file type in upload
Pull Request -
State: closed - Opened by myhloli 21 days ago
#910 - feat(table): integrate RapidTable model for table recognition
Pull Request -
State: closed - Opened by myhloli 21 days ago
#909 - 多栏版面文档识别的阅读顺序不正确
Issue -
State: open - Opened by guoguo0646 21 days ago
- 3 comments
Labels: bug
#908 - FatalError: `Erroneous arithmetic operation` is detected by the operating system.
Issue -
State: closed - Opened by bottleofwater11 21 days ago
- 9 comments
Labels: bug
#907 - feat: using next_docs
Pull Request -
State: closed - Opened by icecraft 21 days ago
- 1 comment
#906 - Feat/add en docs
Pull Request -
State: closed - Opened by icecraft 21 days ago
- 1 comment
#904 - c̲i̲r̲c̲l̲e̲d̲{1}
Issue -
State: open - Opened by zhongxin129 21 days ago
- 4 comments
Labels: bug
#903 - 解析报错500
Issue -
State: open - Opened by zhongxin129 21 days ago
- 1 comment
Labels: bug
#899 - more interactive web app
Issue -
State: open - Opened by pJahad 21 days ago
- 1 comment
Labels: enhancement
#897 - 安装报错提示:ERROR: Failed building wheel for simsimd
Issue -
State: closed - Opened by windowLiu 22 days ago
- 18 comments
Labels: bug
#889 - 新增DocLayout-YOLO超链接
Pull Request -
State: closed - Opened by qiangqiang199 23 days ago
#888 - 增加DocLayout-YOLO超链接
Pull Request -
State: closed - Opened by qiangqiang199 23 days ago
- 1 comment
#879 - Release 0.9.1
Pull Request -
State: closed - Opened by myhloli 23 days ago
- 1 comment
#878 - docs(README): update changelog for v0.9.1 release
Pull Request -
State: closed - Opened by myhloli 23 days ago
#877 - docs(README): update changelog for v0.9.1 release
Pull Request -
State: closed - Opened by myhloli 23 days ago
#876 - docs: update arXiv paper link in README files
Pull Request -
State: closed - Opened by myhloli 23 days ago
#875 - docs: update arXiv paper link in README files
Pull Request -
State: closed - Opened by myhloli 23 days ago
#874 - test(table): improve HTML validation for table extraction
Pull Request -
State: closed - Opened by myhloli 23 days ago
#871 - feat: mineru_demo接口文档替换为链接
Pull Request -
State: closed - Opened by LollipopsAndWine 24 days ago
#869 - fix: add ci repository
Pull Request -
State: open - Opened by dt-yy 24 days ago
#868 - maigic-pdf解析pdf结果文件中middle.json的来源,是否可以用来在页面前段渲染解析获取的md文件,还原pdf的页面格式
Issue -
State: open - Opened by liy-a 24 days ago
- 2 comments
Labels: enhancement
#867 - docs(faq): add troubleshooting for illegal instruction error on Linux servers
Pull Request -
State: closed - Opened by myhloli 24 days ago
- 1 comment
#866 - fix(table): improve table image processing
Pull Request -
State: closed - Opened by myhloli 24 days ago
- 1 comment
#865 - 0.9.0分支的的阅读顺序的逻辑是怎么用的
Issue -
State: closed - Opened by qrsssh 24 days ago
- 2 comments
Labels: bug
#864 - 预处理中get_most_common_bbox 代码存在BUG
Issue -
State: closed - Opened by imgits 24 days ago
- 1 comment
Labels: bug
#863 - 解析PDF超链接时的解析bug
Issue -
State: closed - Opened by ouwen18 24 days ago
- 4 comments
Labels: bug
#862 - 最新版本0.9.0与paddlepaddle-gpu 3.0.0b1依赖cudnn包冲突
Issue -
State: closed - Opened by yuanyuan25 24 days ago
- 1 comment
Labels: bug
#861 - magick-pdf 0.9.0新版本模型加载好像缺了点儿东西
Issue -
State: closed - Opened by 1134018901 24 days ago
- 3 comments
#860 - docs(README): update Colab demo link
Pull Request -
State: closed - Opened by myhloli 24 days ago
- 2 comments
#859 - 中文文档标题识别特别不准确,以及英文检测文件layout和spans错乱
Issue -
State: closed - Opened by aodingpeng 24 days ago
- 3 comments
Labels: bug
#858 - chore: add CSS and SCSS files to linguist-vendored- Update .gitattributes to mark CSS and SCSS files as vendored
Pull Request -
State: closed - Opened by myhloli 25 days ago
#857 - fix(merge_text): add ligature replacement functionality #305 #241
Pull Request -
State: closed - Opened by myhloli 25 days ago
- 1 comment
#856 - chore: add .gitattributes to configure file linguist attributes
Pull Request -
State: closed - Opened by myhloli 25 days ago
#855 - feat(model): add HTML minification to StructTableModel
Pull Request -
State: closed - Opened by myhloli 25 days ago
#854 - feat(table): upgrade StructEqTable model and integrate into PDF Extract Kit
Pull Request -
State: closed - Opened by myhloli 25 days ago
#853 - Update pdf_extract_kit.py
Pull Request -
State: closed - Opened by CiaranYoung 25 days ago
- 2 comments