Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / ocrmypdf/OCRmyPDF issues and pull requests

#1058 - substitute broken link (#1057)

Pull Request - State: closed - Opened by LucasLarson over 1 year ago

#1057 - [BUG] docs: links to brewformulas.org no longer work

Issue - State: closed - Opened by LucasLarson over 1 year ago

#1056 - output JSON format

Issue - State: closed - Opened by emresaracoglu over 1 year ago - 1 comment

#1055 - Is it possible to add paddleocr as an option for ocr?

Issue - State: closed - Opened by nissansz over 1 year ago - 4 comments

#1053 - fix crash on PDF

Pull Request - State: closed - Opened by frrad over 1 year ago - 1 comment

#1052 - [BUG] crash when trying to process a pdf

Issue - State: closed - Opened by frrad over 1 year ago - 1 comment

#1051 - Feature request: Ask user what likely-incorrect words are

Issue - State: closed - Opened by mattention over 1 year ago - 1 comment
Labels: enhancement

#1049 - [BUG] `--deskew` not compatible with blank pages or with tesseract_timeout = 0

Issue - State: closed - Opened by deexpabada almost 2 years ago - 6 comments

#1048 - Fixed the source installation instructions

Pull Request - State: closed - Opened by yasoob almost 2 years ago - 1 comment

#1047 - Fix tesseract documentation url

Pull Request - State: closed - Opened by CGarces almost 2 years ago

#1046 - Memory leak ocrmypdf.ocr vs subprocess.run

Issue - State: closed - Opened by CGarces almost 2 years ago - 5 comments

#1045 - Fixed some wording

Pull Request - State: closed - Opened by yasoob almost 2 years ago

#1044 - log completion message

Pull Request - State: closed - Opened by drinckes almost 2 years ago

#1043 - Way to test PDF to see if there is any text?

Issue - State: closed - Opened by spedinfargo almost 2 years ago - 1 comment

#1042 - OCR for Comic Book PDFs -- Possible Solution

Issue - State: closed - Opened by yosamsimiti almost 2 years ago - 2 comments

#1041 - Spaces in Japanese

Issue - State: closed - Opened by KajiyaOokami almost 2 years ago - 4 comments
Labels: third party issue

#1040 - Ignore Digital Signed Documents

Issue - State: closed - Opened by flaviobrunopereira almost 2 years ago - 2 comments
Labels: need test file

#1039 - Fixed interchanged words

Pull Request - State: closed - Opened by yasoob almost 2 years ago - 1 comment

#1038 - Draw/Blanking on wrong spot

Issue - State: open - Opened by emre1e almost 2 years ago - 1 comment

#1037 - read_params_file: Can't open pdf/txt -- new issue -- help!

Issue - State: closed - Opened by yosamsimiti almost 2 years ago - 4 comments

#1035 - Garbled order of OCR'ed contents

Issue - State: open - Opened by rkevk almost 2 years ago - 2 comments

#1034 - ocrmypdf cannot convert pages with watermarks.

Issue - State: closed - Opened by marlarius almost 2 years ago - 3 comments

#1032 - Remove blank page without recognizable characters of the ocr

Issue - State: open - Opened by gitmors almost 2 years ago

#1031 - Question: multiple import folders possible?

Issue - State: closed - Opened by Maximus48p almost 2 years ago - 1 comment

#1029 - How to reduce ram usage

Issue - State: closed - Opened by alirf81 almost 2 years ago - 1 comment

#1024 - Issue packaging with pyinstaller

Issue - State: open - Opened by kiyros almost 2 years ago - 5 comments

#1019 - Debian maintainer requested for OCRmyPDF and pikepdf

Issue - State: closed - Opened by jbarlow83 about 2 years ago - 1 comment
Labels: help wanted

#1018 - present OCRmyPDF at normconf

Issue - State: closed - Opened by mu22le about 2 years ago - 1 comment

#1015 - Inverted black and white from optimization

Issue - State: open - Opened by Jmuccigr about 2 years ago - 9 comments

#1010 - How to use a timeout for gs?

Issue - State: closed - Opened by svenha about 2 years ago - 13 comments

#1009 - OCR picks up all the text, but alignment is off

Issue - State: closed - Opened by nchammas about 2 years ago - 2 comments

#1004 - OCRmyPDF assumes really large DPI for native PDF when rasterizing as image

Issue - State: closed - Opened by fabiante about 2 years ago - 3 comments

#1003 - How to keep source file time, date, metadata.... etc for Target File?

Issue - State: closed - Opened by limopc about 2 years ago - 4 comments

#977 - optimize.py doesn't process images with subtype Form

Issue - State: closed - Opened by imz over 2 years ago - 4 comments
Labels: enhancement

#961 - "--force-ocr" switch increases size of pdf by factor 25

Issue - State: open - Opened by wildgruber over 2 years ago - 4 comments

#948 - Double to quadruple file size and worse quality with --deskew --clean-final (due to mask?)

Issue - State: open - Opened by bllngr over 2 years ago - 2 comments
Labels: bug

#944 - "remove-background not implemented"

Issue - State: closed - Opened by bouboulov over 2 years ago - 6 comments

#942 - Creating txt file without an output pdf. Examples missing for correct syntax.

Issue - State: closed - Opened by gevezex over 2 years ago - 3 comments

#931 - `--redo-ocr` adds extra text to the PDF

Issue - State: closed - Opened by DUOLabs333 over 2 years ago - 1 comment

#906 - support monochromatic conversion

Issue - State: closed - Opened by jknockaert over 2 years ago - 6 comments

#897 - --redo-ocr doesn't remove previous ocr-text layer made by ocrmypdf

Issue - State: open - Opened by Mark-Joy over 2 years ago - 2 comments

#872 - cannot run under python 3.10

Issue - State: closed - Opened by starsareintherose almost 3 years ago - 5 comments

#868 - Blank pages cause the process to crash due to tesseract

Issue - State: closed - Opened by philayres almost 3 years ago - 3 comments

#827 - ocrmypdf --redo-ocr fails with DecompressionBombError on small PDF

Issue - State: closed - Opened by nicolasguinot about 3 years ago - 4 comments

#814 - Hanging on Random Files

Issue - State: closed - Opened by jgforbes about 3 years ago - 24 comments

#807 - Hebrew text seems to be reversed(whole line) on OCR-ed pdf

Issue - State: open - Opened by Kors1981 about 3 years ago - 5 comments
Labels: user config

#781 - Correcting recognition errors - possible with sidecar option?

Issue - State: closed - Opened by jdescelliers over 3 years ago - 5 comments

#766 - [ENHANCEMENT] Google Colab notebook

Issue - State: closed - Opened by louispaulet over 3 years ago - 3 comments

#748 - Jbig2 dependency on windows

Issue - State: closed - Opened by mortang2410 over 3 years ago - 21 comments

#721 - --force-ocr converts JBIG2 images to 24-bit

Issue - State: closed - Opened by alawvt over 3 years ago - 6 comments
Labels: bug

#715 - extra space in the result pdf when the input pdf is in Chinese

Issue - State: open - Opened by Eyxxxxx over 3 years ago - 20 comments
Labels: third party issue

#659 - Improving Windows with PyInstaller - Ocrmypdf Distribution Not Found

Issue - State: open - Opened by gabemorris12 almost 4 years ago - 14 comments
Labels: enhancement

#631 - liblept-5.dll load fails on Windows 10 (OSError 0x7F)

Issue - State: closed - Opened by Suyash458 about 4 years ago - 14 comments
Labels: bug, third party issue

#623 - Can you tell me what docker command I should run in order to make the docker image work?

Issue - State: closed - Opened by 5aumy4 about 4 years ago - 5 comments
Labels: question

#595 - Azure ocr with ocrmypdf

Issue - State: open - Opened by sandipan1 about 4 years ago - 13 comments
Labels: enhancement

#590 - Pass existing OCR-Data in ALTO-Format

Issue - State: closed - Opened by M3ssman about 4 years ago - 3 comments
Labels: enhancement

#551 - "--force-ocr" mangles some pages in a pdf

Issue - State: open - Opened by wojciechbielecki over 4 years ago - 5 comments
Labels: bug

#550 - --threshold-final

Issue - State: open - Opened by femifrak over 4 years ago - 4 comments
Labels: enhancement

#541 - Introduce a way to radically reduce the output file size (sacrificing image quality)

Issue - State: closed - Opened by heinrich-ulbricht over 4 years ago - 88 comments
Labels: enhancement

#539 - Chocolately package for Windows

Issue - State: closed - Opened by jbarlow83 over 4 years ago - 5 comments
Labels: enhancement, help wanted

#528 - Add support for PDF/A-2u or PDF/A-2a

Issue - State: closed - Opened by frederictobiasc over 4 years ago - 1 comment
Labels: enhancement

#514 - Error: File did not complete the page properly and may be damaged.

Issue - State: open - Opened by tice17 over 4 years ago - 8 comments

#495 - Searching math equations

Issue - State: closed - Opened by karasjoh000 over 4 years ago - 4 comments
Labels: enhancement

#488 - cx_Freeze support - packaging on Windows

Issue - State: closed - Opened by Faisalsouz over 4 years ago - 2 comments
Labels: help wanted, third party issue

#487 - Command line option deskew not found but d is available

Issue - State: closed - Opened by paazmaya over 4 years ago - 6 comments
Labels: user config

#483 - Provide tsv and hocr output files

Issue - State: closed - Opened by ArlindNocaj over 4 years ago - 2 comments

#460 - fatal error: qpdf/Constants.h - pip3 install ocrmypdf and pikepdf on ubuntu win10 subsystem failed

Issue - State: closed - Opened by mhechthz almost 5 years ago - 27 comments
Labels: user config

#458 - deskew and roate but skip ocr?

Issue - State: closed - Opened by barrars almost 5 years ago - 3 comments

#453 - hocr import / export

Issue - State: closed - Opened by aalmir almost 5 years ago - 38 comments
Labels: enhancement

#450 - Text layer not aligned with original document

Issue - State: open - Opened by wpzdm almost 5 years ago - 8 comments
Labels: need test file

#446 - Pdf error with tables

Issue - State: open - Opened by miguelgarces123 almost 5 years ago - 20 comments

#445 - Support for JPEG2000, jp2 output

Issue - State: closed - Opened by aalmir almost 5 years ago - 4 comments

#443 - Implement optional downsampling as part of preprocessing

Issue - State: closed - Opened by jbarlow83 almost 5 years ago - 5 comments
Labels: enhancement

#437 - support converting multiple images

Issue - State: closed - Opened by grexe almost 5 years ago - 3 comments
Labels: enhancement

#428 - Check if OCR images would be >2^31 bytes

Issue - State: closed - Opened by jbarlow83 about 5 years ago - 5 comments

#410 - Error: unable to find trailer dictionary while recovering damaged file

Issue - State: closed - Opened by fuzihaofzh about 5 years ago - 7 comments
Labels: need test file

#364 - Create an AppImage for ocrmypdf

Issue - State: closed - Opened by jbarlow83 over 5 years ago - 6 comments
Labels: help wanted

#351 - Feature request: additional post-processing options

Issue - State: closed - Opened by WillemJansen over 5 years ago - 1 comment
Labels: enhancement

#318 - AttributeError: 'str' object has no attribute 'option_strings'

Issue - State: closed - Opened by jbarlow83 almost 6 years ago - 2 comments

#316 - Output PDF is getting distorted on each ocrmypdf command.

Issue - State: closed - Opened by DEEPAK-KESWANI almost 6 years ago - 15 comments

#293 - file size increase for pdf/a

Issue - State: closed - Opened by femifrak about 6 years ago - 11 comments

#258 - Best way to handle PDF with mixed content ?

Issue - State: closed - Opened by guldil over 6 years ago - 13 comments

#242 - PDF/A without MuPDF deletes bookmarks

Issue - State: closed - Opened by jbarlow83 over 6 years ago - 1 comment

#237 - Excessive file size growth with --force-ocr

Issue - State: closed - Opened by jbarlow83 over 6 years ago - 5 comments

#209 - ERROR: [tesseract] read_params_file: Can't open txt

Issue - State: closed - Opened by dev-code-davis over 6 years ago - 9 comments

#202 - NixOS packaging issues

Issue - State: closed - Opened by sjau almost 7 years ago - 28 comments

#177 - Add HOCR output as a sidecar option

Issue - State: closed - Opened by parkerhancock about 7 years ago - 7 comments
Labels: enhancement

#139 - Just saying thanks!!!

Issue - State: open - Opened by ericmjl over 7 years ago - 14 comments

#125 - Output PDFs have decreased quality

Issue - State: closed - Opened by Wikinaut over 7 years ago - 13 comments

#115 - Reduce memory usage for very large files (high page count and large file size)

Issue - State: open - Opened by jbarlow83 almost 8 years ago - 7 comments

#66 - ocr corrections

Issue - State: closed - Opened by femifrak over 8 years ago - 8 comments