Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / chrismattmann/tika-python issues and pull requests

#419 - Copy #409 from upstream

Pull Request - State: closed - Opened by baughmann about 1 month ago - 1 comment

#418 - Are you still working on this, update tika?

Issue - State: open - Opened by BBC-Esq 3 months ago - 2 comments

#417 - 403 Forbidden Tika server error

Issue - State: open - Opened by PratyushROpeneyes 5 months ago

#416 - 404 error in tika2.6.0

Issue - State: open - Opened by LaniakeaS 7 months ago

#414 - killServer fails to stop tika

Issue - State: open - Opened by mcantrell 8 months ago

#411 - Is there any way to preserve temp files?

Issue - State: open - Opened by qptest 10 months ago

#410 - SSRF vulnerability: CVE-2022-46364

Issue - State: open - Opened by anushakabber 10 months ago

#409 - Add automated documentation

Pull Request - State: open - Opened by aleksandrskrivickis 10 months ago

#408 - Allow for v2.2.0 for parsing

Pull Request - State: closed - Opened by ditikrushna 10 months ago

#407 - Modify the Tika Python code to use only Tika version >2.

Pull Request - State: closed - Opened by ditikrushna 10 months ago

#406 - Tika server 2.9.1 Pdf tesseract Ocr

Issue - State: open - Opened by Tarik37 11 months ago

#405 - Can this receive a io[bytes] type?

Issue - State: open - Opened by Mathacc 11 months ago - 1 comment

#403 - Permission denied

Issue - State: closed - Opened by nautilux2 over 1 year ago - 1 comment
Labels: invalid, question, wontfix

#402 - Unable to start Tika server

Issue - State: open - Opened by kevin-guimard-ext over 1 year ago

#401 - unable to run tika

Issue - State: closed - Opened by riyaj8888 over 1 year ago - 1 comment
Labels: help wanted, invalid, question, wontfix

#400 - Need to run tika server manualy but previously it works without tika

Issue - State: closed - Opened by mahmudtopu3 over 1 year ago - 1 comment
Labels: help wanted, question, wontfix

#399 - Updated tika to use sha1 hash instead of md5 for checksum

Pull Request - State: open - Opened by griffin-rickle over 1 year ago - 2 comments
Labels: enhancement, help wanted

#398 - Inclusion of PDF Metadata Title field in Extracted Content

Issue - State: closed - Opened by teohsinyee over 1 year ago - 1 comment
Labels: question, wontfix

#397 - Increase retry duration in client only mode

Issue - State: closed - Opened by saraswat40 over 1 year ago - 1 comment
Labels: help wanted, question, wontfix

#396 - Timeline for tika 2.8 support

Issue - State: closed - Opened by vasutrave over 1 year ago - 3 comments
Labels: help wanted, question

#395 - Implement test running using GitHub actions

Pull Request - State: closed - Opened by stumpylog almost 2 years ago - 5 comments
Labels: enhancement, question, py3

#394 - Hi i am getting the same error

Issue - State: closed - Opened by dhikshitha29 almost 2 years ago - 1 comment
Labels: bug, invalid, question, wontfix

#393 - Can tika extract "Marked Content" (tagged PDFs)?

Issue - State: closed - Opened by MartinThoma almost 2 years ago - 2 comments
Labels: help wanted, question, wontfix

#392 - Help installing package on macOS M2 Ventura

Issue - State: closed - Opened by shamoon almost 2 years ago - 3 comments
Labels: help wanted, question, wontfix

#391 - fix(tika): Update download link due to broken URL

Pull Request - State: closed - Opened by sa2812 almost 2 years ago - 1 comment
Labels: bug, enhancement, invalid, wontfix

#390 - Airgap Environment Setup is unable to start Tika server

Issue - State: closed - Opened by Marcos-A almost 2 years ago - 6 comments
Labels: help wanted, question

#388 - 'charmap' codec can't decode byte 0x81 in position 279: character maps to <undefined>

Issue - State: closed - Opened by MohammadFneish7 about 2 years ago - 2 comments
Labels: bug, enhancement, help wanted

#387 - fix unpack from_file/from_buffer headers arg

Pull Request - State: closed - Opened by deadc0de6 about 2 years ago - 6 comments
Labels: bug, enhancement, question

#386 - On older versions of Python (2.7), the unpack tests fail

Issue - State: closed - Opened by chrismattmann about 2 years ago
Labels: bug, enhancement, py3, py2

#385 - Fix test case files

Issue - State: closed - Opened by chrismattmann about 2 years ago - 1 comment
Labels: bug, enhancement

#384 - portions of strings getting cut off with "..."

Issue - State: open - Opened by BCorbeek about 2 years ago - 6 comments
Labels: bug, enhancement, help wanted, question

#383 - Tika-python is not extracting texts properly?

Issue - State: closed - Opened by mrm202 about 2 years ago - 1 comment
Labels: bug, help wanted, question, wontfix

#382 - Fixed issue #375

Pull Request - State: closed - Opened by amensiko about 2 years ago - 3 comments
Labels: enhancement, help wanted, py3

#381 - Fixed issue #377

Pull Request - State: closed - Opened by amensiko about 2 years ago - 4 comments
Labels: enhancement, help wanted, py3

#380 - Adds code highlighting to README.md

Pull Request - State: closed - Opened by AmenRa about 2 years ago - 1 comment
Labels: enhancement

#379 - flask file post handling

Issue - State: closed - Opened by JGuibone over 2 years ago - 1 comment
Labels: invalid, question, wontfix

#378 - Some Korean character not recognized

Issue - State: closed - Opened by smbslt3 over 2 years ago - 3 comments
Labels: help wanted, invalid, question, wontfix

#377 - Upgrade to Tika 2.6.0

Issue - State: closed - Opened by tballison over 2 years ago - 9 comments
Labels: enhancement, help wanted, question

#376 - Content returns gibberish for some PDFs

Issue - State: closed - Opened by alfonsrv over 2 years ago - 3 comments

#375 - Allow raw /rmeta output

Issue - State: closed - Opened by tballison over 2 years ago - 2 comments
Labels: enhancement, help wanted, question

#374 - Tika server returned status: 405

Issue - State: closed - Opened by harshgorjiwala over 2 years ago - 2 comments
Labels: bug, invalid, question, wontfix

#373 - PDF Text extraction: Date superscript split into separate lines

Issue - State: closed - Opened by teohsinyee over 2 years ago - 1 comment
Labels: bug, enhancement, help wanted

#372 - How to deal with large pdfs that are all images?

Issue - State: closed - Opened by mfernaal over 2 years ago - 2 comments
Labels: help wanted, question, wontfix

#371 - Unable to start Tika Server and get corrupt file when running tika-server.jar

Issue - State: closed - Opened by devipramita almost 3 years ago - 2 comments
Labels: help wanted, invalid, question, wontfix

#370 - Using `InMemoryUploadFile` with tika.

Issue - State: closed - Opened by hamodey almost 3 years ago - 1 comment
Labels: help wanted, question, wontfix

#369 - How to use tika-python in aws lambda using docker container image

Issue - State: closed - Opened by saikiranLingampalli almost 3 years ago - 2 comments
Labels: question, wontfix

#367 - Docker Tika-server PDF OCR

Issue - State: closed - Opened by RNWTenor almost 3 years ago - 3 comments
Labels: help wanted, question, wontfix

#366 - Fix unnecessary retries after successful startup

Pull Request - State: closed - Opened by michielvandesteeg about 3 years ago - 1 comment
Labels: bug, enhancement

#365 - tika server error after restarting the machine

Issue - State: closed - Opened by ghost about 3 years ago - 1 comment
Labels: question, wontfix

#364 - Allow for headers in unpack.from_file

Pull Request - State: closed - Opened by ln-P about 3 years ago
Labels: enhancement

#363 - Tika parser with TesseractOCR

Issue - State: closed - Opened by tarunsharma2015 about 3 years ago - 2 comments
Labels: enhancement, question

#362 - how to access Apache Tika's recursiveJSON object using python-tika?

Issue - State: closed - Opened by NLPOR about 3 years ago - 1 comment
Labels: invalid, question, wontfix

#361 - How to structure compressed files, such as rar, zip format?

Issue - State: closed - Opened by NLPOR about 3 years ago - 1 comment
Labels: question, wontfix

#360 - Incorrect filename in Content-Disposition header

Issue - State: open - Opened by tongwang over 3 years ago - 1 comment
Labels: bug, enhancement, help wanted

#359 - Compatibility with Apache Tika version 2.1.0

Issue - State: closed - Opened by bikashg over 3 years ago - 6 comments
Labels: enhancement, help wanted, question

#358 - Checkboxes convert to FORMCHECKBOX

Issue - State: closed - Opened by claire-herdeman over 3 years ago - 1 comment
Labels: question, wontfix

#357 - Use another augmented assignment statement

Issue - State: closed - Opened by elfring over 3 years ago - 2 comments
Labels: invalid, question, wontfix

#356 - A break statement is missed

Issue - State: closed - Opened by seanzian2093 over 3 years ago - 1 comment
Labels: bug, enhancement, help wanted, question

#355 - Pass PIL/cv2 Image to Tika-Python

Issue - State: closed - Opened by frederick0291 over 3 years ago - 1 comment
Labels: enhancement, help wanted, question

#354 - [documentation] Add example with byte buffer

Pull Request - State: closed - Opened by bjrne over 3 years ago - 3 comments
Labels: enhancement, help wanted

#353 - RuntimeError: Unable to start Tika server.

Issue - State: closed - Opened by mhrihab over 3 years ago - 2 comments
Labels: help wanted, question, wontfix

#352 - No such file or directory: '/tmp/tika-server.jar'

Issue - State: closed - Opened by ghost over 3 years ago - 3 comments
Labels: question, wontfix

#350 - Tika-Python does not parse the metadata from PDF

Issue - State: closed - Opened by Apurv3377 over 3 years ago - 3 comments
Labels: question, wontfix

#349 - Update README

Issue - State: closed - Opened by ktoulgaridis over 3 years ago - 1 comment
Labels: duplicate, help wanted, question, wontfix

#348 - Use of hashlib.MD5 on FIPS configured installations

Issue - State: closed - Opened by scarton over 3 years ago - 5 comments
Labels: enhancement, help wanted, question

#347 - For the revised word document, Tika still parses the deleted content

Issue - State: closed - Opened by zjms over 3 years ago - 1 comment
Labels: invalid, question, wontfix

#346 - Bold text repeating twice

Issue - State: closed - Opened by Shradha27 over 3 years ago - 1 comment
Labels: invalid, question, wontfix

#345 - How to handles cases where if I iterate over 100k files at once it fails after parsing a large number?

Issue - State: closed - Opened by user06039 over 3 years ago - 7 comments
Labels: help wanted, question, wontfix

#344 - Extract text styling?

Issue - State: closed - Opened by sabetAI over 3 years ago - 1 comment
Labels: question, wontfix

#343 - Issues with Landscape PDFs

Issue - State: closed - Opened by reisner almost 4 years ago - 1 comment
Labels: question, wontfix

#342 - Correct wrong indent

Pull Request - State: closed - Opened by barseghyanartur almost 4 years ago - 2 comments
Labels: enhancement

#339 - Use the new RTG Translator to provide tika-translate functionality and set default translation engine to it

Issue - State: closed - Opened by chrismattmann almost 4 years ago - 5 comments
Labels: enhancement, py3

#338 - Option to only extract text (no table and images)

Issue - State: closed - Opened by karrtikiyerkcm almost 4 years ago - 1 comment
Labels: help wanted, question, wontfix

#337 - Using Tika with multithreading

Issue - State: closed - Opened by AzureAlph almost 4 years ago - 1 comment
Labels: question, wontfix

#336 - LanguageDetectors

Issue - State: closed - Opened by arky almost 4 years ago - 1 comment
Labels: help wanted, question, wontfix

#335 - Bump pyyaml from 5.2 to 5.4

Pull Request - State: closed - Opened by dependabot[bot] almost 4 years ago
Labels: dependencies

#334 - Python Tika error: URLError: <urlopen error unknown url type: c>

Issue - State: closed - Opened by danielepiu almost 4 years ago - 1 comment
Labels: question, wontfix

#333 - resourceName returns byte character

Issue - State: closed - Opened by ddriver3487 about 4 years ago - 2 comments
Labels: help wanted, question, wontfix

#331 - Setting heap space for tika

Issue - State: closed - Opened by sany2k8 about 4 years ago - 2 comments
Labels: question, wontfix

#330 - [ERROR] RuntimeError: Unable to start Tika server.

Issue - State: closed - Opened by parallel-ai about 4 years ago - 2 comments
Labels: bug, question, wontfix

#329 - make tika CLI similar to parser.from_file

Pull Request - State: closed - Opened by vedal about 4 years ago - 3 comments
Labels: enhancement, help wanted, question

#328 - [QUESTION] Pulling bookmarks out of PDF

Issue - State: closed - Opened by andrei-volkau about 4 years ago - 1 comment
Labels: question, wontfix

#327 - classpath functionality is broken on Windows 10

Issue - State: closed - Opened by mirrord about 4 years ago - 1 comment
Labels: bug, enhancement, help wanted

#326 - tika.TikaClientOnly = True shows warning messages

Issue - State: closed - Opened by Zast996 about 4 years ago - 1 comment
Labels: enhancement, question, wontfix

#325 - use -spawnChild mode

Issue - State: closed - Opened by tballison over 4 years ago - 2 comments
Labels: enhancement, help wanted

#324 - Formatted Text Printing

Issue - State: closed - Opened by Zast996 over 4 years ago - 1 comment
Labels: question, wontfix

#323 - Incorrect formatted text for PDF's

Issue - State: closed - Opened by Tushar-Mehndiratta over 4 years ago - 1 comment
Labels: help wanted, question, wontfix

#322 - TIKA mistakes RTF message for email

Issue - State: closed - Opened by altinp over 4 years ago - 1 comment
Labels: question, wontfix

#321 - Incorrectly Parsing Fraction

Issue - State: closed - Opened by dguisti over 4 years ago - 1 comment
Labels: question, wontfix

#320 - How to disable OCR

Issue - State: closed - Opened by pmgautam over 4 years ago - 2 comments
Labels: wontfix

#319 - UnicodeEncodeError: 'charmap' codec can't encode character

Issue - State: closed - Opened by Tushar-Mehndiratta over 4 years ago - 9 comments
Labels: question, wontfix

#318 - Is tika supported with pyspark?

Issue - State: closed - Opened by deepakjindal90 over 4 years ago - 1 comment
Labels: bug, help wanted, question, wontfix

#316 - Tika 1.24.1 - gzip (de)compression

Pull Request - State: closed - Opened by carantunes over 4 years ago - 6 comments
Labels: enhancement

#315 - Mono-account and not multi-accounts :-(

Issue - State: closed - Opened by enahwe over 4 years ago - 3 comments
Labels: question, wontfix

#314 - Duplicate characters returned when extracting text from PDF

Issue - State: closed - Opened by JSB97 over 4 years ago - 2 comments
Labels: help wanted, question, wontfix