Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / chrismattmann/tika-python issues and pull requests
#419 - Copy #409 from upstream
Pull Request -
State: closed - Opened by baughmann about 1 month ago
- 1 comment
#418 - Are you still working on this, update tika?
Issue -
State: open - Opened by BBC-Esq 3 months ago
- 2 comments
#417 - 403 Forbidden Tika server error
Issue -
State: open - Opened by PratyushROpeneyes 5 months ago
#416 - 404 error in tika2.6.0
Issue -
State: open - Opened by LaniakeaS 7 months ago
#415 - ImportError: cannot import name 'NODE_CLASS_MAPPINGS' from 'nodes'
Issue -
State: closed - Opened by C-Abner 7 months ago
#414 - killServer fails to stop tika
Issue -
State: open - Opened by mcantrell 8 months ago
#413 - `DeprecationWarning: pkg_resources is deprecated as an API`
Issue -
State: open - Opened by Yelinz 8 months ago
#412 - Any way to set IOUtils.setByteArrayMaxOverride(VALUE).
Issue -
State: open - Opened by akgupta0777 10 months ago
#411 - Is there any way to preserve temp files?
Issue -
State: open - Opened by qptest 10 months ago
#410 - SSRF vulnerability: CVE-2022-46364
Issue -
State: open - Opened by anushakabber 10 months ago
#409 - Add automated documentation
Pull Request -
State: open - Opened by aleksandrskrivickis 10 months ago
#408 - Allow for v2.2.0 for parsing
Pull Request -
State: closed - Opened by ditikrushna 10 months ago
#407 - Modify the Tika Python code to use only Tika version >2.
Pull Request -
State: closed - Opened by ditikrushna 10 months ago
#406 - Tika server 2.9.1 Pdf tesseract Ocr
Issue -
State: open - Opened by Tarik37 11 months ago
#405 - Can this receive a io[bytes] type?
Issue -
State: open - Opened by Mathacc 11 months ago
- 1 comment
#404 - How to fix ReadTimeout: HTTPConnectionPool(host='localhost', port=9998): Read timed out. (read timeout=60)
Issue -
State: open - Opened by vriez about 1 year ago
- 1 comment
#403 - Permission denied
Issue -
State: closed - Opened by nautilux2 over 1 year ago
- 1 comment
Labels: invalid, question, wontfix
#402 - Unable to start Tika server
Issue -
State: open - Opened by kevin-guimard-ext over 1 year ago
#401 - unable to run tika
Issue -
State: closed - Opened by riyaj8888 over 1 year ago
- 1 comment
Labels: help wanted, invalid, question, wontfix
#400 - Need to run tika server manualy but previously it works without tika
Issue -
State: closed - Opened by mahmudtopu3 over 1 year ago
- 1 comment
Labels: help wanted, question, wontfix
#399 - Updated tika to use sha1 hash instead of md5 for checksum
Pull Request -
State: open - Opened by griffin-rickle over 1 year ago
- 2 comments
Labels: enhancement, help wanted
#398 - Inclusion of PDF Metadata Title field in Extracted Content
Issue -
State: closed - Opened by teohsinyee over 1 year ago
- 1 comment
Labels: question, wontfix
#397 - Increase retry duration in client only mode
Issue -
State: closed - Opened by saraswat40 over 1 year ago
- 1 comment
Labels: help wanted, question, wontfix
#396 - Timeline for tika 2.8 support
Issue -
State: closed - Opened by vasutrave over 1 year ago
- 3 comments
Labels: help wanted, question
#395 - Implement test running using GitHub actions
Pull Request -
State: closed - Opened by stumpylog almost 2 years ago
- 5 comments
Labels: enhancement, question, py3
#394 - Hi i am getting the same error
Issue -
State: closed - Opened by dhikshitha29 almost 2 years ago
- 1 comment
Labels: bug, invalid, question, wontfix
#393 - Can tika extract "Marked Content" (tagged PDFs)?
Issue -
State: closed - Opened by MartinThoma almost 2 years ago
- 2 comments
Labels: help wanted, question, wontfix
#392 - Help installing package on macOS M2 Ventura
Issue -
State: closed - Opened by shamoon almost 2 years ago
- 3 comments
Labels: help wanted, question, wontfix
#391 - fix(tika): Update download link due to broken URL
Pull Request -
State: closed - Opened by sa2812 almost 2 years ago
- 1 comment
Labels: bug, enhancement, invalid, wontfix
#390 - Airgap Environment Setup is unable to start Tika server
Issue -
State: closed - Opened by Marcos-A almost 2 years ago
- 6 comments
Labels: help wanted, question
#389 - Parsed text for EPUB mixes in metadata strings by default, and contains image tags + alt-text if service parameter is set to text
Issue -
State: closed - Opened by bitsgalore about 2 years ago
- 3 comments
Labels: bug, invalid, wontfix
#388 - 'charmap' codec can't decode byte 0x81 in position 279: character maps to <undefined>
Issue -
State: closed - Opened by MohammadFneish7 about 2 years ago
- 2 comments
Labels: bug, enhancement, help wanted
#387 - fix unpack from_file/from_buffer headers arg
Pull Request -
State: closed - Opened by deadc0de6 about 2 years ago
- 6 comments
Labels: bug, enhancement, question
#386 - On older versions of Python (2.7), the unpack tests fail
Issue -
State: closed - Opened by chrismattmann about 2 years ago
Labels: bug, enhancement, py3, py2
#385 - Fix test case files
Issue -
State: closed - Opened by chrismattmann about 2 years ago
- 1 comment
Labels: bug, enhancement
#384 - portions of strings getting cut off with "..."
Issue -
State: open - Opened by BCorbeek about 2 years ago
- 6 comments
Labels: bug, enhancement, help wanted, question
#383 - Tika-python is not extracting texts properly?
Issue -
State: closed - Opened by mrm202 about 2 years ago
- 1 comment
Labels: bug, help wanted, question, wontfix
#382 - Fixed issue #375
Pull Request -
State: closed - Opened by amensiko about 2 years ago
- 3 comments
Labels: enhancement, help wanted, py3
#381 - Fixed issue #377
Pull Request -
State: closed - Opened by amensiko about 2 years ago
- 4 comments
Labels: enhancement, help wanted, py3
#380 - Adds code highlighting to README.md
Pull Request -
State: closed - Opened by AmenRa about 2 years ago
- 1 comment
Labels: enhancement
#379 - flask file post handling
Issue -
State: closed - Opened by JGuibone over 2 years ago
- 1 comment
Labels: invalid, question, wontfix
#378 - Some Korean character not recognized
Issue -
State: closed - Opened by smbslt3 over 2 years ago
- 3 comments
Labels: help wanted, invalid, question, wontfix
#377 - Upgrade to Tika 2.6.0
Issue -
State: closed - Opened by tballison over 2 years ago
- 9 comments
Labels: enhancement, help wanted, question
#376 - Content returns gibberish for some PDFs
Issue -
State: closed - Opened by alfonsrv over 2 years ago
- 3 comments
#375 - Allow raw /rmeta output
Issue -
State: closed - Opened by tballison over 2 years ago
- 2 comments
Labels: enhancement, help wanted, question
#374 - Tika server returned status: 405
Issue -
State: closed - Opened by harshgorjiwala over 2 years ago
- 2 comments
Labels: bug, invalid, question, wontfix
#373 - PDF Text extraction: Date superscript split into separate lines
Issue -
State: closed - Opened by teohsinyee over 2 years ago
- 1 comment
Labels: bug, enhancement, help wanted
#372 - How to deal with large pdfs that are all images?
Issue -
State: closed - Opened by mfernaal over 2 years ago
- 2 comments
Labels: help wanted, question, wontfix
#371 - Unable to start Tika Server and get corrupt file when running tika-server.jar
Issue -
State: closed - Opened by devipramita almost 3 years ago
- 2 comments
Labels: help wanted, invalid, question, wontfix
#370 - Using `InMemoryUploadFile` with tika.
Issue -
State: closed - Opened by hamodey almost 3 years ago
- 1 comment
Labels: help wanted, question, wontfix
#369 - How to use tika-python in aws lambda using docker container image
Issue -
State: closed - Opened by saikiranLingampalli almost 3 years ago
- 2 comments
Labels: question, wontfix
#367 - Docker Tika-server PDF OCR
Issue -
State: closed - Opened by RNWTenor almost 3 years ago
- 3 comments
Labels: help wanted, question, wontfix
#366 - Fix unnecessary retries after successful startup
Pull Request -
State: closed - Opened by michielvandesteeg about 3 years ago
- 1 comment
Labels: bug, enhancement
#365 - tika server error after restarting the machine
Issue -
State: closed - Opened by ghost about 3 years ago
- 1 comment
Labels: question, wontfix
#364 - Allow for headers in unpack.from_file
Pull Request -
State: closed - Opened by ln-P about 3 years ago
Labels: enhancement
#363 - Tika parser with TesseractOCR
Issue -
State: closed - Opened by tarunsharma2015 about 3 years ago
- 2 comments
Labels: enhancement, question
#362 - how to access Apache Tika's recursiveJSON object using python-tika?
Issue -
State: closed - Opened by NLPOR about 3 years ago
- 1 comment
Labels: invalid, question, wontfix
#361 - How to structure compressed files, such as rar, zip format?
Issue -
State: closed - Opened by NLPOR about 3 years ago
- 1 comment
Labels: question, wontfix
#360 - Incorrect filename in Content-Disposition header
Issue -
State: open - Opened by tongwang over 3 years ago
- 1 comment
Labels: bug, enhancement, help wanted
#359 - Compatibility with Apache Tika version 2.1.0
Issue -
State: closed - Opened by bikashg over 3 years ago
- 6 comments
Labels: enhancement, help wanted, question
#358 - Checkboxes convert to FORMCHECKBOX
Issue -
State: closed - Opened by claire-herdeman over 3 years ago
- 1 comment
Labels: question, wontfix
#357 - Use another augmented assignment statement
Issue -
State: closed - Opened by elfring over 3 years ago
- 2 comments
Labels: invalid, question, wontfix
#356 - A break statement is missed
Issue -
State: closed - Opened by seanzian2093 over 3 years ago
- 1 comment
Labels: bug, enhancement, help wanted, question
#355 - Pass PIL/cv2 Image to Tika-Python
Issue -
State: closed - Opened by frederick0291 over 3 years ago
- 1 comment
Labels: enhancement, help wanted, question
#354 - [documentation] Add example with byte buffer
Pull Request -
State: closed - Opened by bjrne over 3 years ago
- 3 comments
Labels: enhancement, help wanted
#353 - RuntimeError: Unable to start Tika server.
Issue -
State: closed - Opened by mhrihab over 3 years ago
- 2 comments
Labels: help wanted, question, wontfix
#352 - No such file or directory: '/tmp/tika-server.jar'
Issue -
State: closed - Opened by ghost over 3 years ago
- 3 comments
Labels: question, wontfix
#350 - Tika-Python does not parse the metadata from PDF
Issue -
State: closed - Opened by Apurv3377 over 3 years ago
- 3 comments
Labels: question, wontfix
#349 - Update README
Issue -
State: closed - Opened by ktoulgaridis over 3 years ago
- 1 comment
Labels: duplicate, help wanted, question, wontfix
#348 - Use of hashlib.MD5 on FIPS configured installations
Issue -
State: closed - Opened by scarton over 3 years ago
- 5 comments
Labels: enhancement, help wanted, question
#347 - For the revised word document, Tika still parses the deleted content
Issue -
State: closed - Opened by zjms over 3 years ago
- 1 comment
Labels: invalid, question, wontfix
#346 - Bold text repeating twice
Issue -
State: closed - Opened by Shradha27 over 3 years ago
- 1 comment
Labels: invalid, question, wontfix
#345 - How to handles cases where if I iterate over 100k files at once it fails after parsing a large number?
Issue -
State: closed - Opened by user06039 over 3 years ago
- 7 comments
Labels: help wanted, question, wontfix
#344 - Extract text styling?
Issue -
State: closed - Opened by sabetAI over 3 years ago
- 1 comment
Labels: question, wontfix
#343 - Issues with Landscape PDFs
Issue -
State: closed - Opened by reisner almost 4 years ago
- 1 comment
Labels: question, wontfix
#342 - Correct wrong indent
Pull Request -
State: closed - Opened by barseghyanartur almost 4 years ago
- 2 comments
Labels: enhancement
#339 - Use the new RTG Translator to provide tika-translate functionality and set default translation engine to it
Issue -
State: closed - Opened by chrismattmann almost 4 years ago
- 5 comments
Labels: enhancement, py3
#338 - Option to only extract text (no table and images)
Issue -
State: closed - Opened by karrtikiyerkcm almost 4 years ago
- 1 comment
Labels: help wanted, question, wontfix
#337 - Using Tika with multithreading
Issue -
State: closed - Opened by AzureAlph almost 4 years ago
- 1 comment
Labels: question, wontfix
#336 - LanguageDetectors
Issue -
State: closed - Opened by arky almost 4 years ago
- 1 comment
Labels: help wanted, question, wontfix
#335 - Bump pyyaml from 5.2 to 5.4
Pull Request -
State: closed - Opened by dependabot[bot] almost 4 years ago
Labels: dependencies
#334 - Python Tika error: URLError: <urlopen error unknown url type: c>
Issue -
State: closed - Opened by danielepiu almost 4 years ago
- 1 comment
Labels: question, wontfix
#333 - resourceName returns byte character
Issue -
State: closed - Opened by ddriver3487 about 4 years ago
- 2 comments
Labels: help wanted, question, wontfix
#331 - Setting heap space for tika
Issue -
State: closed - Opened by sany2k8 about 4 years ago
- 2 comments
Labels: question, wontfix
#330 - [ERROR] RuntimeError: Unable to start Tika server.
Issue -
State: closed - Opened by parallel-ai about 4 years ago
- 2 comments
Labels: bug, question, wontfix
#329 - make tika CLI similar to parser.from_file
Pull Request -
State: closed - Opened by vedal about 4 years ago
- 3 comments
Labels: enhancement, help wanted, question
#328 - [QUESTION] Pulling bookmarks out of PDF
Issue -
State: closed - Opened by andrei-volkau about 4 years ago
- 1 comment
Labels: question, wontfix
#327 - classpath functionality is broken on Windows 10
Issue -
State: closed - Opened by mirrord about 4 years ago
- 1 comment
Labels: bug, enhancement, help wanted
#326 - tika.TikaClientOnly = True shows warning messages
Issue -
State: closed - Opened by Zast996 about 4 years ago
- 1 comment
Labels: enhancement, question, wontfix
#325 - use -spawnChild mode
Issue -
State: closed - Opened by tballison over 4 years ago
- 2 comments
Labels: enhancement, help wanted
#324 - Formatted Text Printing
Issue -
State: closed - Opened by Zast996 over 4 years ago
- 1 comment
Labels: question, wontfix
#323 - Incorrect formatted text for PDF's
Issue -
State: closed - Opened by Tushar-Mehndiratta over 4 years ago
- 1 comment
Labels: help wanted, question, wontfix
#322 - TIKA mistakes RTF message for email
Issue -
State: closed - Opened by altinp over 4 years ago
- 1 comment
Labels: question, wontfix
#321 - Incorrectly Parsing Fraction
Issue -
State: closed - Opened by dguisti over 4 years ago
- 1 comment
Labels: question, wontfix
#320 - How to disable OCR
Issue -
State: closed - Opened by pmgautam over 4 years ago
- 2 comments
Labels: wontfix
#319 - UnicodeEncodeError: 'charmap' codec can't encode character
Issue -
State: closed - Opened by Tushar-Mehndiratta over 4 years ago
- 9 comments
Labels: question, wontfix
#318 - Is tika supported with pyspark?
Issue -
State: closed - Opened by deepakjindal90 over 4 years ago
- 1 comment
Labels: bug, help wanted, question, wontfix
#316 - Tika 1.24.1 - gzip (de)compression
Pull Request -
State: closed - Opened by carantunes over 4 years ago
- 6 comments
Labels: enhancement
#315 - Mono-account and not multi-accounts :-(
Issue -
State: closed - Opened by enahwe over 4 years ago
- 3 comments
Labels: question, wontfix
#314 - Duplicate characters returned when extracting text from PDF
Issue -
State: closed - Opened by JSB97 over 4 years ago
- 2 comments
Labels: help wanted, question, wontfix