Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / chonkie-ai/chonkie issues and pull requests
#72 - Add TEVL to speed-up sentence chunking
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
#71 - Add TEVL to speed up sentence chunker
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
#71 - Add TEVL to speed up sentence chunker
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
#70 - [Fix] Allow for functions as token_counters in BaseChunkers
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
#70 - [Fix] Allow for functions as token_counters in BaseChunkers
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
#69 - Add support for automated testing with Github Actions
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
#69 - Add support for automated testing with Github Actions
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
#68 - Add `min_chunk_size` to SDPMChunker + Lint codebase with ruff + minor changes
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
Labels: chore
#68 - Add `min_chunk_size` to SDPMChunker + Lint codebase with ruff + minor changes
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
Labels: chore
#67 - [BUG] example code for WordChunker is not working
Issue -
State: closed - Opened by mozz85 3 months ago
- 3 comments
Labels: bug
#67 - [BUG] example code for WordChunker is not working
Issue -
State: closed - Opened by mozz85 3 months ago
- 3 comments
Labels: bug
#66 - Added automated testing using Github Actions
Pull Request -
State: closed - Opened by pratyushmittal 3 months ago
- 3 comments
#66 - Added automated testing using Github Actions
Pull Request -
State: closed - Opened by pratyushmittal 3 months ago
- 3 comments
#65 - Fixed similarity_percentile with sdpm chunker + added test
Pull Request -
State: closed - Opened by pratyushmittal 3 months ago
- 5 comments
#65 - Fixed similarity_percentile with sdpm chunker + added test
Pull Request -
State: closed - Opened by pratyushmittal 3 months ago
- 5 comments
#64 - [BUG] EmbeddingsRegistry custom tokenizer does not work
Issue -
State: closed - Opened by rsharma-autessa 3 months ago
- 6 comments
Labels: bug
#64 - [BUG] EmbeddingsRegistry custom tokenizer does not work
Issue -
State: closed - Opened by rsharma-autessa 3 months ago
- 6 comments
Labels: bug
#63 - [Update] Change default embedding model in SemanticChunkers
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
#63 - [Update] Change default embedding model in SemanticChunkers
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
#62 - [Update] Bump version to 0.2.1.post1 and require Python 3.9 or higher
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
#62 - [Update] Bump version to 0.2.1.post1 and require Python 3.9 or higher
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
#61 - [help] failed on SemanticChunker's example
Issue -
State: closed - Opened by mozz85 3 months ago
- 14 comments
Labels: bug
#61 - [help] failed on SemanticChunker's example
Issue -
State: closed - Opened by mozz85 3 months ago
- 14 comments
Labels: bug
#60 - [Refactor] Add min_chunk_size parameter to SemanticChunker and SentenceChunker
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
#60 - [Refactor] Add min_chunk_size parameter to SemanticChunker and SentenceChunker
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
#59 - [BUG] SDPM & Semantic Chunking Example not working
Issue -
State: closed - Opened by regstuff 3 months ago
- 2 comments
Labels: bug
#59 - [BUG] SDPM & Semantic Chunking Example not working
Issue -
State: closed - Opened by regstuff 3 months ago
- 2 comments
Labels: bug
#58 - [Fix] Add fix for #55
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
#58 - [Fix] Add fix for #55
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
#57 - [Fix] AutoEmbeddings not loading `all-minilm-l6-v2` but loads `All-MiniLM-L6-V2`
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
#57 - [Fix] AutoEmbeddings not loading `all-minilm-l6-v2` but loads `All-MiniLM-L6-V2`
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
#56 - Update DOCS.md - fixed embeddings path after recent change
Pull Request -
State: closed - Opened by pratyushmittal 3 months ago
- 3 comments
#56 - Update DOCS.md - fixed embeddings path after recent change
Pull Request -
State: closed - Opened by pratyushmittal 3 months ago
- 3 comments
#55 - [BUG] Newlines are not removed after pre-processing in SemanticChunker
Issue -
State: closed - Opened by Pringled 3 months ago
- 3 comments
Labels: bug
#55 - [BUG] Newlines are not removed after pre-processing in SemanticChunker
Issue -
State: closed - Opened by Pringled 3 months ago
- 3 comments
Labels: bug
#54 - [Refactor] Optimize similarity calculation by using np.divide for imp…
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
#54 - [Refactor] Optimize similarity calculation by using np.divide for imp…
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
#53 - [Fix] Refactor WordChunker, SentenceChunker pre-chunk splitting for reconstruction tests + minor changes
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
#53 - [Fix] Refactor WordChunker, SentenceChunker pre-chunk splitting for reconstruction tests + minor changes
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
#52 - [Fix] Token counts from Tokenizers and Transformers adding special tokens
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
Labels: enhancement
#52 - [Fix] Token counts from Tokenizers and Transformers adding special tokens
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
Labels: enhancement
#51 - [fix] Reorganize optional dependencies in pyproject.toml: rename 'sem…
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
#51 - [fix] Reorganize optional dependencies in pyproject.toml: rename 'sem…
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
#50 - [DISC] Benchmarking Chonkie Mega-Thread
Issue -
State: open - Opened by bhavnicksm 3 months ago
- 2 comments
Labels: documentation, enhancement
#50 - [DISC] Benchmarking Chonkie Mega-Thread
Issue -
State: open - Opened by bhavnicksm 3 months ago
- 2 comments
Labels: documentation, enhancement
#49 - [FEAT] Add support for Model2VecEmbeddings + Switch default embeddings to Model2VecEmbeddings
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
Labels: enhancement
#49 - [FEAT] Add support for Model2VecEmbeddings + Switch default embeddings to Model2VecEmbeddings
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
Labels: enhancement
#48 - Reconstruction Test
Pull Request -
State: closed - Opened by mrmps 3 months ago
- 3 comments
#48 - Reconstruction Test
Pull Request -
State: closed - Opened by mrmps 3 months ago
- 3 comments
#47 - [DOCS] Add info about initial embeddings support and how to add custom embeddings
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
#47 - [DOCS] Add info about initial embeddings support and how to add custom embeddings
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
#46 - Add initial OpenAIEmbeddings support to Chonkie ✨
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
#46 - Add initial OpenAIEmbeddings support to Chonkie ✨
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
#45 - Refactor BaseChunker, SemanticChunker and SDPMChunker to support BaseEmbeddings
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
#44 - [FEAT] Add SentenceTransformerEmbeddings, EmbeddingsRegistry and AutoEmbeddings provider support
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
Labels: enhancement
#43 - [DISC] Improving Documentation
Issue -
State: closed - Opened by bhavnicksm 3 months ago
- 5 comments
Labels: documentation, enhancement, help wanted, in progress
#42 - [BUG] Chunkers failing the test of recronstruction
Issue -
State: closed - Opened by mrmps 3 months ago
- 7 comments
Labels: bug
#41 - [FEAT] - Add model2vec embedding models
Pull Request -
State: closed - Opened by sky-2002 3 months ago
- 15 comments
Labels: enhancement
#40 - [FEAT] Min chunk size (for semantic chunkers)
Issue -
State: closed - Opened by kbarendrecht 3 months ago
- 2 comments
Labels: enhancement
#39 - [FEAT] Add async support to SDPMChunker and to SemanticChunker
Issue -
State: open - Opened by rodion-m 3 months ago
- 7 comments
Labels: enhancement
#38 - [FEAT] Add an ability to use OpenAI / VoyageAI / Cohere embeddings with SDPMChunker via LiteLLM
Issue -
State: open - Opened by rodion-m 3 months ago
- 5 comments
Labels: enhancement
#37 - [BUG] start_index and end_index inaccurate for repetitive text chunks
Issue -
State: closed - Opened by bhavnicksm 3 months ago
- 1 comment
Labels: bug
#36 - [FEAT] Allow configuring backend for Sentence_Transformers (e.g. ONNX, openVINO)
Issue -
State: closed - Opened by kbarendrecht 3 months ago
- 3 comments
Labels: enhancement
#35 - Bump version to 0.2.0.post1 in pyproject.toml and __init__.py
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
#34 - Use `__slots__` instead of `slots=True` for python3.9 support
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
#33 - [BUG] TypeError: dataclass() got an unexpected keyword argument 'slots'
Issue -
State: closed - Opened by AgentT30 3 months ago
- 2 comments
Labels: bug
#32 - Major Update: Fix bugs + Update docs + Add slots to dataclasses + update word & sentence splitting logic + minor changes
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
#31 - [BUG]pyo3_runtime.PanicException: no entry found for key
Issue -
State: closed - Opened by wbbeyourself 3 months ago
- 4 comments
Labels: bug
#30 - [DOCS] Fix typo for import tokenizer in quick start example
Pull Request -
State: closed - Opened by jasonacox 3 months ago
- 1 comment
Labels: documentation
#29 - [BUG] Fix the start_index and end_index to point to character indices, not token indices
Pull Request -
State: closed - Opened by mrmps 3 months ago
- 2 comments
Labels: bug
#28 - Add initial batching support via `chunk_batch` fn + update DOCS
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
#27 - Update dependency version of SentenceTransformer to at least 2.3.0
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
#26 - [BUG]AttributeError: 'SentenceTransformer' object has no attribute 'similarity'
Issue -
State: closed - Opened by heweapon 3 months ago
- 6 comments
Labels: bug
#25 - ImportError: cannot import name 'tokenizer' from 'tokenizers' (/usr/local/lib/python3.10/site-packages/tokenizers/__init__.py)
Issue -
State: closed - Opened by abchbx 3 months ago
- 1 comment
#24 - fix: tokenizer mismatch for `SemanticChunker` + Add BaseEmbeddings
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
#23 - Can I load offline tokenizers in it?
Issue -
State: closed - Opened by a136214808 3 months ago
- 3 comments
Labels: bug
#22 - Update README.md + minor updates
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
#21 - Remove Spacy dependency from 'sentence' install + Add FAQ to DOCS.md
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
#20 - Remove Spacy dependency from Chonkie
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
#19 - Add FastEmbed Support for Embedding Generation/Inference
Issue -
State: closed - Opened by adithya-s-k 3 months ago
- 5 comments
Labels: enhancement
#18 - `TokenChunker` does not support multiple inputs
Issue -
State: closed - Opened by not-lain 3 months ago
- 5 comments
Labels: bug, enhancement
#17 - Update README.md + fix DOCS.md typo
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
#16 - Incorrect import in Docs, SDPMChunker reference
Issue -
State: closed - Opened by Om-Alve 3 months ago
- 1 comment
#15 - Update acknowledgements in README.md for improved clarity and appreci…
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
#14 - Development
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
#13 - Run Black + Isort + beautify the code a bit
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
#12 - Make imports as a part of Chunker __init__ instead of file imports to make Chonkie import faster
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
#11 - Bump version to 0.1.1 in pyproject.toml and __init__.py
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
#10 - Update README.md
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
#9 - Disentangle the Embedding Model from SemanticChunker + Update DOCS and README
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
#8 - Bump version to 0.0.3 in pyproject.toml and __init__.py for release
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
#7 - Update README.md + remove .github action
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
#6 - Bump version to 0.0.2 in pyproject.toml and __init__.py for release
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
#5 - Update README.md
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
#4 - Add support for Transformers and TikToken
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
#3 - v0.0.1a8
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
#2 - Update Logo (for PyPI) + Update README.md + Fix packaging bug
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago
#1 - v0.0.1a4
Pull Request -
State: closed - Opened by bhavnicksm 3 months ago