Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / WorksApplications/SudachiTra issues and pull requests
#67 - Fixes #66 - sudachitra not being compatible with transformers version newer than 4.34
Pull Request -
State: closed - Opened by mingboiz about 1 year ago
- 5 comments
#66 - sudachitra and other custom tokenizers no longer compatible with transformers later than 4.34
Issue -
State: closed - Opened by mingboiz about 1 year ago
- 4 comments
#65 - Can I use a user dictionary?
Issue -
State: open - Opened by mumumu09chi almost 2 years ago
- 2 comments
#64 - The entry of `\n` in `vocab.txt` is causing token index shifting
Issue -
State: open - Opened by hiroshi-matsuda-rit almost 2 years ago
#63 - Introduce token-based authentication for PyPI
Issue -
State: open - Opened by mh-northlander almost 2 years ago
#62 - setup.py install is deprecated.
Issue -
State: open - Opened by mh-northlander almost 2 years ago
#61 - Update python-publish workflow
Pull Request -
State: closed - Opened by mh-northlander almost 2 years ago
- 2 comments
#60 - Python publish workflow is not kicked on the release
Issue -
State: closed - Opened by mh-northlander almost 2 years ago
- 1 comment
#59 - Prepare for chiTra-1.1
Pull Request -
State: closed - Opened by mh-northlander almost 2 years ago
#58 - Prepare for v0.1.8
Pull Request -
State: closed - Opened by mh-northlander almost 2 years ago
#57 - Vocabulary file handling
Issue -
State: open - Opened by mh-northlander almost 2 years ago
#56 - Add changelog file
Issue -
State: closed - Opened by mh-northlander almost 2 years ago
#55 - Add patch file for the JGLUE evaluation
Pull Request -
State: closed - Opened by mh-northlander almost 2 years ago
#54 - Allow to save vocab with non-consecutive indices
Pull Request -
State: closed - Opened by mh-northlander almost 2 years ago
- 3 comments
#53 - Allow empty line in the vocab file
Issue -
State: closed - Opened by mh-northlander almost 2 years ago
#52 - Evaluate model with JGLUE
Issue -
State: closed - Opened by mh-northlander almost 2 years ago
#51 - tokenizer.model_max_length is incorrect
Issue -
State: open - Opened by mh-northlander about 2 years ago
- 1 comment
#50 - Feather/add normalized nouns
Pull Request -
State: closed - Opened by katsutan over 2 years ago
#49 - add workflow_dispatch
Pull Request -
State: closed - Opened by t-yamamura over 2 years ago
- 2 comments
#48 - Support 接尾辞-動詞的 and 接尾辞-形容詞的
Pull Request -
State: closed - Opened by KoichiYasuoka almost 3 years ago
- 4 comments
Labels: duplicate
#47 - update document with the release of pretraining models
Pull Request -
State: closed - Opened by t-yamamura almost 3 years ago
#46 - fix README for pretraining
Pull Request -
State: closed - Opened by t-yamamura almost 3 years ago
#45 - Update README for pretraining
Pull Request -
State: closed - Opened by t-yamamura about 3 years ago
#44 - Update README for pretraing
Issue -
State: closed - Opened by t-yamamura about 3 years ago
#43 - Tokenizer initializations behave differently
Issue -
State: open - Opened by mh-northlander about 3 years ago
#42 - Add to the test for alignments of encoded tokens by `JapaneseBertWordPieceTokenizer`
Issue -
State: open - Opened by t-yamamura about 3 years ago
Labels: bug
#41 - use `pathlib` instead of `os.path`
Issue -
State: open - Opened by t-yamamura about 3 years ago
#40 - pretraining by NVIDIA
Pull Request -
State: closed - Opened by katsutan about 3 years ago
- 1 comment
#39 - Make `split_dataset.py` support huge file input.
Pull Request -
State: closed - Opened by t-yamamura about 3 years ago
- 2 comments
#38 - Feature/use huggingface compatible pretokenizer
Pull Request -
State: closed - Opened by t-yamamura about 3 years ago
- 1 comment
#37 - Add scripts for the model evaluation
Pull Request -
State: closed - Opened by mh-northlander about 3 years ago
- 2 comments
#36 - use PosMatcher instead of `part_of_speech()`
Pull Request -
State: closed - Opened by t-yamamura about 3 years ago
#35 - Feature/conjugation preserving normalize for subword
Pull Request -
State: closed - Opened by t-yamamura about 3 years ago
#34 - Fix/modify merged preprocessing codes
Pull Request -
State: closed - Opened by t-yamamura about 3 years ago
#33 - Use scripts for pretraining implemented by NVIDIA
Issue -
State: closed - Opened by t-yamamura about 3 years ago
#32 - Feature/add cleaning and preprocessing
Pull Request -
State: closed - Opened by t-yamamura about 3 years ago
#31 - add normalizer that leaved conjugation
Pull Request -
State: closed - Opened by katsutan about 3 years ago
- 2 comments
#30 - require sudachipy>=0.6.0
Pull Request -
State: closed - Opened by t-yamamura over 3 years ago
#29 - remove slow tokenizer
Pull Request -
State: closed - Opened by t-yamamura over 3 years ago
#28 - remove slow tokenizer
Issue -
State: closed - Opened by t-yamamura over 3 years ago
#27 - add NFKC normalization
Pull Request -
State: closed - Opened by t-yamamura over 3 years ago
#26 - use NFKC as preprocessing
Issue -
State: closed - Opened by t-yamamura over 3 years ago
#25 - remove lowercase normalizer
Pull Request -
State: closed - Opened by t-yamamura over 3 years ago
#24 - Remove lowercase normalizer
Issue -
State: closed - Opened by t-yamamura over 3 years ago
#23 - Add preprocessing for cleaning up corpus
Issue -
State: closed - Opened by t-yamamura over 3 years ago
#22 - Replace SudachiPy with sudachi.rs
Issue -
State: closed - Opened by t-yamamura over 3 years ago
#21 - improve default configurations
Pull Request -
State: closed - Opened by hiroshi-matsuda-rit over 3 years ago
#20 - fix slow tokenizer
Pull Request -
State: closed - Opened by t-yamamura over 3 years ago
#19 - add slow tokenizer
Pull Request -
State: closed - Opened by t-yamamura over 3 years ago
#18 - Re-register submodule
Pull Request -
State: closed - Opened by t-yamamura over 3 years ago
#17 - update submodule
Pull Request -
State: closed - Opened by t-yamamura over 3 years ago
#16 - make dirs before saving vocab
Pull Request -
State: closed - Opened by t-yamamura over 3 years ago
#15 - fix wrong package name
Pull Request -
State: closed - Opened by t-yamamura over 3 years ago
#14 - store line_per_file as int
Pull Request -
State: closed - Opened by t-yamamura over 3 years ago
#13 - Fix/train tokenizer args
Pull Request -
State: closed - Opened by katsutan over 3 years ago
#12 - Adapt to the text preprocessing of SudachiPy
Pull Request -
State: closed - Opened by t-yamamura over 3 years ago
#11 - fix import
Pull Request -
State: closed - Opened by t-yamamura over 3 years ago
#10 - Bump bunkai from 1.3.0 to 1.4.0
Pull Request -
State: closed - Opened by t-yamamura over 3 years ago
#9 - Fix typos
Pull Request -
State: closed - Opened by sorami over 3 years ago
- 1 comment
#8 - Create python-publish.yaml
Pull Request -
State: closed - Opened by t-yamamura over 3 years ago
#7 - Rename package
Pull Request -
State: closed - Opened by t-yamamura over 3 years ago
#6 - Refactor/codes for pretraining
Pull Request -
State: closed - Opened by t-yamamura over 3 years ago
#5 - Feature/add documents and comments
Pull Request -
State: closed - Opened by t-yamamura over 3 years ago
#4 - add pos pretokenizer
Pull Request -
State: closed - Opened by t-yamamura over 3 years ago
#3 - fix import structure
Pull Request -
State: closed - Opened by t-yamamura over 3 years ago
#2 - transformers should be >= 4.6.1
Pull Request -
State: closed - Opened by hiroshi-matsuda-rit over 3 years ago
#1 - rename package name
Pull Request -
State: closed - Opened by t-yamamura over 3 years ago