Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / google/sentencepiece issues and pull requests
#798 - IndexError: Out of range: piece id is out of range.
Issue -
State: closed - Opened by lytgyx about 2 years ago
- 1 comment
#797 - self implemente add_tokens by requesting pb, encounter "Runtime error: X is already defined" when load sp model file
Issue -
State: closed - Opened by lsy641 about 2 years ago
- 1 comment
#796 - Cannot install sentencepiece with Python 3.11 on Windows
Issue -
State: closed - Opened by kbatsuren about 2 years ago
- 2 comments
#795 - CMake need endif
Pull Request -
State: closed - Opened by A2va about 2 years ago
#794 - sentencepiece 0.1.97 re-released?
Issue -
State: closed - Opened by kenhys about 2 years ago
- 2 comments
#793 - Disable shared build on windows
Pull Request -
State: closed - Opened by A2va about 2 years ago
#792 - add CIFuzz GitHub action
Pull Request -
State: closed - Opened by DavidKorczynski over 2 years ago
#791 - Continuous Tokenizer Training
Issue -
State: closed - Opened by dszhengyu over 2 years ago
- 1 comment
#790 - Recommended corpus size
Issue -
State: closed - Opened by astariul over 2 years ago
- 1 comment
#789 - Chinese full-width punctuation such as "," and "?" are not contained in the vocab
Issue -
State: closed - Opened by acadaiaca over 2 years ago
- 2 comments
#788 - `not a mach-o file` error on Jupyter M2 Mac
Issue -
State: closed - Opened by mattlinares over 2 years ago
- 2 comments
#787 - Build with protobuf in system
Issue -
State: closed - Opened by acane77 over 2 years ago
- 3 comments
Labels: bug, enhancement
#785 - Linkage error
Issue -
State: closed - Opened by A2va over 2 years ago
- 3 comments
#782 - Even with the sampling I get OOM
Issue -
State: closed - Opened by lfoppiano over 2 years ago
- 3 comments
#780 - Enable iOS builds
Pull Request -
State: closed - Opened by jplu over 2 years ago
- 1 comment
#770 - about running spm.SentencePieceTrainer.Train()?
Issue -
State: closed - Opened by Joll123 over 2 years ago
- 2 comments
#763 - Difficulty installing on M1 mac (solved)
Issue -
State: closed - Opened by johnmcdonnell over 2 years ago
- 2 comments
#763 - Difficulty installing on M1 mac (solved)
Issue -
State: closed - Opened by johnmcdonnell over 2 years ago
- 2 comments
#756 - Fix a typo
Pull Request -
State: closed - Opened by kenhys over 2 years ago
#748 - Any way to load from Huggingface `tokenizer.json` file?
Issue -
State: closed - Opened by jbmaxwell almost 3 years ago
- 6 comments
#741 - “sentencepiece_processor.h”: No such file or directory
Issue -
State: closed - Opened by Helmsman-Lab almost 3 years ago
- 3 comments
#740 - Usage of dash-separated 'description-file' will not be supported in future versions. Please use the underscore name 'description_file' instead
Issue -
State: closed - Opened by BaseMax almost 3 years ago
- 4 comments
#738 - Exception occurs when reading the saved model again
Issue -
State: closed - Opened by zchuz almost 3 years ago
- 2 comments
#738 - Exception occurs when reading the saved model again
Issue -
State: closed - Opened by zchuz almost 3 years ago
- 2 comments
#732 - Sentencepiece installation fails on Python 3.10
Issue -
State: closed - Opened by tsharish almost 3 years ago
- 4 comments
#726 - SentencePieceProcessor has no attribute Encode
Issue -
State: closed - Opened by prashantserai about 3 years ago
- 1 comment
#726 - SentencePieceProcessor has no attribute Encode
Issue -
State: closed - Opened by prashantserai about 3 years ago
- 1 comment
#723 - Bug: can't co-exist with pytorch-lightning
Issue -
State: closed - Opened by jordane95 about 3 years ago
- 11 comments
#703 - Segmentation fault on Ubuntu with basic python test
Issue -
State: closed - Opened by johntmyers over 3 years ago
- 6 comments
#702 - Unigram training always crashes when making suffix array
Issue -
State: closed - Opened by MatthewBieda over 3 years ago
- 4 comments
#692 - user defined char set separated from "_".
Issue -
State: closed - Opened by BrightXiaoHan over 3 years ago
- 1 comment
#684 - How to handle multiple whitespaces or newlines
Issue -
State: closed - Opened by AmrMKayid over 3 years ago
- 2 comments
#683 - bazel support for C++ API
Issue -
State: open - Opened by BBerabi over 3 years ago
- 1 comment
Labels: feature request
#668 - How to get the sentencepiece vocabulary from .model file and store it to .vocab?
Issue -
State: closed - Opened by SambhawDrag over 3 years ago
- 4 comments
#650 - Prevent sentencepiece from normalizing whitespaces
Issue -
State: closed - Opened by miguelvictor almost 4 years ago
- 1 comment
#628 - Is the loss computation in UnigramTrainer correct?
Issue -
State: closed - Opened by mbollmann about 4 years ago
- 3 comments
Labels: bug
#608 - Add Mac M1 Compatibility
Issue -
State: closed - Opened by pierreia about 4 years ago
- 22 comments
#604 - RuntimeError when using sentencepiece
Issue -
State: closed - Opened by Serkonosand about 4 years ago
- 2 comments
#591 - Cannot install sentencepiece with Python 3.9 on Windows
Issue -
State: closed - Opened by seemethere about 4 years ago
- 16 comments
#588 - Combine vocabularies from various languges
Issue -
State: closed - Opened by JamesDConley about 4 years ago
- 8 comments
#579 - Shared library use unsafe because of abseil linkage
Issue -
State: closed - Opened by danieldk over 4 years ago
- 6 comments
#572 - pip install failed on Linux
Issue -
State: closed - Opened by zhangguanheng66 over 4 years ago
- 11 comments
#571 - Sentencepiece with pre-defined vocabulary
Issue -
State: open - Opened by vladmosin over 4 years ago
- 6 comments
Labels: help wanted, feature request
#563 - cmake: fix FTBFS on armel, mips, powerpc, m68k and sh4
Pull Request -
State: closed - Opened by kenhys over 4 years ago
#562 - cmake: use GNUInstallDirs.cmake on UNIX
Pull Request -
State: closed - Opened by kenhys over 4 years ago
#555 - What is the meaning of the second column of the .vocab file (using BPE)?
Issue -
State: closed - Opened by dskoo over 4 years ago
- 2 comments
#516 - My training crashes with large corpus.
Issue -
State: closed - Opened by Srj over 4 years ago
- 6 comments
#481 - Specify protobuf version when compiling from source
Issue -
State: closed - Opened by jchwenger almost 5 years ago
- 4 comments
Labels: duplicate, protobuf
#480 - How to get the frequency of a subword ?
Issue -
State: closed - Opened by liuyaox almost 5 years ago
- 2 comments
#474 - Using `set_vocabulary` to modify vocabulary
Issue -
State: closed - Opened by sshleifer almost 5 years ago
- 4 comments
#464 - module 'sentencepiece' has no attribute 'SentencePieceTrainer'
Issue -
State: closed - Opened by rossbrown9879 almost 5 years ago
- 6 comments
#444 - Get vocab and merges file from model file
Issue -
State: closed - Opened by andompesta about 5 years ago
- 3 comments
#444 - Get vocab and merges file from model file
Issue -
State: closed - Opened by andompesta about 5 years ago
- 2 comments
#426 - How to extend tokens dictionary?
Issue -
State: closed - Opened by kpe over 5 years ago
- 11 comments
#425 - do_lower_case in the sentencepiece model files
Issue -
State: closed - Opened by kpe over 5 years ago
- 3 comments
#416 - Fix a typo
Pull Request -
State: closed - Opened by kenhys over 5 years ago
#412 - Regarding `character_coverage`
Issue -
State: closed - Opened by ArbinTimilsina over 5 years ago
- 3 comments
#406 - Explanation of encoding method
Issue -
State: closed - Opened by rmrao over 5 years ago
- 2 comments
#384 - Remove duplicated if (NOT DEFINED CMAKE_INSTALL_LIBDIR) check
Pull Request -
State: closed - Opened by kenhys over 5 years ago
- 2 comments
#378 - Pip install sentencepiece failure
Issue -
State: closed - Opened by saareliad over 5 years ago
- 43 comments
#366 - can we train by Parallel Computing or Multithreading or multi-Progress
Issue -
State: open - Opened by joytianya over 5 years ago
- 7 comments
Labels: feature request
#346 - Possible to have arm support for Android?
Issue -
State: closed - Opened by gitathrun over 5 years ago
- 13 comments
#338 - Option to quite LOG(INFO) and LOG(WARNING) messages
Issue -
State: closed - Opened by ArbinTimilsina over 5 years ago
- 6 comments
#323 - How can i add character to existing model?
Issue -
State: closed - Opened by misssprite almost 6 years ago
- 3 comments
#318 - Bug in BPE algorithm
Issue -
State: closed - Opened by xbelonogov almost 6 years ago
- 5 comments
Labels: bug
#299 - python wrapper export vocabulary list
Issue -
State: closed - Opened by xinsu626 almost 6 years ago
- 3 comments
#285 - Tutorial to train a cross-language model with sentencepiece
Issue -
State: closed - Opened by loretoparisi about 6 years ago
- 4 comments
Labels: sample code
#263 - do not split by apostrophe character
Issue -
State: closed - Opened by EgorLakomkin about 6 years ago
- 3 comments
#255 - replace <unk> with custom unk token "xxunk"
Issue -
State: closed - Opened by kasparlund about 6 years ago
- 6 comments
#252 - Computing representative vocabularies for multiple large files
Issue -
State: closed - Opened by emjotde about 6 years ago
- 8 comments
#242 - Typo on paragraph #44
Pull Request -
State: closed - Opened by kant over 6 years ago
- 1 comment
#215 - What is the difference between --user_defined_symbols and --control_symbols
Issue -
State: closed - Opened by thammegowda over 6 years ago
- 3 comments
#121 - Manually modifying SentencePiece model?
Issue -
State: closed - Opened by neubig over 6 years ago
- 9 comments
#121 - Manually modifying SentencePiece model?
Issue -
State: closed - Opened by neubig over 6 years ago
- 10 comments
#102 - Understanding BOS/EOS symbols
Issue -
State: closed - Opened by sooheon over 6 years ago
- 6 comments
#99 - Added link on string #32
Pull Request -
State: closed - Opened by kant over 6 years ago
- 1 comment
#27 - Typo
Pull Request -
State: closed - Opened by kant over 7 years ago
- 1 comment