Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / NMZivkovic/BertTokenizers issues and pull requests

#28 - [BUG] BertUncasedBaseTokenizer ran forever with input "SixGe1−xH"

Issue - State: open - Opened by darren-zdc 5 months ago - 2 comments

#27 - Wrong vocabulary index after white space

Issue - State: open - Opened by wegylexy 8 months ago

#26 - Word piece tokenizer never exits if a sub-word token doesn't exist

Issue - State: open - Opened by matteocontrini over 1 year ago - 1 comment

#25 - Fixing tokenizers to correctly handle linux line endings (\n)

Pull Request - State: open - Opened by palenshus over 1 year ago

#24 - Strings with linux line endings break the tokenizer

Issue - State: open - Opened by palenshus over 1 year ago

#23 - Fix wrong naming #22

Pull Request - State: open - Opened by tsepton over 1 year ago - 2 comments

#22 - Custom vocabulary classes naming error

Issue - State: open - Opened by tsepton over 1 year ago

#20 - The tokenization for Korean text seems not correct.

Issue - State: open - Opened by terryqj0107 over 1 year ago

#19 - support .net462

Pull Request - State: open - Opened by amitportnoy over 1 year ago - 1 comment

#18 - This does not match behavior of Huggingface's Python version

Issue - State: open - Opened by gevorgter over 1 year ago - 9 comments

#16 - fix unicode of multilingual vocab

Pull Request - State: open - Opened by zhipenghan almost 2 years ago

#15 - Multilingual vocab not code properly

Issue - State: open - Opened by zhipenghan almost 2 years ago

#14 - Classes for custom vocabulary

Pull Request - State: closed - Opened by NMZivkovic about 2 years ago

#13 - Support for loading a custom vocab.txt?

Issue - State: closed - Opened by BrainSlugs83 over 2 years ago - 3 comments

#12 - different behavior: hugging face bert-base-uncased vs. BERT Base Uncased

Issue - State: closed - Opened by PaulCalot over 2 years ago - 3 comments

#11 - Updated Readme.md

Pull Request - State: closed - Opened by NMZivkovic over 2 years ago

#10 - Update Readme.md

Pull Request - State: closed - Opened by NMZivkovic over 2 years ago

#9 - Update README.md

Pull Request - State: closed - Opened by NMZivkovic over 2 years ago

#8 - CI/CD Pipeline - Automaticly publishing NuGet Package

Issue - State: open - Opened by NMZivkovic over 2 years ago

#7 - Supporting .NET 5 and .NET 6

Pull Request - State: closed - Opened by NMZivkovic over 2 years ago

#6 - Remove unnecessary files

Pull Request - State: closed - Opened by NMZivkovic over 2 years ago

#5 - Migrated to .NET6

Pull Request - State: closed - Opened by NMZivkovic over 2 years ago

#4 - Multilingual model tokenization differs from Python

Issue - State: closed - Opened by ADD-eNavarro over 2 years ago - 3 comments

#3 - Fixing tokenizer to select for >=2 instead of 2. Resolves discrepena…

Pull Request - State: closed - Opened by DanMMSFT over 2 years ago

#2 - BertTokenizers in .NET 3.1 and/or .NET 6.0

Issue - State: closed - Opened by ADD-eNavarro over 2 years ago - 2 comments

#1 - Issue using the code

Issue - State: closed - Opened by bentoo over 2 years ago - 1 comment