Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / openai/tiktoken issues and pull requests
#358 - Community Resource: AutoTikTokenizer - A Bridge Between TikToken and HuggingFace Tokenizers
Issue -
State: open - Opened by bhavnicksm 3 days ago
#357 - Option to use TIKTOKEN_BPE_HOST environment variable for configurable BPE host URL
Pull Request -
State: open - Opened by AitBaali-Hamza-bcg 5 days ago
#356 - Why is the word vector file corresponding to GPT so small?
Issue -
State: closed - Opened by Cristliu 7 days ago
- 1 comment
#355 - Keep getting tiktoken errors when running my code
Issue -
State: closed - Opened by Byzzee-gh 14 days ago
- 1 comment
#354 - Whisper
Issue -
State: open - Opened by sine2pi 19 days ago
#353 - ValueError: not enough values to unpack (expected 2, got 1).
Issue -
State: open - Opened by Itime-ren 23 days ago
#352 - make decoder and sorted_token_bytes re-use existing memory
Pull Request -
State: open - Opened by tmm1 24 days ago
#351 - use HashSet from rustc-hash too
Pull Request -
State: open - Opened by tmm1 24 days ago
#350 - Add aarch64 musllinux wheel for 0.8.0
Issue -
State: open - Opened by rhuddleston about 1 month ago
- 1 comment
#349 - Add lint workflow
Pull Request -
State: closed - Opened by esadek about 1 month ago
- 2 comments
#348 - Remove unused imports
Pull Request -
State: closed - Opened by esadek about 1 month ago
- 3 comments
#347 - Python 3.11 wheel aarch64 missing for tiktoken 0.8
Issue -
State: closed - Opened by hauntsaninja about 1 month ago
- 2 comments
#346 - Build wheels for Python 3.13
Pull Request -
State: closed - Opened by iisakkirotko about 1 month ago
- 2 comments
#345 - Add replace spaces flag
Pull Request -
State: closed - Opened by rishabhy about 1 month ago
#344 - Does tiktoken count only input tokens or output tokens as well?
Issue -
State: closed - Opened by GildeshAbhay about 1 month ago
- 1 comment
#343 - Tiktoken Permission denied error
Issue -
State: open - Opened by NewGHUser4321 about 1 month ago
#342 - on the ipad
Issue -
State: closed - Opened by torot123 about 1 month ago
#341 - Add test ci
Pull Request -
State: closed - Opened by arvid220u about 2 months ago
#337 - Is there a new tokenizer for o1 models?
Issue -
State: closed - Opened by jiadingfang about 2 months ago
- 9 comments
#336 - Fix repeated characters handling in BPE tokenization (e.g., 'RR' in 'Strawberry')
Pull Request -
State: closed - Opened by Sachleens 2 months ago
- 1 comment
#335 - chatgpt-4o-latest is not yet added
Issue -
State: closed - Opened by jvlinsta 3 months ago
- 3 comments
#334 - Facing erros in importing the o200k_base
Issue -
State: closed - Opened by JaynouOliver 3 months ago
- 9 comments
#333 - Leveraging DP for bpe_encode function
Issue -
State: closed - Opened by lordgavy01 3 months ago
#332 - https://www.youtube.com/watch?v=8YnyAjkOap8
Issue -
State: closed - Opened by Anand-her 3 months ago
#331 - Uses Regex instead of fancy-regex - 6x speedup
Pull Request -
State: open - Opened by Majdoddin 3 months ago
- 2 comments
#330 - ValueError: not enough values to unpack (expected 2, got 1) when tiktoken.get_encoding("cl100k_base")
Issue -
State: closed - Opened by hzh12345678 3 months ago
- 2 comments
#329 - fix: add encoding for fine-tuned models based on gpt-4o
Pull Request -
State: open - Opened by hughcrt 3 months ago
#328 - Counting image tokens for gpt-4o
Issue -
State: closed - Opened by BleTib 3 months ago
- 2 comments
#327 - When I was inputting long text into a large model, that is, when the len of the text was 1024*1024, a StackOverflow error occurred.
Issue -
State: open - Opened by YangQiangli 3 months ago
#323 - RecursionError: maximum recursion depth exceeded while calling a Python object
Issue -
State: closed - Opened by Hudrolax 4 months ago
- 5 comments
#322 - 请问大家现在tiktok 退出了tiktok coin?
Issue -
State: closed - Opened by danielng620 4 months ago
#322 - 请问大家现在tiktok 退出了tiktok coin?
Issue -
State: closed - Opened by danielng620 4 months ago
#321 - Bunu uygulamaya göre ayarla
Issue -
State: closed - Opened by marseko 4 months ago
#321 - Bunu uygulamaya göre ayarla
Issue -
State: closed - Opened by marseko 4 months ago
#320 - Send from DWG FastView(Android)
Issue -
State: closed - Opened by marseko 4 months ago
#320 - Send from DWG FastView(Android)
Issue -
State: closed - Opened by marseko 4 months ago
#319 - Cache for Encoding - Runtime Boosted by 12%
Pull Request -
State: open - Opened by Majdoddin 4 months ago
#319 - Cache for Encoding - Runtime Boosted by 12%
Pull Request -
State: open - Opened by Majdoddin 4 months ago
#318 - DOC: Add a link toward PyPI tiktoken package.
Pull Request -
State: open - Opened by MaxJPRey 4 months ago
#318 - DOC: Add a link toward PyPI tiktoken package.
Pull Request -
State: open - Opened by MaxJPRey 4 months ago
#317 - [FR] Add `--offline`
Issue -
State: open - Opened by NightMachinery 4 months ago
- 3 comments
#316 - Optimal byte_pair_encode(), 6% faster, 0.6% better COMPRESSION
Pull Request -
State: closed - Opened by Majdoddin 4 months ago
- 1 comment
#316 - Optimal byte_pair_encode(), 6% faster, 0.6% better COMPRESSION
Pull Request -
State: closed - Opened by Majdoddin 4 months ago
- 1 comment
#315 - ai
Pull Request -
State: closed - Opened by MITCHELLNEAL1 5 months ago
#315 - ai
Pull Request -
State: closed - Opened by MITCHELLNEAL1 5 months ago
#314 - Add Terminal-Based Visualization Tool for Tokenized Data Points in Tiktoken Tokenizer
Pull Request -
State: open - Opened by LVivona 5 months ago
#314 - Add Terminal-Based Visualization Tool for Tokenized Data Points in Tiktoken Tokenizer
Pull Request -
State: open - Opened by LVivona 5 months ago
#313 - Update README.md
Pull Request -
State: open - Opened by SmartManoj 5 months ago
#313 - Update README.md
Pull Request -
State: closed - Opened by SmartManoj 5 months ago
- 1 comment
#305 - Support for GPT 4o
Issue -
State: closed - Opened by jcrupi 5 months ago
- 1 comment
#305 - Support for GPT 4o
Issue -
State: closed - Opened by jcrupi 5 months ago
- 1 comment
#303 - TikToken Tokenizer from scratch ?
Issue -
State: open - Opened by IsNoobgrammer 6 months ago
#303 - TikToken Tokenizer from scratch ?
Issue -
State: open - Opened by IsNoobgrammer 6 months ago
#302 - I want to modify the code in self._core_bpe.decode_bytes(tokens).decode("utf-8", errors=errors)
Issue -
State: closed - Opened by FanshuoZeng 6 months ago
- 1 comment
#302 - I want to modify the code in self._core_bpe.decode_bytes(tokens).decode("utf-8", errors=errors)
Issue -
State: closed - Opened by FanshuoZeng 6 months ago
- 1 comment
#301 - Unknown encoding gpt2
Issue -
State: closed - Opened by aryagxr 6 months ago
- 1 comment
#301 - Unknown encoding gpt2
Issue -
State: closed - Opened by aryagxr 6 months ago
- 1 comment
#300 - tiktoken 0.7.0 isn't compatible with python 3.11.*
Issue -
State: closed - Opened by trenton3983 6 months ago
- 3 comments
#300 - tiktoken 0.7.0 isn't compatible with python 3.11.*
Issue -
State: closed - Opened by trenton3983 6 months ago
- 3 comments
#299 - Tiktoken educational BPE trainer takes long time to train with vocab size 30k
Issue -
State: open - Opened by sagorbrur 6 months ago
- 2 comments
#299 - Tiktoken educational BPE trainer takes long time to train with vocab size 30k
Issue -
State: open - Opened by sagorbrur 6 months ago
- 2 comments
#298 - `o200k_base` pretokenizer - regex error?
Issue -
State: closed - Opened by AmitMY 6 months ago
- 2 comments
#298 - `o200k_base` pretokenizer - regex error?
Issue -
State: closed - Opened by AmitMY 6 months ago
- 2 comments
#297 - GPT4o出现低级bug:发现最新token中的垃圾语料及实测GPT4o胡言乱语出现幻觉
Issue -
State: closed - Opened by alexhmyang 6 months ago
- 3 comments
#297 - GPT4o出现低级bug:发现最新token中的垃圾语料及实测GPT4o胡言乱语出现幻觉
Issue -
State: closed - Opened by alexhmyang 6 months ago
- 3 comments
#296 - Or
Issue -
State: closed - Opened by jacob121532 6 months ago
#296 - Or
Issue -
State: closed - Opened by jacob121532 6 months ago
#295 - gpt-4o tokenizer
Issue -
State: closed - Opened by nxfi777 6 months ago
- 1 comment
#295 - gpt-4o tokenizer
Issue -
State: closed - Opened by nxfi777 6 months ago
- 1 comment
#294 - A character is splited into two tokens
Issue -
State: closed - Opened by kerlion 6 months ago
- 1 comment
#294 - A character is splited into two tokens
Issue -
State: closed - Opened by kerlion 6 months ago
- 1 comment
#293 - I need tiktoken win32 python3.8 version, can anyone provide it?
Issue -
State: closed - Opened by loveFeng 6 months ago
- 1 comment
#293 - I need tiktoken win32 python3.8 version, can anyone provide it?
Issue -
State: closed - Opened by loveFeng 6 months ago
- 1 comment
#292 - Combining marks and indic vowel marks within words are being split breaking all indic languages and most languages except English and CJKs
Issue -
State: closed - Opened by ajaykg 6 months ago
- 4 comments
#292 - Combining marks and indic vowel marks within words are being split breaking all indic languages and most languages except English and CJKs
Issue -
State: closed - Opened by ajaykg 6 months ago
- 4 comments
#291 - Error
Issue -
State: closed - Opened by pedromothe5 6 months ago
#291 - Error
Issue -
State: closed - Opened by pedromothe5 6 months ago
#290 - Use a custom exception ValueError subclass for the special tokens warning
Issue -
State: open - Opened by simonw 6 months ago
#290 - Use a custom exception ValueError subclass for the special tokens warning
Issue -
State: open - Opened by simonw 6 months ago
#289 - Custom tokenizer fails to encode despite characters being in mergeable_ranks
Issue -
State: closed - Opened by afang-story 6 months ago
- 3 comments
#289 - Custom tokenizer fails to encode despite characters being in mergeable_ranks
Issue -
State: open - Opened by afang-story 6 months ago
- 2 comments
#288 - Understanding the intended behaviour of `_encode_bytes`
Issue -
State: open - Opened by ashleyholman 6 months ago
#288 - Understanding the intended behaviour of `_encode_bytes`
Issue -
State: open - Opened by ashleyholman 6 months ago
#287 - Exception has occurred: ConnectionError HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x000001F4D42B0EE0>: Failed to establish a new connection: [Errno 11001] getaddrinfo failed'))
Issue -
State: closed - Opened by anithamudigoudar 7 months ago
- 2 comments
#287 - Exception has occurred: ConnectionError HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x000001F4D42B0EE0>: Failed to establish a new connection: [Errno 11001] getaddrinfo failed'))
Issue -
State: open - Opened by anithamudigoudar 7 months ago
#286 - Tiktoken not installing on a macbook pro with m2 chip
Issue -
State: closed - Opened by chaudhryna 7 months ago
- 2 comments
#286 - Tiktoken not installing on a macbook pro with m2 chip
Issue -
State: closed - Opened by chaudhryna 7 months ago
- 2 comments
#284 - Optimize _byte_pair_merge function in BPE implementation
Issue -
State: open - Opened by naveens01 7 months ago
#284 - Optimize _byte_pair_merge function in BPE implementation
Issue -
State: open - Opened by naveens01 7 months ago
#283 - how to convert qwen.tiktoken to tokenzier.model
Issue -
State: open - Opened by cloudyuyuyu 7 months ago
#283 - how to convert qwen.tiktoken to tokenzier.model
Issue -
State: open - Opened by cloudyuyuyu 7 months ago
#281 - SSLError: HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url
Issue -
State: open - Opened by sijiashen 7 months ago
- 8 comments
#279 - Using offline: `.tiktoken` file gets deleted automatically on Linux
Issue -
State: closed - Opened by nkilm 7 months ago
- 4 comments
#277 - Add handling for empty input text in encode method
Pull Request -
State: closed - Opened by pratyakshagarwal 7 months ago
- 1 comment
#277 - Add handling for empty input text in encode method
Pull Request -
State: closed - Opened by pratyakshagarwal 7 months ago
- 1 comment
#276 - Encode an empty string gives empty tokens
Issue -
State: closed - Opened by flexwang 7 months ago
- 2 comments