Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / explosion/curated-tokenizers issues and pull requests

#59 - Set version to 2.0.0

Pull Request - State: closed - Opened by danieldk about 1 month ago

#58 - Set version to v2.0.0.dev0

Pull Request - State: closed - Opened by danieldk about 1 month ago

#57 - bbpe: Add token LRU cache and avoid string lookups

Pull Request - State: closed - Opened by danieldk about 1 month ago
Labels: enhancement

#56 - Bump version to 0.9.2, update minimum Python version to 3.8

Pull Request - State: closed - Opened by danieldk 3 months ago

#55 - Bump version to 0.0.9

Pull Request - State: closed - Opened by danieldk 4 months ago

#54 - Backport: make all the piece processors picklable (#53)

Pull Request - State: closed - Opened by danieldk 4 months ago

#53 - Make all the piece processors picklable

Pull Request - State: closed - Opened by danieldk 4 months ago - 1 comment
Labels: enhancement

#52 - Bump version to 0.9.1

Pull Request - State: closed - Opened by danieldk 8 months ago

#51 - {ByteBPE,SentencePiece}Processor: support file objects

Pull Request - State: open - Opened by danieldk 9 months ago
Labels: enhancement

#50 - Switch from distutils to setuptools (#49)

Pull Request - State: closed - Opened by adrianeboyd 10 months ago

#49 - Switch from distutils to setuptools

Pull Request - State: closed - Opened by adrianeboyd 10 months ago - 1 comment

#48 - Bump version to 0.9.0

Pull Request - State: closed - Opened by danieldk 10 months ago

#47 - Set version to `0.0.8`

Pull Request - State: closed - Opened by shadeMe 11 months ago

#46 - Backport fixes to `spacy-3.x`

Pull Request - State: closed - Opened by shadeMe 11 months ago

#45 - Lower the version number to 0.9.0.dev0

Pull Request - State: closed - Opened by danieldk 11 months ago

#44 - Add support for `copy`/`deepcopy` + `SentencePieceProcessor` deserialization bugfix

Pull Request - State: closed - Opened by shadeMe 11 months ago
Labels: bug, enhancement

#43 - `id_to_piece` and `piece_to_id` return `None` in case of invalid inputs

Pull Request - State: closed - Opened by shadeMe 11 months ago
Labels: enhancement

#42 - `SentencePieceProcessor`: `piece_to_id` throws when piece token is unknown

Pull Request - State: closed - Opened by shadeMe 11 months ago - 1 comment
Labels: enhancement

#41 - Set version to 1.0.0.dev0

Pull Request - State: closed - Opened by danieldk 12 months ago

#40 - Set version to `0.0.7`

Pull Request - State: closed - Opened by danieldk 12 months ago

#39 - Rename cutlery to curated-tokenizers

Pull Request - State: closed - Opened by danieldk 12 months ago

#38 - Set version to `0.0.6`

Pull Request - State: closed - Opened by shadeMe 12 months ago

#37 - Add `piece_to_id` and `id_to_piece` to `SentencePieceProcessor`

Pull Request - State: closed - Opened by shadeMe 12 months ago
Labels: enhancement

#36 - Set version to `v0.0.5`

Pull Request - State: closed - Opened by shadeMe about 1 year ago

#35 - New `WordPieceProcessor` methods

Pull Request - State: closed - Opened by shadeMe about 1 year ago
Labels: enhancement

#34 - Bump version to 0.0.4

Pull Request - State: closed - Opened by danieldk about 1 year ago

#33 - Add ByteBPEProcessor.decode_from_ids

Pull Request - State: closed - Opened by danieldk about 1 year ago
Labels: enhancement

#32 - Update sentencepiece to 0.1.98

Pull Request - State: closed - Opened by danieldk about 1 year ago - 1 comment

#31 - CI: Switch from Azure to GHA

Pull Request - State: closed - Opened by adrianeboyd about 1 year ago

#30 - Enable `mypy` in CI

Pull Request - State: closed - Opened by shadeMe about 1 year ago

#29 - Add GitHub PR template

Pull Request - State: closed - Opened by shadeMe about 1 year ago

#28 - Fix `curated-transformers` link

Pull Request - State: closed - Opened by shadeMe about 1 year ago

#27 - Use GitHub Actions for CI

Pull Request - State: closed - Opened by shadeMe about 1 year ago - 2 comments

#26 - Bump version to 0.0.3

Pull Request - State: closed - Opened by danieldk over 1 year ago

#25 - Add regex to install_requires

Pull Request - State: closed - Opened by danieldk over 1 year ago

#24 - Fixup tests directory (for sdist)

Pull Request - State: closed - Opened by danieldk over 1 year ago

#23 - Bump version to 0.0.2

Pull Request - State: closed - Opened by danieldk over 1 year ago

#22 - Add a rudimentary README

Pull Request - State: closed - Opened by danieldk over 1 year ago

#21 - Add ByteBPEProcessor

Pull Request - State: closed - Opened by danieldk over 1 year ago

#20 - Port cutlery to Rust and pyo3

Pull Request - State: closed - Opened by danieldk over 1 year ago

#19 - Add support for BPE

Issue - State: closed - Opened by danieldk over 1 year ago

#18 - Add basic wordpiece processor

Pull Request - State: closed - Opened by danieldk over 1 year ago

#17 - Typing improvements

Pull Request - State: closed - Opened by danieldk almost 2 years ago

#16 - Add __len__ method to get the piece vocab size

Pull Request - State: closed - Opened by danieldk almost 2 years ago

#15 - Fix segmentation fault when using *_id with uninitialized processor

Pull Request - State: closed - Opened by danieldk almost 2 years ago

#14 - Add {bos,eos,pad,unk}_id methods

Pull Request - State: closed - Opened by danieldk almost 2 years ago

#13 - Do not use NumPy arrays in the interface

Pull Request - State: closed - Opened by danieldk almost 2 years ago

#12 - Expose the main class as cysp.SentencePieceProcessor

Pull Request - State: closed - Opened by danieldk almost 2 years ago

#11 - Rename class Processor -> SentencePieceProcessor

Pull Request - State: closed - Opened by danieldk almost 2 years ago

#10 - Do not use cmake/scikit-build

Pull Request - State: closed - Opened by danieldk almost 2 years ago

#9 - Add Windows to CI

Pull Request - State: closed - Opened by danieldk almost 2 years ago - 1 comment

#7 - Add Processor::{decode_from_ids,decode_from_pieces}

Pull Request - State: closed - Opened by danieldk almost 2 years ago

#6 - Add serialization from and to protobuf binary data

Pull Request - State: closed - Opened by danieldk almost 2 years ago

#5 - Add Processor.{as_ids,as_pieces}

Pull Request - State: closed - Opened by danieldk almost 2 years ago

#4 - Test input with NUL character

Pull Request - State: closed - Opened by danieldk almost 2 years ago

#3 - Return piece identifiers as a NumPy array

Pull Request - State: closed - Opened by danieldk almost 2 years ago

#2 - Basic initial encode method

Pull Request - State: closed - Opened by danieldk almost 2 years ago

#1 - Add CI

Pull Request - State: closed - Opened by danieldk almost 2 years ago